Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit e4b53016 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab
Browse files

edac.txt: update information about newer Intel CPUs



There's a chapter at edac.rst written by the time Nehalem
support was added. Such information is used not only by the
Nehalem driver (i7core_edac), but by all newer Intel CPU
architectures that are supported by i7core_edac, sb_edac
and sbx_edac drivers.

Update the information to reflect that.

Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
parent 96714bd7
Loading
Loading
Loading
Loading
+29 −15
Original line number Diff line number Diff line
@@ -741,13 +741,25 @@ The ``test_device_edac`` sample driver is located at the
http://bluesmoke.sourceforge.net project site for EDAC.


Nehalem Usage of EDAC APIs
--------------------------
Usage of EDAC APIs on Nehalem and newer Intel CPUs
--------------------------------------------------

Due to the way Nehalem exports Memory Controller data, some adjustments
were done at i7core_edac driver. This chapter will cover those differences
On older Intel architectures, the memory controller was part of the North
Bridge chipset. Nehalem, Sandy Bridge, Ivy Bridge, Haswell, Sky Lake and
newer Intel architectures integrated an enhanced version of the memory
controller (MC) inside the CPUs.

1) On Nehalem, there is one Memory Controller per Quick Patch Interconnect
This chapter will cover the differences of the enhanced memory controllers
found on newer Intel CPUs, such as ``i7core_edac``, ``sb_edac`` and
``sbx_edac`` drivers.

.. note::

   The Xeon E7 processor families use a separate chip for the memory
   controller, called Intel Scalable Memory Buffer. This section doesn't
   apply for such families.

1) There is one Memory Controller per Quick Patch Interconnect
   (QPI). At the driver, the term "socket" means one QPI. This is
   associated with a physical CPU socket.

@@ -757,7 +769,7 @@ were done at i7core_edac driver. This chapter will cover those differences

   The minimum known unity is DIMMs. There are no information about csrows.
   As EDAC API maps the minimum unity is csrows, the driver sequentially
   maps channel/dimm into different csrows.
   maps channel/DIMM into different csrows.

   For example, supposing the following layout::

@@ -780,8 +792,8 @@ were done at i7core_edac driver. This chapter will cover those differences

   Each QPI is exported as a different memory controller.

2) Nehalem MC has the ability to generate errors. The driver implements this
   functionality via some error injection nodes:
2) The MC has the ability to inject errors to test drivers. The drivers
   implement this functionality via some error injection nodes:

   For injecting a memory error, there are some sysfs nodes, under
   ``/sys/devices/system/edac/mc/mc?/``:
@@ -855,13 +867,14 @@ were done at i7core_edac driver. This chapter will cover those differences

	EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error))

3) Nehalem specific Corrected Error memory counters
3) Corrected Error memory register counters

   Nehalem have some registers to count memory errors. The driver uses those
   registers to report Corrected Errors on devices with Registered Dimms.
   Those newer MCs have some registers to count memory errors. The driver
   uses those registers to report Corrected Errors on devices with Registered
   DIMMs.

   However, those counters don't work with Unregistered Dimms. As the chipset
   offers some counters that also work with UDIMMS (but with a worse level of
   However, those counters don't work with Unregistered DIMM. As the chipset
   offers some counters that also work with UDIMMs (but with a worse level of
   granularity than the default ones), the driver exposes those registers for
   UDIMM memories.

@@ -896,8 +909,8 @@ were done at i7core_edac driver. This chapter will cover those differences
4) Standard error counters

   The standard error counters are generated when an mcelog error is received
   by the driver. Since, with udimm, this is counted by software, it is
   possible that some errors could be lost. With rdimm's, they display the
   by the driver. Since, with UDIMM, this is counted by software, it is
   possible that some errors could be lost. With RDIMM's, they display the
   contents of the registers

Reference documents used on ``amd64_edac``
@@ -958,6 +971,7 @@ Credits
* |copy| Mauro Carvalho Chehab

  - 05 Aug 2009	Nehalem interface
  - 26 Oct 2016 Converted to ReST and cleanups at the Nehalem section

* EDAC authors/maintainers: