<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/edac, branch v3.4</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/drivers/edac?h=v3.4</id>
<link rel='self' href='https://git.amat.us/linux/atom/drivers/edac?h=v3.4'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-04-07T00:56:20Z</updated>
<entry>
<title>Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile</title>
<updated>2012-04-07T00:56:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-04-07T00:56:20Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4157368edbc3d69b05e9294a73c84fc9c96bdec4'/>
<id>urn:sha1:4157368edbc3d69b05e9294a73c84fc9c96bdec4</id>
<content type='text'>
Pull arch/tile bug fixes from Chris Metcalf:
 "This includes Paul Gortmaker's change to fix the &lt;asm/system.h&gt;
  disintegration issues on tile, a fix to unbreak the tilepro ethernet
  driver, and a backlog of bugfix-only changes from internal Tilera
  development over the last few months.

  They have all been to LKML and on linux-next for the last few days.
  The EDAC change to MAINTAINERS is an oddity but discussion on the
  linux-edac list suggested I ask you to pull that change through my
  tree since they don't have a tree to pull edac changes from at the
  moment."

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (39 commits)
  drivers/net/ethernet/tile: fix netdev_alloc_skb() bombing
  MAINTAINERS: update EDAC information
  tilepro ethernet driver: fix a few minor issues
  tile-srom.c driver: minor code cleanup
  edac: say "TILEGx" not "TILEPro" for the tilegx edac driver
  arch/tile: avoid accidentally unmasking NMI-type interrupt accidentally
  arch/tile: remove bogus performance optimization
  arch/tile: return SIGBUS for addresses that are unaligned AND invalid
  arch/tile: fix finv_buffer_remote() for tilegx
  arch/tile: use atomic exchange in arch_write_unlock()
  arch/tile: stop mentioning the "kvm" subdirectory
  arch/tile: export the page_home() function.
  arch/tile: fix pointer cast in cacheflush.c
  arch/tile: fix single-stepping over swint1 instructions on tilegx
  arch/tile: implement panic_smp_self_stop()
  arch/tile: add "nop" after "nap" to help GX idle power draw
  arch/tile: use proper memparse() for "maxmem" options
  arch/tile: fix up locking in pgtable.c slightly
  arch/tile: don't leak kernel memory when we unload modules
  arch/tile: fix bug in delay_backoff()
  ...
</content>
</entry>
<entry>
<title>MCE, AMD: Drop too granulary family model checks</title>
<updated>2012-04-04T13:50:11Z</updated>
<author>
<name>Borislav Petkov</name>
<email>borislav.petkov@amd.com</email>
</author>
<published>2012-04-04T12:21:02Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ec3e82d6dc46cac7309b01ff9761f469b0263019'/>
<id>urn:sha1:ec3e82d6dc46cac7309b01ff9761f469b0263019</id>
<content type='text'>
MCA details seldom change inbetween the models of a family so don't
be too conservative and enable decoding on everything starting from
K8 onwards. Minor adjustments can come in later but most importantly,
we have some decoding infrastructure in place for upcoming models by
default.

Signed-off-by: Borislav Petkov &lt;borislav.petkov@amd.com&gt;
</content>
</entry>
<entry>
<title>edac: say "TILEGx" not "TILEPro" for the tilegx edac driver</title>
<updated>2012-04-02T16:14:06Z</updated>
<author>
<name>Chris Metcalf</name>
<email>cmetcalf@tilera.com</email>
</author>
<published>2012-03-30T22:58:37Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e2e110d7596656e2badd21c48713bd01e1b40f44'/>
<id>urn:sha1:e2e110d7596656e2badd21c48713bd01e1b40f44</id>
<content type='text'>
This is just an aesthetic change but it was silly to say TILEPro
when booting up on the tilegx architecture.

Signed-off-by: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac</title>
<updated>2012-03-28T21:24:40Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-28T21:24:40Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f0f3680e50352c57b6cfc5b0d44d63bb0aa20f80'/>
<id>urn:sha1:f0f3680e50352c57b6cfc5b0d44d63bb0aa20f80</id>
<content type='text'>
Pull EDAC fixes from Mauro Carvalho Chehab:
 "A series of EDAC driver fixes.  It also has one core fix at the
  documentation, and a rename patch, fixing the name of the struct that
  contains the rank information."

* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  edac: rename channel_info to rank_info
  i5400_edac: Avoid calling pci_put_device() twice
  edac: i5100 ack error detection register after each read
  edac: i5100 fix erroneous define for M1Err
  edac: sb_edac: Fix a wrong value setting for the previous value
  edac: sb_edac: Fix a INTERLEAVE_MODE() misuse
  edac: sb_edac: Let the driver depend on PCI_MMCONFIG
  edac: Improve the comments to better describe the memory concepts
  edac/ppc4xx_edac: Fix compilation
  Fix sb_edac compilation with 32 bits kernels
</content>
</entry>
<entry>
<title>Merge tag 'device-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux</title>
<updated>2012-03-24T17:41:37Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-24T17:41:37Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=250f6715a4112d6686670c5a62ceb9305da94616'/>
<id>urn:sha1:250f6715a4112d6686670c5a62ceb9305da94616</id>
<content type='text'>
Pull &lt;linux/device.h&gt; avoidance patches from Paul Gortmaker:
 "Nearly every subsystem has some kind of header with a proto like:

	void foo(struct device *dev);

  and yet there is no reason for most of these guys to care about the
  sub fields within the device struct.  This allows us to significantly
  reduce the scope of headers including headers.  For this instance, a
  reduction of about 40% is achieved by replacing the include with the
  simple fact that the device is some kind of a struct.

  Unlike the much larger module.h cleanup, this one is simply two
  commits.  One to fix the implicit &lt;linux/device.h&gt; users, and then one
  to delete the device.h includes from the linux/include/ dir wherever
  possible."

* tag 'device-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  device.h: audit and cleanup users in main include dir
  device.h: cleanup users outside of linux/include (C files)
</content>
</entry>
<entry>
<title>Merge tag 'amd64-edac-updates-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp</title>
<updated>2012-03-24T00:59:47Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-24T00:59:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=dae430c6f6e5d0b98c238c340a41a39e221e8940'/>
<id>urn:sha1:dae430c6f6e5d0b98c238c340a41a39e221e8940</id>
<content type='text'>
Pull AMD64 EDAC fixes from Borislav Petkov:
 "A bunch of fixes/updates for the AMD side of EDAC including

   * MCE decoding updates
   * tree-wide EDAC sweep making pci_device_ids __devinitconst
   * Scrub rate API correction
   * two amd64_edac corrections for K8 boxes and sysfs csrow nodes"

* tag 'amd64-edac-updates-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  MCE, AMD: Constify error tables
  MCE, AMD: Correct bank 5 error signatures
  MCE, AMD: Rework NB MCE signatures
  MCE, AMD: Correct VB data error description
  MCE, AMD: Correct ucode patch buffer description
  MCE, AMD: Correct some MC0 error types
  EDAC: Make pci_device_id tables __devinitconst.
  EDAC: Correct scrub rate API
  amd64_edac: Fix K8 revD and later chip select sizes
  amd64_edac: Fix missing csrows sysfs nodes
</content>
</entry>
<entry>
<title>edac: rename channel_info to rank_info</title>
<updated>2012-03-21T18:22:50Z</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab@redhat.com</email>
</author>
<published>2012-01-27T13:26:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a4b4be3fd7a76021f67380b03d8bccebf067db72'/>
<id>urn:sha1:a4b4be3fd7a76021f67380b03d8bccebf067db72</id>
<content type='text'>
What it is pointed by a csrow/channel vector is a rank information, and
not a channel information.

On a traditional architecture, the memory controller directly access the
memory ranks, via chip select rows. Different ranks at the same DIMM is
selected via different chip select rows. So, typically, one
csrow/channel pair means one different DIMM.

On FB-DIMMs, there's a microcontroller chip at the DIMM, called Advanced
Memory Buffer (AMB) that serves as the interface between the memory
controller and the memory chips.

The AMB selection is via the DIMM slot, and not via a csrow.

It is up to the AMB to talk with the csrows of the DRAM chips.

So, the FB-DIMM memory controllers see the DIMM slot, and not the DIMM
rank. RAMBUS is similar.

Newer memory controllers, like the ones found on Intel Sandy Bridge and
Nehalem, even working with normal DDR3 DIMM's, don't use the usual
channel A/channel B interleaving schema to provide 128 bits data access.

Instead, they have more channels (3 or 4 channels), and they can use
several interleaving schemas. Such memory controllers see the DIMMs
directly on their registers, instead of the ranks, which is better for
the driver, as its main usageis to point to a broken DIMM stick (the
Field Repleceable Unit), and not to point to a broken DRAM chip.

The drivers that support such such newer memory architecture models
currently need to fake information and to abuse on EDAC structures, as
the subsystem was conceived with the idea that the csrow would always be
visible by the CPU.

To make things a little worse, those drivers don't currently fake
csrows/channels on a consistent way, as the concepts there don't apply
to the memory controllers they're talking with. So, each driver author
interpreted the concepts using a different logic.

In order to fix it, let's rename the data structure that points into a
DIMM rank to "rank_info", in order to be clearer about what's stored
there.

Latter patches will provide a better way to represent the memory
hierarchy for the other types of memory controller.

Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>i5400_edac: Avoid calling pci_put_device() twice</title>
<updated>2012-03-21T18:22:49Z</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab@redhat.com</email>
</author>
<published>2012-02-12T11:21:34Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0142877aa4e54dd9943fb727e9b386c36c8e3ab7'/>
<id>urn:sha1:0142877aa4e54dd9943fb727e9b386c36c8e3ab7</id>
<content type='text'>
When i5400_edac driver is removed and re-loaded a few times, it causes
an OOPS, as it is currently decrementing some PCI device usage two
times.

When called inside a loop, pci_get_device() will call
pci_put_device(). That mangles the error count. In this specific
case, it seems easier to just duplicate the call.

Also fixes the error logic when pci_get_device fails.

Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>edac: i5100 ack error detection register after each read</title>
<updated>2012-03-21T18:22:49Z</updated>
<author>
<name>Niklas Söderlund</name>
<email>niklas.soderlund@ericsson.com</email>
</author>
<published>2011-12-09T16:12:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=df95e42e1f20a561f2fe0a632d5b8fd6c26f1bb9'/>
<id>urn:sha1:df95e42e1f20a561f2fe0a632d5b8fd6c26f1bb9</id>
<content type='text'>
If I only ack the detection register after a error have been detected
I'm unable to reliably detect errors. I have verified this behavior
using both an error injection DIMM and software to inject errors.

I can't find any documentation supporting this behavior in Intel 5100
Memory Controller Hub Chipset, see 1. So this is all based on
experimentation.

[1] Intel® 5100 Memory Controller Hub Chipset
    http://www.intel.com/content/dam/doc/datasheet/5100-
	memory-controller-hub-chipset-datasheet.pdf

Signed-off-by: Niklas Söderlund &lt;niklas.soderlund@ericsson.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>edac: i5100 fix erroneous define for M1Err</title>
<updated>2012-03-21T18:20:55Z</updated>
<author>
<name>Niklas Söderlund</name>
<email>niklas.soderlund@ericsson.com</email>
</author>
<published>2012-02-17T10:36:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b6378cb3e545912a19e6355aa9171326fdc004d8'/>
<id>urn:sha1:b6378cb3e545912a19e6355aa9171326fdc004d8</id>
<content type='text'>
According to [1] the define for M1Err in the FERR_NF_MEM register is
wrong. It should be at position 1 not 0.

[1] Intel 5100 Memory Controller Hub Chipset Doc.Nr: 318378
    http://www.intel.com/content/dam/doc/datasheet/5100-
    memory-controller-hub-chipset-datasheet.pdf

Reported-by: Ba Thang Nguyen &lt;thang.b.nguyen@dektech.com.au&gt;
Signed-off-by: Niklas Söderlund &lt;niklas.soderlund@ericsson.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
</feed>
