aboutsummaryrefslogtreecommitdiff
path: root/drivers/edac
AgeCommit message (Collapse)Author
2011-02-06amd64_edac: Fix interleaving checkBorislav Petkov
commit e726f3c368e7c1919a7166ec09c5705759f1a69d upstream. When matching error address to the range contained by one memory node, we're in valid range when node interleaving 1. is disabled, or 2. enabled and when the address bits we interleave on match the interleave selector on this node (see the "Node Interleaving" section in the BKDG for an enlightening example). Thus, when we early-exit, we need to reverse the compound logic statement properly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-02-06EDAC: Fix workqueue-related crashesBorislav Petkov
commit bb31b3122c0dd07d2d958da17a50ad771ce79e2b upstream. 00740c58541b6087d78418cebca1fcb86dc6077d changed edac_core to un-/register a workqueue item only if a lowlevel driver supplies a polling routine. Normally, when we remove a polling low-level driver, we go and cancel all the queued work. However, the workqueue unreg happens based on the ->op_state setting, and edac_mc_del_mc() sets this to OP_OFFLINE _before_ we cancel the work item, leading to NULL ptr oops on the workqueue list. Fix it by putting the unreg stuff in proper order. Reported-and-tested-by: Tobias Karnat <tobias.karnat@googlemail.com> LKML-Reference: <1291201307.3029.21.camel@Tobias-Karnat> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-28i7core_edac: fix panic in udimm sysfs attributes registrationMarcin Slusarz
commit 64aab720bdf8771214a7c88872bd8e3194c2d279 upstream. Array of udimm sysfs attributes was not ended with NULL marker, leading to dereference of random memory. EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2 BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4 IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name RIP: 0010:[<ffffffff81330b36>] [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 (...) Call Trace: [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1 [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2 [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557 [<ffffffff81428901>] i7core_probe+0xccf/0xec0 RIP [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 ---[ end trace 20de320855b81d78 ]--- Kernel panic - not syncing: Attempted to kill init! Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10amd64_edac: Fix operator precendence errorBorislav Petkov
commit 962b70a1eb22c467b95756a290c694e73da17f41 upstream. The bitwise AND is of higher precedence, make that explicit. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10amd64_edac: Correct scrub rate settingBorislav Petkov
commit bc57117856cf1e581135810b37d3b75f9d1749f5 upstream. Exit early when setting scrub rate on unknown/unsupported families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10amd64_edac: Fix DCT base address selectorBorislav Petkov
commit 9975a5f22a4fcc8d08035c65439900a983f891ad upstream. The correct check is to verify whether in high range we're below 4GB and not to extract the DctSelBaseAddr again. See "2.8.5 Routing DRAM Requests" in the F10h BKDG. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-07-27edac: mpc85xx: fix coldplug/hotplug module autoloadingAnton Vorontsov
The MPC85xx EDAC driver is missing module device aliases, so the driver won't load automatically on boot. This patch fixes the issue by adding proper MODULE_DEVICE_TABLE() macros. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-26quiesce EDAC initialisation on desktop/mobile i7Daniel J Blueman
Don't print failure to detect Core i7 EDAC facilities to the console at boot time, most often occurring on Core i7 desktops and laptops. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-20edac: mpc85xx: add support for MPC8569 EDAC controllersAnton Vorontsov
Simply add a proper ID into the device table. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-20edac: mpc85xx: fix MPC85xx dependencyAnton Vorontsov
Since commit 5753c082f66eca5be81f6bda85c1718c5eea6ada ("powerpc/85xx: Kconfig cleanup"), there is no MPC85xx Kconfig symbol anymore, so the driver became non-selectable. This patch fixes the issue by switching to PPC_85xx symbol. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-04Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: MAINTAINERS: Add an entry for i7core_edac i7core_edac: Avoid doing multiple probes for the same card i7core_edac: Properly discover the first QPI device
2010-07-02i7core_edac: Avoid doing multiple probes for the same cardMauro Carvalho Chehab
As Nehalem/Nehalem-EP/Westmere devices uses several devices for the same functionality (memory controller), the default way of proping devices doesn't work. So, instead of a per-device probe, all devices should be probed at once. This means that we should block any new attempt of probe, otherwise, it will try to register the same device several times. Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-07-02i7core_edac: Properly discover the first QPI deviceMauro Carvalho Chehab
On Nehalem/Nehalem-EP/Westmere, the first QPI device is the last PCI bus. The last bus is generally at 0x3f or 0xff, but there are also other systems using different setups. For example, HP Z800 has 0x7f as the last bus. This patch adds a logic to discover the last bus, dynamically detecting it at runtime. Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-07-02amd64_edac: Fix syndrome calculation on K8Borislav Petkov
When calculating the DCT channel from the syndrome we need to know the syndrome type (x4 vs x8). On F10h, this is read out from extended PCI cfg space register F3x180 while on K8 we only support x4 syndromes and don't have extended PCI config space anyway. Make the code accessing F3x180 F10h only and fall back to x4 syndromes on everything else. Cc: <stable@kernel.org> # .33.x .34.x Reported-by: Jeffrey Merkey <jeffmerkey@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-06-04Merge branch 'linux_next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits) i7core_edac: Better describe the supported devices Add support for Westmere to i7core_edac driver i7core_edac: don't free on success i7core_edac: Add support for X5670 Always call i7core_[ur]dimm_check_mc_ecc_err i7core_edac: fix memory leak of i7core_dev EDAC: add __init to i7core_xeon_pci_fixup i7core_edac: Fix wrong device id for channel 1 devices i7core: add support for Lynnfield alternate address i7core_edac: Add initial support for Lynnfield i7core_edac: do not export static functions edac: fix i7core build edac: i7core_edac produces undefined behaviour on 32bit i7core_edac: Use a more generic approach for probing PCI devices i7core_edac: PCI device is called NONCORE, instead of NOCORE i7core_edac: Fix ringbuffer maxsize i7core_edac: First store, then increment i7core_edac: Better parse "any" addrmask i7core_edac: Use a lockless ringbuffer edac: Create an unique instance for each kobj ...
2010-06-02of/edac: fix build breakage in driversAnatolij Gustschin
Fixes build errors in EDAC drivers caused by the OF device_node pointer being moved into struct device Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-27drivers/edac: convert logging messages direct uses of __FILE__ to %s, __FILEJoe Perches
Reduces text by eliminating multiple __FILE__ uses. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Joe Perches <joe@perches.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Tim Small <tim@buttersideup.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-22Merge remote branch 'origin' into secretlab/next-devicetreeGrant Likely
Merging in current state of Linus' tree to deal with merge conflicts and build failures in vio.c after merge. Conflicts: drivers/i2c/busses/i2c-cpm.c drivers/i2c/busses/i2c-mpc.c drivers/net/gianfar.c Also fixed up one line in arch/powerpc/kernel/vio.c to use the correct node pointer. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22of: Remove duplicate fields from of_platform_driverGrant Likely
.name, .match_table and .owner are duplicated in both of_platform_driver and device_driver. This patch is a removes the extra copies from struct of_platform_driver and converts all users to the device_driver members. This patch is a pretty mechanical change. The usage model doesn't change and if any drivers have been missed, or if anything has been fixed up incorrectly, then it will fail with a compile time error, and the fixup will be trivial. This patch looks big and scary because it touches so many files, but it should be pretty safe. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-18i7core_edac: Better describe the supported devicesMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18Add support for Westmere to i7core_edac driverVernon Mauery
This adds new PCI IDs for the Westmere's memory controller devices and modifies the i7core_edac driver to be able to probe both Nehalem and Westmere processors. Signed-off-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in commentsRoman Fietze
This fixes all occurrences of pci_enable_device and pci_disable_device in all comments. There are no code changes involved. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-18i7core_edac: don't free on successTony Luck
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18i7core_edac: Add support for X5670Mauro Carvalho Chehab
As reported by Vernon Mauery <vernux@us.ibm.com>, X5670 (Westmere-EP) uses a different register for one of the uncore PCI devices. Add support for it. Those are the PCI ID's on this new chipset: fe:00.0 0600: 8086:2c70 (rev 02) fe:00.1 0600: 8086:2d81 (rev 02) fe:02.0 0600: 8086:2d90 (rev 02) fe:02.1 0600: 8086:2d91 (rev 02) fe:02.2 0600: 8086:2d92 (rev 02) fe:02.3 0600: 8086:2d93 (rev 02) fe:02.4 0600: 8086:2d94 (rev 02) fe:02.5 0600: 8086:2d95 (rev 02) fe:03.0 0600: 8086:2d98 (rev 02) fe:03.1 0600: 8086:2d99 (rev 02) fe:03.2 0600: 8086:2d9a (rev 02) fe:03.4 0600: 8086:2d9c (rev 02) fe:04.0 0600: 8086:2da0 (rev 02) fe:04.1 0600: 8086:2da1 (rev 02) fe:04.2 0600: 8086:2da2 (rev 02) fe:04.3 0600: 8086:2da3 (rev 02) fe:05.0 0600: 8086:2da8 (rev 02) fe:05.1 0600: 8086:2da9 (rev 02) fe:05.2 0600: 8086:2daa (rev 02) fe:05.3 0600: 8086:2dab (rev 02) fe:06.0 0600: 8086:2db0 (rev 02) fe:06.1 0600: 8086:2db1 (rev 02) fe:06.2 0600: 8086:2db2 (rev 02) fe:06.3 0600: 8086:2db3 (rev 02) (as usual, the same PCI devices repeat at ff: bus) The PCI device 8086:2c70 is shown as: fe:00.0 Host bridge: Intel Corporation QuickPath Architecture Generic Non-core Registers (rev 02) So, for this device to be recognized, it is only a matter of adding this new PCI ID to the driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18Always call i7core_[ur]dimm_check_mc_ecc_errVernon Mauery
This fixes an error in function i7core_check_error In commit ca9c90ba09ca3c9799319f46a56f397afbf617c2 which converts the driver to use double buffering, there is a change in the logic. Before, if mce_count was zero, it skipped over a couple of statements and finished out with a call to the *check_mc_ecc_err function. The current code checks to see if mce_count is 0 and then exits. This change reverts the behavior back to the original where if there are no errors to report, we skip to the end and call the *check_mc_ecc_err function. This fix allows the driver to work again on my Nehalem based blades again. Signed-off-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18i7core_edac: fix memory leak of i7core_devAlexander Beregalov
Free already allocated i7core_dev. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18EDAC: add __init to i7core_xeon_pci_fixupJiri Slaby
It's called only from an __init function and is the only user of pcibios_scan_specific_bus which will be marked as __devinit in the next patch. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix wrong device id for channel 1 devicesMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core: add support for Lynnfield alternate addressMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Add initial support for LynnfieldMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: fix i7core buildRandy Dunlap
Fix build warning (missing header file) and build error when CONFIG_SMP=n. drivers/edac/i7core_edac.c:860: error: implicit declaration of function 'msleep' drivers/edac/i7core_edac.c:1700: error: 'struct cpuinfo_x86' has no member named 'phys_proc_id' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: i7core_edac produces undefined behaviour on 32bitAlan Cox
Fix the shifts up Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Use a more generic approach for probing PCI devicesMauro Carvalho Chehab
Currently, only one PCI set of tables is allowed. This prevents using the driver for other devices like Lynnfield, with have a different set of PCI ID's. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: PCI device is called NONCORE, instead of NOCOREMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix ringbuffer maxsizeMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: First store, then incrementMauro Carvalho Chehab
Fix ringbuffer store logic. While here, add a few comments to the code and remove the undesired printk that could otherwise be called during NMI time. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Better parse "any" addrmaskMauro Carvalho Chehab
Instead of accepting just "any", accept also "any\n" Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Use a lockless ringbufferMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: Create an unique instance for each kobjMauro Carvalho Chehab
Current code only works when there's just one memory controller, since we need one kobj for each instance. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Convert UDIMM error counters into a proper sysfs groupMauro Carvalho Chehab
Instead of displaying 3 values at the same var, break it into 3 different sysfs nodes: /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2 For registered dimms, however, the error counters are already being displayed at: /sys/devices/system/edac/mc/mc0/csrow*/ce_count So, there's no need to add any extra sysfs nodes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: Don't create csrow entries on instance groupsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac: store/show methods for device groups weren't workingMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Add support for sysfs addrmatch groupMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10edac_core: Allow the creation of sysfs groupsMauro Carvalho Chehab
Currently, all sysfs nodes are stored at /sys/.*/mc. (regex) However, sometimes it is needed to create attribute groups. This patch extends edac_core to allow groups creation. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Avoid printing a warning when debug is disabledMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: We need to use list_for_each_entry_safe to avoid errorsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: change remove module strategyMauro Carvalho Chehab
The old remove module stragegy didn't work on devices with multiple cores, since only one PCI device is used to open all mc's, due to Nehalem nature. Also, it were based at pdev value. However, this doesn't point to the pci device used at mci->dev. So, instead, it unregisters all devices at once, deleting them from the device list. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: remove static counter for max socketsMauro Carvalho Chehab
The number of sockets is now fully dynamic. Get rid of this obsolete var. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: at remove, don't remove all pci devices at onceMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10i7core_edac: Fix a bug when printing error counts with RDIMMsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>