aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/powernv
AgeCommit message (Collapse)Author
2014-07-28powerpc/powernv: Change BUG_ON to WARN_ON in elog codeVasant Hegde
We can continue to read the error log (up to MAX size) even if we get the elog size more than MAX size. Hence change BUG_ON to WARN_ON. Also updated error message. Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-25powerpc/powernv: Remove OPAL v1 takeoverMichael Ellerman
In commit 27f4488872d9 "Add OPAL takeover from PowerVM" we added support for "takeover" on OPAL v1 machines. This was a mode of operation where we would boot under pHyp, and query for the presence of OPAL. If detected we would then do a special sequence to take over the machine, and the kernel would end up running in hypervisor mode. OPAL v1 was never a supported product, and was never shipped outside IBM. As far as we know no one is still using it. Newer versions of OPAL do not use the takeover mechanism. Although the query for OPAL should be harmless on machines with newer OPAL, we have seen a machine where it causes a crash in Open Firmware. The code in early_init_devtree() to copy boot_command_line into cmd_line was added in commit 817c21ad9a1f "Get kernel command line accross OPAL takeover", and AFAIK is only used by takeover, so should also be removed. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Dump PE location codeGavin Shan
As Ben suggested, it's meaningful to dump PE's location code for site engineers when hitting EEH errors. The patch introduces function eeh_pe_loc_get() to retireve the location code from dev-tree so that we can output it when hitting EEH errors. If primary PE bus is root bus, the PHB's dev-node would be tried prior to root port's dev-node. Otherwise, the upstream bridge's dev-node of the primary PE bus will be check for the location code directly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Enable POWER8 doorbell IPIsMichael Neuling
This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix killed EEH eventGavin Shan
On PowerNV platform, EEH errors are reported by IO accessors or poller driven by interrupt. After the PE is isolated, we won't produce EEH event for the PE. The current implementation has possibility of EEH event lost in this way: The interrupt handler queues one "special" event, which drives the poller. EEH thread doesn't pick the special event yet. IO accessors kicks in, the frozen PE is marked as "isolated" and EEH event is queued to the list. EEH thread runs because of special event and purge all existing EEH events. However, we never produce an other EEH event for the frozen PE. Eventually, the PE is marked as "isolated" and we don't have EEH event to recover it. The patch fixes the issue to keep EEH events for PEs that have been marked as "isolated" with the help of additional "force" help to eeh_remove_event(). Reported-by: Rolf Brudeseth <rolfb@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Don't escalate non-existing frozen PEGavin Shan
Commit cb5b242c ("powerpc/eeh: Escalate error on non-existing PE") escalates the frozen state on non-existing PE to fenced PHB. It was to improve kdump reliability. After that, commit 361f2a2a ("powrpc/powernv: Reset PHB in kdump kernel") was introduced to issue complete reset on all PHBs to increase the reliability of kdump kernel. Commit cb5b242c becomes unuseful and it would be reverted. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Report frozen parent PE prior to child PEGavin Shan
When we have the corner case of frozen parent and child PE at the same time, we have to handle the frozen parent PE prior to the child. Without clearning the frozen state on parent PE, the child PE can't be recovered successfully. The patch searches the EEH PE hierarchy tree and returns the toppest frozen PE to be handled. It ensures the frozen parent PE will be handled prior to child PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Reduce panic timeout from 180s to 10sAnton Blanchard
We've already dropped the default pseries timeout to 10s, do the same for powernv. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix reading of OPAL msglogJoel Stanley
memory_return_from_buffer returns a signed value, so ret should be ssize_t. Fixes the following issue reported by David Binderman: [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style) Checking if unsigned variable 'ret' is less than zero. [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style) Checking if unsigned variable 'ret' is less than zero. Local variable "ret" is of type size_t. This is always unsigned, so it is pointless to check if it is less than zero. https://bugzilla.kernel.org/show_bug.cgi?id=77551 Fixing this exposes a real bug for the case where the entire count bytes is successfully read from the POS_WRAP case. The second memory_read_from_buffer will return EINVAL, causing the entire read to return EINVAL to userspace, despite the data being copied correctly. The fix is to test for the case where the data has been read and return early. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix endianness problems in EEHGuo Chao
EEH information fetched from OPAL need fix before using in LE environment. To be included in sparse's endian check, declare them as __beXX and access them by accessors. Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powernv: Fix permissions on sysparam sysfs entriesAnton Blanchard
Everyone can write to these files, which is not what we want. Cc: stable@vger.kernel.org # 3.15 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv : Disable subcore for UP configsShreyas B. Prabhu
Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/subcore.c: In function ‘cpu_update_split_mode’: arch/powerpc/platforms/powernv/subcore.c:274:15: error: ‘setup_max_cpus’ undeclared (first use in this function) arch/powerpc/platforms/powernv/subcore.c:285:5: error: lvalue required as left operand of assignment 'setup_max_cpus' variable is relevant only on SMP, so there is no point working around it for UP. Furthermore, subcore itself is relevant only on SMP and hence the better solution is to exclude subcore.o and subcore-asm.o for UP builds. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Include asm/smp.h to fix UP build failureShreyas B. Prabhu
Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/setup.c: In function ‘pnv_kexec_wait_secondaries_down’: arch/powerpc/platforms/powernv/setup.c:179:4: error: implicit declaration of function ‘get_hard_smp_processor_id’ rc = opal_query_cpu_status(get_hard_smp_processor_id(i), The usage of get_hard_smp_processor_id() needs the declaration from <asm/smp.h>. The file setup.c includes <linux/sched.h>, which in-turn includes <linux/smp.h>. However, <linux/smp.h> includes <asm/smp.h> only on SMP configs and hence UP builds fail. Fix this by directly including <asm/smp.h> in setup.c unconditionally. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-10Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "Here is the bulk of the powerpc changes for this merge window. It got a bit delayed in part because I wasn't paying attention, and in part because I discovered I had a core PCI change without a PCI maintainer ack in it. Bjorn eventually agreed it was ok to merge it though we'll probably improve it later and I didn't want to rebase to add his ack. There is going to be a bit more next week, essentially fixes that I still want to sort through and test. The biggest item this time is the support to build the ppc64 LE kernel with our new v2 ABI. We previously supported v2 userspace but the kernel itself was a tougher nut to crack. This is now sorted mostly thanks to Anton and Rusty. We also have a fairly big series from Cedric that add support for 64-bit LE zImage boot wrapper. This was made harder by the fact that traditionally our zImage wrapper was always 32-bit, but our new LE toolchains don't really support 32-bit anymore (it's somewhat there but not really "supported") so we didn't want to rely on it. This meant more churn that just endian fixes. This brings some more LE bits as well, such as the ability to run in LE mode without a hypervisor (ie. under OPAL firmware) by doing the right OPAL call to reinitialize the CPU to take HV interrupts in the right mode and the usual pile of endian fixes. There's another series from Gavin adding EEH improvements (one day we *will* have a release with less than 20 EEH patches, I promise!). Another highlight is the support for the "Split core" functionality on P8 by Michael. This allows a P8 core to be split into "sub cores" of 4 threads which allows the subcores to run different guests under KVM (the HW still doesn't support a partition per thread). And then the usual misc bits and fixes ..." [ Further delayed by gmail deciding that BenH is a dirty spammer. Google knows. ] * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (155 commits) powerpc/powernv: Add missing include to LPC code selftests/powerpc: Test the THP bug we fixed in the previous commit powerpc/mm: Check paca psize is up to date for huge mappings powerpc/powernv: Pass buffer size to OPAL validate flash call powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC() powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC() powerpc/powernv: Set memory_block_size_bytes to 256MB powerpc: Allow ppc_md platform hook to override memory_block_size_bytes powerpc/powernv: Fix endian issues in memory error handling code powerpc/eeh: Skip eeh sysfs when eeh is disabled powerpc: 64bit sendfile is capped at 2GB powerpc/powernv: Provide debugfs access to the LPC bus via OPAL powerpc/serial: Use saner flags when creating legacy ports powerpc: Add cpu family documentation powerpc/xmon: Fix up xmon format strings powerpc/powernv: Add calls to support little endian host powerpc: Document sysfs DSCR interface powerpc: Fix regression of per-CPU DSCR setting powerpc: Split __SYSFS_SPRSETUP macro arch: powerpc/fadump: Cleaning up inconsistent NULL checks ...
2014-06-07powerpc/powernv: Add missing include to LPC codeBenjamin Herrenschmidt
kbuild bot spotted that one: arch/powerpc/platforms/powernv/opal-lpc.c: In function 'opal_lpc_init_debugfs': >> arch/powerpc/platforms/powernv/opal-lpc.c:319:35: error: 'powerpc_debugfs_root' undeclared (first use in this function) root = debugfs_create_dir("lpc", powerpc_debugfs_root); ^ We neet to include the definition explicitely. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Pass buffer size to OPAL validate flash callVasant Hegde
We pass actual buffer size to opal_validate_flash() OPAL API call and in return it contains output buffer size. Commit cc146d1d (Fix little endian issues) missed to set the size param before making OPAL call. So firmware image validation fails. This patch sets size variable before making OPAL call. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Tested-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Set memory_block_size_bytes to 256MBAnton Blanchard
powerpc sets a low SECTION_SIZE_BITS to accomodate small pseries boxes. We default to 16MB memory blocks, and boxes with a lot of memory end up with enormous numbers of sysfs memory nodes. Set a more reasonable default for powernv of 256MB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Fix endian issues in memory error handling codeAnton Blanchard
struct OpalMemoryErrorData is passed to us from firmware, so we have to byteswap it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Provide debugfs access to the LPC bus via OPALBenjamin Herrenschmidt
This provides debugfs files to access the LPC bus on Power8 non-virtualized using the appropriate OPAL firmware calls. The usage is simple: one file per space (IO, MEM and FW), lseek to the address and read/write the data. IO and MEM always generate series of byte accesses. FW can generate word and dword accesses if aligned properly. Based on an original patch from Rob Lippert and reworked. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-05powerpc/powernv: Add calls to support little endian hostBenjamin Herrenschmidt
When running as a powernv "host" system on P8, we need to switch the endianness of interrupt handlers. This does it via the appropriate call to the OPAL firmware which may result in just switching HID0:HILE but depending on the processor version might need to do a few more things. This call must be done early before any other processor has been brought out of firmware. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-04Merge tag 'devicetree-for-3.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux into next Pull DeviceTree updates from Rob Herring: - Another round of clean-up of FDT related code in architecture code. This removes knowledge of internal FDT details from most architectures except powerpc. - Conversion of kernel's custom FDT parsing code to use libfdt. - DT based initialization for generic serial earlycon. The introduction of generic serial earlycon support went in through the tty tree. - Improve the platform device naming for DT probed devices to ensure unique naming and use parent names instead of a global index. - Fix a race condition in of_update_property. - Unify the various linker section OF match tables and fix several function prototype errors. - Update platform_get_irq_byname to work in deferred probe cases. - 2 binding doc updates * tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (58 commits) of: handle NULL node in next_child iterators of/irq: provide more wrappers for !CONFIG_OF devicetree: bindings: Document micrel vendor prefix dt: bindings: dwc2: fix required value for the phy-names property of_pci_irq: kill useless variable in of_irq_parse_pci() of/irq: do irq resolution in platform_get_irq_byname() of: Add a testcase for of_find_node_by_path() of: Make of_find_node_by_path() handle /aliases of: Create unlocked version of for_each_child_of_node() lib: add glibc style strchrnul() variant of: Handle memory@0 node on PPC32 only pci/of: Remove dead code of: fix race between search and remove in of_update_property() of: Use NULL for pointers of: Stop naming platform_device using dcr address of: Ensure unique names without sacrificing determinism tty/serial: pl011: add DT based earlycon support of/fdt: add FDT serial scanning for earlycon of/fdt: add FDT address translation support serial: earlycon: add DT support ...
2014-05-28powerpc/powernv: Add support for POWER8 split core on powernvMichael Ellerman
Upcoming POWER8 chips support a concept called split core. This is where the core can be split into subcores that although not full cores, are able to appear as full cores to a guest. The splitting & unsplitting procedure is mildly complicated, and explained at length in the comments within the patch. One notable detail is that when splitting or unsplitting we need to pull offline cpus out of their offline state to do work as part of the procedure. The interface for changing the split mode is via a sysfs file, eg: $ echo 2 > /sys/devices/system/cpu/subcores_per_core Currently supported values are '1', '2' and '4'. And indicate respectively that the core should be unsplit, split in half, and split in quarters. These modes correspond to threads_per_subcore of 8, 4 and 2. We do not allow changing the split mode while KVM VMs are active. This is to prevent the value changing while userspace is configuring the VM, and also to prevent the mode being changed in such a way that existing guests are unable to be run. CPU hotplug fixes by Srivatsa. max_cpus fixes by Mahesh. cpuset fixes by benh. Fix for irq race by paulus. The rest by mikey and mpe. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()Michael Ellerman
To support split core we need to be able to force all secondaries into nap, so the core can detect they are idle and do an unsplit. Currently power7_nap() will return without napping if there is an irq pending. We want to ignore the pending irq and nap anyway, we will deal with the interrupt later. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-20Revert "powerpc/powernv: Fundamental reset on PLX ports"Benjamin Herrenschmidt
This reverts commit b2b5efcf208ddc9444aca77336627428782a39f4. This code was way too board specific, there are quirks as to how the PERST line is wired on different boards, we'll have to revisit this using/creating appropriate firmware interfaces. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-13Merge branch 'dt-bus-name' into for-nextRob Herring
2014-05-12powerpc/powernv: Reset root port in firmwareGavin Shan
Resetting root port has more stuff to do than that for PCIe switch ports and we should have resetting root port done in firmware instead of the kernel itself. The problem was introduced by commit 5b2e198e ("powerpc/powernv: Rework EEH reset"). Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-05Merge remote-tracking branch 'anton/abiv2' into nextBenjamin Herrenschmidt
This series adds support for building the powerpc 64-bit LE kernel using the new ABI v2. We already supported running ABI v2 userspace programs but this adds support for building the kernel itself using the new ABI.
2014-04-30of/fdt: update of_get_flat_dt_prop in prep for libfdtRob Herring
Make of_get_flat_dt_prop arguments compatible with libfdt fdt_getprop call in preparation to convert FDT code to use libfdt. Make the return value const and the property length ptr type an int. Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Michal Simek <michal.simek@xilinx.com> Tested-by: Grant Likely <grant.likely@linaro.org> Tested-by: Stephen Chivers <schivers@csc.com>
2014-04-28powerpc: powernv: Implement ppc_md.get_proc_freq()Gautham R. Shenoy
Implement a method named pnv_get_proc_freq(unsigned int cpu) which returns the current clock rate on the 'cpu' in Hz to be reported in /proc/cpuinfo. This method uses the value reported by cpufreq when such a value is sane. Otherwise it falls back to old way of reporting the clockrate, i.e. ppc_proc_freq. Set the ppc_md.get_proc_freq() hook to pnv_get_proc_freq() on the PowerNV platform. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Return secondary CPUs to firmware before FW updateVasant Hegde
Firmware update on PowerNV platform takes several minutes. During this time one CPU is stuck in FW and the kernel complains about "soft lockups". This patch returns all secondary CPUs to firmware before starting firmware update process. [ Reworked a bit and cleaned up -- BenH ] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Don't use pe->pbus to get the domain numberGavin Shan
If the PE contains single PCI function, "pe->pbus" would be NULL. It's not reliable to be used by pci_domain_nr(). We just grab the PCI domain number from the PCI host controller (struct pci_controller) instance. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Missed IOMMU table typeGavin Shan
In function pnv_pci_ioda2_setup_dma_pe(), the IOMMU table type is set to (TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE) unconditionally. It was just set to TCE_PCI by pnv_pci_setup_iommu_table(). So the primary IOMMU table type (TCE_PCI) is lost. The patch fixes it. Also, pnv_pci_setup_iommu_table() already set "tbl->it_busno" to zero and we needn't do it again. The patch removes the redundant assignment. The patch also fixes similar issues in pnv_pci_ioda_setup_dma_pe(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Fundamental reset on PLX portsGavin Shan
The patch intends to support fundamental reset on PLX downstream ports. If the PCI device matches any one of the internal table, which includes PLX vendor ID, bridge device ID, register offset for fundamental reset and bit, fundamental reset will be done accordingly. Otherwise, it will fail back to hot reset. Additional flag (EEH_DEV_FRESET) is introduced to record the last reset type on the PCI bridge. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powrpc/powernv: Reset PHB in kdump kernelGavin Shan
In the kdump scenario, the first kerenl doesn't shutdown PCI devices and the kdump kerenl clean PHB IODA table at the early probe time. That means the kdump kerenl can't support PCI transactions piled by the first kerenl. Otherwise, lots of EEH errors and frozen PEs will be detected. In order to avoid the EEH errors, the PHB is resetted to drop all PCI transaction from the first kerenl. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/pci: Mask linkDown on resetting PCI busGavin Shan
The problem was initially reported by Wendy who tried pass through IPR adapter, which was connected to PHB root port directly, to KVM based guest. When doing that, pci_reset_bridge_secondary_bus() was called by VFIO driver and linkDown was detected by the root port. That caused all PEs to be frozen. The patch fixes the issue by routing the reset for the secondary bus of root port to underly firmware. For that, one more weak function pci_reset_secondary_bus() is introduced so that the individual platforms can override that and do specific reset for bridge's secondary bus. Reported-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Make the delay for PE reset unifiedGavin Shan
Basically, we have 3 types of resets to fulfil PE reset: fundamental, hot and PHB reset. For the later 2 cases, we need PCI bus reset hold and settlement delay as specified by PCI spec. PowerNV and pSeries platforms are running on top of different firmware and some of the delays have been covered by underly firmware (PowerNV). The patch makes the delays unified to be done in backend, instead of EEH core. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Reset root port in firmwareGavin Shan
Resetting root port has more stuff to do than that for PCIe switch ports and we should have resetting root port done in firmware instead of the kernel itself. The problem was introduced by commit 5b2e198e ("powerpc/powernv: Rework EEH reset"). Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Fix endless reporting frozen PEGavin Shan
Once one specific PE has been marked as EEH_PE_ISOLATED, it's in the middile of recovery or removed permenently. We needn't report the frozen PE again. Otherwise, we will have endless reporting same frozen PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: No hotplug on permanently removed devGavin Shan
The issue was detected in a bit complicated test case where we have multiple hierarchical PEs shown as following figure: +-----------------+ | PE#3 p2p#0 | | p2p#1 | +-----------------+ | +-----------------+ | PE#4 pdev#0 | | pdev#1 | +-----------------+ PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p bridges. We accidentally had less-known scenario: PE#4 was removed permanently from the system because of permanent failure (e.g. exceeding the max allowd failure times in last hour), then we detects EEH errors on PE#3 and tried to recover it. However, eeh_dev instances for pdev#0/1 were not detached from PE#4, which was still connected to PE#3. All of that was because of the fact that we rely on count-based pcibios_release_device(), which isn't reliable enough. When doing recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which are not valid any more. Eventually, we run into kernel crash. The patch fixes above issue from two aspects. For unplug, we simply skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED && !EEH_PE_STATE_RECOVERING) and its frozen count should be greater than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read its PCI config so that PCI core will omit them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Allow to disable EEHGavin Shan
The patch introduces bootarg "eeh=off" to disable EEH functinality. Also, it creates /sys/kerenl/debug/powerpc/eeh_enable to disable or enable EEH functionality. By default, we have the functionality enabled. For PowerNV platform, we will restore to have the conventional mechanism of clearing frozen PE during PCI config access if we're going to disable EEH functionality. Conversely, we will rely on EEH for error recovery. The patch also fixes the issue that we missed to cover the case of disabled EEH functionality in function ioda_eeh_event(). Those events driven by interrupt should be cleared to avoid endless reporting. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Use cached capability for log dumpGavin Shan
When calling into eeh_gather_pci_data() on pSeries platform, we possiblly don't have pci_dev instance yet, but eeh_dev is always ready. So we use cached capability from eeh_dev instead of pci_dev for log dump there. In order to keep things unified, we also cache PCI capability positions to eeh_dev for PowerNV as well. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Avoid I/O access during PE resetGavin Shan
We have suffered recrusive frozen PE a lot, which was caused by IO accesses during the PE reset. Ben came up with the good idea to keep frozen PE until recovery (BAR restore) gets done. With that, IO accesses during PE reset are dropped by hardware and wouldn't incur the recrusive frozen PE any more. The patch implements the idea. We don't clear the frozen state until PE reset is done completely. During the period, the EEH core expects unfrozen state from backend to keep going. So we have to reuse EEH_PE_RESET flag, which has been set during PE reset, to return normal state from backend. The side effect is we have to clear frozen state for towice (PE reset and clear it explicitly), but that's harmless. We have some limitations on pHyp. pHyp doesn't allow to enable IO or DMA for unfrozen PE. So we don't enable them on unfrozen PE in eeh_pci_enable(). We have to enable IO before grabbing logs on pHyp. Otherwise, 0xFF's is always returned from PCI config space. Also, we had wrong return value from eeh_pci_enable() for EEH_OPT_THAW_DMA case. The patch fixes it too. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Use EEH PCI config accessorsGavin Shan
For EEH PowerNV backends, they need use their own PCI config accesors as the normal one could be blocked during PE reset. The patch also removes necessary parameter "hose" for the function ioda_eeh_bridge_reset(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/eeh: Block PCI-CFG access during PE resetGavin Shan
We've observed multiple PE reset failures because of PCI-CFG access during that period. Potentially, some device drivers can't support EEH very well and they can't put the device to motionless state before PE reset. So those device drivers might produce PCI-CFG accesses during PE reset. Also, we could have PCI-CFG access from user space (e.g. "lspci"). Since access to frozen PE should return 0xFF's, we can block PCI-CFG access during the period of PE reset so that we won't get recrusive EEH errors. The patch adds flag EEH_PE_RESET, which is kept during PE reset. The PowerNV/pSeries PCI-CFG accessors reuse the flag to block PCI-CFG accordingly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Remove fields in PHB diag-data dumpGavin Shan
For some fields (e.g. LEM, MMIO, DMA) in PHB diag-data dump, it's meaningless to print them if they have non-zero value in the corresponding mask registers because we always have non-zero values in the mask registers. The patch only prints those fieds if we have non-zero values in the primary registers (e.g. LEM, MMIO, DMA status) so that we can save couple of lines. The patch also removes unnecessary spare line before "brdgCtl:" and two leading spaces as prefix in each line as Ben suggested. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Move PNV_EEH_STATE_ENABLED aroundGavin Shan
The flag PNV_EEH_STATE_ENABLED is put into pnv_phb::eeh_state, which is protected by CONFIG_EEH. We needn't that. Instead, we can have pnv_phb::flags and maintain all flags there, which is the purpose of the patch. The patch also renames PNV_EEH_STATE_ENABLED to PNV_PHB_FLAG_EEH. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Remove PNV_EEH_STATE_REMOVEDGavin Shan
The PHB state PNV_EEH_STATE_REMOVED maintained in pnv_phb isn't so useful any more and it's duplicated to EEH_PE_ISOLATED. The patch replaces PNV_EEH_STATE_REMOVED with EEH_PE_ISOLATED. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28ppc/powernv: Set the runlatch bits correctly for offline cpusPreeti U Murthy
Up until now we have been setting the runlatch bits for a busy CPU and clearing it when a CPU enters idle state. The runlatch bit has thus been consistent with the utilization of a CPU as long as the CPU is online. However when a CPU is hotplugged out the runlatch bit is not cleared. It needs to be cleared to indicate an unused CPU. Hence this patch has the runlatch bit cleared for an offline CPU just before entering an idle state and sets it immediately after it exits the idle state. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Fix little endian issues in OPAL dump codeAnton Blanchard
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/powernv: Create OPAL sglist helper functions and fix endian issuesAnton Blanchard
We have two copies of code that creates an OPAL sg list. Consolidate these into a common set of helpers and fix the endian issues. The flash interface embedded a version number in the num_entries field, whereas the dump interface did did not. Since versioning wasn't added to the flash interface and it is impossible to add this in a backwards compatible way, just remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>