aboutsummaryrefslogtreecommitdiff
path: root/arch/sparc/mm/fault_64.c
diff options
context:
space:
mode:
authorbob picco <bpicco@meloft.net>2014-09-16 09:26:47 -0400
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2014-10-30 09:40:19 -0700
commitac1addf5ab3a937de68cd1460460dee3aa7271c7 (patch)
tree0a45afd7308a9b05d0c855690684cf754a53a18a /arch/sparc/mm/fault_64.c
parent7907ea428efbfb7cc645b1ae5723a1817348b358 (diff)
sparc64: sun4v TLB error power off events
[ Upstream commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 ] We've witnessed a few TLB events causing the machine to power off because of prom_halt. In one case it was some nfs related area during rmmod. Another was an mmapper of /dev/mem. A more recent one is an ITLB issue with a bad pagesize which could be a hardware bug. Bugs happen but we should attempt to not power off the machine and/or hang it when possible. This is a DTLB error from an mmapper of /dev/mem: [root@sparcie ~]# SUN4V-DTLB: Error at TPC[fffff80100903e6c], tl 1 SUN4V-DTLB: TPC<0xfffff80100903e6c> SUN4V-DTLB: O7[fffff801081979d0] SUN4V-DTLB: O7<0xfffff801081979d0> SUN4V-DTLB: vaddr[fffff80100000000] ctx[1250] pte[98000000000f0610] error[2] . This is recent mainline for ITLB: [ 3708.179864] SUN4V-ITLB: TPC<0xfffffc010071cefc> [ 3708.188866] SUN4V-ITLB: O7[fffffc010071cee8] [ 3708.197377] SUN4V-ITLB: O7<0xfffffc010071cee8> [ 3708.206539] SUN4V-ITLB: vaddr[e0003] ctx[1a3c] pte[2900000dcc800eeb] error[4] . Normally sun4v_itlb_error_report() and sun4v_dtlb_error_report() would call prom_halt() and drop us to OF command prompt "ok". This isn't the case for LDOMs and the machine powers off. For the HV reported error of HV_ENORADDR for HV HV_MMU_MAP_ADDR_TRAP we cause a SIGBUS error by qualifying it within do_sparc64_fault() for fault code mask of FAULT_CODE_BAD_RA. This is done when trap level (%tl) is less or equal one("1"). Otherwise, for %tl > 1, we proceed eventually to die_if_kernel(). The logic of this patch was partially inspired by David Miller's feedback. Power off of large sparc64 machines is painful. Plus die_if_kernel provides more context. A reset sequence isn't a brief period on large sparc64 but better than power-off/power-on sequence. Cc: sparclinux@vger.kernel.org Signed-off-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'arch/sparc/mm/fault_64.c')
-rw-r--r--arch/sparc/mm/fault_64.c3
1 files changed, 3 insertions, 0 deletions
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 587cd056512..18fcd716709 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -346,6 +346,9 @@ retry:
down_read(&mm->mmap_sem);
}
+ if (fault_code & FAULT_CODE_BAD_RA)
+ goto do_sigbus;
+
vma = find_vma(mm, address);
if (!vma)
goto bad_area;