aboutsummaryrefslogtreecommitdiff
path: root/arch/sh/kernel/dwarf.c
AgeCommit message (Collapse)Author
2010-05-25sh: handle early calls to return_address() when using dwarf unwinder.Paul Mundt
The dwarf unwinder ties in to an early initcall, but it's possible that return_address() calls will be made prior to that. This implements some additional error handling in to the dwarf unwinder as well as an exit path in the return_address() case to bail out if the unwinder hasn't come up yet. This fixes a NULL pointer deref in early boot when mempool_alloc() blows up on the not-yet-ready mempool via dwarf_unwind_stack(). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-05-22sh: fix up the dwarf unwinder build for MODULES=n.Paul Mundt
Presently the dwarf unwinder build blows up if modules are disabled, fix it up. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-04-26Merge branch 'sh/stable-updates'Paul Mundt
Conflicts: arch/sh/kernel/dwarf.c drivers/dma/shdma.c Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-04-20sh: dwarf unwinder needs linux/module.h.Paul Mundt
Previously the struct module definition was pulled in from other headers, but we want the reference to be explicit. Fixes up randconfig build issues. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-23sh: Silence unintialized variable warnings in dwarf unwinder.Paul Mundt
The parent rb_node needs to be initialized to shut up the compiler, even though we're unlikely to ever hit this issue at run time. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-02-08sh: Optimise FDE/CIE lookup by using red-black treesMatt Fleming
Now that the DWARF unwinder is being used to provide perf callstacks unwinding speed is an issue. It is no longer being used in exceptional circumstances where we don't care about runtime performance, e.g. when panicing, so it makes sense improve performance is possible. With this patch I saw a 42% improvement in unwind time when calling return_address(1). Greater improvements will be seen as the number of levels unwound increases as each unwind is now cheaper. Note that insertion time has doubled but that's just the price we pay for keeping the trees balanced. However, this is a one-time cost for kernel boot/module load and so the improvements in lookup time dominate the extra time we spend keeping the trees balanced. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-02-08sh: Don't continue unwinding across interruptsMatt Fleming
Unfortunately, due to poor DWARF info in current toolchains, unwinding through interrutps cannot be done reliably. The problem is that the DWARF info for function epilogues is wrong. Take this standard epilogue sequence, 80003cc4: e3 6f mov r14,r15 80003cc6: 26 4f lds.l @r15+,pr 80003cc8: f6 6e mov.l @r15+,r14 <---- interrupt here 80003cca: f6 6b mov.l @r15+,r11 80003ccc: f6 6a mov.l @r15+,r10 80003cce: f6 69 mov.l @r15+,r9 80003cd0: 0b 00 rts If we take an interrupt at the highlighted point, the DWARF info will bogusly claim that the return address can be found at some offset from the frame pointer, even though the frame pointer was just restored. The worst part is if the unwinder finds a text address at the bogus stack address - unwinding will continue, for a bit, until it finally comes across an unexpected address on the stack and blows up. The only solution is to stop unwinding once we've calculated the function that was executing when the interrupt occurred. This PC can be easily calculated from pt_regs->pc. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-02-02sh: Fix access to released memory in dwarf_unwinder_cleanup()Marek Skuczynski
Signed-off-by: Marek Skuczynski <mareksk7@gmail.com> Acked-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-11-09Merge branch 'sh/stable-updates'Paul Mundt
2009-11-06sh: unwinder: Fix up invalid PC refetch in dwarf unwinder.Paul Mundt
The dwarf unwinder presently attempts to provide a sane PC value if none is provided, however the logic is broken and cases where a previous valid dwarf frame exists along with a bogus PC value can still proceed. This fixes up the test and prevents the unwinder from blowing up. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-10-26Merge branch 'sh/stable-updates'Paul Mundt
Conflicts: arch/sh/kernel/dwarf.c
2009-10-26sh: Check for return_to_handler when unwinding the stackMatt Fleming
When CONFIG_FUNCTION_GRAPH_TRACER is enabled the function graph tracer may patch return addresses on the stack with the address of return_to_handler(). This really confuses the DWARF unwinder because it will try find the caller of return_to_handler(), not the caller of the real return address. So teach the DWARF unwinder how to find the real return address whenever it encounters return_to_handler(). This patch does not cope very well when multiple return addresses on the stack have been patched. To make it work properly it would require state to track how many return_to_handler()'s have been seen so that we'd know where to look in current->curr_ret_stack[]. So for now, instead of trying to handle this, just moan if more than one return address on the stack has been patched. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-10-19sh: Fix up uninitialized variable warning in dwarf unwinder.Paul Mundt
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-10-13sh: Tidy up the dwarf module helpers.Paul Mundt
This enables us to build the dwarf unwinder both with modules enabled and disabled in addition to reducing code size in the latter case. The helpers are also consolidated, and modified to resemble the BUG module helpers. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-10-12Merge branch 'sh/dwarf-unwinder'Paul Mundt
Conflicts: arch/sh/kernel/dwarf.c
2009-10-11sh: Remove any reference to recursive functions from commentsMatt Fleming
Originally, dwarf_unwind_stack() was a recursive function and it seems that some of the old comments were never updated. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-10-11sh: Fix memory leak in dwarf_unwind_stack()Matt Fleming
If we broke out of the while (1) loop because the return address of "frame" was zero, then "frame" needs to be free'd before we return. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-10-11sh: Teach the DWARF unwinder about modulesMatt Fleming
Pass a module's .eh_frame section to the DWARF unwinder at module load time so that the section's FDEs and CIEs can be registered with the DWARF unwinder. This allows us to unwind the stack through module code when generating backtraces. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-09-24sh: includecheck fix: dwarf.cJaswinder Singh Rajput
fix the following 'make includecheck' warning: arch/sh/kernel/dwarf.c: asm/dwarf.h is included more than once. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-31sh: unwinder: Fix up uninitialized variable warnings on sh2a build.Paul Mundt
A couple of these popped up on the sh2a build, causing build failures. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-22sh: unwinder: cacheline align slab cache objects.Paul Mundt
The CIE and FDE structs are big enough and accessed regularly enough in certain configurations to make cacheline alignment useful. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-21sh: Handle the DWARF op, DW_CFA_undefinedMatt Fleming
Allow a DWARF register to have an undefined value. When applied to the DWARF return address register this lets lets us label a function as having no direct caller, e.g. kernel_thread_helper(). Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-21sh: Fix bug calculating the end of the FDE instructionsMatt Fleming
The 'end' member of struct dwarf_fde denotes one byte past the end of the CFA instruction stream for an FDE. The value of 'end' was being calcualted incorrectly, it was being set too high. This resulted in dwarf_cfa_execute_insns() interpreting data past the end of valid instructions, thus causing all sorts of weird crashes. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-21sh: unwinder: Introduce UNWINDER_BUG() and UNWINDER_BUG_ON()Matt Fleming
We can't assume that if we execute the unwinder code and the unwinder was already running that it has faulted. Clearly two kernel threads can invoke the unwinder at the same time and may be running simultaneously. The previous approach used BUG() and BUG_ON() in the unwinder code to detect whether the unwinder was incapable of unwinding the stack, and that the next available unwinder should be used instead. A better approach is to explicitly invoke a trap handler to switch unwinders when the current unwinder cannot continue. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-21sh: unwinder: Set the flags for DW_CFA_val_offset ops as DWARF_VAL_OFFSETMatt Fleming
The handling of DW_CFA_val_offset ops was incorrectly using the DWARF_REG_OFFSET flag but the register's value cannot be calculated using the DWARF_REG_OFFSET method. Create a new flag to indicate that a different method must be used to calculate the register's value even though there is no implementation for DWARF_VAL_OFFSET yet; it's mainly just a place holder. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-21sh: unwinder: Fix memory leak and create our own kmem cacheMatt Fleming
Plug a memory leak in dwarf_unwinder_dump() where we didn't free the memory that we had previously allocated for the DWARF frames and DWARF registers. Now is also a opportune time to implement our own mempool and kmem cache. It's a good idea to have a certain number of frame and register objects in reserve at all times, so that we are guaranteed to have our allocation satisfied even when memory is scarce. Since we have pools to allocate from we can implement the registers for each frame as a linked list as opposed to a sparsely populated array. Whilst it's true that the lookup time for a linked list is larger than for arrays, there's only usually a maximum of 8 registers per frame. So the overhead isn't that much of a concern. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-17sh: unwinder: Move initialization to early_initcall() and tidy up locking.Paul Mundt
This moves the initialization over to an early_initcall(). This fixes up some lockdep interaction issues. At the same time, kill off some superfluous locking in the init path. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-16sh: Add support for DWARF GNU extensionsMatt Fleming
Also, remove the "fix" to DW_CFA_def_cfa_register where we reset the frame's cfa_offset to 0. This action is incorrect when handling DW_CFA_def_cfa_register as the DWARF spec specifically states that the previous contents of cfa_offset should be used with the new register. The reason that I thought cfa_offset should be reset to 0 was because it was being assigned a bogus value prior to executing the DW_CFA_def_cfa_register op. It turns out that the bogus cfa_offset value came from interpreting .cfi_escape pseudo-ops (those used by the GNU extensions) as CFA_DW_def_cfa ops. Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-16sh: Try again at getting the initial return address for an unwindMatt Fleming
The previous hack for calculating the return address for the first frame we unwind (dwarf_unwinder_dump) didn't always work. The problem was that it assumed once it read the rule for calculating the return address, there would be no new rules for calculating it. This isn't true because the way in which the CFA is calculated can change as you progress through a function and the return address is figured out using the CFA. Therefore, the way to calculate the return address can change. So, instead of using some offset from the beginning of dwarf_unwind_stack which is just a flakey approach, and instead of executing instructions from the FDE until the return address is setup, we now figure out the pc in dwarf_unwind_stack() just before we call dwarf_cfa_execute_insns(). Signed-off-by: Matt Fleming <matt@console-pimps.org>
2009-08-15sh: Set the cfa_offset to 0 if we see a DW_CFA_def_cfa_register opMatt Fleming
The way that the CFA is calculated can change as we progress through a function. If we see a DW_CFA_def_cfa_register op we need to reset the frame's cfa_offset value which may have been previously setup. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-14sh: unwinder: Convert frame allocations to GFP_ATOMIC.Paul Mundt
save_stack_trace_tsk() and friends can be called from atomic context (as triggered by latencytop), and subsequently hit two problematic allocation points that were using GFP_KERNEL (these were dwarf_unwind_stack() and dwarf_frame_alloc_regs()). Convert these over to GFP_ATOMIC and get latencytop working with the DWARF unwinder. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-14sh: Delete DWARF_ARCH_UNWIND_OFFSETMatt Fleming
Trying to figure out the best value for DWARF_ARCH_UNWIND_OFFSET is tricky at best. Various things can change the size (and offset from the beginning of the function) of the prologue. Notably, turning on ftrace adds calls to mcount at the beginning of functions, thereby pushing the prologue further into the function. So replace DWARF_ARCH_UNWIND_OFFSET with some code that continues to execute CFA instructions until the value of return address register is defined. This is safe to do because we know that the return address must have been pushed onto the frame before our first function call; we just can't figure out where at compile-time. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-14sh: unwinder: Restore put_unaligned() for an unaligned destination.Paul Mundt
The destination address might be unaligned, so set it with put_unaligned() for safety. This restores the previous behaviour, albeit through the proper API. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-14sh: unwinder: Fix up usage of unaligned accessors.Paul Mundt
This was using internal symbols for unaligned accesses, bypassing the exposed interface for variable sized safe accesses. This converts all of the __get_unaligned_cpuXX() users over to get_unaligned() directly, relying on the cast to select the proper internal routine. Additionally, the __put_unaligned_cpuXX() case is superfluous given that the destination address is aligned in all of the current cases, so just drop that outright. Furthermore, this switches to the asm/unaligned.h header instead of the asm-generic version, which was silently bypassing the SH-4A optimized unaligned ops. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-14sh: dwarf unwinder support.Matt Fleming
This is a first cut at a generic DWARF unwinder for the kernel. It's still lacking DWARF64 support and the DWARF expression support hasn't been tested very well but it is generating proper stacktraces on SH for WARN_ON() and NULL dereferences. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>