aboutsummaryrefslogtreecommitdiff
path: root/mm/slob.c
AgeCommit message (Collapse)Author
2009-07-30fix RCU-callback-after-kmem_cache_destroy problem in sl[aou]bPaul E. McKenney
commit 7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5 upstream. Jesper noted that kmem_cache_destroy() invokes synchronize_rcu() rather than rcu_barrier() in the SLAB_DESTROY_BY_RCU case, which could result in RCU callbacks accessing a kmem_cache after it had been destroyed. Acked-by: Matt Mackall <mpm@selenic.com> Reported-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-06mm: SLOB fix reclaim_stateNick Piggin
SLOB does not correctly account reclaim_state.reclaimed_slab, so it will break memory reclaim. Account it like SLAB does. Cc: stable@kernel.org Cc: linux-mm@kvack.org Acked-by: Matt Mackall <mpm@selenic.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-04-03kmemtrace: trace kfree() calls with NULL or zero-length objectsPekka Enberg
Impact: also output kfree(NULL) entries This patch moves the trace_kfree() calls before the ZERO_OR_NULL_PTR check so that we can trace call-sites that call kfree() with NULL many times which might be an indication of a bug. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> LKML-Reference: <1237971957.30175.18.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03kmemtrace: use tracepointsEduard - Gabriel Munteanu
kmemtrace now uses tracepoints instead of markers. We no longer need to use format specifiers to pass arguments. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> [ folded: Use the new TP_PROTO and TP_ARGS to fix the build. ] [ folded: fix build when CONFIG_KMEMTRACE is disabled. ] [ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ] Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02Merge branch 'tracing/core-v2' into tracing-for-linusIngo Molnar
Conflicts: include/linux/slub_def.h lib/Kconfig.debug mm/slob.c mm/slub.c
2009-03-30Merge branch 'locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits) lockdep: fix deadlock in lockdep_trace_alloc lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB lockdep: annotate reclaim context (__GFP_NOFS), fix lockdep: build fix for !PROVE_LOCKING lockstat: warn about disabled lock debugging lockdep: use stringify.h lockdep: simplify check_prev_add_irq() lockdep: get_user_chars() redo lockdep: simplify get_user_chars() lockdep: add comments to mark_lock_irq() lockdep: remove macro usage from mark_held_locks() lockdep: fully reduce mark_lock_irq() lockdep: merge the !_READ mark_lock_irq() helpers lockdep: merge the _READ mark_lock_irq() helpers lockdep: simplify mark_lock_irq() helpers #3 lockdep: further simplify mark_lock_irq() helpers lockdep: simplify the mark_lock_irq() helpers lockdep: split up mark_lock_irq() lockdep: generate usage strings lockdep: generate the state bit definitions ...
2009-03-30lockdep: annotate reclaim context (__GFP_NOFS), fix SLOBIngo Molnar
Impact: build fix fix typo in mm/slob.c: mm/slob.c:469: error: ‘flags’ undeclared (first use in this function) mm/slob.c:469: error: (Each undeclared identifier is reported only once mm/slob.c:469: error: for each function it appears in.) Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090128135457.350751756@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-24Merge branches 'topic/slob/cleanups', 'topic/slob/fixes', 'topic/slub/core', ↵Pekka Enberg
'topic/slub/cleanups' and 'topic/slub/perf' into for-linus
2009-03-23slob: fix lockup in slob_free()Nick Piggin
Don't hold SLOB lock when freeing the page. Reduces lock hold width. See the following thread for discussion of the bug: http://marc.info/?l=linux-kernel&m=123709983214143&w=2 Reported-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-03-04Merge branch 'core/locking' into tracing/ftraceIngo Molnar
2009-02-14lockdep: annotate reclaim context (__GFP_NOFS)Nick Piggin
Here is another version, with the incremental patch rolled up, and added reclaim context annotation to kswapd, and allocation tracing to slab allocators (which may only ever reach the page allocator in rare cases, so it is good to put annotations here too). Haven't tested this version as such, but it should be getting closer to merge worthy ;) -- After noticing some code in mm/filemap.c accidentally perform a __GFP_FS allocation when it should not have been, I thought it might be a good idea to try to catch this kind of thing with lockdep. I coded up a little idea that seems to work. Unfortunately the system has to actually be in __GFP_FS page reclaim, then take the lock, before it will mark it. But at least that might still be some orders of magnitude more common (and more debuggable) than an actual deadlock condition, so we have some improvement I hope (the concept is no less complete than discovery of a lock's interrupt contexts). I guess we could even do the same thing with __GFP_IO (normal reclaim), and even GFP_NOIO locks too... but filesystems will have the most locks and fiddly code paths, so let's start there and see how it goes. It *seems* to work. I did a quick test. ================================= [ INFO: inconsistent lock state ] 2.6.28-rc6-00007-ged31348-dirty #26 --------------------------------- inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage. modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes: (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd] {in-reclaim-W} state was registered at: [<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60 [<ffffffff80268f71>] lock_acquire+0x91/0xc0 [<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310 [<ffffffffa002002b>] brd_init+0x2b/0x216 [brd] [<ffffffff8020903b>] _stext+0x3b/0x170 [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff irq event stamp: 3929 hardirqs last enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310 hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310 softirqs last enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0 softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0 other info that might help us debug this: 1 lock held by modprobe/8526: #0: (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd] stack backtrace: Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged31348-dirty #26 Call Trace: [<ffffffff80265483>] print_usage_bug+0x193/0x1d0 [<ffffffff80266530>] mark_lock+0xaf0/0xca0 [<ffffffff80266735>] mark_held_locks+0x55/0xc0 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd] [<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60 [<ffffffff80285005>] __alloc_pages_internal+0x475/0x580 [<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd] [<ffffffffa002006a>] brd_init+0x6a/0x216 [brd] [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd] [<ffffffff8020903b>] _stext+0x3b/0x170 [<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10 [<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180 [<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190 [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', ↵Ingo Molnar
'tracing/urgent' and 'linus' into tracing/core
2009-02-12mm: Export symbol ksize()Kirill A. Shutemov
Commit 7b2cd92adc5430b0c1adeb120971852b4ea1ab08 ("crypto: api - Fix zeroing on free") added modular user of ksize(). Export that to fix crypto.ko compilation. Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-01-19slob: clean up the codeAmérico Wang
- Use NULL instead of plain 0; - Rename slob_page() to is_slob_page(); - Define slob_page() to convert void* to struct slob_page*; - Rename slob_new_page() to slob_new_pages(); - Define slob_free_pages() accordingly. Compile tests only. Signed-off-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-12-30tracing/kmemtrace: normalize the raw tracer event to the unified tracing APIFrederic Weisbecker
Impact: new tracer plugin This patch adapts kmemtrace raw events tracing to the unified tracing API. To enable and use this tracer, just do the following: echo kmemtrace > /debugfs/tracing/current_tracer cat /debugfs/tracing/trace You will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | type_id 1 call_site 18446744071565527833 ptr 18446612134395152256 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 That was to stay backward compatible with the format output produced in inux/tracepoint.h. This is the default ouput, but note that I tried something else. If you change an option: echo kmem_minimalistic > /debugfs/trace_options and then cat /debugfs/trace, you will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | - C 0xffff88007c088780 file_free_rcu + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc780 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc870 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc960 -1 d_alloc + K 1304 1312 000000d0 0xffff8800791d7340 -1 reiserfs_alloc_inode - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 992 1000 000000d0 0xffff880079045b58 -1 alloc_inode + K 768 1024 000080d0 0xffff88007c096400 -1 alloc_pipe_info + K 240 240 000000d0 0xffff8800790dca50 -1 d_alloc + K 272 320 000080d0 0xffff88007c088780 -1 get_empty_filp + K 272 320 000080d0 0xffff88007c088000 -1 get_empty_filp Yeah I shall confess kmem_minimalistic should be: kmem_alternative. Whatever, I find it more readable but this a personal opinion of course. We can drop it if you want. On the ALLOC/FREE column, + means an allocation and - a free. On the type column, you have K = kmalloc, C = cache, P = page I would like the flags to be GFP_* strings but that would not be easy to not break the column with strings.... About the node...it seems to always be -1. I don't know why but that shouldn't be difficult to find. I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would be more easy to find the tracer headers if they are all in their common directory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29Merge branch 'topic/kmemtrace' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 into tracing/kmemtrace
2008-12-29kmemtrace: SLOB hooks.Eduard - Gabriel Munteanu
This adds hooks for the SLOB allocator, to allow tracing with kmemtrace. We also convert some inline functions to __always_inline to make sure _RET_IP_, which expands to __builtin_return_address(0), always works as expected. Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-12-15slob: do not pass the SLAB flags as GFP in kmem_cache_create()Catalin Marinas
The kmem_cache_create() function in the slob allocator passes the SLAB flags as GFP flags to the slob_alloc() function. The patch changes this call to pass GFP_KERNEL as the other allocators seem to do. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-09SLOB: fix bogus ksize calculation fixMatt Mackall
This fixes the previous fix, which was completely wrong on closer inspection. This version has been manually tested with a user-space test harness and generates sane values. A nearly identical patch has been boot-tested. The problem arose from changing how kmalloc/kfree handled alignment padding without updating ksize to match. This brings it in sync. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-07SLOB: fix bogus ksize calculationMatt Mackall
SLOB's ksize calculation was braindamaged and generally harmlessly underreported the allocation size. But for very small buffers, it could in fact overreport them, leading code depending on krealloc to overrun the allocation and trample other data. Signed-off-by: Matt Mackall <mpm@selenic.com> Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-29mm: unexport ksizeAdrian Bunk
This patch removes the obsolete and no longer used exports of ksize. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-07-26SL*B: drop kmem cache argument from constructorAlexey Dobriyan
Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24slob: record page flag overlays explicitlyAndy Whitcroft
SLOB reuses two page bits for internal purposes, it overlays PG_active and PG_private. This is hidden away in slob.c. Document these overlays explicitly in the main page-flags enum along with all the others. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-19slob: Fix to return wrong pointerMinChan Kim
Although slob_alloc return NULL, __kmalloc_node returns NULL + align. Because align always can be changed, it is very hard for debugging problem of no page if it don't return NULL. We have to return NULL in case of no page. [penberg@cs.helsinki.fi: fix formatting as suggested by Matt.] Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: MinChan Kim <minchan.kim@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-04-27slob: fix bug - when slob allocates "struct kmem_cache", it does not force ↵Yi Li
alignment. This may trigger misaligned memory access exception. Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Yi Li <yi.li@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-02-05slob: reduce external fragmentation by using three free listsMatt Mackall
By putting smaller objects on their own list, we greatly reduce overall external fragmentation and increase repeatability. This reduces total SLOB overhead from > 50% to ~6% on a simple boot test. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05slob: fix free block merging at head of subpageMatt Mackall
We weren't merging freed blocks at the beginning of the free list. Fixing this showed a 2.5% efficiency improvement in a userspace test harness. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-09Avoid double memclear() in SLOB/SLUBLinus Torvalds
Both slob and slub react to __GFP_ZERO by clearing the allocation, which means that passing the GFP_ZERO bit down to the page allocator is just wasteful and pointless. Acked-by: Matt Mackall <mpm@selenic.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-05Add EXPORT_SYMBOL(ksize);Tetsuo Handa
mm/slub.c exports ksize(), but mm/slob.c and mm/slab.c don't. It's used by binfmt_flat, which can be built as a module. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Christoph Lameter <clameter@sgi.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-15slob: fix memory corruptionNick Piggin
Previously, it would be possible for prev->next to point to &free_slob_pages, and thus we would try to move a list onto itself, and bad things would happen. It seems a bit hairy to be doing list operations with the list marker as an entry, rather than a head, but... this resolves the following crash: http://bugzilla.kernel.org/show_bug.cgi?id=9379 Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter
Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Slab allocators: fail if ksize is called with a NULL parameterChristoph Lameter
A NULL pointer means that the object was not allocated. One cannot determine the size of an object that has not been allocated. Currently we return 0 but we really should BUG() on attempts to determine the size of something nonexistent. krealloc() interprets NULL to mean a zero sized object. Handle that separately in krealloc(). Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16{slub, slob}: use unlikely() for kfree(ZERO_OR_NULL_PTR) checkSatyam Sharma
Considering kfree(NULL) would normally occur only in error paths and kfree(ZERO_SIZE_PTR) is uncommon as well, so let's use unlikely() for the condition check in SLUB's and SLOB's kfree() to optimize for the common case. SLAB has this already. Signed-off-by: Satyam Sharma <satyam@infradead.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21slob: reduce list scanningMatt Mackall
The version of SLOB in -mm always scans its free list from the beginning, which results in small allocations and free segments clustering at the beginning of the list over time. This causes the average search to scan over a large stretch at the beginning on each allocation. By starting each page search where the last one left off, we evenly distribute the allocations and greatly shorten the average search. Without this patch, kernel compiles on a 1.5G machine take a large amount of system time for list scanning. With this patch, compiles are within a few seconds of performance of a SLAB kernel with no notable change in system time. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20mm: Remove slab destructors from kmem_cache_create().Paul Mundt
Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-17Slab allocators: Cleanup zeroing allocationsChristoph Lameter
It becomes now easy to support the zeroing allocs with generic inline functions in slab.h. Provide inline definitions to allow the continued use of kzalloc, kmem_cache_zalloc etc but remove other definitions of zeroing functions from the slab allocators and util.c. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Slab allocators: support __GFP_ZERO in all allocatorsChristoph Lameter
A kernel convention for many allocators is that if __GFP_ZERO is passed to an allocator then the allocated memory should be zeroed. This is currently not supported by the slab allocators. The inconsistency makes it difficult to implement in derived allocators such as in the uncached allocator and the pool allocators. In addition the support zeroed allocations in the slab allocators does not have a consistent API. There are no zeroing allocator functions for NUMA node placement (kmalloc_node, kmem_cache_alloc_node). The zeroing allocations are only provided for default allocs (kzalloc, kmem_cache_zalloc_node). __GFP_ZERO will make zeroing universally available and does not require any addititional functions. So add the necessary logic to all slab allocators to support __GFP_ZERO. The code is added to the hot path. The gfp flags are on the stack and so the cacheline is readily available for checking if we want a zeroed object. Zeroing while allocating is now a frequent operation and we seem to be gradually approaching a 1-1 parity between zeroing and not zeroing allocs. The current tree has 3476 uses of kmalloc vs 2731 uses of kzalloc. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Slab allocators: consistent ZERO_SIZE_PTR support and NULL result semanticsChristoph Lameter
Define ZERO_OR_NULL_PTR macro to be able to remove the checks from the allocators. Move ZERO_SIZE_PTR related stuff into slab.h. Make ZERO_SIZE_PTR work for all slab allocators and get rid of the WARN_ON_ONCE(size == 0) that is still remaining in SLAB. Make slub return NULL like the other allocators if a too large memory segment is requested via __kmalloc. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Slab allocators: consolidate code for krealloc in mm/util.cChristoph Lameter
The size of a kmalloc object is readily available via ksize(). ksize is provided by all allocators and thus we can implement krealloc in a generic way. Implement krealloc in mm/util.c and drop slab specific implementations of krealloc. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: sparsemem supportPaul Mundt
Currently slob is disabled if we're using sparsemem, due to an earlier patch from Goto-san. Slob and static sparsemem work without any trouble as it is, and the only hiccup is a missing slab_is_available() in the case of sparsemem extreme. With this, we're rid of the last set of restrictions for slob usage. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: initial NUMA supportPaul Mundt
This adds preliminary NUMA support to SLOB, primarily aimed at systems with small nodes (tested all the way down to a 128kB SRAM block), whether asymmetric or otherwise. We follow the same conventions as SLAB/SLUB, preferring current node placement for new pages, or with explicit placement, if a node has been specified. Presently on UP NUMA this has the side-effect of preferring node#0 allocations (since numa_node_id() == 0, though this could be reworked if we could hand off a pfn to determine node placement), so single-CPU NUMA systems will want to place smaller nodes further out in terms of node id. Once a page has been bound to a node (via explicit node id typing), we only do block allocations from partial free pages that have a matching node id in the page flags. The current implementation does have some scalability problems, in that all partial free pages are tracked in the global freelist (with contention due to the single spinlock). However, these are things that are being reworked for SMP scalability first, while things like per-node freelists can easily be built on top of this sort of functionality once it's been added. More background can be found in: http://marc.info/?l=linux-mm&m=118117916022379&w=2 http://marc.info/?l=linux-mm&m=118170446306199&w=2 http://marc.info/?l=linux-mm&m=118187859420048&w=2 and subsequent threads. Acked-by: Christoph Lameter <clameter@sgi.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: improved alignment handlingNick Piggin
Remove the core slob allocator's minimum alignment restrictions, and instead introduce the alignment restrictions at the slab API layer. This lets us heed the ARCH_KMALLOC/SLAB_MINALIGN directives, and also use __alignof__ (unsigned long) for the default alignment (which should allow relaxed alignment architectures to take better advantage of SLOB's small minimum alignment). Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: remove bigblock trackingNick Piggin
Remove the bigblock lists in favour of using compound pages and going directly to the page allocator. Allocation size is stored in page->private, which also makes ksize more accurate than it previously was. Saves ~.5K of code, and 12-24 bytes overhead per >= PAGE_SIZE allocation. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: rework freelist handlingNick Piggin
Improve slob by turning the freelist into a list of pages using struct page fields, then each page has a singly linked freelist of slob blocks via a pointer in the struct page. - The first benefit is that the slob freelists can be indexed by a smaller type (2 bytes, if the PAGE_SIZE is reasonable). - Next is that freeing is much quicker because it does not have to traverse the entire freelist. Allocation can be slightly faster too, because we can skip almost-full freelist pages completely. - Slob pages are then freed immediately when they become empty, rather than having a periodic timer try to free them. This gives efficiency and memory consumption improvement. Then, we don't encode seperate size and next fields into each slob block, rather we use the sign bit to distinguish between "size" or "next". Then size 1 blocks contain a "next" offset, and others contain the "size" in the first unit and "next" in the second unit. - This allows minimum slob allocation alignment to go from 8 bytes to 2 bytes on 32-bit and 12 bytes to 2 bytes on 64-bit. In practice, it is best to align them to word size, however some architectures (eg. cris) could gain space savings from turning off this extra alignment. Then, make kmalloc use its own slob_block at the front of the allocation in order to encode allocation size, rather than rely on not overwriting slob's existing header block. - This reduces kmalloc allocation overhead similarly to alignment reductions. - Decouples kmalloc layer from the slob allocator. Then, add a page flag specific to slob pages. - This means kfree of a page aligned slob block doesn't have to traverse the bigblock list. I would get benchmarks, but my test box's network doesn't come up with slob before this patch. I think something is timing out. Anyway, things are faster after the patch. Code size goes up about 1K, however dynamic memory usage _should_ be lower even on relatively small memory systems. Future todo item is to restore the cyclic free list search, rather than to always begin at the start. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-17Remove SLAB_CTOR_CONSTRUCTORChristoph Lameter
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-17Slab allocators: Drop support for destructorsChristoph Lameter
There is no user of destructors left. There is no reason why we should keep checking for destructors calls in the slab allocators. The RFC for this patch was discussed at http://marc.info/?l=linux-kernel&m=117882364330705&w=2 Destructors were mainly used for list management which required them to take a spinlock. Taking a spinlock in a destructor is a bit risky since the slab allocators may run the destructors anytime they decide a slab is no longer needed. Patch drops destructor support. Any attempt to use a destructor will BUG(). Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-17slob: implement RCU freeingNick Piggin
The SLOB allocator should implement SLAB_DESTROY_BY_RCU correctly, because even on UP, RCU freeing semantics are not equivalent to simply freeing immediately. This also allows SLOB to be used on SMP. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07slob: fix page order calculation on not 4KB pageAkinobu Mita
SLOB doesn't calculate correct page order when page size is not 4KB. This patch fixes it with using get_order() instead of find_order() which is SLOB version of get_order(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGNChristoph Lameter
This patch was recently posted to lkml and acked by Pekka. The flag SLAB_MUST_HWCACHE_ALIGN is 1. Never checked by SLAB at all. 2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB 3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB. The only remaining use is in sparc64 and ppc64 and their use there reflects some earlier role that the slab flag once may have had. If its specified then SLAB_HWCACHE_ALIGN is also specified. The flag is confusing, inconsistent and has no purpose. Remove it. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07slob: handle SLAB_PANIC flagAkinobu Mita
kmem_cache_create() for slob doesn't handle SLAB_PANIC. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>