<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sysctl.c, branch v2.6.18.7</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel/sysctl.c?h=v2.6.18.7</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel/sysctl.c?h=v2.6.18.7'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2006-10-13T20:23:22Z</updated>
<entry>
<title>zone_reclaim: dynamic slab reclaim</title>
<updated>2006-10-13T20:23:22Z</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-10-02T17:45:24Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=be64642c614ee7b193a75da3731c7ee397c21b4b'/>
<id>urn:sha1:be64642c614ee7b193a75da3731c7ee397c21b4b</id>
<content type='text'>
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0ff38490c836dc379ff7ec45b10a15a662f4e5f6


Currently one can enable slab reclaim by setting an explicit option in
/proc/sys/vm/zone_reclaim_mode.  Slab reclaim is then used as a final
option if the freeing of unmapped file backed pages is not enough to free
enough pages to allow a local allocation.

However, that means that the slab can grow excessively and that most memory
of a node may be used by slabs.  We have had a case where a machine with
46GB of memory was using 40-42GB for slab.  Zone reclaim was effective in
dealing with pagecache pages.  However, slab reclaim was only done during
global reclaim (which is a bit rare on NUMA systems).

This patch implements slab reclaim during zone reclaim.  Zone reclaim
occurs if there is a danger of an off node allocation.  At that point we

1. Shrink the per node page cache if the number of pagecache
   pages is more than min_unmapped_ratio percent of pages in a zone.

2. Shrink the slab cache if the number of the nodes reclaimable slab pages
   (patch depends on earlier one that implements that counter)
   are more than min_slab_ratio (a new /proc/sys/vm tunable).

The shrinking of the slab cache is a bit problematic since it is not node
specific.  So we simply calculate what point in the slab we want to reach
(current per node slab use minus the number of pages that neeed to be
allocated) and then repeately run the global reclaim until that is
unsuccessful or we have reached the limit.  I hope we will have zone based
slab reclaim at some point which will make that easier.

The default for the min_slab_ratio is 5%

Also remove the slab option from /proc/sys/vm/zone_reclaim_mode.

[akpm@osdl.org: cleanups]
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>[PATCH] ZVC/zone_reclaim: Leave 1% of unmapped pagecache pages for file I/O</title>
<updated>2006-07-03T22:26:59Z</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-07-03T07:24:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9614634fe6a138fd8ae044950700d2af8d203f97'/>
<id>urn:sha1:9614634fe6a138fd8ae044950700d2af8d203f97</id>
<content type='text'>
It turns out that it is advantageous to leave a small portion of unmapped file
backed pages if all of a zone's pages (or almost all pages) are allocated and
so the page allocator has to go off-node.

This allows recently used file I/O buffers to stay on the node and
reduces the times that zone reclaim is invoked if file I/O occurs
when we run out of memory in a zone.

The problem is that zone reclaim runs too frequently when the page cache is
used for file I/O (read write and therefore unmapped pages!) alone and we have
almost all pages of the zone allocated.  Zone reclaim may remove 32 unmapped
pages.  File I/O will use these pages for the next read/write requests and the
unmapped pages increase.  After the zone has filled up again zone reclaim will
remove it again after only 32 pages.  This cycle is too inefficient and there
are potentially too many zone reclaim cycles.

With the 1% boundary we may still remove all unmapped pages for file I/O in
zone reclaim pass.  However.  it will take a large number of read and writes
to get back to 1% again where we trigger zone reclaim again.

The zone reclaim 2.6.16/17 does not show this behavior because we have a 30
second timeout.

[akpm@osdl.org: rename the /proc file and the variable]
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial</title>
<updated>2006-06-30T22:39:30Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@g5.osdl.org</email>
</author>
<published>2006-06-30T22:39:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=22a3e233ca08a2ddc949ba1ae8f6e16ec7ef1a13'/>
<id>urn:sha1:22a3e233ca08a2ddc949ba1ae8f6e16ec7ef1a13</id>
<content type='text'>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  Remove obsolete #include &lt;linux/config.h&gt;
  remove obsolete swsusp_encrypt
  arch/arm26/Kconfig typos
  Documentation/IPMI typos
  Kconfig: Typos in net/sched/Kconfig
  v9fs: do not include linux/version.h
  Documentation/DocBook/mtdnand.tmpl: typo fixes
  typo fixes: specfic -&gt; specific
  typo fixes in Documentation/networking/pktgen.txt
  typo fixes: occuring -&gt; occurring
  typo fixes: infomation -&gt; information
  typo fixes: disadvantadge -&gt; disadvantage
  typo fixes: aquire -&gt; acquire
  typo fixes: mecanism -&gt; mechanism
  typo fixes: bandwith -&gt; bandwidth
  fix a typo in the RTC_CLASS help text
  smb is no longer maintained

Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
</content>
</entry>
<entry>
<title>[PATCH] zoned vm counters: zone_reclaim: remove /proc/sys/vm/zone_reclaim_interval</title>
<updated>2006-06-30T18:25:35Z</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2006-06-30T08:55:37Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=34aa1330f9b3c5783d269851d467326525207422'/>
<id>urn:sha1:34aa1330f9b3c5783d269851d467326525207422</id>
<content type='text'>
The zone_reclaim_interval was necessary because we were not able to determine
how many unmapped pages exist in a zone.  Therefore we had to scan in
intervals to figure out if any pages were unmapped.

With the zoned counters and NR_ANON_PAGES we now know the number of pagecache
pages and the number of mapped pages in a zone.  So we can simply skip the
reclaim if there is an insufficient number of unmapped pages.  We use
SWAP_CLUSTER_MAX as the boundary.

Drop all support for /proc/sys/vm/zone_reclaim_interval.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>Remove obsolete #include &lt;linux/config.h&gt;</title>
<updated>2006-06-30T17:25:36Z</updated>
<author>
<name>Jörn Engel</name>
<email>joern@wohnheim.fh-wedel.de</email>
</author>
<published>2006-06-30T17:25:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6ab3d5624e172c553004ecc862bfeac16d9d68b7'/>
<id>urn:sha1:6ab3d5624e172c553004ecc862bfeac16d9d68b7</id>
<content type='text'>
Signed-off-by: Jörn Engel &lt;joern@wohnheim.fh-wedel.de&gt;
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</content>
</entry>
<entry>
<title>[PATCH] pi-futex: rt mutex core</title>
<updated>2006-06-28T00:32:47Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2006-06-27T09:54:53Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=23f78d4a03c53cbd75d87a795378ea540aa08c86'/>
<id>urn:sha1:23f78d4a03c53cbd75d87a795378ea540aa08c86</id>
<content type='text'>
Core functions for the rt-mutex subsystem.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] vdso: randomize the i386 vDSO by moving it into a vma</title>
<updated>2006-06-28T00:32:38Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2006-06-27T09:53:50Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e6e5494cb23d1933735ee47cc674ffe1c4afed6f'/>
<id>urn:sha1:e6e5494cb23d1933735ee47cc674ffe1c4afed6f</id>
<content type='text'>
Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Arjan van de Ven &lt;arjan@infradead.org&gt;
Cc: Gerd Hoffmann &lt;kraxel@suse.de&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Zachary Amsden &lt;zach@vmware.com&gt;
Cc: Andi Kleen &lt;ak@muc.de&gt;
Cc: Jan Beulich &lt;jbeulich@novell.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] x86_64: Add compat_printk and sysctl to turn off compat layer warnings</title>
<updated>2006-06-26T17:48:16Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@suse.de</email>
</author>
<published>2006-06-26T11:56:52Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bebfa1013eee1d91b3242e5801cc8fbdfaf148ec'/>
<id>urn:sha1:bebfa1013eee1d91b3242e5801cc8fbdfaf148ec</id>
<content type='text'>
Sometimes e.g. with crashme the compat layer warnings can be noisy.
Add a way to turn them off by gating all output through compat_printk
that checks a global sysctl. The default is not changed.

Signed-off-by: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] Get rid of /proc/sys/proc</title>
<updated>2006-06-25T17:01:15Z</updated>
<author>
<name>Stephen Hemminger</name>
<email>shemminger@osdl.org</email>
</author>
<published>2006-06-25T12:48:31Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=eab03ac7bd3e0da99eb9dc068772a85a5e3f3577'/>
<id>urn:sha1:eab03ac7bd3e0da99eb9dc068772a85a5e3f3577</id>
<content type='text'>
The table is empty, why does it still exist?

Signed-off-by: Stephen Hemminger &lt;shemminger@osdl.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] CONFIG_NET=n build fix</title>
<updated>2006-06-23T14:43:06Z</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@osdl.org</email>
</author>
<published>2006-06-23T09:05:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=57ae2508610d50893cb3e3bbb869ff70ff724a2a'/>
<id>urn:sha1:57ae2508610d50893cb3e3bbb869ff70ff724a2a</id>
<content type='text'>
Cc: Greg KH &lt;greg@kroah.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
</feed>
