<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/cpuset.c, branch v2.6.18.7</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel/cpuset.c?h=v2.6.18.7</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel/cpuset.c?h=v2.6.18.7'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2006-08-27T18:01:32Z</updated>
<entry>
<title>[PATCH] cpuset: oom panic fix</title>
<updated>2006-08-27T18:01:32Z</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2006-08-27T08:23:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0d673a5a4775d3dc565b6668ed75fd2db2ede624'/>
<id>urn:sha1:0d673a5a4775d3dc565b6668ed75fd2db2ede624</id>
<content type='text'>
cpuset_excl_nodes_overlap always returns 0 if current is exiting.  This caused
customer's systems to panic in the OOM killer when processes were having
trouble getting memory for the final put_user in mm_release.  Even though
there were lots of processes to kill.

Change to returning 1 in this case.  This achieves parity with !CONFIG_CPUSETS
case, and was observed to fix the problem.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Acked-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] cpuset: top_cpuset tracks hotplug changes to cpu_online_map</title>
<updated>2006-08-27T18:01:32Z</updated>
<author>
<name>Paul Jackson</name>
<email>pj@sgi.com</email>
</author>
<published>2006-08-27T08:23:51Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4c4d50f7b39cc58f1064b93a61ad617451ae41df'/>
<id>urn:sha1:4c4d50f7b39cc58f1064b93a61ad617451ae41df</id>
<content type='text'>
Change the list of cpus allowed to tasks in the top (root) cpuset to
dynamically track what cpus are online, using a CPU hotplug notifier.  Make
this top cpus file read-only.

On systems that have cpusets configured in their kernel, but that aren't
actively using cpusets (for some distros, this covers the majority of
systems) all tasks end up in the top cpuset.

If that system does support CPU hotplug, then these tasks cannot make use
of CPUs that are added after system boot, because the CPUs are not allowed
in the top cpuset.  This is a surprising regression over earlier kernels
that didn't have cpusets enabled.

In order to keep the behaviour of cpusets consistent between systems
actively making use of them and systems not using them, this patch changes
the behaviour of the 'cpus' file in the top (root) cpuset, making it read
only, and making it automatically track the value of cpu_online_map.  Thus
tasks in the top cpuset will have automatic use of hot plugged CPUs allowed
by their cpuset.

Thanks to Anton Blanchard and Nathan Lynch for reporting this problem,
driving the fix, and earlier versions of this patch.

Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Nathan Lynch &lt;ntl@pobox.com&gt;
Cc: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock</title>
<updated>2006-07-23T20:03:05Z</updated>
<author>
<name>Paul Jackson</name>
<email>pj@sgi.com</email>
</author>
<published>2006-07-23T18:36:08Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=abb5a5cc6bba1516403146c5b79036fe843beb70'/>
<id>urn:sha1:abb5a5cc6bba1516403146c5b79036fe843beb70</id>
<content type='text'>
Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.

It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.

This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.

This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it.  We don't have any other reproducible test case.

The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.

This is in the cpuset_rmdir() code, where we:
  * take the callback_mutex lock
  * mark the cpuset CS_REMOVED
  * call update_cpu_domains for cpu_exclusive cpusets
  * in that call, take the cpu_hotplug lock if the
    cpuset is marked for removal.

Thanks to Jack Steiner for identifying this deadlock.

The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock.  This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.

I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two.  The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.

Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>Remove obsolete #include &lt;linux/config.h&gt;</title>
<updated>2006-06-30T17:25:36Z</updated>
<author>
<name>Jörn Engel</name>
<email>joern@wohnheim.fh-wedel.de</email>
</author>
<published>2006-06-30T17:25:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6ab3d5624e172c553004ecc862bfeac16d9d68b7'/>
<id>urn:sha1:6ab3d5624e172c553004ecc862bfeac16d9d68b7</id>
<content type='text'>
Signed-off-by: Jörn Engel &lt;joern@wohnheim.fh-wedel.de&gt;
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</content>
</entry>
<entry>
<title>typo fixes: occuring -&gt; occurring</title>
<updated>2006-06-30T16:27:16Z</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@stusta.de</email>
</author>
<published>2006-06-30T16:27:16Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=80f7228b59e4bbe9d840af3ff0f2fe480d6e7c79'/>
<id>urn:sha1:80f7228b59e4bbe9d840af3ff0f2fe480d6e7c79</id>
<content type='text'>
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</content>
</entry>
<entry>
<title>[PATCH] proc: Use struct pid not struct task_ref</title>
<updated>2006-06-26T16:58:26Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2006-06-26T07:25:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=13b41b09491e5d75e8027dca1ee78f5e073bc4c0'/>
<id>urn:sha1:13b41b09491e5d75e8027dca1ee78f5e073bc4c0</id>
<content type='text'>
Incrementally update my proc-dont-lock-task_structs-indefinitely patches so
that they work with struct pid instead of struct task_ref.

Mostly this is a straight 1-1 substitution.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] proc: don't lock task_structs indefinitely</title>
<updated>2006-06-26T16:58:25Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2006-06-26T07:25:55Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=99f895518368252ba862cc15ce4eb98ebbe1bec6'/>
<id>urn:sha1:99f895518368252ba862cc15ce4eb98ebbe1bec6</id>
<content type='text'>
Every inode in /proc holds a reference to a struct task_struct.  If a
directory or file is opened and remains open after the the task exits this
pinning continues.  With 8K stacks on a 32bit machine the amount pinned per
file descriptor is about 10K.

Normally I would figure a reasonable per user process limit is about 100
processes.  With 80 processes, with a 1000 file descriptors each I can trigger
the 00M killer on a 32bit kernel, because I have pinned about 800MB of useless
data.

This patch replaces the struct task_struct pointer with a pointer to a struct
task_ref which has a struct task_struct pointer.  The so the pinning of dead
tasks does not happen.

The code now has to contend with the fact that the task may now exit at any
time.  Which is a little but not muh more complicated.

With this change it takes about 1000 processes each opening up 1000 file
descriptors before I can trigger the OOM killer.  Much better.

[mlp@google.com: task_mmu small fixes]
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Albert Cahalan &lt;acahalan@gmail.com&gt;
Signed-off-by: Prasanna Meda &lt;mlp@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] SELinux: add security hook call to mediate attach_task (kernel/cpuset.c)</title>
<updated>2006-06-23T14:42:54Z</updated>
<author>
<name>David Quigley</name>
<email>dpquigl@tycho.nsa.gov</email>
</author>
<published>2006-06-23T09:04:00Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=22fb52dd736a62e24c44c50739007496265dc38c'/>
<id>urn:sha1:22fb52dd736a62e24c44c50739007496265dc38c</id>
<content type='text'>
Add a security hook call to enable security modules to control the ability
to attach a task to a cpuset.  While limited control over this operation is
possible via permission checks on the pseudo fs interface, those checks are
not sufficient to control access to the target task, which is looked up in
this function.  The existing task_setscheduler hook is re-used for this
operation since this falls under the same class of operations.

Signed-off-by: David Quigley &lt;dpquigl@tycho.nsa.gov&gt;
Acked-by:  Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
Acked-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] VFS: Permit filesystem to override root dentry on mount</title>
<updated>2006-06-23T14:42:45Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2006-06-23T09:02:57Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=454e2398be9b9fa30433fccc548db34d19aa9958'/>
<id>urn:sha1:454e2398be9b9fa30433fccc548db34d19aa9958</id>
<content type='text'>
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Nathan Scott &lt;nathans@sgi.com&gt;
Cc: Roland Dreier &lt;rolandd@cisco.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
<entry>
<title>[PATCH] cpuset: might_sleep_if check in cpuset_zones_allowed</title>
<updated>2006-05-21T19:59:18Z</updated>
<author>
<name>Paul Jackson</name>
<email>pj@sgi.com</email>
</author>
<published>2006-05-20T22:00:11Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=92d1dbd27417c54c23aac6a84c285e256f6118b6'/>
<id>urn:sha1:92d1dbd27417c54c23aac6a84c285e256f6118b6</id>
<content type='text'>
It's too easy to incorrectly call cpuset_zone_allowed() in an atomic
context without __GFP_HARDWALL set, and when done, it is not noticed until
a tight memory situation forces allocations to be tried outside the current
cpuset.

Add a 'might_sleep_if()' check, to catch this earlier on, instead of
waiting for a similar check in the mutex_lock() code, which is only rarely
invoked.

Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
</entry>
</feed>
