<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/init/Kconfig, branch v3.8</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/init/Kconfig?h=v3.8</id>
<link rel='self' href='https://git.amat.us/linux/atom/init/Kconfig?h=v3.8'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-01-16T20:42:57Z</updated>
<entry>
<title>Tell the world we gave up on pushing CC_OPTIMIZE_FOR_SIZE</title>
<updated>2013-01-16T20:42:57Z</updated>
<author>
<name>Kirill Smelkov</name>
<email>kirr@mns.spb.ru</email>
</author>
<published>2012-11-02T11:41:01Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3a55fb0d9fe8e2f4594329edd58c5fd6f35a99dd'/>
<id>urn:sha1:3a55fb0d9fe8e2f4594329edd58c5fd6f35a99dd</id>
<content type='text'>
In commit 281dc5c5ec0f ("Give up on pushing CC_OPTIMIZE_FOR_SIZE") we
already changed the actual default value, but the help-text still
suggested 'y'. Fix the help text too, for all the same reasons.

Sadly, -Os keeps on generating some very suboptimal code for certain
cases, to the point where any I$ miss upside is swamped by the downside.
The main ones are:

 - using "rep movsb" for memcpy, even on CPU's where that is
   horrendously bad for performance.

 - not honoring branch prediction information, so any I$ footprint you
   win from smaller code, you lose from less code density in the I$.

 - using divide instructions when that is very expensive.

Signed-off-by: Kirill Smelkov &lt;kirr@mns.spb.ru&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: infrastructure to match an allocation to the right cache</title>
<updated>2012-12-18T23:02:14Z</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2012-12-18T22:22:40Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d7f25f8a2f81252d1ac134470ba1d0a287cf8fcd'/>
<id>urn:sha1:d7f25f8a2f81252d1ac134470ba1d0a287cf8fcd</id>
<content type='text'>
The page allocator is able to bind a page to a memcg when it is
allocated.  But for the caches, we'd like to have as many objects as
possible in a page belonging to the same cache.

This is done in this patch by calling memcg_kmem_get_cache in the
beginning of every allocation function.  This function is patched out by
static branches when kernel memory controller is not being used.

It assumes that the task allocating, which determines the memcg in the
page allocator, belongs to the same cgroup throughout the whole process.
Misaccounting can happen if the task calls memcg_kmem_get_cache() while
belonging to a cgroup, and later on changes.  This is considered
acceptable, and should only happen upon task migration.

Before the cache is created by the memcg core, there is also a possible
imbalance: the task belongs to a memcg, but the cache being allocated from
is the global cache, since the child cache is not yet guaranteed to be
ready.  This case is also fine, since in this case the GFP_KMEMCG will not
be passed and the page allocator will not attempt any cgroup accounting.

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@redhat.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: JoonSoo Kim &lt;js1304@gmail.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Suleiman Souhlal &lt;suleiman@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: kmem accounting basic infrastructure</title>
<updated>2012-12-18T23:02:12Z</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2012-12-18T22:21:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=510fc4e11b772fd60f2c545c64d4c55abd07ce36'/>
<id>urn:sha1:510fc4e11b772fd60f2c545c64d4c55abd07ce36</id>
<content type='text'>
Add the basic infrastructure for the accounting of kernel memory.  To
control that, the following files are created:

 * memory.kmem.usage_in_bytes
 * memory.kmem.limit_in_bytes
 * memory.kmem.failcnt
 * memory.kmem.max_usage_in_bytes

They have the same meaning of their user memory counterparts.  They
reflect the state of the "kmem" res_counter.

Per cgroup kmem memory accounting is not enabled until a limit is set for
the group.  Once the limit is set the accounting cannot be disabled for
that group.  This means that after the patch is applied, no behavioral
changes exists for whoever is still using memcg to control their memory
usage, until memory.kmem.limit_in_bytes is set for the first time.

We always account to both user and kernel resource_counters.  This
effectively means that an independent kernel limit is in place when the
limit is set to a lower value than the user memory.  A equal or higher
value means that the user limit will always hit first, meaning that kmem
is effectively unlimited.

People who want to track kernel memory but not limit it, can set this
limit to a very high number (like RESOURCE_MAX - 1page - that no one will
ever hit, or equal to the user memory)

[akpm@linux-foundation.org: MEMCG_MMEM only works with slab and slub]
Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Acked-by: Kamezawa Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@redhat.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: JoonSoo Kim &lt;js1304@gmail.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2012-12-17T23:44:47Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-17T23:44:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6a2b60b17b3e48a418695a94bd2420f6ab32e519'/>
<id>urn:sha1:6a2b60b17b3e48a418695a94bd2420f6ab32e519</id>
<content type='text'>
Pull user namespace changes from Eric Biederman:
 "While small this set of changes is very significant with respect to
  containers in general and user namespaces in particular.  The user
  space interface is now complete.

  This set of changes adds support for unprivileged users to create user
  namespaces and as a user namespace root to create other namespaces.
  The tyranny of supporting suid root preventing unprivileged users from
  using cool new kernel features is broken.

  This set of changes completes the work on setns, adding support for
  the pid, user, mount namespaces.

  This set of changes includes a bunch of basic pid namespace
  cleanups/simplifications.  Of particular significance is the rework of
  the pid namespace cleanup so it no longer requires sending out
  tendrils into all kinds of unexpected cleanup paths for operation.  At
  least one case of broken error handling is fixed by this cleanup.

  The files under /proc/&lt;pid&gt;/ns/ have been converted from regular files
  to magic symlinks which prevents incorrect caching by the VFS,
  ensuring the files always refer to the namespace the process is
  currently using and ensuring that the ptrace_mayaccess permission
  checks are always applied.

  The files under /proc/&lt;pid&gt;/ns/ have been given stable inode numbers
  so it is now possible to see if different processes share the same
  namespaces.

  Through the David Miller's net tree are changes to relax many of the
  permission checks in the networking stack to allowing the user
  namespace root to usefully use the networking stack.  Similar changes
  for the mount namespace and the pid namespace are coming through my
  tree.

  Two small changes to add user namespace support were commited here adn
  in David Miller's -net tree so that I could complete the work on the
  /proc/&lt;pid&gt;/ns/ files in this tree.

  Work remains to make it safe to build user namespaces and 9p, afs,
  ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
  Kconfig guard remains in place preventing that user namespaces from
  being built when any of those filesystems are enabled.

  Future design work remains to allow root users outside of the initial
  user namespace to mount more than just /proc and /sys."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
  proc: Usable inode numbers for the namespace file descriptors.
  proc: Fix the namespace inode permission checks.
  proc: Generalize proc inode allocation
  userns: Allow unprivilged mounts of proc and sysfs
  userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
  procfs: Print task uids and gids in the userns that opened the proc file
  userns: Implement unshare of the user namespace
  userns: Implent proc namespace operations
  userns: Kill task_user_ns
  userns: Make create_new_namespaces take a user_ns parameter
  userns: Allow unprivileged use of setns.
  userns: Allow unprivileged users to create new namespaces
  userns: Allow setting a userns mapping to your current uid.
  userns: Allow chown and setgid preservation
  userns: Allow unprivileged users to create user namespaces.
  userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
  userns: fix return value on mntns_install() failure
  vfs: Allow unprivileged manipulation of the mount namespace.
  vfs: Only support slave subtrees across different user namespaces
  vfs: Add a user namespace reference from struct mnt_namespace
  ...
</content>
</entry>
<entry>
<title>Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma</title>
<updated>2012-12-16T23:18:08Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-16T22:33:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3d59eebc5e137bd89c6351e4c70e90ba1d0dc234'/>
<id>urn:sha1:3d59eebc5e137bd89c6351e4c70e90ba1d0dc234</id>
<content type='text'>
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task&lt;-&gt;node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
</content>
</entry>
<entry>
<title>mm: sched: numa: Control enabling and disabling of NUMA balancing</title>
<updated>2012-12-11T14:42:55Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-11-22T11:16:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1a687c2e9a99335c9e77392f050fe607fa18a652'/>
<id>urn:sha1:1a687c2e9a99335c9e77392f050fe607fa18a652</id>
<content type='text'>
This patch adds Kconfig options and kernel parameters to allow the
enabling and disabling of automatic NUMA balancing. The existance
of such a switch was and is very important when debugging problems
related to transparent hugepages and we should have the same for
automatic NUMA placement.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
</content>
</entry>
<entry>
<title>mm: numa: pte_numa() and pmd_numa()</title>
<updated>2012-12-11T14:42:36Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2012-10-03T23:50:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=be3a728427a605990a7a0b6dbf9e29b68e266146'/>
<id>urn:sha1:be3a728427a605990a7a0b6dbf9e29b68e266146</id>
<content type='text'>
Implement pte_numa and pmd_numa.

We must atomically set the numa bit and clear the present bit to
define a pte_numa or pmd_numa.

Once a pte or pmd has been set as pte_numa or pmd_numa, the next time
a thread touches a virtual address in the corresponding virtual range,
a NUMA hinting page fault will trigger. The NUMA hinting page fault
will clear the NUMA bit and set the present bit again to resolve the
page fault.

The expectation is that a NUMA hinting page fault is used as part
of a placement policy that decides if a page should remain on the
current node or migrated to a different node.

Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
</content>
</entry>
<entry>
<title>context_tracking: New context tracking susbsystem</title>
<updated>2012-11-30T19:40:07Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-11-27T18:33:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=91d1aa43d30505b0b825db8898ffc80a8eca96c7'/>
<id>urn:sha1:91d1aa43d30505b0b825db8898ffc80a8eca96c7</id>
<content type='text'>
Create a new subsystem that probes on kernel boundaries
to keep track of the transitions between level contexts
with two basic initial contexts: user or kernel.

This is an abstraction of some RCU code that use such tracking
to implement its userspace extended quiescent state.

We need to pull this up from RCU into this new level of indirection
because this tracking is also going to be used to implement an "on
demand" generic virtual cputime accounting. A necessary step to
shutdown the tick while still accounting the cputime.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Li Zhong &lt;zhong@linux.vnet.ibm.com&gt;
Cc: Gilad Ben-Yossef &lt;gilad@benyossef.com&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
[ paulmck: fix whitespace error and email address. ]
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Add callback-free CPUs</title>
<updated>2012-11-16T18:05:56Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paul.mckenney@linaro.org</email>
</author>
<published>2012-08-20T04:35:53Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3fbfbf7a3b66ec424042d909f14ba2ddf4372ea8'/>
<id>urn:sha1:3fbfbf7a3b66ec424042d909f14ba2ddf4372ea8</id>
<content type='text'>
RCU callback execution can add significant OS jitter and also can
degrade both scheduling latency and, in asymmetric multiprocessors,
energy efficiency.  This commit therefore adds the ability for selected
CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded
to kthreads.  If the "rcu_nocb_poll" boot parameter is also specified,
these kthreads will do polling, removing the need for the offloaded
CPUs to do wakeups.  At least one CPU must be doing normal callback
processing: currently CPU 0 cannot be selected as a no-CBs CPU.
In addition, attempts to offline the last normal-CBs CPU will fail.

This feature was inspired by Jim Houston's and Joe Korty's JRCU, and
this commit includes fixes to problems located by Fengguang Wu's
kbuild test robot.

[ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ]

Signed-off-by: Paul E. McKenney &lt;paul.mckenney@linaro.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>userns: Support fuse interacting with multiple user namespaces</title>
<updated>2012-11-15T06:05:33Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2012-02-08T00:26:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=499dcf2024092e5cce41d05599a5b51d1f92031a'/>
<id>urn:sha1:499dcf2024092e5cce41d05599a5b51d1f92031a</id>
<content type='text'>
Use kuid_t and kgid_t in struct fuse_conn and struct fuse_mount_data.

The connection between between a fuse filesystem and a fuse daemon is
established when a fuse filesystem is mounted and provided with a file
descriptor the fuse daemon created by opening /dev/fuse.

For now restrict the communication of uids and gids between the fuse
filesystem and the fuse daemon to the initial user namespace.  Enforce
this by verifying the file descriptor passed to the mount of fuse was
opened in the initial user namespace.  Ensuring the mount happens in
the initial user namespace is not necessary as mounts from non-initial
user namespaces are not yet allowed.

In fuse_req_init_context convert the currrent fsuid and fsgid into the
initial user namespace for the request that will be sent to the fuse
daemon.

In fuse_fill_attr convert the uid and gid passed from the fuse daemon
from the initial user namespace into kuids and kgids.

In iattr_to_fattr called from fuse_setattr convert kuids and kgids
into the uids and gids in the initial user namespace before passing
them to the fuse filesystem.

In fuse_change_attributes_common called from fuse_dentry_revalidate,
fuse_permission, fuse_geattr, and fuse_setattr, and fuse_iget convert
the uid and gid from the fuse daemon into a kuid and a kgid to store
on the fuse inode.

By default fuse mounts are restricted to task whose uid, suid, and
euid matches the fuse user_id and whose gid, sgid, and egid matches
the fuse group id.  Convert the user_id and group_id mount options
into kuids and kgids at mount time, and use uid_eq and gid_eq to
compare the in fuse_allow_task.

Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
</feed>
