<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/rmap.c, branch v3.3.5</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/mm/rmap.c?h=v3.3.5</id>
<link rel='self' href='https://git.amat.us/linux/atom/mm/rmap.c?h=v3.3.5'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-01-13T04:13:06Z</updated>
<entry>
<title>mm: unify remaining mem_cont, mem, etc. variable names to memcg</title>
<updated>2012-01-13T04:13:06Z</updated>
<author>
<name>Johannes Weiner</name>
<email>jweiner@redhat.com</email>
</author>
<published>2012-01-13T01:18:32Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=72835c86ca15d0126354b73d5f29ce9194931c9b'/>
<id>urn:sha1:72835c86ca15d0126354b73d5f29ce9194931c9b</id>
<content type='text'>
Signed-off-by: Johannes Weiner &lt;jweiner@redhat.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mremap: enforce rmap src/dst vma ordering in case of vma_merge() succeeding in copy_vma()</title>
<updated>2012-01-11T00:30:44Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2012-01-10T23:08:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=948f017b093a9baac23855fcd920d3a970b71bb6'/>
<id>urn:sha1:948f017b093a9baac23855fcd920d3a970b71bb6</id>
<content type='text'>
migrate was doing an rmap_walk with speculative lock-less access on
pagetables.  That could lead it to not serializing properly against mremap
PT locks.  But a second problem remains in the order of vmas in the
same_anon_vma list used by the rmap_walk.

If vma_merge succeeds in copy_vma, the src vma could be placed after the
dst vma in the same_anon_vma list.  That could still lead to migrate
missing some pte.

This patch adds an anon_vma_moveto_tail() function to force the dst vma at
the end of the list before mremap starts to solve the problem.

If the mremap is very large and there are a lots of parents or childs
sharing the anon_vma root lock, this should still scale better than taking
the anon_vma root lock around every pte copy practically for the whole
duration of mremap.

Update: Hugh noticed special care is needed in the error path where
move_page_tables goes in the reverse direction, a second
anon_vma_moveto_tail() call is needed in the error path.

This program exercises the anon_vma_moveto_tail:

===

int main()
{
	static struct timeval oldstamp, newstamp;
	long diffsec;
	char *p, *p2, *p3, *p4;
	if (posix_memalign((void **)&amp;p, 2*1024*1024, SIZE))
		perror("memalign"), exit(1);
	if (posix_memalign((void **)&amp;p2, 2*1024*1024, SIZE))
		perror("memalign"), exit(1);
	if (posix_memalign((void **)&amp;p3, 2*1024*1024, SIZE))
		perror("memalign"), exit(1);

	memset(p, 0xff, SIZE);
	printf("%p\n", p);
	memset(p2, 0xff, SIZE);
	memset(p3, 0x77, 4096);
	if (memcmp(p, p2, SIZE))
		printf("error\n");
	p4 = mremap(p+SIZE/2, SIZE/2, SIZE/2, MREMAP_FIXED|MREMAP_MAYMOVE, p3);
	if (p4 != p3)
		perror("mremap"), exit(1);
	p4 = mremap(p4, SIZE/2, SIZE/2, MREMAP_FIXED|MREMAP_MAYMOVE, p+SIZE/2);
	if (p4 != p+SIZE/2)
		perror("mremap"), exit(1);
	if (memcmp(p, p2, SIZE))
		printf("error\n");
	printf("ok\n");

	return 0;
}
===

$ perf probe -a anon_vma_moveto_tail
Add new event:
  probe:anon_vma_moveto_tail (on anon_vma_moveto_tail)

You can now use it on all perf tools, such as:

        perf record -e probe:anon_vma_moveto_tail -aR sleep 1

$ perf record -e probe:anon_vma_moveto_tail -aR ./anon_vma_moveto_tail
0x7f2ca2800000
ok
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.043 MB perf.data (~1860 samples) ]
$ perf report --stdio
   100.00%  anon_vma_moveto  [kernel.kallsyms]  [k] anon_vma_moveto_tail

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reported-by: Nai Xia &lt;nai.xia@gmail.com&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Pawel Sikora &lt;pluto@agmk.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux</title>
<updated>2011-11-07T03:44:47Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-11-07T03:44:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=32aaeffbd4a7457bf2f7448b33b5946ff2a960eb'/>
<id>urn:sha1:32aaeffbd4a7457bf2f7448b33b5946ff2a960eb</id>
<content type='text'>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include &lt;linux/module.h&gt;
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include &lt;linux/module.h&gt;
  net: sch_generic remove redundant use of &lt;linux/module.h&gt;
  net: inet_timewait_sock doesnt need &lt;linux/module.h&gt;
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
</content>
</entry>
<entry>
<title>ksm: fix the comment of try_to_unmap_one()</title>
<updated>2011-11-01T00:30:49Z</updated>
<author>
<name>Wanlong Gao</name>
<email>gaowanlong@cn.fujitsu.com</email>
</author>
<published>2011-11-01T00:08:51Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=99ef0315f1b320f392acc4364598340e78758fd2'/>
<id>urn:sha1:99ef0315f1b320f392acc4364598340e78758fd2</id>
<content type='text'>
try_to_unmap_one() is called by try_to_unmap_ksm(), too.

Signed-off-by: Wanlong Gao &lt;gaowanlong@cn.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: Map most files to use export.h instead of module.h</title>
<updated>2011-10-31T13:20:12Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2011-10-16T06:01:52Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b95f1b31b75588306e32b2afd32166cad48f670b'/>
<id>urn:sha1:b95f1b31b75588306e32b2afd32166cad48f670b</id>
<content type='text'>
The files changed within are only using the EXPORT_SYMBOL
macro variants.  They are not using core modular infrastructure
and hence don't need module.h but only the export.h header.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback</title>
<updated>2011-07-26T17:39:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-26T17:39:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f01ef569cddb1a8627b1c6b3a134998ad1cf4b22'/>
<id>urn:sha1:f01ef569cddb1a8627b1c6b3a134998ad1cf4b22</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback: (27 commits)
  mm: properly reflect task dirty limits in dirty_exceeded logic
  writeback: don't busy retry writeback on new/freeing inodes
  writeback: scale IO chunk size up to half device bandwidth
  writeback: trace global_dirty_state
  writeback: introduce max-pause and pass-good dirty limits
  writeback: introduce smoothed global dirty limit
  writeback: consolidate variable names in balance_dirty_pages()
  writeback: show bdi write bandwidth in debugfs
  writeback: bdi write bandwidth estimation
  writeback: account per-bdi accumulated written pages
  writeback: make writeback_control.nr_to_write straight
  writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()
  writeback: trace event writeback_queue_io
  writeback: trace event writeback_single_inode
  writeback: remove .nonblocking and .encountered_congestion
  writeback: remove writeback_control.more_io
  writeback: skip balance_dirty_pages() for in-memory fs
  writeback: add bdi_dirty_limit() kernel-doc
  writeback: avoid extra sync work at enqueue time
  writeback: elevate queue_io() into wb_writeback()
  ...

Fix up trivial conflicts in fs/fs-writeback.c and mm/filemap.c
</content>
</entry>
<entry>
<title>[S390] reference bit testing for unmapped pages</title>
<updated>2011-07-24T08:48:00Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2011-07-24T08:47:59Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=50a15981a1fac7e019ff7c3cba87531fb580f065'/>
<id>urn:sha1:50a15981a1fac7e019ff7c3cba87531fb580f065</id>
<content type='text'>
On x86 a page without a mapper is by definition not referenced / old.
The s390 architecture keeps the reference bit in the storage key and
the current code will check the storage key for page without a mapper.
This leads to an interesting effect: the first time an s390 system
needs to write pages to swap it only finds referenced pages. This
causes a lot of pages to get added and written to the swap device.
To avoid this behaviour change page_referenced to query the storage
key only if there is a mapper of the page.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>fs: kill i_alloc_sem</title>
<updated>2011-07-21T00:47:46Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2011-06-24T18:29:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bd5fe6c5eb9c548d7f07fe8f89a150bb6705e8e3'/>
<id>urn:sha1:bd5fe6c5eb9c548d7f07fe8f89a150bb6705e8e3</id>
<content type='text'>
i_alloc_sem is a rather special rw_semaphore.  It's the last one that may
be released by a non-owner, and it's write side is always mirrored by
real exclusion.  It's intended use it to wait for all pending direct I/O
requests to finish before starting a truncate.

Replace it with a hand-grown construct:

 - exclusion for truncates is already guaranteed by i_mutex, so it can
   simply fall way
 - the reader side is replaced by an i_dio_count member in struct inode
   that counts the number of pending direct I/O requests.  Truncate can't
   proceed as long as it's non-zero
 - when i_dio_count reaches non-zero we wake up a pending truncate using
   wake_up_bit on a new bit in i_flags
 - new references to i_dio_count can't appear while we are waiting for
   it to read zero because the direct I/O count always needs i_mutex
   (or an equivalent like XFS's i_iolock) for starting a new operation.

This scheme is much simpler, and saves the space of a spinlock_t and a
struct list_head in struct inode (typically 160 bits on a non-debug 64-bit
system).

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure.c: fix spinlock vs mutex order</title>
<updated>2011-06-28T01:00:13Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2011-06-27T23:18:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9b679320a5fbf46454011e5c62e0b8991b0956d1'/>
<id>urn:sha1:9b679320a5fbf46454011e5c62e0b8991b0956d1</id>
<content type='text'>
We cannot take a mutex while holding a spinlock, so flip the order and
fix the locking documentation.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: avoid anon_vma_chain allocation under anon_vma lock</title>
<updated>2011-06-18T02:24:11Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-06-18T02:05:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=dd34739c03f2f9a79403d33419c2e61e11b4c403'/>
<id>urn:sha1:dd34739c03f2f9a79403d33419c2e61e11b4c403</id>
<content type='text'>
Hugh Dickins points out that lockdep (correctly) spots a potential
deadlock on the anon_vma lock, because we now do a GFP_KERNEL allocation
of anon_vma_chain while doing anon_vma_clone().  The problem is that
page reclaim will want to take the anon_vma lock of any anonymous pages
that it will try to reclaim.

So re-organize the code in anon_vma_clone() slightly: first do just a
GFP_NOWAIT allocation, which will usually work fine.  But if that fails,
let's just drop the lock and re-do the allocation, now with GFP_KERNEL.

End result: not only do we avoid the locking problem, this also ends up
getting better concurrency in case the allocation does need to block.
Tim Chen reports that with all these anon_vma locking tweaks, we're now
almost back up to the spinlock performance.

Reported-and-tested-by: Hugh Dickins &lt;hughd@google.com&gt;
Tested-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
