aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2007-10-16memoryless nodes: fixup uses of node_online_map in generic codeLee Schermerhorn
Here's a cut at fixing up uses of the online node map in generic code. mm/shmem.c:shmem_parse_mpol() Ensure nodelist is subset of nodes with memory. Use node_states[N_HIGH_MEMORY] as default for missing nodelist for interleave policy. mm/shmem.c:shmem_fill_super() initialize policy_nodes to node_states[N_HIGH_MEMORY] mm/page-writeback.c:highmem_dirtyable_memory() sum over nodes with memory mm/page_alloc.c:zlc_setup() allowednodes - use nodes with memory. mm/page_alloc.c:default_zonelist_order() average over nodes with memory. mm/page_alloc.c:find_next_best_node() skip nodes w/o memory. N_HIGH_MEMORY state mask may not be initialized at this time, unless we want to depend on early_calculate_totalpages() [see below]. Will ZONE_MOVABLE ever be configurable? mm/page_alloc.c:find_zone_movable_pfns_for_nodes() spread kernelcore over nodes with memory. This required calling early_calculate_totalpages() unconditionally, and populating N_HIGH_MEMORY node state therein from nodes in the early_node_map[]. If we can depend on this, we can eliminate the population of N_HIGH_MEMORY mask from __build_all_zonelists() and use the N_HIGH_MEMORY mask in find_next_best_node(). mm/mempolicy.c:mpol_check_policy() Ensure nodes specified for policy are subset of nodes with memory. [akpm@linux-foundation.org: fix warnings] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Use N_HIGH_MEMORY for cpusetsChristoph Lameter
cpusets try to ensure that any node added to a cpuset's mems_allowed is on-line and contains memory. The assumption was that online nodes contained memory. Thus, it is possible to add memoryless nodes to a cpuset and then add tasks to this cpuset. This results in continuous series of oom-kill and apparent system hang. Change cpusets to use node_states[N_HIGH_MEMORY] [a.k.a. node_memory_map] in place of node_online_map when vetting memories. Return error if admin attempts to write a non-empty mems_allowed node mask containing only memoryless-nodes. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Fix GFP_THISNODE behaviorChristoph Lameter
GFP_THISNODE checks that the zone selected is within the pgdat (node) of the first zone of a nodelist. That only works if the node has memory. A memoryless node will have its first node on another pgdat (node). GFP_THISNODE currently will return simply memory on the first pgdat. Thus it is returning memory on other nodes. GFP_THISNODE should fail if there is no local memory on a node. Add a new set of zonelists for each node that only contain the nodes that belong to the zones itself so that no fallback is possible. Then modify gfp_type to pickup the right zone based on the presence of __GFP_THISNODE. Drop the existing GFP_THISNODE checks from the page_allocators hot path. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: drop one memoryless node boot warningChristoph Lameter
get_pfn_range_for_nid() is called multiple times for each node at boot time. Each time, it will warn about nodes with no memory, resulting in boot messages like: Node 0 active with no memory Node 0 active with no memory Node 0 active with no memory Node 0 active with no memory Node 0 active with no memory Node 0 active with no memory On node 0 totalpages: 0 Node 0 active with no memory Node 0 active with no memory DMA zone: 0 pages used for memmap Node 0 active with no memory Node 0 active with no memory Normal zone: 0 pages used for memmap Node 0 active with no memory Node 0 active with no memory Movable zone: 0 pages used for memmap and so on for each memoryless node. We already have the "On node N totalpages: ..." and other related messages, so drop the "Node N active with no memory" warnings. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Bob Picco <bob.picco@hp.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Add N_CPU node stateChristoph Lameter
We need the check for a node with cpu in zone reclaim. Zone reclaim will not allow remote zone reclaim if a node has a cpu. [Lee.Schermerhorn@hp.com: Move setup of N_CPU node state mask] Signed-off-by: Christoph Lameter <clameter@sgi.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Update memory policy and page migrationChristoph Lameter
Online nodes now may have no memory. The checks and initialization must therefore be changed to no longer use the online functions. This will correctly initialize the interleave on bootup to only target nodes with memory and will make sys_move_pages return an error when a page is to be moved to a memoryless node. Similarly we will get an error if MPOL_BIND and MPOL_INTERLEAVE is used on a memoryless node. These are somewhat new semantics. So far one could specify memoryless nodes and we would maybe do the right thing and just ignore the node (or we'd do something strange like with MPOL_INTERLEAVE). If we want to allow the specification of memoryless nodes via memory policies then we need to keep checking for online nodes. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Allow profiling data to fall back to other nodesChristoph Lameter
Processors on memoryless nodes must be able to fall back to remote nodes in order to get a profiling buffer. This may lead to excessive NUMA traffic but I think we should allow this rather than failing. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Uncached allocator updatesChristoph Lameter
The checks for node_online in the uncached allocator are made to make sure that memory is available on these nodes. Thus switch all the checks to use N_HIGH_MEMORY and to N_ONLINE. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: SLUB supportChristoph Lameter
Simply switch all for_each_online_node to for_each_node_state(NORMAL_MEMORY). That way SLUB only operates on nodes with regular memory. Any allocation attempt on a memoryless node or a node with just highmem will fall whereupon SLUB will fetch memory from a nearby node (depending on how memory policies and cpuset describe fallback). Signed-off-by: Christoph Lameter <clameter@sgi.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Slab supportChristoph Lameter
Slab should not allocate control structures for nodes without memory. This may seem to work right now but its unreliable since not all allocations can fall back due to the use of GFP_THISNODE. Switching a few for_each_online_node's to N_NORMAL_MEMORY will allow us to only allocate for nodes that have regular memory. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: No need for kswapdChristoph Lameter
A node without memory does not need a kswapd. So use the memory map instead of the online map when starting kswapd. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: OOM: use N_HIGH_MEMORY map instead of constructing one on ↵Christoph Lameter
the fly constrained_alloc() builds its own memory map for nodes with memory. We have that available in N_HIGH_MEMORY now. So simplify the code. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Fix interleave behavior for memoryless nodesChristoph Lameter
MPOL_INTERLEAVE currently simply loops over all nodes. Allocations on memoryless nodes will be redirected to nodes with memory. This results in an imbalance because the neighboring nodes to memoryless nodes will get significantly more interleave hits that the rest of the nodes on the system. We can avoid this imbalance by clearing the nodes in the interleave node set that have no memory. If we use the node map of the memory nodes instead of the online nodes then we have only the nodes we want. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: introduce mask of nodes with memoryChristoph Lameter
It is necessary to know if nodes have memory since we have recently begun to add support for memoryless nodes. For that purpose we introduce a two new node states: N_HIGH_MEMORY and N_NORMAL_MEMORY. A node has its bit in N_HIGH_MEMORY set if it has any memory regardless of the type of mmemory. If a node has memory then it has at least one zone defined in its pgdat structure that is located in the pgdat itself. A node has its bit in N_NORMAL_MEMORY set if it has a lower zone than ZONE_HIGHMEM. This means it is possible to allocate memory that is not subject to kmap. N_HIGH_MEMORY and N_NORMAL_MEMORY can then be used in various places to insure that we do the right thing when we encounter a memoryless node. [akpm@linux-foundation.org: build fix] [Lee.Schermerhorn@hp.com: update N_HIGH_MEMORY node state for memory hotadd] [y-goto@jp.fujitsu.com: Fix memory hotplug + sparsemem build] Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Memoryless nodes: Generic management of nodemasks for various purposesChristoph Lameter
Why do we need to support memoryless nodes? KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote: > For fujitsu, problem is called "empty" node. > > When ACPI's SRAT table includes "possible nodes", ia64 bootstrap(acpi_numa_init) > creates nodes, which includes no memory, no cpu. > > I tried to remove empty-node in past, but that was denied. > It was because we can hot-add cpu to the empty node. > (node-hotplug triggered by cpu is not implemented now. and it will be ugly.) > > > For HP, (Lee can comment on this later), they have memory-less-node. > As far as I hear, HP's machine can have following configration. > > (example) > Node0: CPU0 memory AAA MB > Node1: CPU1 memory AAA MB > Node2: CPU2 memory AAA MB > Node3: CPU3 memory AAA MB > Node4: Memory XXX GB > > AAA is very small value (below 16MB) and will be omitted by ia64 bootstrap. > After boot, only Node 4 has valid memory (but have no cpu.) > > Maybe this is memory-interleave by firmware config. Christoph Lameter <clameter@sgi.com> wrote: > Future SGI platforms (actually also current one can have but nothing like > that is deployed to my knowledge) have nodes with only cpus. Current SGI > platforms have nodes with just I/O that we so far cannot manage in the > core. So the arch code maps them to the nearest memory node. Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > For the HP platforms, we can configure each cell with from 0% to 100% > "cell local memory". When we configure with <100% CLM, the "missing > percentages" are interleaved by hardware on a cache-line granularity to > improve bandwidth at the expense of latency for numa-challenged > applications [and OSes, but not our problem ;-)]. When we boot Linux on > such a config, all of the real nodes have no memory--it all resides in a > single interleaved pseudo-node. > > When we boot Linux on a 100% CLM configuration [== NUMA], we still have > the interleaved pseudo-node. It contains a few hundred MB stolen from > the real nodes to contain the DMA zone. [Interleaved memory resides at > phys addr 0]. The memoryless-nodes patches, along with the zoneorder > patches, support this config as well. > > Also, when we boot a NUMA config with the "mem=" command line, > specifying less memory than actually exists, Linux takes the excluded > memory "off the top" rather than distributing it across the nodes. This > can result in memoryless nodes, as well. > This patch: Preparation for memoryless node patches. Provide a generic way to keep nodemasks describing various characteristics of NUMA nodes. Remove the node_online_map and the node_possible map and realize the same functionality using two nodes stats: N_POSSIBLE and N_ONLINE. [Lee.Schermerhorn@hp.com: Initialize N_*_MEMORY and N_CPU masks for non-NUMA config] Signed-off-by: Christoph Lameter <clameter@sgi.com> Tested-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Bob Picco <bob.picco@hp.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@skynet.ie> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fs: remove some AOP_TRUNCATED_PAGENick Piggin
prepare/commit_write no longer returns AOP_TRUNCATED_PAGE since OCFS2 and GFS2 were converted to the new aops, so we can make some simplifications for that. [michal.k.k.piotrowski@gmail.com: fix warning] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fs: restore nobhNick Piggin
Implement nobh in new aops. This is a bit tricky. FWIW, nobh_truncate is now implemented in a way that does not create blocks in sparse regions, which is a silly thing for it to have been doing (isn't it?) ext2 survives fsx and fsstress. jfs is converted as well... ext3 should be easy to do (but not done yet). [akpm@linux-foundation.org: coding-style fixes] Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16ocfs2: convert to new aopsNick Piggin
Plug ocfs2 into the ->write_begin and ->write_end aops. A bunch of custom code is now gone - the iovec iteration stuff during write and the ocfs2 splice write actor. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fs: affs convert to new aopsNick Piggin
Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fs: adfs convert to new aopsNick Piggin
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16jfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16minixfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Andries Brouwer <Andries.Brouwer@cwi.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16sysv: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16udf: convert to new aopsNick Piggin
Convert udf to new aops. Also seem to have fixed pagecache corruption in udf_adinicb_commit_write -- page was marked uptodate when it is not. Also, fixed the silly setup where prepare_write was doing a kmap to be used in commit_write: just do kmap_atomic in write_end. Use libfs helpers to make this easier. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: <bfennema@falcon.csc.calpoly.edu> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16ufs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16jffs2: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16hostfs: convert to new aopsNick Piggin
This also gets rid of a lot of useless read_file stuff. And also optimises the full page write case by marking a !uptodate page uptodate. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fuse: convert to new aopsNick Piggin
[mszeredi] - don't send zero length write requests - it is not legal for the filesystem to return with zero written bytes Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16smbfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16nfs: convert to new aopsNick Piggin
[akpm@linux-foundation.org: fix against git-nfs] [peterz@infradead.org: fix against git-nfs] Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16With reiserfs no longer using the weird generic_cont_expand, remove it ↵Nick Piggin
completely. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16reiserfs: use generic_cont_expand_simpleVladimir Saveliev
This patch makes reiserfs to use AOP_FLAG_CONT_EXPAND in order to get rid of the special generic_cont_expand routine Signed-off-by: Vladimir Saveliev <vs@namesys.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16reiserfs: convert to new aopsVladimir Saveliev
Convert reiserfs to new aops Signed-off-by: Vladimir Saveliev <vs@namesys.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16reiserfs: use generic writeVladimir Saveliev
Make reiserfs to write via generic routines. Original reiserfs write optimized for big writes is deadlock rone Signed-off-by: Vladimir Saveliev <vs@namesys.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16qnx4: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Anders Larsen <al@alarsen.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16bfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16hpfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16hfsplus: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16hfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fat: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fs: new cont helpersNick Piggin
Rework the generic block "cont" routines to handle the new aops. Supporting cont_prepare_write would take quite a lot of code to support, so remove it instead (and we later convert all filesystems to use it). write_begin gets passed AOP_FLAG_CONT_EXPAND when called from generic_cont_expand, so filesystems can avoid the old hacks they used. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16gfs2: convert to new aopsSteven Whitehouse
Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16xfs: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16ext4: convert to new aopsNick Piggin
Convert ext4 to use write_begin()/write_end() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dmitriy Monakhov <dmonakhov@sw.ru> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16ext3: convert to new aopsNick Piggin
Various fixes and improvements Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16ext2: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16block_dev: convert to new aopsNick Piggin
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16implement simple fs aopsNick Piggin
Implement new aops for some of the simpler filesystems. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16mm: restore KERNEL_DS optimisationsNick Piggin
Restore the KERNEL_DS optimisation, especially helpful to the 2copy write path. This may be a pretty questionable gain in most cases, especially after the legacy 2copy write path is removed, but it doesn't cost much. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16deny partial write for loop dev fdDmitry Monakhov
Partial write can be easily supported by LO_CRYPT_NONE mode, but it is not easy in LO_CRYPT_CRYPTOAPI case, because of its block nature. I don't know who still used cryptoapi, but theoretically it is possible. So let's leave things as they are. Loop device doesn't support partial write before Nick's "write_begin/write_end" patch set, and let's it behave the same way after. Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>