Age | Commit message (Collapse) | Author |
|
[ Upstream commit 0fbebed682ff2788dee58e8d7f7dda46e33aa10b ]
If our first THP installation for an MM is via the set_pmd_at() done
during khugepaged's collapsing we'll end up in tsb_grow() trying to do
a GFP_KERNEL allocation with several locks held.
Simply using GFP_ATOMIC in this situation is not the best option
because we really can't have this fail, so we'd really like to keep
this an order 0 GFP_KERNEL allocation if possible.
Also, doing the TSB allocation from khugepaged is a really bad idea
because we'll allocate it potentially from the wrong NUMA node in that
context.
So what we do is defer the hugepage TSB allocation until the first TLB
miss we take on a hugepage. This is slightly tricky because we have
to handle two unusual cases:
1) Taking the first hugepage TLB miss in the window trap handler.
We'll call the winfix_trampoline when that is detected.
2) An initial TSB allocation via TLB miss races with a hugetlb
fault on another cpu running the same MM. We handle this by
unconditionally loading the TSB we see into the current cpu
even if it's non-NULL at hugetlb_setup time.
Reported-by: Meelis Roos <mroos@ut.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This is relatively easy since PMD's now cover exactly 4MB of memory.
Our PMD entries are 32-bits each, so we use a special encoding. The
lowest bit, PMD_ISHUGE, determines the interpretation. This is possible
because sparc64's page tables are purely software entities so we can use
whatever encoding scheme we want. We just have to make the TLB miss
assembler page table walkers aware of the layout.
set_pmd_at() works much like set_pte_at() but it has to operate in two
page from a table of non-huge PTEs, so we have to queue up TLB flushes
based upon what mappings are valid in the PTE table. In the second regime
we are going from huge-page to non-huge-page, and in that case we need
only queue up a single TLB flush to push out the huge page mapping.
We still have 5 bits remaining in the huge PMD encoding so we can very
likely support any new pieces of THP state tracking that might get added
in the future.
With lots of help from Johannes Weiner.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We've split up the PTE tables so that they take up half a page instead of
a full page. This is in order to facilitate transparent huge page
support, which works much better if our PMDs cover 4MB instead of 8MB.
What we do is have a one-behind cache for PTE table allocations in the
mm struct.
This logic triggers only on allocations. For example, we don't try to
keep track of free'd up page table blocks in the style that the s390 port
does.
There were only two slightly annoying aspects to this change:
1) Changing pgtable_t to be a "pte_t *". There's all of this special
logic in the TLB free paths that needed adjustments, as did the
PMD populate interfaces.
2) init_new_context() needs to zap the pointer, since the mm struct
just gets copied from the parent on fork.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Narrowing the scope of the page size configurations will make the
transparent hugepage changes much simpler.
In the end what we really want to do is have the kernel support multiple
huge page sizes and use whatever is appropriate as the context dictactes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current asm-generic/page.h only contains the get_order
function, and asm-generic/uaccess.h only implements
unaligned accesses. This renames the file to getorder.h
and uaccess-unaligned.h to make room for new page.h
and uaccess.h file that will be usable by all simple
(e.g. nommu) architectures.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
As sparse warns, without this struct page pointer subtraction is
extremely expensive, and this is a pretty common operation in
fast paths.
With this define struct page becomes 64 bytes which makes for a
simple subtract and shift, instead of a costly divide or reciprocol
multiply.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The majority of this patch was created by the following script:
***
ASM=arch/sparc/include/asm
mkdir -p $ASM
git mv include/asm-sparc64/ftrace.h $ASM
git rm include/asm-sparc64/*
git mv include/asm-sparc/* $ASM
sed -ie 's/asm-sparc64/asm/g' $ASM/*
sed -ie 's/asm-sparc/asm/g' $ASM/*
***
The rest was an update of the top-level Makefile to use sparc
for header files when sparc64 is being build.
And a small fixlet to pick up the correct unistd.h from
sparc64 code.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
|