diff options
Diffstat (limited to 'Documentation/cachetlb.txt')
| -rw-r--r-- | Documentation/cachetlb.txt | 107 |
1 files changed, 57 insertions, 50 deletions
diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index 53245c429f7..d79b008e4a3 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt @@ -5,7 +5,7 @@ This document describes the cache/tlb flushing interfaces called by the Linux VM subsystem. It enumerates over each interface, -describes it's intended purpose, and what side effect is expected +describes its intended purpose, and what side effect is expected after the interface is invoked. The side effects described below are stated for a uniprocessor @@ -16,7 +16,7 @@ on all processors in the system. Don't let this scare you into thinking SMP cache/tlb flushing must be so inefficient, this is in fact an area where many optimizations are possible. For example, if it can be proven that a user address space has never executed -on a cpu (see vma->cpu_vm_mask), one need not perform a flush +on a cpu (see mm_cpumask()), one need not perform a flush for this address space on that cpu. First, the TLB flushing interfaces, since they are the simplest. The @@ -57,7 +57,7 @@ changes occur: interface must make sure that any previous page table modifications for the address space 'vma->vm_mm' in the range 'start' to 'end-1' will be visible to the cpu. That is, after - running, here will be no entries in the TLB for 'mm' for + running, there will be no entries in the TLB for 'mm' for virtual addresses in the range 'start' to 'end-1'. The "vma" is the backing store being used for the region. @@ -87,43 +87,20 @@ changes occur: This is used primarily during fault processing. -5) void flush_tlb_pgtables(struct mm_struct *mm, - unsigned long start, unsigned long end) - - The software page tables for address space 'mm' for virtual - addresses in the range 'start' to 'end-1' are being torn down. - - Some platforms cache the lowest level of the software page tables - in a linear virtually mapped array, to make TLB miss processing - more efficient. On such platforms, since the TLB is caching the - software page table structure, it needs to be flushed when parts - of the software page table tree are unlinked/freed. - - Sparc64 is one example of a platform which does this. - - Usually, when munmap()'ing an area of user virtual address - space, the kernel leaves the page table parts around and just - marks the individual pte's as invalid. However, if very large - portions of the address space are unmapped, the kernel frees up - those portions of the software page tables to prevent potential - excessive kernel memory usage caused by erratic mmap/mmunmap - sequences. It is at these times that flush_tlb_pgtables will - be invoked. - -6) void update_mmu_cache(struct vm_area_struct *vma, - unsigned long address, pte_t pte) +5) void update_mmu_cache(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep) At the end of every page fault, this routine is invoked to tell the architecture specific code that a translation - described by "pte" now exists at virtual address "address" - for address space "vma->vm_mm", in the software page tables. + now exists at virtual address "address" for address space + "vma->vm_mm", in the software page tables. A port may use this information in any way it so chooses. For example, it could use this event to pre-load TLB translations for software managed TLB configurations. The sparc64 port currently does this. -7) void tlb_migrate_finish(struct mm_struct *mm) +6) void tlb_migrate_finish(struct mm_struct *mm) This interface is called at the end of an explicit process migration. This interface provides a hook @@ -133,12 +110,6 @@ changes occur: The ia64 sn2 platform is one example of a platform that uses this interface. -8) void lazy_mmu_prot_update(pte_t pte) - This interface is called whenever the protection on - any user PTEs change. This interface provides a notification - to architecture specific code to take appropriate action. - - Next, we have the cache flushing interfaces. In general, when Linux is changing an existing virtual-->physical mapping to a new value, the sequence will be in one of the following forms: @@ -179,10 +150,21 @@ Here are the routines, one by one: lines associated with 'mm'. This interface is used to handle whole address space - page table operations such as what happens during - fork, exit, and exec. + page table operations such as what happens during exit and exec. + +2) void flush_cache_dup_mm(struct mm_struct *mm) + + This interface flushes an entire user address space from + the caches. That is, after running, there will be no cache + lines associated with 'mm'. + + This interface is used to handle whole address space + page table operations such as what happens during fork. + + This option is separate from flush_cache_mm to allow some + optimizations for VIPT caches. -2) void flush_cache_range(struct vm_area_struct *vma, +3) void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) Here we are flushing a specific range of (user) virtual @@ -199,7 +181,7 @@ Here are the routines, one by one: call flush_cache_page (see below) for each entry which may be modified. -3) void flush_cache_page(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn) +4) void flush_cache_page(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn) This time we need to remove a PAGE_SIZE sized range from the cache. The 'vma' is the backing structure used by @@ -220,7 +202,7 @@ Here are the routines, one by one: This is used primarily during fault processing. -4) void flush_cache_kmaps(void) +5) void flush_cache_kmaps(void) This routine need only be implemented if the platform utilizes highmem. It will be called right before all of the kmaps @@ -232,7 +214,7 @@ Here are the routines, one by one: This routing should be implemented in asm/highmem.h -5) void flush_cache_vmap(unsigned long start, unsigned long end) +6) void flush_cache_vmap(unsigned long start, unsigned long end) void flush_cache_vunmap(unsigned long start, unsigned long end) Here in these two interfaces we are flushing a specific range @@ -242,14 +224,14 @@ Here are the routines, one by one: The first of these two routines is invoked after map_vm_area() has installed the page table entries. The second is invoked - before unmap_vm_area() deletes the page table entries. + before unmap_kernel_range() deletes the page table entries. There exists another whole class of cpu cache issues which currently require a whole different set of interfaces to handle properly. The biggest problem is that of virtual aliasing in the data cache of a processor. -Is your port susceptible to virtual aliasing in it's D-cache? +Is your port susceptible to virtual aliasing in its D-cache? Well, if your D-cache is virtually indexed, is larger in size than PAGE_SIZE, and does not prevent multiple cache lines for the same physical address from existing at once, you have this problem. @@ -267,7 +249,7 @@ one way to solve this (in particular SPARC_FLAG_MMAPSHARED). Next, you have to solve the D-cache aliasing issue for all other cases. Please keep in mind that fact that, for a given page mapped into some user address space, there is always at least one more -mapping, that of the kernel in it's linear mapping starting at +mapping, that of the kernel in its linear mapping starting at PAGE_OFFSET. So immediately, once the first user maps a given physical page into its address space, by implication the D-cache aliasing problem has the potential to exist since the kernel already @@ -362,14 +344,15 @@ maps this page at its virtual address. likely that you will need to flush the instruction cache for copy_to_user_page(). - void flush_anon_page(struct page *page, unsigned long vmaddr) + void flush_anon_page(struct vm_area_struct *vma, struct page *page, + unsigned long vmaddr) When the kernel needs to access the contents of an anonymous page, it calls this function (currently only get_user_pages()). Note: flush_dcache_page() deliberately doesn't work for an anonymous page. The default implementation is a nop (and should remain so for all coherent architectures). For incoherent architectures, it should flush - the cache of the page at vmaddr in the current user process. + the cache of the page at vmaddr. void flush_kernel_dcache_page(struct page *page) When the kernel needs to modify a user page is has obtained @@ -392,5 +375,29 @@ maps this page at its virtual address. void flush_icache_page(struct vm_area_struct *vma, struct page *page) All the functionality of flush_icache_page can be implemented in - flush_dcache_page and update_mmu_cache. In 2.7 the hope is to - remove this interface completely. + flush_dcache_page and update_mmu_cache. In the future, the hope + is to remove this interface completely. + +The final category of APIs is for I/O to deliberately aliased address +ranges inside the kernel. Such aliases are set up by use of the +vmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O +subsystem assumes that the user mapping and kernel offset mapping are +the only aliases. This isn't true for vmap aliases, so anything in +the kernel trying to do I/O to vmap areas must manually manage +coherency. It must do this by flushing the vmap range before doing +I/O and invalidating it after the I/O returns. + + void flush_kernel_vmap_range(void *vaddr, int size) + flushes the kernel cache for a given virtual address range in + the vmap area. This is to make sure that any data the kernel + modified in the vmap range is made visible to the physical + page. The design is to make this area safe to perform I/O on. + Note that this API does *not* also flush the offset map alias + of the area. + + void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates + the cache for a given virtual address range in the vmap area + which prevents the processor from making the cache stale by + speculatively reading data while the I/O was occurring to the + physical pages. This is only necessary for data reads into the + vmap area. |
