Age | Commit message (Collapse) | Author |
|
commit f7b0e1055574ce06ab53391263b4e205bf38daf3 upstream.
With the current implementation, kstat_cpu(cpu).irqs_sum is also
increased in case of irq_mis_count increment.
So there is no need to count irq_mis_count in arch_irq_stat,
otherwise irq_mis_count will be counted twice in the sum of
/proc/stat.
Reported-by: Liu Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Li Fei <fei.li@intel.com>
Acked-by: Liu Chuansheng <chuansheng.liu@intel.com>
Cc: tomoki.sekiyama.qu@hitachi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1366980611.32469.7.camel@fli24-HP-Compaq-8100-Elite-CMT-PC
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 660696d1d16a71e15549ce1bf74953be1592bcd3 upstream.
Source operand for one byte mov[zs]x is decoded incorrectly if it is in
high byte register. Fix that.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 91cf54feecf815bec0b6a8d6d9dbd0e219f2f2cc upstream.
Fix regression introduced by commit 796211b7953 ("mmc: atmel-mci: add
pdc support and runtime capabilities detection") which removed the need
for CONFIG_MMC_ATMELMCI_DMA but kept the Kconfig-entry as well as the
compile guards around dma_release_channel() in remove(). Consequently,
DMA is always enabled (if supported), but the DMA-channel is not
released on module unload unless the DMA-config option is selected.
Remove the no longer used CONFIG_MMC_ATMELMCI_DMA option completely.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit de53e9caa4c6149ef4a78c2f83d7f5b655848767 upstream.
The Linux Kernel contains some inline assembly source code which has
wrong asm register constraints in arch/ia64/kvm/vtlb.c.
I observed this on Kernel 3.2.35 but it is also true on the most
recent Kernel 3.9-rc1.
File arch/ia64/kvm/vtlb.c:
u64 guest_vhpt_lookup(u64 iha, u64 *pte)
{
u64 ret;
struct thash_data *data;
data = __vtr_lookup(current_vcpu, iha, D_TLB);
if (data != NULL)
thash_vhpt_insert(current_vcpu, data->page_flags,
data->itir, iha, D_TLB);
asm volatile (
"rsm psr.ic|psr.i;;"
"srlz.d;;"
"ld8.s r9=[%1];;"
"tnat.nz p6,p7=r9;;"
"(p6) mov %0=1;"
"(p6) mov r9=r0;"
"(p7) extr.u r9=r9,0,53;;"
"(p7) mov %0=r0;"
"(p7) st8 [%2]=r9;;"
"ssm psr.ic;;"
"srlz.d;;"
"ssm psr.i;;"
"srlz.d;;"
: "=r"(ret) : "r"(iha), "r"(pte):"memory");
return ret;
}
The list of output registers is
: "=r"(ret) : "r"(iha), "r"(pte):"memory");
The constraint "=r" means that the GCC has to maintain that these vars
are in registers and contain valid info when the program flow leaves
the assembly block (output registers).
But "=r" also means that GCC can put them in registers that are used
as input registers. Input registers are iha, pte on the example.
If the predicate p7 is true, the 8th assembly instruction
"(p7) mov %0=r0;"
is the first one which writes to a register which is maintained by the
register constraints; it sets %0. %0 means the first register operand;
it is ret here.
This instruction might overwrite the %2 register (pte) which is needed
by the next instruction:
"(p7) st8 [%2]=r9;;"
Whether it really happens depends on how GCC decides what registers it
uses and how it optimizes the code.
The attached patch fixes the register operand constraints in
arch/ia64/kvm/vtlb.c.
The register constraints should be
: "=&r"(ret) : "r"(iha), "r"(pte):"memory");
The & means that GCC must not use any of the input registers to place
this output register in.
This is Debian bug#702639
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702639).
The patch is applicable on Kernel 3.9-rc1, 3.2.35 and many other versions.
Signed-off-by: Stephan Schreiber <info@fs-driver.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 136f39ddc53db3bcee2befbe323a56d4fbf06da8 upstream.
The Linux Kernel contains some inline assembly source code which has
wrong asm register constraints in arch/ia64/include/asm/futex.h.
I observed this on Kernel 3.2.23 but it is also true on the most
recent Kernel 3.9-rc1.
File arch/ia64/include/asm/futex.h:
static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
u32 oldval, u32 newval)
{
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
return -EFAULT;
{
register unsigned long r8 __asm ("r8");
unsigned long prev;
__asm__ __volatile__(
" mf;; \n"
" mov %0=r0 \n"
" mov ar.ccv=%4;; \n"
"[1:] cmpxchg4.acq %1=[%2],%3,ar.ccv \n"
" .xdata4 \"__ex_table\", 1b-., 2f-. \n"
"[2:]"
: "=r" (r8), "=r" (prev)
: "r" (uaddr), "r" (newval),
"rO" ((long) (unsigned) oldval)
: "memory");
*uval = prev;
return r8;
}
}
The list of output registers is
: "=r" (r8), "=r" (prev)
The constraint "=r" means that the GCC has to maintain that these vars
are in registers and contain valid info when the program flow leaves
the assembly block (output registers).
But "=r" also means that GCC can put them in registers that are used
as input registers. Input registers are uaddr, newval, oldval on the
example.
The second assembly instruction
" mov %0=r0 \n"
is the first one which writes to a register; it sets %0 to 0. %0 means
the first register operand; it is r8 here. (The r0 is read-only and
always 0 on the Itanium; it can be used if an immediate zero value is
needed.)
This instruction might overwrite one of the other registers which are
still needed.
Whether it really happens depends on how GCC decides what registers it
uses and how it optimizes the code.
The objdump utility can give us disassembly.
The futex_atomic_cmpxchg_inatomic() function is inline, so we have to
look for a module that uses the funtion. This is the
cmpxchg_futex_value_locked() function in
kernel/futex.c:
static int cmpxchg_futex_value_locked(u32 *curval, u32 __user *uaddr,
u32 uval, u32 newval)
{
int ret;
pagefault_disable();
ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
pagefault_enable();
return ret;
}
Now the disassembly. At first from the Kernel package 3.2.23 which has
been compiled with GCC 4.4, remeber this Kernel seemed to work:
objdump -d linux-3.2.23/debian/build/build_ia64_none_mckinley/kernel/futex.o
0000000000000230 <cmpxchg_futex_value_locked>:
230: 0b 18 80 1b 18 21 [MMI] adds r3=3168,r13;;
236: 80 40 0d 00 42 00 adds r8=40,r3
23c: 00 00 04 00 nop.i 0x0;;
240: 0b 50 00 10 10 10 [MMI] ld4 r10=[r8];;
246: 90 08 28 00 42 00 adds r9=1,r10
24c: 00 00 04 00 nop.i 0x0;;
250: 09 00 00 00 01 00 [MMI] nop.m 0x0
256: 00 48 20 20 23 00 st4 [r8]=r9
25c: 00 00 04 00 nop.i 0x0;;
260: 08 10 80 06 00 21 [MMI] adds r2=32,r3
266: 00 00 00 02 00 00 nop.m 0x0
26c: 02 08 f1 52 extr.u r16=r33,0,61
270: 05 40 88 00 08 e0 [MLX] addp4 r8=r34,r0
276: ff ff 0f 00 00 e0 movl r15=0xfffffffbfff;;
27c: f1 f7 ff 65
280: 09 70 00 04 18 10 [MMI] ld8 r14=[r2]
286: 00 00 00 02 00 c0 nop.m 0x0
28c: f0 80 1c d0 cmp.ltu p6,p7=r15,r16;;
290: 08 40 fc 1d 09 3b [MMI] cmp.eq p8,p9=-1,r14
296: 00 00 00 02 00 40 nop.m 0x0
29c: e1 08 2d d0 cmp.ltu p10,p11=r14,r33
2a0: 56 01 10 00 40 10 [BBB] (p10) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
2a6: 02 08 00 80 21 03 (p08) br.cond.dpnt.few 2b0
<cmpxchg_futex_value_locked+0x80>
2ac: 40 00 00 41 (p06) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
2b0: 0a 00 00 00 22 00 [MMI] mf;;
2b6: 80 00 00 00 42 00 mov r8=r0
2bc: 00 00 04 00 nop.i 0x0
2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;;
2c6: 10 1a 85 22 20 00 cmpxchg4.acq r33=[r33],r35,ar.ccv
2cc: 00 00 04 00 nop.i 0x0;;
2d0: 10 00 84 40 90 11 [MIB] st4 [r32]=r33
2d6: 00 00 00 02 00 00 nop.i 0x0
2dc: 20 00 00 40 br.few 2f0
<cmpxchg_futex_value_locked+0xc0>
2e0: 09 40 c8 f9 ff 27 [MMI] mov r8=-14
2e6: 00 00 00 02 00 00 nop.m 0x0
2ec: 00 00 04 00 nop.i 0x0;;
2f0: 0b 58 20 1a 19 21 [MMI] adds r11=3208,r13;;
2f6: 20 01 2c 20 20 00 ld4 r18=[r11]
2fc: 00 00 04 00 nop.i 0x0;;
300: 0b 88 fc 25 3f 23 [MMI] adds r17=-1,r18;;
306: 00 88 2c 20 23 00 st4 [r11]=r17
30c: 00 00 04 00 nop.i 0x0;;
310: 11 00 00 00 01 00 [MIB] nop.m 0x0
316: 00 00 00 02 00 80 nop.i 0x0
31c: 08 00 84 00 br.ret.sptk.many b0;;
The lines
2b0: 0a 00 00 00 22 00 [MMI] mf;;
2b6: 80 00 00 00 42 00 mov r8=r0
2bc: 00 00 04 00 nop.i 0x0
2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;;
2c6: 10 1a 85 22 20 00 cmpxchg4.acq r33=[r33],r35,ar.ccv
2cc: 00 00 04 00 nop.i 0x0;;
are the instructions of the assembly block.
The line
2b6: 80 00 00 00 42 00 mov r8=r0
sets the r8 register to 0 and after that
2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;;
prepares the 'oldvalue' for the cmpxchg but it takes it from r8. This
is wrong.
What happened here is what I explained above: An input register is
overwritten which is still needed.
The register operand constraints in futex.h are wrong.
(The problem doesn't occur when the Kernel is compiled with GCC 4.6.)
The attached patch fixes the register operand constraints in futex.h.
The code after patching of it:
static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
u32 oldval, u32 newval)
{
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
return -EFAULT;
{
register unsigned long r8 __asm ("r8") = 0;
unsigned long prev;
__asm__ __volatile__(
" mf;; \n"
" mov ar.ccv=%4;; \n"
"[1:] cmpxchg4.acq %1=[%2],%3,ar.ccv \n"
" .xdata4 \"__ex_table\", 1b-., 2f-. \n"
"[2:]"
: "+r" (r8), "=&r" (prev)
: "r" (uaddr), "r" (newval),
"rO" ((long) (unsigned) oldval)
: "memory");
*uval = prev;
return r8;
}
}
I also initialized the 'r8' var with the C programming language.
The _asm qualifier on the definition of the 'r8' var forces GCC to use
the r8 processor register for it.
I don't believe that we should use inline assembly for zeroing out a
local variable.
The constraint is
"+r" (r8)
what means that it is both an input register and an output register.
Note that the page fault handler will modify the r8 register which
will be the return value of the function.
The real fix is
"=&r" (prev)
The & means that GCC must not use any of the input registers to place
this output register in.
Patched the Kernel 3.2.23 and compiled it with GCC4.4:
0000000000000230 <cmpxchg_futex_value_locked>:
230: 0b 18 80 1b 18 21 [MMI] adds r3=3168,r13;;
236: 80 40 0d 00 42 00 adds r8=40,r3
23c: 00 00 04 00 nop.i 0x0;;
240: 0b 50 00 10 10 10 [MMI] ld4 r10=[r8];;
246: 90 08 28 00 42 00 adds r9=1,r10
24c: 00 00 04 00 nop.i 0x0;;
250: 09 00 00 00 01 00 [MMI] nop.m 0x0
256: 00 48 20 20 23 00 st4 [r8]=r9
25c: 00 00 04 00 nop.i 0x0;;
260: 08 10 80 06 00 21 [MMI] adds r2=32,r3
266: 20 12 01 10 40 00 addp4 r34=r34,r0
26c: 02 08 f1 52 extr.u r16=r33,0,61
270: 05 40 00 00 00 e1 [MLX] mov r8=r0
276: ff ff 0f 00 00 e0 movl r15=0xfffffffbfff;;
27c: f1 f7 ff 65
280: 09 70 00 04 18 10 [MMI] ld8 r14=[r2]
286: 00 00 00 02 00 c0 nop.m 0x0
28c: f0 80 1c d0 cmp.ltu p6,p7=r15,r16;;
290: 08 40 fc 1d 09 3b [MMI] cmp.eq p8,p9=-1,r14
296: 00 00 00 02 00 40 nop.m 0x0
29c: e1 08 2d d0 cmp.ltu p10,p11=r14,r33
2a0: 56 01 10 00 40 10 [BBB] (p10) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
2a6: 02 08 00 80 21 03 (p08) br.cond.dpnt.few 2b0
<cmpxchg_futex_value_locked+0x80>
2ac: 40 00 00 41 (p06) br.cond.spnt.few 2e0
<cmpxchg_futex_value_locked+0xb0>
2b0: 0b 00 00 00 22 00 [MMI] mf;;
2b6: 00 10 81 54 08 00 mov.m ar.ccv=r34
2bc: 00 00 04 00 nop.i 0x0;;
2c0: 09 58 8c 42 11 10 [MMI] cmpxchg4.acq r11=[r33],r35,ar.ccv
2c6: 00 00 00 02 00 00 nop.m 0x0
2cc: 00 00 04 00 nop.i 0x0;;
2d0: 10 00 2c 40 90 11 [MIB] st4 [r32]=r11
2d6: 00 00 00 02 00 00 nop.i 0x0
2dc: 20 00 00 40 br.few 2f0
<cmpxchg_futex_value_locked+0xc0>
2e0: 09 40 c8 f9 ff 27 [MMI] mov r8=-14
2e6: 00 00 00 02 00 00 nop.m 0x0
2ec: 00 00 04 00 nop.i 0x0;;
2f0: 0b 88 20 1a 19 21 [MMI] adds r17=3208,r13;;
2f6: 30 01 44 20 20 00 ld4 r19=[r17]
2fc: 00 00 04 00 nop.i 0x0;;
300: 0b 90 fc 27 3f 23 [MMI] adds r18=-1,r19;;
306: 00 90 44 20 23 00 st4 [r17]=r18
30c: 00 00 04 00 nop.i 0x0;;
310: 11 00 00 00 01 00 [MIB] nop.m 0x0
316: 00 00 00 02 00 80 nop.i 0x0
31c: 08 00 84 00 br.ret.sptk.many b0;;
Much better.
There is a
270: 05 40 00 00 00 e1 [MLX] mov r8=r0
which was generated by C code r8 = 0. Below
2b6: 00 10 81 54 08 00 mov.m ar.ccv=r34
what means that oldval is no longer overwritten.
This is Debian bug#702641
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702641).
The patch is applicable on Kernel 3.9-rc1, 3.2.23 and many other versions.
Signed-off-by: Stephan Schreiber <info@fs-driver.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d303e9e98fce56cdb3c6f2ac92f626fc2bd51c77 upstream.
Back 2010 during a revamp of the irq code some initializations
were moved from ia64_mca_init() to ia64_mca_late_init() in
commit c75f2aa13f5b268aba369b5dc566088b5194377c
Cannot use register_percpu_irq() from ia64_mca_init()
But this was hideously wrong. First of all these initializations
are now down far too late. Specifically after all the other cpus
have been brought up and initialized their own CMC vectors from
smp_callin(). Also ia64_mca_late_init() may be called from any cpu
so the line:
ia64_mca_cmc_vector_setup(); /* Setup vector on BSP */
is generally not executed on the BSP, and so the CMC vector isn't
setup at all on that processor.
Make use of the arch_early_irq_init() hook to get this code executed
at just the right moment: not too early, not too late.
Reported-by: Fred Hartnett <fred.hartnett@hp.com>
Tested-by: Fred Hartnett <fred.hartnett@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 57ae1b0532977b30184aaba04b6cafe0a284c21f upstream.
Occurs when CONFIG_CRYPTO_CRC32C_INTEL=y and CONFIG_CRYPTO_CRC32C_INTEL=y.
Older versions of bintuils do not support the pclmulqdq instruction. The
PCLMULQDQ gas macro is used instead.
Signed-off-by: Sandy Wu <sandyw@twitter.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 104ad3b32d7a71941c8ab2dee78eea38e8a23309 upstream.
ARM processors with LPAE enabled use 3 levels of page tables, with an
entry in the top level (pgd) covering 1GB of virtual space. Because of
the branch relocation limitations on ARM, the loadable modules are
mapped 16MB below PAGE_OFFSET, making the corresponding 1GB pgd shared
between kernel modules and user space.
If free_pgtables() is called with the default ceiling 0,
free_pgd_range() (and subsequently called functions) also frees the page
table shared between user space and kernel modules (which is normally
handled by the ARM-specific pgd_free() function). This patch changes
defines the ARM USER_PGTABLES_CEILING to TASK_SIZE when CONFIG_ARM_LPAE
is enabled.
Note that the pgd_free() function already checks the presence of the
shared pmd page allocated by pgd_alloc() and frees it, though with
ceiling 0 this wasn't necessary.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
online/offline
commit 66ff0fe9e7bda8aec99985b24daad03652f7304e upstream.
While we don't use the spinlock interrupt line (see for details
commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 -
xen: disable PV spinlocks on HVM) - we should still do the proper
init / deinit sequence. We did not do that correctly and for the
CPU init for PVHVM guest we would allocate an interrupt line - but
failed to deallocate the old interrupt line.
This resulted in leakage of an irq_desc but more importantly this splat
as we online an offlined CPU:
genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
Call Trace:
[<ffffffff811156de>] __setup_irq+0x23e/0x4a0
[<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
[<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff810cad35>] ? update_max_interval+0x15/0x40
[<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
[<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
[<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
[<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff8109402b>] __cpu_notify+0x1b/0x30
[<ffffffff8166834a>] _cpu_up+0xa0/0x14b
[<ffffffff816684ce>] cpu_up+0xd9/0xec
[<ffffffff8165f754>] store_online+0x94/0xd0
[<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
[<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
[<ffffffff811a2864>] vfs_write+0xb4/0x130
[<ffffffff811a302a>] sys_write+0x5a/0xa0
[<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
cpu 1 spinlock event irq -16
smpboot: Booting Node 0 Processor 1 APIC 0x2
And if one looks at the /proc/interrupts right after
offlining (CPU1):
70: 0 0 xen-percpu-ipi spinlock0
71: 0 0 xen-percpu-ipi spinlock1
77: 0 0 xen-percpu-ipi spinlock2
There is the oddity of the 'spinlock1' still being present.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 888b65b4bc5e7fcbbb967023300cd5d44dba1950 upstream.
In the PVHVM path when we do CPU online/offline path we would
leak the timer%d IRQ line everytime we do a offline event. The
online path (xen_hvm_setup_cpu_clockevents via
x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
line for the timer%d.
But we would still use the old interrupt line leading to:
kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
invalid opcode: 0000 [#1] SMP
RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
.. snip..
<IRQ>
[<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
[<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
[<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
[<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
[<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
[<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
[<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
<EOI>
[<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
[<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8
There is also the oddity (timer1) in the /proc/interrupts after
offlining CPU1:
64: 1121 0 xen-percpu-virq timer0
78: 0 0 xen-percpu-virq timer1
84: 0 2483 xen-percpu-virq timer2
This patch fixes it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7918c92ae9638eb8a6ec18e2b4a0de84557cccc8 upstream.
When we online the CPU, we get this splat:
smpboot: Booting Node 0 Processor 1 APIC 0x2
installing Xen timer for CPU 1
BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
Call Trace:
[<ffffffff810c1fea>] __might_sleep+0xda/0x100
[<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff813036eb>] kvasprintf+0x5b/0x90
[<ffffffff81303758>] kasprintf+0x38/0x40
[<ffffffff81044510>] xen_setup_timer+0x30/0xb0
[<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
[<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
The solution to that is use kasprintf in the CPU hotplug path
that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
and remove the call to in xen_hvm_setup_cpu_clockevents.
Unfortunatly the later is not a good idea as the bootup path
does not use xen_hvm_cpu_notify so we would end up never allocating
timer%d interrupt lines when booting. As such add the check for
atomic() to continue.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6747e83235caecd30b186d1282e4eba7679f81b7 upstream.
In commit 85fe402 (fs: do not assign default i_ino in new_inode), the
initialisation of i_ino was removed from new_inode() and pushed down
into the callers. However spufs_new_inode() was not updated.
This exhibits as no files appearing in /spu, because all our dirents
have a zero inode, which readdir() seems to dislike.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8c2a381734fc9718f127f4aba958e8a7958d4028 upstream.
In __restore_cpu_power8 we determine if we are HV and if not, we return
before setting HV only resources.
Unfortunately we forgot to restore the link register from r11 before
returning.
This will happen on boot and with secondary CPUs not coming online.
This adds the missing link register restore.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3e96ca7f007ddb06b82a74a68585d1dbafa85ff1 upstream.
POWER8 allows us to take interrupts with the MMU on. This gives us a
second set of vectors offset at 0x4000.
Unfortunately when coping these vectors we missed checking for MSR HV
for hardware interrupts (0x500). This results in us trying to use
HSRR0/1 when HV=0, rather than SRR0/1 on HW IRQs
The below fixes this to check CPU_FTR_HVMODE when patching the code at
0x4500.
Also we remove the check for CPU_FTR_ARCH_206 since relocation on IRQs
are only available in arch 2.07 and beyond.
Thanks to benh for helping find this.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 29ce3c5073057991217916abc25628e906911757 upstream.
In __after_prom_start we copy the kernel down to zero in two calls to
copy_and_flush. After the first call (copy from 0 to copy_to_here:)
we jump to the newly copied code soon after.
Unfortunately there's no isync between the copy of this code and the
jump to it. Hence it's possible that stale instructions could still be
in the icache or pipeline before we branch to it.
We've seen this on real machines and it's results in no console output
after:
calling quiesce...
returning from prom_init
The below adds an isync to ensure that the copy and flushing has
completed before any branching to the new instructions occurs.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2a5a461f179509142c661d79f878855798b85201 upstream.
- unneeded whitespace
- missing double quote
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 88fcb59a06556bf10eac97d7abb913cccea2c830 upstream.
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e7619459d47a673af3433208a42f583af920e9db upstream.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b090e5f68c0353534880b95ea0df56b8c0230b8c upstream.
Remove the malformed "mem=" bootargs parameter in at91sam9x5ek.dtsi
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f10491fff07dcced77f8ab1b3bc1f8e18715bfb9 upstream.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
[nicolas.ferre@atmel.com: fix rts/cts for usart3]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0259d9eb30d003af305626db2d8332805696e60d upstream.
The UART1 is on the fast AHB bridge, not on the slow bus.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0d97558901c446a989de202a5d9ae94ec53644e5 upstream.
The TIME_VALID flag is specified for the different states but
the time residency computation is not done, no tk flag, no time
computation in the idle function.
Set the en_core_tk_irqen flag to activate it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f5d6a1441a5045824f36ff7c6b6bbae0373472a6 upstream.
Currently IOP3XX_PERIPHERAL_VIRT_BASE conflicts with PCI_IO_VIRT_BASE:
address size
PCI_IO_VIRT_BASE 0xfee00000 0x200000
IOP3XX_PERIPHERAL_VIRT_BASE 0xfeffe000 0x2000
Fix by moving IOP3XX_PERIPHERAL_VIRT_BASE below PCI_IO_VIRT_BASE.
The patch fixes the following kernel panic with 3.9-rc1 on iop3xx boards:
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.9.0-rc1-iop32x (aaro@blackmetal) (gcc version 4.7.2 (GCC) ) #20 PREEMPT Tue Mar 5 16:44:36 EET 2013
[ 0.000000] bootconsole [earlycon0] enabled
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at mm/vmalloc.c:1145!
[ 0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 Not tainted (3.9.0-rc1-iop32x #20)
[ 0.000000] PC is at vm_area_add_early+0x4c/0x88
[ 0.000000] LR is at add_static_vm_early+0x14/0x68
[ 0.000000] pc : [<c03e74a8>] lr : [<c03e1c40>] psr: 800000d3
[ 0.000000] sp : c03ffee4 ip : dfffdf88 fp : c03ffef4
[ 0.000000] r10: 00000002 r9 : 000000cf r8 : 00000653
[ 0.000000] r7 : c040eca8 r6 : c03e2408 r5 : dfffdf60 r4 : 00200000
[ 0.000000] r3 : dfffdfd8 r2 : feffe000 r1 : ff000000 r0 : dfffdf60
[ 0.000000] Flags: Nzcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel
[ 0.000000] Control: 0000397f Table: a0004000 DAC: 00000017
[ 0.000000] Process swapper (pid: 0, stack limit = 0xc03fe1b8)
[ 0.000000] Stack: (0xc03ffee4 to 0xc0400000)
[ 0.000000] fee0: 00200000 c03fff0c c03ffef8 c03e1c40 c03e7468 00200000 fee00000
[ 0.000000] ff00: c03fff2c c03fff10 c03e23e4 c03e1c38 feffe000 c0408ee4 ff000000 c0408f04
[ 0.000000] ff20: c03fff3c c03fff30 c03e2434 c03e23b4 c03fff84 c03fff40 c03e2c94 c03e2414
[ 0.000000] ff40: c03f8878 c03f6410 ffff0000 000bffff 00001000 00000008 c03fff84 c03f6410
[ 0.000000] ff60: c04227e8 c03fffd4 a0008000 c03f8878 69052e30 c02f96eb c03fffbc c03fff88
[ 0.000000] ff80: c03e044c c03e268c 00000000 0000397f c0385130 00000001 ffffffff c03f8874
[ 0.000000] ffa0: dfffffff a0004000 69052e30 a03f61a0 c03ffff4 c03fffc0 c03dd5cc c03e0184
[ 0.000000] ffc0: 00000000 00000000 00000000 00000000 00000000 c03f8878 0000397d c040601c
[ 0.000000] ffe0: c03f8874 c0408674 00000000 c03ffff8 a0008040 c03dd558 00000000 00000000
[ 0.000000] Backtrace:
[ 0.000000] [<c03e745c>] (vm_area_add_early+0x0/0x88) from [<c03e1c40>] (add_static_vm_early+0x14/0x68)
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cea15092f098b7018e89f64a5a14bb71955965d5 upstream.
cyc_to_sched_clock() is called by sched_clock() and cyc_to_ns()
is called by cyc_to_sched_clock(). I suspect that some compilers
inline both of these functions into sched_clock() and so we've
been getting away without having a notrace marking. It seems that
my compiler isn't inlining cyc_to_sched_clock() though, so I'm
hitting a recursion bug when I enable the function graph tracer,
causing my system to crash. Marking these functions notrace fixes
it. Technically cyc_to_ns() doesn't need the notrace because it's
already marked inline, but let's just add it so that if we ever
remove inline from that function it doesn't blow up.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Commits f36391d2790d04993f48da6a45810033a2cdf847 and
f0af97070acbad5d6a361f485828223a4faaa0ee upstream. ]
As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.
So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.
Fix this by using generic infrastructure to synchonize on the
completion of the cross call.
This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().
We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls. If we're not in such a
region, we flush TLBs synchronously.
1) Get rid of xcall_flush_tlb_pending and per-cpu type
implementations.
2) Do TLB batch cross calls instead via:
smp_call_function_many()
tlb_pending_func()
__flush_tlb_pending()
3) Batch only in lazy mmu sequences:
a) Add 'active' member to struct tlb_batch
b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
c) Set 'active' in arch_enter_lazy_mmu_mode()
d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
e) Check 'active' in tlb_batch_add_one() and do a synchronous
flush if it's clear.
4) Add infrastructure for synchronous TLB page flushes.
a) Implement __flush_tlb_page and per-cpu variants, patch
as needed.
b) Likewise for xcall_flush_tlb_page.
c) Implement smp_flush_tlb_page() to invoke the cross-call.
d) Wire up global_flush_tlb_page() to the right routine based
upon CONFIG_SMP
5) It turns out that singleton batches are very common, 2 out of every
3 batch flushes have only a single entry in them.
The batch flush waiting is very expensive, both because of the poll
on sibling cpu completeion, as well as because passing the tlb batch
pointer to the sibling cpus invokes a shared memory dereference.
Therefore, in flush_tlb_pending(), if there is only one entry in
the batch perform a completely asynchronous global_flush_tlb_page()
instead.
Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3b5e50edaf500f392f4a372296afc0b99ffa7e70 upstream.
This reverts commit c17a6554782ad531f4713b33fd6339ba67ef6391.
Manuel Lauss writes:
lmo commit c17a6554 (MIPS: page.h: Provide more readable definition for
PAGE_MASK) apparently breaks ioremap of 36-bit addresses on my Alchemy
systems (PCI and PCMCIA) The reason is that in arch/mips/mm/ioremap.c
line 157 (phys_addr &= PAGE_MASK) bits 32-35 are cut off. Seems the
new PAGE_MASK is explicitly 32bit, or one could make it signed instead
of unsigned long.
From: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4f2e29031e6c67802e7370292dd050fd62f337ee upstream.
Commit b4cbb197c7e7 ("vm: add vm_iomap_memory() helper function") added
a helper function wrapper around io_remap_pfn_range(), and every other
architecture defined it in <asm/pgtable.h>.
The s390 choice of <asm/io.h> may make sense, but is not very convenient
for this case, and gratuitous differences like that cause unexpected errors like this:
mm/memory.c: In function 'vm_iomap_memory':
mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]
Glory be the kbuild test robot who noticed this, bisected it, and
reported it to the guilty parties (ie me).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f1923820c447e986a9da0fc6bf60c1dccdf0408e upstream.
The valid mask for both offcore_response_0 and
offcore_response_1 was wrong for SNB/SNB-EP,
IVB/IVB-EP. It was possible to write to
reserved bit and cause a GP fault crashing
the kernel.
This patch fixes the problem by correctly marking the
reserved bits in the valid mask for all the processors
mentioned above.
A distinction between desktop and server parts is introduced
because bits 24-30 are only available on the server parts.
This version of the patch is just a rebase to perf/urgent tree
and should apply to older kernels as well.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: jolsa@redhat.com
Cc: ak@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cb2d8b342aa084d1f3ac29966245dec9163677fb upstream.
Events may be created with attr->disabled == 1 and attr->enable_on_exec
== 1, which confuses the group validation code because events with the
PERF_EVENT_STATE_OFF are not considered candidates for scheduling, which
may lead to failure at group scheduling time.
This patch fixes the validation check for ARM, so that events in the
OFF state are still considered when enable_on_exec is true.
Reported-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cd272d1ea71583170e95dde02c76166c7f9017e6 upstream.
On Feroceon the L2 cache becomes non-coherent with the CPU
when the L1 caches are disabled. Thus the L2 needs to be invalidated
after both L1 caches are disabled.
On kexec before the starting the code for relocation the kernel,
the L1 caches are disabled in cpu_froc_fin (cpu_v7_proc_fin for Feroceon),
but after L2 cache is never invalidated, because inv_all is not set
in cache-feroceon-l2.c.
So kernel relocation and decompression may has (and usually has) errors.
Setting the function enables L2 invalidation and fixes the issue.
Signed-off-by: Illia Ragozin <illia.ragozin@grapecom.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cab1e0a36c9dd0b0671fb84197ed294513f5adc1 upstream.
This patch enables iomuxc_gate clock. It is necessary to be able to
reconfigure iomux pads. Without this clock enabled, the
clk_disable_unused function will disable this clock and the iomux pads
are not configurable anymore. This happens at every boot. After a reboot
(watchdog system reset) the clock is not enabled again, so all iomux pad
reconfigurations in boot code are without effect.
The iomux pads should be always configurable, so this patch always
enables it.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Lingzhu Xiang <lxiang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5dc2eb7da1e387e31ce54f54af580c6a6f512ca6 upstream.
The i.MX35 has two bits per clock gate which are decoded as follows:
0b00 -> clock off
0b01 -> clock is on in run mode, off in wait/doze
0b10 -> clock is on in run/wait mode, off in doze
0b11 -> clock is always on
The reset value for the MAX clock is 0b10.
The MAX clock is needed by the SoC, yet unused in the Kernel, so the
common clock framework will disable it during late init time. It will
only disable clocks though which it detects as being turned on. This
detection is made depending on the lower bit of the gate. If the reset
value has been altered by the bootloader to 0b11 the clock framework
will detect the clock as turned on, yet unused, hence it will turn it
off and the system locks up.
This patch turns the MAX clock on unconditionally making the Kernel
independent of the bootloader.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Lingzhu Xiang <lxiang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8f964525a121f2ff2df948dac908dcc65be21b5b upstream.
This patch adds support for kvm_gfn_to_hva_cache_init functions for
reads and writes that will cross a page. If the range falls within
the same memslot, then this will be a fast operation. If the range
is split between two memslots, then the slower kvm_read_guest and
kvm_write_guest are used.
Tested: Test against kvm_clock unit tests.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
(CVE-2013-1797)
commit 0b79459b482e85cb7426aa7da683a9f2c97aeae1 upstream.
There is a potential use after free issue with the handling of
MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable
memory such as frame buffers then KVM might continue to write to that
address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins
the page in memory so it's unlikely to cause an issue, but if the user
space component re-purposes the memory previously used for the guest, then
the guest will be able to corrupt that memory.
Tested: Tested against kvmclock unit test
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
(CVE-2013-1796)
commit c300aa64ddf57d9c5d9c898a64b36877345dd4a9 upstream.
If the guest sets the GPA of the time_page so that the request to update the
time straddles a page then KVM will write onto an incorrect page. The
write is done byusing kmap atomic to get a pointer to the page for the time
structure and then performing a memcpy to that page starting at an offset
that the guest controls. Well behaved guests always provide a 32-byte aligned
address, however a malicious guest could use this to corrupt host kernel
memory.
Tested: Tested against kvmclock unit test.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b6c7aabd923a17af993c5a5d5d7995f0b27c000a upstream.
Let's do the changes properly and fix the same problem everywhere, not
just for one case.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c5e6cb051c5f7d56f05bd6a4af22cb300a4ced79 upstream.
The existing check handles the case where we've migrated to a different
core than we last ran on, but it doesn't handle the case where we're
still on the same cpu we last ran on, but some other vcpu has run on
this cpu in the meantime.
Without this, guest segfaults (and other misbehavior) have been seen in
smp guests.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d8b92292408831d86ff7b781e66bf79301934b99 upstream.
A label 0 was missed in the patch a9c4e541 (powerpc/kprobe: Complete
kprobe and migrate exception frame). This will cause the kernel
branch to an undetermined address if there really has a conflict when
updating the thread flags.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-By: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 511ba86e1d386f671084b5d0e6f110bb30b8eeb2 upstream.
Invoking arch_flush_lazy_mmu_mode() results in calls to
preempt_enable()/disable() which may have performance impact.
Since lazy MMU is not used on bare metal we can patch away
arch_flush_lazy_mmu_mode() so that it is never called in such
environment.
[ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU
updates" may cause a minor performance regression on
bare metal. This patch resolves that performance regression. It is
somewhat unclear to me if this is a good -stable candidate. ]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com
Tested-by: Josh Boyer <jwboyer@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5 upstream.
In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.
One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:
- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries
The OOPs usually looks as so:
------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[<ffffffff81627759>] do_page_fault+0x399/0x4b0
[<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110
[<ffffffff81624065>] page_fault+0x25/0x30
[<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350
[<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60
[<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150
[<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80
[<ffffffff81153e61>] unmap_single_vma+0x531/0x870
[<ffffffff81154962>] unmap_vmas+0x52/0xa0
[<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100
[<ffffffff8115c8f8>] exit_mmap+0x98/0x170
[<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81059ce3>] mmput+0x83/0xf0
[<ffffffff810624c4>] exit_mm+0x104/0x130
[<ffffffff8106264a>] do_exit+0x15a/0x8c0
[<ffffffff810630ff>] do_group_exit+0x3f/0xa0
[<ffffffff81063177>] sys_exit_group+0x17/0x20
[<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.
RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer <jwboyer@redhat.com>
Reported-and-Tested-by: Krishna Raman <kraman@redhat.com>
Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin < |