Age | Commit message (Collapse) | Author |
|
commit 2999b58b795ad81f10e34bdbbfd2742172f247e4 upstream.
When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
word 82 of the identify data instead the word 83; it also checks the ATA/PI
version support in the word 80 (which the CompactFlash specifications have as
reserved), this having no slightest chance to work on the modern CF cards that
don't have 0x848A in the word 0...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
[ Upstream commit 9c5ff5f75d0d0a1c7928ecfae3f38418b51a88e3 ]
crc32c algorithm provides a byteswaped result. On little-endian
arches, the result ends up in big-endian/network byte order.
On big-endinan arches, the result ends up in little-endian
order and needs to be byte swapped again. Thus calling cpu_to_le32
gives the right output.
Tested-by: Jukka Taimisto <jukka.taimisto@mail.suomi.net>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 6c5979631b4b03c9288776562c18036765e398c1 upstream.
With the new system call defines we get this on uml:
arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'
Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask. However sys_sigprocmask is missing
because of this.
To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 93f78da405685a756beeaeae4b5e41fcec39eab3 upstream.
This reverts commit c9e587abfdec2c2aaa55fab83bcb4972e2f84f9b, and the
subsequent commits that fixed it up:
- afa9b649 "fbcon: prevent cursor disappearance after switching to 512
character font"
- d850a2fa "vt/fbcon: fix background color on line feed"
- 7fe3915a "vt/fbcon: update scrl_erase_char after 256/512-glyph font
switch"
by request of Alan Cox. Quoth Alan:
"Unfortunately it's wrong and its been causing breakages because
various apps like ncurses expect our previous (and correct)
behaviour."
Alexander sent out a similar patch.
Requested-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Tested-by: Jan Engelhardt <jengelh@medozas.de>
Cc: Alexander V. Lukyanov <lav@netis.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Dual 16950 Serial adapter
commit 39aced68d664291db3324d0fcf0985ab5626aac2 upstream.
The PCI-card identified as "Oxford Semiconductor Ltd EXSYS EX-41092 Dual
16950 Serial adapter" is only usable with other devices (i.e. not the same
card) after doing a "setserial /dev/ttyS<n> baud_base 115200". This
baud_base should be default for this card.
Signed-off-by: Niels de Vos <niels.devos@wincor-nixdorf.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 97c44836cdec1ea713a15d84098a1a908157e68f upstream.
This patch makes the ROM reading code return an error to user space if
the size of the ROM read is equal to 0.
The patch also emits a warnings if the contents of the ROM are invalid,
and documents the effects of the "enable" file on ROM reading.
Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au>
Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 777c6c5f1f6e757ae49ecca2ed72d6b1f523c007 upstream.
With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.
Interruptible waiters don't do that when aborting due to a signal. And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.
This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently. The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.
Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock. Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.
Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.
[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Mentored-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 7f9a50a5b89b87f8e754f59ae9968da28be618a5 upstream.
Impact: fix spurious BUG_ON() triggered under load
module_refcount() isn't reliable outside stop_machine(), as demonstrated
by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
(an inc on one cpu and dec on another while module_refcount() is tallying
can give false results, for example).
Almost noone should be using __module_get, but that's another issue.
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit d96f94c604453f87fe24154b87e1e9a3a72511f8 upstream.
Bit 11 in intel PDC definitions is meant for OS capability to handle
hardware coordination of P-states. In Linux we have always supported
hwardware coordination of P-states. Just let the BIOSes know that we
support it, by setting this bit.
Some BIOSes use this bit to choose between hardware or software coordination
and without this change below, BIOSes switch to software coordination, which
is not very optimal in terms of power consumption and extra wakeups from idle.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 9db4fcd99f7ef886ded97cd26a8642c70fbe34df upstream.
The "minimal" descriptors such as EndTag are calculated as 12
bytes long, but the actual length in the internal descriptor is
16 because of the round-up to 8 on 64-bit build.
http://www.acpica.org/bugzilla/show_bug.cgi?id=728
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 27663c5855b10af9ec67bc7dfba001426ba21222 upstream
As of version 2.0, ACPI can return 64-bit integers. The current
acpi_evaluate_integer only supports 64-bit integers on 64-bit platforms.
Change the argument to take a pointer to an acpi_integer so we support
64-bit integers on all platforms.
lenb: replaced use of "acpi_integer" with "unsigned long long"
lenb: fixed bug in acpi_thermal_trips_update()
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 57064d213d2e44654d4f13c66df135b5e7389a26 upstream.
This patch adds the Intel Tigerpoint LPC Controller DeviceIDs.
Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 9b22ea560957de1484e6b3e8538f7eef202e3596 upstream.
The changes to deliver hardware accelerated VLAN packets to packet
sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
The __vlan_hwaccel_rx() function is called directly from the drivers
RX function, for non-NAPI drivers that means its still in RX IRQ
context:
[ 27.779463] ------------[ cut here ]------------
[ 27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81()
...
[ 27.782520] [<c0264755>] netif_nit_deliver+0x5b/0x75
[ 27.782590] [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162
[ 27.782664] [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1]
[ 27.782738] [<c0155b17>] handle_IRQ_event+0x23/0x51
[ 27.782808] [<c015692e>] handle_edge_irq+0xc2/0x102
[ 27.782878] [<c0105fd5>] do_IRQ+0x4d/0x64
Split hardware accelerated VLAN reception into two parts to fix this:
- __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN
device lookup, then calls netif_receive_skb()/netif_rx()
- vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb()
in softirq context, performs the real reception and delivery to
packet sockets.
Reported-and-tested-by: Ramon Casellas <ramon.casellas@cttc.es>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit a229fc61ef0ee3c30fd193beee0eeb87410227f1 upstream.
bsg.h in current form is perfectly suitable for user-mode
consumption. It is needed together with scsi/sg.h for applications
that want to interface with the bsg driver.
Currently the few projects that use it would copy it over into
the projects. But that is not acceptable for projects that need
to provide source and devel packages for distros.
This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had
a stable API since these Kernels and distro users will need the header
for these kernels a swell
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 9df04e1f25effde823a600e755b51475d438f56b upstream.
Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances. That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.
Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:
LOMEM MAX_WATCHES (per user)
512MB ~178000
1GB ~356000
2GB ~712000
A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us. No more max_user_instances
limits then.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Bron Gondwana <brong@fastmail.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit e65f0f8271b1b0452334e5da37fd35413a000de4 upstream.
Add support for Sealevel Systems Model 7803 COMM+8
Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit e4d866cdea24543ee16ce6d07d80c513e86ba983 upstream.
It supports VX855 and future chips whose IDE controller uses PCI ID 0x0571.
Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b94b898f3107046b5c97c556e23529283ea5eadd upstream.
On Vortex86SX with IDE controller revision 0x11 ultra DMA must be
disabled. This patch was tested by DMP and seems to work.
It is a cleaned up version of their older Kernel patch:
http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz
Tested-by: Shawn Lin <shawn@dmp.com.tw>
Signed-off-by: Brandon Philips <bphilips@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 42ef73fe134732b2e91c0326df5fd568da17c4b2 upstream.
On -rt we were seeing spurious bad page states like:
Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a
This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.
This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.
union {
struct {
unsigned long private;
struct address_space *mapping;
};
spinlock_t ptl;
struct kmem_cache *slab;
struct page *first_page;
};
Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.
But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.
The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.
It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.
Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 856bf4d717feb8c55d4e2f817b71ebb70cfbc67b upstream.
s_syncing livelock avoidance was breaking data integrity guarantee of
sys_sync, by allowing sys_sync to skip writing or waiting for superblocks
if there is a concurrent sys_sync happening.
This livelock avoidance is much less important now that we don't have the
get_super_to_sync() call after every sb that we sync. This was replaced
by __put_super_and_need_restart.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 4f5a99d64c17470a784a6c68064207d82e3e74a5 upstream.
Remove WB_SYNC_HOLD. The primary motiviation is the design of my
anti-starvation code for fsync. It requires taking an inode lock over the
sync operation, so we could run into lock ordering problems with multiple
inodes. It is possible to take a single global lock to solve the ordering
problem, but then that would prevent a future nice implementation of "sync
multiple inodes" based on lock order via inode address.
Seems like a backward step to remove this, but actually it is busted
anyway: we can't use the inode lists for data integrity wait: an inode can
be taken off the dirty lists but still be under writeback. In order to
satisfy data integrity semantics, we should wait for it to finish
writeback, but if we only search the dirty lists, we'll miss it.
It would be possible to have a "writeback" list, for sys_sync, I suppose.
But why complicate things by prematurely optimise? For unmounting, we
could avoid the "livelock avoidance" code, which would be easier, but
again premature IMO.
Fixing the existing data integrity problem will come next.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 25ff1c316f6a763f1eefe7f8984b2d8c03888432 upstream.
This patch (as1189d) adds some hacks to usb-storage for dealing with
the growing problems involving bad capacity values and last-sector
accesses:
A new flag, US_FL_CAPACITY_OK, is created to indicate that
the device is known to report its capacity correctly. An
unusual_devs entry for Linux's own File-backed Storage Gadget
is added with this flag set, since g_file_storage always
reports the correct capacity and since the capacity need
not be even (it is determined by the size of the backing
file).
An entry in unusual_devs.h which has only the CAPACITY_OK
flag set shouldn't prejudice libusual, since the device will
work perfectly well with either usb-storage or ub. So a
new macro, COMPLIANT_DEV, is added to let libusual know
about these entries.
When a last-sector access fails three times in a row and
neither the FIX_CAPACITY nor the CAPACITY_OK flag is set,
we assume the last-sector bug is present. We replace the
existing status and sense data with values that will cause
the SCSI core to fail the access immediately rather than
retry indefinitely. This should fix the difficulties
people have been having with Nokia phones.
This version of the patch differs from the version accepted into the
mainline only in that it does not trigger a WARN() when an
odd-numbered last-sector access succeeds. In a stable kernel series
we don't want to go around spamming users' logs and consoles for no
good reason.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit e8c82c2e23e3527e0c9dc195e432c16784d270fa upstream.
An XFS workload showed up a bug in the lockless pagecache patch. Basically it
would go into an "infinite" loop, although it would sometimes be able to break
out of the loop! The reason is a missing compiler barrier in the "increment
reference count unless it was zero" case of the lockless pagecache protocol in
the gang lookup functions.
This would cause the compiler to use a cached value of struct page pointer to
retry the operation with, rather than reload it. So the page might have been
removed from pagecache and freed (refcount==0) but the lookup would not correctly
notice the page is no longer in pagecache, and keep attempting to increment the
refcount and failing, until the page gets reallocated for something else. This
isn't a data corruption because the condition will be detected if the page has
been reallocated. However it can result in a lockup.
Linus points out that ACCESS_ONCE is also required in that pointer load, even
if it's absence is not causing a bug on our particular build. The most general
way to solve this is just to put an rcu_dereference in radix_tree_deref_slot.
Assembly of find_get_pages,
before:
.L220:
movq (%rbx), %rax #* ivtmp.1162, tmp82
movq (%rax), %rdi #, prephitmp.1149
.L218:
testb $1, %dil #, prephitmp.1149
jne .L217 #,
testq %rdi, %rdi # prephitmp.1149
je .L203 #,
cmpq $-1, %rdi #, prephitmp.1149
je .L217 #,
movl 8(%rdi), %esi # <variable>._count.counter, c
testl %esi, %esi # c
je .L218 #,
after:
.L212:
movq (%rbx), %rax #* ivtmp.1109, tmp81
movq (%rax), %rdi #, ret
testb $1, %dil #, ret
jne .L211 #,
testq %rdi, %rdi # ret
je .L197 #,
cmpq $-1, %rdi #, ret
je .L211 #,
movl 8(%rdi), %esi # <variable>._count.counter, c
testl %esi, %esi # c
je .L212 #,
(notice the obvious infinite loop in the first example, if page->count remains 0)
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 18e6959c385f3edf3991fa6662a53dac4eb10d5b upstream.
This assertion is incorrect for lockless pagecache. By definition if we
have an unpinned page that we are trying to take a speculative reference
to, it may become the tail of a compound page at any time (if it is
freed, then reallocated as a compound page).
It was still a valid assertion for the vmscan.c LRU isolation case, but
it doesn't seem incredibly helpful... if somebody wants it, they can
put it back directly where it applies in the vmscan code.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 54566b2c1594c2326a645a3551f9d989f7ba3c5e upstream.
With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened. They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim. This bug could
cause filesystem deadlocks.
The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called. It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock. The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.
Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on
this flag in their write_begin function. Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).
This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a
random example).
[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
untouched to the grab_cache_page_write_begin() function. That
just simplifies everybody, and may even allow future expansion of the
logic. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 2b66421995d2e93c9d1a0111acf2581f8529c6e5 upstream.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit d4e82042c4cfa87a7d51710b71f568fe80132551 upstream.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ee6a093222549ac0c72cfd296c69fa5e7d6daa34 upstream.
This enables the use of syscall wrappers to do proper sign extension
for 64-bit programs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 1a94bc34768e463a93cb3751819709ab0ea80a01 upstream.
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
By selecting HAVE_SYSCALL_WRAPPERS architectures can activate
system call wrappers in order to sign extend system call arguments.
All architectures where the ABI defines that the caller of a function
has to perform sign extension probably need this.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit e55380edf68796d75bf41391a781c68ee678587d upstream.
This way it matches the generic system call name convention.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 2ed7c03ec17779afb4fcfa3b8c61df61bd4879ba upstream.
Convert all system calls to return a long. This should be a NOP since all
converted types should have the same size anyway.
With the exception of sys_exit_group which returned void. But that doesn't
matter since the system call doesn't return.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 4c696ba7982501d43dea11dbbaabd2aa8a19cc42 upstream.
Move declarations to correct header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 4ae8978cf92a96257cd8998a49e781be83571d64 upstream.
The problems lie in the types used for some inotify interfaces, both at the kernel level and at the glibc level. This mail addresses the kernel problem. I will follow up with some suggestions for glibc changes.
For the sys_inotify_rm_watch() interface, the type of the 'wd' argument is
currently 'u32', it should be '__s32' . That is Robert's suggestion, and
is consistent with the other declarations of watch descriptors in the
kernel source, in particular, the inotify_event structure in
include/linux/inotify.h:
struct inotify_event {
__s32 wd; /* watch descriptor */
__u32 mask; /* watch mask */
__u32 cookie; /* cookie to synchronize two events */
__u32 len; /* length (including nulls) of name */
char name[0]; /* stub for possible name */
};
The patch makes the changes needed for inotify_rm_watch().
Signed-off-by: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Robert Love <rlove@google.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 1c5745aa380efb6417b5681104b007c8612fb496 upstream.
Redo:
5b7dba4: sched_clock: prevent scd->clock from moving backwards
which had to be reverted due to s2ram hangs:
ca7e716: Revert "sched_clock: prevent scd->clock from moving backwards"
... this time with resume restoring GTOD later in the sequence
taken into account as well.
The "timekeeping_suspended" flag is not very nice but we cannot call into
GTOD before it has been properly resumed and the scheduler will run very
early in the resume sequence.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 5289f46b9de04bde181d833d48df9671b69c4b08 upstream.
flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.
Debugged-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit d253eee20195b25e298bf162a6e72f14bf4803e5 upstream.
Due to a wrong safety check in af_can.c it was not possible to filter
for SFF frames with a specific CAN identifier without getting the
same selected CAN identifier from a received EFF frame also.
This fix has a minimum (but user visible) impact on the CAN filter
API and therefore the CAN version is set to a new date.
Indeed the 'old' API is still working as-is. But when now setting
CAN_(EFF|RTR)_FLAG in can_filter.can_mask you might get less traffic
than before - but still the stuff that you expected to get for your
defined filter ...
Thanks to Kurt Van Dijck for pointing at this issue and for the review.
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ae8d04e2ecbb233926860e9ce145eac19c7835dc upstream.
VMI initialiation can relocate the fixmap, causing early_ioremap to
malfunction if it is initialized before the relocation. To fix this,
VMI activation is split into two phases; the detection, which must
happen before setting up ioremap, and the activation, which must happen
after parsing early boot parameters.
This fixes a crash on boot when VMI is enabled under VMware.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b563cf59c4d67da7d671788a9848416bfa4180ab upstream.
PnP encodes the resource type directly as its struct resource->flags value
which is an unsigned long. Make it so...
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit bf2a9a39639b8b51377905397a5005f444e9a892 upstream.
binfmt_script and binfmt_misc disallow recursion to avoid stack overflow
using sh_bang and misc_bang. It causes problem in some cases:
$ echo '#!/bin/ls' > /tmp/t0
$ echo '#!/tmp/t0' > /tmp/t1
$ echo '#!/tmp/t1' > /tmp/t2
$ chmod +x /tmp/t*
$ /tmp/t2
zsh: exec format error: /tmp/t2
Similar problem with binfmt_misc.
This patch introduces field 'recursion_depth' into struct linux_binprm to
track recursion level in binfmt_misc and binfmt_script. If recursion
level more then BINPRM_MAX_RECURSION it generates -ENOEXEC.
[akpm@linux-foundation.org: make linux_binprm.recursion_depth a uint]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 4afe978530702c934dfdb11f54073136818b2119 upstream.
When a checkpointing IO fails, current JBD code doesn't check the error
and continue journaling. This means latest metadata can be lost from both
the journal and filesystem.
This patch leaves the failed metadata blocks in the journal space and
aborts journaling in the case of log_do_checkpoint(). To achieve this, we
need to do:
1. don't remove the failed buffer from the checkpoint list where in
the case of __try_to_free_cp_buf() because it may be released or
overwritten by a later transaction
2. log_do_checkpoint() is the last chance, remove the failed buffer
from the checkpoint list and abort the journal
3. when checkpointing fails, don't update the journal super block to
prevent the journaled contents from being cleaned. For safety,
don't update j_tail and j_tail_sequence either
4. when checkpointing fails, notify this error to the ext3 layer so
that ext3 don't clear the needs_recovery flag, otherwise the
journaled contents are ignored and cleaned in the recovery phase
5. if the recovery fails, keep the needs_recovery flag
6. prevent cleanup_journal_tail() from being called between
__journal_drop_transaction() and journal_abort() (a race issue
between journal_flush() and __log_wait_for_space()
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit f2f1fa78a155524b849edf359e42a3001ea652c0 upstream.
There's no point in having too short SG_IO timeouts, since if the
command does end up timing out, we'll end up through the reset sequence
that is several seconds long in order to abort the command that timed
out.
As a result, shorter timeouts than a few seconds simply do not make
sense, as the recovery would be longer than the timeout itself.
Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT.
Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
trimed down version of commit 05496769e5da83ce22ed97345afd9c7b71d6bd24 upstream.
Some devices such as "cciss/c0d0p9" will cause jbd2 setup and teardown
failures when /proc filenames are created with embedded slashes. This
is a slimmed down version of commit 05496769, with the stack reduction
aspects of the patch omitted to meet the -stable criteria.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 5f23b734963ec7eaa3ebcd9050da0c9b7d143dd3 upstream.
This is an implementation of David Miller's suggested fix in:
https://bugzilla.redhat.com/show_bug.cgi?id=470201
It has been updated to use wait_event() instead of
wait_event_interruptible().
Paraphrasing the description from the above report, it makes sendmsg()
block while UNIX garbage collection is in progress. This avoids a
situation where child processes continue to queue new FDs over a
AF_UNIX socket to a parent which is in the exit path and running
garbage collection on these FDs. This contention can result in soft
lockups and oom-killing of unrelated processes.
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit ac70a964b0e22a95af3628c344815857a01461b7 upstream.
Some recent Seagate harddrives have firmware bug which causes FLUSH
CACHE to timeout under certain circumstances if NCQ is being used.
This can be worked around by disabling NCQ and fixed by updating the
firmware. Implement ATA_HORKAGE_FIRMWARE_UPDATE and blacklist these
devices.
The wiki page has been updated to contain information on this issue.
http://ata.wiki.kernel.org/index.php/Known_issues
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
backport of commit 97a70e548bd97d5a46ae9d44f24aafcc013fd701 to the 2.6.27 kernel.
The NUMA code on x86_32 creates special memory mapping that allows
each node's pgdat to be located in this node's memory. For this
purpose it allocates a memory area at the end of each node's memory
and maps this area so that it is accessible with virtual addresses
belonging to low memory. As a result, if there is high memory,
these NUMA-allocated areas are physically located in high memory,
although they are mapped to low memory addresses.
Our hibernation code does not take that into account and for this
reason hibernation fails on all x86_32 systems with CONFIG_NUMA=y and
with high memory present. Fix this by adding a special mapping for
the NUMA-allocated memory areas to the temporary page tables created
during the last phase of resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 58319b802a614f10f1b5238fbde7a4b2e9a60069 upstream.
Now that the PCI core manages the 'name' for each individual
hotplug driver, and all drivers (except rpaphp) have been converted
to use hotplug_slot_name(), there is no need for the PCI hotplug
core to drag around its own copy of name either.
Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 0ad772ec464d3fcf9d210836b97e654f393606c4 upstream
In preparation for cleaning up the various hotplug drivers
such that they don't have to manage their own 'name' parameters
anymore, we provide the following convenience functions:
pci_slot_name()
hotplug_slot_name()
These helpers will be used by individual hotplug drivers.
Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 828f37683e6d3ab5912989df0d04201db7ad798e upstream.
Slot detection drivers can co-exist with hotplug drivers. The names
of the detected/claimed slots may be different depending on module
load order.
For legacy reasons, we need to allow hotplug drivers to override
the slot name if a detection driver is loaded first (and they find
the same slots).
Creating and overriding slot names should be an atomic operation,
otherwise you get a locking nightmare as various drivers race to
call pci_create_slot().
pci_create_slot() is already serialized by grabbing the pci_bus_sem.
We update the API and add a 'hotplug' param, which is:
set if the caller is a hotplug driver
NULL if the caller is a detection driver
pci_create_slot() does not actually use the 'hotplug' parameter in this
patch. A later patch will add the logic that uses it.
Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 1359f2701b96abd9bb69c1273fb995a093b6409a upstream.
Update pci_hp_register() to take a const char *name parameter.
The motivation for this is to clean up the individual hotplug
drivers so that each one does not have to manage its own name.
The PCI core should be the place where we manage the name.
We update the interface and all callsites first, in a
"no functional change" manner, and clean up the drivers later.
Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b627c8b17ccacba38c975bc0f69a49fc4e5261c9 upstream.
Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off
Currently these macros evaluate to a no-op except the kernel is compiled
with GART or Calgary support. But we also need these macros when we have
SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at
least with SWIOTLB we can define these macros always.
This patch is also for stable backport for the same reason the SWIOTLB
default selection patch is.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|