aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2013-03-20block: use i_size_write() in bd_set_size()Guo Chao
commit d646a02a9d44d1421f273ae3923d97b47b918176 upstream. blkdev_ioctl(GETBLKSIZE) uses i_size_read() to read size of block device. If we update block size directly, reader may see intermediate result in some machines and configurations. Use i_size_write() instead. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: M. Hindess <hindessm@uk.ibm.com> Cc: Nikanth Karthikesan <knikanth@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-20btrfs: use rcu_barrier() to wait for bdev puts at unmountEric Sandeen
commit bc178622d40d87e75abc131007342429c9b03351 upstream. Doing this would reliably fail with -EBUSY for me: # mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2 ... unable to open /dev/sdb2: Device or resource busy because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it. Using systemtap to track bdev gets & puts shows a kworker thread doing a blkdev put after mkfs attempts a get; this is left over from the unmount path: btrfs_close_devices __btrfs_close_devices call_rcu(&device->rcu, free_device); free_device INIT_WORK(&device->rcu_work, __free_device); schedule_work(&device->rcu_work); so unmount might complete before __free_device fires & does its blkdev_put. Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait until all blkdev_put()s are done, and the device is truly free once unmount completes. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-20ext3: Fix format string issuesLars-Peter Clausen
commit 8d0c2d10dd72c5292eda7a06231056a4c972e4cc upstream. ext3_msg() takes the printk prefix as the second parameter and the format string as the third parameter. Two callers of ext3_msg omit the prefix and pass the format string as the second parameter and the first parameter to the format string as the third parameter. In both cases this string comes from an arbitrary source. Which means the string may contain format string characters, which will lead to undefined and potentially harmful behavior. The issue was introduced in commit 4cf46b67eb("ext3: Unify log messages in ext3") and is fixed by this patch. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-14vfs: fix pipe counter breakageAl Viro
commit a930d8790552658140d7d0d2e316af4f0d76a512 upstream. If you open a pipe for neither read nor write, the pipe code will not add any usage counters to the pipe, causing the 'struct pipe_inode_info" to be potentially released early. That doesn't normally matter, since you cannot actually use the pipe, but the pipe release code - particularly fasync handling - still expects the actual pipe infrastructure to all be there. And rather than adding NULL pointer checks, let's just disallow this case, the same way we already do for the named pipe ("fifo") case. This is ancient going back to pre-2.4 days, and until trinity, nobody naver noticed. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-14cifs: ensure that cifs_get_root() only traverses directoriesJeff Layton
commit ce2ac52105aa663056dfc17966ebed1bf93e6e64 upstream. Kjell Braden reported this oops: [ 833.211970] BUG: unable to handle kernel NULL pointer dereference at (null) [ 833.212816] IP: [< (null)>] (null) [ 833.213280] PGD 1b9b2067 PUD e9f7067 PMD 0 [ 833.213874] Oops: 0010 [#1] SMP [ 833.214344] CPU 0 [ 833.214458] Modules linked in: des_generic md4 nls_utf8 cifs vboxvideo drm snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq bnep rfcomm snd_timer bluetooth snd_seq_device ppdev snd vboxguest parport_pc joydev mac_hid soundcore snd_page_alloc psmouse i2c_piix4 serio_raw lp parport usbhid hid e1000 [ 833.215629] [ 833.215629] Pid: 1752, comm: mount.cifs Not tainted 3.0.0-rc7-bisectcifs-fec11dd9a0+ #18 innotek GmbH VirtualBox/VirtualBox [ 833.215629] RIP: 0010:[<0000000000000000>] [< (null)>] (null) [ 833.215629] RSP: 0018:ffff8800119c9c50 EFLAGS: 00010282 [ 833.215629] RAX: ffffffffa02186c0 RBX: ffff88000c427780 RCX: 0000000000000000 [ 833.215629] RDX: 0000000000000000 RSI: ffff88000c427780 RDI: ffff88000c4362e8 [ 833.215629] RBP: ffff8800119c9c88 R08: ffff88001fc15e30 R09: 00000000d69515c7 [ 833.215629] R10: ffffffffa0201972 R11: ffff88000e8f6a28 R12: ffff88000c4362e8 [ 833.215629] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88001181aaa6 [ 833.215629] FS: 00007f2986171700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 833.215629] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 833.215629] CR2: 0000000000000000 CR3: 000000001b982000 CR4: 00000000000006f0 [ 833.215629] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 833.215629] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 833.215629] Process mount.cifs (pid: 1752, threadinfo ffff8800119c8000, task ffff88001c1c16f0) [ 833.215629] Stack: [ 833.215629] ffffffff8116a9b5 ffff8800119c9c88 ffffffff81178075 0000000000000286 [ 833.215629] 0000000000000000 ffff88000c4276c0 ffff8800119c9ce8 ffff8800119c9cc8 [ 833.215629] ffffffff8116b06e ffff88001bc6fc00 ffff88000c4276c0 ffff88000c4276c0 [ 833.215629] Call Trace: [ 833.215629] [<ffffffff8116a9b5>] ? d_alloc_and_lookup+0x45/0x90 [ 833.215629] [<ffffffff81178075>] ? d_lookup+0x35/0x60 [ 833.215629] [<ffffffff8116b06e>] __lookup_hash.part.14+0x9e/0xc0 [ 833.215629] [<ffffffff8116b1d6>] lookup_one_len+0x146/0x1e0 [ 833.215629] [<ffffffff815e4f7e>] ? _raw_spin_lock+0xe/0x20 [ 833.215629] [<ffffffffa01eef0d>] cifs_do_mount+0x26d/0x500 [cifs] [ 833.215629] [<ffffffff81163bd3>] mount_fs+0x43/0x1b0 [ 833.215629] [<ffffffff8117d41a>] vfs_kern_mount+0x6a/0xd0 [ 833.215629] [<ffffffff8117e584>] do_kern_mount+0x54/0x110 [ 833.215629] [<ffffffff8117fdc2>] do_mount+0x262/0x840 [ 833.215629] [<ffffffff81108a0e>] ? __get_free_pages+0xe/0x50 [ 833.215629] [<ffffffff8117f9ca>] ? copy_mount_options+0x3a/0x180 [ 833.215629] [<ffffffff8118075d>] sys_mount+0x8d/0xe0 [ 833.215629] [<ffffffff815ece82>] system_call_fastpath+0x16/0x1b [ 833.215629] Code: Bad RIP value. [ 833.215629] RIP [< (null)>] (null) [ 833.215629] RSP <ffff8800119c9c50> [ 833.215629] CR2: 0000000000000000 [ 833.238525] ---[ end trace ec00758b8d44f529 ]--- When walking down the path on the server, it's possible to hit a symlink. The path walking code assumes that the caller will handle that situation properly, but cifs_get_root() isn't set up for it. This patch prevents the oops by simply returning an error. A better solution would be to try and chase the symlinks here, but that's fairly complicated to handle. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=53221 Reported-and-tested-by: Kjell Braden <afflux@pentabarf.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-14btrfs: Init io_lock after cloning btrfs device structThomas Gleixner
commit 1cba0cdf5e4dbcd9e5fa5b54d7a028e55e2ca057 upstream. __btrfs_close_devices() clones btrfs device structs with memcpy(). Some of the fields in the clone are reinitialized, but it's missing to init io_lock. In mainline this goes unnoticed, but on RT it leaves the plist pointing to the original about to be freed lock struct. Initialize io_lock after cloning, so no references to the original struct are left. Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-04ext4: fix race in ext4_mb_add_n_trim()Niu Yawei
commit f1167009711032b0d747ec89a632a626c901a1ad upstream. In ext4_mb_add_n_trim(), lg_prealloc_lock should be taken when changing the lg_prealloc_list. Signed-off-by: Niu Yawei <yawei.niu@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-04ocfs2: ac->ac_allow_chain_relink=0 won't disable group relinkXiaowei.Hu
commit 309a85b6861fedbb48a22d45e0e079d1be993b3a upstream. ocfs2_block_group_alloc_discontig() disables chain relink by setting ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple cluster groups. It doesn't keep the credits for all chain relink,but ocfs2_claim_suballoc_bits overrides this in this call trace: ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()-> __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits() ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call ocfs2_search_chain() one time and disable it again, and then we run out of credits. Fix is to allow relink by default and disable it in ocfs2_block_group_alloc_discontig. Without this patch, End-users will run into a crash due to run out of credits, backtrace like this: RIP: 0010:[<ffffffffa0808b14>] [<ffffffffa0808b14>] jbd2_journal_dirty_metadata+0x164/0x170 [jbd2] RSP: 0018:ffff8801b919b5b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0 RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8 RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0 R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800 FS: 00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0) Call Trace: ocfs2_journal_dirty+0x2f/0x70 [ocfs2] ocfs2_relink_block_group+0x111/0x480 [ocfs2] ocfs2_search_chain+0x455/0x9a0 [ocfs2] ... Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28NLS: improve UTF8 -> UTF16 string conversion routineAlan Stern
commit 0720a06a7518c9d0c0125bd5d1f3b6264c55c3dd upstream. The utf8s_to_utf16s conversion routine needs to be improved. Unlike its utf16s_to_utf8s sibling, it doesn't accept arguments specifying the maximum length of the output buffer or the endianness of its 16-bit output. This patch (as1501) adds the two missing arguments, and adjusts the only two places in the kernel where the function is called. A follow-on patch will add a third caller that does utilize the new capabilities. The two conversion routines are still annoyingly inconsistent in the way they handle invalid byte combinations. But that's a subject for a different patch. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28ext4: add missing kfree() on error return path in add_new_gdb()Dan Carpenter
commit c49bafa3842751b8955a962859f42d307673d75d upstream. We added some more error handling in b40971426a "ext4: add error checking to calls to ext4_handle_dirty_metadata()". But we need to call kfree() as well to avoid a memory leak. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28ext4: Free resources in some error path in ext4_fill_superTao Ma
commit dcf2d804ed6ffe5e942b909ed5e5b74628be6ee4 upstream. Some of the error path in ext4_fill_super don't release the resouces properly. So this patch just try to release them in the right way. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28NLM: Ensure that we resend all pending blocking locks after a reclaimTrond Myklebust
commit 666b3d803a511fbc9bc5e5ea8ce66010cf03ea13 upstream. Currently, nlmclnt_lock will break out of the for(;;) loop when the reclaimer wakes up the blocking lock thread by setting nlm_lck_denied_grace_period. This causes the lock request to fail with an ENOLCK error. The intention was always to ensure that we resend the lock request after the grace period has expired. Reported-by: Wangyuan Zhang <Wangyuan.Zhang@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28ocfs2: unlock super lock if lockres refresh failedJunxiao Bi
commit 3278bb748d2437eb1464765f36429e5d6aa91c38 upstream. If lockres refresh failed, the super lock will never be released which will cause some processes on other cluster nodes hung forever. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28inotify: remove broken mask checks causing unmount to be EINVALJim Somerville
commit 676a0675cf9200ac047fb50825f80867b3bb733b upstream. Running the command: inotifywait -e unmount /mnt/disk immediately aborts with a -EINVAL return code. This is however a valid parameter. This abort occurs only if unmount is the sole event parameter. If other event parameters are supplied, then the unmount event wait will work. The problem was introduced by commit 44b350fc23e ("inotify: Fix mask checks"). In that commit, it states: The mask checks in inotify_update_existing_watch() and inotify_new_watch() are useless because inotify_arg_to_mask() sets FS_IN_IGNORED and FS_EVENT_ON_CHILD bits anyway. But instead of removing the useless checks, it did this: mask = inotify_arg_to_mask(arg); - if (unlikely(!mask)) + if (unlikely(!(mask & IN_ALL_EVENTS))) return -EINVAL; The problem is that IN_ALL_EVENTS doesn't include IN_UNMOUNT, and other parts of the code keep IN_UNMOUNT separate from IN_ALL_EVENTS. So the check should be: if (unlikely(!(mask & (IN_ALL_EVENTS | IN_UNMOUNT)))) But inotify_arg_to_mask(arg) always sets the IN_UNMOUNT bit in the mask anyway, so the check is always going to pass and thus should simply be removed. Also note that inotify_arg_to_mask completely controls what mask bits get set from arg, there's no way for invalid bits to get enabled there. Lets fix it by simply removing the useless broken checks. Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix MSG_SENDPAGE_NOTLAST logicEric Dumazet
[ Upstream commit ae62ca7b03217be5e74759dc6d7698c95df498b3 ] commit 35f9c09fe9c72e (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag : MSG_SENDPAGE_NOTLAST meant to be set on all frags but the last one for a splice() call. The condition used to set the flag in pipe_to_sendpage() relied on splice() user passing the exact number of bytes present in the pipe, or a smaller one. But some programs pass an arbitrary high value, and the test fails. The effect of this bug is a lack of tcp_push() at the end of a splice(pipe -> socket) call, and possibly very slow or erratic TCP sessions. We should both test sd->total_len and fact that another fragment is in the pipe (pipe->nrbufs > 1) Many thanks to Willy for providing very clear bug report, bisection and test programs. Reported-by: Willy Tarreau <w@1wt.eu> Bisected-by: Willy Tarreau <w@1wt.eu> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-11nilfs2: fix fix very long mount time issueVyacheslav Dubeyko
commit a9bae189542e71f91e61a4428adf6e5a7dfe8063 upstream. There exists a situation when GC can work in background alone without any other filesystem activity during significant time. The nilfs_clean_segments() method calls nilfs_segctor_construct() that updates superblocks in the case of NILFS_SC_SUPER_ROOT and THE_NILFS_DISCONTINUED flags are set. But when GC is working alone the nilfs_clean_segments() is called with unset THE_NILFS_DISCONTINUED flag. As a result, the update of superblocks doesn't occurred all this time and in the case of SPOR superblocks keep very old values of last super root placement. SYMPTOMS: Trying to mount a NILFS2 volume after SPOR in such environment ends with very long mounting time (it can achieve about several hours in some cases). REPRODUCING PATH: 1. It needs to use external USB HDD, disable automount and doesn't make any additional filesystem activity on the NILFS2 volume. 2. Generate temporary file with size about 100 - 500 GB (for example, dd if=/dev/zero of=<file_name> bs=1073741824 count=200). The size of file defines duration of GC working. 3. Then it needs to delete file. 4. Start GC manually by means of command "nilfs-clean -p 0". When you start GC by means of such way then, at the end, superblocks is updated by once. So, for simulation of SPOR, it needs to wait sometime (15 - 40 minutes) and simply switch off USB HDD manually. 5. Switch on USB HDD again and try to mount NILFS2 volume. As a result, NILFS2 volume will mount during very long time. REPRODUCIBILITY: 100% FIX: This patch adds checking that superblocks need to update and set THE_NILFS_DISCONTINUED flag before nilfs_clean_segments() call. Reported-by: Sergey Alexandrov <splavgm@gmail.com> Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Tested-by: Vyacheslav Dubeyko <slava@dubeyko.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-03fs/cifs/cifs_dfs_ref.c: fix potential memory leakageCong Ding
commit 10b8c7dff5d3633b69e77f57d404dab54ead3787 upstream. When it goes to error through line 144, the memory allocated to *devname is not freed, and the caller doesn't free it either in line 250. So we free the memroy of *devname in function cifs_compose_mount_options() when it goes to error. Signed-off-by: Cong Ding <dinggnu@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21ext4: init pagevec in ext4_da_block_invalidatepagesEric Sandeen
commit 66bea92c69477a75a5d37b9bfed5773c92a3c4b4 upstream. ext4_da_block_invalidatepages is missing a pagevec_init(), which means that pvec->cold contains random garbage. This affects whether the page goes to the front or back of the LRU when ->cold makes it to free_hot_cold_page() Reviewed-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17GFS2: Test bufdata with buffer locked and gfs2_log_lock heldBenjamin Marzinski
commit 96e5d1d3adf56f1c7eeb07258f6a1a0a7ae9c489 upstream. In gfs2_trans_add_bh(), gfs2 was testing if a there was a bd attached to the buffer without having the gfs2_log_lock held. It was then assuming it would stay attached for the rest of the function. However, without either the log lock being held of the buffer locked, __gfs2_ail_flush() could detach bd at any time. This patch moves the locking before the test. If there isn't a bd already attached, gfs2 can safely allocate one and attach it before locking. There is no way that the newly allocated bd could be on the ail list, and thus no way for __gfs2_ail_flush() to detach it. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-17epoll: prevent missed events on EPOLL_CTL_MODEric Wong
commit 128dd1759d96ad36c379240f8b9463e8acfd37a1 upstream. EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to ensure events are not missed. Since the modifications to the interest mask are not protected by the same lock as ep_poll_callback, we need to ensure the change is visible to other CPUs calling ep_poll_callback. We also need to ensure f_op->poll() has an up-to-date view of past events which occured before we modified the interest mask. So this barrier also pairs with the barrier in wq_has_sleeper(). This should guarantee either ep_poll_callback or f_op->poll() (or both) will notice the readiness of a recently-ready/modified item. This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in: http://thread.gmane.org/gmane.linux.kernel/1408782/ Signed-off-by: Eric Wong <normalperson@yhbt.net> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Voellmy <andreas.voellmy@yale.edu> Tested-by: "Junchang(Jason) Wang" <junchang.wang@yale.edu> Cc: netdev@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17udf: don't increment lenExtents while writing to a holeNamjae Jeon
commit fb719c59bdb4fca86ee1fd1f42ab3735ca12b6b2 upstream. Incrementing lenExtents even while writing to a hole is bad for performance as calls to udf_discard_prealloc and udf_truncate_tail_extent would not return from start if isize != lenExtents Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17udf: fix memory leak while allocating blocks during writeNamjae Jeon
commit 2fb7d99d0de3fd8ae869f35ab682581d8455887a upstream. Need to brelse the buffer_head stored in cur_epos and next_epos. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ext4: lock i_mutex when truncating orphan inodesTheodore Ts'o
commit 721e3eba21e43532e438652dd8f1fcdfce3187e7 upstream. Commit c278531d39 added a warning when ext4_flush_unwritten_io() is called without i_mutex being taken. It had previously not been taken during orphan cleanup since races weren't possible at that point in the mount process, but as a result of this c278531d39, we will now see a kernel WARN_ON in this case. Take the i_mutex in ext4_orphan_cleanup() to suppress this warning. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ext4: do not try to write superblock on ro remount w/o journalMichael Tokarev
commit d096ad0f79a782935d2e06ae8fb235e8c5397775 upstream. When a journal-less ext4 filesystem is mounted on a read-only block device (blockdev --setro will do), each remount (for other, unrelated, flags, like suid=>nosuid etc) results in a series of scary messages from kernel telling about I/O errors on the device. This is becauese of the following code ext4_remount(): if (sbi->s_journal == NULL) ext4_commit_super(sb, 1); at the end of remount procedure, which forces writing (flushing) of a superblock regardless whenever it is dirty or not, if the filesystem is readonly or not, and whenever the device itself is readonly or not. We only need call ext4_commit_super when the file system had been previously mounted read/write. Thanks to Eric Sandeen for help in diagnosing this issue. Signed-off-By: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17jbd2: fix assertion failure in jbd2_journal_flush()Jan Kara
commit d7961c7fa4d2e3c3f12be67e21ba8799b5a7238a upstream. The following race is possible between start_this_handle() and someone calling jbd2_journal_flush(). Process A Process B start_this_handle(). if (journal->j_barrier_count) # false if (!journal->j_running_transaction) { #true read_unlock(&journal->j_state_lock); jbd2_journal_lock_updates() jbd2_journal_flush() write_lock(&journal->j_state_lock); if (journal->j_running_transaction) { # false ... wait for committing trans ... write_unlock(&journal->j_state_lock); ... write_lock(&journal->j_state_lock); if (!journal->j_running_transaction) { # true jbd2_get_transaction(journal, new_transaction); write_unlock(&journal->j_state_lock); goto repeat; # eventually blocks on j_barrier_count > 0 ... J_ASSERT(!journal->j_running_transaction); # fails We fix the race by rechecking j_barrier_count after reacquiring j_state_lock in exclusive mode. Reported-by: yjwsignal@empal.com Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ext4: fix extent tree corruption caused by hole punchForrest Liu
commit c36575e663e302dbaa4d16b9c72d2c9a913a9aef upstream. When depth of extent tree is greater than 1, logical start value of interior node is not correctly updated in ext4_ext_rm_idx. Signed-off-by: Forrest Liu <forrestl@synology.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Ashish Sangwan <ashishsangwan2@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17jffs2: hold erase_completion_lock on exitAlexey Khoroshilov
commit 2cbba75a56ea78e6876b4e2547a882f10b3fe72b upstream. Users of jffs2_do_reserve_space() expect they still held erase_completion_lock after call to it. But there is a path where jffs2_do_reserve_space() leaves erase_completion_lock unlocked. The patch fixes it. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ext4: fix memory leak in ext4_xattr_set_acl()'s error pathEugene Shatokhin
commit 24ec19b0ae83a385ad9c55520716da671274b96c upstream. In ext4_xattr_set_acl(), if ext4_journal_start() returns an error, posix_acl_release() will not be called for 'acl' which may result in a memory leak. This patch fixes that. Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-11nfs: fix null checking in nfs_get_option_str()Xi Wang
commit e25fbe380c4e3c09afa98bcdcd9d3921443adab8 upstream. The following null pointer check is broken. *option = match_strdup(args); return !option; The pointer `option' must be non-null, and thus `!option' is always false. Use `!*option' instead. The bug was introduced in commit c5cb09b6f8 ("Cleanup: Factor out some cut-and-paste code."). Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-11nfsd4: fix oops on unusual readlike compoundJ. Bruce Fields
commit d5f50b0c290431c65377c4afa1c764e2c3fe5305 upstream. If the argument and reply together exceed the maximum payload size, then a reply with a read-like operation can overlow the rq_pages array. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-11NFS: Fix calls to drop_nlink()Trond Myklebust
commit 1f018458b30b0d5c535c94e577aa0acbb92e1395 upstream. It is almost always wrong for NFS to call drop_nlink() after removing a file. What we really want is to mark the inode's attributes for revalidation, and we want to ensure that the VFS drops it if we're reasonably sure that this is the final unlink(). Do the former using the usual cache validity flags, and the latter by testing if inode->i_nlink == 1, and clearing it in that case. This also fixes the following warning reported by Neil Brown and Jeff Layton (among others). [634155.004438] WARNING: at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.5.0/lin [634155.004442] Hardware name: Latitude E6510 [634155.004577] crc_itu_t crc32c_intel snd_hwdep snd_pcm snd_timer snd soundcor [634155.004609] Pid: 13402, comm: bash Tainted: G W 3.5.0-36-desktop # [634155.004611] Call Trace: [634155.004630] [<ffffffff8100444a>] dump_trace+0xaa/0x2b0 [634155.004641] [<ffffffff815a23dc>] dump_stack+0x69/0x6f [634155.004653] [<ffffffff81041a0b>] warn_slowpath_common+0x7b/0xc0 [634155.004662] [<ffffffff811832e4>] drop_nlink+0x34/0x40 [634155.004687] [<ffffffffa05bb6c3>] nfs_dentry_iput+0x33/0x70 [nfs] [634155.004714] [<ffffffff8118049e>] dput+0x12e/0x230 [634155.004726] [<ffffffff8116b230>] __fput+0x170/0x230 [634155.004735] [<ffffffff81167c0f>] filp_close+0x5f/0x90 [634155.004743] [<ffffffff81167cd7>] sys_close+0x97/0x100 [634155.004754] [<ffffffff815c3b39>] system_call_fastpath+0x16/0x1b [634155.004767] [<00007f2a73a0d110>] 0x7f2a73a0d10f Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-11NFS: avoid NULL dereference in nfs_destroy_serverNeilBrown
commit f259613a1e4b44a0cf85a5dafd931be96ee7c9e5 upstream. In rare circumstances, nfs_clone_server() of a v2 or v3 server can get an error between setting server->destory (to nfs_destroy_server), and calling nfs_start_lockd (which will set server->nlm_host). If this happens, nfs_clone_server will call nfs_free_server which will call nfs_destroy_server and thence nlmclnt_done(NULL). This causes the NULL to be dereferenced. So add a guard to only call nlmclnt_done() if ->nlm_host is not NULL. The other guards there are irrelevant as nlm_host can only be non-NULL if one of these flags are set - so remove those tests. (Thanks to Trond for this suggestion). This is suitable for any stable kernel since 2.6.25. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-11exec: do not leave bprm->interp on stackKees Cook
commit b66c5984017533316fd1951770302649baf1aa33 upstream. If a series of scripts are executed, each triggering module loading via unprintable bytes in the script header, kernel stack contents can leak into the command line. Normally execution of binfmt_script and binfmt_misc happens recursively. However, when modules are enabled, and unprintable bytes exist in the bprm->buf, execution will restart after attempting to load matching binfmt modules. Unfortunately, the logic in binfmt_script and binfmt_misc does not expect to get restarted. They leave bprm->interp pointing to their local stack. This means on restart bprm->interp is left pointing into unused stack memory which can then be copied into the userspace argv areas. After additional study, it seems that both recursion and restart remains the desirable way to handle exec with scripts, misc, and modules. As such, we need to protect the changes to interp. This changes the logic to require allocation for any changes to the bprm->interp. To avoid adding a new kmalloc to every exec, the default value is left as-is. Only when passing through binfmt_script or binfmt_misc does an allocation take place. For a proof of concept, see DoTest.sh from: http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/ Signed-off-by: Kees Cook <keescook@chromium.org> Cc: halfdog <me@halfdog.net> Cc: P J P <ppandit@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-03jbd: Fix lock ordering bug in journal_unmap_buffer()Jan Kara
commit 25389bb207987b5774182f763b9fb65ff08761c8 upstream. Commit 09e05d48 introduced a wait for transaction commit into journal_unmap_buffer() in the case we are truncating a buffer undergoing commit in the page stradding i_size on a filesystem with blocksize < pagesize. Sadly we forgot to drop buffer lock before waiting for transaction commit and thus deadlock is possible when kjournald wants to lock the buffer. Fix the problem by dropping the buffer lock before waiting for transaction commit. Since we are still holding page lock (and that is OK), buffer cannot disappear under us. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26reiserfs: Protect reiserfs_quota_write() with write lockJan Kara
commit 361d94a338a3fd0cee6a4ea32bbc427ba228e628 upstream. Calls into reiserfs journalling code and reiserfs_get_block() need to be protected with write lock. We remove write lock around calls to high level quota code in the next patch so these paths would suddently become unprotected. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26reiserfs: Move quota calls out of write lockJan Kara
commit 7af11686933726e99af22901d622f9e161404e6b upstream. Calls into highlevel quota code cannot happen under the write lock. These calls take dqio_mutex which ranks above write lock. So drop write lock before calling back into quota code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26reiserfs: Protect reiserfs_quota_on() with write lockJan Kara
commit b9e06ef2e8706fe669b51f4364e3aeed58639eb2 upstream. In reiserfs_quota_on() we do quite some work - for example unpacking tail of a quota file. Thus we have to hold write lock until a moment we call back into the quota code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26reiserfs: Fix lock ordering during remountJan Kara
commit 3bb3e1fc47aca554e7e2cc4deeddc24750987ac2 upstream. When remounting reiserfs dquot_suspend() or dquot_resume() can be called. These functions take dqonoff_mutex which ranks above write lock so we have to drop it before calling into quota code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26NFS: Wait for session recovery to finish before returningBryan Schumaker
commit 399f11c3d872bd748e1575574de265a6304c7c43 upstream. Currently, we will schedule session recovery and then return to the caller of nfs4_handle_exception. This works for most cases, but causes a hang on the following test case: Client Server ------ ------ Open file over NFS v4.1 Write to file Expire client Try to lock file The server will return NFS4ERR_BADSESSION, prompting the client to schedule recovery. However, the client will continue placing lock attempts and the open recovery never seems to be scheduled. The simplest solution is to wait for session recovery to run before retrying the lock. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26eCryptfs: check for eCryptfs cipher support at mountTim Sally
commit 5f5b331d5c21228a6519dcb793fc1629646c51a6 upstream. The issue occurs when eCryptfs is mounted with a cipher supported by the crypto subsystem but not by eCryptfs. The mount succeeds and an error does not occur until a write. This change checks for eCryptfs cipher support at mount time. Resolves Launchpad issue #338914, reported by Tyler Hicks in 03/2009. https://bugs.launchpad.net/ecryptfs/+bug/338914 Signed-off-by: Tim Sally <tsally@atomicpeace.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26eCryptfs: Copy up POSIX ACL and read-only flags from lower mountTyler Hicks
commit 069ddcda37b2cf5bb4b6031a944c0e9359213262 upstream. When the eCryptfs mount options do not include '-o acl', but the lower filesystem's mount options do include 'acl', the MS_POSIXACL flag is not flipped on in the eCryptfs super block flags. This flag is what the VFS checks in do_last() when deciding if the current umask should be applied to a newly created inode's mode or not. When a default POSIX ACL mask is set on a directory, the current umask is incorrectly applied to new inodes created in the directory. This patch ignores the MS_POSIXACL flag passed into ecryptfs_mount() and sets the flag on the eCryptfs super block depending on the flag's presence on the lower super block. Additionally, it is incorrect to allow a writeable eCryptfs mount on top of a read-only lower mount. This missing check did not allow writes to the read-only lower mount because permissions checks are still performed on the lower filesystem's objects but it is best to simply not allow a rw mount on top of ro mount. However, a ro eCryptfs mount on top of a rw mount is valid and still allowed. https://launchpad.net/bugs/1009207 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Reported-by: Stefan Beller <stefanbeller@googlemail.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26UBIFS: introduce categorized lprops counterArtem Bityutskiy
commit 98a1eebda3cb2a84ecf1f219bb3a95769033d1bf upstream. This commit is a preparation for a subsequent bugfix. We introduce a counter for categorized lprops. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26UBIFS: fix mounting problems after power cutsArtem Bityutskiy
commit a28ad42a4a0c6f302f488f26488b8b37c9b30024 upstream. This is a bugfix for a problem with the following symptoms: 1. A power cut happens 2. After reboot, we try to mount UBIFS 3. Mount fails with "No space left on device" error message UBIFS complains like this: UBIFS error (pid 28225): grab_empty_leb: could not find an empty LEB The root cause of this problem is that when we mount, not all LEBs are categorized. Only those which were read are. However, the 'ubifs_find_free_leb_for_idx()' function assumes that all LEBs were categorized and 'c->freeable_cnt' is valid, which is a false assumption. This patch fixes the problem by teaching 'ubifs_find_free_leb_for_idx()' to always fall back to LPT scanning if no freeable LEBs were found. This problem was reported by few people in the past, but Brent Taylor was able to reproduce it and send me a flash image which cannot be mounted, which made it easy to hunt the bug. Kudos to Brent. Reported-by: Brent Taylor <motobud@gmail.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26fanotify: fix missing breakEric Paris
commit 848561d368751a1c0f679b9f045a02944506a801 upstream. Anders Blomdell noted in 2010 that Fanotify lost events and provided a test case. Eric Paris confirmed it was a bug and posted a fix to the list https://groups.google.com/forum/?fromgroups=#!topic/linux.kernel/RrJfTfyW2BE but never applied it. Repeated attempts over time to actually get him to apply it have never had a reply from anyone who has raised it So apply it anyway Signed-off-by: Alan Cox <alan@linux.intel.com> Reported-by: Anders Blomdell <anders.blomdell@control.lth.se> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17xfs: fix reading of wrapped log dataDave Chinner
commit 6ce377afd1755eae5c93410ca9a1121dfead7b87 upstream. Commit 4439647 ("xfs: reset buffer pointers before freeing them") in 3.0-rc1 introduced a regression when recovering log buffers that wrapped around the end of log. The second part of the log buffer at the start of the physical log was being read into the header buffer rather than the data buffer, and hence recovery was seeing garbage in the data buffer when it got to the region of the log buffer that was incorrectly read. Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17NFS: Fix Oopses in nfs_lookup_revalidate and nfs4_lookup_revalidateTrond Myklebust
[Fixed upstream as part of 0b728e1911c, but that's a much larger patch, this is only the nfs portion backported as needed.] Fix the following Oops in 3.5.1: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 IP: [<ffffffffa03789cd>] nfs_lookup_revalidate+0x2d/0x480 [nfs] PGD 337c63067 PUD 0 Oops: 0000 [#1] SMP CPU 5 Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss sunrpc af_packet binfmt_misc cpufreq_conservative cpufreq_userspace cpufreq_powersave dm_mod acpi_cpufreq mperf coretemp gpio_ich kvm_intel joydev kvm ioatdma hid_generic igb lpc_ich i7core_edac edac_core ptp serio_raw dca pcspkr i2c_i801 mfd_core sg pps_core usbhid crc32c_intel microcode button autofs4 uhci_hcd ttm drm_kms_helper drm i2c_algo_bit sysimgblt sysfillrect syscopyarea ehci_hcd usbcore usb_common scsi_dh_rdac scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh edd fan ata_piix thermal processor thermal_sys Pid: 30431, comm: java Not tainted 3.5.1-2-default #1 Supermicro X8DTT/X8DTT RIP: 0010:[<ffffffffa03789cd>] [<ffffffffa03789cd>] nfs_lookup_revalidate+0x2d/0x480 [nfs] RSP: 0018:ffff8801b418bd38 EFLAGS: 00010292 RAX: 00000000fffffff6 RBX: ffff88032016d800 RCX: 0000000000000020 RDX: ffffffff00000000 RSI: 0000000000000000 RDI: ffff8801824a7b00 RBP: ffff8801b418bdf8 R08: 7fffff0034323030 R09: fffffffff04c03ed R10: ffff8801824a7b00 R11: 0000000000000002 R12: ffff8801824a7b00 R13: ffff8801824a7b00 R14: 0000000000000000 R15: ffff8803201725d0 FS: 00002b53a46cb700(0000) GS:ffff88033fc20000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 000000020a426000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process java (pid: 30431, threadinfo ffff8801b418a000, task ffff8801b5d20600) Stack: ffff8801b418be44 ffff88032016d800 ffff8801b418bdf8 0000000000000000 ffff8801824a7b00 ffff8801b418bdd7 ffff8803201725d0 ffffffff8116a9c0 ffff8801b5c38dc0 0000000000000007 ffff88032016d800 0000000000000000 Call Trace: [<ffffffff8116a9c0>] lookup_dcache+0x80/0xe0 [<ffffffff8116aa43>] __lookup_hash+0x23/0x90 [<ffffffff8116b4a5>] lookup_one_len+0xc5/0x100 [<ffffffffa03869a3>] nfs_sillyrename+0xe3/0x210 [nfs] [<ffffffff8116cadf>] vfs_unlink.part.25+0x7f/0xe0 [<ffffffff8116f22c>] do_unlinkat+0x1ac/0x1d0 [<ffffffff815717b9>] system_call_fastpath+0x16/0x1b [<00002b5348b5f527>] 0x2b5348b5f526 Code: ec 38 b8 f6 ff ff ff 4c 89 64 24 18 4c 89 74 24 28 49 89 fc 48 89 5c 24 08 48 89 6c 24 10 49 89 f6 4c 89 6c 24 20 4c 89 7c 24 30 <f6> 46 38 40 0f 85 d1 00 00 00 e8 c4 c4 df e0 48 8b 58 30 49 89 RIP [<ffffffffa03789cd>] nfs_lookup_revalidate+0x2d/0x480 [nfs] RSP <ffff8801b418bd38> CR2: 0000000000000038 ---[ end trace 845113ed191985dd ]--- This Oops affects 3.5 kernels and older, and is due to lookup_one_len() calling down to the dentry revalidation code with a NULL pointer to struct nameidata. It is fixed upstream by commit 0b728e1911c (stop passing nameidata * to ->d_revalidate()) Reported-by: Richard Ems <richard.ems@cape-horn-eng.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17NFS: fix bug in legacy DNS resolver.NeilBrown
commit 8d96b10639fb402357b75b055b1e82a65ff95050 upstream. The DNS resolver's use of the sunrpc cache involves a 'ttl' number (relative) rather that a timeout (absolute). This confused me when I wrote commit c5b29f885afe890f953f7f23424045cdad31d3e4 "sunrpc: use seconds since boot in expiry cache" and I managed to break it. The effect is that any TTL is interpreted as 0, and nothing useful gets into the cache. This patch removes the use of get_expiry() - which really expects an expiry time - and uses get_uint() instead, treating the int correctly as a ttl. This fixes a regression that has been present since 2.6.37, causing certain NFS accesses in certain environments to incorrectly fail. Reported-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17nfsd: add get_uint for u32'sJ. Bruce Fields
commit a007c4c3e943ecc054a806c259d95420a188754b upstream. I don't think there's a practical difference for the range of values these interfaces should see, but it would be safer to be unambiguous. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17NFSv4: nfs4_locku_done must release the sequence idTrond Myklebust
commit 2b1bc308f492589f7d49012ed24561534ea2be8c upstream. If the state recovery machinery is triggered by the call to nfs4_async_handle_error() then we can deadlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-17nfs: Show original device name verbatim in /proc/*/mount{s,info}Ben Hutchings
commit 97a54868262da1629a3e65121e65b8e8c4419d9f upstream. Since commit c7f404b ('vfs: new superblock methods to override /proc/*/mount{s,info}'), nfs_path() is used to generate the mounted device name reported back to userland. nfs_path() always generates a trailing slash when the given dentry is the root of an NFS mount, but userland may expect the original device name to be returned verbatim (as it used to be). Make this canonicalisation optional and change the callers accordingly. [jrnieder@gmail.com: use flag instead of bool argument] Reported-and-tested-by: Chris Hiestand <chiestand@salk.edu> Reference: http://bugs.debian.org/669314 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>