Age | Commit message (Collapse) | Author |
|
commit 5c06fe772da43db63b053addcd2c267f76d0be91 upstream.
D'oh...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-and-tested-by: Peter Palfrader <peter@palfrader.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 0e55a7cca4b66f625d67b292f80b6a976e77c51b upstream
Daemons that need to be launched while the rootfs is read-only can now
poll /proc/mounts to be notified when their O_RDWR requests may no
longer end in EROFS.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit d38b7aa7fc3371b52d036748028db50b585ade2e upstream
Fix a stack corruption caused by a corrupted hfs filesystem. If the
catalog name length is corrupted the memcpy overwrites the catalog btree
structure. Since the field is limited to HFS_NAMELEN bytes in the
structure and the file format, we throw an error if it is too long.
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit b27cf88e9592953ae292d05324887f2f44979433 upstream
The thread_should_wake() function trawls through the list of 'very
dirty' eraseblocks, determining whether the background GC thread should
wake. Doing this without holding the appropriate locks is a bad idea.
OLPC Trac #8615
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit dc8a0843a435b2c0891e7eaea64faaf1ebec9b11 upstream
deflate_mutex protects the globals lzo_mem and lzo_compress_buf. However,
jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the
compressed data from lzo_compress_buf. Correct this by moving the mutex
unlock after the copy.
In addition, document what deflate_mutex actually protects.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit c87591b719737b4e91eb1a9fa8fd55a4ff1886d6 upstream
In ext3_sync_fs, we only wait for a commit to finish if we started it, but
there may be one already in progress which will not be synced.
In the case of a data=ordered umount with pending long symlinks which are
delayed due to a long list of other I/O on the backing block device, this
causes the buffer associated with the long symlinks to not be moved to the
inode dirty list in the second phase of fsync_super. Then, before they
can be dirtied again, kjournald exits, seeing the UMOUNT flag and the
dirty pages are never written to the backing block device, causing long
symlink corruption and exposing new or previously freed block data to
userspace.
This can be reproduced with a script created
by Eric Sandeen <sandeen@redhat.com>:
#!/bin/bash
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
rm -f /mnt/test2/*
dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
touch
/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
ln -s
/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
/mnt/test2/link
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
ls /mnt/test2/
umount /mnt/test2
To ensure all commits are synced, we flush all journal commits now when
sync_fs'ing ext3.
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
While testing more corrupted images with hfsplus, i came across
one which triggered the following bug:
[15840.675016] BUG: unable to handle kernel paging request at fffffffb
[15840.675016] IP: [<c0116a4f>] kmap+0x15/0x56
[15840.675016] *pde = 00008067 *pte = 00000000
[15840.675016] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[15840.675016] Modules linked in:
[15840.675016]
[15840.675016] Pid: 11575, comm: ln Not tainted (2.6.27-rc4-00123-gd3ee1b4-dirty #29)
[15840.675016] EIP: 0060:[<c0116a4f>] EFLAGS: 00010202 CPU: 0
[15840.675016] EIP is at kmap+0x15/0x56
[15840.675016] EAX: 00000246 EBX: fffffffb ECX: 00000000 EDX: cab919c0
[15840.675016] ESI: 000007dd EDI: cab0bcf4 EBP: cab0bc98 ESP: cab0bc94
[15840.675016] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[15840.675016] Process ln (pid: 11575, ti=cab0b000 task=cab919c0 task.ti=cab0b000)
[15840.675016] Stack: 00000000 cab0bcdc c0231cfb 00000000 cab0bce0 00000800 ca9290c0 fffffffb
[15840.675016] cab145d0 cab919c0 cab15998 22222222 22222222 22222222 00000001 cab15960
[15840.675016] 000007dd cab0bcf4 cab0bd04 c022cb3a cab0bcf4 cab15a6c ca9290c0 00000000
[15840.675016] Call Trace:
[15840.675016] [<c0231cfb>] ? hfsplus_block_allocate+0x6f/0x2d3
[15840.675016] [<c022cb3a>] ? hfsplus_file_extend+0xc4/0x1db
[15840.675016] [<c022ce41>] ? hfsplus_get_block+0x8c/0x19d
[15840.675016] [<c06adde4>] ? sub_preempt_count+0x9d/0xab
[15840.675016] [<c019ece6>] ? __block_prepare_write+0x147/0x311
[15840.675016] [<c0161934>] ? __grab_cache_page+0x52/0x73
[15840.675016] [<c019ef4f>] ? block_write_begin+0x79/0xd5
[15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
[15840.675016] [<c019f22a>] ? cont_write_begin+0x27f/0x2af
[15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
[15840.675016] [<c0139ebe>] ? tick_program_event+0x28/0x4c
[15840.675016] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[15840.675016] [<c022b723>] ? hfsplus_write_begin+0x2d/0x32
[15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
[15840.675016] [<c0161988>] ? pagecache_write_begin+0x33/0x107
[15840.675016] [<c01879e5>] ? __page_symlink+0x3c/0xae
[15840.675016] [<c019ad34>] ? __mark_inode_dirty+0x12f/0x137
[15840.675016] [<c0187a70>] ? page_symlink+0x19/0x1e
[15840.675016] [<c022e6eb>] ? hfsplus_symlink+0x41/0xa6
[15840.675016] [<c01886a9>] ? vfs_symlink+0x99/0x101
[15840.675016] [<c018a2f6>] ? sys_symlinkat+0x6b/0xad
[15840.675016] [<c018a348>] ? sys_symlink+0x10/0x12
[15840.675016] [<c01038bd>] ? sysenter_do_call+0x12/0x31
[15840.675016] =======================
[15840.675016] Code: 00 00 75 10 83 3d 88 2f ec c0 02 75 07 89 d0 e8 12 56 05 00 5d c3 55 ba 06 00 00 00 89 e5 53 89 c3 b8 3d eb 7e c0 e8 16 74 00 00 <8b> 03 c1 e8 1e 69 c0 d8 02 00 00 05 b8 69 8e c0 2b 80 c4 02 00
[15840.675016] EIP: [<c0116a4f>] kmap+0x15/0x56 SS:ESP 0068:cab0bc94
[15840.675016] ---[ end trace 4fea40dad6b70e5f ]---
This happens because the return value of read_mapping_page() is passed on
to kmap unchecked. The bug is triggered after the first
read_mapping_page() in hfsplus_block_allocate(), this patch fixes all
three usages in this functions but leaves the ones further down in the
file unchanged.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit efc7ffcb4237f8cb9938909041c4ed38f6e1bf40 upstream
When an hfsplus image gets corrupted it might happen that the catalog
namelength field gets b0rked. If we mount such an image the memcpy() in
hfsplus_cat_build_key_uni() writes more than the 255 that fit in the name
field. Depending on the size of the overwritten data, we either only get
memory corruption or also trigger an oops like this:
[ 221.628020] BUG: unable to handle kernel paging request at c82b0000
[ 221.629066] IP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151
[ 221.629066] *pde = 0ea29163 *pte = 082b0160
[ 221.629066] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
[ 221.629066] Modules linked in:
[ 221.629066]
[ 221.629066] Pid: 4845, comm: mount Not tainted (2.6.27-rc4-00123-gd3ee1b4-dirty #28)
[ 221.629066] EIP: 0060:[<c022d4b1>] EFLAGS: 00010206 CPU: 0
[ 221.629066] EIP is at hfsplus_find_cat+0x10d/0x151
[ 221.629066] EAX: 00000029 EBX: 00016210 ECX: 000042c2 EDX: 00000002
[ 221.629066] ESI: c82d70ca EDI: c82b0000 EBP: c82d1bcc ESP: c82d199c
[ 221.629066] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[ 221.629066] Process mount (pid: 4845, ti=c82d1000 task=c8224060 task.ti=c82d1000)
[ 221.629066] Stack: c080b3c4 c82aa8f8 c82d19c2 00016210 c080b3be c82d1bd4 c82aa8f0 00000300
[ 221.629066] 01000000 750008b1 74006e00 74006900 65006c00 c82d6400 c013bd35 c8224060
[ 221.629066] 00000036 00000046 c82d19f0 00000082 c8224548 c8224060 00000036 c0d653cc
[ 221.629066] Call Trace:
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c01302d2>] ? __kernel_text_address+0x1b/0x27
[ 221.629066] [<c010487a>] ? dump_trace+0xca/0xd6
[ 221.629066] [<c0109e32>] ? save_stack_address+0x0/0x2c
[ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
[ 221.629066] [<c013b571>] ? save_trace+0x37/0x8d
[ 221.629066] [<c013b62e>] ? add_lock_to_list+0x67/0x8d
[ 221.629066] [<c013ea1c>] ? validate_chain+0x8a4/0x9f4
[ 221.629066] [<c013553d>] ? down+0xc/0x2f
[ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c013da5d>] ? mark_held_locks+0x43/0x5a
[ 221.629066] [<c013dc3a>] ? trace_hardirqs_on+0xb/0xd
[ 221.629066] [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f
[ 221.629066] [<c06abec8>] ? _spin_unlock_irqrestore+0x42/0x58
[ 221.629066] [<c013555c>] ? down+0x2b/0x2f
[ 221.629066] [<c022aa68>] ? hfsplus_iget+0xa0/0x154
[ 221.629066] [<c022b0b9>] ? hfsplus_fill_super+0x280/0x447
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
[ 221.629066] [<c041c9e4>] ? string+0x2b/0x74
[ 221.629066] [<c041cd16>] ? vsnprintf+0x2e9/0x512
[ 221.629066] [<c010487a>] ? dump_trace+0xca/0xd6
[ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
[ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
[ 221.629066] [<c013b571>] ? save_trace+0x37/0x8d
[ 221.629066] [<c013b62e>] ? add_lock_to_list+0x67/0x8d
[ 221.629066] [<c013ea1c>] ? validate_chain+0x8a4/0x9f4
[ 221.629066] [<c01354d3>] ? up+0xc/0x2f
[ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c041cfb7>] ? snprintf+0x1b/0x1d
[ 221.629066] [<c01ba466>] ? disk_name+0x25/0x67
[ 221.629066] [<c0183960>] ? get_sb_bdev+0xcd/0x10b
[ 221.629066] [<c016ad92>] ? kstrdup+0x2a/0x4c
[ 221.629066] [<c022a7b3>] ? hfsplus_get_sb+0x13/0x15
[ 221.629066] [<c022ae39>] ? hfsplus_fill_super+0x0/0x447
[ 221.629066] [<c0183583>] ? vfs_kern_mount+0x3b/0x76
[ 221.629066] [<c0183602>] ? do_kern_mount+0x32/0xba
[ 221.629066] [<c01960d4>] ? do_new_mount+0x46/0x74
[ 221.629066] [<c0196277>] ? do_mount+0x175/0x193
[ 221.629066] [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f
[ 221.629066] [<c01663b2>] ? __get_free_pages+0x1e/0x24
[ 221.629066] [<c06ac07b>] ? lock_kernel+0x19/0x8c
[ 221.629066] [<c01962e6>] ? sys_mount+0x51/0x9b
[ 221.629066] [<c01962f9>] ? sys_mount+0x64/0x9b
[ 221.629066] [<c01038bd>] ? sysenter_do_call+0x12/0x31
[ 221.629066] =======================
[ 221.629066] Code: 89 c2 c1 e2 08 c1 e8 08 09 c2 8b 85 e8 fd ff ff 66 89 50 06 89 c7 53 83 c7 08 56 57 68 c4 b3 80 c0 e8 8c 5c ef ff 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 83 c3 06 8b 95 e8 fd ff ff 0f
[ 221.629066] EIP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151 SS:ESP 0068:c82d199c
[ 221.629066] ---[ end trace e417a1d67f0d0066 ]---
Since hfsplus_cat_build_key_uni() returns void and only has one callsite,
the check is performed at the callsite.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
[ backport of 7c88db0cb589df980acfb2f73c3595a0653004ec to 2.7.27.3 by Joe
Korty <joe.korty@ccur.com ]
proc: fix vma display mismatch between /proc/pid/{maps,smaps}
Commit 4752c369789250eafcd7813e11c8fb689235b0d2 aka
"maps4: simplify interdependence of maps and smaps" broke /proc/pid/smaps,
causing it to display some vmas twice and other vmas not at all. For example:
grep .- /proc/1/smaps >/tmp/smaps; diff /proc/1/maps /tmp/smaps
1 25d24
2 < 7fd7e23aa000-7fd7e23ac000 rw-p 7fd7e23aa000 00:00 0
3 28a28
4 > ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
The bug has something to do with setting m->version before all the
seq_printf's have been performed. show_map was doing this correctly,
but show_smap was doing this in the middle of its seq_printf sequence.
This patch arranges things so that the setting of m->version in show_smap
is also done at the end of its seq_printf sequence.
Testing: in addition to the above grep test, for each process I summed
up the 'Rss' fields of /proc/pid/smaps and compared that to the 'VmRSS'
field of /proc/pid/status. All matched except for Xorg (which has a
/dev/mem mapping which Rss accounts for but VmRSS does not). This result
gives us some confidence that neither /proc/pid/maps nor /proc/pid/smaps
are any longer skipping or double-counting vmas.
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
|
(CVE-2008-3528)
This is a trivial backport of the following upstream commits:
- bd39597cbd42a784105a04010100e27267481c67 (ext2)
- cdbf6dba28e8e6268c8420857696309470009fd9 (ext3)
- 9d9f177572d9e4eba0f2e18523b44f90dd51fe74 (ext4)
This addresses CVE-2008-3528
ext[234]: Avoid printk floods in the face of directory corruption
Note: some people thinks this represents a security bug, since it
might make the system go away while it is printing a large number of
console messages, especially if a serial console is involved. Hence,
it has been assigned CVE-2008-3528, but it requires that the attacker
either has physical access to your machine to insert a USB disk with a
corrupted filesystem image (at which point why not just hit the power
button), or is otherwise able to convince the system administrator to
mount an arbitrary filesystem image (at which point why not just
include a setuid shell or world-writable hard disk device file or some
such). Me, I think they're just being silly. --tytso
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Cc: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit a364bc0b37f14ffd66c1f982af42990a9d77fa43 upstream
We recently fixed the cifs readdir code so that it saves the resume key
before calling CIFSFindNext. Unfortunately, this assumes that we have
just done a CIFSFindFirst (or FindNext) and have resume info to save.
This isn't necessarily the case. Fix the code to save resume info if we
had to reinitiate the search, and after a FindNext.
This fixes connectathon basic test6 against NetApp filers.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 6c5e51dae2c37127e00be392f40842e08077e96a upstream
When we skip unrecognized options in xfs_fs_remount we should just break
out of the switch and not return because otherwise we may skip clearing
the xfs-internal read-only flag. This will only show up on some
operations like touch because most read-only checks are done by the VFS
which thinks this filesystem is r/w. Eventually we should replace the
XFS read-only flag with a helper that always checks the VFS flag to make
sure they can never get out of sync.
Bug reported and fix verified by Marcel Beister on #xfs.
Bug fix verified by updated xfstests/189.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 0752f1522a9120f731232919f7ad904e9e22b8ce upstream
When we do a seekdir() or equivalent, we usually end up doing a
FindFirst call and then call FindNext until we get to the offset that we
want. The problem is that when we call FindNext, the code usually
doesn't have the proper info (mostly, the filename of the entry from the
last search) to resume the search.
Add a "last_entry" field to the cifs_search_info that points to the last
entry in the search. We calculate this pointer by using the
LastNameOffset field from the search parms that are returned. We then
use that info to do a cifs_save_resume_key before we call CIFSFindNext.
This patch allows CIFS to reliably pass the "telldir" connectathon test.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
commit 73f6aa4d44ab6157badc456ddfa05b31e58de5f0 upstream.
Currently we disable barriers as soon as we get a buffer in xlog_iodone
that has the XBF_ORDERED flag cleared. But this can be the case not only
for buffers where the barrier failed, but also the first buffer of a
split log write in case of a log wraparound. Due to the disabled
barriers we can easily get directory corruption on unclean shutdowns.
So instead of using this check add a new buffer flag for failed barrier
writes.
This is a regression vs 2.6.26 caused by patch to use the right macro
to check for the ORDERED flag, as we previously got true returned for
every buffer.
Thanks to Toei Rei for reporting the bug.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: David Chinner <david@fromorbit.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
This is debatable, but while we're debating it, let's disallow the
combination of splice and an O_APPEND destination.
It's not entirely clear what the semantics of O_APPEND should be, and
POSIX apparently expects pwrite() to ignore O_APPEND, for example. So
we could make up any semantics we want, including the old ones.
But Miklos convinced me that we should at least give it some thought,
and that accepting writes at arbitrary offsets is wrong at least for
IS_APPEND() files (which always have O_APPEND set, even if the reverse
isn't true: you can obviously have O_APPEND set on a regular file).
So disallow O_APPEND entirely for now. I doubt anybody cares, and this
way we have one less gray area to worry about.
Reported-and-argued-for-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Jens Axboe <ens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The previous patch db203d53d474aa068984e409d807628f5841da1b ("mm:
tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock
ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU
architectures because it was using the expanding truncate to signal ramfs
to allocate a physically contiguous RAM backing the inode (otherwise it is
unusable for "memory mapping" it to userspace).
However do_truncate is what caused the lock ordering error, due to it
taking i_mutex. In this case, we can actually just call ramfs directly to
allocate memory for the mapping, rather than go via truncate.
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix inotify lock order reversal with mmap_sem due to holding locks over
copy_to_user.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reported-by: "Daniel J Blueman" <daniel.blueman@gmail.com>
Tested-by: "Daniel J Blueman" <daniel.blueman@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on. The condition occurs when
try_to_unuse() runs in parallel with an exiting task. A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.
CPU0 CPU1
try_to_unuse
looks at mm = task0->mm
increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
mmput(mm) decrements mm->mm_users
task0 freed
dereferencing mm->owner fails
The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.
Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.
Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.
Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same. And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.
Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The VFS interface for the 'd_compare()' is a bit special (read: 'odd'),
because it really just essentially replaces a memcmp(). The filesystem
is supposed to just compare the two names with whatever case-independent
or other function.
And when I say 'is supposed to', I obviously mean that 'procfs does odd
things, and actually looks at the dentry that we don't even pass down,
rather than just the name'. Which results in problems, because we
actually call d_compare before we have even verified that the dentry is
still hashed at all.
And that causes a problm since the inode that procfs looks at may have
been free'd and the d_inode pointer is NULL. procfs just assumes that
all dentries are positive, since procfs itself never generates a
negative one. But memory pressure will still result in the dentry
getting torn down, and as it is removed by RCU, it still remains visible
on some lists - and to d_compare.
If the filesystem just did a name comparison, we wouldn't care. And we
could just fix procfs to know about negative dentries too. But rather
than have the low-level filesystems know about internal VFS details,
just move the check for a unhashed dentry up a bit, so that we will only
call d_compare on dentries that are still active.
The actual oops this caused didn't look like a NULL pointer dereference
because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)'
to get at its internal proc_inode information from the inode pointer,
and accessed a field below the inode. So the oops would look something
like
BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50
and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and
ppc64 (Hugh Dickins).
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* git://oss.sgi.com:8090/xfs/linux-2.6:
[XFS] Remove xfs_iext_irec_compact_full()
[XFS] Fix extent list corruption in xfs_iext_irec_compact_full().
|
|
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: fix printk format warnings
UBIFS: remove incorrect assert
UBIFS: TNC / GC race fixes
UBIFS: create the name of the background thread in every case
|
|
Yet another bug was found in xfs_iext_irec_compact_full() and while the
source of the bug was found it wasn't an easy task to track it down
because the conditions are very difficult to reproduce.
A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their
significant effort in tracking down the source of this corruption.
xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost
identical - they both compact indirect extent lists by moving extents from
subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only
moves extents if all of the extents in the next buffer will fit into the
empty space in the buffer before it. xfs_iext_irec_compact_full() will go
a step further and move part of the next buffer if all the extents wont
fit. It will then shift the remaining extents in the next buffer up to the
start of the buffer. The bug here was that we did not update er_extoff and
this caused extent list corruption.
It does not appear that this extra functionality gains us much. Calling
xfs_iext_irec_compact_pages() instead will do a good enough job at
compacting the indirect list and will be quicker too.
For the case in xfs_iext_indirect_to_direct() the total number of extents
in the indirect list will fit into one buffer so we will never need the
extra functionality of xfs_iext_irec_compact_full() there.
Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the
buffers will never overlap) so we don't want the performance hit that can
incur.
SGI-PV: 987159
SGI-Modid: xfs-linux-melb:xfs-kern:32166a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
|
If we don't move all the records from the next buffer into the current
buffer then we need to update the er_extoff field of the next buffer as we
shift the remaining records to the start of the buffer.
SGI-PV: 987159
SGI-Modid: xfs-linux-melb:xfs-kern:32165a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Russell Cattelan <cattelan@thebarn.com>
|
|
In case of error, the function p9_client_walk returns an ERR pointer, but
never returns a NULL pointer. So a NULL test that comes after an IS_ERR
test should be deleted.
The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@match_bad_null_test@
expression x, E;
statement S1,S2;
@@
x = p9_client_walk(...)
... when != x = E
* if (x != NULL)
S1 else S2
// </smpl>
Signed-off-by: Julien Brunel <brunel@diku.dk>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long
unsigned int', but argument 5 has type 'long unsigned int'
fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long
unsigned int', but argument 2 has type 'long unsigned int'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
The assert was not valid because one of the variables
'taken_empty_lebs' has transient values out of sync
with the other variables.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
- update GC sequence number if any nodes may have been moved
even if GC did not finish the LEB
- don't ignore error return when reading
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
If the ubifs partition is mounted RO and then remounted RW we end
up with no thread name in ubifs_remount_rw() and the thread appears
nameless.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
When unreserving space with boundaries that are not block aligned we round
up the start and round down the end boundaries and then use this function,
xfs_zero_remaining_bytes(), to zero the parts of the blocks that got
dropped during the rounding. The problem is we don't consider if these
blocks are beyond eof. Worse still is if we encounter delayed allocations
beyond eof we will try to use the magic delayed allocation block number as
a real block number. If the file size is ever extended to expose these
blocks then we'll go through xfs_zero_eof() to zero them anyway.
SGI-PV: 983683
SGI-Modid: xfs-linux-melb:xfs-kern:32055a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
|
|
We have a use-after-free issue where log completions access buffers via
the buffer log item and the buffer has already been freed. Fix this by
taking a reference on the buffer when attaching the buffer log item and
release the hold when the buffer log item is detached and we no longer
need the buffer. Also create a new function xfs_buf_item_free() to combine
some common code.
SGI-PV: 985757
SGI-Modid: xfs-linux-melb:xfs-kern:32025a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
|
|
If we call xfs_lock_two_inodes() to grab both the iolock and the ilock,
then drop the ilocks on both inodes, then grab them again (as
xfs_swap_extents() does) then lockdep will report a locking order problem.
This is a false positive.
To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks
at once - force calers to make two separate calls. This means that nested
dropping and regaining of the ilocks will retain the same lockdep subclass
and so lockdep will not see anything wrong with this code.
SGI-PV: 986238
SGI-Modid: xfs-linux-melb:xfs-kern:31999a
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
|
The current code in xlog_iodone() uses the wrong macro to check if the
barrier has been cleared due to an EOPNOTSUPP error form the lower layer.
SGI-PV: 986143
SGI-Modid: xfs-linux-melb:xfs-kern:31984a
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
|
With the help from some tracing I found that we try to map extents beyond
eof when doing a direct I/O read. It appears that the way to inform the
generic direct I/O path (ie do_direct_IO()) that we have breached eof is
to return an unmapped buffer from xfs_get_blocks_direct(). This will cause
do_direct_IO() to jump to the hole handling code where is will check for
eof and then abort.
This problem was found because a direct I/O read was trying to map beyond
eof and was encountering delayed allocations. The delayed allocations
beyond eof are speculative allocations and they didn't get converted when
the direct I/O flushed the file because there was only enough space in the
current AG to convert and write out the dirty pages within eof. Note that
xfs_iomap_write_allocate() wont necessarily convert all the delayed
allocation passed to it - it will return after allocating the first extent
- so if the delayed allocation extends beyond eof then it will stay that
way.
SGI-PV: 983683
SGI-Modid: xfs-linux-melb:xfs-kern:31929a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
|
|
Logically we would return an error in xfs_fs_remount code to prevent users
from believing they might have changed mount options using remount which
can't be changed.
But unfortunately mount(8) adds all options from mtab and fstab to the
mount arguments in some cases so we can't blindly reject options, but have
to check for each specified option if it actually differs from the
currently set option and only reject it if that's the case.
Until that is implemented we return success for every remount request, and
silently ignore all options that we can't actually change.
SGI-PV: 985710
SGI-Modid: xfs-linux-melb:xfs-kern:31908a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
|
Memory allocations for log->l_grant_trace and iclog->ic_trace are done on
demand when the first event is logged. In xlog_state_get_iclog_space() we
call xlog_trace_iclog() under a spinlock and allocating memory here can
cause us to sleep with a spinlock held and deadlock the system.
For the log grant tracing we use KM_NOSLEEP but that means we can lose
trace entries. Since there is no locking to serialize the log grant
tracing we could race and have multiple allocations and leak memory.
So move the allocations to where we initialize the log/iclog structures.
Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace
since it's not even used.
SGI-PV: 983738
SGI-Modid: xfs-linux-melb:xfs-kern:31896a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
|
|
Herton Krzesinski reports that the error-checking changes in
04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
fs/partition/check.c: check value returned by add_partition") cause his
buggy USB camera to no longer mount. "The camera is an Olympus X-840.
The original issue comes from the camera itself: its format program
creates a partition with an off by one error".
Buggy devices happen. It is better for the kernel to warn and to proceed
with the mount.
Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Abdel Benamrouche <draconux@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A "Quicklists: 0 kB" line has just started appearing in
/proc/meminfo, but most architectures (including x86) don't have
them configured, so #ifdef it, like the highmem lines.
And those architectures which do have quicklists configured are
using them for page tables: so let's place it next to PageTables.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This fixes:
=============================================
[ INFO: possible recursive locking detected ]
2.6.27-rc5-00283-g70bb089 #68
---------------------------------------------
touch/6855 is trying to acquire lock:
(&info->bfs_lock){--..}, at: [<c02262f5>] bfs_delete_inode+0x9e/0x18c
but task is already holding lock:
(&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187
other info that might help us debug this:
2 locks held by touch/6855:
#0: (&type->i_mutex_dir_key#5){--..}, at: [<c018ad13>] do_filp_open+0x10b/0x62f
#1: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187
stack backtrace:
Pid: 6855, comm: touch Not tainted 2.6.27-rc5-00283-g70bb089 #68
[<c013e769>] validate_chain+0x458/0x9f4
[<c013bece>] ? trace_hardirqs_off+0xb/0xd
[<c013f36b>] __lock_acquire+0x666/0x6e0
[<c013f440>] lock_acquire+0x5b/0x77
[<c02262f5>] ? bfs_delete_inode+0x9e/0x18c
[<c06aab74>] mutex_lock_nested+0xbc/0x234
[<c02262f5>] ? bfs_delete_inode+0x9e/0x18c
[<c02262f5>] ? bfs_delete_inode+0x9e/0x18c
[<c02262f5>] bfs_delete_inode+0x9e/0x18c
[<c0226257>] ? bfs_delete_inode+0x0/0x18c
[<c01925e1>] generic_delete_inode+0x94/0xfe
[<c019265d>] generic_drop_inode+0x12/0x12f
[<c0191b7e>] iput+0x4b/0x4e
[<c0226d1e>] bfs_create+0x163/0x187
[<c0188b42>] vfs_create+0xa6/0x114
[<c018adb5>] do_filp_open+0x1ad/0x62f
[<c0107cdc>] ? native_sched_clock+0x82/0x96
[<c06ac309>] ? _spin_unlock+0x27/0x3c
[<c019379e>] ? alloc_fd+0xbf/0xc9
[<c06ae2f4>] ? sub_preempt_count+0x9d/0xab
[<c019379e>] ? alloc_fd+0xbf/0xc9
[<c0180391>] do_sys_open+0x42/0xb8
[<c041d564>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c0180449>] sys_open+0x1e/0x26
[<c01038bd>] sysenter_do_call+0x12/0x31
=======================
The problem is that we don't unlock the bfs->lock mutex before calling
iput (we do in the other cases).
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Print parent directory name as well.
The aim is to catch non-creation of parent directory when proc_mkdir will
return NULL and all subsequent registrations go directly in /proc instead
of intended directory.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Fixed insane printk string while at it. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
udf: add llseek method
udf: Fix error paths in udf_new_inode()
udf: Fix lock inversion between iprune_mutex and alloc_mutex (v2)
|
|
ocfs2 will become read-only if we try to read the bytes which pass
the end of i_size. This can be easily reproduced by following steps:
1. mkfs a ocfs2 volume with bs=4k cs=4k and nosparse.
2. create a small file(say less than 100 bytes) and we will create the file
which is allocated 1 cluster.
3. read 8196 bytes from the kernel using O_DIRECT which exceeds the limit.
4. The ocfs2 volume becomes read-only and dmesg shows:
OCFS2: ERROR (device sda13): ocfs2_direct_IO_get_blocks:
Inode 66010 has a hole at block 1
File system is now read-only due to the potential of on-disk corruption.
Please run fsck.ocfs2 once the file system is unmounted.
So suppress the ERROR message.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: make minimum fanout 3
UBIFS: fix division by zero
UBIFS: amend f_fsid
UBIFS: fill f_fsid
UBIFS: improve statfs reporting even more
UBIFS: introduce LEB overhead
UBIFS: add forgotten gc_idx_lebs component
UBIFS: fix assertion
UBIFS: improve statfs reporting
UBIFS: remove incorrect index space check
UBIFS: push empty flash hack down
UBIFS: do not update min_idx_lebs in stafs
UBIFS: allow for racing between GC and TNC
UBIFS: always read hashed-key nodes under TNC mutex
UBIFS: fix zero-length truncations
|
|
Automounter maps can contain mount options valid for other NFS
implementations but not for Linux. The Linux automounter uses the
mount command's "-s" command line option ("s" for "sloppy") so that
mount requests containing such options are not rejected.
Commit f45663ce5fb30f76a3414ab3ac69f4dd320e760a attempted to address a
known regression with text-based NFS mount option parsing. Unrecognized
mount options would cause mount requests to fail, even if the "-s"
option was used on the mount command line.
Unfortunately, this commit was not complete as submitted. It adds a
new mount option, "sloppy". But it is missing a hunk, so it now allows
NFS mounts with unrecognized mount options, even if the "sloppy" option
is not present. This could be a problem if a required critical mount
option such as "sync" is misspelled, for example, and is considered a
regression from 2.6.26.
This patch restores the missing hunk. Now, the default behavior of
text-based NFS mount options is as before: any unrecognized mount option
will cause the mount to fail.
Please include this in 2.6.27-rc.
Thanks to Neil Brown for reporting this.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
UDF currently doesn't set a llseek method for regular files, which
means it will fall back to default_llseek. This means no one can seek
beyond 2 Gigabytes on udf, and that there's not protection vs
the i_size updates from writers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
UBIFS does not really work correctly when fanout is 2,
because of the way we manage the indexing tree. It may
just become a list and UBIFS screws up.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
If fanout is 3, we have division by zero in
'ubifs_read_superblock()':
divide error: 0000 [#1] PREEMPT SMP
Pid: 28744, comm: mount Not tainted (2.6.27-rc4-ubifs-2.6 #23)
EIP: 0060:[<f8f9e3ef>] EFLAGS: 00010202 CPU: 0
EIP is at ubifs_reported_space+0x2d/0x69 [ubifs]
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: f0ae64b0 EBP: f1f9fcf4 ESP: f1f9fce0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
Spencer reported a problem where utime and stime were going negative despite
the fixes in commit b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa. The suspected
reason for the problem is that signal_struct maintains it's own utime and
stime (of exited tasks), these are not updated using the new task_utime()
routine, hence sig->utime can go backwards and cause the same problem
to occur (sig->utime, adds tsk->utime and not task_utime()). This patch
fixes the problem
TODO: using max(task->prev_utime, derived utime) works for now, but a more
generic solution is to implement cputime_max() and use the cputime_gt()
function for comparison.
Reported-by: spencer@bluehost.com
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
David Woodhouse suggest |