Age | Commit message (Collapse) | Author |
|
hugetlb_vmtruncate_list was misconverted to prio_tree: its prio_tree is in
units of PAGE_SIZE (PAGE_CACHE_SIZE) like any other, not HPAGE_SIZE (whereas
its radix_tree is kept in units of HPAGE_SIZE, otherwise slots would be
absurdly sparse).
At first I thought the error benign, just calling __unmap_hugepage_range on
more vmas than necessary; but on 32-bit machines, when the prio_tree is
searched correctly, it happens to ensure the v_offset calculation won't
overflow. As it stood, when truncating at or beyond 4GB, it was liable to
discard pages COWed from lower offsets; or even to clear pmd entries of
preceding vmas, triggering exit_mmap's BUG_ON(nr_ptes).
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
|
|
In kernel bugzilla #6248 (http://bugzilla.kernel.org/show_bug.cgi?id=6248),
Adrian Bunk <bunk@stusta.de> notes that CONFIG_HUGETLBFS is missing Kconfig
help text.
Signed-off-by: Arthur Othieno <apgo@patchbomb.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
|
|
Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch
For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir. This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.
Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.
Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
|
|
This fixes a vulnerability in the "parent process death signal"
implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd.
and iSEC Security Research.
http://marc.info/?l=bugtraq&m=118711306802632&w=2
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
|
|
Shows how many people are testing coda - the bug had been there for 5 years
and results of stepping on it are not subtle.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Local variable `i' is a byte-counter. Don't use it as an index into an array
of le32's.
Reported-by: "young dave" <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The user can generate console output if they cause do_mmap() to fail
during sys_io_setup(). This was seen in a regression test that does
exactly that by spinning calling mmap() until it gets -ENOMEM before
calling io_setup().
We don't need this printk at all, just remove it.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This patch fixes ext3 block bitmap leakage,
which leads to the following fsck messages on
_healthy_ filesystem:
Block bitmap differences: -64159 -73707
All kernels up to 2.6.17 have this bug.
Found by
Vasily Averin <vvs@sw.ru> and Andrey Savochkin <saw@sawoct.com>
Test case triggered the issue was created by
Dmitry Monakhov <dmonakhov@sw.ru>
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
CVE-2006-5753 is for a case where an inode can be marked bad, switching
the ops to bad_inode_ops, which are all connected as:
static int return_EIO(void)
{
return -EIO;
}
#define EIO_ERROR ((void *) (return_EIO))
static struct inode_operations bad_inode_ops =
{
.create = bad_inode_create
...etc...
The problem here is that the void cast causes return types to not be
promoted, and for ops such as listxattr which expect more than 32 bits of
return value, the 32-bit -EIO is interpreted as a large positive 64-bit
number, i.e. 0x00000000fffffffa instead of 0xfffffffa.
This goes particularly badly when the return value is taken as a number of
bytes to copy into, say, a user's buffer for example...
I originally had coded up the fix by creating a return_EIO_<TYPE> macro
for each return type, like this:
static int return_EIO_int(void)
{
return -EIO;
}
#define EIO_ERROR_INT ((void *) (return_EIO_int))
static struct inode_operations bad_inode_ops =
{
.create = EIO_ERROR_INT,
...etc...
but Al felt that it was probably better to create an EIO-returner for each
actual op signature. Since so few ops share a signature, I just went ahead
& created an EIO function for each individual file & inode op that returns
a value.
Adrian Bunk:
backported to 2.6.16
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Due to type confusion, when an nfsacl verison 2 'ACCESS' request
finishes and tries to clean up, it calls fh_put on entiredly the
wrong thing and this can cause an oops.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This also adds he required page "writeback" flag handling, that cifs
hasn't been doing and that the page dirty flag changes made obvious.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Acked-by: Steve French <smfltc@us.ibm.com>
|
|
Fix insecure default behaviour reported by Tigran Aivazian: if an
ext2 or ext3 filesystem is tuned to mount with "acl", but mounted by
a kernel built without ACL support, then umask was ignored when creating
inodes - though root or user has umask 022, touch creates files as 0666,
and mkdir creates directories as 0777.
This appears to have worked right until 2.6.11, when a fix to the default
mode on symlinks (always 0777) assumed VFS applies umask: which it does,
unless the mount is marked for ACLs; but ext[23] set MS_POSIXACL in
s_flags according to s_mount_opt set according to def_mount_opts.
We could revert to the 2.6.10 ext[23]_init_acl (adding an S_ISLNK test);
but other filesystems only set MS_POSIXACL when ACLs are configured. We
could fix this at another level; but it seems most robust to avoid setting
the s_mount_opt flag in the first place (at the expense of more ifdefs).
Likewise don't set the XATTR_USER flag when built without XATTR support.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This patch fixes a confusion reiserfs has for a long time.
On release file operation reiserfs used to try to pack file data stored in
last incomplete page of some files into metadata blocks. After packing the
page got cleared with clear_page_dirty. It did not take into account that
the page may be mmaped into other process's address space. Recent
replacement for clear_page_dirty cancel_dirty_page found the confusion with
sanity check that page has to be not mapped.
The patch fixes the confusion by making reiserfs avoid tail packing if an
inode was ever mmapped. reiserfs_mmap and reiserfs_file_release are
serialized with mutex in reiserfs specific inode. reiserfs_mmap locks the
mutex and sets a bit in reiserfs specific inode flags.
reiserfs_file_release checks the bit having the mutex locked. If bit is
set - tail packing is avoided. This eliminates a possibility that mmapped
page gets cancel_page_dirty-ed.
Signed-off-by: Vladimir Saveliev <vs@namesys.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Fix filenames on adfs discs being terminated at the first character greater
than 128 (adfs filenames are Latin 1). I saw this problem when using a
loopback adfs image on a 2.6.17-rc5 x86_64 machine, and the patch fixed it
there.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This was triggered, but not the fault of, the dirty page accounting
patches. Suitable for -stable as well, after it goes upstream.
Unable to handle kernel NULL pointer dereference at virtual address 0000004c
EIP is at _spin_lock+0x12/0x66
Call Trace:
[<401766e7>] __set_page_dirty_buffers+0x15/0xc0
[<401401e7>] set_page_dirty+0x2c/0x51
[<40140db2>] set_page_dirty_balance+0xb/0x3b
[<40145d29>] __do_fault+0x1d8/0x279
[<40147059>] __handle_mm_fault+0x125/0x951
[<401133f1>] do_page_fault+0x440/0x59f
[<4034d0c1>] error_code+0x39/0x40
[<08048a33>] 0x8048a33
=======================
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
OpenVZ Linux kernel team has found a problem with mounting in compat mode.
Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
leads to oops:
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
[<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
PGD 34d48067 PUD 34d03067 PMD 0
Oops: 0000 [1] SMP
CPU: 0
Modules linked in: iptable_nat simfs smbfs ip_nat ip_conntrack vzdquota
parport_pc lp parport 8021q bridge llc vznetdev vzmon nfs lockd sunrpc vzdev
iptable_filter af_packet xt_length ipt_ttl xt_tcpmss ipt_TCPMSS
iptable_mangle xt_limit ipt_tos ipt_REJECT ip_tables x_tables thermal
processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801
i2c_core e100 mii floppy ide_cd cdrom
Pid: 14656, comm: mount
RIP: 0060:[<ffffffff802bc7c6>] [<ffffffff802bc7c6>]
compat_sys_mount+0xd6/0x290
RSP: 0000:ffff810034d31f38 EFLAGS: 00010292
RAX: 000000000000002c RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff810034c86bc0 RSI: 0000000000000096 RDI: ffffffff8061fc90
RBP: ffff810034d31f78 R08: 0000000000000000 R09: 000000000000000d
R10: ffff810034d31e58 R11: 0000000000000001 R12: ffff810039dc3000
R13: 000000000805ea48 R14: 0000000000000000 R15: 00000000c0ed0000
FS: 0000000000000000(0000) GS:ffffffff80749000(0033) knlGS:00000000b7d556b0
CS: 0060 DS: 007b ES: 007b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000034d43000 CR4: 00000000000006e0
Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task
ffff810034c86bc0)
Stack: 0000000000000000 ffff810034dd0000 ffff810034e4a000 000000000805ea48
0000000000000000 0000000000000000 0000000000000000 0000000000000000
000000000805ea48 ffffffff8021e64e 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8021e64e>] ia32_sysret+0x0/0xa
Code: 83 3b 06 0f 85 41 01 00 00 0f b7 43 0c 89 43 14 0f b7 43 0a
RIP [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
RSP <ffff810034d31f38>
CR2: 0000000000000000
The problem is that data_page pointer can be NULL, so we should skip data
conversion in this case.
Signed-off-by: Andrey Mirkin <amirkin@openvz.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Steve Grubb's fzfuzzer tool (http://people.redhat.com/sgrubb/files/
fsfuzzer-0.6.tar.gz) generates corrupt Cramfs filesystems which cause
Cramfs to kernel oops in cramfs_uncompress_block(). The cause of the oops
is an unchecked corrupted block length field read by cramfs_readpage().
This patch adds a sanity check to cramfs_readpage() which checks that the
block length field is sensible. The (PAGE_CACHE_SIZE << 1) size check is
intentional, even though the uncompressed data is not going to be larger
than PAGE_CACHE_SIZE, gzip sometimes generates compressed data larger than
the original source data. Mkcramfs checks that the compressed size is
always less than or equal to PAGE_CACHE_SIZE << 1. Of course Cramfs could
use the original uncompressed data in this case, but it doesn't.
Signed-off-by: Phillip Lougher <phillip@lougher.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz
Basically it makes a filesystem, splats some random bits over it, then
tries to mount it and do some simple filesystem actions.
At best, the filesystem catches the corruption gracefully. At worst,
things spin out of control.
As you might guess, we found a couple places in ext3 where things spin out
of control :)
First, we had a corrupted directory that was never checked for
consistency... it was corrupt, and pointed to another bad "entry" of
length 0. The for() loop looped forever, since the length of
ext3_next_entry(de) was 0, and we kept looking at the same pointer over and
over and over and over... I modeled this check and subsequent action on
what is done for other directory types in ext3_readdir...
(adding this check adds some computational expense; I am testing a followup
patch to reduce the number of times we check and re-check these directory
entries, in all cases. Thanks for the idea, Andreas).
Next we had a root directory inode which had a corrupted size, claimed to
be > 200M on a 4M filesystem. There was only really 1 block in the
directory, but because the size was so large, readdir kept coming back for
more, spewing thousands of printk's along the way.
Per Andreas' suggestion, if we're in this read error condition and we're
trying to read an offset which is greater than i_blocks worth of bytes,
stop trying, and break out of the loop.
With these two changes fsfuzz test survives quite well on ext3.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This one was pointed out on the MOKB site:
http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html
If a directory's i_size is corrupted, ext2_find_entry() will keep processing
pages until the i_size is reached, even if there are no more blocks associated
with the directory inode. This patch puts in some minimal sanity-checking
so that we don't keep checking pages (and issuing errors) if we know there
can be no more data to read, based on the block count of the directory inode.
This is somewhat similar in approach to the ext3 patch I sent earlier this
year.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
http://kernelfun.blogspot.com/2006/11/mokb-14-11-2006-linux-26x-selinux.html
mount that image...
fs: filesystem was not cleanly unmounted, running fsck.hfs is recommended. mounting read-only.
hfs: get root inode failed.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000018
printing eip
...
EIP is at superblock_doinit+0x21/0x767
...
[] selinux_sb_kern_mount+0xc/0x4b
[] vfs_kern_mount+0x99/0xf6
[] do_kern_mount+0x2d/0x3e
[] do_mount+0x5fa/0x66d
[] sys_mount+0x77/0xae
[] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
hfs_fill_super() returns success even if
root_inode = hfs_iget(sb, &fd.search_key->cat, &rec);
or
sb->s_root = d_alloc_root(root_inode);
fails. This superblock finds its way to superblock_doinit() which does:
struct dentry *root = sb->s_root;
struct inode *inode = root->d_inode;
and boom. Need to make sure the error cases return an error, I think.
[akpm@osdl.org: return -ENOMEM on oom]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
If grow_buffers() is for some reason passed a block number which wants to li
outside the maximum-addressable pagecache range (PAGE_SIZE * 4G bytes) then
will accidentally truncate `index' and will then instnatiate a page at the
wrong pagecache offset. This causes __getblk_slow() to go into an infinite
loop.
This can happen with corrupted disks, or with software errors elsewhere.
Detect that, and handle it.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Fuse didn't always call i_size_write() with i_mutex held which caused
rare hangs on SMP/32bit. This bug has been present since fuse-2.2,
well before being merged into mainline.
The simplest solution is to protect i_size_write() with the
per-connection spinlock. Using i_mutex for this purpose would require
some restructuring of the code and I'm not even sure it's always safe
to acquire i_mutex in all places i_size needs to be set.
Since most of vmtruncate is already duplicated for other reasons,
duplicate the remaining part as well, making all i_size_write() calls
internal to fuse.
Using i_size_write() was unnecessary in fuse_init_inode(), since this
function is only called on a newly created locked inode.
Reported by a few people over the years, but special thanks to Dana
Henriksen who was persistent enough in helping me debug it.
Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
If the open intents tell us that a given lookup is going to result in a,
exclusive create, we currently optimize away the lookup call itself. The
reason is that the lookup would not be atomic with the create RPC call, so
why do it in the first place?
A problem occurs, however, if the VFS aborts the exclusive create operation
after the lookup, but before the call to create the file/directory: in this
case we will end up with a hashed negative dentry in the dcache that has
never been looked up.
Fix this by only actually hashing the dentry once the create operation has
been successfully completed.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Fix check for bad address; use macro instead of open-coding two checks.
Taken from RHEL4 kernel update.
From: Ernie Petrides <petrides@redhat.com>
For background, the BAD_ADDR() macro should return TRUE if the address is
TASK_SIZE, because that's the lowest address that is *not* valid for
user-space mappings. The macro was correct in binfmt_aout.c but was wrong
for the "equal to" case in binfmt_elf.c. There were two in-line validations
of user-space addresses in binfmt_elf.c, which have been appropriately
converted to use the corrected BAD_ADDR() macro in the patch you posted
yesterday. Note that the size checks against TASK_SIZE are okay as coded.
The additional changes that I propose are below. These are in the error
paths for bad ELF entry addresses once load_elf_binary() has already
committed to exec'ing the new image (following the tearing down of the
task's original address space).
The 1st hunk deals with the interp-side of the outer "if". There were two
problems here. The printk() should be removed because this path can be
triggered at will by a bogus interpreter image created and used by a
malicious user. Further, the error code should not be ENOEXEC, because that
causes the loop in search_binary_handler() to continue trying other exec
handlers (twice, in fact). But it's too late for this to work correctly,
because the user address space has already been torn down, and an exec()
failure cannot be returned to the user code because the code no longer
exists. The only recovery is to force a SIGSEGV, but it's best to terminate
the search loop immediately. I somewhat arbitrarily chose EINVAL as a
fallback error code, but any error returned by load_elf_interp() will
override that (but this value will never be seen by user-space).
The 2nd hunk deals with the non-interp-side of the outer "if". There were
two problems here as well. The SIGSEGV needs to be forced, because a prior
sigaction() syscall might have set the associated disposition to SIG_IGN.
And the ENOEXEC should be changed to EINVAL as described above.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
fcntl(F_SETSIG) no longer works on leases because
lease_release_private_callback() gets called as the lease is copied in
order to initialise it.
The problem is that lease_alloc() performs an unnecessary initialisation,
which sets the lease_manager_ops. Avoid the problem by allocating the
target lease structure using locks_alloc_lock().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
diRead and diWrite are representing the page number as an unsigned int.
This causes file system corruption on volumes larger than 16TB.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree
stopped calling the readdir member of a file_operations struct with the big
kernel lock held, and fixed up all the readdir functions to do their own
locking. However, that change added calls to unlock_kernel() in
vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to
lock_kernel().
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of
buffer size from address of buffer_head is a bad idea - it will generate
junk in any case, may oops if buffer_head is close to the end of slab
page and next page is not mapped and isn't what was intended there.
IOW, ->b_data is missing in that call. Fortunately, result doesn't go
into the primary on-disk data structures, so only backup ones get crap
written to them; that had allowed this bug to remain unnoticed until
now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Fixes Samba bugzilla bug # 4182
Rename by handle failures (retry after rename by path) were not
being returned back.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Following function can drops d_count twice against one reference
by lookup_one_len.
<SOURCE>
/**
* sysfs_update_file - update the modified timestamp on an object attribute.
* @kobj: object we're acting for.
* @attr: attribute descriptor.
*/
int sysfs_update_file(struct kobject * kobj, const struct attribute * attr)
{
struct dentry * dir = kobj->dentry;
struct dentry * victim;
int res = -ENOENT;
mutex_lock(&dir->d_inode->i_mutex);
victim = lookup_one_len(attr->name, dir, strlen(attr->name));
if (!IS_ERR(victim)) {
/* make sure dentry is really there */
if (victim->d_inode &&
(victim->d_parent->d_inode == dir->d_inode)) {
victim->d_inode->i_mtime = CURRENT_TIME;
fsnotify_modify(victim);
/**
* Drop reference from initial sysfs_get_dentry().
*/
dput(victim);
res = 0;
} else
d_drop(victim);
/**
* Drop the reference acquired from sysfs_get_dentry() above.
*/
dput(victim);
}
mutex_unlock(&dir->d_inode->i_mutex);
return res;
}
</SOURCE>
PCI-hotplug (drivers/pci/hotplug/pci_hotplug_core.c) is only user of
this function. I confirmed that dentry of /sys/bus/pci/slots/XXX/*
have negative d_count value.
This patch removes unnecessary dput().
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
In bugzilla #6941, Jens Kilian reported:
"The function befs_utf2nls (in fs/befs/linuxvfs.c) writes a 0 byte past the
end of a block of memory allocated via kmalloc(), leading to memory
corruption. This happens only for filenames which are pure ASCII and a
multiple of 4 bytes in length. [...]
Without DEBUG_SLAB, this leads to further corruption and hard lockups; I
believe this is the bug which has made kernels later than 2.6.8 unusable
for me. (This must be due to changes in memory management, the bug has
been in the BeFS driver since the time it was introduced (AFAICT).)
Steps to reproduce:
Create a directory (in BeOS, naturally :-) with files named, e.g.,
"1", "22", "333", "4444", ... Mount it in Linux and do an "ls" or "find""
This patch implements the suggested fix. Credits to Jens Kilian for
debugging the problem and finding the right fix.
Signed-off-by: Diego Calleja <diegocg@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
For files other than IFREG, nobh option doesn't make sense. Modifications
to them are journalled and needs buffer heads to do that. Without this
patch, we get kernel oops in page_buffers().
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The current sun disklabel code uses a signed int for the sector count.
When partitions larger than 1 TB are used, the cast to a sector_t causes
the partition sizes to be invalid:
# cat /proc/paritions | grep sdan
66 112 2146435072 sdan
66 115 9223372036853660736 sdan3
66 120 9223372036853660736 sdan8
This patch switches the sector count to an unsigned int to fix this.
Eric Sandeen also submitted the same patch.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This just turns off chmod() on the /proc/<pid>/ files, since there is no
good reason to allow it, and had we disallowed it originally, the nasty
/proc race exploit wouldn't have been possible.
The other patches already fixed the problem chmod() could cause, so this
is really just some final mop-up..
This particular version is based off a patch by Eugene and Marcel which
had much better naming than my original equivalent one.
Signed-off-by: Eugene Teo <eteo@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Else a subsequent bio_clone might make a mess.
Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Make cifsd allow us to suspend if it has lost the connection with a server
Ref: http://bugzilla.kernel.org/show_bug.cgi?id=6811
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Since cifs_unlink can also be called from rename path and there
was one report of oops am making the extra check for null inode.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
heavy stress.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
When found, it is obvious. nfds calculated when allocating fdsets is
rewritten by calculation of size of fdtable, and when we are unlucky, we
try to free fdsets of wrong size.
Found due to OpenVZ resource management (User Beancounters).
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This prevents bad inode numbers from triggering errors in
ext2_get_inode.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
2.6.16 leaks like hell. While testing, I found massive filp leakage
(reproduced in openvz) in the bowels of namei.c.
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
blatantly ripped off from Neil Brown's ext2 patch.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
The inode number out of an NFS file handle gets passed eventually to
ext3_get_inode_block() without any checking. If ext3_get_inode_block()
allows it to trigger an error, then bad filehandles can have unpleasant
effect - ext3_error() will usually cause a forced read-only remount, or a
panic if `errors=panic' was used.
So remove the call to ext3_error there and put a matching check in
ext3/namei.c where inode numbers are read off storage.
Andrew Morton fixed an off-by-one error.
Dann Frazier ported the patch to 2.6.16.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
UDF code is not really ready to handle extents larger that 1GB. This is
the easy way to forbid creating those.
Also truncation code did not count with the case when there are no
extents in the file and we are extending the file.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Looking at the reiser4 crash, I found a leak in debugfs. In
debugfs_mknod(), we create the inode before checking if the dentry
already has one attached. We don't free it if that is the case.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
If get_user_pages() returns less pages than what we asked for, we
jump to out_unmap which will return ERR_PTR(ret). But ret can contain
a positive number just smaller than local_nr_pages, so be sure to set
it to -EFAULT always.
Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
Relax /proc fix a bit
Clearign all of i_mode was a bit draconian. We only really care about
S_ISUID/ISGID, after all.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Fix nasty /proc vulnerability
We have a bad interaction with both the kernel and user space being able
to change some of the /proc file status. This fixes the most obvious
part of it, but I expect we'll also make it harder for users to modify
even their "own" files in /proc.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|