From 1a8edf40e7c3eee955e0dd0316a7c9d85e36f597 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 15 Jan 2011 13:12:53 -0500 Subject: do_lookup() fix do_lookup() has a path leading from LOOKUP_RCU case to non-RCU crossing of mountpoints, which breaks things badly. If we hit need_revalidate: and do nothing in there, we need to come back into LOOKUP_RCU half of things, not to done: in non-RCU one. Signed-off-by: Al Viro --- fs/namei.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/namei.c b/fs/namei.c index 8df7a78ace5..529e917ad2f 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1089,6 +1089,7 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, nd->seq = seq; if (dentry->d_flags & DCACHE_OP_REVALIDATE) goto need_revalidate; +done2: path->mnt = mnt; path->dentry = dentry; __follow_mount_rcu(nd, path, inode); @@ -1143,6 +1144,8 @@ need_revalidate: goto need_lookup; if (IS_ERR(dentry)) goto fail; + if (nd->flags & LOOKUP_RCU) + goto done2; goto done; fail: -- cgit v1.2.3-18-g5258 From 9875cf806403fae66b2410a3c2cc820d97731e04 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:21 +0000 Subject: Add a dentry op to handle automounting rather than abusing follow_link() Add a dentry op (d_automount) to handle automounting directories rather than abusing the follow_link() inode operation. The operation is keyed off a new dentry flag (DCACHE_NEED_AUTOMOUNT). This also makes it easier to add an AT_ flag to suppress terminal segment automount during pathwalk and removes the need for the kludge code in the pathwalk algorithm to handle directories with follow_link() semantics. The ->d_automount() dentry operation: struct vfsmount *(*d_automount)(struct path *mountpoint); takes a pointer to the directory to be mounted upon, which is expected to provide sufficient data to determine what should be mounted. If successful, it should return the vfsmount struct it creates (which it should also have added to the namespace using do_add_mount() or similar). If there's a collision with another automount attempt, NULL should be returned. If the directory specified by the parameter should be used directly rather than being mounted upon, -EISDIR should be returned. In any other case, an error code should be returned. The ->d_automount() operation is called with no locks held and may sleep. At this point the pathwalk algorithm will be in ref-walk mode. Within fs/namei.c itself, a new pathwalk subroutine (follow_automount()) is added to handle mountpoints. It will return -EREMOTE if the automount flag was set, but no d_automount() op was supplied, -ELOOP if we've encountered too many symlinks or mountpoints, -EISDIR if the walk point should be used without mounting and 0 if successful. The path will be updated to point to the mounted filesystem if a successful automount took place. __follow_mount() is replaced by follow_managed() which is more generic (especially with the patch that adds ->d_manage()). This handles transits from directories during pathwalk, including automounting and skipping over mountpoints (and holding processes with the next patch). __follow_mount_rcu() will jump out of RCU-walk mode if it encounters an automount point with nothing mounted on it. follow_dotdot*() does not handle automounts as you don't want to trigger them whilst following "..". I've also extracted the mount/don't-mount logic from autofs4 and included it here. It makes the mount go ahead anyway if someone calls open() or creat(), tries to traverse the directory, tries to chdir/chroot/etc. into the directory, or sticks a '/' on the end of the pathname. If they do a stat(), however, they'll only trigger the automount if they didn't also say O_NOFOLLOW. I've also added an inode flag (S_AUTOMOUNT) so that filesystems can mark their inodes as automount points. This flag is automatically propagated to the dentry as DCACHE_NEED_AUTOMOUNT by __d_instantiate(). This saves NFS and could save AFS a private flag bit apiece, but is not strictly necessary. It would be preferable to do the propagation in d_set_d_op(), but that doesn't normally have access to the inode. [AV: fixed breakage in case if __follow_mount_rcu() fails and nameidata_drop_rcu() succeeds in RCU case of do_lookup(); we need to fall through to non-RCU case after that, rather than just returning with ungrabbed *path] Signed-off-by: David Howells Was-Acked-by: Ian Kent Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 2 + Documentation/filesystems/vfs.txt | 14 +++ fs/dcache.c | 5 +- fs/namei.c | 226 ++++++++++++++++++++++++++++---------- include/linux/dcache.h | 7 +- include/linux/fs.h | 2 + 6 files changed, 198 insertions(+), 58 deletions(-) diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 977d8919cc6..5f0c52a0738 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -19,6 +19,7 @@ prototypes: void (*d_release)(struct dentry *); void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen); + struct vfsmount *(*d_automount)(struct path *path); locking rules: rename_lock ->d_lock may block rcu-walk @@ -29,6 +30,7 @@ d_delete: no yes no no d_release: no no yes no d_iput: no no yes no d_dname: no no no no +d_automount: no no yes no --------------------------- inode_operations --------------------------- prototypes: diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index cae6d27c9f5..726a4f6fa3c 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -864,6 +864,7 @@ struct dentry_operations { void (*d_release)(struct dentry *); void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); + struct vfsmount *(*d_automount)(struct path *); }; d_revalidate: called when the VFS needs to revalidate a dentry. This @@ -930,6 +931,19 @@ struct dentry_operations { at the end of the buffer, and returns a pointer to the first char. dynamic_dname() helper function is provided to take care of this. + d_automount: called when an automount dentry is to be traversed (optional). + This should create a new VFS mount record, mount it on the directory + and return the record to the caller. The caller is supplied with a + path parameter giving the automount directory to describe the automount + target and the parent VFS mount record to provide inheritable mount + parameters. NULL should be returned if someone else managed to make + the automount first. If the automount failed, then an error code + should be returned. + + This function is only used if DCACHE_NEED_AUTOMOUNT is set on the + dentry. This is set by __d_instantiate() if S_AUTOMOUNT is set on the + inode being added. + Example : static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen) diff --git a/fs/dcache.c b/fs/dcache.c index 0c6d5c549d8..51f7bb6463a 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1380,8 +1380,11 @@ EXPORT_SYMBOL(d_set_d_op); static void __d_instantiate(struct dentry *dentry, struct inode *inode) { spin_lock(&dentry->d_lock); - if (inode) + if (inode) { + if (unlikely(IS_AUTOMOUNT(inode))) + dentry->d_flags |= DCACHE_NEED_AUTOMOUNT; list_add(&dentry->d_alias, &inode->i_dentry); + } dentry->d_inode = inode; dentry_rcuwalk_barrier(dentry); spin_unlock(&dentry->d_lock); diff --git a/fs/namei.c b/fs/namei.c index 529e917ad2f..16109da68bb 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -896,51 +896,120 @@ int follow_up(struct path *path) } /* - * serialization is taken care of in namespace.c + * Perform an automount + * - return -EISDIR to tell follow_managed() to stop and return the path we + * were called with. */ -static void __follow_mount_rcu(struct nameidata *nd, struct path *path, - struct inode **inode) +static int follow_automount(struct path *path, unsigned flags, + bool *need_mntput) { - while (d_mountpoint(path->dentry)) { - struct vfsmount *mounted; - mounted = __lookup_mnt(path->mnt, path->dentry, 1); - if (!mounted) - return; - path->mnt = mounted; - path->dentry = mounted->mnt_root; - nd->seq = read_seqcount_begin(&path->dentry->d_seq); - *inode = path->dentry->d_inode; + struct vfsmount *mnt; + + if (!path->dentry->d_op || !path->dentry->d_op->d_automount) + return -EREMOTE; + + /* We want to mount if someone is trying to open/create a file of any + * type under the mountpoint, wants to traverse through the mountpoint + * or wants to open the mounted directory. + * + * We don't want to mount if someone's just doing a stat and they've + * set AT_SYMLINK_NOFOLLOW - unless they're stat'ing a directory and + * appended a '/' to the name. + */ + if (!(flags & LOOKUP_FOLLOW) && + !(flags & (LOOKUP_CONTINUE | LOOKUP_DIRECTORY | + LOOKUP_OPEN | LOOKUP_CREATE))) + return -EISDIR; + + current->total_link_count++; + if (current->total_link_count >= 40) + return -ELOOP; + + mnt = path->dentry->d_op->d_automount(path); + if (IS_ERR(mnt)) { + /* + * The filesystem is allowed to return -EISDIR here to indicate + * it doesn't want to automount. For instance, autofs would do + * this so that its userspace daemon can mount on this dentry. + * + * However, we can only permit this if it's a terminal point in + * the path being looked up; if it wasn't then the remainder of + * the path is inaccessible and we should say so. + */ + if (PTR_ERR(mnt) == -EISDIR && (flags & LOOKUP_CONTINUE)) + return -EREMOTE; + return PTR_ERR(mnt); } -} + if (!mnt) /* mount collision */ + return 0; -static int __follow_mount(struct path *path) -{ - int res = 0; - while (d_mountpoint(path->dentry)) { - struct vfsmount *mounted = lookup_mnt(path); - if (!mounted) - break; - dput(path->dentry); - if (res) - mntput(path->mnt); - path->mnt = mounted; - path->dentry = dget(mounted->mnt_root); - res = 1; + if (mnt->mnt_sb == path->mnt->mnt_sb && + mnt->mnt_root == path->dentry) { + mntput(mnt); + return -ELOOP; } - return res; + + dput(path->dentry); + if (*need_mntput) + mntput(path->mnt); + path->mnt = mnt; + path->dentry = dget(mnt->mnt_root); + *need_mntput = true; + return 0; } -static void follow_mount(struct path *path) +/* + * Handle a dentry that is managed in some way. + * - Flagged as mountpoint + * - Flagged as automount point + * + * This may only be called in refwalk mode. + * + * Serialization is taken care of in namespace.c + */ +static int follow_managed(struct path *path, unsigned flags) { - while (d_mountpoint(path->dentry)) { - struct vfsmount *mounted = lookup_mnt(path); - if (!mounted) - break; - dput(path->dentry); - mntput(path->mnt); - path->mnt = mounted; - path->dentry = dget(mounted->mnt_root); + unsigned managed; + bool need_mntput = false; + int ret; + + /* Given that we're not holding a lock here, we retain the value in a + * local variable for each dentry as we look at it so that we don't see + * the components of that value change under us */ + while (managed = ACCESS_ONCE(path->dentry->d_flags), + managed &= DCACHE_MANAGED_DENTRY, + unlikely(managed != 0)) { + /* Transit to a mounted filesystem. */ + if (managed & DCACHE_MOUNTED) { + struct vfsmount *mounted = lookup_mnt(path); + if (mounted) { + dput(path->dentry); + if (need_mntput) + mntput(path->mnt); + path->mnt = mounted; + path->dentry = dget(mounted->mnt_root); + need_mntput = true; + continue; + } + + /* Something is mounted on this dentry in another + * namespace and/or whatever was mounted there in this + * namespace got unmounted before we managed to get the + * vfsmount_lock */ + } + + /* Handle an automount point */ + if (managed & DCACHE_NEED_AUTOMOUNT) { + ret = follow_automount(path, flags, &need_mntput); + if (ret < 0) + return ret == -EISDIR ? 0 : ret; + continue; + } + + /* We didn't change the current path point */ + break; } + return 0; } int follow_down(struct path *path) @@ -958,13 +1027,37 @@ int follow_down(struct path *path) return 0; } +/* + * Skip to top of mountpoint pile in rcuwalk mode. We abort the rcu-walk if we + * meet an automount point and we're not walking to "..". True is returned to + * continue, false to abort. + */ +static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, + struct inode **inode, bool reverse_transit) +{ + while (d_mountpoint(path->dentry)) { + struct vfsmount *mounted; + mounted = __lookup_mnt(path->mnt, path->dentry, 1); + if (!mounted) + break; + path->mnt = mounted; + path->dentry = mounted->mnt_root; + nd->seq = read_seqcount_begin(&path->dentry->d_seq); + *inode = path->dentry->d_inode; + } + + if (unlikely(path->dentry->d_flags & DCACHE_NEED_AUTOMOUNT)) + return reverse_transit; + return true; +} + static int follow_dotdot_rcu(struct nameidata *nd) { struct inode *inode = nd->inode; set_root_rcu(nd); - while(1) { + while (1) { if (nd->path.dentry == nd->root.dentry && nd->path.mnt == nd->root.mnt) { break; @@ -987,12 +1080,28 @@ static int follow_dotdot_rcu(struct nameidata *nd) nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); inode = nd->path.dentry->d_inode; } - __follow_mount_rcu(nd, &nd->path, &inode); + __follow_mount_rcu(nd, &nd->path, &inode, true); nd->inode = inode; return 0; } +/* + * Skip to top of mountpoint pile in refwalk mode for follow_dotdot() + */ +static void follow_mount(struct path *path) +{ + while (d_mountpoint(path->dentry)) { + struct vfsmount *mounted = lookup_mnt(path); + if (!mounted) + break; + dput(path->dentry); + mntput(path->mnt); + path->mnt = mounted; + path->dentry = dget(mounted->mnt_root); + } +} + static void follow_dotdot(struct nameidata *nd) { set_root(nd); @@ -1057,12 +1166,14 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, struct vfsmount *mnt = nd->path.mnt; struct dentry *dentry, *parent = nd->path.dentry; struct inode *dir; + int err; + /* * See if the low-level filesystem might want * to use its own hash.. */ if (unlikely(parent->d_flags & DCACHE_OP_HASH)) { - int err = parent->d_op->d_hash(parent, nd->inode, name); + err = parent->d_op->d_hash(parent, nd->inode, name); if (err < 0) return err; } @@ -1092,20 +1203,25 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, done2: path->mnt = mnt; path->dentry = dentry; - __follow_mount_rcu(nd, path, inode); - } else { - dentry = __d_lookup(parent, name); - if (!dentry) - goto need_lookup; + if (likely(__follow_mount_rcu(nd, path, inode, false))) + return 0; + if (nameidata_drop_rcu(nd)) + return -ECHILD; + /* fallthru */ + } + dentry = __d_lookup(parent, name); + if (!dentry) + goto need_lookup; found: - if (dentry->d_flags & DCACHE_OP_REVALIDATE) - goto need_revalidate; + if (dentry->d_flags & DCACHE_OP_REVALIDATE) + goto need_revalidate; done: - path->mnt = mnt; - path->dentry = dentry; - __follow_mount(path); - *inode = path->dentry->d_inode; - } + path->mnt = mnt; + path->dentry = dentry; + err = follow_managed(path, nd->flags); + if (unlikely(err < 0)) + return err; + *inode = path->dentry->d_inode; return 0; need_lookup: @@ -2203,11 +2319,9 @@ static struct file *do_last(struct nameidata *nd, struct path *path, if (open_flag & O_EXCL) goto exit_dput; - if (__follow_mount(path)) { - error = -ELOOP; - if (open_flag & O_NOFOLLOW) - goto exit_dput; - } + error = follow_managed(path, nd->flags); + if (error < 0) + goto exit_dput; error = -ENOENT; if (!path->dentry->d_inode) diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 59fcd24b146..ee6c26d142c 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -167,6 +167,7 @@ struct dentry_operations { void (*d_release)(struct dentry *); void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); + struct vfsmount *(*d_automount)(struct path *); } ____cacheline_aligned; /* @@ -205,13 +206,17 @@ struct dentry_operations { #define DCACHE_CANT_MOUNT 0x0100 #define DCACHE_GENOCIDE 0x0200 -#define DCACHE_MOUNTED 0x0400 /* is a mountpoint */ #define DCACHE_OP_HASH 0x1000 #define DCACHE_OP_COMPARE 0x2000 #define DCACHE_OP_REVALIDATE 0x4000 #define DCACHE_OP_DELETE 0x8000 +#define DCACHE_MOUNTED 0x10000 /* is a mountpoint */ +#define DCACHE_NEED_AUTOMOUNT 0x20000 /* handle automount on this dir */ +#define DCACHE_MANAGED_DENTRY \ + (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT) + extern seqlock_t rename_lock; static inline int dname_external(struct dentry *dentry) diff --git a/include/linux/fs.h b/include/linux/fs.h index 3984f2358d1..4c00ed53be8 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -242,6 +242,7 @@ struct inodes_stat_t { #define S_SWAPFILE 256 /* Do not truncate: swapon got its bmaps */ #define S_PRIVATE 512 /* Inode is fs-internal */ #define S_IMA 1024 /* Inode has an associated IMA struct */ +#define S_AUTOMOUNT 2048 /* Automount/referral quasi-directory */ /* * Note that nosuid etc flags are inode-specific: setting some file-system @@ -277,6 +278,7 @@ struct inodes_stat_t { #define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE) #define IS_PRIVATE(inode) ((inode)->i_flags & S_PRIVATE) #define IS_IMA(inode) ((inode)->i_flags & S_IMA) +#define IS_AUTOMOUNT(inode) ((inode)->i_flags & S_AUTOMOUNT) /* the read-only stuff doesn't really belong here, but any other place is probably as bad and I don't want to create yet another include file. */ -- cgit v1.2.3-18-g5258 From cc53ce53c86924bfe98a12ea20b7465038a08792 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:26 +0000 Subject: Add a dentry op to allow processes to be held during pathwalk transit Add a dentry op (d_manage) to permit a filesystem to hold a process and make it sleep when it tries to transit away from one of that filesystem's directories during a pathwalk. The operation is keyed off a new dentry flag (DCACHE_MANAGE_TRANSIT). The filesystem is allowed to be selective about which processes it holds and which it permits to continue on or prohibits from transiting from each flagged directory. This will allow autofs to hold up client processes whilst letting its userspace daemon through to maintain the directory or the stuff behind it or mounted upon it. The ->d_manage() dentry operation: int (*d_manage)(struct path *path, bool mounting_here); takes a pointer to the directory about to be transited away from and a flag indicating whether the transit is undertaken by do_add_mount() or do_move_mount() skipping through a pile of filesystems mounted on a mountpoint. It should return 0 if successful and to let the process continue on its way; -EISDIR to prohibit the caller from skipping to overmounted filesystems or automounting, and to use this directory; or some other error code to return to the user. ->d_manage() is called with namespace_sem writelocked if mounting_here is true and no other locks held, so it may sleep. However, if mounting_here is true, it may not initiate or wait for a mount or unmount upon the parameter directory, even if the act is actually performed by userspace. Within fs/namei.c, follow_managed() is extended to check with d_manage() first on each managed directory, before transiting away from it or attempting to automount upon it. follow_down() is renamed follow_down_one() and should only be used where the filesystem deliberately intends to avoid management steps (e.g. autofs). A new follow_down() is added that incorporates the loop done by all other callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS and CIFS do use it, their use is removed by converting them to use d_automount()). The new follow_down() calls d_manage() as appropriate. It also takes an extra parameter to indicate if it is being called from mount code (with namespace_sem writelocked) which it passes to d_manage(). follow_down() ignores automount points so that it can be used to mount on them. __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have that determine whether to abort or not itself. That would allow the autofs daemon to continue on in rcu-walk mode. Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't required as every tranist from that directory will cause d_manage() to be invoked. It can always be set again when necessary. ========================== WHAT THIS MEANS FOR AUTOFS ========================== Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to trigger the automounting of indirect mounts, and both of these can be called with i_mutex held. autofs knows that the i_mutex will be held by the caller in lookup(), and so can drop it before invoking the daemon - but this isn't so for d_revalidate(), since the lock is only held on _some_ of the code paths that call it. This means that autofs can't risk dropping i_mutex from its d_revalidate() function before it calls the daemon. The bug could manifest itself as, for example, a process that's trying to validate an automount dentry that gets made to wait because that dentry is expired and needs cleaning up: mkdir S ffffffff8014e05a 0 32580 24956 Call Trace: [] :autofs4:autofs4_wait+0x674/0x897 [] avc_has_perm+0x46/0x58 [] autoremove_wake_function+0x0/0x2e [] :autofs4:autofs4_expire_wait+0x41/0x6b [] :autofs4:autofs4_revalidate+0x91/0x149 [] __lookup_hash+0xa0/0x12f [] lookup_create+0x46/0x80 [] sys_mkdirat+0x56/0xe4 versus the automount daemon which wants to remove that dentry, but can't because the normal process is holding the i_mutex lock: automount D ffffffff8014e05a 0 32581 1 32561 Call Trace: [] __mutex_lock_slowpath+0x60/0x9b [] do_path_lookup+0x2ca/0x2f1 [] .text.lock.mutex+0xf/0x14 [] do_rmdir+0x77/0xde [] tracesys+0x71/0xe0 [] tracesys+0xd5/0xe0 which means that the system is deadlocked. This patch allows autofs to hold up normal processes whilst the daemon goes ahead and does things to the dentry tree behind the automouter point without risking a deadlock as almost no locks are held in d_manage() and none in d_automount(). Signed-off-by: David Howells Was-Acked-by: Ian Kent Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 2 ++ Documentation/filesystems/vfs.txt | 21 +++++++++++- drivers/staging/autofs/dirhash.c | 5 ++- fs/afs/mntpt.c | 5 +-- fs/autofs4/autofs_i.h | 13 ------- fs/autofs4/dev-ioctl.c | 2 +- fs/autofs4/expire.c | 2 +- fs/autofs4/root.c | 11 +++--- fs/cifs/cifs_dfs_ref.c | 5 +-- fs/namei.c | 72 +++++++++++++++++++++++++++++++++++++-- fs/namespace.c | 14 ++++---- fs/nfs/namespace.c | 5 +-- fs/nfsd/vfs.c | 5 +-- include/linux/dcache.h | 11 ++++-- include/linux/namei.h | 3 +- 15 files changed, 126 insertions(+), 50 deletions(-) diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 5f0c52a0738..cbf98b989b1 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -20,6 +20,7 @@ prototypes: void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen); struct vfsmount *(*d_automount)(struct path *path); + int (*d_manage)(struct dentry *, bool); locking rules: rename_lock ->d_lock may block rcu-walk @@ -31,6 +32,7 @@ d_release: no no yes no d_iput: no no yes no d_dname: no no no no d_automount: no no yes no +d_manage: no no yes no --------------------------- inode_operations --------------------------- prototypes: diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 726a4f6fa3c..4682586b147 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -865,6 +865,7 @@ struct dentry_operations { void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); struct vfsmount *(*d_automount)(struct path *); + int (*d_manage)(struct dentry *, bool); }; d_revalidate: called when the VFS needs to revalidate a dentry. This @@ -938,12 +939,30 @@ struct dentry_operations { target and the parent VFS mount record to provide inheritable mount parameters. NULL should be returned if someone else managed to make the automount first. If the automount failed, then an error code - should be returned. + should be returned. If -EISDIR is returned, then the directory will + be treated as an ordinary directory and returned to pathwalk to + continue walking. This function is only used if DCACHE_NEED_AUTOMOUNT is set on the dentry. This is set by __d_instantiate() if S_AUTOMOUNT is set on the inode being added. + d_manage: called to allow the filesystem to manage the transition from a + dentry (optional). This allows autofs, for example, to hold up clients + waiting to explore behind a 'mountpoint' whilst letting the daemon go + past and construct the subtree there. 0 should be returned to let the + calling process continue. -EISDIR can be returned to tell pathwalk to + use this directory as an ordinary directory and to ignore anything + mounted on it and not to check the automount flag. Any other error + code will abort pathwalk completely. + + If the 'mounting_here' parameter is true, then namespace_sem is being + held by the caller and the function should not initiate any mounts or + unmounts that it will then wait for. + + This function is only used if DCACHE_MANAGE_TRANSIT is set on the + dentry being transited from. + Example : static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen) diff --git a/drivers/staging/autofs/dirhash.c b/drivers/staging/autofs/dirhash.c index d3f42c8325f..a08bd735503 100644 --- a/drivers/staging/autofs/dirhash.c +++ b/drivers/staging/autofs/dirhash.c @@ -88,14 +88,13 @@ struct autofs_dir_ent *autofs_expire(struct super_block *sb, } path.mnt = mnt; path_get(&path); - if (!follow_down(&path)) { + if (!follow_down_one(&path)) { path_put(&path); DPRINTK(("autofs: not expirable\ (not a mounted directory): %s\n", ent->name)); continue; } - while (d_mountpoint(path.dentry) && follow_down(&path)) - ; + follow_down(&path, false); // TODO: need to check error umount_ok = may_umount(path.mnt); path_put(&path); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index e83c0336e7b..f3e891d57a2 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -273,10 +273,7 @@ static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd) break; case -EBUSY: /* someone else made a mount here whilst we were busy */ - while (d_mountpoint(nd->path.dentry) && - follow_down(&nd->path)) - ; - err = 0; + err = follow_down(&nd->path, false); default: mntput(newmnt); break; diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index 0fffe1c24ce..eb67953452b 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -229,19 +229,6 @@ int autofs4_wait(struct autofs_sb_info *,struct dentry *, enum autofs_notify); int autofs4_wait_release(struct autofs_sb_info *,autofs_wqt_t,int); void autofs4_catatonic_mode(struct autofs_sb_info *); -static inline int autofs4_follow_mount(struct path *path) -{ - int res = 0; - - while (d_mountpoint(path->dentry)) { - int followed = follow_down(path); - if (!followed) - break; - res = 1; - } - return res; -} - static inline u32 autofs4_get_dev(struct autofs_sb_info *sbi) { return new_encode_dev(sbi->sb->s_dev); diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index eff9a419469..1442da4860e 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -551,7 +551,7 @@ static int autofs_dev_ioctl_ismountpoint(struct file *fp, err = have_submounts(path.dentry); - if (follow_down(&path)) + if (follow_down_one(&path)) magic = path.mnt->mnt_sb->s_magic; } diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c index cc1d0136590..6a930b90d38 100644 --- a/fs/autofs4/expire.c +++ b/fs/autofs4/expire.c @@ -56,7 +56,7 @@ static int autofs4_mount_busy(struct vfsmount *mnt, struct dentry *dentry) path_get(&path); - if (!follow_down(&path)) + if (!follow_down_one(&path)) goto done; if (is_autofs4_dentry(path.dentry)) { diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 651e4ef563b..20225636a4e 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -234,7 +234,7 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) nd->flags); /* * For an expire of a covered direct or offset mount we need - * to break out of follow_down() at the autofs mount trigger + * to break out of follow_down_one() at the autofs mount trigger * (d_mounted--), so we can see the expiring flag, and manage * the blocking and following here until the expire is completed. */ @@ -243,7 +243,7 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) if (ino->flags & AUTOFS_INF_EXPIRING) { spin_unlock(&sbi->fs_lock); /* Follow down to our covering mount. */ - if (!follow_down(&nd->path)) + if (!follow_down_one(&nd->path)) goto done; goto follow; } @@ -292,11 +292,10 @@ follow: * multi-mount with no root offset so we don't need * to follow it. */ - if (d_mountpoint(dentry)) { - if (!autofs4_follow_mount(&nd->path)) { - status = -ENOENT; + if (d_managed(dentry)) { + status = follow_down(&nd->path, false); + if (status < 0) goto out_error; - } } done: diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index c68a056f27f..83479cf63f9 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -273,10 +273,7 @@ static int add_mount_helper(struct vfsmount *newmnt, struct nameidata *nd, break; case -EBUSY: /* someone else made a mount here whilst we were busy */ - while (d_mountpoint(nd->path.dentry) && - follow_down(&nd->path)) - ; - err = 0; + err = follow_down(&nd->path, false); default: mntput(newmnt); break; diff --git a/fs/namei.c b/fs/namei.c index 16109da68bb..9d3033dc22e 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -960,6 +960,7 @@ static int follow_automount(struct path *path, unsigned flags, /* * Handle a dentry that is managed in some way. + * - Flagged for transit management (autofs) * - Flagged as mountpoint * - Flagged as automount point * @@ -979,6 +980,16 @@ static int follow_managed(struct path *path, unsigned flags) while (managed = ACCESS_ONCE(path->dentry->d_flags), managed &= DCACHE_MANAGED_DENTRY, unlikely(managed != 0)) { + /* Allow the filesystem to manage the transit without i_mutex + * being held. */ + if (managed & DCACHE_MANAGE_TRANSIT) { + BUG_ON(!path->dentry->d_op); + BUG_ON(!path->dentry->d_op->d_manage); + ret = path->dentry->d_op->d_manage(path->dentry, false); + if (ret < 0) + return ret == -EISDIR ? 0 : ret; + } + /* Transit to a mounted filesystem. */ if (managed & DCACHE_MOUNTED) { struct vfsmount *mounted = lookup_mnt(path); @@ -1012,7 +1023,7 @@ static int follow_managed(struct path *path, unsigned flags) return 0; } -int follow_down(struct path *path) +int follow_down_one(struct path *path) { struct vfsmount *mounted; @@ -1029,14 +1040,19 @@ int follow_down(struct path *path) /* * Skip to top of mountpoint pile in rcuwalk mode. We abort the rcu-walk if we - * meet an automount point and we're not walking to "..". True is returned to + * meet a managed dentry and we're not walking to "..". True is returned to * continue, false to abort. */ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, struct inode **inode, bool reverse_transit) { + unsigned abort_mask = + reverse_transit ? 0 : DCACHE_MANAGE_TRANSIT; + while (d_mountpoint(path->dentry)) { struct vfsmount *mounted; + if (path->dentry->d_flags & abort_mask) + return true; mounted = __lookup_mnt(path->mnt, path->dentry, 1); if (!mounted) break; @@ -1086,6 +1102,57 @@ static int follow_dotdot_rcu(struct nameidata *nd) return 0; } +/* + * Follow down to the covering mount currently visible to userspace. At each + * point, the filesystem owning that dentry may be queried as to whether the + * caller is permitted to proceed or not. + * + * Care must be taken as namespace_sem may be held (indicated by mounting_here + * being true). + */ +int follow_down(struct path *path, bool mounting_here) +{ + unsigned managed; + int ret; + + while (managed = ACCESS_ONCE(path->dentry->d_flags), + unlikely(managed & DCACHE_MANAGED_DENTRY)) { + /* Allow the filesystem to manage the transit without i_mutex + * being held. + * + * We indicate to the filesystem if someone is trying to mount + * something here. This gives autofs the chance to deny anyone + * other than its daemon the right to mount on its + * superstructure. + * + * The filesystem may sleep at this point. + */ + if (managed & DCACHE_MANAGE_TRANSIT) { + BUG_ON(!path->dentry->d_op); + BUG_ON(!path->dentry->d_op->d_manage); + ret = path->dentry->d_op->d_manage(path->dentry, mounting_here); + if (ret < 0) + return ret == -EISDIR ? 0 : ret; + } + + /* Transit to a mounted filesystem. */ + if (managed & DCACHE_MOUNTED) { + struct vfsmount *mounted = lookup_mnt(path); + if (!mounted) + break; + dput(path->dentry); + mntput(path->mnt); + path->mnt = mounted; + path->dentry = dget(mounted->mnt_root); + continue; + } + + /* Don't handle automount points here */ + break; + } + return 0; +} + /* * Skip to top of mountpoint pile in refwalk mode for follow_dotdot() */ @@ -3530,6 +3597,7 @@ const struct inode_operations page_symlink_inode_operations = { }; EXPORT_SYMBOL(user_path_at); +EXPORT_SYMBOL(follow_down_one); EXPORT_SYMBOL(follow_down); EXPORT_SYMBOL(follow_up); EXPORT_SYMBOL(get_write_access); /* binfmt_aout */ diff --git a/fs/namespace.c b/fs/namespace.c index 3ddfd9046c4..d94ccd6ddaf 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1844,9 +1844,10 @@ static int do_move_mount(struct path *path, char *old_name) return err; down_write(&namespace_sem); - while (d_mountpoint(path->dentry) && - follow_down(path)) - ; + err = follow_down(path, true); + if (err < 0) + goto out; + err = -EINVAL; if (!check_mnt(path->mnt) || !check_mnt(old_path.mnt)) goto out; @@ -1940,9 +1941,10 @@ int do_add_mount(struct vfsmount *newmnt, struct path *path, down_write(&namespace_sem); /* Something was mounted here while we slept */ - while (d_mountpoint(path->dentry) && - follow_down(path)) - ; + err = follow_down(path, true); + if (err < 0) + goto unlock; + err = -EINVAL; if (!(mnt_flags & MNT_SHRINKABLE) && !check_mnt(path->mnt)) goto unlock; diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index 74aaf3963c1..bfcb933e575 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -176,10 +176,7 @@ out_err: path_put(&nd->path); goto out; out_follow: - while (d_mountpoint(nd->path.dentry) && - follow_down(&nd->path)) - ; - err = 0; + err = follow_down(&nd->path, false); goto out; } diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 230b79fbf00..0f79e33a65d 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -88,8 +88,9 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, .dentry = dget(dentry)}; int err = 0; - while (d_mountpoint(path.dentry) && follow_down(&path)) - ; + err = follow_down(&path, false); + if (err < 0) + goto out; exp2 = rqst_exp_get_by_name(rqstp, &path); if (IS_ERR(exp2)) { diff --git a/include/linux/dcache.h b/include/linux/dcache.h index ee6c26d142c..1a87760d653 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -168,6 +168,7 @@ struct dentry_operations { void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); struct vfsmount *(*d_automount)(struct path *); + int (*d_manage)(struct dentry *, bool); } ____cacheline_aligned; /* @@ -214,8 +215,9 @@ struct dentry_operations { #define DCACHE_MOUNTED 0x10000 /* is a mountpoint */ #define DCACHE_NEED_AUTOMOUNT 0x20000 /* handle automount on this dir */ +#define DCACHE_MANAGE_TRANSIT 0x40000 /* manage transit from this dirent */ #define DCACHE_MANAGED_DENTRY \ - (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT) + (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT) extern seqlock_t rename_lock; @@ -404,7 +406,12 @@ static inline void dont_mount(struct dentry *dentry) extern void dput(struct dentry *); -static inline int d_mountpoint(struct dentry *dentry) +static inline bool d_managed(struct dentry *dentry) +{ + return dentry->d_flags & DCACHE_MANAGED_DENTRY; +} + +static inline bool d_mountpoint(struct dentry *dentry) { return dentry->d_flags & DCACHE_MOUNTED; } diff --git a/include/linux/namei.h b/include/linux/namei.h index 18d06add0a4..8ef2c789c2a 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -79,7 +79,8 @@ extern struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry extern struct dentry *lookup_one_len(const char *, struct dentry *, int); -extern int follow_down(struct path *); +extern int follow_down_one(struct path *); +extern int follow_down(struct path *, bool); extern int follow_up(struct path *); extern struct dentry *lock_rename(struct dentry *, struct dentry *); -- cgit v1.2.3-18-g5258 From 6f45b65672c8017d5e210e338bb5858a938ef445 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:31 +0000 Subject: Add an AT_NO_AUTOMOUNT flag to suppress terminal automount Add an AT_NO_AUTOMOUNT flag to suppress terminal automounting of automount point directories. This can be used by fstatat() users to permit the gathering of attributes on an automount point and also prevent mass-automounting of a directory of automount points by ls. Signed-off-by: David Howells Acked-by: Ian Kent Signed-off-by: Al Viro --- fs/namei.c | 6 ++++++ fs/stat.c | 4 +++- include/linux/fcntl.h | 1 + include/linux/namei.h | 2 ++ 4 files changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/namei.c b/fs/namei.c index 9d3033dc22e..dc50bfb2f5d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -908,6 +908,12 @@ static int follow_automount(struct path *path, unsigned flags, if (!path->dentry->d_op || !path->dentry->d_op->d_automount) return -EREMOTE; + /* We don't want to mount if someone supplied AT_NO_AUTOMOUNT + * and this is the terminal part of the path. + */ + if ((flags & LOOKUP_NO_AUTOMOUNT) && !(flags & LOOKUP_CONTINUE)) + return -EISDIR; /* we actually want to stop here */ + /* We want to mount if someone is trying to open/create a file of any * type under the mountpoint, wants to traverse through the mountpoint * or wants to open the mounted directory. diff --git a/fs/stat.c b/fs/stat.c index 12e90e21390..d5c61cf2b70 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -75,11 +75,13 @@ int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat, int error = -EINVAL; int lookup_flags = 0; - if ((flag & ~AT_SYMLINK_NOFOLLOW) != 0) + if ((flag & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT)) != 0) goto out; if (!(flag & AT_SYMLINK_NOFOLLOW)) lookup_flags |= LOOKUP_FOLLOW; + if (flag & AT_NO_AUTOMOUNT) + lookup_flags |= LOOKUP_NO_AUTOMOUNT; error = user_path_at(dfd, filename, lookup_flags, &path); if (error) diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h index afc00af3229..a562fa5fb4e 100644 --- a/include/linux/fcntl.h +++ b/include/linux/fcntl.h @@ -45,6 +45,7 @@ #define AT_REMOVEDIR 0x200 /* Remove directory instead of unlinking file. */ #define AT_SYMLINK_FOLLOW 0x400 /* Follow symbolic links. */ +#define AT_NO_AUTOMOUNT 0x800 /* Suppress terminal automount traversal */ #ifdef __KERNEL__ diff --git a/include/linux/namei.h b/include/linux/namei.h index 8ef2c789c2a..f276d4fa01f 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -45,6 +45,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; * - ending slashes ok even for nonexistent files * - internal "there are more path components" flag * - dentry cache is untrusted; force a real lookup + * - suppress terminal automount */ #define LOOKUP_FOLLOW 0x0001 #define LOOKUP_DIRECTORY 0x0002 @@ -53,6 +54,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_PARENT 0x0010 #define LOOKUP_REVAL 0x0020 #define LOOKUP_RCU 0x0040 +#define LOOKUP_NO_AUTOMOUNT 0x0080 /* * Intent data */ -- cgit v1.2.3-18-g5258 From d18610b0ce9eb48c60649d8fcbf68374c84349d3 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 19:04:05 +0000 Subject: AFS: Use d_automount() rather than abusing follow_link() Make AFS use the new d_automount() dentry operation rather than abusing follow_link() on directories. Signed-off-by: David Howells Signed-off-by: Al Viro --- fs/afs/dir.c | 1 + fs/afs/inode.c | 3 ++- fs/afs/internal.h | 1 + fs/afs/mntpt.c | 44 +++++++++++++++----------------------------- 4 files changed, 19 insertions(+), 30 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index e6a4ab980e3..20c106f2492 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -66,6 +66,7 @@ const struct dentry_operations afs_fs_dentry_operations = { .d_revalidate = afs_d_revalidate, .d_delete = afs_d_delete, .d_release = afs_d_release, + .d_automount = afs_d_automount, }; #define AFS_DIR_HASHTBL_SIZE 128 diff --git a/fs/afs/inode.c b/fs/afs/inode.c index 0747339011c..db66c520147 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -184,7 +184,8 @@ struct inode *afs_iget_autocell(struct inode *dir, const char *dev_name, inode->i_generation = 0; set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); - inode->i_flags |= S_NOATIME; + set_bit(AFS_VNODE_MOUNTPOINT, &vnode->flags); + inode->i_flags |= S_AUTOMOUNT | S_NOATIME; unlock_new_inode(inode); _leave(" = %p", inode); return inode; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 58c633b8024..5a9b6843bac 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -592,6 +592,7 @@ extern const struct inode_operations afs_mntpt_inode_operations; extern const struct inode_operations afs_autocell_inode_operations; extern const struct file_operations afs_mntpt_file_operations; +extern struct vfsmount *afs_d_automount(struct path *); extern int afs_mntpt_check_symlink(struct afs_vnode *, struct key *); extern void afs_mntpt_kill_timer(void); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index f3e891d57a2..d23b2e344a7 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -24,7 +24,6 @@ static struct dentry *afs_mntpt_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd); static int afs_mntpt_open(struct inode *inode, struct file *file); -static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd); static void afs_mntpt_expiry_timed_out(struct work_struct *work); const struct file_operations afs_mntpt_file_operations = { @@ -34,13 +33,11 @@ const struct file_operations afs_mntpt_file_operations = { const struct inode_operations afs_mntpt_inode_operations = { .lookup = afs_mntpt_lookup, - .follow_link = afs_mntpt_follow_link, .readlink = page_readlink, .getattr = afs_getattr, }; const struct inode_operations afs_autocell_inode_operations = { - .follow_link = afs_mntpt_follow_link, .getattr = afs_getattr, }; @@ -88,6 +85,7 @@ int afs_mntpt_check_symlink(struct afs_vnode *vnode, struct key *key) _debug("symlink is a mountpoint"); spin_lock(&vnode->lock); set_bit(AFS_VNODE_MOUNTPOINT, &vnode->flags); + vnode->vfs_inode.i_flags |= S_AUTOMOUNT; spin_unlock(&vnode->lock); } @@ -238,49 +236,37 @@ error_no_devname: } /* - * follow a link from a mountpoint directory, thus causing it to be mounted + * handle an automount point */ -static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd) +struct vfsmount *afs_d_automount(struct path *path) { struct vfsmount *newmnt; int err; - _enter("%p{%s},{%s:%p{%s},}", - dentry, - dentry->d_name.name, - nd->path.mnt->mnt_devname, - dentry, - nd->path.dentry->d_name.name); - - dput(nd->path.dentry); - nd->path.dentry = dget(dentry); + _enter("{%s,%s}", path->mnt->mnt_devname, path->dentry->d_name.name); - newmnt = afs_mntpt_do_automount(nd->path.dentry); - if (IS_ERR(newmnt)) { - path_put(&nd->path); - return (void *)newmnt; - } + newmnt = afs_mntpt_do_automount(path->dentry); + if (IS_ERR(newmnt)) + return newmnt; mntget(newmnt); - err = do_add_mount(newmnt, &nd->path, MNT_SHRINKABLE, &afs_vfsmounts); + err = do_add_mount(newmnt, path, MNT_SHRINKABLE, &afs_vfsmounts); switch (err) { case 0: - path_put(&nd->path); - nd->path.mnt = newmnt; - nd->path.dentry = dget(newmnt->mnt_root); queue_delayed_work(afs_wq, &afs_mntpt_expiry_timer, afs_mntpt_expiry_timeout * HZ); - break; + _leave(" = %p {%s}", newmnt, newmnt->mnt_devname); + return newmnt; case -EBUSY: /* someone else made a mount here whilst we were busy */ - err = follow_down(&nd->path, false); + mntput(newmnt); + _leave(" = NULL [EBUSY]"); + return NULL; default: mntput(newmnt); - break; + _leave(" = %d", err); + return ERR_PTR(err); } - - _leave(" = %d", err); - return ERR_PTR(err); } /* -- cgit v1.2.3-18-g5258 From 36d43a43761b004ad1879ac21471d8fc5f3157ec Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:42 +0000 Subject: NFS: Use d_automount() rather than abusing follow_link() Make NFS use the new d_automount() dentry operation rather than abusing follow_link() on directories. Signed-off-by: David Howells Acked-by: Trond Myklebust Acked-by: Ian Kent Signed-off-by: Al Viro --- fs/nfs/dir.c | 4 ++- fs/nfs/inode.c | 4 +-- fs/nfs/internal.h | 1 + fs/nfs/namespace.c | 84 ++++++++++++++++++++++++-------------------------- include/linux/nfs_fs.h | 1 - 5 files changed, 46 insertions(+), 48 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index df8c03a0216..2c3eb33b904 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -970,7 +970,7 @@ int nfs_lookup_verify_inode(struct inode *inode, struct nameidata *nd) { struct nfs_server *server = NFS_SERVER(inode); - if (test_bit(NFS_INO_MOUNTPOINT, &NFS_I(inode)->flags)) + if (IS_AUTOMOUNT(inode)) return 0; if (nd != NULL) { /* VFS wants an on-the-wire revalidation */ @@ -1173,6 +1173,7 @@ const struct dentry_operations nfs_dentry_operations = { .d_revalidate = nfs_lookup_revalidate, .d_delete = nfs_dentry_delete, .d_iput = nfs_dentry_iput, + .d_automount = nfs_d_automount, }; static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, struct nameidata *nd) @@ -1246,6 +1247,7 @@ const struct dentry_operations nfs4_dentry_operations = { .d_revalidate = nfs_open_revalidate, .d_delete = nfs_dentry_delete, .d_iput = nfs_dentry_iput, + .d_automount = nfs_d_automount, }; /* diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index ce00b704452..d8512423ba7 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -300,7 +300,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr) else inode->i_op = &nfs_mountpoint_inode_operations; inode->i_fop = NULL; - set_bit(NFS_INO_MOUNTPOINT, &nfsi->flags); + inode->i_flags |= S_AUTOMOUNT; } } else if (S_ISLNK(inode->i_mode)) inode->i_op = &nfs_symlink_inode_operations; @@ -1208,7 +1208,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) /* Update the fsid? */ if (S_ISDIR(inode->i_mode) && (fattr->valid & NFS_ATTR_FATTR_FSID) && !nfs_fsid_equal(&server->fsid, &fattr->fsid) && - !test_bit(NFS_INO_MOUNTPOINT, &nfsi->flags)) + !IS_AUTOMOUNT(inode)) server->fsid = fattr->fsid; /* diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index bfa3a34af80..4644f04b4b4 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -252,6 +252,7 @@ extern char *nfs_path(const char *base, const struct dentry *droot, const struct dentry *dentry, char *buffer, ssize_t buflen); +extern struct vfsmount *nfs_d_automount(struct path *path); /* getroot.c */ extern struct dentry *nfs_get_root(struct super_block *, struct nfs_fh *); diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index bfcb933e575..f3fbb1bf3f1 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -97,9 +97,8 @@ Elong: } /* - * nfs_follow_mountpoint - handle crossing a mountpoint on the server - * @dentry - dentry of mountpoint - * @nd - nameidata info + * nfs_d_automount - Handle crossing a mountpoint on the server + * @path - The mountpoint * * When we encounter a mountpoint on the server, we want to set up * a mountpoint on the client too, to prevent inode numbers from @@ -109,84 +108,81 @@ Elong: * situation, and that different filesystems may want to use * different security flavours. */ -static void * nfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) +struct vfsmount *nfs_d_automount(struct path *path) { struct vfsmount *mnt; - struct nfs_server *server = NFS_SERVER(dentry->d_inode); + struct nfs_server *server = NFS_SERVER(path->dentry->d_inode); struct dentry *parent; struct nfs_fh *fh = NULL; struct nfs_fattr *fattr = NULL; int err; - dprintk("--> nfs_follow_mountpoint()\n"); + dprintk("--> nfs_d_automount()\n"); - err = -ESTALE; - if (IS_ROOT(dentry)) - goto out_err; + mnt = ERR_PTR(-ESTALE); + if (IS_ROOT(path->dentry)) + goto out_nofree; - err = -ENOMEM; + mnt = ERR_PTR(-ENOMEM); fh = nfs_alloc_fhandle(); fattr = nfs_alloc_fattr(); if (fh == NULL || fattr == NULL) - goto out_err; + goto out; dprintk("%s: enter\n", __func__); - dput(nd->path.dentry); - nd->path.dentry = dget(dentry); - /* Look it up again */ - parent = dget_parent(nd->path.dentry); + /* Look it up again to get its attributes */ + parent = dget_parent(path->dentry); err = server->nfs_client->rpc_ops->lookup(parent->d_inode, - &nd->path.dentry->d_name, + &path->dentry->d_name, fh, fattr); dput(parent); - if (err != 0) - goto out_err; + if (err != 0) { + mnt = ERR_PTR(err); + goto out; + } if (fattr->valid & NFS_ATTR_FATTR_V4_REFERRAL) - mnt = nfs_do_refmount(nd->path.mnt, nd->path.dentry); + mnt = nfs_do_refmount(path->mnt, path->dentry); else - mnt = nfs_do_submount(nd->path.mnt, nd->path.dentry, fh, - fattr); - err = PTR_ERR(mnt); + mnt = nfs_do_submount(path->mnt, path->dentry, fh, fattr); if (IS_ERR(mnt)) - goto out_err; + goto out; mntget(mnt); - err = do_add_mount(mnt, &nd->path, nd->path.mnt->mnt_flags|MNT_SHRINKABLE, + err = do_add_mount(mnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE, &nfs_automount_list); - if (err < 0) { + switch (err) { + case 0: + dprintk("%s: done, success\n", __func__); + schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); + break; + case -EBUSY: + /* someone else made a mount here whilst we were busy */ mntput(mnt); - if (err == -EBUSY) - goto out_follow; - goto out_err; + dprintk("%s: done, collision\n", __func__); + mnt = NULL; + break; + default: + mntput(mnt); + dprintk("%s: done, error %d\n", __func__, err); + mnt = ERR_PTR(err); + break; } - path_put(&nd->path); - nd->path.mnt = mnt; - nd->path.dentry = dget(mnt->mnt_root); - schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); + out: nfs_free_fattr(fattr); nfs_free_fhandle(fh); - dprintk("%s: done, returned %d\n", __func__, err); - - dprintk("<-- nfs_follow_mountpoint() = %d\n", err); - return ERR_PTR(err); -out_err: - path_put(&nd->path); - goto out; -out_follow: - err = follow_down(&nd->path, false); - goto out; +out_nofree: + dprintk("<-- nfs_follow_mountpoint() = %p\n", mnt); + return mnt; } const struct inode_operations nfs_mountpoint_inode_operations = { - .follow_link = nfs_follow_mountpoint, .getattr = nfs_getattr, }; const struct inode_operations nfs_referral_inode_operations = { - .follow_link = nfs_follow_mountpoint, }; static void nfs_expire_automounts(struct work_struct *work) diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 0779bb8f95b..6023efa9f5d 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -215,7 +215,6 @@ struct nfs_inode { #define NFS_INO_ADVISE_RDPLUS (0) /* advise readdirplus */ #define NFS_INO_STALE (1) /* possible stale inode */ #define NFS_INO_ACL_LRU_SET (2) /* Inode is on the LRU list */ -#define NFS_INO_MOUNTPOINT (3) /* inode is remote mountpoint */ #define NFS_INO_FLUSHING (4) /* inode is flushing out data */ #define NFS_INO_FSCACHE (5) /* inode can be cached by FS-Cache */ #define NFS_INO_FSCACHE_LOCK (6) /* FS-Cache cookie management lock */ -- cgit v1.2.3-18-g5258 From 01c64feac45cea1317263eabc4f7ee1b240f297f Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:47 +0000 Subject: CIFS: Use d_automount() rather than abusing follow_link() Make CIFS use the new d_automount() dentry operation rather than abusing follow_link() on directories. [NOTE: THIS IS UNTESTED!] Signed-off-by: David Howells Cc: Steve French Signed-off-by: Al Viro --- fs/cifs/cifs_dfs_ref.c | 131 +++++++++++++++++++++++++------------------------ fs/cifs/cifsfs.h | 6 +++ fs/cifs/dir.c | 2 + fs/cifs/inode.c | 8 +-- 4 files changed, 80 insertions(+), 67 deletions(-) diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index 83479cf63f9..0fc163808de 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -255,32 +255,6 @@ static struct vfsmount *cifs_dfs_do_refmount(struct cifs_sb_info *cifs_sb, } -static int add_mount_helper(struct vfsmount *newmnt, struct nameidata *nd, - struct list_head *mntlist) -{ - /* stolen from afs code */ - int err; - - mntget(newmnt); - err = do_add_mount(newmnt, &nd->path, nd->path.mnt->mnt_flags | MNT_SHRINKABLE, mntlist); - switch (err) { - case 0: - path_put(&nd->path); - nd->path.mnt = newmnt; - nd->path.dentry = dget(newmnt->mnt_root); - schedule_delayed_work(&cifs_dfs_automount_task, - cifs_dfs_mountpoint_expiry_timeout); - break; - case -EBUSY: - /* someone else made a mount here whilst we were busy */ - err = follow_down(&nd->path, false); - default: - mntput(newmnt); - break; - } - return err; -} - static void dump_referral(const struct dfs_info3_param *ref) { cFYI(1, "DFS: ref path: %s", ref->path_name); @@ -290,45 +264,43 @@ static void dump_referral(const struct dfs_info3_param *ref) ref->path_consumed); } - -static void* -cifs_dfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) +/* + * Create a vfsmount that we can automount + */ +static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt) { struct dfs_info3_param *referrals = NULL; unsigned int num_referrals = 0; struct cifs_sb_info *cifs_sb; struct cifsSesInfo *ses; - char *full_path = NULL; + char *full_path; int xid, i; - int rc = 0; - struct vfsmount *mnt = ERR_PTR(-ENOENT); + int rc; + struct vfsmount *mnt; struct tcon_link *tlink; cFYI(1, "in %s", __func__); - BUG_ON(IS_ROOT(dentry)); + BUG_ON(IS_ROOT(mntpt)); xid = GetXid(); - dput(nd->path.dentry); - nd->path.dentry = dget(dentry); - /* * The MSDFS spec states that paths in DFS referral requests and * responses must be prefixed by a single '\' character instead of * the double backslashes usually used in the UNC. This function * gives us the latter, so we must adjust the result. */ - full_path = build_path_from_dentry(dentry); - if (full_path == NULL) { - rc = -ENOMEM; - goto out_err; - } + mnt = ERR_PTR(-ENOMEM); + full_path = build_path_from_dentry(mntpt); + if (full_path == NULL) + goto free_xid; - cifs_sb = CIFS_SB(dentry->d_inode->i_sb); + cifs_sb = CIFS_SB(mntpt->d_inode->i_sb); tlink = cifs_sb_tlink(cifs_sb); + mnt = ERR_PTR(-EINVAL); if (IS_ERR(tlink)) { - rc = PTR_ERR(tlink); - goto out_err; + mnt = ERR_CAST(tlink); + goto free_full_path; } ses = tlink_tcon(tlink)->ses; @@ -338,46 +310,77 @@ cifs_dfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) cifs_put_tlink(tlink); + mnt = ERR_PTR(-ENOENT); for (i = 0; i < num_referrals; i++) { int len; - dump_referral(referrals+i); + dump_referral(referrals + i); /* connect to a node */ len = strlen(referrals[i].node_name); if (len < 2) { cERROR(1, "%s: Net Address path too short: %s", __func__, referrals[i].node_name); - rc = -EINVAL; - goto out_err; + mnt = ERR_PTR(-EINVAL); + break; } mnt = cifs_dfs_do_refmount(cifs_sb, full_path, referrals + i); cFYI(1, "%s: cifs_dfs_do_refmount:%s , mnt:%p", __func__, referrals[i].node_name, mnt); - - /* complete mount procedure if we accured submount */ if (!IS_ERR(mnt)) - break; + goto success; } - /* we need it cause for() above could exit without valid submount */ - rc = PTR_ERR(mnt); - if (IS_ERR(mnt)) - goto out_err; - - rc = add_mount_helper(mnt, nd, &cifs_dfs_automount_list); + /* no valid submounts were found; return error from get_dfs_path() by + * preference */ + if (rc != 0) + mnt = ERR_PTR(rc); -out: - FreeXid(xid); +success: free_dfs_info_array(referrals, num_referrals); +free_full_path: kfree(full_path); +free_xid: + FreeXid(xid); cFYI(1, "leaving %s" , __func__); - return ERR_PTR(rc); -out_err: - path_put(&nd->path); - goto out; + return mnt; +} + +/* + * Attempt to automount the referral + */ +struct vfsmount *cifs_dfs_d_automount(struct path *path) +{ + struct vfsmount *newmnt; + int err; + + cFYI(1, "in %s", __func__); + + newmnt = cifs_dfs_do_automount(path->dentry); + if (IS_ERR(newmnt)) { + cFYI(1, "leaving %s [automount failed]" , __func__); + return newmnt; + } + + mntget(newmnt); + err = do_add_mount(newmnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE, + &cifs_dfs_automount_list); + switch (err) { + case 0: + schedule_delayed_work(&cifs_dfs_automount_task, + cifs_dfs_mountpoint_expiry_timeout); + cFYI(1, "leaving %s [ok]" , __func__); + return newmnt; + case -EBUSY: + /* someone else made a mount here whilst we were busy */ + mntput(newmnt); + cFYI(1, "leaving %s [EBUSY]" , __func__); + return NULL; + default: + mntput(newmnt); + cFYI(1, "leaving %s [error %d]" , __func__, err); + return ERR_PTR(err); + } } const struct inode_operations cifs_dfs_referral_inode_operations = { - .follow_link = cifs_dfs_follow_mountpoint, }; - diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index 897b2b2b28b..851030f7493 100644 --- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -93,6 +93,12 @@ extern int cifs_readdir(struct file *file, void *direntry, filldir_t filldir); extern const struct dentry_operations cifs_dentry_ops; extern const struct dentry_operations cifs_ci_dentry_ops; +#ifdef CONFIG_CIFS_DFS_UPCALL +extern struct vfsmount *cifs_dfs_d_automount(struct path *path); +#else +#define cifs_dfs_d_automount NULL +#endif + /* Functions related to symlinks */ extern void *cifs_follow_link(struct dentry *direntry, struct nameidata *nd); extern void cifs_put_link(struct dentry *direntry, diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c index 1e95dd63563..dd5f22918c3 100644 --- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -675,6 +675,7 @@ cifs_d_revalidate(struct dentry *direntry, struct nameidata *nd) const struct dentry_operations cifs_dentry_ops = { .d_revalidate = cifs_d_revalidate, + .d_automount = cifs_dfs_d_automount, /* d_delete: cifs_d_delete, */ /* not needed except for debugging */ }; @@ -711,4 +712,5 @@ const struct dentry_operations cifs_ci_dentry_ops = { .d_revalidate = cifs_d_revalidate, .d_hash = cifs_ci_hash, .d_compare = cifs_ci_compare, + .d_automount = cifs_dfs_d_automount, }; diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index b06b6062024..6c9ee8014ff 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -32,7 +32,7 @@ #include "fscache.h" -static void cifs_set_ops(struct inode *inode, const bool is_dfs_referral) +static void cifs_set_ops(struct inode *inode) { struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); @@ -60,7 +60,7 @@ static void cifs_set_ops(struct inode *inode, const bool is_dfs_referral) break; case S_IFDIR: #ifdef CONFIG_CIFS_DFS_UPCALL - if (is_dfs_referral) { + if (IS_AUTOMOUNT(inode)) { inode->i_op = &cifs_dfs_referral_inode_operations; } else { #else /* NO DFS support, treat as a directory */ @@ -167,7 +167,9 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr) } spin_unlock(&inode->i_lock); - cifs_set_ops(inode, fattr->cf_flags & CIFS_FATTR_DFS_REFERRAL); + if (fattr->cf_flags & CIFS_FATTR_DFS_REFERRAL) + inode->i_flags |= S_AUTOMOUNT; + cifs_set_ops(inode); } void -- cgit v1.2.3-18-g5258 From db3729153e82ba3ada89681f26c4f1b6d6807a80 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:53 +0000 Subject: Remove the automount through follow_link() kludge code from pathwalk Remove the automount through follow_link() kludge code from pathwalk in favour of using d_automount(). Signed-off-by: David Howells Acked-by: Ian Kent Signed-off-by: Al Viro --- fs/namei.c | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index dc50bfb2f5d..61995fba4e2 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1341,17 +1341,6 @@ fail: return PTR_ERR(dentry); } -/* - * This is a temporary kludge to deal with "automount" symlinks; proper - * solution is to trigger them on follow_mount(), so that do_lookup() - * would DTRT. To be killed before 2.6.34-final. - */ -static inline int follow_on_final(struct inode *inode, unsigned lookup_flags) -{ - return inode && unlikely(inode->i_op->follow_link) && - ((lookup_flags & LOOKUP_FOLLOW) || S_ISDIR(inode->i_mode)); -} - /* * Name resolution. * This is the basic name resolution function, turning a pathname into @@ -1490,7 +1479,8 @@ last_component: err = do_lookup(nd, &this, &next, &inode); if (err) break; - if (follow_on_final(inode, lookup_flags)) { + if (inode && unlikely(inode->i_op->follow_link) && + (lookup_flags & LOOKUP_FOLLOW)) { if (nameidata_dentry_drop_rcu_maybe(nd, next.dentry)) return -ECHILD; BUG_ON(inode != next.dentry->d_inode); @@ -2543,8 +2533,7 @@ reval: struct inode *linki = link.dentry->d_inode; void *cookie; error = -ELOOP; - /* S_ISDIR part is a temporary automount kludge */ - if (!(nd.flags & LOOKUP_FOLLOW) && !S_ISDIR(linki->i_mode)) + if (!(nd.flags & LOOKUP_FOLLOW)) goto exit_dput; if (count++ == 32) goto exit_dput; -- cgit v1.2.3-18-g5258 From 10584211e48036182212b598cc53331776406d60 Mon Sep 17 00:00:00 2001 From: Ian Kent Date: Fri, 14 Jan 2011 18:45:58 +0000 Subject: autofs4: Add d_automount() dentry operation Add a function to use the newly defined ->d_automount() dentry operation for triggering mounts instead of doing the user space callback in ->lookup() and ->d_revalidate(). Note, to be useful the subsequent patch to add the ->d_manage() dentry operation is also needed so the discussion of functionality is deferred to that patch. Signed-off-by: Ian Kent Signed-off-b