diff options
| author | Grant Likely <grant.likely@secretlab.ca> | 2010-01-28 14:38:25 -0700 | 
|---|---|---|
| committer | Grant Likely <grant.likely@secretlab.ca> | 2010-01-28 14:38:25 -0700 | 
| commit | 0ada0a73120c28cc432bcdbac061781465c2f48f (patch) | |
| tree | d17cadd4ea47e25d9e48e7d409a39c84268fbd27 /Documentation/filesystems/nfs/Exporting | |
| parent | 6016a363f6b56b46b24655bcfc0499b715851cf3 (diff) | |
| parent | 92dcffb916d309aa01778bf8963a6932e4014d07 (diff) | |
Merge commit 'v2.6.33-rc5' into secretlab/test-devicetree
Diffstat (limited to 'Documentation/filesystems/nfs/Exporting')
| -rw-r--r-- | Documentation/filesystems/nfs/Exporting | 147 | 
1 files changed, 147 insertions, 0 deletions
| diff --git a/Documentation/filesystems/nfs/Exporting b/Documentation/filesystems/nfs/Exporting new file mode 100644 index 00000000000..87019d2b598 --- /dev/null +++ b/Documentation/filesystems/nfs/Exporting @@ -0,0 +1,147 @@ + +Making Filesystems Exportable +============================= + +Overview +-------- + +All filesystem operations require a dentry (or two) as a starting +point.  Local applications have a reference-counted hold on suitable +dentries via open file descriptors or cwd/root.  However remote +applications that access a filesystem via a remote filesystem protocol +such as NFS may not be able to hold such a reference, and so need a +different way to refer to a particular dentry.  As the alternative +form of reference needs to be stable across renames, truncates, and +server-reboot (among other things, though these tend to be the most +problematic), there is no simple answer like 'filename'. + +The mechanism discussed here allows each filesystem implementation to +specify how to generate an opaque (outside of the filesystem) byte +string for any dentry, and how to find an appropriate dentry for any +given opaque byte string. +This byte string will be called a "filehandle fragment" as it +corresponds to part of an NFS filehandle. + +A filesystem which supports the mapping between filehandle fragments +and dentries will be termed "exportable". + + + +Dcache Issues +------------- + +The dcache normally contains a proper prefix of any given filesystem +tree.  This means that if any filesystem object is in the dcache, then +all of the ancestors of that filesystem object are also in the dcache. +As normal access is by filename this prefix is created naturally and +maintained easily (by each object maintaining a reference count on +its parent). + +However when objects are included into the dcache by interpreting a +filehandle fragment, there is no automatic creation of a path prefix +for the object.  This leads to two related but distinct features of +the dcache that are not needed for normal filesystem access. + +1/ The dcache must sometimes contain objects that are not part of the +   proper prefix. i.e that are not connected to the root. +2/ The dcache must be prepared for a newly found (via ->lookup) directory +   to already have a (non-connected) dentry, and must be able to move +   that dentry into place (based on the parent and name in the +   ->lookup).   This is particularly needed for directories as +   it is a dcache invariant that directories only have one dentry. + +To implement these features, the dcache has: + +a/ A dentry flag DCACHE_DISCONNECTED which is set on +   any dentry that might not be part of the proper prefix. +   This is set when anonymous dentries are created, and cleared when a +   dentry is noticed to be a child of a dentry which is in the proper +   prefix.  + +b/ A per-superblock list "s_anon" of dentries which are the roots of +   subtrees that are not in the proper prefix.  These dentries, as +   well as the proper prefix, need to be released at unmount time.  As +   these dentries will not be hashed, they are linked together on the +   d_hash list_head. + +c/ Helper routines to allocate anonymous dentries, and to help attach +   loose directory dentries at lookup time. They are: +    d_alloc_anon(inode) will return a dentry for the given inode. +      If the inode already has a dentry, one of those is returned. +      If it doesn't, a new anonymous (IS_ROOT and +        DCACHE_DISCONNECTED) dentry is allocated and attached. +      In the case of a directory, care is taken that only one dentry +      can ever be attached. +    d_splice_alias(inode, dentry) will make sure that there is a +      dentry with the same name and parent as the given dentry, and +      which refers to the given inode. +      If the inode is a directory and already has a dentry, then that +      dentry is d_moved over the given dentry. +      If the passed dentry gets attached, care is taken that this is +      mutually exclusive to a d_alloc_anon operation. +      If the passed dentry is used, NULL is returned, else the used +      dentry is returned.  This corresponds to the calling pattern of +      ->lookup. +   +  +Filesystem Issues +----------------- + +For a filesystem to be exportable it must: +  +   1/ provide the filehandle fragment routines described below. +   2/ make sure that d_splice_alias is used rather than d_add +      when ->lookup finds an inode for a given parent and name. +      Typically the ->lookup routine will end with a: + +		return d_splice_alias(inode, dentry); +	} + + + +  A file system implementation declares that instances of the filesystem +are exportable by setting the s_export_op field in the struct +super_block.  This field must point to a "struct export_operations" +struct which has the following members: + + encode_fh  (optional) +    Takes a dentry and creates a filehandle fragment which can later be used +    to find or create a dentry for the same object.  The default +    implementation creates a filehandle fragment that encodes a 32bit inode +    and generation number for the inode encoded, and if necessary the +    same information for the parent. + +  fh_to_dentry (mandatory) +    Given a filehandle fragment, this should find the implied object and +    create a dentry for it (possibly with d_alloc_anon). + +  fh_to_parent (optional but strongly recommended) +    Given a filehandle fragment, this should find the parent of the +    implied object and create a dentry for it (possibly with d_alloc_anon). +    May fail if the filehandle fragment is too small. + +  get_parent (optional but strongly recommended) +    When given a dentry for a directory, this should return  a dentry for +    the parent.  Quite possibly the parent dentry will have been allocated +    by d_alloc_anon.  The default get_parent function just returns an error +    so any filehandle lookup that requires finding a parent will fail. +    ->lookup("..") is *not* used as a default as it can leave ".." entries +    in the dcache which are too messy to work with. + +  get_name (optional) +    When given a parent dentry and a child dentry, this should find a name +    in the directory identified by the parent dentry, which leads to the +    object identified by the child dentry.  If no get_name function is +    supplied, a default implementation is provided which uses vfs_readdir +    to find potential names, and matches inode numbers to find the correct +    match. + + +A filehandle fragment consists of an array of 1 or more 4byte words, +together with a one byte "type". +The decode_fh routine should not depend on the stated size that is +passed to it.  This size may be larger than the original filehandle +generated by encode_fh, in which case it will have been padded with +nuls.  Rather, the encode_fh routine should choose a "type" which +indicates the decode_fh how much of the filehandle is valid, and how +it should be interpreted. | 
