aboutsummaryrefslogtreecommitdiff
path: root/Documentation/filesystems/porting
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems/porting')
-rw-r--r--Documentation/filesystems/porting119
1 files changed, 94 insertions, 25 deletions
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index dfbcd1b00b0..0f3a1390bf0 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -94,9 +94,8 @@ protected.
---
[mandatory]
-BKL is also moved from around sb operations. ->write_super() Is now called
-without BKL held. BKL should have been shifted into individual fs sb_op
-functions. If you don't need it, remove it.
+BKL is also moved from around sb operations. BKL should have been shifted into
+individual fs sb_op functions. If you don't need it, remove it.
---
[informational]
@@ -282,7 +281,7 @@ ext2_write_failed and callers for an example.
[mandatory]
- ->truncate is going away. The whole truncate sequence needs to be
+ ->truncate is gone. The whole truncate sequence needs to be
implemented in ->setattr, which is now mandatory for filesystems
implementing on-disk size changes. Start with a copy of the old inode_setattr
and vmtruncate, and the reorder the vmtruncate + foofs_vmtruncate sequence to
@@ -296,21 +295,22 @@ in the beginning of ->setattr unconditionally.
->clear_inode() and ->delete_inode() are gone; ->evict_inode() should
be used instead. It gets called whenever the inode is evicted, whether it has
remaining links or not. Caller does *not* evict the pagecache or inode-associated
-metadata buffers; getting rid of those is responsibility of method, as it had
-been for ->delete_inode().
- ->drop_inode() returns int now; it's called on final iput() with inode_lock
-held and it returns true if filesystems wants the inode to be dropped. As before,
-generic_drop_inode() is still the default and it's been updated appropriately.
-generic_delete_inode() is also alive and it consists simply of return 1. Note that
-all actual eviction work is done by caller after ->drop_inode() returns.
- clear_inode() is gone; use end_writeback() instead. As before, it must
-be called exactly once on each call of ->evict_inode() (as it used to be for
-each call of ->delete_inode()). Unlike before, if you are using inode-associated
-metadata buffers (i.e. mark_buffer_dirty_inode()), it's your responsibility to
-call invalidate_inode_buffers() before end_writeback().
- No async writeback (and thus no calls of ->write_inode()) will happen
-after end_writeback() returns, so actions that should not overlap with ->write_inode()
-(e.g. freeing on-disk inode if i_nlink is 0) ought to be done after that call.
+metadata buffers; the method has to use truncate_inode_pages_final() to get rid
+of those. Caller makes sure async writeback cannot be running for the inode while
+(or after) ->evict_inode() is called.
+
+ ->drop_inode() returns int now; it's called on final iput() with
+inode->i_lock held and it returns true if filesystems wants the inode to be
+dropped. As before, generic_drop_inode() is still the default and it's been
+updated appropriately. generic_delete_inode() is also alive and it consists
+simply of return 1. Note that all actual eviction work is done by caller after
+->drop_inode() returns.
+
+ As before, clear_inode() must be called exactly once on each call of
+->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike
+before, if you are using inode-associated metadata buffers (i.e.
+mark_buffer_dirty_inode()), it's your responsibility to call
+invalidate_inode_buffers() before clear_inode().
NOTE: checking i_nlink in the beginning of ->write_inode() and bailing out
if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput()
@@ -354,12 +354,10 @@ protects *all* the dcache state of a given dentry.
via rcu-walk path walk (basically, if the file can have had a path name in the
vfs namespace).
- i_dentry and i_rcu share storage in a union, and the vfs expects
-i_dentry to be reinitialized before it is freed, so an:
-
- INIT_LIST_HEAD(&inode->i_dentry);
-
-must be done in the RCU callback.
+ Even though i_dentry and i_rcu share storage in a union, we will
+initialize the former in inode_init_always(), so just leave it alone in
+the callback. It used to be necessary to clean it there, but not anymore
+(starting at 3.2).
--
[recommended]
@@ -394,3 +392,74 @@ file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode.
Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set,
so the i_size should not change when hole punching, even when puching the end of
a file off.
+
+--
+[mandatory]
+ ->get_sb() is gone. Switch to use of ->mount(). Typically it's just
+a matter of switching from calling get_sb_... to mount_... and changing the
+function type. If you were doing it manually, just switch from setting ->mnt_root
+to some pointer to returning that pointer. On errors return ERR_PTR(...).
+
+--
+[mandatory]
+ ->permission() and generic_permission()have lost flags
+argument; instead of passing IPERM_FLAG_RCU we add MAY_NOT_BLOCK into mask.
+ generic_permission() has also lost the check_acl argument; ACL checking
+has been taken to VFS and filesystems need to provide a non-NULL ->i_op->get_acl
+to read an ACL from disk.
+
+--
+[mandatory]
+ If you implement your own ->llseek() you must handle SEEK_HOLE and
+SEEK_DATA. You can hanle this by returning -EINVAL, but it would be nicer to
+support it in some way. The generic handler assumes that the entire file is
+data and there is a virtual hole at the end of the file. So if the provided
+offset is less than i_size and SEEK_DATA is specified, return the same offset.
+If the above is true for the offset and you are given SEEK_HOLE, return the end
+of the file. If the offset is i_size or greater return -ENXIO in either case.
+
+[mandatory]
+ If you have your own ->fsync() you must make sure to call
+filemap_write_and_wait_range() so that all dirty pages are synced out properly.
+You must also keep in mind that ->fsync() is not called with i_mutex held
+anymore, so if you require i_mutex locking you must make sure to take it and
+release it yourself.
+
+--
+[mandatory]
+ d_alloc_root() is gone, along with a lot of bugs caused by code
+misusing it. Replacement: d_make_root(inode). The difference is,
+d_make_root() drops the reference to inode if dentry allocation fails.
+
+--
+[mandatory]
+ The witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and
+->lookup() do *not* take struct nameidata anymore; just the flags.
+--
+[mandatory]
+ ->create() doesn't take struct nameidata *; unlike the previous
+two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that
+local filesystems can ignore tha argument - they are guaranteed that the
+object doesn't exist. It's remote/distributed ones that might care...
+--
+[mandatory]
+ FS_REVAL_DOT is gone; if you used to have it, add ->d_weak_revalidate()
+in your dentry operations instead.
+--
+[mandatory]
+ vfs_readdir() is gone; switch to iterate_dir() instead
+--
+[mandatory]
+ ->readdir() is gone now; switch to ->iterate()
+[mandatory]
+ vfs_follow_link has been removed. Filesystems must use nd_set_link
+ from ->follow_link for normal symlinks, or nd_jump_link for magic
+ /proc/<pid> style links.
+--
+[mandatory]
+ iget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be
+ called with both ->i_lock and inode_hash_lock held; the former is *not*
+ taken anymore, so verify that your callbacks do not rely on it (none
+ of the in-tree instances did). inode_hash_lock is still held,
+ of course, so they are still serialized wrt removal from inode hash,
+ as well as wrt set() callback of iget5_locked().