NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery

commit f22e5edd2244609aed3906207a62223e7707a34d upstream. Andy Adamson reports: The state manager is recovering expired state and recovery OPENs are being processed. If kswapd is pruning inodes at the same time, a deadlock can occur when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the resultant layoutreturn gets an error that the state mangager is to handle, causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. At the same time an open is waiting for the inode deletion to complete in __wait_on_freeing_inode. If the open is either the open called by the state manager, or an open from the same open owner that is holding the NFSv4 sequence id which causes the OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, then the state is deadlocked with kswapd. The fix is simply to have layoutreturn ignore all errors except NFS4ERR_DELAY. We already know that layouts are dropped on all server reboots, and that it has to be coded to deal with the "forgetful client model" that doesn't send layoutreturns. Reported-by: Andy Adamson <andros@netapp.com> Link: http://lkml.kernel.org/r/1385402270-14284-1-git-send-email-andros@netapp.com Signed-off-by: Trond Myklebust <Trond.Myklebust@primarydata.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
author: Trond Myklebust <Trond.Myklebust@netapp.com> 2013-12-04 12:09:45 -0500
committer: Jiri Slaby <jslaby@suse.cz> 2014-04-03 10:32:13 +0200
commit: da6bf1d4a5ac4141869039c2a36bf1c80ed36953 (patch)
tree: dd7b051e438eab81402f441a34d0ba114535b4df /fs
parent: f5ddb3e95e4caa2414ddd79ed6f046f61da218a7 (diff)
1 files changed, 8 insertions, 1 deletions
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 5a5fb98edb8..9edb753d9e3 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7273,7 +7273,14 @@ static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
 		return;
 
 	server = NFS_SERVER(lrp->args.inode);
-	if (nfs4_async_handle_error(task, server, NULL) == -EAGAIN) {
+	switch (task->tk_status) {
+	default:
+		task->tk_status = 0;
+	case 0:
+		break;
+	case -NFS4ERR_DELAY:
+		if (nfs4_async_handle_error(task, server, NULL) != -EAGAIN)
+			break;
 		rpc_restart_call_prepare(task);
 		return;
 	}
author	Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-04 12:09:45 -0500
committer	Jiri Slaby <jslaby@suse.cz>	2014-04-03 10:32:13 +0200
commit	da6bf1d4a5ac4141869039c2a36bf1c80ed36953 (patch)
tree	dd7b051e438eab81402f441a34d0ba114535b4df /fs
parent	f5ddb3e95e4caa2414ddd79ed6f046f61da218a7 (diff)