From fa9f0e4925c7796afd14bf7bbf7a064078818bbc Mon Sep 17 00:00:00 2001 From: David Teigland Date: Fri, 8 Sep 2006 08:36:35 -0500 Subject: [DLM] confirm master for recovered waiting requests Fixing the following scenario: - A request is on the waiters list waiting for a reply from a remote node. - The request is the first one on the resource, so first_lkid is set. - The remote node fails causing recovery. - During recovery the requesting node becomes master. - The request is now processed locally instead of being a remote operation. - At this point we need to call confirm_master() on the resource since we're certain we're now the master node. This will clear first_lkid. - We weren't calling confirm_master(), so first_lkid was not being cleared causing subsequent requests on that resource to get stuck. Signed-off-by: David Teigland Signed-off-by: Steven Whitehouse --- fs/dlm/lock.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'fs') diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c index 67247f0b508..af2f2f01bd5 100644 --- a/fs/dlm/lock.c +++ b/fs/dlm/lock.c @@ -3283,6 +3283,8 @@ int dlm_recover_waiters_post(struct dlm_ls *ls) hold_rsb(r); lock_rsb(r); _request_lock(r, lkb); + if (is_master(r)) + confirm_master(r, 0); unlock_rsb(r); put_rsb(r); break; -- cgit v1.2.3-18-g5258