From e6d265e8504ab4a3368b8645d318b344ee88b280 Mon Sep 17 00:00:00 2001 From: Jay Vosburgh Date: Fri, 28 Oct 2011 15:42:50 +0000 Subject: bonding: eliminate bond_close race conditions This patch resolves two sets of race conditions. Mitsuo Hayasaka reported the first, as follows: The bond_close() calls cancel_delayed_work() to cancel delayed works. It, however, cannot cancel works that were already queued in workqueue. The bond_open() initializes work->data, and proccess_one_work() refers get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when work->data has been initialized. Thus, a panic occurs. He included a patch that converted the cancel_delayed_work calls in bond_close to flush_delayed_work_sync, which eliminated the above problem. His patch is incorporated, at least in principle, into this patch. In this patch, we use cancel_delayed_work_sync in place of flush_delayed_work_sync, and also convert bond_uninit in addition to bond_close. This conversion to _sync, however, opens new races between bond_close and three periodically executing workqueue functions: bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon. The race occurs because bond_close and bond_uninit are always called with RTNL held, and these workqueue functions may acquire RTNL to perform failover-related activities. If bond_close or bond_uninit is waiting in cancel_delayed_work_sync, deadlock occurs. These deadlocks are resolved by having the workqueue functions acquire RTNL conditionally. If the rtnl_trylock() fails, the functions reschedule and return immediately. For the cases that are attempting to perform link failover, a delay of 1 is used; for the other cases, the normal interval is used (as those activities are not as time critical). Additionally, the bond_mii_monitor function now stores the delay in a variable (mimicing the structure of activebackup_arp_mon). Lastly, all of the above renders the kill_timers sentinel moot, and therefore it has been removed. Tested-by: Mitsuo Hayasaka Signed-off-by: Jay Vosburgh Signed-off-by: David S. Miller --- drivers/net/bonding/bonding.h | 1 - 1 file changed, 1 deletion(-) (limited to 'drivers/net/bonding/bonding.h') diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h index 82fec5fc75d..1aecc37e5b4 100644 --- a/drivers/net/bonding/bonding.h +++ b/drivers/net/bonding/bonding.h @@ -222,7 +222,6 @@ struct bonding { struct slave *); rwlock_t lock; rwlock_t curr_slave_lock; - s8 kill_timers; u8 send_peer_notif; s8 setup_by_slave; s8 igmp_retrans; -- cgit v1.2.3-18-g5258