diff options
| author | Jay Vosburgh <fubar@us.ibm.com> | 2011-10-28 15:42:50 +0000 | 
|---|---|---|
| committer | David S. Miller <davem@davemloft.net> | 2011-10-30 03:13:14 -0400 | 
| commit | e6d265e8504ab4a3368b8645d318b344ee88b280 (patch) | |
| tree | 6fd2bc16819bae491e56be2658fc3298b7efa92c /tools/perf/util/scripting-engines/trace-event-python.c | |
| parent | 10ee0faed92c8af4baebd633372136a6608a41ea (diff) | |
bonding: eliminate bond_close race conditions
This patch resolves two sets of race conditions.
	Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> reported the
first, as follows:
The bond_close() calls cancel_delayed_work() to cancel delayed works.
It, however, cannot cancel works that were already queued in workqueue.
The bond_open() initializes work->data, and proccess_one_work() refers
get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when
work->data has been initialized. Thus, a panic occurs.
	He included a patch that converted the cancel_delayed_work calls
in bond_close to flush_delayed_work_sync, which eliminated the above
problem.
	His patch is incorporated, at least in principle, into this
patch.  In this patch, we use cancel_delayed_work_sync in place of
flush_delayed_work_sync, and also convert bond_uninit in addition to
bond_close.
	This conversion to _sync, however, opens new races between
bond_close and three periodically executing workqueue functions:
bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon.
	The race occurs because bond_close and bond_uninit are always
called with RTNL held, and these workqueue functions may acquire RTNL to
perform failover-related activities.  If bond_close or bond_uninit is
waiting in cancel_delayed_work_sync, deadlock occurs.
	These deadlocks are resolved by having the workqueue functions
acquire RTNL conditionally.  If the rtnl_trylock() fails, the functions
reschedule and return immediately.  For the cases that are attempting to
perform link failover, a delay of 1 is used; for the other cases, the
normal interval is used (as those activities are not as time critical).
	Additionally, the bond_mii_monitor function now stores the delay
in a variable (mimicing the structure of activebackup_arp_mon).
	Lastly, all of the above renders the kill_timers sentinel moot,
and therefore it has been removed.
Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'tools/perf/util/scripting-engines/trace-event-python.c')
0 files changed, 0 insertions, 0 deletions
