aboutsummaryrefslogtreecommitdiff
path: root/drivers/scsi/scsi_transport_fc.c
AgeCommit message (Collapse)Author
2012-08-01Merge branch 'for-3.6/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block IO bits from Jens Axboe: "The most complicated part if this is the request allocation rework by Tejun, which has been queued up for a long time and has been in for-next ditto as well. There are a few commits from yesterday and today, mostly trivial and obvious fixes. So I'm pretty confident that it is sound. It's also smaller than usual." * 'for-3.6/core' of git://git.kernel.dk/linux-block: block: remove dead func declaration block: add partition resize function to blkpg ioctl block: uninitialized ioc->nr_tasks triggers WARN_ON block: do not artificially constrain max_sectors for stacking drivers blkcg: implement per-blkg request allocation block: prepare for multiple request_lists block: add q->nr_rqs[] and move q->rq.elvpriv to q->nr_rqs_elvpriv blkcg: inline bio_blkcg() and friends block: allocate io_context upfront block: refactor get_request[_wait]() block: drop custom queue draining used by scsi_transport_{iscsi|fc} mempool: add @gfp_mask to mempool_create_node() blkcg: make root blkcg allocation use %GFP_KERNEL blkcg: __blkg_lookup_create() doesn't need radix preload
2012-07-20[SCSI] core, classes, mpt2sas: have scsi_internal_device_unblock take new stateMike Christie
This has scsi_internal_device_unblock/scsi_target_unblock take the new state to set the devices as an argument instead of always setting to running. The patch also converts users of these functions. This allows the FC and iSCSI class to transition devices from blocked to transport-offline, so that when fast_io_fail/replacement_timeout has fired we do not set the devices back to running. Instead, we set them to SDEV_TRANSPORT_OFFLINE. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] fc: add some more FC specific stats to fc_hostVasu Dev
The libfc provides more flexibility and with that we can monitor some more FC specific stats for FC exches or FCP error cases, this patch add such new FC stats. The patch adds *only* FC specific new stats to existing fc_host attribute container. Added stats names are self explanatory as existing FC stats already has, however anyway still added commentary along their definition to describe them. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Acked-by : Robert Love <robert.w.love@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-25block: drop custom queue draining used by scsi_transport_{iscsi|fc}Tejun Heo
iscsi_remove_host() uses bsg_remove_queue() which implements custom queue draining. fc_bsg_remove() open-codes mostly identical logic. The draining logic isn't correct in that blk_stop_queue() doesn't prevent new requests from being queued - it just stops processing, so nothing prevents new requests to be queued after the logic determines that the queue is drained. blk_cleanup_queue() now implements proper queue draining and these custom draining logics aren't necessary. Drop them and use bsg_unregister_queue() + blk_cleanup_queue() instead. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: James Smart <james.smart@emulex.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-10[SCSI] fc class: fix scanning when devs are offlineMike Christie
When a rport is added back or the role is changed the fc class will queue a scan and then call scsi_target_unblock. The problem with this is if the devices are in the SDEV_OFFLINE state and the scan is run before the scsi_target_unblock, then the scan will see LUN0 as offline and the scan will fail. This patch moves the unblock call to before the scan, so we know the device state will be set correctly when the scan is run. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] scsi_transport_fc: Add FDMI host attributesNeerav Parikh
This adds FC-GS Fabric Device Management Interface (FDMI) related attributes to fc_host_attr structure. This is in preparation for allowing FDMI attributes to be registered via libfc. Signed-off-by: Neerav Parikh <neerav.parikh@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Acked-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-01-16[SCSI] scsi_transport_fc: Clear Devloss Callback Done flag in ↵James Smart
fc_remote_port_rolechg This patch fixes a bug where devloss is not called on fc_host teardown. The issue is seen if the LLDD uses rport_rolechg to add the target role to an rport. When an rport goes away, the LLDD will call fc_remote_port_delete, which will start the devloss timer. If the timer expires, the transport will call the devloss callback and set the FC_RPORT_DEVLOSS_CALLBK_DONE flag. However, the rport structure is not deleted, it is retained to store the SCSI id mappings for the rport in case it comes back. In the scenario where it does come back, and the driver calls fc_remote_port_add, but does not indicate the "target" role for the rport - the create will clear the structure, but forgets to clear FC_RPORT_DEVLOSS_CALLBK_DONE flag (which is cleared if it's added with the target role). The secondary call, of fc_remote_port_rolechg to add the target role also does not clear the flag. Thus, the next time the rport goes away, the resulting devloss timer expiration will not call the driver callback as the flag is still set. This patch adds the FC_RPORT_DEVLOSS_CALLBK_DONE flags to the list of those that are cleared upon reuse of the rport structure. Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-05-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (110 commits) [SCSI] qla2xxx: Refactor call to qla2xxx_read_sfp for thermal temperature. [SCSI] qla2xxx: Unify the read/write sfp mailbox command routines. [SCSI] qla2xxx: Clear complete initialization control block. [SCSI] qla2xxx: Allow an override of the registered maximum LUN. [SCSI] qla2xxx: Add host number in reset and quiescent message logs. [SCSI] qla2xxx: Correctly read sfp single byte mailbox register. [SCSI] qla2xxx: Add qla82xx_rom_unlock() function. [SCSI] qla2xxx: Log if qla82xx firmware fails to load from flash. [SCSI] qla2xxx: Use passed in host to initialize local scsi_qla_host in queuecommand function [SCSI] qla2xxx: Correct buffer start in edc sysfs debug print. [SCSI] qla2xxx: Update firmware version after flash update for ISP82xx. [SCSI] qla2xxx: Fix hang during driver unload when vport is active. [SCSI] qla2xxx: Properly set the dsd_list_len for dsd_chaining in cmd type 6. [SCSI] qla2xxx: Fix virtual port failing to login after chip reset. [SCSI] qla2xxx: Fix vport delete hang when logins are outstanding. [SCSI] hpsa: Change memset using sizeof(ptr) to sizeof(*ptr) [SCSI] ipr: Rate limit DMA mapping errors [SCSI] hpsa: add P2000 to list of shared SAS devices [SCSI] hpsa: do not attempt PCI power management reset method if we know it won't work. [SCSI] hpsa: remove superfluous sleeps around reset code ...
2011-05-01[SCSI] scsi_transport_fc: Fix deadlock during fc_remove_hostNithin Nayak Sujir
Creating and destroying fcoe interface in a tight loop leads to a system deadlock with the following call traces: Call Trace: [<ffffffff814f4b3d>] schedule_timeout+0x1fd/0x2c0 [<ffffffff814f469f>] ? wait_for_common+0x4f/0x190 [<ffffffff814f469f>] ? wait_for_common+0x4f/0x190 [<ffffffff814f4737>] wait_for_common+0xe7/0x190 [<ffffffff81042fa0>] ? default_wake_function+0x0/0x20 [<ffffffff81082c2d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff814f48bd>] wait_for_completion+0x1d/0x20 [<ffffffff81066d90>] flush_workqueue+0x290/0x5f0 [<ffffffff81066b00>] ? flush_workqueue+0x0/0x5f0 [<ffffffff81067148>] destroy_workqueue+0x38/0x340 [<ffffffffa0260289>] fc_remove_host+0x1b9/0x1f0 [scsi_transport_fc] [<ffffffffa02ed195>] bnx2fc_if_destroy+0xc5/0x1f0 [bnx2fc] [<ffffffffa02ed33a>] bnx2fc_destroy+0x7a/0x100 [bnx2fc] [<ffffffffa02c789b>] fcoe_transport_destroy+0x9b/0x1b0 [libfcoe] [<ffffffff81069ec2>] param_attr_store+0x52/0x80 [<ffffffff81069976>] module_attr_store+0x26/0x30 [<ffffffff8119e726>] sysfs_write_file+0xe6/0x170 [<ffffffff81134710>] vfs_write+0xd0/0x1a0 [<ffffffff811348e4>] sys_write+0x54/0xa0 [<ffffffff81002e02>] system_call_fastpath+0x16/0x1b Call Trace: [<ffffffff81074865>] async_synchronize_cookie_domain+0x75/0x120 [<ffffffff8106caa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81074925>] async_synchronize_cookie+0x15/0x20 [<ffffffff8107494c>] async_synchronize_full+0x1c/0x40 [<ffffffffa0057466>] sd_remove+0x36/0xc0 [sd_mod] [<ffffffff81358a75>] __device_release_driver+0x75/0xe0 [<ffffffff81358bef>] device_release_driver+0x2f/0x50 [<ffffffff81357aee>] bus_remove_device+0xbe/0x120 [<ffffffff813553ef>] device_del+0x12f/0x1e0 [<ffffffff8137454d>] __scsi_remove_device+0xbd/0xc0 [<ffffffff81374585>] scsi_remove_device+0x35/0x50 [<ffffffff813746a7>] __scsi_remove_target+0xe7/0x110 [<ffffffff81374730>] ? __remove_child+0x0/0x30 [<ffffffff81374753>] __remove_child+0x23/0x30 [<ffffffff81354a2c>] device_for_each_child+0x4c/0x80 [<ffffffff81374703>] scsi_remove_target+0x33/0x60 [<ffffffffa02622c6>] fc_starget_delete+0x26/0x30 [scsi_transport_fc] [<ffffffffa026271a>] fc_rport_final_delete+0xaa/0x200 [scsi_transport_fc] [<ffffffff8106585a>] process_one_work+0x1aa/0x540 [<ffffffff810657eb>] ? process_one_work+0x13b/0x540 [<ffffffffa0262670>] ? fc_rport_final_delete+0x0/0x200 [scsi_transport_fc] [<ffffffff81067ac9>] worker_thread+0x179/0x410 [<ffffffff81067950>] ? worker_thread+0x0/0x410 [<ffffffff8106c546>] kthread+0xb6/0xc0 [<ffffffff8103879b>] ? finish_task_switch+0x4b/0xe0 [<ffffffff81003ca4>] kernel_thread_helper+0x4/0x10 [<ffffffff814f7994>] ? restore_args+0x0/0x30 [<ffffffff8106c490>] ? kthread+0x0/0xc0 [<ffffffff81003ca0>] ? kernel_thread_helper+0x0/0x10 fc_remove_host() waits for flushing the workqueue, but it is stuck at flushing the first work. The first work doesnt complete, because it is waiting for async layer to complete the IOs. The async layer cannot complete the IO as the terminate_rport_io for the second work was not called, which will be called only when the first work completes. Hence the deadlock. To resolve this deadlock, the workqueue allocation has been modified from create_singlethread_workqueue() to alloc_workqueue(). In addition, fc_terminate_rport_io() should be called before the scsi_flush_work() to avoid the similar deadlock as above. scsi fc alloc queue. move terminate rport io before flush Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-19block: get rid of QUEUE_FLAG_REENTERJens Axboe
We are currently using this flag to check whether it's safe to call into ->request_fn(). If it is set, we punt to kblockd. But we get a lot of false positives and excessive punts to kblockd, which hurts performance. The only real abuser of this infrastructure is SCSI. So export the async queue run and convert SCSI over to use that. There's room for improvement in that SCSI need not always use the async call, but this fixes our performance issue and they can fix that up in due time. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-18block: add blk_run_queue_asyncChristoph Hellwig
Instead of overloading __blk_run_queue to force an offload to kblockd add a new blk_run_queue_async helper to do it explicitly. I've kept the blk_queue_stopped check for now, but I suspect it's not needed as the check we do when the workqueue items runs should be enough. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-10Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/coreJens Axboe
Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: remove per-queue pluggingJens Axboe
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-02block: add @force_kblockd to __blk_run_queue()Tejun Heo
__blk_run_queue() automatically either calls q->request_fn() directly or schedules kblockd depending on whether the function is recursed. blk-flush implementation needs to be able to explicitly choose kblockd. Add @force_kblockd. All the current users are converted to specify %false for the parameter and this patch doesn't introduce any behavior change. stable: This is prerequisite for fixing ide oops caused by the new blk-flush implementation. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-07[SCSI] fc class: add fc host dev loss sysfs fileMike Christie
This adds a fc host dev loss sysfs file. Instead of calling into the driver using the get_host_def_dev_loss_tmo callback, we allow drivers to init the dev loss like is done for other fc host params, and then the fc class will handle updating the value if the user writes to the new sysfs file. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-09[SCSI] scsi_transport_fc: fix blocked bsg request when fc object deletedJames Smart
When an rport is "blocked" and a bsg request is received, the bsg request gets placed on the queue but the queue stalls. If the fc object is then deleted - the bsg queue never restarts and keeps the reference on the object, and stops the overall teardown. This patch restarts the bsg queue on teardown and drains any pending requests, allowing the teardown to succeed. Signed-off-by: Carl Lajeunesse <carl.lajeunesse@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-05[SCSI] fc class: add fc host default default dev loss settingMike Christie
This patch adds a fc_host setting to store the default dev_loss_tmo. It is used if the driver has a callack to get the value from the LLD. If the callback is not set, then we use the fc class module default value. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-08-11drivers: scsi: use newly introduced hex_to_bin() methodAndy Shevchenko
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com> Cc: Adaptec OEM Raid Solutions <aacraid@adaptec.com> Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Cc: James Smart <james.smart@emulex.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-16fix typos concerning "hierarchy"Uwe Kleine-König
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-04-11[SCSI] Allow FC LLD to fast-fail scsi eh by introducing new eh returnChristof Schmitt
If the scsi eh is running and then a FC LLD calls fc_remote_port_delete, the SCSI commands sent from the eh will fail. To prevent this, a FC LLD can call fc_block_scsi_eh from the eh callback, blocking the eh thread until the dev_loss_tmo fires or the remote port is available again. If (e.g. for a multipathing setup) the dev_loss_tmo is set to a very large value, thus preventing the scsi device removal , the scsi eh can block for a long time. For multipathing, the fast_io_fail_tmo is then set to a low value to detect path problems sooner. This patch introduces a new return code FAST_IO_FAIL. The function fc_block_scsi_eh now returns FAST_IO_FAIL when the fast_io_fail_tmo fires. This indicates that the LLD terminated all pending I/O requests and there are no more pending SCSI commands for the scsi eh to wait for. This return code can be passed back to the scsi eh to stop the escalation and finish the recovery process for this device. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-04-11[SCSI] scsi_transport_fc: Protect against overflow in dev_loss_tmoHannes Reinecke
The rport structure defines dev_loss_tmo as u32, which is later multiplied with HZ to get the actual timeout value. This might overflow for large dev_loss_tmo values. So we should be better using u64 as intermediate variables here to protect against overflow. Signed-off-by: Hannes Reinecke <hare@suse.de> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-04-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] qla1280: retain firmware for error recovery [SCSI] attirbute_container: Initialize sysfs attributes with sysfs_attr_init [SCSI] advansys: fix regression with request_firmware change [SCSI] qla2xxx: Updated version number to 8.03.02-k2. [SCSI] qla2xxx: Prevent sending mbx commands from sysfs during isp reset. [SCSI] qla2xxx: Disable MSI on qla24xx chips other than QLA2432. [SCSI] qla2xxx: Check to make sure multique and CPU affinity support is not enabled at the same time. [SCSI] qla2xxx: Correct vp_idx checking during PORT_UPDATE processing. [SCSI] qla2xxx: Honour "Extended BB credits" bit for CNAs. [SCSI] scsi_transport_fc: Make sure commands are completed when rport is offline [SCSI] libiscsi: Fix recovery slowdown regression
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-27[SCSI] scsi_transport_fc: Make sure commands are completed when rport is offlineSarang Radke
blk_end_request doesn't complete a bidi request successfully The unfinished request eventually triggers a panic in timeout handling routine fc_bsg_job_timeout as req->special is NULL Use blk_end_request_all to end the request unconditionally Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-03-08[SCSI] scsi_transport_fc: Fix synchronization issue while deleting vportGal Rosen
The issue occur while deleting 60 virtual ports through the sys interface /sys/class/fc_vports/vport-X/vport_delete. It happen while in a mistake each request sent twice for the same vport. This interface is asynchronous, entering the delete request into a work queue, allowing more than one request to enter to the delete work queue. The result is a NULL pointer. The first request already delete the vport, while the second request got a pointer to the vport before the device destroyed. Re-create vport later cause system freeze. Solution: Check vport flags before entering the request to the work queue. [jejb: fixed int<->long problem on spinlock flags variable] Signed-off-by: Gal Rosen <galr@storwize.com> Acked-by: James Smart <james.smart@emulex.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-18[SCSI] scsi_transport_fc: Remove capping from dev_loss_tmoHannes Reinecke
Currently dev_loss_tmo is capped by SCSI_DEVICE_BLOCK_MAX_TIMEOUT. This causes problem with multipathing when the 'no_path_retry' setting exceeds the dev_loss_tmo setting, as then the system might run into a deadlock when all paths have been removed temporarily for longer than dev_loss_tmo. The principal reasons for the capping has been that we should not allow a remote port to remain in status 'blocked' indefinitely, so the capping is there to ensure that the port status is being reset eventually. However, the fast_io_fail_tmo will also move the remote port out of the 'blocked' state, so for any HBA driver implementing both the capping should really be on the fast_io_fail_tmo, and not on the dev_loss_tmo. This patch implements just that, ie the fast_io_fail_tmo is capped to SCSI_DEVICE_BLOCK_TIMEOUT and the capping is removed from dev_loss_tmo when fast_io_fail_tmo is set. This allows us to synchronize the dev_loss_tmo setting to the 'no_path_retry' setting from multipathing thus avoiding the deadlock. Signed-off-by: Hannes Reinecke <hare@suse.de> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17[SCSI] scsi_transport_fc: Allow LLD to reset FC BSG timeoutSwen Schillig
The hardware used with zfcp cannot abort a currently pending CT or ELS request. Therefore we need the option to postpone the timeout triggered request abort within the fc layer, since there is nothing zfcp can do to stop the request at this point. Cc: James Smart <James.Smart@emulex.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-10[SCSI] fc class: fix fc_transport_init error handlingMike Christie
If transport_class_register fails we should unregister any registered classes, or we will leak memory or other resources. I did a quick modprobe of scsi_transport_fc to test the patch. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (222 commits) [SCSI] zfcp: Remove flag ZFCP_STATUS_FSFREQ_TMFUNCNOTSUPP [SCSI] zfcp: Activate fc4s attributes for zfcp in FC transport class [SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED [SCSI] zfcp: Update FSF error reporting [SCSI] zfcp: Improve ELS ADISC handling [SCSI] zfcp: Simplify handling of ct and els requests [SCSI] zfcp: Remove ZFCP_DID_MASK [SCSI] zfcp: Move WKA port to zfcp FC code [SCSI] zfcp: Use common code definitions for FC CT structs [SCSI] zfcp: Use common code definitions for FC ELS structs [SCSI] zfcp: Update FCP protocol related code [SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport [SCSI] zfcp: Assign scheduled work to driver queue [SCSI] zfcp: Remove STATUS_COMMON_REMOVE flag as it is not required anymore [SCSI] zfcp: Implement module unloading [SCSI] zfcp: Merge trace code for fsf requests in one function [SCSI] zfcp: Access ports and units with container_of in sysfs code [SCSI] zfcp: Remove suspend callback [SCSI] zfcp: Remove global config_mutex [SCSI] zfcp: Replace local reference counting with common kref ...
2009-12-04[SCSI] fc class: fail fast bsg requestsMike Christie
If the port state is blocked and the fast io fail tmo has fired then this patch will fail bsg requests immediately. This is needed if userspace is sending IOs to test the transport like with fcping, so it will not have to wait for the dev loss tmo. With this patch he bsg req fast io fail code behaves like the normal and sg io/passthrough fast io fail. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-By: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04[SCSI] scsi_transport_fc: Introduce helper function for blocking scsi_ehChristof Schmitt
Move the duplicated code from FC LLDs to SCSI FC transport class. Acked-by: James Smart <james.smart@emulex.com> Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Acked-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-11-06[SCSI] scsi_transport_fc: Fix WARN message for FC passthru failure pathsBrian King
There are three error paths in the FC passthru code where job->reply->reply_payload_rcv_len does not get initialized, resulting in the WARN_ON in fc_bsg_jobdone going off. This patch fixes this. An example of one of the WARN_ON messages seen: Badness at drivers/scsi/scsi_transport_fc.c:3424 NIP: d000000000bf21ac LR: d000000000bf2684 CTR: c0000000003f753c REGS: c00000004eb03430 TRAP: 0700 Not tainted (2.6.32-rc4-git) MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 24008444 XER: 00000012 TASK = c00000004c3fc9c0[3243] 'fcping' THREAD: c00000004eb00000 CPU: 0 GPR00: 0000000000000001 c00000004eb036b0 d000000000c01da0 000000004bf17fc0 GPR04: c00000004cd256a0 c00000007e011ce0 c00000007e011d00 c00000004e718000 GPR08: c00000004cd256a0 c00000004eb03ad0 c00000004cd25a90 0000000000000020 GPR12: d000000000bf7848 c000000000b62600 0000000000000060 fffffffffffffff4 GPR16: ffffffffffffffd6 c00000004c7a3060 ffffffff80000003 c00000004b0f0310 GPR20: c00000004e71b180 c00000004c7a3060 0000000000000004 0000000000000000 GPR24: c00000004e71b000 c00000004c7a3000 c00000004b0f0000 c00000004e718000 GPR28: c00000004cd256a0 c00000004cd25a90 d000000000c01db0 c00000004e01d680 NIP [d000000000bf21ac] .fc_bsg_jobdone+0x64/0x9c [scsi_transport_fc] LR [d000000000bf2684] .fc_bsg_request_handler+0x4a0/0x564 [scsi_transport_fc] Call Trace: [c00000004eb036b0] [c0000000003f755c] .get_device+0x20/0x38 (unreliable) [c00000004eb03720] [d000000000bf2684] .fc_bsg_request_handler+0x4a0/0x564 [scsi_transport_fc] [c00000004eb03820] [c0000000002c9b5c] .__generic_unplug_device+0x58/0x70 [c00000004eb038a0] [c0000000002ce9fc] .blk_execute_rq_nowait+0x70/0xf4 [c00000004eb03930] [c0000000002ceb2c] .blk_execute_rq+0xac/0x100 [c00000004eb03a60] [c0000000002d51b4] .bsg_ioctl+0x1fc/0x264 [c00000004eb03c10] [c00000000018a89c] .vfs_ioctl+0x54/0xec [c00000004eb03ca0] [c00000000018b01c] .do_vfs_ioctl+0x640/0x6a8 [c00000004eb03d80] [c00000000018b0fc] .SyS_ioctl+0x78/0xbc [c00000004eb03e30] [c0000000000085b4] syscall_exit+0x0/0x40 Instruction dump: 8003004c 2fa80000 90090104 38000000 900a0108 419e0038 e9230040 81680108 80690004 7f835840 7c101026 5400f7fe <0b000000> 7d605b78 7f8b1840 409d0008 Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-By: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-29[SCSI] scsi_transport_fc: remove invalid BUG_ONMichael Reed
I was doing some large lun count testing with 2.6.31 and hit a BUG_ON() in fc_timeout_deleted_rport(), and it seems like it should have been just a matter of time before someone did. It seems invalid to set port_state under lock, then expect it to remain set after releasing the lock. Another thread called fc_remote_port_add() when the lock was released, changing the port_state. This patch removes the BUG_ON and moves the test of the port_state to inside the host_lock. It's been running for several weeks now with no ill effect. Signed-off-by: Michael Reed <mdr@sgi.com> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-02[SCSI] scsi_transport_fc: fix missing kernel-docRandy Dunlap
Add missing kernel-doc notation in scsi_transport_fc.c: Warning(drivers/scsi/scsi_transport_fc.c:3593): No description found for parameter 'q' Warning(drivers/scsi/scsi_transport_fc.c:3700): No description found for parameter 'q' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-08-22[SCSI] fc_transport: Correct max fc_host attribute countJames Smart
Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-08-22[SCSI] scsi_transport_fc: fix kernel-doc param nameRandy Dunlap
Change function parameter name in kernel-doc to match the function's actual parameter name, to fix 2 kernel-doc warnings. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-06-26[SCSI] FC transport: Locking fix for common-code FC pass-through patchChristof Schmitt
Fix this: ------------[ cut here ]------------ Badness at block/blk-core.c:244 CPU: 0 Tainted: G W 2.6.31-rc1-00004-gd3a263a #3 Process zfcp_wq (pid: 901, task: 000000002fb7a038, ksp: 000000002f02bc78) Krnl PSW : 0704300180000000 00000000002141ba (blk_remove_plug+0xb2/0xb8) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3 Krnl GPRS: 0000000000000001 0000000000000001 0000000022811440 0000000022811798 000000000027ff4e 0000000000000000 0000000000000000 000000002f00f000 070000000006a0f4 000000002af70000 000000002af2a800 00000000228d1c00 0000000022811440 000000000050c708 000000002f02bca8 000000002f02bc80 Krnl Code: 00000000002141b0: b9140022 lgfr %r2,%r2 00000000002141b4: 07fe bcr 15,%r14 00000000002141b6: a7f40001 brc 15,2141b8 >00000000002141ba: a7f4ffbe brc 15,214136 00000000002141be: 0707 bcr 0,%r7 00000000002141c0: ebaff0680024 stmg %r10,%r15,104(%r15) 00000000002141c6: c0d00017c2a9 larl %r13,50c718 00000000002141cc: a7f13fc0 tmll %r15,16320 Call Trace: ([<000000000050e7d8>] C.272.16122+0x88/0x110) [<00000000002141ec>] __blk_run_queue+0x2c/0x154 [<000000000028013a>] fc_remote_port_add+0x85e/0x95c [<000000000037596e>] zfcp_scsi_rport_work+0xe6/0x148 [<000000000006908c>] worker_thread+0x25c/0x318 [<000000000006f10c>] kthread+0x94/0x9c [<000000000001c2b2>] kernel_thread_starter+0x6/0xc [<000000000001c2ac>] kernel_thread_starter+0x0/0xc INFO: lockdep is turned off. Last Breaking-Event-Address: [<00000000002141b6>] blk_remove_plug+0xae/0xb8 The FC pass-through support triggers the WARN_ON(!irqs_disabled()) in blk_plug_device. Since blk_plug_device requires being called with disabled interrupts, use spin_lock_irqsave in fc_bsg_goose_queue to disable the interrupts before calling into the block layer. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-21[SCSI] scsi_transport_fc: replace BUS_ID_SIZE by fixed countJames Bottomley
BUS_ID_SIZE is being removed from the kernel. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-21fc_transport: Selective return value from BSG timeout functionGiridhar Malavali
The return value from BSG timout function should be based on the state of the BSG job. This helps block layer to take selective actions to clean up BSG job. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-21fc_transport: The softirq_done function registration for BSG requestGiridhar Malavali
Registered the softirq_done function, since this is requried iby an request using block level request timeout functionality. This function will be called by the block layer as part of time out clean process to release the BSG request. Moved some of the BSG request completion activities to softirq_done routine to take care of both normal and timout completions. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-12[SCSI] FC Pass Thru supportJames Smart
Attached is the ELS/CT pass-thru patch for the FC Transport. The patch creates a generic framework that lays on top of bsg and the SGIO v4 ioctl in order to pass transaction requests to LLDD's. The interface supports the following operations: On an fc_host basis: Request login to the specified N_Port_ID, creating an fc_rport. Request logout of the specified N_Port_ID, deleting an fc_rport Send ELS request to specified N_Port_ID w/o requiring a login, and wait for ELS response. Send CT request to specified N_Port_ID and wait for CT response. Login is required, but LLDD is allowed to manage login and decide whether it stays in place after the request is satisfied. Vendor-Unique request. Allows a LLDD-specific request to be passed to the LLDD, and the passing of a response back to the application. On an fc_rport basis: Send ELS request to nport and wait for ELS response. Send CT request to nport and wait for CT response. The patch also exports several headers from include/scsi such that they can be available to user-space applications: include/scsi/scsi.h include/scsi/scsi_netlink.h include/scsi/scsi_netlink_fc.h include/scsi/scsi_bsg_fc.h For further information, refer to the last RFC: http://marc.info/?l=linux-scsi&m=123436574018579&w=2 Note: Documentation is still spotty and will be added later. [bharrosh@panasas.com: update for new block API] Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits) [SCSI] scsi_dh_rdac: Retry for NOT_READY check condition [SCSI] mpt2sas: make global symbols unique [SCSI] sd: Make revalidate less chatty [SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices [SCSI] sd: Refactor sd_read_capacity() [SCSI] mpt2sas v00.100.11.15 [SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h [SCSI] ch: Add scsi type modalias [SCSI] 3w-9xxx: add power management support [SCSI] bsg: add linux/types.h include to bsg.h [SCSI] cxgb3i: fix function descriptions [SCSI] libiscsi: fix possbile null ptr session command cleanup [SCSI] iscsi class: remove host no argument from session creation callout [SCSI] libiscsi: pass session failure a session struct [SCSI] iscsi lib: remove qdepth param from iscsi host allocation [SCSI] iscsi lib: have lib create work queue for transmitting IO [SCSI] iscsi class: fix lock dep warning on logout [SCSI] libiscsi: don't cap queue depth in iscsi modules [SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging [SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging ...
2009-03-12[SCSI] scsi_transport_fc: Add missing parenthesis to Point-To-Point descriptionChristof Schmitt
Fix typo by adding closing parenthesis. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-05netlink: change return-value logic of netlink_broadcast()Pablo Neira Ayuso
Currently, netlink_broadcast() reports errors to the caller if no messages at all were delivered: 1) If, at least, one message has been delivered correctly, returns 0. 2) Otherwise, if no messages at all were delivered due to skb_clone() failure, return -ENOBUFS. 3) Otherwise, if there are no listeners, return -ESRCH. With this patch, the caller knows if the delivery of any of the messages to the listeners have failed: 1) If it fails to deliver any message (for whatever reason), return -ENOBUFS. 2) Otherwise, if all messages were delivered OK, returns 0. 3) Otherwise, if no listeners, return -ESRCH. In the current ctnetlink code and in Netfilter in general, we can add reliable logging and connection tracking event delivery by dropping the packets whose events were not successfully delivered over Netlink. Of course, this option would be settable via /proc as this approach reduces performance (in terms of filtered connections per seconds by a stateful firewall) but providing reliable logging and event delivery (for conntrackd) in return. This patch also changes some clients of netlink_broadcast() that may report ENOBUFS errors via printk. This error handling is not of any help. Instead, the userspace daemons that are listening to those netlink messages should resync themselves with the kernel-side if they hit ENOBUFS. BTW, netlink_broadcast() clients include those that call cn_netlink_send(), nlmsg_multicast() and genlmsg_multicast() since they internally call netlink_broadcast() and return its error value. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-06[SCSI] fc transport: restore missing dev_loss_tmo callback to LLDDJames Smart
When we reworked the transport for the rport lifetimes, in cases where the rport was reused as a container for tgt id bindings, we inadvertantly removed the callback to the driver indicating that dev_loss_tmo had fired. This patch restores that functionality. Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-01-02[SCSI] struct device - replace bus_id with dev_name(), dev_set_name()Kay Sievers
[jejb: limit ioctl to returning 20 characters to avoid overrun on long device names and add a few more conversions] Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29[SCSI] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeoutJames Smart
Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired. The desire is to terminate everything, so that the i/o is cleaned up prior to the sdev's being unblocked, thus any outstanding timeouts/aborts are avoided. Also, we do this early enough such that the rport's port_id field is still valid. FCOE libFC code needs this info to find the i/o's to terminate. Signed-off-by: James Smart <james.smart@emulex.com> [michaelc@cs.wisc.edu: remove extra scsi_target_unblock call] Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-13[SCSI] fc class: unblock target after calling terminate callback (take 2)Mike Christie
When we block a rport and the driver implements the terminate callback we will fail IO that was running quickly. However IO that was in the scsi_device/block queue sits there until the dev_loss_tmo fires, and this can make it look like IO is lost because new IO will get executed but that IO stuck in the blocked queue sits there for some time longer. With this patch when the fast io fail tmo fires, we will fail the blocked IO and any new IO. This patch also allows all drivers to partially support the fast io fail tmo. If the terminate io callback is not implemented, we will still fail blocked IO and any new IO, so multipath can handle that. This patch also allows the fc and iscsi classes to implement the same behavior. The timers are just unfornately named differently. This patch also fixes the problem where drivers were unblocking the target in their terminate callback, which was needed for rport removal, but for fast io fail timeout it would cause IO to bounce arround the scsi/block layer and the LLD queuecommand. And it for drivers that could have IO stuck but did not have a terminate callback the unblock calls in the class will fix them. v2. - fix up bit setting style to meet JamesS's pref. - Broke out new host byte error changes to make it easier to read. - added JamesS's ack from list. v1 - initial patch Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits) [SCSI] zfcp: fix double dbf id usage [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev [SCSI] zfcp: fix erp list usage without using locks [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport [SCSI] zfcp: fix deadlock caused by shared work queue tasks [SCSI] zfcp: put threshold data in hba trace [SCSI] zfcp: Simplify zfcp data structures [SCSI] zfcp: Simplify get_adapter_by_busid [SCSI] zfcp: remove all typedefs and replace them with standards [SCSI] zfcp: attach and release SAN nameserver port on demand [SCSI] zfcp: remove unused references, declarations and flags [SCSI] zfcp: Update message with input from review [SCSI] zfcp: add queue_full sysfs attribute [SCSI] scsi_dh: suppress comparison warning [SCSI] scsi_dh: add Dell product information into rdac device handler [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option [SCSI] qla2xxx: fix printk format warnings [SCSI] qla2xxx: Update version number to 8.02.01-k8. [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing. [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling. ...