aboutsummaryrefslogtreecommitdiff
path: root/drivers/scsi/scsi.c
AgeCommit message (Collapse)Author
2013-12-19[SCSI] Update documentationHannes Reinecke
The documentation has gone out-of-sync, so update it to the current status. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-12-19[SCSI] improved eh timeout handlerHannes Reinecke
When a command runs into a timeout we need to send an 'ABORT TASK' TMF. This is typically done by the 'eh_abort_handler' LLDD callback. Conceptually, however, this function is a normal SCSI command, so there is no need to enter the error handler. This patch implements a new scsi_abort_command() function which invokes an asynchronous function scsi_eh_abort_handler() to abort the commands via the usual 'eh_abort_handler'. If abort succeeds the command is either retried or terminated, depending on the number of allowed retries. However, 'eh_eflags' records the abort, so if the retry would fail again the command is pushed onto the error handler without trying to abort it (again); it'll be cleared up from SCSI EH. [hare: smatch detected stray switch fixed] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25[SCSI] remove check for 'resetting'Hannes Reinecke
Field is now unused, so this is dead code. [jejb: remove resetting and last_reset from Scsi_Host] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-02[SCSI] Don't attempt to send extended INQUIRY command if skip_vpd_pages is setMartin K. Petersen
If a device has the skip_vpd_pages flag set we should simply fail the scsi_get_vpd_page() call. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Stuart Foster <smf.linux@ntlworld.com> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-06-26[SCSI] sd: Update WRITE SAME heuristicsMartin K. Petersen
SATA drives located behind a SAS controller would incorrectly receive WRITE SAME commands. Tweak the heuristics so that: - If REPORT SUPPORTED OPERATION CODES is provided we will use that to choose between WRITE SAME(16), WRITE SAME(10) and disabled. This also fixes an issue with the old code which would issue WRITE SAME(10) despite the command not being whitelisted in REPORT SUPPORTED OPERATION CODES. - If REPORT SUPPORTED OPERATION CODES is not provided we will fall back to WRITE SAME(10) unless the device has an ATA Information VPD page. The assumption is that a SATL which is smart enough to implement WRITE SAME would also provide REPORT SUPPORTED OPERATION CODES. To facilitate the new heuristics scsi_report_opcode() has been modified to so we can distinguish between "operation not supported" and "RSOC not supported". Reported-by: H. Peter Anvin <hpa@zytor.com> Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-11-13[SCSI] Add a report opcode helperMartin K. Petersen
The REPORT SUPPORTED OPERATION CODES command can be used to query whether a given opcode is supported by a device. Add a helper function that allows us to look up commands. We only issue RSOC if the device reports compliance with SPC-3 or later. But to err on the side of caution we disable the command for ATA, FireWire and USB. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] async: make async_synchronize_full() flush all work regardless of domainDan Williams
In response to an async related regression James noted: "My theory is that this is an init problem: The assumption in a lot of our code is that async_synchronize_full() waits for everything ... even the domain specific async schedules, which isn't true." ...so make this assumption true. Each domain, including the default one, registers itself on a global domain list when work is scheduled. Once all entries complete it exits that list. Waiting for the list to be empty syncs all in-flight work across all domains. Domains can opt-out of global syncing if they are declared as exclusive ASYNC_DOMAIN_EXCLUSIVE(). All stack-based domains have been declared exclusive since the domain may go out of scope as soon as the last work item completes. Statically declared domains are mostly ok, but async_unregister_domain() is there to close any theoretical races with pending async_synchronize_full waiters at module removal time. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Eldad Zack <eldadzack@gmail.com> Tested-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] async: introduce 'async_domain' typeDan Williams
This is in preparation for teaching async_synchronize_full() to sync all pending async work, and not just on the async_running domain. This conversion is functionally equivalent, just embedding the existing list in a new async_domain type. The .registered attribute is used in a later patch to distinguish between domains that want to be flushed by async_synchronize_full() versus those that only expect async_synchronize_{full|cookie}_domain to be used for flushing. [jejb: add async.h to scsi_priv.h for struct async_domain] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Tested-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-07[SCSI] Fix sd_probe_domain config problemJames Bottomley
With CONFIG_BLK_DEV_SD = n and CONFIG_PM = n, you get this compile failure: (.text+0x4f6c77): undefined reference to `scsi_sd_probe_domain' This was introduced by commit a7a20d103994fd760766e6c9d494daa569cbfe06 Author: Dan Williams <dan.j.williams@intel.com> Date: Thu Mar 22 17:05:11 2012 -0700 [SCSI] sd: limit the scope of the async probe domain And happens because scsi_sd_probe_domain is conditionally defined but unconditionally used. Fix this by making the symbol unconditionally defined. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Dan Williams <dan.j.williams@intel.com> Tested-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17[SCSI] sd: limit the scope of the async probe domainDan Williams
sd injects and synchronizes probe work on the global kernel-wide domain. This runs into conflict with PM that wants to perform resume actions in async context: [ 494.237079] INFO: task kworker/u:3:554 blocked for more than 120 seconds. [ 494.294396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.360809] kworker/u:3 D 0000000000000000 0 554 2 0x00000000 [ 494.420739] ffff88012e4d3af0 0000000000000046 ffff88013200c160 ffff88012e4d3fd8 [ 494.484392] ffff88012e4d3fd8 0000000000012500 ffff8801394ea0b0 ffff88013200c160 [ 494.548038] ffff88012e4d3ae0 00000000000001e3 ffffffff81a249e0 ffff8801321c5398 [ 494.611685] Call Trace: [ 494.632649] [<ffffffff8149dd25>] schedule+0x5a/0x5c [ 494.674687] [<ffffffff8104b968>] async_synchronize_cookie_domain+0xb6/0x112 [ 494.734177] [<ffffffff810461ff>] ? __init_waitqueue_head+0x50/0x50 [ 494.787134] [<ffffffff8131a224>] ? scsi_remove_target+0x48/0x48 [ 494.837900] [<ffffffff8104b9d9>] async_synchronize_cookie+0x15/0x17 [ 494.891567] [<ffffffff8104ba49>] async_synchronize_full+0x54/0x70 <-- here we wait for async contexts to complete [ 494.943783] [<ffffffff8104b9f5>] ? async_synchronize_full_domain+0x1a/0x1a [ 495.002547] [<ffffffffa00114b1>] sd_remove+0x2c/0xa2 [sd_mod] [ 495.051861] [<ffffffff812fe94f>] __device_release_driver+0x86/0xcf [ 495.104807] [<ffffffff812fe9bd>] device_release_driver+0x25/0x32 <-- here we take device_lock() [ 853.511341] INFO: task kworker/u:4:549 blocked for more than 120 seconds. [ 853.568693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 853.635119] kworker/u:4 D ffff88013097b5d0 0 549 2 0x00000000 [ 853.695129] ffff880132773c40 0000000000000046 ffff880130790000 ffff880132773fd8 [ 853.758990] ffff880132773fd8 0000000000012500 ffff88013288a0b0 ffff880130790000 [ 853.822796] 0000000000000246 0000000000000040 ffff88013097b5c8 ffff880130790000 [ 853.886633] Call Trace: [ 853.907631] [<ffffffff8149dd25>] schedule+0x5a/0x5c [ 853.949670] [<ffffffff8149cc44>] __mutex_lock_common+0x220/0x351 [ 854.001225] [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4 [ 854.049082] [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4 [ 854.097011] [<ffffffff8149ce48>] mutex_lock_nested+0x2f/0x36 <-- here we wait for device_lock() [ 854.145591] [<ffffffff81304bd7>] device_resume+0x58/0x1c4 [ 854.192066] [<ffffffff81304d61>] async_resume+0x1e/0x45 [ 854.237019] [<ffffffff8104bc93>] async_run_entry_fn+0xc6/0x173 <-- ...while running in async context Provide a 'scsi_sd_probe_domain' so that async probe actions actions can be flushed without regard for the state of PM, and allow for the resume path to handle devices that have transitioned from SDEV_QUIESCE to SDEV_DEL prior to resume. Acked-by: Alan Stern <stern@rowland.harvard.edu> [alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume] Signed-off-by: Dan Williams <dan.j.williams@intel.com> [jejb: remove unneeded config guards in include file] Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] Handle disk devices which can not process medium access commandsMartin K. Petersen
We have experienced several devices which fail in a fashion we do not currently handle gracefully in SCSI. After a failure these devices will respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.) but any command accessing the storage medium will time out. The following patch adds an callback that can be used by upper level drivers to inspect the results of an error handling command. This in turn has been used to implement additional checking in the SCSI disk driver. If a medium access command fails twice but TEST UNIT READY succeeds both times in the subsequent error handling we will offline the device. The maximum number of failed commands required to take a device offline can be tweaked in sysfs. Also add a new error flag to scsi_debug which allows this scenario to be easily reproduced. [jejb: fix up integer parsing to use kstrtouint] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2010-11-16SCSI host lock push-downJeff Garzik
Move the mid-layer's ->queuecommand() invocation from being locked with the host lock to being unlocked to facilitate speeding up the critical path for drivers who don't need this lock taken anyway. The patch below presents a simple SCSI host lock push-down as an equivalent transformation. No locking or other behavior should change with this patch. All existing bugs and locking orders are preserved. Additionally, add one parameter to queuecommand, struct Scsi_Host * and remove one parameter from queuecommand, void (*done)(struct scsi_cmnd *) Scsi_Host* is a convenient pointer that most host drivers need anyway, and 'done' is redundant to struct scsi_cmnd->scsi_done. Minimal code disturbance was attempted with this change. Most drivers needed only two one-line modifications for their host lock push-down. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Acked-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-16[SCSI] Fix VPD inquiry page wrapperMartin K. Petersen
Fix two bugs in the VPD page wrapper: - Don't return failure if the user asked for page 0 - The end of buffer check failed to account for the page header size and consequently didn't work Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-04-30[SCSI] add scsi trace core functions and put trace pointsKei Tokunaga
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com> Signed-off-by: Kei Tokunaga <tokunaga.keiich@jp.fujitsu.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-03-01scsi.c: add missing kernel-doc notation for new VPD parametersRandy Dunlap
Add missing kernel-doc notation for new function parameters: Warning(drivers/scsi/scsi.c:1031): No description found for parameter 'buf' Warning(drivers/scsi/scsi.c:1031): No description found for parameter 'buf_len' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-18[SCSI] eliminate potential kmalloc failure in scsi_get_vpd_page()James Bottomley
The best way to fix this is to eliminate the intenal kmalloc() and make the caller allocate the required amount of storage. Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04[SCSI] add queue_depth ramp up codeVasu Dev
Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-02[SCSI] Fix protection scsi_data_buffer leakMartin K. Petersen
We would leak a scsi_data_buffer if the free_list command was of the protected variety. Reported-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-08-22[SCSI] fix bugs in scsi_vpd_inquiry()James Bottomley
Universally, SCSI functions assume the lengths fed in are those of the buffer to DMA data to, not the lengths of the data minus the header. scsi_vpd_inquiry() assumed the latter and got it wrong, so fix up all the functions to use the correct assumption (and fix a bug where INQUIRY in SCSI-2 dcannot go over 255). [jejb: Matthew posted an identical version of this at the same time I did] Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-06-08[SCSI] fix documentation for two functionsBartlomiej Zolnierkiewicz
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-03[SCSI] use kmem_cache_zalloc instead of kmem_cache_alloc/memsetWei Yongjun
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12[SCSI] Add VPD helperMatthew Wilcox
Based on prior work by Martin Petersen and James Bottomley, this patch adds a generic helper for retrieving VPD pages from SCSI devices. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-01-13[SCSI] Skip deleted devices in __scsi_device_lookup_by_target()Hannes Reinecke
__scsi_device_lookup_by_target() will always return the first sdev with a matching LUN, regardless of the state. However, when this sdev is in SDEV_DEL scsi_device_lookup_by_target() will ignore this device and so any valid device on the list after the deleted device will never be found. So we have to modify __scsi_device_lookup_by_target() to skip any device in SDEV_DEL. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-01-02[SCSI] remove severly outdated comment in scsi_dispatch_cmdChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-13[SCSI] Add helper code so transport classes/driver can control queueing (v3)Mike Christie
SCSI-ml manages the queueing limits for the device and host, but does not do so at the target level. However something something similar can come in userful when a driver is transitioning a transport object to the the blocked state, becuase at that time we do not want to queue io and we do not want the queuecommand to be called again. The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers. You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit a transport level queueing issue like the hw cannot allocate some resource at the iscsi session/connection level, or the target has temporarily closed or shrunk the queueing window, or if we are transitioning to the blocked state. bnx2i, when they rework their firmware according to netdev developers requests, will also need to be able to limit queueing at this level. bnx2i will hook into libiscsi, but will allocate a scsi host per netdevice/hba, so unlike pure software iscsi/iser which is allocating a host per session, it cannot set the scsi_host->can_queue and return SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport. The iscsi class/driver can also set a scsi_target->can_queue value which reflects the max commands the driver/class can support. For iscsi this reflects the number of commands we can support for each session due to session/connection hw limits, driver limits, and to also reflect the session/targets's queueing window. Changes: v1 - initial patch. v2 - Fix scsi_run_queue handling of multiple blocked targets. Previously we would break from the main loop if a device was added back on the starved list. We now run over the list and check if any target is blocked. v3 - Rediff for scsi-misc. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits) [SCSI] zfcp: fix double dbf id usage [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev [SCSI] zfcp: fix erp list usage without using locks [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport [SCSI] zfcp: fix deadlock caused by shared work queue tasks [SCSI] zfcp: put threshold data in hba trace [SCSI] zfcp: Simplify zfcp data structures [SCSI] zfcp: Simplify get_adapter_by_busid [SCSI] zfcp: remove all typedefs and replace them with standards [SCSI] zfcp: attach and release SAN nameserver port on demand [SCSI] zfcp: remove unused references, declarations and flags [SCSI] zfcp: Update message with input from review [SCSI] zfcp: add queue_full sysfs attribute [SCSI] scsi_dh: suppress comparison warning [SCSI] scsi_dh: add Dell product information into rdac device handler [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option [SCSI] qla2xxx: fix printk format warnings [SCSI] qla2xxx: Update version number to 8.02.01-k8. [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing. [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling. ...
2008-10-09block: unify request timeout handlingJens Axboe
Right now SCSI and others do their own command timeout handling. Move those bits to the block layer. Instead of having a timer per command, we try to be a bit more clever and simply have one per-queue. This avoids the overhead of having to tear down and setup a timer for each command, so it will result in a lot less timer fiddling. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-03[SCSI] add inline functions for recognising created and blocked statesJames Bottomley
The created and blocked states are very shortly going to correspond to mixed sdev_state states. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26[SCSI] Support devices with protection informationMartin K. Petersen
Implement support for DMA of protection information for devices that are data integrity capable. - Add support for mapping an extra scatter-gather list containing the protection information. - Allocate protection scsi_data_buffer if host is DIX (integrity DMA) capable. - Accessor function for checking whether a device has protection enabled. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26[SCSI] fix shared tag map setupMike Christie
Currently qla4xxx and stex pass in their can_queue values into scsi_activate_tcq because they wanted the tag map that large. The problem with this is that it ends up also setting the queue depth to that large value. All we want to do this in this case is set the device queue depth and the other device settings. We do not need to touch the tag map sizing because the drivers had setup that map according to their can_queue limits when the shared map was created. The scsi mid layer in request_fn will then handle the case where we have more requests than available tags when it checks the host queue ready function. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-06-05[SCSI] make use of the residue valueJames Bottomley
USB sometimes doesn't return an error but instead returns a residue value indicating part (or all) of the command wasn't completed. So if the driver _done() error processing indicates the command was fully processed, subtract off the residue so that this USB error gets propagated. Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-05-02[SCSI] add support for variable length extended commandsBoaz Harrosh
Add support for variable-length, extended, and vendor specific CDBs to scsi-ml. It is now possible for initiators and ULD's to issue these types of commands. LLDs need not change much. All they need is to raise the .max_cmd_len to the longest command they support (see iscsi patch). - clean-up some code paths that did not expect commands to be larger than 16, and change cmd_len members' type to short as char is not enough. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-29[SCSI] bug fix for free list handlingAlan D. Brunelle
commit: commit 542bd1377a963070bc4a03ff7d2690ddf3920596 Author: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Mon Apr 21 10:57:20 2008 -0500 [SCSI] fix SLUB WARN_ON Fixed another problem in free list handling by moving list allocation from scsi_host_alloc() to scsi_add_host(). Unfortunately it introduced a new failure mode in that hosts can pass straight from alloc to put without going through add, leaving the free list uninitialised. Fix by checking shost->cmd_pool on the release path to see if it got initialised. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-21block: move the padding adjustment to blk_rq_map_sgFUJITA Tomonori
blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule that req->data_len (the true data length) is equal to sum(bio). It broke the scsi command completion code. commit e97a294ef6938512b655b1abf17656cf2b26f709 was introduced to fix the above issue. However, the partial completion code doesn't work with it. The commit is also a layer violation (scsi mid-layer should not know about the block layer's padding). This patch moves the padding adjustment to blk_rq_map_sg (suggested by James). The padding works like the drain buffer. This patch breaks the rule that req->data_len is equal to sum(sg), however, the drain buffer already broke it. So this patch just restores the rule that req->data_len is equal to sub(bio) without breaking anything new. Now when a low level driver needs padding, blk_rq_map_user and blk_rq_map_user_iov guarantee there's enough room for padding. blk_rq_map_sg can safely extend the last entry of a scatter list. blk_rq_map_sg must extend the last entry of a scatter list only for a request that got through bio_copy_user_iov. This patches introduces new REQ_COPY_USER flag. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-07[SCSI] export command allocation and freeing functions independently of the hostJames Bottomley
This is needed by things like USB storage that want to set up static commands for later use at start of day. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07[SCSI] consolidate command allocation in a single placeJames Bottomley
Since the way we allocate commands with a separate sense buffer is getting complicated, we should isolate setup and teardown to a single routine so that if it gets even more complex, there's only one place in the code that needs to be altered. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-06scsi: fix sense_slab/bio swapping livelockHugh Dickins
Since 2.6.25-rc7, I've been seeing an occasional livelock on one x86_64 machine, copying kernel trees to tmpfs, paging out to swap. Signature: 6000 pages under writeback but never getting written; most tasks of interest trying to reclaim, but each get_swap_bio waiting for a bio in mempool_alloc's io_schedule_timeout(5*HZ); every five seconds an atomic page allocation failure report from kblockd failing to allocate a sense_buffer in __scsi_get_command. __scsi_get_command has a (one item) free_list to protect against this, but rc1's [SCSI] use dynamically allocated sense buffer de25deb18016f66dcdede165d07654559bb332bc upset that slightly. When it fails to allocate from the separate sense_slab, instead of giving up, it must fall back to the command free_list, which is sure to have a sense_buffer attached. Either my earlier -rc testing missed this, or there's some recent contributory factor. One very significant factor is SLUB, which merges slab caches when it can, and on 64-bit happens to merge both bio cache and sense_slab cache into kmalloc's 128-byte cache: so that under this swapping load, bios above are liable to gobble up all the slots needed for scsi_cmnd sense_buffers below. That's disturbing behaviour, and I tried a few things to fix it. Adding a no-op constructor to the sense_slab inhibits SLUB from merging it, and stops all the allocation failures I was seeing; but it's rather a hack, and perhaps in different configurations we have other caches on the swapout path which are ill-merged. Another alternative is to revert the separate sense_slab, using cache-line-aligned sense_buffer allocated beyond scsi_cmnd from the one kmem_cache; but that might waste more memory, and is only a way of diverting around the known problem. While I don't like seeing the allocation failures, and hate the idea of all those bios piled up above a scsi host working one by one, it does seem to emerge fairly soon with the livelock fix. So lacking better ideas, stick with that one clear fix for now. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <a.p.ziljstra@chello.nl> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04scsi: missing add of padded bytes to io completion byte countJens Axboe
Original patch from Tejun Heo <htejun@gmail.com> but should use ->extra_len and not ->data_len, as we would then overshoot the original request size. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-07[SCSI] kernel-doc: fix scsi docbookRandy Dunlap
Add missing function parameter descriptions. Make function short description fit on one line as required. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30[SCSI] implement scsi_data_bufferBoaz Harrosh
In preparation for bidi we abstract all IO members of scsi_cmnd, that will need to duplicate, into a substructure. - Group all IO members of scsi_cmnd into a scsi_data_buffer structure. - Adjust accessors to new members. - scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of scsi_cmnd. And work on it. - Adjust scsi_init_io() and scsi_release_buffers() for above change. - Fix other parts of scsi_lib/scsi.c to members migration. Use accessors where appropriate. - fix Documentation about scsi_cmnd in scsi_host.h - scsi_error.c * Changed needed members of struct scsi_eh_save. * Careful considerations in scsi_eh_prep/restore_cmnd. - sd.c and sr.c * sd and sr would adjust IO size to align on device's block size so code needs to change once we move to scsi_data_buff implementation. * Convert code to use scsi_for_each_sg * Use data accessors where appropriate. - tgt: convert libsrp to use scsi_data_buffer - isd200: This driver still bangs on scsi_cmnd IO members, so need changing [jejb: rebased on top of sg_table patches fixed up conflicts and used the synergy to eliminate use_sg and sg_count] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-23[SCSI] don't use __GFP_DMA for sense buffers if not requiredJames Bottomley
Only hosts which actually have ISA DMA requirements need sense buffers coming out of ZONE_DMA, so only use the __GFP_DMA flag for that case to avoid allocating this scarce resource if it's not necessary. [tomo: fixed slab leak in failure case] Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-23[SCSI] use dynamically allocated sense bufferFUJITA Tomonori
This removes static array sense_buffer in scsi_cmnd and uses dynamically allocated sense_buffer (with GFP_DMA). The reason for doing this is that some architectures need cacheline aligned buffer for DMA: http://lkml.org/lkml/2007/11/19/2 The problems are that scsi_eh_prep_cmnd puts scsi_cmnd::sense_buffer to sglist and some LLDs directly DMA to scsi_cmnd::sense_buffer. It's necessary to DMA to scsi_cmnd::sense_buffer safely. This patch solves these issues. __scsi_get_command allocates sense_buffer via kmem_cache_alloc and attaches it to a scsi_cmnd so everything just work as before. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11[SCSI] fix scsi_setup_command_freelist failure path raceFUJITA Tomonori
Looks like that host_cmd_pool_mutex are necessary here. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11[SCSI] docbook and kernel-doc updatesRandy Dunlap
- Change title to remove "Mid-Layer" since the doc is about all of the SCSI layers. - Use "SCSI" instead of "scsi" in docbook text. - Use "*/" to end kernel-doc notation blocks. - A few other minor typo fixes. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11[SCSI] Add Documentation and integrate into docbook buildRob Landley
Add Documentation/DocBook/scsi_midlayer.tmpl, add to Makefile, and update lots of kerneldoc comments in drivers/scsi/*. Updated with comments from Stefan Richter, Stephen M. Cameron, James Bottomley and Randy Dunlap. Signed-off-by: Rob Landley <rob@landley.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-06Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done""Linus Torvalds
This reverts commit ac40532ef0b8649e6f7f83859ea0de1c4ed08a19, which gets us back the original cleanup of 6f5391c283d7fdcf24bf40786ea79061919d1e1d. It turns out that the bug that was triggered by that commit was apparently not actually triggered by that commit at all, and just the testing conditions had changed enough to make it appear to be due to it. The real problem seems to have been found by Peter Osterlund: "pktcdvd sets it [block device size] when opening the /dev/pktcdvd device, but when the drive is later opened as /dev/scd0, there is nothing that sets it back. (Btw, 40944 is possible if the disk is a CDRW that was formatted with "cdrwtool -m 10236".) The problem is that pktcdvd opens the cd device in non-blocking mode when pktsetup is run, and doesn't close it again until pktsetup -d is run. The effect is that if you meanwhile open the cd device, blkdev.c:do_open() doesn't call bd_set_size() because bdev->bd_openers is non-zero." In particular, to repeat the bug (regardless of whether commit 6f5391c283d7fdcf24bf40786ea79061919d1e1d is applied or not): " 1. Start with an empty drive. 2. pktsetup 0 /dev/scd0 3. Insert a CD containing an isofs filesystem. 4. mount /dev/pktcdvd/0 /mnt/tmp 5. umount /mnt/tmp 6. Press the eject button. 7. Insert a DVD containing a non-writable filesystem. 8. mount /dev/scd0 /mnt/tmp 9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null 10. If the DVD contains data beyond the physical size of a CD, you get I/O errors in the terminal, and dmesg reports lots of "attempt to access beyond end of device" errors." which in turn is because the nested open after the media change won't cause the size to be set properly (because the original open still holds the block device, and we only do the bd_set_size() when we don't have other people holding the device open). The proper fix for that is probably to just do something like bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9; in fs/block_dev.c:do_open() even for the cases where we're not the original opener (but *not* call bd_set_size(), since that will also change the block size of the device). Cc: Peter Osterlund <petero2@telia.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-02scsi: revert "[SCSI] Get rid of scsi_cmnd->done"Ingo Molnar
This reverts commit 6f5391c283d7fdcf24bf40786ea79061919d1e1d ("[SCSI] Get rid of scsi_cmnd->done") that was supposed to be a cleanup commit, but apparently it causes regressions: Bug 9370 - v2.6.24-rc2-409-g9418d5d: attempt to access beyond end of device http://bugzilla.kernel.org/show_bug.cgi?id=9370 this patch should be reintroduced in a more split-up form to make testing of it easier. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matthew Wilcox <matthew@wil.cx> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-10esp_scsi: fix reset cleanup spinlock recursionMaciej W. Rozycki
The esp_reset_cleanup() function is called with the host lock held and invokes starget_for_each_device() which wants to take it too. Here is a fix along the lines of shost_for_each_device()/__shost_for_each_device() adding a __starget_for_each_device() counterpart which assumes the lock has already been taken. Eventually, I think the driver should get modified so that more work is done as a softirq rather than in the interrupt context, but for now it fixes a bug that causes the spinlock debugger to fire. While at it, it fixes a small number of cosmetic problems with starget_for_each_device() too. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12[SCSI] Get rid of scsi_cmnd->doneMatthew Wilcox
The ULD ->done callback moves into the scsi_driver. By moving the call to scsi_io_completion() from scsi_blk_pc_done() to scsi_finish_command(), we can eliminate the latter entirely. By returning 'good_bytes' from the ->done callback (rather than invoking scsi_io_completion()), we can stop exporting scsi_io_completion(). Also move the prototypes from sd.h to sd.c as they're all internal anyway. Rename sd_rw_intr to sd_done and rw_intr to sr_done. Inspired-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-10-12[SCSI] Remove ->pid field from scsi_cmndMatthew Wilcox
The pid field is a duplicate of the serial_number field and has been scheduled for removal for a long time. A few drivers were still using it, so just change them to use serial_number instead. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>