linux/drivers/firewire/sbp2.c, branch v3.0.73

firewire: sbp2: fix panic after rmmod with slow targets

2011-10-25T05:10:16Z

commit 0278ccd9d53e07c4e699432b2fed9de6c56f506c upstream. If firewire-sbp2 starts a login to a target that doesn't complete ORBs in a timely manner (and has to retry the login), and the module is removed before the operation times out, you end up with a null-pointer dereference and a kernel panic. [SR: This happens because sbp2_target_get/put() do not maintain module references. scsi_device_get/put() do, but at occasions like Chris describes one, nobody holds a reference to an SBP-2 sdev.] This patch cancels pending work for each unit in sbp2_remove(), which hopefully means there are no extra references around that prevent us from unloading. This fixes my crash. Signed-off-by: Chris Boot Signed-off-by: Stefan Richter Signed-off-by: Greg Kroah-Hartman

firewire: sbp2: parallelize login, reconnect, logout

2011-05-10T20:53:46Z

The struct sbp2_logical_unit.work items can all be executed in parallel but are not reentrant. Furthermore, reconnect or re-login work must be executed in a WQ_MEM_RECLAIM workqueue. Hence replace the old single-threaded firewire-sbp2 workqueue by a concurrency-managed but non-reentrant workqueue with rescuer. firewire-core already maintains one, hence use this one. In earlier versions of this change, I observed occasional failures of parallel INQUIRY to an Initio INIC-2430 FireWire 800 to dual IDE bridge. More testing indicates that parallel INQUIRY is not actually a problem, but too quick successions of logout and login + INQUIRY, e.g. a quick sequence of cable plugout and plugin, can result in failed INQUIRY. This does not seem to be something that should or could be addressed by serialization. Another dual-LU device to which I currently have access to, an OXUF924DSB FireWire 800 to dual SATA bridge with firmware from MacPower, has been successfully tested with this too. This change is beneficial to environments with two or more FireWire storage devices, especially if they are located on the same bus. Management tasks that should be performed as soon and as quickly as possible, especially reconnect, are no longer held up by tasks on other devices that may take a long time, especially login with INQUIRY and sd or sr driver probe. Signed-off-by: Stefan Richter

firewire: sbp2: octlet AT payloads can be stack-allocated

2011-05-10T20:53:46Z

We do not need slab allocations for ORB pointer write transactions anymore in order to satisfy streaming DMA mapping constraints, thanks to commit da28947e7e36 "firewire: ohci: avoid separate DMA mapping for small AT payloads". (Besides, the slab-allocated buffers that firewire-sbp2 used to provide for 8-byte write requests were still not fully portable since they shared a cacheline with unrelated CPU-accessed data.) Signed-off-by: Stefan Richter

firewire: sbp2: omit Scsi_Host lock from queuecommand

2011-05-10T20:53:45Z

firewire-sbp2 already takes care for internal serialization where required (ORB list accesses), and it does not use cmd->serial_number internally. Hence it is safe to not grab the shost lock around queuecommand. While we are at housekeeping, drop a redundant struct member: sbp2_command_orb.done is set once in a hot path and dereferenced once in a hot path. We can as well dereference sbp2_command_orb.cmd->scsi_done instead. Signed-off-by: Stefan Richter

firewire: sbp2: revert obsolete 'fix stall with "Unsolicited response"'

2011-03-20T15:45:24Z

Now that firewire-core sets the local node's SPLIT_TIMEOUT to 2 seconds per default, commit a481e97d3cdc40b9d58271675bd4f0abb79d4872 is no longer required. Signed-off-by: Stefan Richter

SCSI host lock push-down

2010-11-16T21:33:23Z

Move the mid-layer's ->queuecommand() invocation from being locked with the host lock to being unlocked to facilitate speeding up the critical path for drivers who don't need this lock taken anyway. The patch below presents a simple SCSI host lock push-down as an equivalent transformation. No locking or other behavior should change with this patch. All existing bugs and locking orders are preserved. Additionally, add one parameter to queuecommand, struct Scsi_Host * and remove one parameter from queuecommand, void (*done)(struct scsi_cmnd *) Scsi_Host* is a convenient pointer that most host drivers need anyway, and 'done' is redundant to struct scsi_cmnd->scsi_done. Minimal code disturbance was attempted with this change. Most drivers needed only two one-line modifications for their host lock push-down. Signed-off-by: Jeff Garzik Acked-by: James Bottomley Signed-off-by: Linus Torvalds

firewire: sbp2: fix stall with "Unsolicited response"

2010-08-19T18:28:25Z

Fix I/O stalls with some 4-bay RAID enclosures which are based on OXUF936QSE: - Onnto dataTale RSM4QO, old firmware (not anymore with current firmware), - inXtron Hydra Super-S LCM, old as well as current firmware when used in RAID-5 mode, perhaps also in other RAID modes. The stalls happen during heavy or moderate disk traffic in periods that are a multiple of 5 minutes, roughly twice per hour. They are caused by the target responding too late to an ORB_Pointer register write: The target responds after Split_Timeout, hence firewire-core cancels the transaction, and firewire-sbp2 fails the SCSI request. The SCSI core retries the request, that fails again (and again), hence SCSI core calls firewire-sbp2's abort handler (and even the Management_Agent register write in the abort handler has the transaction timeout problem). During all that, the process which issued the I/O is stalled in I/O wait state. Meanwhile, the target actually acts on the first failed SCSI request: It responds to the ORB_Pointer write later (seen in the kernel log as "firewire_core: Unsolicited response") and also finishes the SCSI request with proper status (seen in the kernel log as "firewire_sbp2: status write for unknown orb"). So let's just ignore RCODE_CANCELLED in the transaction callback and wait for the target to complete the ORB nevertheless. This requires a small modification is sbp2_cancel_orbs(); it now needs to call orb->callback() regardless whether fw_cancel_transaction() found the transaction unfinished or finished. A different solution is to increase Split_Timeout on the local node. (Tested: 2000ms timeout; maybe 1000ms or something like that works too. 200ms is insufficient. Standard is 100ms.) However, I rather not do this because any software on any node could change the Split_Timeout to something unsuitable. Or such a large Split_Timeout may be undesirable for other purposes. Signed-off-by: Stefan Richter

firewire: sbp2: fix memory leak in sbp2_cancel_orbs or at send error

2010-08-19T18:28:25Z

When an ORB was canceled (Command ORB i.e. SCSI request timed out, or Management ORB timed out), or there was a send error in the initial transaction, we missed to drop one of the ORB's references and thus leaked memory. Background: In total, we hold 3 references to each Operation Request Block: - 1 during sbp2_scsi_queuecommand() or sbp2_send_management_orb() respectively, - 1 for the duration of the write transaction to the ORB_Pointer or Management_Agent register of the target, - 1 for as long as the ORB stays within the lu->orb_list, until the ORB is unlinked from the list and the orb->callback was executed. The latter one of these 3 references is finished - normally by sbp2_status_write() when the target wrote status for a pending ORB, - or by sbp2_cancel_orbs() in case of an ORB time-out, - or by complete_transaction() in case of a send error. Of them, the latter two lacked the kref_put. Add the missing kref_put()s. Add comments to the gets and puts of references for transaction callbacks and ORB callbacks so that it is easier to see what is supposed to happen. Signed-off-by: Stefan Richter

Merge firewire branches to be released post v2.6.35

2010-08-02T08:09:04Z

Conflicts: drivers/firewire/core-card.c drivers/firewire/core-cdev.c and forgotten #include in drivers/firewire/ohci.c Signed-off-by: Stefan Richter

firewire: remove an unused function argument

2010-06-20T21:11:55Z

void (*fw_address_callback_t)(..., int speed, ...) is the speed that a remote node chose to transmit a request to us. In case of split transactions, firewire-core will transmit the response at that speed. Upper layer drivers on the other hand (firewire-net, -sbp2, firedtv, and userspace drivers) cannot do anything useful with that speed datum, except log it for debug purposes. But data that is merely potentially (not even actually) used for debug purposes does not belong into the API. Signed-off-by: Stefan Richter