linux - Linux kernel source tree

Age	Commit message (Collapse)	Author
2010-03-01	Merge branch 'srp' into for-next	Roland Dreier

2010-03-01	Merge branch 'nes' into for-next	Roland Dreier

2010-03-01	Merge branch 'mlx4' into for-next	Roland Dreier

2010-03-01	Merge branch 'iser' into for-next	Roland Dreier

2010-03-01	Merge branch 'ipoib' into for-next	Roland Dreier

2010-03-01	Merge branch 'ehca' into for-next	Roland Dreier

2010-03-01	Merge branch 'cxgb3' into for-next	Roland Dreier

2010-03-01	IB/srp: Clean up error path in srp_create_target_ib()	Roland Dreier
	Instead of repeating the error unwinding steps in each place an error can be detected, use the common idiom of gotos into an error flow. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-01	IB/srp: Split send and recieve CQs to reduce number of interrupts	Bart Van Assche
	We can reduce the number of IB interrupts from two interrupts per srp_queuecommand() call to one by using separate CQs for send and receive completions and processing send completions by polling every time a TX IU is allocated. Receive completion events still trigger an interrupt. Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-25	RDMA/nes: Add support for KR device id 0x0110	Chien Tung
	Add support for KR device id 0x0110. While at it, cleanup nes_init_phy() by splitting it into nes_init_1g_phy() and nes_init_2025_phy(). Remove support for NES_PHY_TYPE_IRIS, which was used on an XFP board that was only manufactured in small quantities and given out for evals in even smaller quantities. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing	Steve Wise
	If cxgb3 calls the iw_cxgb3 t3cclient remove function due to a device removal event, then the iwch device must be marked with CXIO_ERROR_FATAL since the device below us is going away. Otherwise, we can get stuck in a deadlock as RDMA ULPs try and deallocate objects (like MRs, QPs, etc). So always mark the device with CXIO_ERROR_FATAL when removing. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Don't allocate the SW queue for user mode CQs	Steve Wise
	Only kernel mode CQs need the SW queue memory allocated. The SW queue for user mode CQs is allocated in userspace by libcxgb3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Increase the max CQ depth	Steve Wise
	Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	RDMA/cxgb3: Doorbell overflow avoidance and recovery	Steve Wise
	T3 hardware doorbell FIFO overflows can cause application stalls due to lost doorbell ring events. This has been seen when running large NP IMB alltoall MPI jobs. The T3 hardware supports an xon/xoff-type flow control mechanism to help avoid overflowing the HW doorbell FIFO. This patch uses these interrupts to disable RDMA QP doorbell rings when we near an overflow condition, and then turn them back on (and ring all the active QP doorbells) when when the doorbell FIFO empties out. In addition if an doorbell ring is dropped by the hardware, the code will now recover. Design: cxgb3: - enable these DB interrupts - in the interrupt handler, schedule work tasks to call the ULPs event handlers with the new events. - ring all the qset txqs when an overflow is detected. iw_cxgb3: - disable db ringing on all active qps when we get the DB_FULL event - enable db ringing on all active qps and ring all active dbs when we get the DB_EMPTY event - On DB_DROP event: - disable db rings in the event handler - delay-schedule a work task which rings and enables the dbs on all active qps. - in post_send and post_recv logic, don't ring the db if it's disabled. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Clean whitespace errors	Alexander Chiang
	As shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Increase maximum devices supported	Alexander Chiang
	Some large systems may support more than IB_UCM_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. the first IB_UCM_MAX_DEVICES keep the same major/minor device numbers they've always had. If there are more than IB_UCM_MAX_DEVICES, then we dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Use stack variable 'base' in ib_ucm_add_one	Alexander Chiang
	This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UCM_MAX_DEVICES. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one	Alexander Chiang
	This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UCM_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Clean whitespace	Alexander Chiang
	Clean errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Increase maximum devices supported	Alexander Chiang
	Some large systems may support more than IB_UMAD_MAX_PORTS (currently 64). This change allows us to support more ports in a backwards-compatible manner. The first IB_UMAD_MAX_PORTS keep the same major/minor device numbers they've always had. If there are more than IB_UMAD_MAX_PORTS, we then dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Use stack variable 'base' in ib_umad_init_port	Alexander Chiang
	This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UMAD_MAX_PORTS in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Use stack variable 'devnum' in ib_umad_init_port	Alexander Chiang
	This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UMAD_MAX_PORTS in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Remove port_table[]	Alexander Chiang
	We no longer need this data structure, as it was used to associate an inode back to a struct ib_umad_port during ->open(). But now that we're embedding a struct cdev in struct ib_umad_port, we can use the container_of() macro to go from the inode back to the device instead. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/umad: Convert *cdev to cdev in struct ib_umad_port	Alexander Chiang
	Instead of storing pointers to cdev and sm_cdev, embed the full structures instead. This change allows us to use the container_of() macro in ib_umad_open() and ib_umad_sm_open() in a future patch. This change increases the size of struct ib_umad_port to 320 bytes from 128. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Whitespace cleanup	Alexander Chiang
	Clean up the errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Pack struct ib_uverbs_event_file tighter	Alexander Chiang
	Eliminate some padding in the structure by rearranging the members. sizeof(struct ib_uverbs_event_file) is now 72 bytes (from 80) and more members now fit in the first cacheline. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Increase maximum devices supported	Alexander Chiang
	Some large systems may support more than IB_UVERBS_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. The first IB_UVERBS_MAX_DEVICES keep the same major/minor device numbers that they've always had. If there are more than IB_UVERBS_MAX_DEVICES, we then dynamically request a new major device number (new minors start at 0). This change increases the maximum number of HCAs to 64 (from 32). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: use stack variable 'base' in ib_uverbs_add_one	Alexander Chiang
	This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UVERBS_MAX_DEVICES in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Use stack variable 'devnum' in ib_uverbs_add_one	Alexander Chiang
	This change is not useful by itself, but it sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UVERBS_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Remove dev_table	Alexander Chiang
	dev_table's raison d'etre was to associate an inode back to a struct ib_uverbs_device. However, now that we've converted ib_uverbs_device to contain an embedded cdev (instead of a *cdev), we can use the container_of() macro and cast back to the containing device. There's no longer any need for dev_table, so get rid of it. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/uverbs: Convert *cdev to cdev in struct ib_uverbs_device	Alexander Chiang
	Instead of storing a pointer to a cdev, embed the entire struct cdev. This change allows us to use the container_of() macro in ib_uverbs_open() in a future patch. This change increases the size of struct ib_uverbs_device to 168 bytes across 3 cachelines from 80 bytes in 2 cachelines. However, we rearrange the members so that everything fits into the first cacheline except for the struct cdev. Finally, we don't touch the cdev in any fastpaths, so this change shouldn't negatively affect performance. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Remove redundant locking from iser scsi command response flow	Or Gerlitz
	Currently the iSER receive completion flow takes the session lock twice. Optimize it to avoid the first one by letting iser_task_rdma_finalize() be called only from the cleanup_task callback invoked by iscsi_free_task, thus reducing the contention on the session lock between the scsi command submission to the scsi command completion flows. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Use libiscsi passthrough mode	Or Gerlitz
	libiscsi passthrough mode invokes the transport xmit calls directly without first going through an internal queue, unlike the other mode, which uses a queue and a xmitworker thread. Now that the "cant_sleep" prerequisite of iscsi_host_alloc is met, move to use it. Handling xmit errors is now done by the passthrough flow of libiscsi. Since the queue/worker aren't used in this mode, the code that schedules the xmitworker is removed. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Remove unnecessary connection checks	Or Gerlitz
	Remove unnecessary checks for the IB connection state and for QP overflow, as conn state changes are reported by iSER to libiscsi and handled there. QP overflow is theoretically possible only when unsolicited data-outs are used; anyway it's being checked and handled by HW drivers. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Use atomic allocations	Or Gerlitz
	Two minor flows in iSER's data path still use allocations; move them to be atomic as a preperation step towards moving to use libiscsi passthrough mode. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Simplify send flow/descriptors	Or Gerlitz
	Simplify and shrink the logic/code used for the send descriptors. Changes include removing struct iser_dto (an unnecessary abstraction), using struct iser_regd_buf only for handling SCSI commands, using dma_sync instead of dma_map/unmap, etc. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Use different CQ for send completions	Or Gerlitz
	Use a different CQ for send completions, where send completions are polled by the interrupt-driven receive completion handler. Therefore, interrupts aren't used for the send CQ. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Remove atomic counter for posted receive buffers	Or Gerlitz
	Now that both the posting and reaping of receive buffers is done in the completion path, the counter of outstanding buffers not be atomic. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: New receive buffer posting logic	Or Gerlitz
	Currently, the recv buffer posting logic is based on the transactional nature of iSER which allows for posting a buffer before sending a PDU. Change this to post only when the number of outstanding recv buffers is below a water mark and in a batched manner, thus simplifying and optimizing the data path. Use a pre-allocated ring of recv buffers instead of allocating from kmem cache. A special treatment is given to the login response buffer whose size must be 8K unlike the size of buffers used for any other purpose which is 128 bytes. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24	IB/iser: Revert commit bba7ebb "avoid recv buffer exhaustion"	Or Gerlitz
	We will make a major change in the recv buffer posting logic, after which the problem commit bba7ebb "avoid recv buffer exhaustion caused by unexpected PDUs" comes to solve doesn't exist any more, so revert it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Change WQ overflow return code	Or Gerlitz
	Change the nes driver to return -ENOMEM on SQ/RQ overflow to match the return code of other RDMA HW drivers (e.g cxgb3, ehca, mlx4, mthca). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Acked-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Multiple disconnects cause crash during AE handling	Faisal Latif
	There is a double disconnect during AE processing, causing crashes. While fixing the crash, also simplify the AE handling code. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Fix crash when listener destroyed during loopback setup	Faisal Latif
	When a listener is destroyed and there is an MPA response pending for loopback connection, the active side cm_node gets destroyed twice: once in cm_event_connect_error() and again in nes_accept()/nes_reject(). Increment the cm_node's refcount so it's not destroyed by cm_event_connect_error(). Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	RDMA/nes: Use atomic counters for CM listener create and destroy	Faisal Latif
	After running long iterative MPI tests, sometimes ethtool reports a "CM Destroy Listener" count more than the "CM Create Listener" count. This inconsistency is fixed by making counter variables atomic. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19	IB/ehca: Require in_wc in process_mad()	Alexander Schmidt
	If the caller does not pass a valid in_wc to process_mad(), return MAD failure status, as it is not possible to generate a valid MAD redirect response (and redirects are the only MAD responses ehca generates). Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12	IB/ehca: Allow access for ib_query_qp()	Alexander Schmidt
	The max_dest_rd_atomic and max_qp_rd_atomic values are properly returned by query_qp(), so there should not be an error returned when they are queried. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12	IB/ehca: Do not turn off irqs in tasklet context	Alexander Schmidt
	The irq_spinlock is only taken in tasklet context, so it is safe not to disable hardware interrupts. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12	IB/mlx4: Simplify retrieval of ib_device	Eli Cohen
	struct ib_qp already holds a pointer to the ib device. No need to dive to the hw device object to retrieve it. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-11	IPoIB: Remove TX moderation settings from ethtool support	Or Gerlitz
	As of commit f56bcd8 ("IPoIB: Use separate CQ for UD send completions"), there are no TX interrupts. Change the ethtool code not to report TX moderation settings, so users will not be misled to think they can control TX interrupt moderation. Pointed out by Alex Vainman <alexv@voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-11	RDMA/cxgb3: Remove BUG_ON() on CQ rearm failure	Steve Wise
	Failure to rearm a CQ means the cxgb3 device is wedged, but we shouldn't kill the whole system with a BUG_ON() if this happens. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>