aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2014-03-17IB/iser: Suppress completions for fast registration work requestsSagi Grimberg
In case iSER uses fast registration method, it should not request for successful completions on fast registration nor local invalidate requests. We color wr_id with ISER_FRWR_LI_WRID in order to correctly consume error completions. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/mlx4: Fix a sparse endianness warningBart Van Assche
Fix the following warning for the mlx4 driver: $ make M=drivers/infiniband C=2 CF=-D__CHECK_ENDIAN__ drivers/infiniband/hw/mlx4/qp.c:1885:31: warning: restricted __be16 degrades to integer Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17RDMA/ocrdma: Fix compiler warningPrarit Bhargava
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function ‘_ocrdma_modify_qp’: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:1299:31: error: ‘old_qps’ may be used uninitialized in this function [-Werror=maybe-uninitialized] status = ocrdma_mbx_modify_qp(dev, qp, attr, attr_mask, old_qps); ocrdma_mbx_modify_qp() (and subsequent calls) doesn't appear to use old_qps so it doesn't need to be passed on. Removing the variable results in the warning going away. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Devesh Sharma (Devesh.sharma@emulex.com) Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17RDMA/nes: Clean up a conditionDan Carpenter
We don't need to test "ret" twice and also the white space is messed up. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: Remove duplicate check in get_a_ctxt()Dan Carpenter
We already know "pusable" is non-zero, no need to check again. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/usnic: Remove '0x' when using %pa formatFabio Estevam
%pa format already prints in hexadecimal format, so remove the '0x' annotation to avoid a double '0x0x' pattern. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/nes: Return an error on ib_copy_from_udata() failure instead of NULLYann Droneaud
In case of error while accessing to userspace memory, function nes_create_qp() returns NULL instead of an error code wrapped through ERR_PTR(). But NULL is not expected by ib_uverbs_create_qp(), as it check for error with IS_ERR(). As page 0 is likely not mapped, it is going to trigger an Oops when the kernel will try to dereference NULL pointer to access to struct ib_qp's fields. In some rare cases, page 0 could be mapped by userspace, which could turn this bug to a vulnerability that could be exploited: the function pointers in struct ib_device will be under userspace total control. This was caught when using spatch (aka. coccinelle) to rewrite calls to ib_copy_{from,to}_udata(). Link: https://www.gitorious.org/opteya/ib-hw-nes-create-qp-null Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: Fix memory leak of recv context when driver fails to initialize.Dennis Dalessandro
In qib_create_ctxts() we allocate an array to hold recv contexts. Then attempt to create data for those recv contexts. If that call to qib_create_ctxtdata() fails then an error is returned but the previously allocated memory is not freed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: fixup indentation in qib_ib_rcv()Yann Droneaud
Commit af061a644a0e4d4778 add some code in qib_ib_rcv() which trigger a warning from coccicheck (coccinelle/spatch): $ make C=2 CHECK=scripts/coccicheck drivers/infiniband/hw/qib/ CHECK drivers/infiniband/hw/qib/qib_verbs.c drivers/infiniband/hw/qib/qib_verbs.c:679:5-32: code aligned with following code on line 681 CC [M] drivers/infiniband/hw/qib/qib_verbs.o In fact, according to similar code in qib_kreceive(), qib_ib_rcv() code is correct but improperly indented. This patch fix indentation for the misaligned portion. Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: infinipath@intel.com Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: cocci@systeme.lip6.fr Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: add missing braces in do_qib_user_sdma_queue_create()Yann Droneaud
Commit c804f07248895ff9c moved qib_assign_ctxt() to do_qib_user_sdma_queue_create() but dropped the braces around the statements. This was spotted by coccicheck (coccinelle/spatch): $ make C=2 CHECK=scripts/coccicheck drivers/infiniband/hw/qib/ CHECK drivers/infiniband/hw/qib/qib_file_ops.c drivers/infiniband/hw/qib/qib_file_ops.c:1583:2-23: code aligned with following code on line 1587 This patch adds braces back. Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: infinipath@intel.com Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: cocci@systeme.lip6.fr Cc: stable@vger.kernel.org Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: Modify software pma counters to use percpu variablesMike Marciniszyn
The counters, unicast_xmit, unicast_rcv, multicast_xmit, multicast_rcv are now maintained as percpu variables. The mad code is modified to add a z_ latch so that the percpu counters monotonically increase with appropriate adjustments in the reset, read logic to maintain the z_ latch. This patch also corrects the fact the unitcast_xmit wasn't handled at all for UC and RC QPs. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: Add percpu counter replacing qib_devdata int_counterMike Marciniszyn
This patch replaces the dd->int_counter with a percpu counter. The maintanance of qib_stats.sps_ints and int_counter are combined into the new counter. There are two new functions added to read the counter: - qib_int_counter (for a particular qib_devdata) - qib_sps_ints (for all HCAs) A z_int_counter is added to allow the interrupt detection logic to determine if interrupts have occured since z_int_counter was "reset". Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: Fix debugfs ordering issue with multiple HCAsMike Marciniszyn
The debugfs init code was incorrectly called before the idr mechanism is used to get the unit number, so the dd->unit hasn't been initialized. This caused the unit relative directory creation to fail after the first. This patch moves the init for the debugfs stuff until after all of the failures and after the unit number has been determined. A bug in unwind code in qib_alloc_devdata() is also fixed. Cc: <stable@vger.kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/ipath: Fix potential buffer overrun in sending diag packet routineDennis Dalessandro
Guard against a potential buffer overrun. The size to read from the user is passed in, and due to the padding that needs to be taken into account, as well as the place holder for the ICRC it is possible to overflow the 32bit value which would cause more data to be copied from user space than is allocated in the buffer. Cc: <stable@vger.kernel.org> Reported-by: Nico Golde <nico@ngolde.de> Reported-by: Fabian Yamaguchi <fabs@goesec.de> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17IB/qib: Fix potential buffer overrun in sending diag packet routineDennis Dalessandro
Guard against a potential buffer overrun. Right now the qib driver is protected by the fact that the data structure in question is only 16 bits. Should that ever change the problem will be exposed. There is a similar defect in the ipath driver and this brings the two code paths into sync. Reported-by: Nico Golde <nico@ngolde.de> Reported-by: Fabian Yamaguchi <fabs@goesec.de> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17RDMA/nes: Fix for passing a valid QP pointer to the user space libraryTatyana Nikolova
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-17RDMA/nes: Fixes for IRD/ORD negotiation with MPA v2Tatyana Nikolova
Fixes to enable the negotiation of the supported IRD/ORD sizes with the peer when exchanging MPA v2 messages in connection establishment. Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-14cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug FixesSteve Wise
The current logic suffers from a slow response time to disable user DB usage, and also fails to avoid DB FIFO drops under heavy load. This commit fixes these deficiencies and makes the avoidance logic more optimal. This is done by more efficiently notifying the ULDs of potential DB problems, and implements a smoother flow control algorithm in iw_cxgb4, which is the ULD that puts the most load on the DB fifo. Design: cxgb4: Direct ULD callback from the DB FULL/DROP interrupt handler. This allows the ULD to stop doing user DB writes as quickly as possible. While user DB usage is disabled, the LLD will accumulate DB write events for its queues. Then once DB usage is reenabled, a single DB write is done for each queue with its accumulated write count. This reduces the load put on the DB fifo when reenabling. iw_cxgb4: Instead of marking each qp to indicate DB writes are disabled, we create a device-global status page that each user process maps. This allows iw_cxgb4 to only set this single bit to disable all DB writes for all user QPs vs traversing the idr of all the active QPs. If the libcxgb4 doesn't support this, then we fall back to the old approach of marking each QP. Thus we allow the new driver to work with an older libcxgb4. When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes via the status page and transition the DB state to STOPPED. As user processes see that DB writes are disabled, they call into iw_cxgb4 to submit their DB write events. Since the DB state is in STOPPED, the QP trying to write gets enqueued on a new DB "flow control" list. As subsequent DB writes are submitted for this flow controlled QP, the amount of writes are accumulated for each QP on the flow control list. So all the user QPs that are actively ringing the DB get put on this list and the number of writes they request are accumulated. When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq context, we change the DB state to FLOW_CONTROL, and begin resuming all the QPs that are on the flow control list. This logic runs on until the flow control list is empty or we exit FLOW_CONTROL mode (due to a DB DROP upcall, for example). QPs are removed from this list, and their accumulated DB write counts written to the DB FIFO. Sets of QPs, called chunks in the code, are removed at one time. The chunk size is 64. So 64 QPs are resumed at a time, and before the next chunk is resumed, the logic waits (blocks) for the DB FIFO to drain. This prevents resuming to quickly and overflowing the FIFO. Once the flow control list is empty, the db state transitions back to NORMAL and user QPs are again allowed to write directly to the user DB register. The algorithm is designed such that if the DB write load is high enough, then all the DB writes get submitted by the kernel using this flow controlled approach to avoid DB drops. As the load lightens though, we resume to normal DB writes directly by user applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14cxgb4/iw_cxgb4: Treat CPL_ERR_KEEPALV_NEG_ADVICE as negative adviceSteve Wise
Based on original work by Anand Priyadarshee <anandp@chelsio.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13Merge branch 'for-next' of ↵Nicholas Bellinger
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband into for-next
2014-03-12mlx4: Activate RoCE/SRIOVJack Morgenstein
To activate RoCE/SRIOV, need to remove the following: 1. In mlx4_ib_add, need to remove the error return preventing initialization of a RoCE port under SRIOV. 2. In update_vport_qp_params (in resource_tracker.c) need to remove the error return when a RoCE RC or UD qp is detected. This error return causes the INIT-to-RTR qp transition to fail in the wrapper function under RoCE/SRIOV. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4_ib: Fix SIDR support of for UD QPs under SRIOV/RoCEShani Michaelli
* Handle CM_SIDR_REQ_ATTR_ID and CM_SIDR_REP_ATTR_ID in multiplex_cm_handler and demux_cm_handler. * Handle Service ID Resolution messages and REQ messages separately, for their formats are different. Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Implement IP based gids support for RoCE/SRIOVJack Morgenstein
Since there is no connection between the MAC/VLAN and the GID when using IP-based addressing, the proxy QP1 (running on the slave) must pass the source-mac, destination-mac, and vlan_id information separately from the GID. Additionally, the Host must pass the remote source-mac and vlan_id back to the slave, This is achieved as follows: Outgoing MADs: 1. Source MAC: obtained from the CQ completion structure (struct ib_wc, smac field). 2. Destination MAC: obtained from the tunnel header 3. vlan_id: obtained from the tunnel header. Incoming MADs 1. The source (i.e., remote) MAC and vlan_id are passed in the tunnel header to the proxy QP1. VST mode support: For outgoing MADs, the vlan_id obtained from the header is discarded, and the vlan_id specified by the Hypervisor is used instead. For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the "invalid" vlan (0xffff) is substituted when forwarding to the slave. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Add ref counting to port MAC table for RoCEJack Morgenstein
The IB side of RoCE requires the MAC table index of the MAC address used by its QPs. To obtain the real MAC index, the IB side registers the MAC (increasing its ref count, and also returning the real MAC index) during the modify-qp sequence. This protects against the ETH side deleting or modifying that MAC table entry while the QP is active. Note that until the modify-qp command returns success, the MAC and VLAN information only has "candidate" status. If the modify-qp succeeds, the "candidate" info is promoted to the operational MAC/VLAN info for the qp. If the modify fails, the candidate MAC/VLAN is unregistered, and the old qp info is preserved. The patch is a bit complex, because there are multiple qp transitions where the primary-path information may be modified: INIT-to-RTR, and SQD-to-SQD. Similarly for the alternate path information. Therefore the code must handle cases where path information has already been entered into the QP context by previous qp transitions. For the MAC address, the success logic is as follows: 1. If there was no previous MAC, simply move the candidate MAC information to the operational information, and reset the candidate MAC info. 2. If there was a previous MAC, unregister it. Then move the MAC information from candidate to operational, and reset the candidate info (as in 1. above). The MAC address failure logic is the same for all cases: - Unregister the candidate MAC, and reset the candidate MAC info. For Vlan registration, the logic is similar. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: In RoCE allow guests to have multiple GIDSJack Morgenstein
The GIDs are statically distributed, as follows: PF: gets 16 GIDs VFs: Remaining GIDS are divided evenly between VFs activated by the driver. If the division is not even, lower-numbered VFs get an extra GID. For an IB interface, the number of gids per guest remains as before: one gid per guest. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Adjust QP1 multiplexing for RoCE/SRIOVJack Morgenstein
This requires the following modifications: 1. Fix build_mlx4_header to properly fill in the ETH fields 2. Adjust mux and demux QP1 flow to support RoCE. This commit still assumes only one GID per slave for RoCE. The commit enabling multiple GIDs is a subsequent commit, and is done separately because of its complexity. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "This series addresses a number of outstanding issues wrt to active I/O shutdown using iser-target. This includes: - Fix a long standing tpg_state bug where a tpg could be referenced during explicit shutdown (v3.1+ stable) - Use list_del_init for iscsi_cmd->i_conn_node so list_empty checks work as expected (v3.10+ stable) - Fix a isert_conn->state related hung task bug + ensure outstanding I/O completes during session shutdown. (v3.10+ stable) - Fix isert_conn->post_send_buf_count accounting for RDMA READ/WRITEs (v3.10+ stable) - Ignore FRWR completions during active I/O shutdown (v3.12+ stable) - Fix command leakage for interrupt coalescing during active I/O shutdown (v3.13+ stable) Also included is another DIF emulation fix from Sagi specific to v3.14-rc code" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: Target/sbc: Fix sbc_copy_prot for offset scatters iser-target: Fix command leak for tx_desc->comp_llnode_batch iser-target: Ignore completions for FRWRs in isert_cq_tx_work iser-target: Fix post_send_buf_count for RDMA READ/WRITE iscsi/iser-target: Fix isert_conn->state hung shutdown issues iscsi/iser-target: Use list_del_init for ->i_conn_node iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug
2014-03-07IB/mlx5: Expose support for signature MR featureSagi Grimberg
Currently support only T10-DIF types of signature handover operations (types 1|2|3). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/mlx5: Collect signature error completionSagi Grimberg
This commit takes care of the generated signature error CQE generated by the HW (if happened). The underlying mlx5 driver will handle signature error completions and will mark the relevant memory region as dirty. Once the consumer gets the completion for the transaction, it must check for signature errors on signature memory region using a new lightweight verb ib_check_mr_status(). In case the user doesn't check for signature error (i.e. doesn't call ib_check_mr_status() with status check IB_MR_CHECK_SIG_STATUS), the memory region cannot be used for another signature operation (REG_SIG_MR work request will fail). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/mlx5: Support IB_WR_REG_SIG_MRSagi Grimberg
This patch implements IB_WR_REG_SIG_MR posted by the user. Baisically this WR involves 3 WQEs in order to prepare and properly register the signature layout: 1. post UMR WR to register the sig_mr in one of two possible ways: * In case the user registered a single MR for data so the UMR data segment consists of: - single klm (data MR) passed by the user - BSF with signature attributes requested by the user. * In case the user registered 2 MRs, one for data and one for protection, the UMR consists of: - strided block format which includes data and protection MRs and their repetitive block format. - BSF with signature attributes requested by the user. 2. post SET_PSV in order to set the memory domain initial signature parameters passed by the user. SET_PSV is not signaled and solicited CQE. 3. post SET_PSV in order to set the wire domain initial signature parameters passed by the user. SET_PSV is not signaled and solicited CQE. * After this compound WR we place a small fence for next WR to come. This patch also introduces some helper functions to set the BSF correctly and determining the signature format selectors. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/mlx5: Remove MTT access mode from umr flags helper functionSagi Grimberg
get_umr_flags helper function might be used for types of access modes other than ACCESS_MODE_MTT, such as ACCESS_MODE_KLM. So remove it from helper, and callers will add their own access mode flag. This commit does not add/change functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/mlx5: Break up wqe handling into begin & finish routinesSagi Grimberg
As a preliminary step for signature feature which will require posting multiple (3) WQEs for a single WR, we break post_send routine WQE indexing into begin and finish routines. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/mlx5: Initialize mlx5_ib_qp signature-related membersSagi Grimberg
If user requested signature enable we initialize relevant mlx5_ib_qp members. We mark the qp as sig_enable and we increase the effective SQ size, but still limit the user max_send_wr to original size computed. We also allow the create_qp routine to accept sig_enable create flag. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07mlx5: Implement create_mr and destroy_mrSagi Grimberg
Support create_mr and destroy_mr verbs. Creating ib_mr may be done for either ib_mr that will register regular page lists like alloc_fast_reg_mr routine, or indirect ib_mrs that can register other (pre-registered) ib_mrs in an indirect manner. In addition user may request signature enable, that will mean that the created ib_mr may be attached with signature attributes (BSF, PSVs). Currently we only allow direct/indirect registration modes. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/core: Introduce signature verbs APISagi Grimberg
Introduce a verbs interface for signature-related operations. A signature handover operation configures the layouts of data and protection attributes both in memory and wire domains. Signature operations are: - INSERT: Generate and insert protection information when handing over data from input space to output space. - validate and STRIP: Validate protection information and remove it when handing over data from input space to output space. - validate and PASS: Validate protection information and pass it when handing over data from input space to output space. Once the signature handover opration is done, the HCA will offload data integrity generation/validation while performing the actual data transfer. Additions: 1. HCA signature capabilities in device attributes Verbs provider supporting signature handover operations fills relevant fields in device attributes structure returned by ib_query_device. 2. QP creation flag IB_QP_CREATE_SIGNATURE_EN Creating a QP that will carry signature handover operations may require some special preparations from the verbs provider. So we add QP creation flag IB_QP_CREATE_SIGNATURE_EN to declare that the created QP may carry out signature handover operations. Expose signature support to verbs layer (no support for now). 3. New send work request IB_WR_REG_SIG_MR Signature handover work request. This WR will define the signature handover properties of the memory/wire domains as well as the domains layout. The purpose of this work request is to bind all the needed information for the signature operation: - data to be transferred: wr->sg_list (ib_sge). * The raw data, pre-registered to a single MR (normally, before signature, this MR would have been used directly for the data transfer) - data protection guards: sig_handover.prot (ib_sge). * The data protection buffer, pre-registered to a single MR, which contains the data integrity guards of the raw data blocks. Note that it may not always exist, only in cases where the user is interested in storing protection guards in memory. - signature operation attributes: sig_handover.sig_attrs. * Tells the HCA how to validate/generate the protection information. Once the work request is executed, the memory region that will describe the signature transaction will be the sig_mr. The application can now go ahead and send the sig_mr.rkey or use the sig_mr.lkey for data transfer. 4. New Verb ib_check_mr_status check_mr_status verb checks the status of the memory region post transaction. The first check that may be used is IB_MR_CHECK_SIG_STATUS, which will indicate if any signature errors are pending for a specific signature-enabled ib_mr. This verb is a lightwight check and is allowed to be taken from interrupt context. An application must call this verb after it is known that the actual data transfer has finished. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07IB/core: Introduce protected memory regionsSagi Grimberg
This commit introduces verbs for creating/destoying memory regions which will allow new types of memory key operations such as protected memory registration. Indirect memory registration is registering several (one of more) pre-registered memory regions in a specific layout. The Indirect region may potentialy describe several regions and some repitition format between them. Protected Memory registration is registering a memory region with various data integrity attributes which will describe protection schemes that will be handled by the HCA in an offloaded manner. These memory regions will be applicable for a new REG_SIG_MR work request introduced later in this patchset. In the future these routines may replace or implement current memory regions creation routines existing today: - ib_reg_user_mr - ib_alloc_fast_reg_mr - ib_get_dma_mr - ib_dereg_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-04iser-target: Fix command leak for tx_desc->comp_llnode_batchNicholas Bellinger
This patch addresses a number of active I/O shutdown issues related to isert_cmd descriptors being leaked that are part of a completion interrupt coalescing batch. This includes adding logic in isert_cq_tx_comp_err() to drain any associated tx_desc->comp_llnode_batch, as well as isert_cq_drain_comp_llist() to drain any associated isert_conn->conn_comp_llist. Also, set tx_desc->llnode_active in isert_init_send_wr() in order to determine when work requests need to be skipped in isert_cq_tx_work() exception path code. Finally, update isert_init_send_wr() to only allow interrupt coalescing when ISER_CONN_UP. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.13+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-04iser-target: Ignore completions for FRWRs in isert_cq_tx_workNicholas Bellinger
This patch changes IB_WR_FAST_REG_MR + IB_WR_LOCAL_INV related work requests to include a ISER_FRWR_LI_WRID value in order to signal isert_cq_tx_work() that these requests should be ignored. This is necessary because even though IB_SEND_SIGNALED is not set for either work request, during a QP failure event the work requests will be returned with exception status from the TX completion queue. v2 changes: - Rename ISER_FRWR_LI_WRID -> ISER_FASTREG_LI_WRID (Sagi) Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-04iser-target: Fix post_send_buf_count for RDMA READ/WRITENicholas Bellinger
This patch fixes the incorrect setting of ->post_send_buf_count related to RDMA WRITEs + READs where isert_rdma_rw->send_wr_num was not being taken into account. This includes incrementing ->post_send_buf_count within isert_put_datain() + isert_get_dataout(), decrementing within __isert_send_completion() + isert_response_completion(), and clearing wr->send_wr_num within isert_completion_rdma_read() This is necessary because even though IB_SEND_SIGNALED is not set for RDMA WRITEs + READs, during a QP failure event the work requests will be returned with exception status from the TX completion queue. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-04iscsi/iser-target: Fix isert_conn->state hung shutdown issuesNicholas Bellinger
This patch addresses a couple of different hug shutdown issues related to wait_event() + isert_conn->state. First, it changes isert_conn->conn_wait + isert_conn->conn_wait_comp_err from waitqueues to completions, and sets ISER_CONN_TERMINATING from within isert_disconnect_work(). Second, it splits isert_free_conn() into isert_wait_conn() that is called earlier in iscsit_close_connection() to ensure that all outstanding commands have completed before continuing. Finally, it breaks isert_cq_comp_err() into seperate TX / RX related code, and adds logic in isert_cq_rx_comp_err() to wait for outstanding commands to complete before setting ISER_CONN_DOWN and calling complete(&isert_conn->conn_wait_comp_err). Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-04iscsi/iser-target: Use list_del_init for ->i_conn_nodeNicholas Bellinger
There are a handful of uses of list_empty() for cmd->i_conn_node within iser-target code that expect to return false once a cmd has been removed from the per connect list. This patch changes all uses of list_del -> list_del_init in order to ensure that list_empty() returns false as expected. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-04IB: Refactor umem to use linear SG tableYishai Hadas
This patch refactors the IB core umem code and vendor drivers to use a linear (chained) SG table instead of chunk list. With this change the relevant code becomes clearer—no need for nested loops to build and use umem. Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-25net,IB/mlx: Bump all Mellanox driver versionsAmir Vadai
Bump all Mellanox driver versions. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "Mostly minor fixes this time to v3.14-rc1 related changes. Also included is one fix for a free after use regression in persistent reservations UNREGISTER logic that is CC'ed to >= v3.11.y stable" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: Target/sbc: Fix protection copy routine IB/srpt: replace strict_strtoul() with kstrtoul() target: Simplify command completion by removing CMD_T_FAILED flag iser-target: Fix leak on failure in isert_conn_create_fastreg_pool iscsi-target: Fix SNACK Type 1 + BegRun=0 handling target: Fix missing length check in spc_emulate_evpd_83() qla2xxx: Remove last vestiges of qla_tgt_cmd.cmd_list target: Fix 32-bit + CONFIG_LBDAF=n link error w/ sector_div target: Fix free-after-use regression in PR unregister
2014-02-14Merge branches 'cma', 'cxgb4', 'iser', 'misc', 'mlx4', 'mlx5', 'nes', ↵Roland Dreier
'ocrdma', 'qib' and 'usnic' into for-next
2014-02-14RDMA/ocrdma: Fix load time panic during GID table initDevesh Sharma
We should use rdma_vlan_dev_real_dev() instead of using vlan_dev_real_dev() when building the GID table for a vlan interface. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14RDMA/ocrdma: Fix traffic class shiftDevesh Sharma
Use correct value for obtaining traffic class from device response for Query QP request. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/iser: Fix use after free in iser_snd_completion()Dan Carpenter
We use "tx_desc" again after we free it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/iser: Avoid dereferencing iscsi_iser conn object when not bound to iser ↵Roi Dayan
connection Fix a possible NULL pointer dereference in disconnection flow. This can happen if the target disconnected/rejected the connection request, e.g before the binding stage between iscsi connection to the transport connection. Signed-off-by: Alex Tabachnik <alext@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>