diff options
author | Vinod Koul <vinod.koul@intel.com> | 2013-11-16 11:54:17 +0530 |
---|---|---|
committer | Vinod Koul <vinod.koul@intel.com> | 2013-11-16 12:02:36 +0530 |
commit | df12a3178d340319b1955be6b973a4eb84aff754 (patch) | |
tree | 2b9c68f8a6c299d1e5a4026c60117b5c00d46008 | |
parent | 2f986ec6fa57a5dcf77f19f5f0d44b1f680a100f (diff) | |
parent | 82a1402eaee5dab1f3ab2d5aa4c316451374c5af (diff) |
Merge commit 'dmaengine-3.13-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine
Pull dmaengine changes from Dan
1/ Bartlomiej and Dan finalized a rework of the dma address unmap
implementation.
2/ In the course of testing 1/ a collection of enhancements to dmatest
fell out. Notably basic performance statistics, and fixed / enhanced
test control through new module parameters 'run', 'wait', 'noverify',
and 'verbose'. Thanks to Andriy and Linus for their review.
3/ Testing the raid related corner cases of 1/ triggered bugs in the
recently added 16-source operation support in the ioatdma driver.
4/ Some minor fixes / cleanups to mv_xor and ioatdma.
Conflicts:
drivers/dma/dmatest.c
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
39 files changed, 1018 insertions, 1911 deletions
diff --git a/Documentation/dmatest.txt b/Documentation/dmatest.txt index a2b5663eae2..dd77a81bdb8 100644 --- a/Documentation/dmatest.txt +++ b/Documentation/dmatest.txt @@ -15,39 +15,48 @@ be built as module or inside kernel. Let's consider those cases. Part 2 - When dmatest is built as a module... -After mounting debugfs and loading the module, the /sys/kernel/debug/dmatest -folder with nodes will be created. There are two important files located. First -is the 'run' node that controls run and stop phases of the test, and the second -one, 'results', is used to get the test case results. - -Note that in this case test will not run on load automatically. - Example of usage: + % modprobe dmatest channel=dma0chan0 timeout=2000 iterations=1 run=1 + +...or: + % modprobe dmatest % echo dma0chan0 > /sys/module/dmatest/parameters/channel % echo 2000 > /sys/module/dmatest/parameters/timeout % echo 1 > /sys/module/dmatest/parameters/iterations - % echo 1 > /sys/kernel/debug/dmatest/run + % echo 1 > /sys/module/dmatest/parameters/run + +...or on the kernel command line: + + dmatest.channel=dma0chan0 dmatest.timeout=2000 dmatest.iterations=1 dmatest.run=1 Hint: available channel list could be extracted by running the following command: % ls -1 /sys/class/dma/ -After a while you will start to get messages about current status or error like -in the original code. +Once started a message like "dmatest: Started 1 threads using dma0chan0" is +emitted. After that only test failure messages are reported until the test +stops. Note that running a new test will not stop any in progress test. -The following command should return actual state of the test. - % cat /sys/kernel/debug/dmatest/run - -To wait for test done the user may perform a busy loop that checks the state. - - % while [ $(cat /sys/kernel/debug/dmatest/run) = "Y" ] - > do - > echo -n "." - > sleep 1 - > done - > echo +The following command returns the state of the test. + % cat /sys/module/dmatest/parameters/run + +To wait for test completion userpace can poll 'run' until it is false, or use +the wait parameter. Specifying 'wait=1' when loading the module causes module +initialization to pause until a test run has completed, while reading +/sys/module/dmatest/parameters/wait waits for any running test to complete +before returning. For example, the following scripts wait for 42 tests +to complete before exiting. Note that if 'iterations' is set to 'infinite' then +waiting is disabled. + +Example: + % modprobe dmatest run=1 iterations=42 wait=1 + % modprobe -r dmatest +...or: + % modprobe dmatest run=1 iterations=42 + % cat /sys/module/dmatest/parameters/wait + % modprobe -r dmatest Part 3 - When built-in in the kernel... @@ -62,21 +71,22 @@ case. You always could check them at run-time by running Part 4 - Gathering the test results -The module provides a storage for the test results in the memory. The gathered -data could be used after test is done. +Test results are printed to the kernel log buffer with the format: -The special file 'results' in the debugfs represents gathered data of the in -progress test. The messages collected are printed to the kernel log as well. +"dmatest: result <channel>: <test id>: '<error msg>' with src_off=<val> dst_off=<val> len=<val> (<err code>)" Example of output: - % cat /sys/kernel/debug/dmatest/results - dma0chan0-copy0: #1: No errors with src_off=0x7bf dst_off=0x8ad len=0x3fea (0) + % dmesg | tail -n 1 + dmatest: result dma0chan0-copy0: #1: No errors with src_off=0x7bf dst_off=0x8ad len=0x3fea (0) The message format is unified across the different types of errors. A number in the parens represents additional information, e.g. error code, error counter, -or status. +or status. A test thread also emits a summary line at completion listing the +number of tests executed, number that failed, and a result code. -Comparison between buffers is stored to the dedicated structure. +Example: + % dmesg | tail -n 1 + dmatest: dma0chan0-copy0: summary 1 test, 0 failures 1000 iops 100000 KB/s (0) -Note that the verify result is now accessible only via file 'results' in the -debugfs. +The details of a data miscompare error are also emitted, but do not follow the +above format. diff --git a/arch/arm/include/asm/hardware/iop3xx-adma.h b/arch/arm/include/asm/hardware/iop3xx-adma.h index 9b28f1243bd..240b29ef17d 100644 --- a/arch/arm/include/asm/hardware/iop3xx-adma.h +++ b/arch/arm/include/asm/hardware/iop3xx-adma.h @@ -393,36 +393,6 @@ static inline int iop_chan_zero_sum_slot_count(size_t len, int src_cnt, return slot_cnt; } -static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc) -{ - return 0; -} - -static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc, - struct iop_adma_chan *chan) -{ - union iop3xx_desc hw_desc = { .ptr = desc->hw_desc, }; - - switch (chan->device->id) { - case DMA0_ID: - case DMA1_ID: - return hw_desc.dma->dest_addr; - case AAU_ID: - return hw_desc.aau->dest_addr; - default: - BUG(); - } - return 0; -} - - -static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc, - struct iop_adma_chan *chan) -{ - BUG(); - return 0; -} - static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc, struct iop_adma_chan *chan) { diff --git a/arch/arm/include/asm/hardware/iop_adma.h b/arch/arm/include/asm/hardware/iop_adma.h index 122f86d8c99..250760e0810 100644 --- a/arch/arm/include/asm/hardware/iop_adma.h +++ b/arch/arm/include/asm/hardware/iop_adma.h @@ -82,8 +82,6 @@ struct iop_adma_chan { * @slot_cnt: total slots used in an transaction (group of operations) * @slots_per_op: number of slots per operation * @idx: pool index - * @unmap_src_cnt: number of xor sources - * @unmap_len: transaction bytecount * @tx_list: list of descriptors that are associated with one operation * @async_tx: support for the async_tx api * @group_list: list of slots that make up a multi-descriptor transaction @@ -99,8 +97,6 @@ struct iop_adma_desc_slot { u16 slot_cnt; u16 slots_per_op; u16 idx; - u16 unmap_src_cnt; - size_t unmap_len; struct list_head tx_list; struct dma_async_tx_descriptor async_tx; union { diff --git a/arch/arm/mach-iop13xx/include/mach/adma.h b/arch/arm/mach-iop13xx/include/mach/adma.h index 6d3782d85a9..a86fd0ed775 100644 --- a/arch/arm/mach-iop13xx/include/mach/adma.h +++ b/arch/arm/mach-iop13xx/include/mach/adma.h @@ -218,20 +218,6 @@ iop_chan_xor_slot_count(size_t len, int src_cnt, int *slots_per_op) #define iop_chan_pq_slot_count iop_chan_xor_slot_count #define iop_chan_pq_zero_sum_slot_count iop_chan_xor_slot_count -static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc, - struct iop_adma_chan *chan) -{ - struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc; - return hw_desc->dest_addr; -} - -static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc, - struct iop_adma_chan *chan) -{ - struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc; - return hw_desc->q_dest_addr; -} - static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc, struct iop_adma_chan *chan) { @@ -350,18 +336,6 @@ iop_desc_init_pq(struct iop_adma_desc_slot *desc, int src_cnt, hw_desc->desc_ctrl = u_desc_ctrl.value; } -static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc) -{ - struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc; - union { - u32 value; - struct iop13xx_adma_desc_ctrl field; - } u_desc_ctrl; - - u_desc_ctrl.value = hw_desc->desc_ctrl; - return u_desc_ctrl.field.pq_xfer_en; -} - static inline void iop_desc_init_pq_zero_sum(struct iop_adma_desc_slot *desc, int src_cnt, unsigned long flags) diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c index 9e62feffb37..f8c0b8dbeb7 100644 --- a/crypto/async_tx/async_memcpy.c +++ b/crypto/async_tx/async_memcpy.c @@ -50,33 +50,36 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset, &dest, 1, &src, 1, len); struct dma_device *device = chan ? chan->device : NULL; struct dma_async_tx_descriptor *tx = NULL; + struct dmaengine_unmap_data *unmap = NULL; - if (device && is_dma_copy_aligned(device, src_offset, dest_offset, len)) { - dma_addr_t dma_dest, dma_src; + if (device) + unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOIO); + + if (unmap && is_dma_copy_aligned(device, src_offset, dest_offset, len)) { unsigned long dma_prep_flags = 0; if (submit->cb_fn) dma_prep_flags |= DMA_PREP_INTERRUPT; if (submit->flags & ASYNC_TX_FENCE) dma_prep_flags |= DMA_PREP_FENCE; - dma_dest = dma_map_page(device->dev, dest, dest_offset, len, - DMA_FROM_DEVICE); - - dma_src = dma_map_page(device->dev, src, src_offset, len, - DMA_TO_DEVICE); - - tx = device->device_prep_dma_memcpy(chan, dma_dest, dma_src, - len, dma_prep_flags); - if (!tx) { - dma_unmap_page(device->dev, dma_dest, len, - DMA_FROM_DEVICE); - dma_unmap_page(device->dev, dma_src, len, - DMA_TO_DEVICE); - } + + unmap->to_cnt = 1; + unmap->addr[0] = dma_map_page(device->dev, src, src_offset, len, + DMA_TO_DEVICE); + unmap->from_cnt = 1; + unmap->addr[1] = dma_map_page(device->dev, dest, dest_offset, len, + DMA_FROM_DEVICE); + unmap->len = len; + + tx = device->device_prep_dma_memcpy(chan, unmap->addr[1], + unmap->addr[0], len, + dma_prep_flags); } if (tx) { pr_debug("%s: (async) len: %zu\n", __func__, len); + + dma_set_unmap(tx, unmap); async_tx_submit(chan, tx, submit); } else { void *dest_buf, *src_buf; @@ -96,6 +99,8 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset, async_tx_sync_epilog(submit); } + dmaengine_unmap_put(unmap); + return tx; } EXPORT_SYMBOL_GPL(async_memcpy); diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c index 91d5d385899..d05327caf69 100644 --- a/crypto/async_tx/async_pq.c +++ b/crypto/async_tx/async_pq.c @@ -46,49 +46,24 @@ static struct page *pq_scribble_page; * do_async_gen_syndrome - asynchronously calculate P and/or Q */ static __async_inline struct dma_async_tx_descriptor * -do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks, - const unsigned char *scfs, unsigned int offset, int disks, - size_t len, dma_addr_t *dma_src, +do_async_gen_syndrome(struct dma_chan *chan, + const unsigned char *scfs, int disks, + struct dmaengine_unmap_data *unmap, + enum dma_ctrl_flags dma_flags, struct async_submit_ctl *submit) { struct dma_async_tx_descriptor *tx = NULL; struct dma_device *dma = chan->device; - enum dma_ctrl_flags dma_flags = 0; enum async_tx_flags flags_orig = submit->flags; dma_async_tx_callback cb_fn_orig = submit->cb_fn; dma_async_tx_callback cb_param_orig = submit->cb_param; int src_cnt = disks - 2; - unsigned char coefs[src_cnt]; unsigned short pq_src_cnt; dma_addr_t dma_dest[2]; int src_off = 0; - int idx; - int i; - /* DMAs use destinations as sources, so use BIDIRECTIONAL mapping */ - if (P(blocks, disks)) - dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks), offset, - len, DMA_BIDIRECTIONAL); - else - dma_flags |= DMA_PREP_PQ_DISABLE_P; - if (Q(blocks, disks)) - dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks), offset, - len, DMA_BIDIRECTIONAL); - else - dma_flags |= DMA_PREP_PQ_DISABLE_Q; - - /* convert source addresses being careful to collapse 'empty' - * sources and update the coefficients accordingly - */ - for (i = 0, idx = 0; i < src_cnt; i++) { - if (blocks[i] == NULL) - continue; - dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len, - DMA_TO_DEVICE); - coefs[idx] = scfs[i]; - idx++; - } - src_cnt = idx; + if (submit->flags & ASYNC_TX_FENCE) + dma_flags |= DMA_PREP_FENCE; while (src_cnt > 0) { submit->flags = flags_orig; @@ -100,28 +75,25 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks, if (src_cnt > pq_src_cnt) { submit->flags &= ~ASYNC_TX_ACK; submit->flags |= ASYNC_TX_FENCE; - dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP; submit->cb_fn = NULL; submit->cb_param = NULL; } else { - dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP; submit->cb_fn = cb_fn_orig; submit->cb_param = cb_param_orig; if (cb_fn_orig) dma_flags |= DMA_PREP_INTERRUPT; } - if (submit->flags & ASYNC_TX_FENCE) - dma_flags |= DMA_PREP_FENCE; - /* Since we have clobbered the src_list we are committed - * to doing this asynchronously. Drivers force forward - * progress in case they can not provide a descriptor + /* Drivers force forward progress in case they can not provide + * a descriptor */ for (;;) { + dma_dest[0] = unmap->addr[disks - 2]; + dma_dest[1] = unmap->addr[disks - 1]; tx = dma->device_prep_dma_pq(chan, dma_dest, - &dma_src[src_off], + &unmap->addr[src_off], pq_src_cnt, - &coefs[src_off], len, + &scfs[src_off], unmap->len, dma_flags); if (likely(tx)) break; @@ -129,6 +101,7 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks, dma_async_issue_pending(chan); } + dma_set_unmap(tx, unmap); async_tx_submit(chan, tx, submit); submit->depend_tx = tx; @@ -188,10 +161,6 @@ do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks, * set to NULL those buffers will be replaced with the raid6_zero_page * in the synchronous path and omitted in the hardware-asynchronous * path. - * - * 'blocks' note: if submit->scribble is NULL then the contents of - * 'blocks' may be overwritten to perform address conversions - * (dma_map_page() or page_address()). */ struct dma_async_tx_descriptor * async_gen_syndrome(struct page **blocks, unsigned int offset, int disks, @@ -202,26 +171,69 @@ async_gen_syndrome(struct page **blocks, unsigned int offset, int disks, &P(blocks, disks), 2, blocks, src_cnt, len); struct dma_device *device = chan ? chan->device : NULL; - dma_addr_t *dma_src = NULL; + struct dmaengine_unmap_data *unmap = NULL; BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks))); - if (submit->scribble) - dma_src = submit->scribble; - else if (sizeof(dma_addr_t) <= sizeof(struct page *)) - dma_src = (dma_addr_t *) blocks; + if (device) + unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO); - if (dma_src && device && + if (unmap && (src_cnt <= dma_maxpq(device, 0) || dma_maxpq(device, DMA_PREP_CONTINUE) > 0) && is_dma_pq_aligned(device, offset, 0, len)) { + struct dma_async_tx_descriptor *tx; + enum dma_ctrl_flags dma_flags = 0; + unsigned char coefs[src_cnt]; + int i, j; + /* run the p+q asynchronously */ pr_debug("%s: (async) disks: %d len: %zu\n", __func__, disks, len); - return do_async_gen_syndrome(chan, blocks, raid6_gfexp, offset, - disks, len, dma_src, submit); + + /* convert source addresses being careful to collapse 'empty' + * sources and update the coefficients accordingly + */ + unmap->len = len; + for (i = 0, j = 0; i < src_cnt; i++) { + if (blocks[i] == NULL) + continue; + unmap->addr[j] = dma_map_page(device->dev, blocks[i], offset, + len, DMA_TO_DEVICE); + coefs[j] = raid6_gfexp[i]; + unmap->to_cnt++; + j++; + } + + /* + * DMAs use destinations as sources, + * so use BIDIRECTIONAL mapping + */ + unmap->bidi_cnt++; + if (P(blocks, disks)) + unmap->addr[j++] = dma_map_page(device->dev, P(blocks, disks), + offset, len, DMA_BIDIRECTIONAL); + else { + unmap->addr[j++] = 0; + dma_flags |= DMA_PREP_PQ_DISABLE_P; + } + + unmap->bidi_cnt++; + if (Q(blocks, disks)) + unmap->addr[j++] = dma_map_page(device->dev, Q(blocks, disks), + offset, len, DMA_BIDIRECTIONAL); + else { + unmap->addr[j++] = 0; + dma_flags |= DMA_PREP_PQ_DISABLE_Q; + } + + tx = do_async_gen_syndrome(chan, coefs, j, unmap, dma_flags, submit); + dmaengine_unmap_put(unmap); + return tx; } + dmaengine_unmap_put(unmap); + /* run the pq synchronously */ pr_debug("%s: (sync) disks: %d len: %zu\n", __func__, disks, len); @@ -277,50 +289,60 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks, struct dma_async_tx_descriptor *tx; unsigned char coefs[disks-2]; enum dma_ctrl_flags dma_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0; - dma_addr_t *dma_src = NULL; - int src_cnt = 0; + struct dmaengine_unmap_data *unmap = NULL; BUG_ON(disks < 4); - if (submit->scribble) - dma_src = submit->scribble; - else if (sizeof(dma_addr_t) <= sizeof(struct page *)) - dma_src = (dma_addr_t *) blocks; + if (device) + unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO); - if (dma_src && device && disks <= dma_maxpq(device, 0) && + if (unmap && disks <= dma_maxpq(device, 0) && is_dma_pq_aligned(device, offset, 0, len)) { struct device *dev = device->dev; - dma_addr_t *pq = &dma_src[disks-2]; - int i; + dma_addr_t pq[2]; + int i, j = 0, src_cnt = 0; pr_debug("%s: (async) disks: %d len: %zu\n", __func__, disks, len); - if (!P(blocks, disks)) + + unmap->len = len; + for (i = 0; i < disks-2; i++) + if (likely(blocks[i])) { + unmap->addr[j] = dma_map_page(dev, blocks[i], + offset, len, + DMA_TO_DEVICE); + coefs[j] = raid6_gfexp[i]; + unmap->to_cnt++; + src_cnt++; + j++; + } + + if (!P(blocks, disks)) { + pq[0] = 0; dma_flags |= DMA_PREP_PQ_DISABLE_P; - else + } else { pq[0] = dma_map_page(dev, P(blocks, disks), offset, len, DMA_TO_DEVICE); - if (!Q(blocks, disks)) + unmap->addr[j++] = pq[0]; + unmap->to_cnt++; + } + if (!Q(blocks, disks)) { + pq[1] = 0; dma_flags |= DMA_PREP_PQ_DISABLE_Q; - else + } else { pq[1] = dma_map_page(dev, Q(blocks, disks), offset, len, DMA_TO_DEVICE); + unmap->addr[j++] = pq[1]; + unmap->to_cnt++; + } if (submit->flags & ASYNC_TX_FENCE) dma_flags |= DMA_PREP_FENCE; - for (i = 0; i < disks-2; i++) - if (likely(blocks[i])) { - dma_src[src_cnt] = dma_map_page(dev, blocks[i], - offset, len, - DMA_TO_DEVICE); - coefs[src_cnt] = raid6_gfexp[i]; - src_cnt++; - } - for (;;) { - tx = device->device_prep_dma_pq_val(chan, pq, dma_src, + tx = device->device_prep_dma_pq_val(chan, pq, + unmap->addr, src_cnt, coefs, len, pqres, @@ -330,6 +352,8 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks, async_tx_quiesce(&submit->depend_tx); dma_async_issue_pending(chan); } + + dma_set_unmap(tx, unmap); async_tx_submit(chan, tx, submit); return tx; diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c index a9f08a6a582..934a8498149 100644 --- a/crypto/async_tx/async_raid6_recov.c +++ b/crypto/async_tx/async_raid6_recov.c @@ -26,6 +26,7 @@ #include <linux/dma-mapping.h> #include <linux/raid/pq.h> #include <linux/async_tx.h> +#include <linux/dmaengine.h> static struct dma_async_tx_descriptor * async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef, @@ -34,35 +35,45 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef, struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ, &dest, 1, srcs, 2, len); struct dma_device *dma = chan ? chan->device : NULL; + struct dmaengine_unmap_data *unmap = NULL; const u8 *amul, *bmul; u8 ax, bx; u8 *a, *b, *c; - if (dma) { - dma_addr_t dma_dest[2]; - dma_addr_t dma_src[2]; + if (dma) + unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO); + + if (unmap) { struct device *dev = dma->dev; + dma_addr_t pq[2]; struct dma_async_tx_descriptor *tx; enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P; if (submit->flags & ASYNC_TX_FENCE) dma_flags |= DMA_PREP_FENCE; - dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL); - dma_src[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE); - dma_src[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE); - tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 2, coef, + unmap->addr[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE); + unmap->addr[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE); + unmap->to_cnt = 2; + + unmap->addr[2] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL); + unmap->bidi_cnt = 1; + /* engine only looks at Q, but expects it to follow P */ + pq[1] = unmap->addr[2]; + + unmap->len = len; + tx = dma->device_prep_dma_pq(chan, pq, unmap->addr, 2, coef, len, dma_flags); if (tx) { + dma_set_unmap(tx, unmap); async_tx_submit(chan, tx, submit); + dmaengine_unmap_put(unmap); return tx; } /* could not get a descriptor, unmap and fall through to * the synchronous path */ - dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL); - dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE); - dma_unmap_page(dev, dma_src[1], len, DMA_TO_DEVICE); + dmaengine_unmap_put(unmap); } /* run the operation synchronously */ @@ -89,23 +100,38 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len, struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ, &dest, 1, &src, 1, len); struct dma_device *dma = chan ? chan->device : NULL; + struct dmaengine_unmap_data *unmap = NULL; const u8 *qmul; /* Q multiplier table */ u8 *d, *s; - if (dma) { + if (dma) + unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO); + + if (unmap) { dma_addr_t dma_dest[2]; - dma_addr_t dma_src[1]; struct device *dev = dma->dev; struct dma_async_tx_descriptor *tx; enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P; if (submit->flags & ASYNC_TX_FENCE) dma_flags |= DMA_PREP_FENCE; - dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL); - dma_src[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE); - tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef, - len, dma_flags); + unmap->addr[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE); + unmap->to_cnt++; + unmap->addr[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL); + dma_dest[1] = unmap->addr[1]; + unmap->bidi_cnt++; + unmap->len = len; + + /* this looks funny, but the engine looks for Q at + * dma_dest[1] and ignores dma_dest[0] as a dest + * due to DMA_PREP_PQ_DISABLE_P + */ + tx = dma->device_prep_dma_pq(chan, dma_dest, unmap->addr, + 1, &coef, len, dma_flags); + if (tx) { + dma_set_unmap(tx, unmap); + dmaengine_unmap_put(unmap); async_tx_submit(chan, tx, submit); return tx; } @@ -113,8 +139,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len, /* could not get a descriptor, unmap and fall through to * the synchronous path */ - dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL); - dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE); + dmaengine_unmap_put(unmap); } /* no channel available, or failed to allocate a descriptor, so diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c index 8ade0a0481c..3c562f5a60b 100644 --- a/crypto/async_tx/async_xor.c +++ b/crypto/async_tx/async_xor.c @@ -33,48 +33,31 @@ /* do_async_xor - dma map the pages and perform the xor with an engine */ static __async_inline struct dma_async_tx_descriptor * -do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list, - unsigned int offset, int src_cnt, size_t len, dma_addr_t *dma_src, +do_async_xor(struct dma_chan *chan, struct dmaengine_unmap_data *unmap, struct async_submit_ctl *submit) { struct dma_device *dma = chan->device; struct dma_async_tx_descriptor *tx = NULL; - int src_off = 0; - int i; dma_async_tx_callback cb_fn_orig = submit->cb_fn; void *cb_param_orig = submit->cb_param; enum async_tx_flags flags_orig = submit->flags; - enum dma_ctrl_flags dma_flags; - int xor_src_cnt = 0; - dma_addr_t dma_dest; - - /* map the dest bidrectional in case it is re-used as a source */ - dma_dest = dma_map_page(dma->dev, dest, offset, len, DMA_BIDIRECTIONAL); - for (i = 0; i < src_cnt; i++) { - /* only map the dest once */ - if (!src_list[i]) - cont |