[RFC PATCH v5 3/8] iommu: add a new capable IOMMU_CAP_MERGING

Robin Murphy robin.murphy at arm.com
Wed Jun 5 12:21:59 UTC 2019


On 05/06/2019 12:11, Yoshihiro Shimoda wrote:
> This patch adds a new capable IOMMU_CAP_MERGING to check whether
> the IOVA would be contiguous strictly if a device requires and
> the IOMMU driver has the capable.
> 
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh at renesas.com>
> ---
>   drivers/iommu/dma-iommu.c | 26 ++++++++++++++++++++++++--
>   include/linux/iommu.h     |  1 +
>   2 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 034caae..ecf1a04 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -847,11 +847,16 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>   	dma_addr_t iova;
>   	size_t iova_len = 0;
>   	unsigned long mask = dma_get_seg_boundary(dev);
> -	int i;
> +	int i, ret;
> +	bool iova_contiguous = false;
>   
>   	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>   		iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
>   
> +	if (dma_get_iova_contiguous(dev) &&
> +	    iommu_capable(dev->bus, IOMMU_CAP_MERGING))
> +		iova_contiguous = true;
> +
>   	/*
>   	 * Work out how much IOVA space we need, and align the segments to
>   	 * IOVA granules for the IOMMU driver to handle. With some clever
> @@ -867,6 +872,13 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>   		sg_dma_len(s) = s_length;
>   		s->offset -= s_iova_off;
>   		s_length = iova_align(iovad, s_length + s_iova_off);
> +		/*
> +		 * Check whether the IOVA would be contiguous strictly if
> +		 * a device requires and the IOMMU driver has the capable.
> +		 */
> +		if (iova_contiguous && i > 0 &&
> +		    (s_iova_off || s->length != s_length))
> +			return 0;
>   		s->length = s_length;
>   
>   		/*
> @@ -902,8 +914,18 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>   	if (iommu_map_sg(domain, iova, sg, nents, prot) < iova_len)
>   		goto out_free_iova;
>   
> -	return __finalise_sg(dev, sg, nents, iova);
> +	ret = __finalise_sg(dev, sg, nents, iova);
> +	/*
> +	 * Check whether the sg entry is single if a device requires and
> +	 * the IOMMU driver has the capable.
> +	 */
> +	if (iova_contiguous && ret != 1)
> +		goto out_unmap_sg;

I don't see that just failing really gives this option any value. 
Clearly the MMC driver has to do *something* to handle the failure (plus 
presumably the case of not having IOMMU DMA ops at all), which begs the 
question of why it couldn't just do whatever that is anyway, without all 
this infrastructure. For starters, it would be a far simpler and less 
invasive patch:

	if (dma_map_sg(...) > 1) {
		dma_unmap_sg(...);
		/* split into multiple requests and try again */
	}

But then it would make even more sense to just have the driver be 
proactive about its special requirement in the first place, and simply 
validate the list before it even tries to map it:

	for_each_sg(sgl, sg, n, i)
		if ((i > 0 && sg->offset % PAGE_SIZE) ||
		    (i < n - 1 && sg->length % PAGE_SIZE))
			/* segment will not be mergeable */

For reference, I think v4l2 and possibly some areas of DRM already do 
something vaguely similar to judge whether they get contiguous buffers 
or not.

> +
> +	return ret;
>   
> +out_unmap_sg:
> +	iommu_dma_unmap_sg(dev, sg, nents, dir, attrs);
>   out_free_iova:
>   	iommu_dma_free_iova(cookie, iova, iova_len);
>   out_restore_sg:
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 91af22a..f971dd3 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -104,6 +104,7 @@ enum iommu_cap {
>   					   transactions */
>   	IOMMU_CAP_INTR_REMAP,		/* IOMMU supports interrupt isolation */
>   	IOMMU_CAP_NOEXEC,		/* IOMMU_NOEXEC flag */
> +	IOMMU_CAP_MERGING,		/* IOMMU supports segments merging */

This isn't a 'capability' of the IOMMU - "segment merging" equates to 
just remapping pages, and there's already a fundamental assumption that 
IOMMUs are capable of that. Plus it's very much a DMA API concept, so 
hardly belongs in the IOMMU API anyway.

All in all, I'm struggling to see the point of this. Although it's not a 
DMA API guarantee, iommu-dma already merges scatterlists as aggressively 
as it is allowed to, and will continue to do so for the foreseeable 
future (since it avoids considerable complication in the IOVA 
allocation), so if you want to make sure iommu_dma_map_sg() merges an 
entire list, just don't give it a non-mergeable list. And if you still 
really really want dma_map_sg() to have a behaviour of "merge to a 
single segment or fail", then that should at least be a DMA API 
attribute, which could in principle be honoured by bounce-buffering 
implementations as well.

And if the problem is really that you're not getting merging because of 
exposing the wrong parameters to the DMA API and/or the block layer, or 
that you just can't quite express your requirement to the block layer in 
the first place, then that should really be tackled at the source rather 
than worked around further down in the stack.

Robin.


More information about the iommu mailing list