How to resolve an issue in swiotlb environment?

Alan Stern stern at rowland.harvard.edu
Mon Jun 10 18:46:25 UTC 2019


On Mon, 10 Jun 2019, Christoph Hellwig wrote:

> Hi Yoshihiro,
> 
> sorry for not taking care of this earlier, today is a public holiday
> here and thus I'm not working much over the long weekend.
> 
> On Mon, Jun 10, 2019 at 11:13:07AM +0000, Yoshihiro Shimoda wrote:
> > I have another way to avoid the issue. But it doesn't seem that a good way though...
> > According to the commit that adding blk_queue_virt_boundary() [3],
> > this is needed for vhci_hcd as a workaround so that if we avoid to call it
> > on xhci-hcd driver, the issue disappeared. What do you think?
> > JFYI, I pasted a tentative patch in the end of email [4].
> 
> Oh, I hadn't even look at why USB uses blk_queue_virt_boundary, and it
> seems like the usage is wrong, as it doesn't follow the same rules as
> all the others.  I think your patch goes in the right direction,
> but instead of comparing a hcd name it needs to be keyed of a flag
> set by the driver (I suspect there is one indicating native SG support,
> but I can't quickly find it), and we need an alternative solution
> for drivers that don't see like vhci.  I suspect just limiting the
> entire transfer size to something that works for a single packet
> for them would be fine.

Christoph:

In most of the different kinds of USB host controllers, the hardware is
not capable of assembling a packet out of multiple buffers at arbitrary
addresses.  As a matter of fact, xHCI is the only kind that _can_ do 
this.

In some cases, the hardware can assemble packets provided each buffer
other than the last ends at a page boundary and each buffer other than
the first starts at a page boundary (Intel would say the buffers are
"virtually contiguous"), but this is a rather complex rule and we don't
want to rely on it.  Plus, in other cases the hardware _can't_ do this.

Instead, we want the SG buffers to be set up so that each one (except 
the last) is an exact multiple of the maximum packet size.  That way, 
each packet can be assembled from the contents of a single buffer and 
there's no problem.

The maximum packet size depends on the type of USB connection.  
Typical values are 1024, 512, or 64.  It's always a power of two and
it's smaller than 4096.  Therefore we simplify the problem even further
by requiring that each SG buffer in a scatterlist (except the last one)
be a multiple of the page size.  (It doesn't need to be aligned on a 
page boundary, as far as I remember.)

That's why the blk_queue_virt_boundary usage was added to the USB code.  
Perhaps it's not the right way of doing this; I'm not an expert on the
inner workings of the block layer.  If you can suggest a better way to
express our requirement, that would be great.

Alan Stern

PS: There _is_ a flag saying whether an HCD supports SG.  But what it
means is that the driver can handle an SG list that meets the
requirement above; it doesn't mean that the driver can reassemble the
data from an SG list into a series of bounce buffers in order to meet
the requirement.  We very much want not to do that, especially since
the block layer should already be capable of doing it for us.



More information about the iommu mailing list