[GIT PULL] AMD IOMMU updates for 2.6.28-rc5

FUJITA Tomonori fujita.tomonori at lab.ntt.co.jp
Wed Nov 19 20:25:15 PST 2008


On Wed, 19 Nov 2008 10:25:44 +0100
Joerg Roedel <joro at 8bytes.org> wrote:

> On Wed, Nov 19, 2008 at 03:05:24PM +0900, FUJITA Tomonori wrote:
> > On Tue, 18 Nov 2008 16:43:22 +0100
> > Joerg Roedel <joerg.roedel at amd.com> wrote:
> > 
> > > Joerg Roedel (4):
> > >       AMD IOMMU: add parameter to disable device isolation
> > >       AMD IOMMU: enable device isolation per default
> > >       AMD IOMMU: fix fullflush comparison length
> > >       AMD IOMMU: check for next_bit also in unmapped area
> > > 
> > >  Documentation/kernel-parameters.txt |    4 +++-
> > >  arch/x86/kernel/amd_iommu.c         |    2 +-
> > >  arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
> > >  3 files changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > As the most important change these patches enable device isolation per
> > > default. Tests have shown that there are drivers which have bugs and do
> > > double-freeing of DMA memory.
> > 
> > What drivers? We need to fix them if they are mainline drivers.
> 
> I found issues in network drivers only for now. The two drivers where I
> found issues are the in-kernel ixgbe driver (I see IO_PAGE_FAULTS
> there), the ixgbe version from the Intel website has a double-free bug
> when unloading the driver or changing the device mtu. The same problem
> was found with the Broadcom NetXtreme II driver.

I see, thanks. You already reported the bugs to netdev?


> > > This can lead to data corruption with a
> > > hardware IOMMU when multiple devices share the same protection domain.
> > > Therefore device isolation should be enabled by default.
> > 
> > Hmm, the change is just because of the bug workaround? If so, I'm not
> > sure it's a good idea. We need to fix the buggy drivers anyway. And
> > device isolation is not free; e.g. use more memory rather than sharing
> > a protection domain. I guess that more people prefer sharing a
> > protection domain by default. It had been the default option for AMD
> > IOMMU until you hit the bugs. IIRC, VT-d also shares a protection
> > domain by default. It would be nice to avoid surprising users if the
> > two virtualization IOMMUs works in the similar way.
> 
> We can't test all drivers for those bugs until 2.6.28 will be released.
> And these bugs can corrupt data, for example when a driver frees dma
> addresses allocated by another driver and these addresses are then
> reallocated.
> The only way to protect the drivers from each other is to isolate them
> in different protection domains. The AMD IOMMU driver prints a WARN_ON()
> if a driver frees dma addresses not yet mapped. This triggered with the
> bnx2 and the ixgbe driver.

It would be better to add such WARN_ON to VT-d. VT-d is everywhere
nowadays. I think that there are some developers who can test these
drivers with VT-d.


> And the data corruption is real, it eat the root-fs of my testbox one
> time.
> I agree that we need to fix the drivers. I plan to implement some debug
> code which allows driver developers to detect those bugs even if they
> have no IOMMU in the system.

It's not so hard to add such debug feature to swiotlb, I guess.


More information about the iommu mailing list