[Bugme-new] [Bug 16911] New: Data corruption due to race between rpc-resend of O_DIRECT WRITE and server OK response.

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Tue Aug 24 07:54:10 PDT 2010


https://bugzilla.kernel.org/show_bug.cgi?id=16911

           Summary: Data corruption due to race between rpc-resend of
                    O_DIRECT WRITE and server OK response.
           Product: File System
           Version: 2.5
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: NFS
        AssignedTo: trond.myklebust at fys.uio.no
        ReportedBy: iisaman at umich.edu
        Regression: No


Attached is a trace which results in data corruption.  The client
sends a sync WRITE immediately before connection loss.  In this case,
the server erroneously sends the NFS4_OK response immediately upon
reconnect, which causes the client to release the write buffer for
reuse.  The client application then overwrites the buffer, but the
kernel is still in the middle of an RPC resend of the original write,
which gets corrupted.

Trond notes that this is a known longstanding bug, so documenting it here.
Despite this being triggered by server misbehavior, it is still a
potential issue when using O_DIRECT writes whenever an RPC-resend is
triggered. Trond notes that this race would be hard to fix, as it
would involve communication between layers that currently is not done.
One potential solution would be, in the case of O_DIRECT, close the
connection in situations where the client would otherwise do an RPC-resend.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the Bugme-new mailing list