[Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Thu Aug 28 11:41:08 PDT 2008


http://bugzilla.kernel.org/show_bug.cgi?id=11448

           Summary: NFS client has inconsistent write flushing to non-linux
                    serversa
           Product: File System
           Version: 2.5
     KernelVersion: 2.6.22.15
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: NFS
        AssignedTo: trond.myklebust at fys.uio.no
        ReportedBy: doug at will.to


Latest working kernel version: N/A (works on 2.6.18 with Linux NFS server, but
we cannot continue to use that kernel for various reasons)
Earliest failing kernel version: N/A (2.6.18, 2.6.24, and 2.6.25 are also known
to fail by another party experiencing same bug against non-Linux NFS servers).
Not currently known to be reproducible against NetApp, but this is not
authoritative (lack of seeing a bug does not guarantee lack of existence)
Distribution: CentOS 4.6
Hardware Environment: supermicro twin, 2 quad core Harpertown CPU, 16G ram.
Software Environment: CentOS 4.6
Problem Description: 

NFS client writes to Sun Solaris 10 U4 server. 
at some point in time, there is an empty portion of the output file from the
writer containing missing data (shows as NULL bytes from another NFS client
issuing a tail -f on the file being written). 
confirmed that the file as exists on the NFS server is sparse, missing bytes
(not necessarily multiple of 512 or 1024, one sample is a gap of 3818 bytes,
another is 1895 bytes, another is 423 bytes)

if you do a read of the entire file from the NFS client doing the writing, it
causes the non-flushed writes to be instantly flushed to the server followed by
a NFS3 commit operation. The data then can be seen on all other NFS clients.

If you do an open of the file alone, no flush
if you do an open and a close, no flush
if you do an open and a read at the beginning of the file (far before the data
that is outstanding), *usually* no flush (one case where it did).
If you do a read at another position in the file, no flush (other than as
indicated above).
If you do a read at the indicated offset where the bytes are null, it causes
the NFS client to write and NFS commit to the server (truss output available)

The missing blocks may flush themselves after undefined periods of time which
can be hours. Our runs last days.

Steps to reproduce:

Chemist running NAMD sees frequent cases of this in his output trajectory index
files. We don't have an exact sequence of steps to reproduce. After I file this
ticket I will be giving ticket number to another person I know at a different
company experiencing the same problem as described above (to the best of my
knowledge)


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list