Re: fsync() fails under NFS, right?

New Message Reply About this list Date view Thread view Subject view Author view Attachment view

From: Eric Werme USG (werme@zk3.dec.com)
Date: 09/14/99-03:54:32 PM Z


From: Eric Werme USG <werme@zk3.dec.com>
Message-Id: <199909142054.QAA0000745099@anw.zk3.dec.com>
Subject: Re: fsync() fails under NFS, right? 
Date: Tue, 14 Sep 1999 16:54:32 -0400


   
   It occurred to me this is the meat of the
   argument.  You say it's all client business.  OK,
   my client wants to do fsync().  Now your server
   receives WRITE requests.  HOW does my client know,
   whether your server WROTE THROUGH to disk or
   CACHED my block?

If the server is properly written and the client sends V2 writes or V3
synchronous writes, the server WILL NOT REPLY UNTIL THE DATA IS ON THE DISK. 
Period.  If you are using a stock Linux server, you are using NFS V2 and I
believe the Linux server does not wait for writes to reach the disk before
replying.  That's why many people see much better performance on Linux
servers than from vendors you pay money to.  Fast, cheap, or reliable - pick
two.  If you're staking your business on Linux, you need a QA department
patterned after the major vendors' departments.  This mail list is
definitely not a Linux support hotline!

  How can I be sure that if your
   server crashes, the thing written before an
   fsync() doesn't die asleep in that remote cache?

In the above examples, the client will retransmit the write until our server
reboots and finally gets the data to disk and then replies.  Um, our servers
don't crash.  :-)

If you're using V3 async writes, then the client has no way of telling
if the data is in the cache or on the disk.  However, fsync() will block
until all the data is on the disk.  The client will send a commit at
some point (e.g. fsync()).  A "write verifier" is used to determine
if the server rebooted between replying to the write and to the commit.
If the server rebooted, then the client will retransmit the writes that
commit may have missed.  Either synchronous writes will be done, or
asynchronous followed by another commit will be done, the latter checking
the write verifier again.

We've been through this many times now, please learn this, I'm not typing it
again, but I might mail you the V2 and V3 spec....

   It seems to us our NFS servers have huge caches
   and silently swallow our small 100 MB file.  :-)
   (All our machines, Linux or Sun, start at 0.5 GB
   RAM.)

IT'S TIME FOR SOME PROOF.  Take a Sun-Sun system, watch writing a
small file with snoop.  It will work, you will see substantial delays
between commit request and response.  Linux client, Sun server will
probably work if fsync() isn't a no-op.  Linux server will probably
reply too quickly in all cases.

If you want to prove a failure, start writing a big file, and turn off
the server.  Turn it back on and reboot.  Let the client finish, then
verify the file on the server.  Try again, but push reset during the
write.  Repeat until you have a corrupted file, send it and
the snoop/tcpdump trace to your vendor's support department.

	-Ric Werme


New Message Reply About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.2 : 03/04/05-01:47:34 AM Z CST