From: Noveck, Dave (Dave.Noveck@netapp.com)
Date: 03/16/01-11:56:31 AM Z
Message-ID: <8C610D86AF6CD4119C9800B0D0499E331A6F34@red.nane.netapp.com> From: "Noveck, Dave" <Dave.Noveck@netapp.com> Subject: RE: idea - atomic append for NFS writes Date: Fri, 16 Mar 2001 09:56:31 -0800 > As Peter points out, atomic appends are probably (I say probably, > because it hasn't been proven they can't be done, its just that every > proposal has been found to have holes) impossible, although if the > server has delegated write capability to the client, as long as the > delegation is in force, appends come for free. In that case, they are essentially local writes. I'm not sure that I'm prepared to give up on append writes yet. The fact that V4 has embraced statefulness may open up some other possibilties. We have stuff that we do exactly-once through seqids (i.e. locking requests) so there may be a possibility of building on that approach. The problem area is server reboot since the current sequencing mechanisms take advantage of the fact that for locking, server reboot resets everything. > In Mark's proposal, what I envision occuring when (1) there is no > delegation in force, and (2) high load on the server such that the > duplicate request cache is blown, is duplicate records from the same > client written at different offsets. I think, and it has been the > consensus of the NFS community during the long NFSv3 and NFSv4 processes, > that this is worse than the current situation, where only if write > sharing is going on does the file get corrupted during appends. I think this is only worth doing if we can do it right (i.e. files don't get corrupted). Users take such a dim view of that. > Peter's notion of "reservations" at the end of the file is interesting, > but (1) as he points out, persistent reservations are required, and (2) > then there's the problem of what if the client doesn't follow through > with the write. To me, those two issues make his append proposal less > palatable than the status quo. > The use of a COMPOUND with VERIFY (file size) followed by WRITE comes > close, but because there's no atomicity guarantee between operations, > it comes up short. Still, it might be a good idea for NFS client's > simulating append to use VERIFY to reduce corruption during > write sharing. But it's early (for me) this morning, and I'm > probably missing something. I may not be understanding exactly the form of COMPOUND you are anticipating using, but it seems that COMPOUND with VERIFY is subject to exactly the same issue that you identified with Mark's proposal: duplicate records even in the absence of sharing. If I do a GETATTR of the file size and then do a VERIFY-WRITE compound, then if the verify fails I am going to loop back and do another GETATTR followed by a VERIFY-WRITE assuming that somebody else got his record in before me. But in the presence of retries, it may have been that the WRITE that got in was an earlier instance of *my* VERIFY-WRITE which succeeded, so the second retry (actually the first retry after the initial try) fails in the VERIFY and I go off to write my record again. Since, in this mode of operation I'm doing twice as many requests as in Mark's proposal, I'm going to hit the problem even earlier. I know I'm being heretical here, but I'm beginning to think that NFS may be at the point of exhausting the universe of things that can be done effectively with at-least-once semantics. Unless we are prepared to keep giving people things (like append writes) that work most of the time and tell them to stop complaining, we may have to bite the bullet and solve the general problem. Exactly-once Semantics -- Accept No Substitutes
This archive was generated by hypermail 2.1.2 : 03/04/05-01:48:40 AM Z CST