RE: OPEN_CONFIRM implementation issues

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Attachment view

From: Noveck, Dave (Dave.Noveck@netapp.com)
Date: 11/08/02-01:16:58 PM Z

Next message: Mike Eisler: "OPEN_CONFIRM implementation issues"
Previous message: Spencer Shepler: "latest (last) draft -05"
Maybe in reply to: Mike Eisler: "OPEN_CONFIRM implementation issues"
Next in thread: eric kustarz: "Re: OPEN_CONFIRM implementation issues"

Message-ID: <C8CF60CFC4D8A74E9945E32CF096548A0708E8@SILVER.nane.netapp.com>
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
Subject: RE: OPEN_CONFIRM implementation issues
Date: Fri, 8 Nov 2002 11:16:58 -0800 

First, some philosophy. I tend to lean to a more restrictive implementation 
than Spencer. We have to obey the spec, but where it isn't clear, my tendency 
is to read it relatively strictly. I think that doing that helps clients find 
their bugs faster. This is also affected by the phase of protocol implementation
one is in. When initial servers are being implemented, a strict interpretation 
helps establish more uniform client behavior.  Later servers are probably best
advised to lean to the liberal side since they will have to deal with clients 
that may have been devloped in a more permissive server environment.

So to understand how I relate this to OPEN_CONFIRM, let's first consider a 
further case e), that we may even agree on.  Suppose someone does a LOCK and 
then passes the stateid returned by LOCK to OPEN_CONFIRM. Now the spec doesn't
explicitly say that you must not do this. However, the point of OPEN_CONFIRM 
is to confirm an OPEN, and the discussion of OPEN_CONFIRM is in those terms.
Also the synopsis, lists the parameter as open_stateid.  Given that, doing 
an OPEN_CONFIRM with the stateid from LOCK looks to me like using an 
inappropriate type or stateid for a given operation which I always treat
as BAD_STATEID.  For example this is what I do when you do a CLOSE on a 
lock stateid.  My impression had been that the definition of BAD_STATEID
included a clause to deal with such a case, but I don't see it in the 
current spec, but the only alternative is INVAL which seems really bogus.
What do you return in CLOSE(lock_stateid) case?  I guess you could treat
it as a no-op and return OK, but that seems very troublesome as it is 
particuarly prone to hide client bugs.

>From a practical point of view, the client who is doing this (i.e. 
OPEN_CONFIRM of a lock stateid) has a bug and probably would want to know 
about it, and getting BAD_STATEID tells him. Either he is pointlessly 
adding RPC's which is not good for him in high-latency environments (a 
V4 goal is better performance in such environments), or he meant to 
confirm the OPEN but is sending the wrong stateid. In either case, being 
liberal and treating the operation as a no-op, just makes it harder to find 
the problem.

I consider cases c) and d) below in exactly the same way. The spec doesn't 
forbid them, but the entire discussion of OPEN_CONFIRM deals with confirming an 
open as the next operation for that owner sequence and indeed the spec discusses 
what happens if something other than a confirm is done (getting rid of the OPEN).  
So again, doing an OPEN_CONFIRM past the appropriate time seems like using a 
stateid which is not appropriate to the operation being performed (i.e. because 
it is not the one returned by OPEN).  So I return BAD_STATEID in this case.  Also, 
the same practical considerations apply as for e). The client has some sort of 
bug and probably wants to find out about it sooner rather than later.

I think a) and b) are closer cases, but I would still follow the same line of 
reasoning. The spec says that OPEN_CONFIRM is "used to confirm the sequence id 
usage for the first time that a open_owner is used by a client."  This is 
unfortunately ambiguous. There are three viewpoints from which to interpret 
first time: omniscient observer, client, and server. It is particularly 
unfortunate that the spec's wording seems to lean in the direction of the
client-seen view, while the design of OPEN_CONFIRM is based on it being from the
server's of view.  That is, confirmation may be required when it appears to 
the server that this is the first use of an owner, either because it is in fact, 
or because the server has lost the knowledge of that owner, due to recylcing 
information about that owner. Given that the client sends the OPEN_CONFIRM, but 
that the necessity for it is based on server knowledge, the protocol provides 
the open-confirm-bit in the open response. So the design of OPEN_CONFIRM is 
based on the client using the open-confirm-bit to determine whether a confirm 
is to be done. Certainly, if the open-confirm-bit is set, the client must do 
one. However, I would also argue, that where a confirm is not required, the 
OPEN is precisely in the same state as in c) or d).  In those cases, because the  
open had been confirmed, I consider OPEN-CONFIRM as inappropriate for that stateid 
and I think it is reasonable to treat OPEN where confirmation is not required 
in exactly the same way. The server is saying this is not the initial sequence 
for a new owner, making OPEN_CONFIRM an inappropriate operation since the spec
discusses the use of OPEN_CONFIRM only in the context of such initial use of 
a lockowner and none of the discussion makes any sense in any other context.

Again, in practical terms, treating this as a no-op is liable to hide bugs.  If 
a client is sending confirm requests when the open-confirm-bit is not set, then 
it's pretty likely that he may not send them when the open-confirm-bit is set. 
While it is possible that he has coded something on the order of "if (open_confirm_bit || 
phase_of_the_moon == FULL) open_confirm()", the more likely cases are that he is 
always sending open_confirm after open (grossly inefficient) or even more likely,
that he misinterpreted "first use" in the spec to mean the *client's* first use.
If that's the case, then at some point in the future, when, after a period of
inactivity, the owner is recycled, he isn't going to do an open-confirm, even
when the server requires it, since from the client's point of view this is not
a new owner.  Since the logic of open-confirm is that the server's view prevails,
enforcing it from the beginning just seems better.  Of course, writing the 
server, I would think that :-)

I think this is an area where the spec is not very clear and opinions will 
differ.  I'm not sure that it so bad that servers differ as long as all of
the cases that they differ on are things that clients don't really want to
do.

-----Original Message-----
From: Eric Kustarz [mailto:Eric.Kustarz@eng.sun.com]
Sent: Thursday, November 07, 2002 5:53 PM
To: nfsv4-wg@sunroof.eng.sun.com
Subject: OPEN_CONFIRM implementation issues

There seems to be an unresolved issue with respect to OPEN_CONFIRM... what 
if the server doesn't ask for an OPEN_CONFIRM but the client sends one 
anyways? what should the server do?

Dave Noveck identified four possible cases:

a) The file was just opened and you were not told to open-confirm.

b) The file was just opened and you were told not to open-confirm and it is 
wrong to open-confirm (create or delegation present).

c) The file was opened and already the open was confirmed but no other IO 
was done.

d) The file was open and confirmed if necessary and IO was done.

Here's how Dave is handling those cases:
"
So I'm saying BAD_STATEID on all of a-d, with the exception that if the 
stateid takes you to the owner and the seqid indicates that this is a DUP 
then you would get OK.
"

The official SUN stance is to apply the Shepler principle of the "server 
should be flexible in what is accepts".  So we are currently having the 
server return NFS4_OK for a) and c).  We would like to see why in b) and d) 
an OPEN_CONFIRM is considered "wrong" to occur? Dave ?

Dave also had this side note:
"
One final point.  the spec lists OLD_STATEID in the error list, although 
I'm hard-pressed to figure out when you might actually return it.  If you 
are returning OK to c-d, then this opens the door to OLD_STATEID as well, 
although returning OK to c-d seems very wierd to me.
"

eric

Next message: Mike Eisler: "OPEN_CONFIRM implementation issues"
Previous message: Spencer Shepler: "latest (last) draft -05"
Maybe in reply to: Mike Eisler: "OPEN_CONFIRM implementation issues"
Next in thread: eric kustarz: "Re: OPEN_CONFIRM implementation issues"

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Attachment view

This archive was generated by hypermail 2.1.2 : 03/04/05-01:50:28 AM Z CST