From: Ted Anderson (TedAnderson@mindspring.com)
Date: 10/03/02-07:20:42 AM Z
Message-ID: <3D9C361A.9060007@mindspring.com> Date: Thu, 03 Oct 2002 08:20:42 -0400 From: Ted Anderson <TedAnderson@mindspring.com> Subject: Re: Replication/Migration issues list This note addresses some of the harder questions about scope and goals. Regarding file system state, it seems to me that there are three cases: read-only replicas, writable replicas (e.g. a fully populated cache) and migration. All these require addressing the question of file handles (and file ids), since this involves issues of file system implementation which may differ from one replica/host to another. The first is distinguished because it has very limited (or absent) requirements for managing opens, shares and delegations. For the latter two, however, this management is important because of the need to coordinate writers with other users. Consistency of writable replicas depends upon communicating information about multiple readers and writers with a central coordinating authority. On the other hand, migration involves transferring this coordination from one authority to another. These latter two activities seem to require different mechanisms. Thus, I conclude there is a need for three mechanisms to support the full range of file sharing functionality. As mentioned in the previous message in connection with right-sizing filesystems, I think we should move towards depending upon pathnames and deprecating file ids, hardlinks, and ZLC files, though directory renames remain an issue. This shift, already begun by the introduction of volatile file handles, eases the job of transferring file handles and file ids between heterogeneous servers. For writable replicas or caches, the NFSv4 client/server protocol contains a suite of mechanisms that are arguably adequate to consistency management. I don't see any moves to add this capability to the repl-mig protocol. Is support for writable replicas (or even strongly consistent read-only replicas which would benefit from something like delegation) out of scope for the repl-mig protocol? To support the migration of filesystems, there seem to be two minds on the subject of this meta file system information. Does the repl-mig protocol rely on clients to reestablish state with the new server, like after a crash, or does the protocol need to help. I've heard opinions both ways on the WG mailing list. Has a consensus developed? The approach of relying on the server crash recovery logic seems workable. I am not aware of any show stoppers with this approach, except the grace period limit on high-availability[1]. Probably a few corner cases that present some difficulties do exist. We should collect problematic scenarios so they can be evaluated based on the application transparency goal. The grace period problem, might admit to a third option, namely to enhance the client-server protocol to ease the transfer of client state. Specifically, a callback could be issued by the source server which tells clients to start transferring their state to the target server. Then clients send a finish message to advise the target that they have completed their state transfer, which would allow the target to exit this state promptly. It does require that clients in an environment requiring high-availability support callbacks, but this does not sound onerous. On 9/10/2002 13:25:51 MDT, Robert Thurlow wrote: >>Updates can be "pulled" as well as "pushed". > > The current proposal deals only with making information flow > from the master copy out to the replicas, and this flow is > unidirectional (modulo rsync-like examination of files). The > way a transfer is initiated is unspecified - earlier thoughts > of also defining an third-party admin protocol to control this > were abandoned since we have things like SNMP and WBEM/CIM. > That means that the impetus to perform an update can come from > a human being, from the destination server, or from a "cron" > job, and happen out-of-band to cause the source to contact > the destination server(s) via this protocol. We do need to > also define the SNMP MIB or WBEM/CIM schema to do this. Okay, clearly at the lowest level all updates have to be "pushed" from where writes occur to other sites. The rest is "control". So if this out-of-band mechanism is where all this control is located, then this repl-mig protocol just ships the data when directed. I suppose with a sufficiently sophisticated control protocol to determine what and where to send data, one could implement any kind of replication. For example, such a repl-mig protocol doesn't need support for describing updates, such as incremental dumps. Instead, the contents of an update is negotiated out-of-band and the protocol just ships the selected data. Replication specification and maintenance has several components: * Where are the replicas positioned - location - list of sites. * When are updates received at the replicas - consistency - guarantees made about data to readers regarding currency and writers regarding safety. * Which objects does the replica house - subset - lots of choices are possible here, including "recently used" and "all". * What name does the replica have - read-only - writable secondaries usually have the same name as the primary so they are interposed, while read-only secondaries have different names from the writable primary. These are all control-like functions which I guess would be out of scope. While I can see how SNMP could be used to manage the list of replica sites, it seems less suitable for maintaining the consistency of a replica. If this protocol is targeted at shipping the filesystem data, then what distinguishes its goals from those of the client-server protocol? My tentative answer is bulk transfers are optimized relative to requests for information about individual files[2]. A protocol limited to file transfer is likely to be significantly simpler than the client-server protocol burdened with operational semantics (e.g. opens, reads, etc). But is this the whole list of benefits? Is there any data to support the assumption that repl-mig can perform better than "big honking compounds ops"[3]? Ted Anderson [1] http://playground.sun.com/pub/nfsv4/nfsv4-wg-archive/2001/0463.html [2] http://playground.sun.com/pub/nfsv4/nfsv4-wg-archive/2001/0476.html [3] http://playground.sun.com/pub/nfsv4/nfsv4-wg-archive/2002/0246.html
This archive was generated by hypermail 2.1.2 : 03/04/05-01:50:24 AM Z CST