NFSv4 and Caching

New Message Reply About this list Date view Thread view Subject view Author view Attachment view

From: Brent (brent@eng.sun.com)
Date: 02/04/99-09:13:59 PM Z


Message-ID: <36BA61F7.65769629@eng.sun.com>
Date: Thu, 04 Feb 1999 19:13:59 -0800
From: Brent <brent@eng.sun.com>
Subject: NFSv4 and Caching


Hello,

I'm concerned that so far we've had little discussion on what kind
of caching model we should use for NFSv4. If NFSv4 is to be a popular
protocol for access to files over the Internet, then good caching
is mandatory.  With effective caching, we can give Internet clients fast,
responsive access to data and allow servers to handle a lot more clients.

NFS v2/v3 Caching
-----------------
	No doubt, we're familiar with the caching that is used by current
	NFS implementations. Most clients cache a relatively small amount
	of data in memory, though a much larger, persistent cache can be
	maintained on disk with something like CacheFS.

	NFS cache consistency is achieved by having clients check for
	server changes to a file or directory before opening a cached
	copy. Changes to cached data are written back synchronously with
	a file close().  This close-to-open consistency guarantees that
	an application that opens a file will see the changes from the
	last close().  This level of cache consistency is sufficient for
	many applications, however, there are some downsides.

	A GETATTR request is generated for every file open.  It's not
	unusual to see rapid-fire streams of GETATTR requests coming
	from a client that's running an application that opens and
	closes a file repeatedly.  The GETATTR requests slow down
	the app to network speed, and beat up the server.

	Once the file (or cached directory) is open, the client may operate
	on stale data (server copy has changed) for between 3 and
	30 seconds (up to 60 seconds for directories). If you update
	a file on the server, clients that have the file open will
	not see the changes immediately.

	To summarize: Although NFS caching can preempt network reads
	and writes, it continues to require a high level of interaction
	with the server.  There is a time-bounded limit on cache
	inconsistency when a file is open.

Spritely NFS and NQNFS
----------------------
	About 10 years ago, V. Srinivasan and Jeff Mogul created Spritely NFS.
	A variant of the NFS protocol with a caching model borrowed from 
	Berkeley's Sprite OS. They added an OPEN call that clients would use
	when opening a file.  The open call allowed the server to track cached
	copies of files, and whether they were read-only or read-write caches.
	If the server detected a cache conflict, it notifies clients
	with a CALLBACK RPC.  Since the client knows the server will notify
	it, there's no need for GETATTR cache checking on open and no time-bounded
	cache inconsistency. Clients are guaranteed that they'll always see the
	latest data in a file.  Recovery of the file open state held by the
	server in case of server crash was complex - it required the server
	to keep track of its clients in stable storage so that it could notify
	them to begin a recovery sequence - rather like the NLM "grace period".

	Rick Macklem's NQNFS (Not Quite NFS) retained the callbacks of Spritely NFS
	but time-bounded the per-client open file state on the server via the
	use of Leases.  Clients would periodically renew the lease on the file
	open state.  If a server crashed, it just delayed recovery until all
	leases had expired so that clients would begin their state-recovery cycle.

	To summarize: these variants showed that NFS could be made
	cache consistent via the use of server state and callbacks.
 	Some performance advantages accrued through more effective
	use of the cache and a reduction in client/server interaction.

AFS and DCE/DFS
---------------
	AFS takes caching to extreme by copying the file to the client on file
	open and having the client access it entirely from the cache.  If the
	file is changed, it is copied back to the server on file close. Modern
	versions of AFS chunk the file so that the entire file need not be 
	cached.  When the AFS client copies the file down from the server,
	the server registers a callback promise.  The client caches the
	file indefinitely and need not interact with the server at all
	because it knows the server will callback if some other client
	needs to share access with the file.  AFS doesn't guarantee that
	clients will always see the latest data (writes are delayed until
	file close).  AFS demonstrates very good clients/server numbers
	because once the client's caches are populated, client/server
	interaction is low. Recovery of server file open state is dependent
	on clients polling the server at intervals to detect if a reboot
	has occurred or if callbacks have been missed.

	DCE/DFS improves on the AFS caching, by having clients request specific
	"tokens" for the type of file access (read, write, lock, etc).  The
	server issues a callback if there's a token conflict.  This allows the
	server to be more intelligent in issuing callbacks. 

	To summarize: AFS and DCE/DFS take caching to another level through
	their support of long-term caching with minimal server interaction.
	DCE/DFS improves the cache consistency of the long-term caching.

CIFS
----
	CIFS clients use opportunistic locks for long-term caching.
	Clients notify the server when they want to cache data for
	read or write and expect a callback from the server if another
	client needs to read or write share the file. CIFS clients
	always see the latest changes to data - caches are consistent.

NFS v4 ?
--------
	So what for NFSv4 ?  The NFS variants and AFS/DCE/DFS show that
	NFS could do better caching, though not without server state and
	callbacks for cache consistency.  It's interesting to note that
	although AFS continues to be a popular file access protocol, there
	has been no significant movement of users to its improved version:
	DCE/DFS.  Are AFS users happy with their effective, yet slightly
	inconsistent caches ?  Is some measure of NFS cache inconsistency
	acceptable for v4 if it simplifies implementation and recovery ?

	Callbacks are a great way to reduce client/server interaction.
	With a callback in place, there's no need for a client to "poll"
	the server - except to check that the network connection and
	server are up and running.  Yet callbacks raise sticky issues
	that complicate the protocol and its implementation:

		- Asynchrony in the protocol
		- Listener threads on clients
		- Callbacks blocked by proxies, firewalls or partitions
		- Scalability of callbacks (callback to 10,000 clients ?)
		- Server needs to record lots of client file open state
		- Server needs to recover that state after crash or failover.

	Are there viable alternatives to using callback ?  Is there a hybrid
	scheme that can merge the strengths of callback and leases ?
	None of the above-mentioned caching schemes account for proxy caches.
	What are the issues with maintaining cached data on some intermediate
	machine ?

Comments/ideas appreciated ...

	Brent


New Message Reply About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.2 : 03/04/05-01:46:38 AM Z CST