From: Juan Gomez (juang@us.ibm.com)
Date: 06/05/02-07:40:28 PM Z
Subject: RE: NFSv4 Replication and Migration: design team conference call/ Draft Message-ID: <OFA839C313.5EE765AF-ON88256BD0.00033AD4@boulder.ibm.com> From: "Juan Gomez" <juang@us.ibm.com> Date: Wed, 5 Jun 2002 17:40:28 -0700 I do not agree with the second paragraph here. I think intersite replication spanning WANs is an important area that should be considered in NFSv4. But I guess is to the group tpo define the span of this work, count my vote on defining migration/replications that work efficiently over WANs. Juan |---------+----------------------------> | | "Noveck, Dave" | | | <Dave.Noveck@neta| | | pp.com> | | | | | | 06/05/02 03:15 PM| | | | |---------+----------------------------> >------------------------------------------------------------------------------------------------------------------------| | | | To: "'Peter Staubach'" <Peter.Staubach@sun.com>, Robert.Thurlow@eng.sun.com, brent@eng.sun.com | | cc: Juan Gomez/Almaden/IBM@IBMUS, nfsv4-wg@sunroof.eng.sun.com | | Subject: RE: NFSv4 Replication and Migration: design team conference call/ Draft | | | | | >------------------------------------------------------------------------------------------------------------------------| What is worthwhile depends on the environment. I believe the environment that the paper was talking about was that of optimizing transmission over DSL. This is an environment in which the bandwidths are low, so the cost of processing all of the data you are sending is relatively low, while the value of optimizing the use of very scarce bandwidth is pretty high. I think a considerable portion of the use of the migration and replication protocols usage is going to be within a building in which the considerations are quite different. There will be significant portion over distance so I don't think we want to profligate with bandwidth, but I don't think we will be seeing much use in very-low-bandwidth situations so I think the first version at least should be oriented toward optimizing TTWI (time-to-working-implementation) and we should avoid going for anything fancy in the bandwidth optimization area. -----Original Message----- From: Peter Staubach [mailto:Peter.Staubach@sun.com] Sent: Wednesday, June 05, 2002 5:14 PM To: Robert.Thurlow@eng.sun.com; brent@eng.sun.com Cc: juang@us.ibm.com; nfsv4-wg@sunroof.eng.sun.com Subject: Re: NFSv4 Replication and Migration: design team conference call/Draft > > > I think compression could be worthwhile to explore. It's not clear > > that we can take advantage of similar blocks except for blocks of > > zeros, which we should certainly optimize. What amount of block > > replication would you expect, and under what circumstances? > > I'm not sure if I read it in the paper Juan referenced, but > you could in theory avoid some data transfers if you kept > track of each block of transferred data together with a > checksum. > > Then when you need to transfer a block of new data in > a file, you checksum the block and see if you've transferred > it previously as part of another file. If you get a hit, > then instead of transferring the block, you just tell the > other end where it is, and where it needs to go. > > I've no idea whether this happens often enough to be > worthwhile - maybe I should go back and read Maziere's paper > again ... > Between this and the sparse file detection, the cpu overheads are possibly starting to become significant. Looking at a block to see whether it is all zeros is really expensive. There needs to be a cheaper way. This also doesn't match real life semantics. We will need to be able to replicate files created with mkfile(1M) for example. They tend to be populated with zeros, but are fully populated for good reasons. A block of zeros does not mean that the block is not populated. ps
This archive was generated by hypermail 2.1.2 : 03/04/05-01:49:49 AM Z CST