From: Nicolas Williams (Nicolas.Williams@sun.com)
Date: 01/24/03-03:12:18 PM Z
Date: Fri, 24 Jan 2003 15:12:18 -0600 From: Nicolas Williams <Nicolas.Williams@sun.com> Subject: Re: [Dan.Oscarsson@kiconsulting.se: Comments on NFSv4 rfc3010bis- 05 draft] Message-ID: <20030124151218.E16765@binky.central.sun.com> On Fri, Jan 24, 2003 at 01:00:43PM -0800, Mike Eisler wrote: > Nicolas Williams wrote: > > This is not my reading of the draft. > > > > Section 11.1.1 clearly specifies what to do with respect to filenames > > with equal names but different encoding [due to different normalization > > forms used by the clients that create them]. Section 11.1.1 necessarily > > requires that the server perform normalization of client inputs to some > > form of the server's choosing. > > I don't see the text. Section 11.1.4 says no normalization > form is specified. Stringprep has tables B1 and B2. B1 is required > to be processed, but I didn't see anything in stringprep that said that > normalization went with table B1. Table B2 explicitly says KC, and B2 > is for dealing with case insensitive stuff. Section 3 of stringprpe (now > RFC 3454) refers to Appendix B and the various tables. It says nothing > about normization for B1, whereas it does say something about normalization > with respect to B2 and B3. Yes, but the requirements from section 11.1.1 about duplicate filenames imply normalization, if not for stored file names, then during comparison of file names. > Now I am admittably quite ignorant on this topic, but it seems rather > odd that B1 requires normalization (and according to RFC 3454, B1 is > mandatory), and yet B3 is specified to allow applications to not use > normalization when case matching. Section 2 of RFC 3454 distinguishes between > the mapping phase and the normalization phase, and mapping is the topic > of section 3 and appendix B. So if your reading is correct, > then it seems to me RFC 3454 has a paradox. > > We apparently need an "RFC 3454 for Dummies" document. I was only referring to section 11.1.1, specifically: "Two valid distinct UTF-8 strings might be the same after processing via the utf8str_cs profile. If the strings are two names inside a directory, the NFS version 4 server will need to either: ..." If the server allows two (or more) files to have the same names but different byte sequences (say, one UTF-8, NFD, the other UTF-8, NFC), then we'll have the "how do I delete a file named '-i'?" problem magnified to "I see a file named 'bár' but when I type 'rm bár' it doesn't work". Or am I missing some text about utf8str_cs comparisons having to be done over normalized forms of the comparison inputs?? (yes, above I did not use UTF-8 for "bár" - my apologies) > -mre Cheers, Nico --
This archive was generated by hypermail 2.1.2 : 03/04/05-01:50:49 AM Z CST