Network Working Group D. Crocker (UCLA-NMC) Request for Comments: 615 MAR 74 NIC #21531 Proposed Network Standard Data Pathname Syntax There seems to be an increasing call for a Network Standard Data Pathname (NSDP); that is, a standardized means of referring to a specific location for/of a collection of bits somewhere on the Network. The reasons for a standard or virtual anything have been discussed, at length, elsewhere and will not be elaborated upon here. Rather than attack the entire issue of virtual pathnames, I wish only to propose a standardized SYNTAX for specifying pathnames. Such a standard will be useful for 1) users who are unfamiliar with a site or who use several different sites and do not want to have to remember each site's idiosynchracies, 2) programs accessing data at several other sites, and 3) documentation: The syntax allows the user to specify the necessary network, host, peripheral device, directory, file, type, and site-specific fields. Adding other fields, as needed, is expected to be quite simple. First the BNF: <NSDP> ::= % <bulk> <cr><lf> <bulk> ::= <field> / <field> <bulk> <field> ::= <key> <L-delim> <name> <R-delim> <key> ::= NETWORK / HOST / PERIPHERAL/ DIRECTORY / FILE / TYPE / SITEPARM / N / H / P / D / F / T / S <L-delim> ::= any printable character that is not in the succeeding <name> field and that is acceptable to the object site: For visual aesthetics and to facilitate human parsing, anytime <L-delim> is a left-bracket character (<, [, (, _), <R-delim> must be the complementary right-bracket character (>, ], ), |). <name> ::= any sequence of characters acceptable to the object site. This is the actual data field with the file, directory, device (or whatever) name. <R-delim> ::= Either 1) the same character as <L-delim> or 2) if the <L-delim> character is a left-bracket character (<, [, (, _) then its complementary right-bracket (>, ], ), |). -1-
<cr> ::= carriage-return <lf> ::= line-feed And some elaboration: The syntax allows <name> fields to be an arbitrary number of rs long. Case is irrelevant to the syntax, though some sites will care about case in <name> fields: <Key> indicates what part of the pathname the next <name> is going to refer to: The single-character keys are abbreviations for the respective full-word keys: <Fields> ARE order dependent, but defaulted ones may be omitted. The order is as indicated for <key>s: That is, Network, Host, ..: Siteparm: Fields may be repeated, as appropriate for the object site; that is, multiple Directory fields, etc: The validity of any combination of <field>s is entirely site-dependent: For example, if a site will accept it, an NSDP with a Host field, and nothing more, is permissible: <delim> is used to delimit the beginning and end of the <name> field: Explanation of <key>s: NETWORK or N: Currently, only ARPA is defined. HOST or H: Reference to host, by official name or nickname or number: The default radix is ten; a numeric string ending with "H" indicates hexadecimal, "O"(oh) indicates octal, and (gratuitously) "D" indicates decimal: PERIPHERAL or P: Peripheral device being referred to: DIRECTORY or D: Name of a directory which contains a pointer to the entity (directory or filename) specified in the following <field>: FILE or F: Basic name of the file or data set: TYPE or T: Optional modifier to filename: (Tenex calls it the extension.) SITEPARM or S: A parameter, such as an access specification or version number, peculiar to the object site. The content of the <name> field must serve to identify what Siteparm is involved. Each site will be responsible for defining the syntax of Siteparm <name>s it will accept. -2-
Some reserved PERIPHERAL <name>s: DISK or DSK: Immediately accessible, direct-access storage. ONLINE or ONL: Whatever immediately-accessible (measured in fractions of a second) storage the user accesses by default; usually disk: TAPE or TAP: Industry-compatible magnetic tape: TAPE7 or TP7: 7-Track industry compatible tape: TAPE9 or TP9: 9-Track industry compatible tape: DECTAPE or DEC: DEC Tape. OFFLINE or OFF: Any tertiary storage; usually tape, though "devices" like the Datacomputer are permissible: The user should expect to wait minutes or hours before being able to access OFFLINE files: PRINTER or PTR: Any available line-printer: DOCPRINTER or DOC:Upper-lower case line printer, preferably with 8 1/2" X 11" unlined paper. PAPER or PAP: Paper tape. PUNCH or PUN: Standard 8O-column card punch. READER or RDR: Standard 80-column card reader: OPERATOR or OPR: System Operator's console. CONSULTANT or CON: On-line consultant. Defaults: Defaults will generally be context dependent. Consequently, the following defaults are offered only as guidelines: Network: ARPA Host: The host interpreting the NVP Peripheral: ONLINE (DISK) Directory: The user's current "working" directory, usually set by the logon process: Filename: None. Type: None. Siteparm: None. -3-
General Comments The only field that must be considered in relation to any host's current syntax is the escape-to-NVP field (The per-cent sign as the first character of a pathname specification): It is not currently known to conflict with any host's syntax: Exclamation mark (!) is the only other character that seems permissible (on the assumption that the character should be a graphic): Its use would cause minor problems at Multics; but more importantly as a graphic, it is too similar to the numeral "1": The syntax is intended to be adequate for all hosts, so any given portion of it may be inappropriate for any given host. A site is expected to permit specifications in a given field iff that site already has a way of accepting the same information: I believe that any modifications to the syntax will be graceful additions, rather than wholesale redesign, and thus can be deferred for a while. Currently, any undefined attributes must be specified in a Siteparm field: Perhaps Version, Access protection and Accounting, as well as other types of information, should be made standard <key>s, rather than buried as Siteparms. I expect that the next version of the NSDP Syntax specification will include them as <key>s, but I would like to wait for some comments from the community. The syntax does not currently allow addressing any collection of bits smaller than a file: This can be remedied by adding PAGE, BYTE and other <key>s; but, again, I would like to solicit some comments first: Disclaimer A pathname specified in the proposed syntax is fairly easy to type but is quite ugly to read: So, at the expense of design cleanliness, the <L-delim>/<R-delim> syntax was modified in an attempt to remedy the problem somewhat: As you will see below, it is only partially successful. The first draft of this document had a syntax that was a mix of Tenex and Multics conventions: That is, (Network)[Host]Peripheral:Directory>Filename:Type;Siteparm Though visually more attractive and generally quicker to type, it lacks extensibility. For example, adding Version number or Access protection as standard fields would be difficult: It is suggested that human interfaces be built to translate to/from NSDP syntax and the user's standard environment. -4-
Some sample pathnames: %H[ISI]D<DCROCKER>F(MESSAGE)T/TXT/S(P77O4O4)<cr><lf> refers to my protected message file at ISI (<DCROCKER>MESSAGE:TXT;P77O4O4). %H/OFFICE-l/D>JOURNAL>F<l8659>T.NLS.<cr><lf> refers to NIC Journal document #18659 (Tenex file <JOURNAL>l8659:NLS): %H/65/D.ARP061.D.LAD:F.DOCUMENT.<cr><lf> refers to a file ARPO6l:LAD.DOCUMENT at UCLA-CCN. Note the use of multiple Directory fields. %H[540]D//D>udd>D>Comp=net>D>Map>F(Mail)<cr><lf> refers to file CompNet>Map>Mail at Mit-Multics. Note that the initial NSPD Directory <name> field is empty. This conforms to Multics' method of starting at the top of its directory structure: I would like to thank Jon Postel, Vint Cerf, Jim White, Charlie Kline, Ken Pogran, Jerry Burchfiel and Tom Boynton for their suggestions. -5-