Re: [PATCH v3] filename.7: new manual page
From: Florian Weimer <hidden>
Date: 2021-10-19 08:54:22
* Thaddeus H. Black:
+.TH FILENAME 7 2021-10-18 "Linux" "Linux Programmer's Manual" +.SH NAME +filename \- requirements and conventions for the naming of files +.SH DESCRIPTION +This manual page sets forth requirements for +and delineates conventions regarding filenames +on a Linux system, +where a +.I filename +is either (as the word suggests) the name of a regular file +or the name of another object held by the system's filesystem +such as a directory, symbolic link, named pipe or device.
Maybe add: “A pathname contains zero or more filenames.”
+.SS Legal filenames +A filename on a Linux system can consist +of almost any sequence of UTF-8 characters +or, indeed, almost any sequence of bytes. +The exceptions are as follows. +.TP +.B Reserved characters +.RS +The following characters are reserved. +.TP +.B / +The solidus is reserved to separate pathname components +as for example in +.IR /usr/share/doc , +each component being itself a filename. +For this reason, no filename may include a solidus. +More precisely, +no filename may include the byte that, +in ASCII and UTF-8, +exclusively represents the solidus.
What does this mean? I think only byte 0x2f is reserved. The UTF-8 comment is misleading. A historic/overlong encoding of / in multiple UTF-8 bytes is *not* reserved.
+.B \e0 +The null character is reserved for the filesystem to append +to terminate a filename's representation in memory. +For this reason, no filename may include a null character. +More precisely, +no filename may include the byte that, +in ASCII and UTF-8, +exclusively represents the null character.
See above.
+.B Reserved names +.RS +The following names are reserved. +.TP +.B . +The filename consisting of a single full stop +is reserved to represent the current directory. +.TP +.B .. +The filename consisting of two full stops +is reserved to represent the parent directory. +.TP +(empty) +The empty filename, +consisting of no bytes at all +(except a terminating null byte), +is not allowed.
This conflicts with the presentation of / as a separator in pathnames, I think: The pathname "/usr/" contains two empty filenames.
+.TP +.B Long names +.RS +No filename may exceed\~255 bytes in length, +or\~256 bytes after counting the terminating null byte.
This is not correct for Linux. Despite the definition of NAME_MAX, filenames can be longer than 255 bytes. NTFS and CIFS have a limit of 255 UTF-16 characters, which translates to about 768 bytes in the UTF-8 encoding used by Linux. Thanks, Florian