Re: [PATCH v7 10/10] convert: add filter.<driver>.process option

From: Jakub Narębski <hidden>
Date: 2016-09-29 09:38:10

Possibly related (same subject, not in this thread)

2016-09-15 · Re: [PATCH v7 10/10] convert: add filter.<driver>.process option · Junio C Hamano <hidden>
2016-09-15 · Re: [PATCH v7 10/10] convert: add filter.<driver>.process option · Lars Schneider <hidden>
2016-09-15 · Re: [PATCH v7 10/10] convert: add filter.<driver>.process option · Lars Schneider <hidden>
2016-09-13 · Re: [PATCH v7 10/10] convert: add filter.<driver>.process option · Lars Schneider <hidden>
2016-09-13 · Re: [PATCH v7 10/10] convert: add filter.<driver>.process option · Junio C Hamano <hidden>

W dniu 29.09.2016 o 08:33, Torsten Bögershausen pisze:

quoted hunk ↗ jump to hunk

On 15.09.16 22:04, Junio C Hamano wrote:

quoted

Lars Schneider [off-list ref] writes:

quoted

Wouldn't that complicate the pathname parsing on the filter side?
Can't we just define in our filter protocol documentation that our 
"pathname" packet _always_ has a trailing "\n"? That would mean the 
receiver would know a packet "pathname=ABC\n\n" encodes the path
"ABC\n" [1].

That's fine, too.  If you declare that pathname over the protocol is
a binary thing, you can also define that the packet does not have
the terminating \n, i.e. the example encodes the path "ABC\n\n",
which is also OK ;-)

As long as the rule is clearly documented, easy for filter
implementors to follow it, and hard for them to get it wrong, I'd be
perfectly happy.

(Sorry for the late reply)

In V8 the additional "\n" is clearly documented.

On the long run,
I would suggest to be more clear what BINARY is:

--- a/Documentation/technical/protocol-common.txt
+++ b/Documentation/technical/protocol-common.txt

@@ -61,6 +61,9 @@ the length's hexadecimal representation.
 A pkt-line MAY contain binary data, so implementors MUST ensure
 pkt-line parsing/formatting routines are 8-bit clean.
 
+Each pkt-line that may contain ASCII control characters should
+be treated as binary.
+

Well, it is not as clear cut with pathnames.  Sane pathnames should
not contain control characters, even if they are outside US-ASCII,
assuming sane filesystem pathnames charset (like UTF-8).

One thing pathname cannot include is NUL ("\0") character.

So in most cases they are ASCII, but might not be.  Not that 
pkt-line text packets are binary-unsafe... I think the trailing
"\n" is here for easier debugging.

http://www.dwheeler.com/essays/filenames-in-shell.html
http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html

-- 
Jakub Narębski

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help