Thread (4 messages) 4 messages, 2 authors, 2025-01-20

Re: [PATCH v7] man/man7/pathname.7: Add file documenting format of pathnames

From: Alejandro Colomar <alx@kernel.org>
Date: 2025-01-20 16:25:10

Hi Jason,

On Mon, Jan 20, 2025 at 10:54:38AM -0500, Jason Yundt wrote:
On Mon, Jan 20, 2025 at 03:22:05PM +0100, Alejandro Colomar wrote:
quoted
quoted
+    char *locale_pathname = malloc(locale_pathname_size);
+    if (locale_pathname == NULL) {
+	err(EXIT_FAILURE, "malloc");
+    }
+\&
+    iconv_t cd = iconv_open(nl_langinfo(CODESET), "UTF\-32");
+    if (cd == (iconv_t) \- 1) {
+        err(EXIT_FAILURE, "iconv_open");
+    }
+    char *inbuf = (char *) utf32_pathname;
+    size_t inbytesleft = sizeof utf32_pathname;
+    char *outbuf = locale_pathname;
+    size_t outbytesleft = locale_pathname_size;
+    size_t iconv_result =
+        iconv(cd, &inbuf, &inbytesleft, &outbuf, &outbytesleft);
+    if (iconv_result == \-1) {
+        err(EXIT_FAILURE, "iconv");
+    }
+    // This ensures that the conversion is 100% complete.
+    // See iconv(3) for details.
+    iconv_result =
+        iconv(cd, NULL, &inbytesleft, &outbuf, &outbytesleft);
+    if (iconv_result == \-1) {
+        err(EXIT_FAILURE, "iconv");
+    }
Do we really need two calls?  Why?
iconv(3) says “In each series of calls to iconv(), the last should be
one with inbuf or *inbuf equal to NULL, in order to flush out any
partially converted input.”  To me, that quote makes it sound like you
should always call iconv() at least twice and that inbuf (or *inbuf)
should be NULL the last time that you call iconv().  I don’t know why
the man page says that you should always call iconv() at least twice.
I suspect that we can call it just once since we provided enough space.

  The conversion can stop for five reasons:

     •  An invalid multibyte sequence ...

     •  A multibyte sequence is encountered that is valid but  that  cannot  be
        translated to the character encoding of the output.  ...

     •  The  input  byte  sequence  has  been entirely converted, that is, *in‐
        bytesleft has gone down to 0.  In this case, iconv() returns the number
        of nonreversible conversions performed during this call.

     •  An incomplete multibyte sequence is encountered in the input, ...

     •  The  output  buffer  has no more room ...

I don't see anything listed that would make reasonable a second call.
Maybe we should improve the wording in iconv(3), but we should be
careful.  For now I'll leave it untouched.  But please call it only
once.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help