Re: Issue in man page pathname.7
From: Alejandro Colomar <alx@kernel.org>
Date: 2025-08-25 20:23:49
Hi Branden, On Mon, Aug 25, 2025 at 03:10:00PM -0500, G. Branden Robinson wrote:
Hi Alex & Helge, At 2025-08-25T20:57:22+0200, Alejandro Colomar wrote:quoted
On Mon, Aug 25, 2025 at 04:17:32PM +0000, Helge Kreutzmann wrote:quoted
Am Sun, Aug 24, 2025 at 10:04:04PM +0200 schrieb Alejandro Colomar:quoted
On Sun, Aug 24, 2025 at 02:48:46PM +0000, Helge Kreutzmann wrote:quoted
Without further ado, the following was found: Issue: The URL is invalid "For maximum interoperability, programs and users should also limit the " "characters that they use for their own pathnames to characters in the POSIX " "E<.UR https://pubs.opengroup.org/\\:onlinepubs/\\:9799919799/\\:basedefs/" "\\:V1_chap03.html#tag_03_265> Portable Filename Character Set E<.UE .>"Hi Helge, That URI has '\\:' in it, which is correct in roff(7) (and in man(7)) source code. That is removed by troff(1) when formatting the page. If you read the formatted page that's not there.Yes, then no URL is there :))Hmmm, that depends on your terminal. If there's no URL or hyperlink, this might be an issue in either the terminal or groff(1).I need clarification on what you're seeing, Helge.
I see the text between the UR/UE pair, but not the link itself, nor a hyperlink. See yourself one of the last paragraphs on pathname(7) on tty1, and let me know if you can't reproduce it. (Or any page with a UR/UE pair with text in the middle.) Essentially, I see the same as this: $ man pathname | tail -n 10 ing shorter filenames, or restricting the allowed bytes in a filename. For maximum interoperability, programs and users should also limit the characters that they use for their own pathnames to characters in the POSIX Portable Filename Character Set. SEE ALSO limits.h(0p), open(2), fpathconf(3), path_resolution(7), mount(8) Linux man-pages 6.15-24-g3d3ffa... 2025-05-17 pathname(7) But of course, without piping to cat(1). On the other hand, I wonder... Shouldn't we get a proper URL if we're piping to cat? Of course there's no terminal with hyperlink support in a pipe!
The presence or absence of `\:` escape sequences should not make the
entire URL fail to format. The visibility of the URL is dependent on
the output device's ability to hyperlink it.
groff_man(7):
.UR uri
.UE [trailing‐text]
Identify uri as an RFC 3986 URI hyperlink with the text
between the two macro calls as the link text. An argument
to UE is placed after the link text without intervening
space. uri may not be visible in the rendered document if
hyperlinks are enabled and supported by the output driver.
If they are not, uri is set in angle brackets after the link
text and before trailing‐text. If hyperlinking is enabled
but there is no link text, uri is formatted and hyperlinked
without angle brackets.
As far as I can tell, groff man's `UR` and `UE` extension macros were
designed to degrade well on systems that don't implement them; recall
that the man(7) macro language was designed in 1979 and did not
anticipate hypertext. (mdoc(7), sometimes touted as an alternative, was
designed in about 1990 and had a similar lacuna--but like man(7), later
saw a groff extension to fill the gap.)
Since the link text itself is not in the arguments to a (possibly
undefined) macro, it should get formatted in the page. A _man_
formatter that doesn't implement `UE` might leave off some trailing text
(usually punctuation), but that too can be worked around portably[1] if
one cares to.
.TH foo 1 2024-08-25 "groff test suite"
.SH Name
foo \- frobnicate a bar
.SH Description
Visit
.UR https://my.example.com
my awesome website\c
.if \n(.g \~
.UE \c
\&.This should work as a reproducer; I expect you wouldn't see the URI in tty1 (unless it's a problem caused by some Debian patch that you lack). Have a lovely night! Alex
Admittedly, the supply of man page maintainers concerned about portability to DWB, Solaris 10, or Plan 9 troffs seems to be dwindling. I've never seen any page go to the foregoing trouble.quoted
quoted
quoted
The effect of '\\:' is telling troff(1) that those are good points to break the line if needed.Thanks for the explanation. Checking the URL after removing the \\: is a valid URL.It's worth noting that `\:` is also a groff extension; this time to the formatter, and dating back to about 1990. \: Insert a non‐printing break point. A word can break at such a point, but a hyphen glyph is not written to the output if it does. The remainder of the word is subject to hyphenation as normal. You can use \: and \% in combination to control breaking of a file name or URI or to permit hyphenation only after certain explicit hyphens within a word. See subsection “Hyperlink macros” above for an example. \: is a GNU extension also supported by Heirloom Doctools troff 050915 (September 2005), mandoc 1.13.1 (2014‐08‐10), and neatroff (commit 399a4936, 2014‐02‐17), but not by DWB, Plan 9, or Solaris 10 troffs. There's a portability workaround for that, too. Here's a real-world example.[2] I mention these issues because Helge's project intakes a huge variety of man pages. Regards, Branden [1] except to po4a: https://github.com/mquinson/po4a/issues/527 [2] https://github.com/ThomasDickey/ncurses-snapshots/blob/ec918320a42c0dd57c1ea8481419bcaf862d16fd/man/curs_getch.3x#L46 https://github.com/ThomasDickey/ncurses-snapshots/blob/ec918320a42c0dd57c1ea8481419bcaf862d16fd/man/curs_getch.3x#L783
-- <https://www.alejandro-colomar.es/>
Attachments
- signature.asc [application/pgp-signature] 833 bytes