Thread (9 messages) 9 messages, 4 authors, 2021-03-07

Re: Escaping hyphens ("real" minus signs in groff)

From: G. Branden Robinson <hidden>
Date: 2021-01-22 03:56:47

Hi Michael!

At 2021-01-21T12:03:13+0100, Michael Kerrisk (man-pages) wrote:
I appreciate your long answer *very* much. But, I'm glad you started
with the short answer :-).
Cool!  But beware, from such pressures is the practice of top-replying
born...  ;-)
quoted
Another issue to consider is that as PDF rendering technology has
improved on Linux, it has become possible to copy and paste from PDF
documents into a terminal window.  In my opinion we should make this
work as well as we can.  Expert Linux users may not ever do this,
wondering why anyone would ever try; new Linux users will quite
reasonably expect to be able to do it.
[...]
quoted
And I mean copy-and-paste not just from PDF but from a terminal
window.
Yes, but I have a question: "\-1" renders in PDF as a long dash 
followed by a "1". This looks okay in PDF, but if I copy and paste
into a terminal, I don't get an ASCII 45. Seems seems to contradict
what you are saying about cut-and-paste above. What am I missing?
The gap between aspiration and implementation.  I don't think the
"copy-and-paste from PDF to terminal window" matter is completely sorted
out yet.

I'm a strident prescriptionist about preserving the distinction between
"-" and "\-" in roff documents, notably including man pages in part
because it affords us more room to design around this problem.

ASCII and ISO 8859 unified the hyphen and minus characters.  AT&T troff
and all of its descendants distinguished them.  Unicode also
distinguishes them.  But Unix has a habit of calling ASCII 055 (45
decimal) a "dash", and moreover, to much software, only the numerical
value of the code point is important.

It's quite possible that for man(7) documents rendering to PDF, we
should perform the following mapping (in the man macros).

.if '\*[.T]'pdf' \
.  char \- \N'45'

This didn't come up in my argument with (mostly?) BSD people because (1)
the immediate issue that raised concern had to do with the grave accent
and apostrophe instead and (2) everybody in that camp who spoke up on
the matter said they seldom, if ever, render man pages to PostScript or
PDF.  By that token, the above 2-liner may not be a controversial matter
to the people I was arguing with.  :)

Consider what would happen to the appearance of PDF-rendered man pages
if we encouraged all \- escaped hyphens to be rewritten as plain hyphens
in the source first, and did the following to mandate uniformity.

.if '\*[.T]'pdf' \{\
.  char \- \N'45'
.  char - \N'45'
.\}

...just as is currently done for the 'utf8' output driver, whose second
line I want kill off.

I feel that responsible stewardship of the groff man macro
implementation means considering the needs of diverse audiences.
I don't really have any other questions, but I have tried to distill 
the  above into some text in man-pages(7) to remind myself for the
future:

[[
.PP
The use of real minus signs serves the following purposes:
.IP * 3
To provide better renderings on various targets other than
ASCII terminals,
notably in PDF and on Unicode/UTF\-8-capable terminals.
.IP *
To generate glyphs that when copied from rendered pages will
produce real minus signs when pasted into a terminal.
]]

Seem okay?
What a "real minus sign" is is a fraught issue[1], but if for the
purposes of man-pages(7) it means the ASCII/ISO hyphen-minus, then yes,
I think it's good enough.

Regards,
Branden

[1] especially in light of the \[mi] special character escape and the
    existence of U+2212 :-/

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help