Thread (87 messages) 87 messages, 19 authors, 2021-05-12

Re: [PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: 2021-05-12 08:29:10
Also in: lkml

Em Tue, 11 May 2021 07:35:29 +0800
David Gow [off-list ref] escreveu:
On Mon, May 10, 2021 at 6:27 PM Mauro Carvalho Chehab
[off-list ref] wrote:
quoted
While UTF-8 characters can be used at the Linux documentation,
the best is to use them only when ASCII doesn't offer a good replacement.
So, replace the occurences of the following UTF-8 characters:

        - U+2014 ('—'): EM DASH

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---  
Oh dear, I do have a habit of overusing em-dashes. I've no problem in
theory with exchanging them for an ASCII approximation.
I suppose there's a reason it's the one dash to rule them all: :-)
https://twitter.com/FakeUnicode/status/727888721312260096/photo/1
quoted
 Documentation/dev-tools/testing-overview.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
index b5b46709969c..8adffc26a2ec 100644
--- a/Documentation/dev-tools/testing-overview.rst
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -18,8 +18,8 @@ frameworks. These both provide infrastructure to help make running tests and
 groups of tests easier, as well as providing helpers to aid in writing new
 tests.

-If you're looking to verify the behaviour of the Kernel — particularly specific
-parts of the kernel — then you'll want to use KUnit or kselftest.
+If you're looking to verify the behaviour of the Kernel - particularly specific
+parts of the kernel - then you'll want to use KUnit or kselftest.  
As Marco pointed out, having multiple HYPHEN-MINUS symbols in a row is
probably a better replacement, as it does distinguish the em-dash from
smaller dashes better.
However, I need three for sphinx to output an em-dash here (2 hyphens
only gives me an en-dash).

So, if we want to get rid of the UTF-8 em-dash, my preferences would
be (in descending order):
1. Three hyphens: '---' (sphinx generates an em-dash)
2. Two hyphens: '--' (worst case, an en-dash surrounded by spaces --
as sphinx generates for me -- is still readable, and it's still
readable as an em-dash in plain text)
3. One hyphen as in this patch (which I don't like as much, but will
no doubt learn to live with)

But it looks like you've got several similar comments on other patches
in this series, so I'm happy for you to use whatever ends up being
agreed upon generally.
Yeah, from the comments I received so far, it seems that most developers
want to use '---' for EM DASH and '--' for EN DASH, typing it as ASCII
instead of using U+<number> as this is easier on most editors.

Yet, my understanding is that we don't have a consensus with that
regards, as some patches I sent using a single hyphen were 
accepted/reviewed/acked.

So, I sent (and it was already applied) a small patch series (/5)
fixing the cases where UTF-8 chars (including DASH) were added
by mistake (probably due to some conversion tool). 

For the remaining issues, my plan is to split this series in two
parts:

The first one with non-polemic UTF-8 changes, and a second one with
just EM/EN DASH, using '---' to replace EM DASH and '--' to replace
EN DASH, as this way, the produced HTML/LaTeX/PDF docs won't change.

This should make easier to discuss the EM/EN DASH changes on
each patch, and see if the above default is the better fit for a
particular usecase.

Thanks,
Mauro
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help