Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

[PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 11/53] docs: trace: coresight: coresight-etm4x-reference.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 11/53] docs: trace: coresight: coresight-etm4x-reference.rst: avoid using UTF-8 chars · Mathieu Poirier <mathieu.poirier@linaro.org> · 2021-05-10
[PATCH 07/53] docs: admin-guide: media: ipu3.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 04/53] docs: index.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 14/53] docs: driver-api: iio: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 05/53] docs: hwmon: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 05/53] docs: hwmon: avoid using UTF-8 chars · Guenter Roeck <linux@roeck-us.net> · 2021-05-10
[PATCH 09/53] docs: admin-guide: perf: imx-ddr.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 18/53] docs: driver-api: nvdimm: btt.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 24/53] docs: userspace-api: media: v4l: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 03/53] docs: ABI: remove some spurious characters · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 08/53] docs: admin-guide: sysctl: kernel.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 23/53] docs: userspace-api: media: fdl-appendix.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 10/53] docs: admin-guide: pm: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 22/53] docs: block: data-integrity.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 19/53] docs: fault-injection: nvme-fault-injection.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 01/53] docs: cdrom-standard.rst: get rid of uneeded UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 02/53] docs: ABI: remove a meaningless UTF-8 character · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 12/53] docs: driver-api: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 20/53] docs: usb: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 06/53] docs: admin-guide: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 27/53] docs: filesystems: f2fs.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [f2fs-dev] [PATCH 27/53] docs: filesystems: f2fs.rst: avoid using UTF-8 chars · Chao Yu <hidden> · 2021-05-11
[PATCH 21/53] docs: process: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 15/53] docs: driver-api: thermal: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 33/53] docs: riscv: vm-layout.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 17/53] docs: driver-api: firmware: other_interfaces.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 25/53] docs: userspace-api: media: dvb: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 31/53] docs: security: tpm: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 34/53] docs: networking: scaling.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 32/53] docs: security: keys: trusted-encrypted.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 16/53] docs: driver-api: media: drivers: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 13/53] docs: driver-api: fpga: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 13/53] docs: driver-api: fpga: avoid using UTF-8 chars · Moritz Fischer <mdf@kernel.org> · 2021-05-10
[PATCH 28/53] docs: filesystems: ext4: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 28/53] docs: filesystems: ext4: avoid using UTF-8 chars · "Theodore Ts'o" <tytso@mit.edu> · 2021-05-10
[PATCH 35/53] docs: networking: devlink: devlink-dpipe.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 38/53] docs: scheduler: sched-deadline.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars · Marco Elver <elver@google.com> · 2021-05-10
Re: [PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-12
Re: [PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars · David Gow <hidden> · 2021-05-10
Re: [PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-12
Re: [PATCH 39/53] docs: dev-tools: testing-overview.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-12
[PATCH 37/53] docs: x86: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 43/53] docs: PCI: acpi-info.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 43/53] docs: PCI: acpi-info.rst: avoid using UTF-8 chars · Krzysztof Wilczyński <hidden> · 2021-05-10
[PATCH 41/53] docs: ABI: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 41/53] docs: ABI: avoid using UTF-8 chars · Guenter Roeck <hidden> · 2021-05-10
[PATCH 42/53] docs: doc-guide: contributing.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 26/53] docs: vm: zswap.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 40/53] docs: power: powercap: powercap.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 36/53] docs: networking: device_drivers: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 30/53] docs: hid: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 29/53] docs: kernel-hacking: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 49/53] docs: misc-devices: ibmvmc.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 52/53] docs: virt: kvm: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 46/53] docs: arm64: arm-acpi.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 47/53] docs: infiniband: tag_matching.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 50/53] docs: firmware-guide: acpi: lpit.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 48/53] docs: timers: no_hz.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 44/53] docs: gpu: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 44/53] docs: gpu: avoid using UTF-8 chars · Jani Nikula <jani.nikula@linux.intel.com> · 2021-05-10
Re: [PATCH 44/53] docs: gpu: avoid using UTF-8 chars · Liviu Dudau <liviu.dudau@arm.com> · 2021-05-10
[PATCH 51/53] docs: firmware-guide: acpi: dsd: graph.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 45/53] docs: sound: kernel-api: writing-an-alsa-driver.rst: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
[PATCH 53/53] docs: RCU: avoid using UTF-8 chars · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 53/53] docs: RCU: avoid using UTF-8 chars · "Paul E. McKenney" <paulmck@kernel.org> · 2021-05-11
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Thorsten Leemhuis <linux@leemhuis.info> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · David Woodhouse <dwmw2@infradead.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Edward Cree <ecree.xilinx@gmail.com> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Edward Cree <ecree.xilinx@gmail.com> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Matthew Wilcox <willy@infradead.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Edward Cree <ecree.xilinx@gmail.com> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-11
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · David Woodhouse <dwmw2@infradead.org> · 2021-05-11
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Ben Boeckel <hidden> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · David Woodhouse <dwmw2@infradead.org> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · "Theodore Ts'o" <tytso@mit.edu> · 2021-05-10
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-11
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2021-05-11
Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII · Adam Borowski <hidden> · 2021-05-10

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: 2021-05-11 09:37:44
Also in: alsa-devel, dri-devel, intel-gfx, intel-wired-lan, keyrings, kvm, linux-acpi, linux-arm-kernel, linux-edac, linux-ext4, linux-f2fs-devel, linux-fpga, linux-hwmon, linux-iio, linux-input, linux-integrity, linux-media, linux-pci, linux-pm, linux-rdma, linux-riscv, linux-usb, lkml, netdev, rcu

Em Mon, 10 May 2021 15:22:02 -0400
"Theodore Ts'o" [off-list ref] escreveu:

On Mon, May 10, 2021 at 02:49:44PM +0100, David Woodhouse wrote:

quoted

On Mon, 2021-05-10 at 13:55 +0200, Mauro Carvalho Chehab wrote:

quoted

This patch series is doing conversion only when using ASCII makes
more sense than using UTF-8. 

See, a number of converted documents ended with weird characters
like ZERO WIDTH NO-BREAK SPACE (U+FEFF) character. This specific
character doesn't do any good.

Others use NO-BREAK SPACE (U+A0) instead of 0x20. Harmless, until
someone tries to use grep[1].

Replacing those makes sense. But replacing emdashes — which are a
distinct character that has no direct replacement in ASCII and which
people do *deliberately* use instead of hyphen-minus — does not.

I regularly use --- for em-dashes and -- for en-dashes.  Markdown will
automatically translate 3 ASCII hypens to em-dashes, and 2 ASCII
hyphens to en-dashes.  It's much, much easier for me to type 2 or 3
hypens into my text editor of choice than trying to enter the UTF-8
characters.

Yeah, typing those UTF-8 chars are a lot harder than typing -- and ---
on several text editors ;-)

Here, I only type UTF-8 chars for accents (my US-layout keyboards are 
all set to US international, so typing those are easy).

If we can make sphinx do this translation, maybe that's
the best way of dealing with these two characters?

Sphinx already does that by default[1], using smartquotes:

	https://docutils.sourceforge.io/docs/user/smartquotes.html

Those are the conversions that are done there:

      - Straight quotes (" and ') turned into "curly" quote characters;
      - dashes (-- and ---) turned into en- and em-dash entities;
      - three consecutive dots (... or . . .) turned into an ellipsis char.

So, we can simply use single/double commas, hyphens and dots for
curly commas and ellipses.

[1] There's a way to disable it at conf.py, but at the Kernel this is
    kept on its default: to automatically do such conversions. 

Thanks,
Mauro

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help