Thread (41 messages) 41 messages, 5 authors, 2021-05-17

Re: Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)

From: Randy Dunlap <hidden>
Date: 2021-05-08 17:46:52

On 5/8/21 10:09 AM, Michal Suchánek wrote:
On Sat, May 08, 2021 at 08:55:11AM -0700, Randy Dunlap wrote:
quoted
Hi Mauro,

On 5/8/21 7:41 AM, Mauro Carvalho Chehab wrote:
quoted
Em Sat, 8 May 2021 12:41:57 +0200
Michal Suchánek [off-list ref] escreveu:
quoted
On Sat, May 08, 2021 at 11:22:05AM +0200, Mauro Carvalho Chehab wrote:
quoted
Em Fri, 7 May 2021 08:39:24 +0200
Mauro Carvalho Chehab [off-list ref] escreveu:
  
quoted
Em Thu, 6 May 2021 14:21:01 -0700
Randy Dunlap [off-list ref] escreveu:
  
  
quoted
I'll prepare a patch fixing it. Some care should be taken, however, as
it has two places where UTF-8 chars should be used[2].  
Ok, I did a small script in order to check what special chars we
currently have (next-20210507) at Documentation/ excluding the
translations.

Based on my script results, we have those groups:

1. Latin accented characters:
	- U+00c7 (LATIN CAPITAL LETTER C WITH CEDILLA) (Ç)
	- U+00df (LATIN SMALL LETTER SHARP S) (ß)
	- U+00e1 (LATIN SMALL LETTER A WITH ACUTE) (á)
	- U+00e4 (LATIN SMALL LETTER A WITH DIAERESIS) (ä)
	- U+00e6 (LATIN SMALL LETTER AE) (æ)
	- U+00e7 (LATIN SMALL LETTER C WITH CEDILLA) (ç)
	- U+00e9 (LATIN SMALL LETTER E WITH ACUTE) (é)
	- U+00ea (LATIN SMALL LETTER E WITH CIRCUMFLEX) (ê)
	- U+00eb (LATIN SMALL LETTER E WITH DIAERESIS) (ë)
	- U+00f3 (LATIN SMALL LETTER O WITH ACUTE) (ó)
	- U+00f4 (LATIN SMALL LETTER O WITH CIRCUMFLEX) (ô)
	- U+00f6 (LATIN SMALL LETTER O WITH DIAERESIS) (ö)
	- U+00f8 (LATIN SMALL LETTER O WITH STROKE) (ø)
	- U+00fc (LATIN SMALL LETTER U WITH DIAERESIS) (ü)
	- U+011f (LATIN SMALL LETTER G WITH BREVE) (ğ)
	- U+0142 (LATIN SMALL LETTER L WITH STROKE) (ł)

2. symbols:
	- U+00a9 (COPYRIGHT SIGN) (©)
	- U+2122 (TRADE MARK SIGN) (™)
	- U+00ae (REGISTERED SIGN) (®)
	- U+00b0 (DEGREE SIGN) (°)
	- U+00b1 (PLUS-MINUS SIGN) (±)
	- U+00b2 (SUPERSCRIPT TWO) (²)
	- U+00b5 (MICRO SIGN) (µ)
	- U+00bd (VULGAR FRACTION ONE HALF) (½)
	- U+2026 (HORIZONTAL ELLIPSIS) (…)

3. arrows:
	- U+2191 (UPWARDS ARROW) (↑)
	- U+2192 (RIGHTWARDS ARROW) (→)
	- U+2193 (DOWNWARDS ARROW) (↓)
	- U+2b0d (UP DOWN BLACK ARROW) (⬍)

4. box drawings:
	- U+2500 (BOX DRAWINGS LIGHT HORIZONTAL) (─)
	- U+2502 (BOX DRAWINGS LIGHT VERTICAL) (│)
	- U+2514 (BOX DRAWINGS LIGHT UP AND RIGHT) (└)
	- U+251c (BOX DRAWINGS LIGHT VERTICAL AND RIGHT) (├)

5. math symbols:
	- U+00b7 (MIDDLE DOT) (·)
	- U+00d7 (MULTIPLICATION SIGN) (×)
	- U+2212 (MINUS SIGN) (−)
	- U+2217 (ASTERISK OPERATOR) (∗)
	- U+223c (TILDE OPERATOR) (∼)
	- U+2264 (LESS-THAN OR EQUAL TO) (≤)
	- U+2265 (GREATER-THAN OR EQUAL TO) (≥)
	- U+27e8 (MATHEMATICAL LEFT ANGLE BRACKET) (⟨)
	- U+27e9 (MATHEMATICAL RIGHT ANGLE BRACKET) (⟩)
	- U+00ac (NOT SIGN) (¬)  
quoted
Use of ¬ is also very dubious in documentation (in fonts it is understandable):
Documentation/ABI/obsolete/sysfs-kernel-fadump_registered:This ABI is renamed and moved to a new location /sys/kernel/fadump/registered.¬
Documentation/ABI/obsolete/sysfs-kernel-fadump_release_mem:This ABI is renamed and moved to a new location /sys/kernel/fadump/release_mem.¬
quoted
Documentation/powerpc/transactional_memory.rst:  if (MSR 29:31 ¬ = 0b010 | SRR1 29:31 ¬ = 0b000) then
Yeah, this should probably be better written as:

  if (MSR 29:31 == 0b010 | SRR1 29:31 == 0b000) then
If the original with the 'NOT SIGN' was correct, then this
version can't be correct. Or do you suspect that the "original"
was corrupted somehow?
This does not make sense however you look at it. Using | between logical
expressions ...
To my eyes/brain, it looks like classic (IBM) symbolic logic notation.
In that context, I don't see anything wrong with it.
It sounds like it is some pseudocode in no language in particular so
it's hard to tell what it actually means and the document does not have
enough context to be able to tell. I suppose there is some comment
somewhere in the kernel code that would clarify this - at least what the
bit patterns mean.
Yeah, I have been looking thru the arch/powerpc/ source code for this,
but I haven't found it yet.

-- 
~Randy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help