Thread (8 messages) 8 messages, 4 authors, 2022-12-01

Re: [bug] git diff --word-diff gives wrong result for utf-8 chinese

From: Phillip Wood <hidden>
Date: 2022-12-01 14:51:37

Hi Ping

On 01/12/2022 07:33, Ping Yin wrote:
quoted
quoted
If the rule is "break on ascii whitespace",
Is there a way to achieve this: break english by word, and break
chinese by utf-8 character
You could extend your current regex so that it matches whole utf-8 
codepoints which is what git does for the builtin userdiff regexes. I've 
not tested it but I think

git config --global diff.wordregex "[[:alnum:]_]+|[^[:space:]]|$(printf 
'[\xc0-\xff][\x80-\xbf]+')"

should work. The downside is that you end up with a .gitconfig that is 
not valid utf-8. Perhaps someone else has a clever idea to get around that.

Best Wishes

Phillip
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help