Re: Feature request: better error messages when UTF-8 bites
From: Johannes Sixt <hidden>
Date: 2022-07-28 05:42:40
Am 27.07.22 um 22:21 schrieb CH:
Somehow when copying and pasting a commit from a website to the command line, a UTF-8 Byte Order Mark (BOM) [https://en.wikipedia.org/wiki/Byte_order_mark] was appended to one of the commit ids. BOMs are invisible, as are many other UTF-8 code points. The upshot was that Git didn't like it, and complained bitterly:quoted
$ strace -etrace=execve -s 200 git diff 038179704f0066aa815d5429221cf381ff4ef289 47346a462d8ba40b9a8b073e351c362522c46aa6 execve("/usr/bin/git", ["git", "diff", "038179704f0066aa815d5429221cf381ff4ef289\357\273\277", "47346a462d8ba40b9a8b073e351c362522c46aa6"], 0x7fffec3c4bb0 /* 80 vars */) = 0 fatal: ambiguous argument '038179704f0066aa815d5429221cf381ff4ef289': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]'+++ exited with 128 +++Feature request: ================ When printing the "fatal: ambiguous argument '......': ....", perhaps escape (url or otherwise) the ambiguous argument when printing it in the error message, or maybe add a sentence about non-ASCII characters being found.
That's not going to fly, IMHO, because when I type git diff todo/René I would not want to see fatal: ambiguous argument 'todo/Ren\303\251': unknown ... I'm convinced that there are thousands of users who use non-ASCII branch and file names that they also frequently mis-type. They'd all be greeted with unintelligible nerdy gibberish. I may be able to change my mind if ambiguous input (in the sense of "is not what it seems to be") leads to a security hazard that is unique to Git. -- Hannes