Re: [PATCH 1/1] http: don't send C or POSIX in Accept-Language
From: Collin Funk <hidden>
Date: 2025-07-11 17:02:05
Justin Tobler [off-list ref] writes:
From my understanding, each language is expected to be defined in the following form: language[_territory][.codeset][@modifier] When we parse the list of languages we only care about the `language[_territory]` part though. From looking at ISO 639 language codes, only codes with two or three characters are valid. If we wanted to be a bit more strict, we could check the length of the language code (everything before the first '_') and filter out anything outside of those limits. This would naturally filter out "C" and "POSIX" without having to mention them explicitly. Not sure if being more strict adds much more value here in practice though. So it may be fine to keep it as-is. :)
Filtering out anything that isn't 2-3 letters seems like a good heuristic to me. It seems better than only filtering out "C" and "POSIX" and allowing anything else. And it keeps us from having to keep a list of updated BCP 47 language tags. Collin