Re: [RFC] Introducing AI Agents to Git Localization

From: brian m. carlson <hidden>
Date: 2026-02-05 01:53:59

On 2026-02-05 at 01:04:51, Jiang Xin wrote:

To clarify, the intention is not to enforce automated translations via
a central bot. Instead, each l10n team should retain full control over
whether or not to use AI assistance in their workflow. The recent
commits in the git-po next branch only add optional guidance in
po/README.md to help AI agents (if a team chooses to use them) perform
specific tasks more effectively—such as recognizing glossary terms
from the .po header, locating untranslated or fuzzy strings, and
splitting large files for easier handling.

We fully acknowledge that AI translation quality varies significantly
across languages, and for some—like Swedish—it may not yet be reliable
enough for direct use. The goal is to provide tools that teams can
optionally leverage, not to replace human judgment or community
oversight.

My experience in seeing AI translations is that they tend to be of
poor quality.  Certainly, I'm only a native speaker of English, but my
reading and writing skills in Spanish and French are somewhere around B2
or C1 and I've seen some AI translations that are frankly just wrong,
making errors that no human would ever make.  And Spanish and French are
two of the most spoken languages on the planet, with hundreds of
millions of speakers each.

I also will share with you the experience of a colleague of mine who is
a European Portuguese speaker.  Most of the AI models produce Brazilian
Portuguese, which is much more common (since there are more people in
Brazil than in Portugal), but can vary substantially from European
Portuguese.  (Most FLOSS I've seen has separate pt_BR and pt_PT
translations for this reason.)  This means that these tools are going to
produce bad translations in such a case.

I strongly feel that we should provide people good quality software,
which includes good quality translations, to the best of our ability.  I
realize that this demands extra effort from humans to do good quality
translations, but I feel really positively about the quality of the
translations we have in Git: they are overall excellent and it is only
extremely infrequently that I've found a problem (which is usually a
typo of some sort that anyone could have made).  Considering that most
people on the planet do not speak English and that even those that do
may not speak it fluently, it's of the utmost importance that we produce
the best quality translations we can.  I don't feel using AI-generated
translations would be honouring our users in that way.

Finally, we have this text in SubmittingPatches:

    The Developer's Certificate of Origin requires contributors to certify
    that they know the origin of their contributions to the project and
    that they have the right to submit it under the project's license.
    It's not yet clear that this can be legally satisfied when submitting
    significant amount of content that has been generated by AI tools.

    Another issue with AI generated content is that AIs still often
    hallucinate or just produce bad code, commit messages, documentation
    or output, even when you point out their mistakes.

    To avoid these issues, we will reject anything that looks AI
    generated, that sounds overly formal or bloated, that looks like AI
    slop, that looks good on the surface but makes no sense, or that
    senders don’t understand or cannot explain.

It's my understanding that copyright attaches to translations, at least
under U.S. and Canadian law, and so the sign-off requirements would need
to be met here.  So I'm afraid that we wouldn't be able to accept such
contributions if they were made due to the need for sign-off with the
DCO.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

Attachments

signature.asc [application/pgp-signature] 262 bytes

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help