Thread (8 messages) 8 messages, 4 authors, 2026-02-24

Re: [RFC] send-email: UTF-8 encoding in subject line

From: Shreyansh Paliwal <hidden>
Date: 2026-02-22 15:56:11

Possibly related (same subject, not in this thread)

On Sun, Feb 22, 2026 at 9:07 AM Shreyansh Paliwal
[off-list ref] wrote:
quoted
quoted
quoted
That makes sense, I tried it below.
I also wondered whether, in addition to this, it might be helpful to warn on
an invalid charset, and/or possibly fall back to UTF-8.
Agreed on the first half of the statement, if we have an easy and
portable way to tell if a given random string names a valid charset.
I do not recommend to "fall back" to anything, if we are asking an
input from the user.
Following up on this, I tried adding a warning when the provided charset
does not appear to be valid. Current flow is,

  Which 8bit encoding should I declare [UTF-8]? y
  Are you sure you want to use <y> [y/N]? y

With the additional check, it becomes,

  Which 8bit encoding should I declare [default: UTF-8]? y
  warning: 'y' does not appear to be a valid charset name.
  Are you sure you want to use <y> [y/N]?

This uses find_encoding() from Perl’s Encode module to detect any
unrecognized charset names.

Let me know what you think.
Also, is there any new test that should be added for this change?

Signed-off-by: Shreyansh Paliwal <redacted>
---
 git-send-email.perl | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index cd4b316ddc..e62fa259ba 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -23,6 +23,7 @@
 use Git::LoadCPAN::Error qw(:try);
 use Git;
 use Git::I18N;
+use Encode qw(find_encoding);

 Getopt::Long::Configure qw/ pass_through /;
@@ -1044,9 +1045,25 @@ sub file_declares_8bit_cte {
        foreach my $f (sort keys %broken_encoding) {
                print "    $f\n";
        }
-       $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "),
-                                 valid_re => qr/.{4}/, confirm_only => 1,
-                                 default => "UTF-8");
+       while (1) {
+               my $encoding = ask(__("Which 8bit encoding should I declare [default: UTF-8]? "),
+                       valid_re => qr/^\S+$/,
+                       default  => "UTF-8");
Here we change things, right?

- The original validation is "at least 4 characters", the new
validation is "at least one non-blank." I'm not sure why we'd prefer
one or the other, frankly. The original goes to 852a15d748
(send-email: ask confirmation if given encoding name is very short,
2015-02-13), which is motivated by the same problem we're discussing
here!
I see.
My understanding of the earlier change (852a15d748) is that the
length check was intended as a heuristic check to catch obviously invalid
inputs like "y" and trigger an extra confirmation based on the fact that
charset names would be at least 4 letters.

With the additional find_encoding() check, the validation becomes semantic
rather than length-based, recognized charset names are accepted directly,
while unrecognized ones trigger a warning and still require explicit
confirmation. The relaxed regex (at least one non-blank) is only meant to
ensure we receive some non-empty input before passing it to find_encoding().
- We get rid of confirm_only, since we're about to roll our own
confirmation below:
quoted
+               next unless defined $encoding;
+               if (find_encoding($encoding)) {
+                       $auto_8bit_encoding = $encoding;
+                       last;
+               }
+               printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
+               my $yesno = ask(
+                       sprintf(__("Are you sure you want to use <%s> [y/N]? "), $encoding),
+                       valid_re => qr/^(?:y|n)/i,
+                       default  => 'n');
…which might want refactored a bit so it can stay close to the original? idk.
Actually the flow needed to change slightly to insert the validity warning
before the final confirmation step. Since ask() handles confirmation internally
using confrim_only and is used in multiple places, it seemed simpler to keep the
additional confirmation local here rather than modifying ask() itself.

Let me know what you think.

Best,
Shreyansh
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help