Re: [RFC] send-email: UTF-8 encoding in subject line
From: D. Ben Knoble <hidden>
Date: 2026-02-22 15:00:40
Possibly related (same subject, not in this thread)
- 2026-02-21 · Re: [RFC] send-email: UTF-8 encoding in subject line · Shreyansh Paliwal <hidden>
- 2026-02-21 · Re: [RFC] send-email: UTF-8 encoding in subject line · Ben Knoble <hidden>
- 2026-02-20 · [RFC] send-email: UTF-8 encoding in subject line · Shreyansh Paliwal <hidden>
On Sun, Feb 22, 2026 at 9:07 AM Shreyansh Paliwal [off-list ref] wrote:
quoted hunk ↗ jump to hunk
quoted
quoted
That makes sense, I tried it below. I also wondered whether, in addition to this, it might be helpful to warn on an invalid charset, and/or possibly fall back to UTF-8.Agreed on the first half of the statement, if we have an easy and portable way to tell if a given random string names a valid charset. I do not recommend to "fall back" to anything, if we are asking an input from the user.Following up on this, I tried adding a warning when the provided charset does not appear to be valid. Current flow is, Which 8bit encoding should I declare [UTF-8]? y Are you sure you want to use <y> [y/N]? y With the additional check, it becomes, Which 8bit encoding should I declare [default: UTF-8]? y warning: 'y' does not appear to be a valid charset name. Are you sure you want to use <y> [y/N]? This uses find_encoding() from Perl’s Encode module to detect any unrecognized charset names. Let me know what you think. Also, is there any new test that should be added for this change? Signed-off-by: Shreyansh Paliwal <redacted> --- git-send-email.perl | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-)diff --git a/git-send-email.perl b/git-send-email.perl index cd4b316ddc..e62fa259ba 100755 --- a/git-send-email.perl +++ b/git-send-email.perl@@ -23,6 +23,7 @@ use Git::LoadCPAN::Error qw(:try); use Git; use Git::I18N; +use Encode qw(find_encoding); Getopt::Long::Configure qw/ pass_through /;@@ -1044,9 +1045,25 @@ sub file_declares_8bit_cte { foreach my $f (sort keys %broken_encoding) { print " $f\n"; } - $auto_8bit_encoding = ask(__("Which 8bit encoding should I declare [UTF-8]? "), - valid_re => qr/.{4}/, confirm_only => 1, - default => "UTF-8"); + while (1) { + my $encoding = ask(__("Which 8bit encoding should I declare [default: UTF-8]? "), + valid_re => qr/^\S+$/, + default => "UTF-8");
Here we change things, right? - The original validation is "at least 4 characters", the new validation is "at least one non-blank." I'm not sure why we'd prefer one or the other, frankly. The original goes to 852a15d748 (send-email: ask confirmation if given encoding name is very short, 2015-02-13), which is motivated by the same problem we're discussing here! - We get rid of confirm_only, since we're about to roll our own confirmation below:
+ next unless defined $encoding;
+ if (find_encoding($encoding)) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
+ printf STDERR __("warning: '%s' does not appear to be a valid charset name.\n"), $encoding;
+ my $yesno = ask(
+ sprintf(__("Are you sure you want to use <%s> [y/N]? "), $encoding),
+ valid_re => qr/^(?:y|n)/i,
+ default => 'n');…which might want refactored a bit so it can stay close to the original? idk.
+ if (defined $yesno && $yesno =~ /^y/i) {
+ $auto_8bit_encoding = $encoding;
+ last;
+ }
+ }
}
if (!$force) {
--
2.53.0-- D. Ben Knoble