Re: [Linux-kernel-mentees] [PATCH v2] checkpatch: add new exception to repeated word check
From: Joe Perches <joe@perches.com>
Date: 2020-10-17 06:33:23
Also in:
lkml
Subsystem:
checkpatch, the rest · Maintainers:
Andy Whitcroft, Joe Perches, Linus Torvalds
On Wed, 2020-10-14 at 11:35 -0700, Joe Perches wrote:
On Wed, 2020-10-14 at 23:42 +0530, Dwaipayan Ray wrote:quoted
On Wed, Oct 14, 2020 at 11:33 PM Joe Perches [off-list ref] wrote:quoted
On Wed, 2020-10-14 at 22:07 +0530, Dwaipayan Ray wrote:quoted
Recently, commit 4f6ad8aa1eac ("checkpatch: move repeated word test") moved the repeated word test to check for more file types. But after this, if checkpatch.pl is run on MAINTAINERS, it generates several new warnings of the type:Perhaps instead of adding more content checks so that word boundaries are not something like \S but also not punctuation so that content like git git:// @size size does not match?Hi, So currently the words are trimmed of non alphabets before the check: while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) { my $first = $1; my $second = $2; where, the word_pattern is: my $word_pattern = '\b[A-Z]?[a-z]{2,}\b';I'm familiar.quoted
So do you perhaps recommend modifying this word pattern to include the punctuation as well rather than trimming them off?Not really, perhaps use the capture group position markers @- @+ or $-[1] $+[1] and $-[2] $+[2] with the substr could be used to see what characters are before and after the word matches.
Perhaps something like: --- scripts/checkpatch.pl | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index fab38b493cef..a65eb40a5539 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl@@ -3054,15 +3054,25 @@ sub process { my $first = $1; my $second = $2; + my $start_pos = $-[1]; + my $end_pos = $+[2]; if ($first =~ /(?:struct|union|enum)/) { pos($rawline) += length($first) + length($second) + 1; next; } - next if ($first ne $second); + next if (lc($first) ne lc($second)); next if ($first eq 'long'); + my $start_char = ""; + my $end_char = ""; + $start_char = substr($rawline, $start_pos - 1, 1) if ($start_pos > 0); + $end_char = substr($rawline, $end_pos, 1) if (length($rawline) > $end_pos); + + next if ($start_char =~ /^\S$/); + next if ($end_char !~ /^[\.\,\s]?$/); + if (WARN("REPEATED_WORD", "Possible repeated word: '$first'\n" . $herecurr) && $fix) {
_______________________________________________ Linux-kernel-mentees mailing list Linux-kernel-mentees@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees