Thread (8 messages) 8 messages, 2 authors, 2020-10-17

Re: [Linux-kernel-mentees] [PATCH v2] checkpatch: add new exception to repeated word check

From: Joe Perches <joe@perches.com>
Date: 2020-10-17 06:33:23
Also in: lkml
Subsystem: checkpatch, the rest · Maintainers: Andy Whitcroft, Joe Perches, Linus Torvalds

On Wed, 2020-10-14 at 11:35 -0700, Joe Perches wrote:
On Wed, 2020-10-14 at 23:42 +0530, Dwaipayan Ray wrote:
quoted
On Wed, Oct 14, 2020 at 11:33 PM Joe Perches [off-list ref] wrote:
quoted
On Wed, 2020-10-14 at 22:07 +0530, Dwaipayan Ray wrote:
quoted
Recently, commit 4f6ad8aa1eac ("checkpatch: move repeated word test")
moved the repeated word test to check for more file types. But after
this, if checkpatch.pl is run on MAINTAINERS, it generates several
new warnings of the type:
Perhaps instead of adding more content checks so that
word boundaries are not something like \S but also
not punctuation so that content like

        git git://
        @size size

does not match?
Hi,
So currently the words are trimmed of non alphabets before the check:

while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) {
my $first = $1;
my $second = $2;

where, the word_pattern is:
my $word_pattern = '\b[A-Z]?[a-z]{2,}\b';
I'm familiar.
quoted
So do you perhaps recommend modifying this word pattern to
include the punctuation as well rather than trimming them off?
Not really, perhaps use the capture group position
markers @- @+ or $-[1] $+[1] and $-[2] $+[2] with the
substr could be used to see what characters are
before and after the word matches.
Perhaps something like:
---
 scripts/checkpatch.pl | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index fab38b493cef..a65eb40a5539 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3054,15 +3054,25 @@ sub process {
 
 				my $first = $1;
 				my $second = $2;
+				my $start_pos = $-[1];
+				my $end_pos = $+[2];
 
 				if ($first =~ /(?:struct|union|enum)/) {
 					pos($rawline) += length($first) + length($second) + 1;
 					next;
 				}
 
-				next if ($first ne $second);
+				next if (lc($first) ne lc($second));
 				next if ($first eq 'long');
 
+				my $start_char = "";
+				my $end_char = "";
+				$start_char = substr($rawline, $start_pos - 1, 1) if ($start_pos > 0);
+				$end_char = substr($rawline, $end_pos, 1) if (length($rawline) > $end_pos);
+
+				next if ($start_char =~ /^\S$/);
+				next if ($end_char !~ /^[\.\,\s]?$/);
+
 				if (WARN("REPEATED_WORD",
 					 "Possible repeated word: '$first'\n" . $herecurr) &&
 				    $fix) {

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help