Thread (18 messages) 18 messages, 5 authors, 2020-07-27

Re: [PATCH 0/9] powerpc: delete duplicated words

From: Randy Dunlap <hidden>
Date: 2020-07-26 19:08:18
Also in: lkml

On 7/26/20 10:49 AM, Joe Perches wrote:
On Sun, 2020-07-26 at 10:23 -0700, Randy Dunlap wrote:
quoted
On 7/26/20 7:29 AM, Christophe Leroy wrote:
quoted
Randy Dunlap [off-list ref] a écrit :
quoted
Drop duplicated words in arch/powerpc/ header files.
How did you detect them ? Do you have some script for tgat, or you just read all comments ?
Yes, it's a script that finds lots of false positives, so I have to check
each and every one of them for validity.
And it's a lot of work too. (thanks Randy)

It could be something like:

$ grep-2.5.4 -nrP --include=*.[ch] '\b([A-Z]?[a-z]{2,}\b)[ \t]*(?:\n[ \t]*\*[ \t]*|)\1\b' * | \
  grep -vP '\b(?:struct|enum|union)\s+([A-Z]?[a-z]{2,})\s+\*?\s*\1\b' | \
  grep -vP '\blong\s+long\b' | \
  grep -vP '\b([A-Z]?[a-z]{2,})(?:\t+| {2,})\1\b'
Hi Joe,

(what is grep-2.5.4 ?)

It looks like you tried a few iterations of this -- since it drops things
like "long long".  There are lots of data types that are repeated & valid.
And many struct names, like "struct kref kref", "struct completion completion",
and "struct mutex mutex".  I handle (ignore) those manually, although that
could be added to the Perl script.

v0.1 of this script also found lots of repeated numbers and strings of
special characters (ASCII art etc.), so now it ignores duplicated numbers
or special characters -- since it is really looking for duplicate words.

Anyway, I might as well attach it. It's no big deal.
And if someone else wants to tackle using it, go for it.

-- 
~Randy

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help