Re: [RFC 1/9] lib/string: Add function to trim duplicate WS

[RFC 0/9] Add persistent durable identifier to storage log messages · Tony Asleson <hidden> · 2019-12-23
[RFC 1/9] lib/string: Add function to trim duplicate WS · Tony Asleson <hidden> · 2019-12-23
Re: [RFC 1/9] lib/string: Add function to trim duplicate WS · Matthew Wilcox <willy@infradead.org> · 2019-12-23
Re: [RFC 1/9] lib/string: Add function to trim duplicate WS · Tony Asleson <hidden> · 2020-01-02
Re: [RFC 1/9] lib/string: Add function to trim duplicate WS · Tony Asleson <hidden> · 2020-01-03
[RFC 3/9] printk: Add printk_emit_ratelimited macro · Tony Asleson <hidden> · 2019-12-23
[RFC 4/9] struct device_type: Add function callback durable_name · Tony Asleson <hidden> · 2019-12-23
[RFC 6/9] create_syslog_header: Add durable name · Tony Asleson <hidden> · 2019-12-23
Re: [RFC 6/9] create_syslog_header: Add durable name · James Bottomley <James.Bottomley@HansenPartnership.com> · 2019-12-24
Re: [RFC 6/9] create_syslog_header: Add durable name · Tony Asleson <hidden> · 2020-01-02
[RFC 5/9] block: Add support functions for persistent durable name · Tony Asleson <hidden> · 2019-12-23
[RFC 2/9] printk: Bring back printk_emit · Tony Asleson <hidden> · 2019-12-23
[RFC 8/9] ata_dev_printk: Add durable name to output · Tony Asleson <hidden> · 2019-12-23
Re: [RFC 8/9] ata_dev_printk: Add durable name to output · James Bottomley <James.Bottomley@HansenPartnership.com> · 2019-12-24
[RFC 9/9] __xfs_printk: Add durable name to output · Tony Asleson <hidden> · 2019-12-23
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Dave Chinner <david@fromorbit.com> · 2020-01-04
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Tony Asleson <hidden> · 2020-01-06
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Dave Chinner <david@fromorbit.com> · 2020-01-06
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Sweet Tea Dorminy <hidden> · 2020-01-07
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Dave Chinner <david@fromorbit.com> · 2020-01-07
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Tony Asleson <hidden> · 2020-01-07
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Dave Chinner <david@fromorbit.com> · 2020-01-08
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Tony Asleson <hidden> · 2020-01-08
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Alasdair G Kergon <agk@redhat.com> · 2020-01-09
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Dave Chinner <david@fromorbit.com> · 2020-01-09
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Alasdair G Kergon <agk@redhat.com> · 2020-01-10
Re: [RFC 9/9] __xfs_printk: Add durable name to output · Tony Asleson <hidden> · 2020-01-10
[RFC 7/9] print_req_error: Add persistent durable name · Tony Asleson <hidden> · 2019-12-23
Re: [RFC 0/9] Add persistent durable identifier to storage log messages · James Bottomley <James.Bottomley@HansenPartnership.com> · 2019-12-24
Re: [RFC 0/9] Add persistent durable identifier to storage log messages · Tony Asleson <hidden> · 2020-01-02

From: Tony Asleson <hidden>
Date: 2020-01-03 14:30:21
Also in: linux-fsdevel, linux-scsi

On 1/2/20 4:52 PM, Tony Asleson wrote:

On 12/23/19 5:28 PM, Matthew Wilcox wrote:

quoted

On Mon, Dec 23, 2019 at 04:55:50PM -0600, Tony Asleson wrote:

quoted

+/**
+ * Removes leading and trailing whitespace and removes duplicate
+ * adjacent whitespace in a string, modifies string in place.
+ * @s The %NUL-terminated string to have spaces removed
+ * Returns the new length
+ */

This isn't good kernel-doc.  See Documentation/doc-guide/kernel-doc.rst
Compile with W=1 to get the format checked.

Indeed, I'll correct it.

quoted

+size_t strim_dupe(char *s)
+{
+	size_t ret = 0;
+	char *w = s;
+	char *p;
+
+	/*
+	 * This will remove all leading and duplicate adjacent, but leave
+	 * 1 space at the end if one or more are present.
+	 */
+	for (p = s; *p != '\0'; ++p) {
+		if (!isspace(*p) || (p != s && !isspace(*(p - 1)))) {
+			*w = *p;
+			++w;
+			ret += 1;
+		}
+	}

I'd be tempted to do ...

	size_t ret = 0;
	char *w = s;
	bool last_space = false;

	do {
		bool this_space = isspace(*s);

		if (!this_space || !last_space) {
			*w++ = *s;
			ret++;
		}
		s++;
		last_space = this_space;
	} while (s[-1] != '\0');

That leaves a starting and trailing WS, how about something like this?

size_t strim_dupe(char *s)
{
	size_t ret = 0;
	char *w = s;
	bool last_space = false;

	do {
		bool this_space = isspace(*s);
		if (!this_space || (!last_space && ret)) {Mollie Fitzgerald
			*w++ = *s;
			ret++;
		}
		s++;
		last_space = this_space;
	} while (s[-1] != '\0');

	if (ret > 1 && isspace(w[-2])) {
		w[-2] = '\0';
		ret--;
	}

	ret--;
	return ret;
}

This function was added so I could strip out extra spaces in the vpd
0x83 string representation, to shorten them before they get added to the
structured syslog message.  I'm starting to think this is a bad idea as
anyone that might want to write some code to use the kernel sysfs entry
for a device and search for it in the syslog would have to perturb the
id string the same way.

I think this change should just be removed from the patch series and
leave the IDs as they are.

If we really wanted a shorter ID, a better approach would be use a hash
of the ID string, but that introduces another level of complexity that
isn't helpful to end users.

-Tony

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help