[PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v
From: наб <hidden>
Date: 2024-12-16 01:00:47
Subsystem:
the rest · Maintainer:
Linus Torvalds
On Sun, Dec 15, 2024 at 10:44:26PM +0100, Alejandro Colomar wrote:
On Sun, Dec 15, 2024 at 10:02:42PM +0100, наб wrote:quoted
quoted
Should we file a bug against glibc strverscmp(3)? We probably should. And the reference to sort(1), I'd put it in BUGS, saying that this API is broken, and does not sort properly. Sounds good?No, this API works as-documented, and the implementation is useful.What does useful mean?
There are applications where a lexicographical-except-numeric comparison like this is what you want (it's most of them). Calling it a "version sort is silly + goofy but, whatever.
quoted
It's just not what ls -v does.While version sort isn't something standard, I think GNU should be self-consistent.
It is, ls -v and sort -V are consistent.
Having just implemented the /actual/ algorithm they use for voreutils,
that is by far /not/ universally applicable, much hairier, and hard-tuned for
"versions that are kinda like debian describes and sorts them (but not actually)
AND ALSO we put them in filenames where we can assume the format a little bit
AND ALSO {4 special cases to make ls -v work}".
Replacing this well-defined lexicographical-except-numeric sorter with... that,
isn't really applicable.
Best,
-- >8 --
From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?=
[off-list ref]
Subject: [PATCH v3] strverscmp.3: this is NOT the ordering used by ls -v
Compare, given:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int compar(const char **l, const char **r) {
return strverscmp(*l, *r);
}
int main(int argc, char ** argv) {
qsort(argv + 1, argc - 1, sizeof(*argv), compar);
for(int i = 1; i < argc; ++i)
puts(argv[i]);
}
yields:
$ /bin/ls -v1 a* # coreutils ls
a-1.0a
a-1.0.1a
$ ../vers a* # as above
a-1.0.1a
a-1.0a
$ ls -v1 a* # voreutils ls @ 5781698 with strverscmp()-equivalent sorting
a-1.0.1a
a-1.0a
compare also the results for real data like
netstat-nat-1.{0,1{,.1},2,3.1,4{,.{1,2,3,4,5,6,7,8,9,10}}}.tar.gz
Thus, coreutils ls -v does NOT use strverscmp(3);
it uses a modified Debian version comparison algorithm with additional
suffix processing and ls -v-specific exceptions.
Signed-off-by: Ahelenia Ziemiańska <redacted>
---
man/man3/strverscmp.3 | 23 ++++++++---------------
1 file changed, 8 insertions(+), 15 deletions(-)
diff --git a/man/man3/strverscmp.3 b/man/man3/strverscmp.3
index 41bc1ddbd..e028d6788 100644
--- a/man/man3/strverscmp.3
+++ b/man/man3/strverscmp.3@@ -18,25 +18,14 @@ .SH SYNOPSIS .BI "int strverscmp(const char *" s1 ", const char *" s2 ); .fi .SH DESCRIPTION -Often one has files +For a dataset like .IR jan1 ", " jan2 ", ..., " jan9 ", " jan10 ", ..." -and it feels wrong when -.BR ls (1) -orders them +sorting it lexicographically yields .IR jan1 ", " jan10 ", ..., " jan2 ", ..., " jan9 . .\" classical solution: "rename jan jan0 jan?" -In order to rectify this, GNU introduced the -.I \-v -option to -.BR ls (1), -which is implemented using -.BR versionsort (3), -which again uses -.BR strverscmp (). -.P -Thus, the task of +The task of .BR strverscmp () -is to compare two strings and find the "right" order, while +is to compare two strings yielding the former order, while .BR strcmp (3) finds only the lexicographic order. This function does not use
@@ -44,6 +33,10 @@ .SH DESCRIPTION .BR LC_COLLATE , so is meant mostly for situations where the strings are expected to be in ASCII. +This is different from the ordering produced by +.BR sort (1) +.BR -V . +.\" sort -V sorts a-1.0a < a-1.0.1a; strverscmp() does not .P What this function does is the following. If both strings are equal, return 0.
--
2.39.5
Attachments
- signature.asc [application/pgp-signature] 833 bytes