Thread (23 messages) 23 messages, 8 authors, 2025-06-02

Re: Paragraphs formatted differently depending on previous ones

From: Alejandro Colomar <alx@kernel.org>
Date: 2025-05-02 15:30:05
Subsystem: the rest · Maintainer: Linus Torvalds

Hi Branden,

On Fri, May 02, 2025 at 09:29:07AM -0500, G. Branden Robinson wrote:
At 2025-05-02T14:42:12+0200, Alejandro Colomar wrote:
quoted
By default, I prefer keeping adjustment.  Often, I want to see changes
in adjustment too as part fo the diff.  Maybe I should add an option to
disable adjustment optionally, which could be useful in those cases
where the diff is a bit hard to understand.
For myself, I found that editorial changes to recast wording or
otherwise add and remove material led to cascading reports of
differences _only_ to spaces in adjusted lines, which usually aren't of
interest to me.
I've changed my mind.  I think it's better to disable it by default in
diffman-git(1), and I can enable it easily anyway.  I've applied the
following patch:

<https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=637b0aa571b61d98c717e7ab7490df8a3d9e4841>

commit 637b0aa571b61d98c717e7ab7490df8a3d9e4841
Author: Alejandro Colomar [off-list ref]
Date:   Fri May 2 17:08:20 2025 +0200

    src/bin/diffman-git: Disable adjustment by default
    
    One can still enable it by setting an empty MANROFFOPT.
    
    Suggested-by: "G. Branden Robinson" [off-list ref]
    Signed-off-by: Alejandro Colomar [off-list ref]
diff --git a/src/bin/diffman-git b/src/bin/diffman-git
index ede506c91..25c0a98b6 100755
--- a/src/bin/diffman-git
+++ b/src/bin/diffman-git
@@ -31,6 +31,7 @@ git rev-parse --show-toplevel | read -r dir;
 cd "$dir";
 
 test -v MAN_KEEP_FORMATTING || export MAN_KEEP_FORMATTING=1;
+test -v MANROFFOPT          || export MANROFFOPT='-d AD=l';
 
 # shellcheck disable=SC2206  # We want only non-empty variables in the array.
 opts=($s $w $u);
quoted
quoted
for P in *.[157]
do
    if [ "$P" = groff_mmse.7 ]
    then
      LOCALE=-msv
    else
      LOCALE=
    fi
What's -msv?
groff_tmac(5):

   Localization packages
     For Western languages, an auxiliary package for localization sets
     the hyphenation mode and loads hyphenation patterns and exceptions.
     Localization files can also adjust the date format and provide
     translations of strings used by some of the full‐service macro
     packages; alter the input encoding (see the next section); and
     change the amount of additional inter‐sentence space.  For Eastern
     languages, the localization file defines character classes and sets
     flags on them.  By default, troffrc loads the localization file for
     English.
...
     sv     Swedish; localizes man, me, mm, mom, and ms.  Sets the input
            encoding to Latin‐1 by loading latin1.tmac.  Some of the
            localization of the mm package is handled separately; see
            groff_mmse(7).
Hmmm.
quoted
quoted
    echo $0: $P >&2
    echo "groff $ARGS $LOCALE $P" > "$P.cR.txt"
    groff $ARGS $LOCALE "$P" >> "$P.cR.txt"
...
done
Would you mind sharing the entire script?  I might get ideas for
improving diffman-git(1).
Sure; it's crude and dumb (like its author?)--I don't generally spend a
lot of software engineering effort on stuff I produce only for my own
consumption.  I've attached it.  The script name is revealing of some of
my music listening habits.
quoted
(And maybe you can drop your script if
diffman-git(1) would be good-enough for you.)
If it stops working for the limited purpose I require it, I may look
into alternatives.  :)
I suggest you try it.  I has some nice features, like specifying the
amount of context lines, or ignoring white space changes (which is
useful to confirm that some change only affects spacing but nothing
else).  It also allows you to diff arbitrary commits, without having to
store a copy of the formatted output.
quoted
The RE movement is intended to indent the "Since Linux 6.7," para.
I'd need to look at more context, and haven't, but `IP` already does
that.
That para was a continuation of a TP, and now is changed to be a
continuation of a nested TP (thus the RS).  See the diff with some more
context, which might clarify:

$ MANWIDTH=72 diffman-git -U20 HEAD^
--- HEAD^^:man/man2const/TIOCLINUX.2const
+++ HEAD^:man/man2const/TIOCLINUX.2const
@@ -24,75 +24,84 @@
             Get task information.  Disappeared in Linux 1.1.92.
 
      subcode=TIOCL_SETSEL
             Set selection.  argp points to a
 
                 struct {
                     char  subcode;
                     short xs, ys, xe, ye;
                     short sel_mode;
                 };
 
             xs and ys are the starting column and row.  xe and ye are
             the ending column and row.  (Upper left corner is row=col‐
             umn=1.)  sel_mode may be one of the following operations:
 
             TIOCL_SELCHAR
                    Select character‐by‐character.  The indicated screen
                    characters are highlighted and saved in a kernel
                    buffer.
 
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
+
             TIOCL_SELWORD
                    Select word‐by‐word, expanding the selection out‐
                    wards to align with word boundaries.  The indicated
                    screen characters are highlighted and saved in a
                    kernel buffer.
 
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
+
             TIOCL_SELLINE
                    Select line‐by‐line, expanding the selection out‐
                    wards to select full lines.  The indicated screen
                    characters are highlighted and saved in a kernel
                    buffer.
 
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
+
             TIOCL_SELPOINTER
                    Show the pointer at position (xs, ys) or (xe, ye),
                    whichever is later in text flow order.
 
             TIOCL_SELCLEAR
                    Remove the current selection highlight, if any, from
                    the console holding the selection.
 
                    This does not affect the stored selected text.
 
             TIOCL_SELMOUSEREPORT
                    Make the terminal report (xs, ys) as the current
                    mouse location using the xterm(1) mouse tracking
                    protocol (see console_codes(4)).  The lower 4 bits
                    of sel_mode (TIOCL_SELBUTTONMASK) indicate the de‐
                    sired button press and modifier key information for
                    the mouse event.
 
                    If mouse reporting is not enabled for the terminal,
                    this operation yields an EINVAL error.
 
-            Since Linux 6.7, using this subcode requires the
-            CAP_SYS_ADMIN capability.
+                   Since Linux 6.7, using this selection mode requires
+                   the CAP_SYS_ADMIN capability.
 
      subcode=TIOCL_PASTESEL
             Paste selection.  The characters in the selection buffer
             are written to fd.
 
             Since Linux 6.7, using this subcode requires the
             CAP_SYS_ADMIN capability.
 
      subcode=TIOCL_UNBLANKSCREEN
             Unblank the screen.
 
      subcode=TIOCL_SELLOADLUT
             Sets contents of a 256‐bit look up table defining charac‐
             ters in a "word", for word‐by‐word selection.  (Since Linux
             1.1.32.)
 
             Since Linux 6.7, using this subcode requires the
             CAP_SYS_ADMIN capability.
 
      subcode=TIOCL_GETSHIFTSTATE

#!/bin/bash

set -e

if [ $# -ne 1 ]
then
    echo "need a directory argument (e.g., \"old\", \"new\")" >&2
    exit 1
fi
Being diffman-git(1), it uses the git(1) repository to find the old
pages and the new ones.  No need to specify paths.
if ! [ -x ./build/test-groff ]
then
    echo "./build/test-groff does not exist or is not executable" >&2
    exit 2
fi

groff () {
    ../build/test-groff "$@"
}
I use man(1), so it would be a matter of passing an appropriate PATH to
run your development groff version.
BFLAG=
#BFLAG=-b
DIR=$1

MANS=(
./src/utils/lkbib/lkbib.1.man
./src/utils/tfmtodit/tfmtodit.1.man
./src/utils/hpftodit/hpftodit.1.man
./src/utils/pfbtops/pfbtops.1.man
./src/utils/afmtodit/afmtodit.1.man
./src/utils/lookbib/lookbib.1.man
./src/utils/addftinfo/addftinfo.1.man
./src/utils/xtotroff/xtotroff.1.man
./src/utils/indxbib/indxbib.1.man
./src/roff/nroff/nroff.1.man
./src/roff/troff/troff.1.man
./src/roff/groff/groff.1.man
./src/utils/grog/grog.1.man
./src/devices/grodvi/grodvi.1.man
./src/devices/grolbp/grolbp.1.man
./src/devices/grops/grops.1.man
./src/devices/grohtml/grohtml.1.man
./src/devices/grolj4/grolj4.1.man
./src/devices/grotty/grotty.1.man
./src/devices/gropdf/gropdf.1.man
./src/devices/gropdf/pdfmom.1.man
./src/devices/xditview/gxditview.1.man
./src/preproc/preconv/preconv.1.man
./src/preproc/tbl/tbl.1.man
./src/preproc/soelim/soelim.1.man
./src/preproc/eqn/eqn.1.man
./src/preproc/eqn/neqn.1.man
./src/preproc/pic/pic.1.man
./src/preproc/refer/refer.1.man
./src/preproc/grn/grn.1.man
./contrib/pic2graph/pic2graph.1.man
./contrib/hdtbl/groff_hdtbl.7.man
./contrib/mm/groff_mm.7.man
./contrib/mm/mmroff.1.man
./contrib/grap2graph/grap2graph.1.man
./contrib/rfc1345/groff_rfc1345.7.man
./contrib/eqn2graph/eqn2graph.1.man
./contrib/gpinyin/gpinyin.1.man
./contrib/mom/groff_mom.7.man
./contrib/gdiffmk/gdiffmk.1.man
./contrib/glilypond/glilypond.1.man
./contrib/chem/chem.1.man
./contrib/gperl/gperl.1.man
./man/groff_tmac.5.man
./man/groff_out.5.man
./man/groff_diff.7.man
./man/groff_char.7.man
./man/groff.7.man
./man/roff.7.man
./man/groff_font.5.man
./tmac/groff_trace.7.man
./tmac/groff_me.7.man
./tmac/groff_ms.7.man
./tmac/groff_man.7.man
./tmac/groff_man_style.7.man
./tmac/groff_mdoc.7.man
./tmac/groff_www.7.man
)
I calculate the MANS dynamically with a regex:

	case $# in
	0)  git diff --name-only;		;;
	1)  git diff --name-only "$1^..$1";	;;
	*)  git diff --name-only "$1..$2";	;;
	esac \
	| grep -E '(\.[[:digit:]]([[:alpha:]][[:alnum:]]*)?\>|\.man)+(\.man|\.in)*$' \
	| sortman \
MANS_SV=(
./contrib/mm/groff_mmse.7.man
)

mkdir "$DIR"
pushd "$DIR" >/dev/null

# the change logs, so we know approximately where we are
cp ../ChangeLog .

for d in chem gdiffmk glilypond gperl gpinyin hdtbl mm mom rfc1345 sboxes
do
	cp ../contrib/$d/ChangeLog ./ChangeLog.$d
done

# our Texinfo manual
cp ../build/doc/groff.txt .

# our Texinfo manual via HTML
cp ../build/doc/groff.html .
lynx -dump groff.html > groff.html.txt

# our ms manuals
groff $BFLAG -ww -Tutf8 -ept -ms ../doc/ms.ms > ms.txt

# our me manuals
#groff $BFLAG -ww -Tutf8 -me ../doc/meintro.me > meintro.txt
#groff $BFLAG -ww -Tutf8 -kt -me -mfr ../doc/meintro_fr.me > meintro_fr.txt
#groff $BFLAG -ww -Tutf8 -me ../doc/meref.me > meref.txt
me_pre=../ATTIC/my.me
groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meintro.me > meintro.txt
groff $BFLAG -ww -Tutf8 -kt -me -mfr $me_pre ../build/doc/meintro_fr.me \
    > meintro_fr.txt
groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meref.me > meref.txt

for F in ${MANS[*]} ${MANS_SV[*]}
do
    G=../build/${F%.man}
    if [ -f "$G" ]
    then
        cp "$G" .
    else
        echo "warning: \"$G\" missing" >&2
    fi
done

: ${AD:=l}

ARGS="$BFLAG -ww -dAD=$AD -rCHECKSTYLE=3 -rU1 -Tutf8 -e -t -mandoc"
NOCR=-rcR=0
LOCALE=
ARGS_HTML="$BFLAG -ww -rCHECKSTYLE=3 -Thtml -e -t -mandoc -P-C -P-G"

for P in *.[157]
do
    if [ "$P" = groff_mmse.7 ]
    then
      LOCALE=-msv
    else
      LOCALE=
    fi

    echo $0: $P >&2
    echo "groff $ARGS $LOCALE $P" > "$P.cR.txt"
    groff $ARGS $LOCALE "$P" >> "$P.cR.txt"
    echo "groff $ARGS $LOCALE $NOCR $P" > "$P.no-cR.txt"
    groff $ARGS $LOCALE $NOCR "$P" >> "$P.no-cR.txt"
    echo "<!-- groff $ARGS_HTML $LOCALE -P-I$P $P -->" > "$P.html"
    groff $ARGS_HTML $LOCALE -P-I$P $P >> "$P.html"
    rm "$P"
done
Hmmm, my script is dumber; it only calls man(1).  But I guess that's
enough.  You can always tell man(1) to pass stuff to groff(1).

	| sortman \
	| while read -r f; do \
		case $# in
		0)  old="HEAD:$f";  new="./$f";   ;;
		1)  old="$1^:$f";   new="$1:$f";  ;;
		*)  old="$1:$f";    new="$2:$f";  ;;
		esac;

		case $# in
		0)  cat "$new";       ;;
		*)  git show "$new";  ;;
		esac \
		| man /dev/stdin \
		| diff --label "$old" --label "$new" "${opts[@]}" \
			<(git show "$old" | man /dev/stdin) \
			/dev/stdin \
		|| true;
	done;
popd >/dev/null
popd(1) at the end of a script is not useful.  Or do you source the
script?
# vim:set ai et sw=4 ts=4 tw=80:

Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help