Re: Paragraphs formatted differently depending on previous ones
From: Alejandro Colomar <alx@kernel.org>
Date: 2025-05-02 15:30:05
Subsystem:
the rest · Maintainer:
Linus Torvalds
Hi Branden, On Fri, May 02, 2025 at 09:29:07AM -0500, G. Branden Robinson wrote:
At 2025-05-02T14:42:12+0200, Alejandro Colomar wrote:quoted
By default, I prefer keeping adjustment. Often, I want to see changes in adjustment too as part fo the diff. Maybe I should add an option to disable adjustment optionally, which could be useful in those cases where the diff is a bit hard to understand.For myself, I found that editorial changes to recast wording or otherwise add and remove material led to cascading reports of differences _only_ to spaces in adjusted lines, which usually aren't of interest to me.
I've changed my mind. I think it's better to disable it by default in diffman-git(1), and I can enable it easily anyway. I've applied the following patch: <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=637b0aa571b61d98c717e7ab7490df8a3d9e4841> commit 637b0aa571b61d98c717e7ab7490df8a3d9e4841 Author: Alejandro Colomar [off-list ref] Date: Fri May 2 17:08:20 2025 +0200 src/bin/diffman-git: Disable adjustment by default One can still enable it by setting an empty MANROFFOPT. Suggested-by: "G. Branden Robinson" [off-list ref] Signed-off-by: Alejandro Colomar [off-list ref]
diff --git a/src/bin/diffman-git b/src/bin/diffman-git
index ede506c91..25c0a98b6 100755
--- a/src/bin/diffman-git
+++ b/src/bin/diffman-git@@ -31,6 +31,7 @@ git rev-parse --show-toplevel | read -r dir; cd "$dir"; test -v MAN_KEEP_FORMATTING || export MAN_KEEP_FORMATTING=1; +test -v MANROFFOPT || export MANROFFOPT='-d AD=l'; # shellcheck disable=SC2206 # We want only non-empty variables in the array. opts=($s $w $u);
quoted
quoted
for P in *.[157] do if [ "$P" = groff_mmse.7 ] then LOCALE=-msv else LOCALE= fiWhat's -msv?groff_tmac(5): Localization packages For Western languages, an auxiliary package for localization sets the hyphenation mode and loads hyphenation patterns and exceptions. Localization files can also adjust the date format and provide translations of strings used by some of the full‐service macro packages; alter the input encoding (see the next section); and change the amount of additional inter‐sentence space. For Eastern languages, the localization file defines character classes and sets flags on them. By default, troffrc loads the localization file for English. ... sv Swedish; localizes man, me, mm, mom, and ms. Sets the input encoding to Latin‐1 by loading latin1.tmac. Some of the localization of the mm package is handled separately; see groff_mmse(7).
Hmmm.
quoted
quoted
echo $0: $P >&2 echo "groff $ARGS $LOCALE $P" > "$P.cR.txt" groff $ARGS $LOCALE "$P" >> "$P.cR.txt" ... doneWould you mind sharing the entire script? I might get ideas for improving diffman-git(1).Sure; it's crude and dumb (like its author?)--I don't generally spend a lot of software engineering effort on stuff I produce only for my own consumption. I've attached it. The script name is revealing of some of my music listening habits.quoted
(And maybe you can drop your script if diffman-git(1) would be good-enough for you.)If it stops working for the limited purpose I require it, I may look into alternatives. :)
I suggest you try it. I has some nice features, like specifying the amount of context lines, or ignoring white space changes (which is useful to confirm that some change only affects spacing but nothing else). It also allows you to diff arbitrary commits, without having to store a copy of the formatted output.
quoted
The RE movement is intended to indent the "Since Linux 6.7," para.I'd need to look at more context, and haven't, but `IP` already does that.
That para was a continuation of a TP, and now is changed to be a continuation of a nested TP (thus the RS). See the diff with some more context, which might clarify: $ MANWIDTH=72 diffman-git -U20 HEAD^
--- HEAD^^:man/man2const/TIOCLINUX.2const
+++ HEAD^:man/man2const/TIOCLINUX.2const@@ -24,75 +24,84 @@ Get task information. Disappeared in Linux 1.1.92. subcode=TIOCL_SETSEL Set selection. argp points to a struct { char subcode; short xs, ys, xe, ye; short sel_mode; }; xs and ys are the starting column and row. xe and ye are the ending column and row. (Upper left corner is row=col‐ umn=1.) sel_mode may be one of the following operations: TIOCL_SELCHAR Select character‐by‐character. The indicated screen characters are highlighted and saved in a kernel buffer. + Since Linux 6.7, using this selection mode requires + the CAP_SYS_ADMIN capability. + TIOCL_SELWORD Select word‐by‐word, expanding the selection out‐ wards to align with word boundaries. The indicated screen characters are highlighted and saved in a kernel buffer. + Since Linux 6.7, using this selection mode requires + the CAP_SYS_ADMIN capability. + TIOCL_SELLINE Select line‐by‐line, expanding the selection out‐ wards to select full lines. The indicated screen characters are highlighted and saved in a kernel buffer. + Since Linux 6.7, using this selection mode requires + the CAP_SYS_ADMIN capability. + TIOCL_SELPOINTER Show the pointer at position (xs, ys) or (xe, ye), whichever is later in text flow order. TIOCL_SELCLEAR Remove the current selection highlight, if any, from the console holding the selection. This does not affect the stored selected text. TIOCL_SELMOUSEREPORT Make the terminal report (xs, ys) as the current mouse location using the xterm(1) mouse tracking protocol (see console_codes(4)). The lower 4 bits of sel_mode (TIOCL_SELBUTTONMASK) indicate the de‐ sired button press and modifier key information for the mouse event. If mouse reporting is not enabled for the terminal, this operation yields an EINVAL error. - Since Linux 6.7, using this subcode requires the - CAP_SYS_ADMIN capability. + Since Linux 6.7, using this selection mode requires + the CAP_SYS_ADMIN capability. subcode=TIOCL_PASTESEL Paste selection. The characters in the selection buffer are written to fd. Since Linux 6.7, using this subcode requires the CAP_SYS_ADMIN capability. subcode=TIOCL_UNBLANKSCREEN Unblank the screen. subcode=TIOCL_SELLOADLUT Sets contents of a 256‐bit look up table defining charac‐ ters in a "word", for word‐by‐word selection. (Since Linux 1.1.32.) Since Linux 6.7, using this subcode requires the CAP_SYS_ADMIN capability. subcode=TIOCL_GETSHIFTSTATE
#!/bin/bash
set -e
if [ $# -ne 1 ]
then
echo "need a directory argument (e.g., \"old\", \"new\")" >&2
exit 1
fiBeing diffman-git(1), it uses the git(1) repository to find the old pages and the new ones. No need to specify paths.
if ! [ -x ./build/test-groff ]
then
echo "./build/test-groff does not exist or is not executable" >&2
exit 2
fi
groff () {
../build/test-groff "$@"
}I use man(1), so it would be a matter of passing an appropriate PATH to run your development groff version.
BFLAG= #BFLAG=-b DIR=$1 MANS=( ./src/utils/lkbib/lkbib.1.man ./src/utils/tfmtodit/tfmtodit.1.man ./src/utils/hpftodit/hpftodit.1.man ./src/utils/pfbtops/pfbtops.1.man ./src/utils/afmtodit/afmtodit.1.man ./src/utils/lookbib/lookbib.1.man ./src/utils/addftinfo/addftinfo.1.man ./src/utils/xtotroff/xtotroff.1.man ./src/utils/indxbib/indxbib.1.man ./src/roff/nroff/nroff.1.man ./src/roff/troff/troff.1.man ./src/roff/groff/groff.1.man ./src/utils/grog/grog.1.man ./src/devices/grodvi/grodvi.1.man ./src/devices/grolbp/grolbp.1.man ./src/devices/grops/grops.1.man ./src/devices/grohtml/grohtml.1.man ./src/devices/grolj4/grolj4.1.man ./src/devices/grotty/grotty.1.man ./src/devices/gropdf/gropdf.1.man ./src/devices/gropdf/pdfmom.1.man ./src/devices/xditview/gxditview.1.man ./src/preproc/preconv/preconv.1.man ./src/preproc/tbl/tbl.1.man ./src/preproc/soelim/soelim.1.man ./src/preproc/eqn/eqn.1.man ./src/preproc/eqn/neqn.1.man ./src/preproc/pic/pic.1.man ./src/preproc/refer/refer.1.man ./src/preproc/grn/grn.1.man ./contrib/pic2graph/pic2graph.1.man ./contrib/hdtbl/groff_hdtbl.7.man ./contrib/mm/groff_mm.7.man ./contrib/mm/mmroff.1.man ./contrib/grap2graph/grap2graph.1.man ./contrib/rfc1345/groff_rfc1345.7.man ./contrib/eqn2graph/eqn2graph.1.man ./contrib/gpinyin/gpinyin.1.man ./contrib/mom/groff_mom.7.man ./contrib/gdiffmk/gdiffmk.1.man ./contrib/glilypond/glilypond.1.man ./contrib/chem/chem.1.man ./contrib/gperl/gperl.1.man ./man/groff_tmac.5.man ./man/groff_out.5.man ./man/groff_diff.7.man ./man/groff_char.7.man ./man/groff.7.man ./man/roff.7.man ./man/groff_font.5.man ./tmac/groff_trace.7.man ./tmac/groff_me.7.man ./tmac/groff_ms.7.man ./tmac/groff_man.7.man ./tmac/groff_man_style.7.man ./tmac/groff_mdoc.7.man ./tmac/groff_www.7.man )
I calculate the MANS dynamically with a regex: case $# in 0) git diff --name-only; ;; 1) git diff --name-only "$1^..$1"; ;; *) git diff --name-only "$1..$2"; ;; esac \ | grep -E '(\.[[:digit:]]([[:alpha:]][[:alnum:]]*)?\>|\.man)+(\.man|\.in)*$' \ | sortman \
MANS_SV=(
./contrib/mm/groff_mmse.7.man
)
mkdir "$DIR"
pushd "$DIR" >/dev/null
# the change logs, so we know approximately where we are
cp ../ChangeLog .
for d in chem gdiffmk glilypond gperl gpinyin hdtbl mm mom rfc1345 sboxes
do
cp ../contrib/$d/ChangeLog ./ChangeLog.$d
done
# our Texinfo manual
cp ../build/doc/groff.txt .
# our Texinfo manual via HTML
cp ../build/doc/groff.html .
lynx -dump groff.html > groff.html.txt
# our ms manuals
groff $BFLAG -ww -Tutf8 -ept -ms ../doc/ms.ms > ms.txt
# our me manuals
#groff $BFLAG -ww -Tutf8 -me ../doc/meintro.me > meintro.txt
#groff $BFLAG -ww -Tutf8 -kt -me -mfr ../doc/meintro_fr.me > meintro_fr.txt
#groff $BFLAG -ww -Tutf8 -me ../doc/meref.me > meref.txt
me_pre=../ATTIC/my.me
groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meintro.me > meintro.txt
groff $BFLAG -ww -Tutf8 -kt -me -mfr $me_pre ../build/doc/meintro_fr.me \
> meintro_fr.txt
groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meref.me > meref.txt
for F in ${MANS[*]} ${MANS_SV[*]}
do
G=../build/${F%.man}
if [ -f "$G" ]
then
cp "$G" .
else
echo "warning: \"$G\" missing" >&2
fi
done
: ${AD:=l}
ARGS="$BFLAG -ww -dAD=$AD -rCHECKSTYLE=3 -rU1 -Tutf8 -e -t -mandoc"
NOCR=-rcR=0
LOCALE=
ARGS_HTML="$BFLAG -ww -rCHECKSTYLE=3 -Thtml -e -t -mandoc -P-C -P-G"
for P in *.[157]
do
if [ "$P" = groff_mmse.7 ]
then
LOCALE=-msv
else
LOCALE=
fi
echo $0: $P >&2
echo "groff $ARGS $LOCALE $P" > "$P.cR.txt"
groff $ARGS $LOCALE "$P" >> "$P.cR.txt"
echo "groff $ARGS $LOCALE $NOCR $P" > "$P.no-cR.txt"
groff $ARGS $LOCALE $NOCR "$P" >> "$P.no-cR.txt"
echo "<!-- groff $ARGS_HTML $LOCALE -P-I$P $P -->" > "$P.html"
groff $ARGS_HTML $LOCALE -P-I$P $P >> "$P.html"
rm "$P"
done
Hmmm, my script is dumber; it only calls man(1). But I guess that's
enough. You can always tell man(1) to pass stuff to groff(1).
| sortman \
| while read -r f; do \
case $# in
0) old="HEAD:$f"; new="./$f"; ;;
1) old="$1^:$f"; new="$1:$f"; ;;
*) old="$1:$f"; new="$2:$f"; ;;
esac;
case $# in
0) cat "$new"; ;;
*) git show "$new"; ;;
esac \
| man /dev/stdin \
| diff --label "$old" --label "$new" "${opts[@]}" \
<(git show "$old" | man /dev/stdin) \
/dev/stdin \
|| true;
done;
popd >/dev/null
popd(1) at the end of a script is not useful. Or do you source the script?
# vim:set ai et sw=4 ts=4 tw=80:
Cheers, Alex -- <https://www.alejandro-colomar.es/>
Attachments
- signature.asc [application/pgp-signature] 833 bytes