Re: Migration of git-scm.com to a static web site: ready for review/testing
From: Johannes Schindelin <hidden>
Date: 2024-09-11 22:18:59
Me again, On Fri, 17 Nov 2023, Johannes Schindelin wrote:
the idea of migrating https://git-scm.com/ from a Rails app to a static site has been discussed several times on this list in the past. Thanks to the heroic, multi-year efforts of Matt Burke, Victoria Dye and Matthias Aßhauer, there is now a Pull Request, ready for review: https://github.com/git/git-scm.com/pull/1804 This Pull Request is not for the faint of heart, mainly because of the sheer amount of generated pages that are committed to the repository (such as the book, the manual pages, etc, a design decision necessary to run this as a static website). These pages are generated by GitHub workflows that are intended to run on a schedule, and the scripts that generate them are part of the Pull Request. For that reason, I do not consider it necessary to review those generated pages, those reviews have been done in the upstream sources from which the pages were generated. At this point, the patches are fairly robust and I am mainly hoping for help with verifying that the static site works as intended, that existing links will continue to work with the new site (essentially, find obscure references to the existing website, then insert `git.github.io/` in the URL and verify that it works as intended). To that end, I deployed this branch to GitHub Pages so that anyone interested (hopefully many!) can have a look at https://git.github.io/git-scm.com/ and compare to the existing https://git-scm.com/.
It's time for the next update, after working with Pagefind, lychee and
Playwright project members to improve on this effort. Here is a focused
list of changes since November 17th 2023 (when I sent the first RFC), in
descending, quite subjective order of importance:
- A couple of patches were upstreamed via
https://github.com/git/git-scm.com/pull/1855
- All other upstream changes were incorporated by rebasing the branch
(keeping the merge commits as structuring elements). A couple of
times. Okay, I won't lie, I must have rebased them about ten dozen
times, even if I "only" force-pushed 22 times since my first update on
this mailing list.
- A rather huge change is that the pre-rendered HTML files are no longer
stored directly in `content/`, but instead in `external/` (using Hugo
mounts), and each and every pre-rendered file is marked clearly as
such, to avoid accidental PRs that want to change those files.
- Another important change is that the link checker Lychee is now used
in CI builds to verify that there are no broken internal links (it
does not try to follow external links, that would be too fragile and
too expensive).
- Yet another important change is that there are now UI tests (using
Playwright) that verify e.g. that the current book section is marked
as active in the Chapters drop-down.
- The Playwright tests also pass successfully on the current Rails-backed
git-scm.com, with a few bugs of that Rails app clearly documented in
the tests.
- The `README.md` is now updated in logical steps, reflecting how the
commit history builds up the Hugo site, then adds Pagefind, etc.
- Upgraded to the newest Hugo and Pagefind versions.
- I've re-done the way git-scm.com/docs works, it is no longer a
complex `_index.html` file, and uses a much more powerful partial
template now instead of awkward shortcodes.
- The pages' sections are now determined more carefully (and correctly).
- Since my last update, a couple of GUIs were added and others modified,
therefore I scripted the migration from the single
`resources/guis.yml` file to the separate `data/guis/*.yml` files, and
documented it in the commit message.
- The code using Pagefind was fixed (there were missing `await`s as well
as racy search result order).
- Translated manual pages are now searchable (when on a page in the
corresponding language).
- The `.html` file extension was dropped from the search results' URLs.
- Book sections containing question marks are now handled correctly,
with a fall-back when those question marks are incorrectly specified
in the URL (which is then interpreted as separator of the path from
the query part of the URL).
- Missing files in the ProGit book or its translations no longer fail
the deployment (and yes, there was one update that resulted in such
missing files, and the speed with which this was fixed provided the
motivation for this change of behavior).
- The Hugo and Pagefind versions are no longer hard-coded in the
workflows, but instead parsed out from `hugo.yml` (which is now the
only location where these versions are specified).
- Some Pagefind changes that accidentally slipped into earlier commits
have been moved into their correct spots in the commit history.
- The CI builds now verify that the search works as intended, by
verifying that searching for a couple of subcommand names will find
the respective manual pages as top result.
- Commits that were created by automated GitHub workflows report in
their commit message which workflow signs responsible for the update.
- Obviously, the automation updated Git version, download URLs, manual
pages and book (including translations) many times, as designed.
- Workflows that can be triggered manually now really respect the
`force-rebuild` flag.
- The GitHub workflows were updated to use current GitHub Actions'
versions (e.g. `actions/checkout@v4`).
I will try to reply to this mail with a range-diff that is slightly edited
to leave out the changes made by GitHub workflows (i.e. book and manual
pages). It is still over 100kB, therefore it might well bounce. Wish me
luck.
Ciao,
Johannes