Thread (6 messages) 6 messages, 2 authors, 2026-03-04

=?y?q?=5BGSOC=5D=20Discuss=3A=20Refactoring=20in=20order=20to=20reduce=20Git=E2=80=99s=20global=20state?=

From: Shreyansh Paliwal <hidden>
Date: 2026-02-19 18:12:17

Hi everyone,

I have been around Git for some time and am interested in the “Refactoring
in order to reduce Git’s global state” project for GSoC 2026.

So far I have built Git from source, completed a microproject, and explored
some related areas in worktree and wt-status. I have also gone through the
blog posts by Ayush and Bello Olamide, which were very helpful in getting
to know about the ongoing/previous related to this. From what I gathered,

- In Outreachy, recent work has focused on moving core.attributesfile and
  core.sparseCheckout into local structs and also to handle the issue of
  lazy loading, but it is still a work in progress.

- In last year’s GSoC work, the focus included removing uses of
  the_repository and other globals across areas such as
  preload-index:(core_preload_index), builtin/prune:
  (repository_format_precious_objects), builtin/fmt-merge-msg:
  (merge_log_config).

Though I still have a few questions regarding the project for better clarity,

- Should the primary focus be on core library code rather than builtin?
  (ref. [1])

- Is it preferable to approach the project file-wise (eg. cleanup of one
  file making it completely free of the_repository) or variable-wise (eg.
  identify one global state from environment.c and eliminate across the
  codebase)?

- Are there any globals which are best not to be removed currently?

For example, in editor.c there are mainly two globals,

- editor_program, which appears to be only used within the file and is not
  dependant on repository. So would it be preferable to remove it from
  environment.c and localize it within editor.c, move it into struct
  repository_settings / repo_config_values, or keep it as is?

- the_repository, there is only one instance in the function
  git_sequence_editor() which is used in editor.c which can be modified to
  pass struct repository down the callers but is also used in
  builtin/var.c, where a local repository instance is not available. In
  that case, would it be feasible to pass the_repository or is there any
  other way?

I have also surveyed files that use #define USE_THE_REPOSITORY_VARIABLE to
roughly analyse the usage of globals, and I could make that much of the
library code is still dependant on the_repository, so could that be taken
on priority to reduce the usage of the_repository throughout the codebase.

Thanks,
Shreyansh

[1]- https://lore.kernel.org/git/7b5dd0c4-0ca0-458e-89db-621a70dac9ae@gmail.com/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help