=?y?q?=5BGSOC=5D=20Discuss=3A=20Refactoring=20in=20order=20to=20reduce=20Git=E2=80=99s=20global=20state?=
From: Shreyansh Paliwal <hidden>
Date: 2026-02-19 18:12:17
Hi everyone, I have been around Git for some time and am interested in the âRefactoring in order to reduce Gitâs global stateâ project for GSoC 2026. So far I have built Git from source, completed a microproject, and explored some related areas in worktree and wt-status. I have also gone through the blog posts by Ayush and Bello Olamide, which were very helpful in getting to know about the ongoing/previous related to this. From what I gathered, - In Outreachy, recent work has focused on moving core.attributesfile and core.sparseCheckout into local structs and also to handle the issue of lazy loading, but it is still a work in progress. - In last yearâs GSoC work, the focus included removing uses of the_repository and other globals across areas such as preload-index:(core_preload_index), builtin/prune: (repository_format_precious_objects), builtin/fmt-merge-msg: (merge_log_config). Though I still have a few questions regarding the project for better clarity, - Should the primary focus be on core library code rather than builtin? (ref. [1]) - Is it preferable to approach the project file-wise (eg. cleanup of one file making it completely free of the_repository) or variable-wise (eg. identify one global state from environment.c and eliminate across the codebase)? - Are there any globals which are best not to be removed currently? For example, in editor.c there are mainly two globals, - editor_program, which appears to be only used within the file and is not dependant on repository. So would it be preferable to remove it from environment.c and localize it within editor.c, move it into struct repository_settings / repo_config_values, or keep it as is? - the_repository, there is only one instance in the function git_sequence_editor() which is used in editor.c which can be modified to pass struct repository down the callers but is also used in builtin/var.c, where a local repository instance is not available. In that case, would it be feasible to pass the_repository or is there any other way? I have also surveyed files that use #define USE_THE_REPOSITORY_VARIABLE to roughly analyse the usage of globals, and I could make that much of the library code is still dependant on the_repository, so could that be taken on priority to reduce the usage of the_repository throughout the codebase. Thanks, Shreyansh [1]- https://lore.kernel.org/git/7b5dd0c4-0ca0-458e-89db-621a70dac9ae@gmail.com/ (local)