Re: Question relate to collaboration on git monorepo
From: ZheNing Hu <hidden>
Date: 2022-09-21 15:42:38
Elijah Newren [off-list ref] 于2022年9月21日周三 09:48写道:
On Tue, Sep 20, 2022 at 5:42 AM ZheNing Hu [off-list ref] wrote:quoted
Hey, guys, If two users of git monorepo are working on different sub project /project1 and /project2 by partial-clone and sparse-checkout , if user one push first, then user two want to push too, he must pull some blob which pushed by user one.This is not true. While user two must pull the new commit and any new trees pushed by user one (which will mean knowing the hashes of the new files), there is no need to download the actual content of the new files unless and until some git command is run that attempts to view the file's contents.
Yeah, now I understand that git fetch will not download blobs out of the sparse-checkout pattern, but git merge will. So git pull will download some missing blobs here.
quoted
The large number of interruptions in git push may be another problem, if thousands of probjects are in one monorepo, and no one else has any code that would conflict with me in any way, but I need pull everytime? Is there a way to make improvements here?No, you only need to pull when attempting to push back to the server. Further, if you're worried that the second push will fail, you could easily script it and put "pull --rebase && push" in a loop until it succeeds (I mean, you did say no one would have any conflicts). In fact, you could just make that a common script distributed to your users and tell them to run that instead of "git push" if they don't want to worry about manually updating.
Ah, This method looks a little funny, but it maybe can work. This issue may also apply to some Code Review tools, maybe need a "pull --rebase && git cr" loop.
Now, if you have thousands of nearly fully independent subprojects and lots of developers for each subproject and they all commit & push *very* frequently, I guess you might be able to eventually get to the scale where you are worried there will be so much contention that the script will take too long. I'd be surprised if you got that far, but even if you did, you could easily adopt a lieutenant-like workflow (somewhat like the linux kernel, but even simpler given the independence of your projects). In such a workflow, you'd let people in subprojects push to their subproject fork (instead of to the "main" or "central" repository), and the lieutenants of the subprojects then periodically push work from that subproject to the main project in batches.
Make sense. When this mono-repo really has this kind of scale, splitting the workflow might be the right thing to do.
I don't really see much need here for improvements, myself.quoted
Here's an example of how two users constrain each other when git push.Did you pay attention to warnings you got along the way? In particular...quoted
git clone --bare mono-repoYou missed the following command right after your clone: git -C mono-repo.git config uploadpack.allowFilter truequoted
# user1 rm -rf m1 git clone --filter="blob:none" --no-checkout --no-local ./mono-repo.git m1Since you forgot to set the important config I mentioned above, your command here generates the following line of output, among others: warning: filtering not recognized by server, ignoring This warning means you weren't testing partial clones, but regular full clones. Perhaps that was the cause of your confusion?
Oh, sorry for forget record this, I have config them globally: uploadpack.allowanysha1inwant=true uploadpack.allowfilter=true Thanks for the answer, ZheNing Hu