Thread (10 messages) 10 messages, 4 authors, 2022-09-23

Re: Question relate to collaboration on git monorepo

From: ZheNing Hu <hidden>
Date: 2022-09-21 15:42:38

Elijah Newren [off-list ref] 于2022年9月21日周三 09:48写道:
On Tue, Sep 20, 2022 at 5:42 AM ZheNing Hu [off-list ref] wrote:
quoted
Hey, guys,

If two users of git monorepo are working on different sub project
/project1 and /project2 by partial-clone and sparse-checkout ,
if user one push first, then user two want to push too, he must
pull some blob which pushed by user one.
This is not true.  While user two must pull the new commit and any new
trees pushed by user one (which will mean knowing the hashes of the
new files), there is no need to download the actual content of the new
files unless and until some git command is run that attempts to view
the file's contents.
Yeah, now I understand that git fetch will not download blobs out of
the sparse-checkout pattern, but git merge will. So git pull will
download some missing blobs here.
quoted
The large number of interruptions in git push may be another
problem, if thousands of probjects are in one monorepo, and
no one else has any code that would conflict with me in any way,
but I need pull everytime? Is there a way to make improvements
here?
No, you only need to pull when attempting to push back to the server.

Further, if you're worried that the second push will fail, you could
easily script it and put "pull --rebase && push" in a loop until it
succeeds (I mean, you did say no one would have any conflicts).  In
fact, you could just make that a common script distributed to your
users and tell them to run that instead of "git push" if they don't
want to worry about manually updating.
Ah, This method looks a little funny, but it maybe can work. This
issue may also apply to some Code Review tools, maybe need
a "pull --rebase && git cr" loop.
Now, if you have thousands of nearly fully independent subprojects and
lots of developers for each subproject and they all commit & push
*very* frequently, I guess you might be able to eventually get to the
scale where you are worried there will be so much contention that the
script will take too long.  I'd be surprised if you got that far, but
even if you did, you could easily adopt a lieutenant-like workflow
(somewhat like the linux kernel, but even simpler given the
independence of your projects).  In such a workflow, you'd let people
in subprojects push to their subproject fork (instead of to the "main"
or "central" repository), and the lieutenants of the subprojects then
periodically push work from that subproject to the main project in
batches.
Make sense. When this mono-repo really has this kind of scale,
splitting the workflow might be the right thing to do.
I don't really see much need here for improvements, myself.
quoted
Here's an example of how two users constrain each other when git push.
Did you pay attention to warnings you got along the way?  In particular...
quoted
git clone --bare mono-repo
You missed the following command right after your clone:

   git -C mono-repo.git config uploadpack.allowFilter true
quoted
# user1
rm -rf m1
git clone --filter="blob:none" --no-checkout --no-local ./mono-repo.git m1
Since you forgot to set the important config I mentioned above, your
command here generates the following line of output, among others:

    warning: filtering not recognized by server, ignoring

This warning means you weren't testing partial clones, but regular
full clones.  Perhaps that was the cause of your confusion?
Oh, sorry for forget record this, I have config them globally:

uploadpack.allowanysha1inwant=true
uploadpack.allowfilter=true

Thanks for the answer,
ZheNing Hu
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help