Thread (13 messages) 13 messages, 7 authors, 2022-08-01

Re: Question: What's the best way to implement directory permission control in git?

From: ZheNing Hu <hidden>
Date: 2022-07-31 16:16:01

Emily Shaffer [off-list ref] 于2022年7月30日周六 07:50写道:
On Wed, Jul 27, 2022 at 1:56 AM ZheNing Hu [off-list ref] wrote:
quoted
if there is a monorepo such as
git@github.com:derrickstolee/sparse-checkout-example.git

There are many files and directories:

client/
    android/
    electron/
    iOS/
service/
    common/
    identity/
    list/
    photos/
web/
    browser/
    editor/
    friends/
boostrap.sh
LICENSE.md
README.md

Now we can use partial-clone + sparse-checkout to reduce
the network overhead, and reduce disk storage space size, that's good.

But I also need a ACL to control what directory or file people can fetch/push.
e.g. I don't want a client fetch the code in "service" or "web".

Now if the user client use "git log -p" or "git sparse-checkout add service"...
or other git command, git which will  download them by
"git fetch --filter=blob:none --stdin <oid>" automatically.

This means that the git client and server interact with git objects
(and don't care about path) we cannot simply ban someone download
a "path" on the server side.

What should I do? You may recommend me to use submodule,
but due to its complexity, I don't really want to use it :-(
As a quick note, there is some effort on making submodules less
complex, at least from the user perspective. My team and I have been
actively working on improvements in that area for the past year or so.
Please feel free to read and examine the design doc[1] to see if the
future looks brighter in that direction than you thought - or, even
better, if there's something missing from that design that would be
compelling in allowing you to use submodules to solve your use case.
Thanks, I think submodules’ improvement may shift my perception.
But the problem I'm having is whether I should give permission control
to all "subdirectories" (if and when I find out that this is not necessary,
then submodules might be an option)
As for differing ACLs within a single repository... Google has had
some attempts at it and has only found pain, at least where Git is
involved. As others have mentioned elsewhere downthread, it doesn't
really match Git's data model.
That's so sad :(
Gerrit has tried to support something sort of similar to this -
per-branch read permissions. They were really painful! So much so that
our Gerrit team is actively discouraging their use, and in the process
of deprecating them. It turns out that on the server side, calculating
permissions for which commit should be visible is very expensive,
because you are not just saying "is commit abcdef on
forbidden-branch?" but rather are saying "is commit abcdef on
forbidden-branch *and not on any branches $user is allowed to see*?"
The same calculation woes would be true of per-object or per-tree
permissions, because Git will treat 'everyone/can/see/.linter.config'
and 'very/secret/dir/.linter.config' as a single object with a single
ID if the contents of each '.linter.config' are identical. It is still
very expensive for the server to decide whether or not it's okay to
send a certain object. Part of the reason the branch ACL calculation
is so painful is that we have some repositories with many many
branches (100,000+); if you're using a very large monorepo you will
probably find similarly expensive and complex calculations even in a
single repository.
Agree, as Avar said that there are delta data too (so data cannot easily
hidden)
Generally, this isn't something I'd like to see Git support - I think
it would by necessity be kludgey and has some very pointy edge cases
for the user (what if I'm trying to merge from another branch and
there is a conflict in very/secret/dir/, but I'm not allowed to see
it?). But of course Git is open source, and my opinion is only one of
many; I just wanted to share some past pain that we've had in this
area.
To summarize (your and other answers' ideas), I have reasons to believe
that git itself cannot easily solve this directory permissions problem:
1. Files with the same object id can be in different directories
(data cannot be isolated).
2. DELTA data can share data between multiple objects
(data cannot be isolated).
3. Permission management is very cumbersome and time consuming,
especially on large repositories.
4. The directories that are not accessible should be or not see merge
conflict is a big problem.
 - Emily

1: https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/ (local)
Thanks.

ZheNing Hu
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help