Thread (115 messages) 115 messages, 6 authors, 2019-07-11

Re: [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal

From: Matheus Tavares Bernardino <hidden>
Date: 2019-02-27 17:40:31

On Tue, Feb 26, 2019 at 9:32 AM Duy Nguyen [off-list ref] wrote:
On Tue, Feb 26, 2019 at 12:18 PM Matheus Tavares
[off-list ref] wrote:
quoted
Replace usage of opendir/readdir/closedir API to traverse directories
recursively, at copy_or_link_directory function, by the dir-iterator
API. This simplifies the code and avoid recursive calls to
copy_or_link_directory.

This process also makes copy_or_link_directory call die() in case of an
error on readdir or stat, inside dir_iterator_advance. Previously it
would just print a warning for errors on stat and ignore errors on
readdir, which isn't nice because a local git clone would end up
successfully even though the .git/objects copy didn't fully succeeded.

Signed-off-by: Matheus Tavares <redacted>
---
I can also make the change described in the last paragraph in a separate
patch before this one, but I would have to undo it in this patch because
dir-iterator already implements it. So, IMHO, it would be just noise
and not worthy.

 builtin/clone.c | 45 +++++++++++++++++++++++----------------------
 1 file changed, 23 insertions(+), 22 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index fd580fa98d..b23ba64c94 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -23,6 +23,8 @@
 #include "transport.h"
 #include "strbuf.h"
 #include "dir.h"
+#include "dir-iterator.h"
+#include "iterator.h"
 #include "sigchain.h"
 #include "branch.h"
 #include "remote.h"
@@ -411,42 +413,37 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
 }

 static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
-                                  const char *src_repo, int src_baselen)
+                                  const char *src_repo)
 {
-       struct dirent *de;
-       struct stat buf;
        int src_len, dest_len;
-       DIR *dir;
-
-       dir = opendir(src->buf);
-       if (!dir)
-               die_errno(_("failed to open '%s'"), src->buf);
+       struct dir_iterator *iter;
+       int iter_status;
+       struct stat st;
+       unsigned flags;

        mkdir_if_missing(dest->buf, 0777);

+       flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS;
+       iter = dir_iterator_begin(src->buf, flags);
+
        strbuf_addch(src, '/');
        src_len = src->len;
        strbuf_addch(dest, '/');
        dest_len = dest->len;

-       while ((de = readdir(dir)) != NULL) {
+       while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
                strbuf_setlen(src, src_len);
-               strbuf_addstr(src, de->d_name);
+               strbuf_addstr(src, iter->relative_path);
                strbuf_setlen(dest, dest_len);
-               strbuf_addstr(dest, de->d_name);
-               if (stat(src->buf, &buf)) {
-                       warning (_("failed to stat %s\n"), src->buf);
-                       continue;
-               }
-               if (S_ISDIR(buf.st_mode)) {
-                       if (!is_dot_or_dotdot(de->d_name))
-                               copy_or_link_directory(src, dest,
-                                                      src_repo, src_baselen);
+               strbuf_addstr(dest, iter->relative_path);
+
+               if (S_ISDIR(iter->st.st_mode)) {
+                       mkdir_if_missing(dest->buf, 0777);
I wonder if this mkdir_if_missing is sufficient. What if you have to
create multiple directories?

Let's say the first advance, we hit "a". The the second advance we hit
directory "b/b/b/b", we would need to mkdir recursively and something
like safe_create_leading_directories() would be a better fit.

I'm not sure if it can happen though. I haven't re-read dir-iterator
code carefully.
quoted
                        continue;
                }

                /* Files that cannot be copied bit-for-bit... */
-               if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
+               if (!strcmp(iter->relative_path, "info/alternates")) {
While we're here, this should be fspathcmp to be friendlier to
case-insensitive filesystems. You probably should fix it in a separate
patch though.
Nice! I will make this change in a separate patch in the series. Thanks!
quoted
                        copy_alternates(src, dest, src_repo);
                        continue;
                }
@@ -463,7 +460,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
                if (copy_file_with_time(dest->buf, src->buf, 0666))
                        die_errno(_("failed to copy file to '%s'"), dest->buf);
        }
-       closedir(dir);
+
+       if (iter_status != ITER_DONE) {
+               strbuf_setlen(src, src_len);
+               die(_("failed to iterate over '%s'"), src->buf);
+       }
I think you need to abort the iterator even when it returns ITER_DONE.
At least that's how the first caller in files-backend.c does it.
Hm, I don't think so, since dir_iterator_advance() already frees the
resources before returning ITER_DONE. Also, I may be wrong, but it
doesn't seem to me, that files-backend.c does it. The function
files_reflog_iterator_advance() that calls dir_iterator_advance() even
sets the dir-iterator pointer to NULL as soon as ITER_DONE is
returned.

quoted
 }

 static void clone_local(const char *src_repo, const char *dest_repo)
@@ -481,7 +482,7 @@ static void clone_local(const char *src_repo, const char *dest_repo)
                get_common_dir(&dest, dest_repo);
                strbuf_addstr(&src, "/objects");
                strbuf_addstr(&dest, "/objects");
-               copy_or_link_directory(&src, &dest, src_repo, src.len);
+               copy_or_link_directory(&src, &dest, src_repo);
                strbuf_release(&src);
                strbuf_release(&dest);
        }
--
2.20.1

--
Duy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help