Re: [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal
From: Matheus Tavares Bernardino <hidden>
Date: 2019-02-27 17:40:31
On Tue, Feb 26, 2019 at 9:32 AM Duy Nguyen [off-list ref] wrote:
On Tue, Feb 26, 2019 at 12:18 PM Matheus Tavares [off-list ref] wrote:quoted
Replace usage of opendir/readdir/closedir API to traverse directories recursively, at copy_or_link_directory function, by the dir-iterator API. This simplifies the code and avoid recursive calls to copy_or_link_directory. This process also makes copy_or_link_directory call die() in case of an error on readdir or stat, inside dir_iterator_advance. Previously it would just print a warning for errors on stat and ignore errors on readdir, which isn't nice because a local git clone would end up successfully even though the .git/objects copy didn't fully succeeded. Signed-off-by: Matheus Tavares <redacted> --- I can also make the change described in the last paragraph in a separate patch before this one, but I would have to undo it in this patch because dir-iterator already implements it. So, IMHO, it would be just noise and not worthy. builtin/clone.c | 45 +++++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 22 deletions(-)diff --git a/builtin/clone.c b/builtin/clone.c index fd580fa98d..b23ba64c94 100644 --- a/builtin/clone.c +++ b/builtin/clone.c@@ -23,6 +23,8 @@ #include "transport.h" #include "strbuf.h" #include "dir.h" +#include "dir-iterator.h" +#include "iterator.h" #include "sigchain.h" #include "branch.h" #include "remote.h"@@ -411,42 +413,37 @@ static void mkdir_if_missing(const char *pathname, mode_t mode) } static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, - const char *src_repo, int src_baselen) + const char *src_repo) { - struct dirent *de; - struct stat buf; int src_len, dest_len; - DIR *dir; - - dir = opendir(src->buf); - if (!dir) - die_errno(_("failed to open '%s'"), src->buf); + struct dir_iterator *iter; + int iter_status; + struct stat st; + unsigned flags; mkdir_if_missing(dest->buf, 0777); + flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS; + iter = dir_iterator_begin(src->buf, flags); + strbuf_addch(src, '/'); src_len = src->len; strbuf_addch(dest, '/'); dest_len = dest->len; - while ((de = readdir(dir)) != NULL) { + while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) { strbuf_setlen(src, src_len); - strbuf_addstr(src, de->d_name); + strbuf_addstr(src, iter->relative_path); strbuf_setlen(dest, dest_len); - strbuf_addstr(dest, de->d_name); - if (stat(src->buf, &buf)) { - warning (_("failed to stat %s\n"), src->buf); - continue; - } - if (S_ISDIR(buf.st_mode)) { - if (!is_dot_or_dotdot(de->d_name)) - copy_or_link_directory(src, dest, - src_repo, src_baselen); + strbuf_addstr(dest, iter->relative_path); + + if (S_ISDIR(iter->st.st_mode)) { + mkdir_if_missing(dest->buf, 0777);I wonder if this mkdir_if_missing is sufficient. What if you have to create multiple directories? Let's say the first advance, we hit "a". The the second advance we hit directory "b/b/b/b", we would need to mkdir recursively and something like safe_create_leading_directories() would be a better fit. I'm not sure if it can happen though. I haven't re-read dir-iterator code carefully.quoted
continue; } /* Files that cannot be copied bit-for-bit... */ - if (!strcmp(src->buf + src_baselen, "/info/alternates")) { + if (!strcmp(iter->relative_path, "info/alternates")) {While we're here, this should be fspathcmp to be friendlier to case-insensitive filesystems. You probably should fix it in a separate patch though.
Nice! I will make this change in a separate patch in the series. Thanks!
quoted
copy_alternates(src, dest, src_repo); continue; }@@ -463,7 +460,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, if (copy_file_with_time(dest->buf, src->buf, 0666)) die_errno(_("failed to copy file to '%s'"), dest->buf); } - closedir(dir); + + if (iter_status != ITER_DONE) { + strbuf_setlen(src, src_len); + die(_("failed to iterate over '%s'"), src->buf); + }I think you need to abort the iterator even when it returns ITER_DONE. At least that's how the first caller in files-backend.c does it.
Hm, I don't think so, since dir_iterator_advance() already frees the resources before returning ITER_DONE. Also, I may be wrong, but it doesn't seem to me, that files-backend.c does it. The function files_reflog_iterator_advance() that calls dir_iterator_advance() even sets the dir-iterator pointer to NULL as soon as ITER_DONE is returned.
quoted
} static void clone_local(const char *src_repo, const char *dest_repo)@@ -481,7 +482,7 @@ static void clone_local(const char *src_repo, const char *dest_repo) get_common_dir(&dest, dest_repo); strbuf_addstr(&src, "/objects"); strbuf_addstr(&dest, "/objects"); - copy_or_link_directory(&src, &dest, src_repo, src.len); + copy_or_link_directory(&src, &dest, src_repo); strbuf_release(&src); strbuf_release(&dest); } --2.20.1-- Duy