Thread (104 messages) 104 messages, 4 authors, 2018-02-19

Re: [PATCH v3 16/23] ref-filter: make cat_file_info independent

From: Jeff King <hidden>
Date: 2018-02-16 21:54:51

On Thu, Feb 15, 2018 at 01:33:25PM +0300, Оля Тележная wrote:
2018-02-15 8:53 GMT+03:00 Jeff King [off-list ref]:
quoted
On Mon, Feb 12, 2018 at 08:08:54AM +0000, Olga Telezhnaya wrote:
quoted
Remove connection between expand_data variable
in cat-file and in ref-filter.
It will help further to get rid of using expand_data in cat-file.
I have to admit I'm confused at this point about what is_cat_file is
for, or even why we need cat_file_data. Shouldn't these items be handled
by their matching ref-filter atoms at this point?
We discussed that earlier outside of mailing list, and I even tried to
implement that idea and spent a couple of days to prove that it's not
possible.
The problem is that the list of atoms is made dynamically, and we
can't store pointers to any values in each atom. That's why we need
separate cat_file_info variable that is outside of main atom list.
We also need is_cat_file because we still have some part of logic that
is different for cat-file and for all other commands, and sometimes we
need to know that information.
OK, right, I forgot about the pointers thing. This is definitely the
sort of design decision that should go into the commit message.

I went over our off-list discussion again to refresh my memory. Let me
try to summarize a bit (for myself, or for other readerss). Naively, one
would hope that you could do something like:

  1. While parsing the %(objectsize) atom, add a pointer like:

       the_object_info->sizep = &atom.size;

  2. Later when fulfilling the request for a specific object, if
     the_object_info is not empty, then call sha1_object_info() with it
     to hold the results.

  3. Read the value out of atom.size when assembling the formatted
     string.

But that has two problems:

  a. During parsing, we're resizing the atom array, so the &atom.size
     pointer is subject to change.  This is the issue you mentioned
     above. I think we could solve it by taking a pass over the atoms
     after parsing and filling in the pointers then. But...

  b. Another problem is that the same atom may appear multiple times. So
     parsing "%(objectsize) %(objectsize)", we can't point
     the_object_info at _both_ of them. We need our call to
     sha1_object_info() to store the result in a single location, and
     then as we format each atom they can both use that (which works
     even if they might format it differently, say if you had
     "%(deltabase) %(deltabase:abbrev)".

So we do need some kind of global storage to hold these results. This is
sort of the same problem we have with parsing the objects at all. There
we get by without a global because all of the logic is crammed into
populate_value(). In theory we could do the same thing here, but there
are just a lot of different items we might be asking for. So instead of
"need_tagged", you'd have a multitude of need_objectsize_disk,
need_deltabase, etc, which gets unwieldy. We may as well fill in a
"struct object_info".

So OK, I agree that something like "struct expand_data" is needed.

But what seems non-ideal to me is the way that ref-filter depends on it
in this cat-file-specific way. If it needs those values it should be
calling sha1_object_info(). And if it doesn't, then it can skip the
call. It shouldn't need to know about cat-file at all.

And likewise cat-file should not know about expand_data. And I think
that's true by the end of the series, though the steps in the middle
left me a little confused.

-Peff
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help