Thread (20 messages) 20 messages, 7 authors, 2016-09-26

RE: [PATCH 2/2] radix-tree: Fix optimisation problem

From: Matthew Wilcox <hidden>
Date: 2016-09-26 21:28:00
Also in: linux-fsdevel, lkml

From: linus971@gmail.com [mailto:linus971@gmail.com] On Behalf Of Linus Torvalds
On Sun, Sep 25, 2016 at 12:04 PM, Linus Torvalds
[off-list ref] wrote:
quoted
       It gets rid of
the ad-hoc arithmetic in radix_tree_descend(), and just makes all that
be inside the is_sibling_entry() logic instead. Which got renamed and
made to actually return the main sibling.
Sadly, it looks like gcc generates bad code for this approach. Looks
like it ends up testing the resulting sibling pointer twice (because
we explicitly disable -fno-delete-null-pointer-checks in the kernel,
and we have no way to say "look, I know this pointer I'm returning is
non-null").

So a smaller patch that keeps the old boolean "is_sibling_entry()" but
then actually *uses* that inside radix_tree_descend() and then tries
to make the nasty cast to "void **" more legible by making it use a
temporary variable seems to be a reasonable balance.

At least I feel like I can still read the code, but admittedly by now
that may be because I've stared at those few lines so much that I feel
like I know what's going on. So maybe the code isn't actually any more
legible after all.

.. and unlike my previous patch, it actually generates better code
than the original (while still passing the fixed test-suite, of
course). The reason seems to be exactly that temporary variable,
allowing us to just do

        entry = rcu_dereference_raw(*sibentry);

rather than doing

        entry = rcu_dereference_raw(parent->slots[offset]);

with the re-computed offset.

So I think I'll commit this unless somebody screams.
Acked-by: Matthew Wilcox <redacted>

I don't love it.  But I think it's a reasonable fix for this point in the release cycle, and I have an idea for changing the representation of sibling slots that will make this moot.

(Basically adopting Konstantin's idea for using the *last* entry instead of the *first*, and then using entries of the form (offset << 2 | RADIX_TREE_INTERNAL_NODE), so we can identify sibling entries without knowing the parent pointer, and we can go straight from sibling entry to slot offset as a shift rather than as a pointer subtraction).
��칻
�&ޱ��jg���
�+a�{.n�+����{��h����ܭ�f���h��/i�(�h�j+z)ߢ�ˊ{�0�
zm����	b��f����:'�隊V����j)m��'�K�rJ+�隊Y/i�(��
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help