Thread (67 messages) 67 messages, 9 authors, 2012-09-06

Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: 2012-08-28 23:00:56
Also in: dm-devel, linux-mm, lkml, netdev

* Mathieu Desnoyers (mathieu.desnoyers@efficios.com) wrote:
* Sasha Levin (levinsasha928@gmail.com) wrote:
quoted
On 08/28/2012 12:11 PM, Mathieu Desnoyers wrote:
quoted
* Sasha Levin (levinsasha928@gmail.com) wrote:
quoted
On 08/25/2012 06:24 AM, Mathieu Desnoyers wrote:
quoted
* Tejun Heo (tj@kernel.org) wrote:
quoted
Hello,

On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
quoted
Thats the thing, the amount of things of things you can do with a given bucket
is very limited. You can't add entries to any point besides the head (without
walking the entire list).
Kinda my point.  We already have all the hlist*() interface to deal
with such cases.  Having something which is evidently the trivial
hlist hashtable and advertises as such in the interface can be
helpful.  I think we need that more than we need anything fancy.

Heh, this is a debate about which one is less insignificant.  I can
see your point.  I'd really like to hear what others think on this.

Guys, do we want something which is evidently trivial hlist hashtable
which can use hlist_*() API directly or do we want something better
encapsulated?
My 2 cents, FWIW: I think this specific effort should target a trivially
understandable API and implementation, for use-cases where one would be
tempted to reimplement his own trivial hash table anyway. So here
exposing hlist internals, with which kernel developers are already
familiar, seems like a good approach in my opinion, because hiding stuff
behind new abstraction might make the target users go away.

Then, as we see the need, we can eventually merge a more elaborate hash
table with poneys and whatnot, but I would expect that the trivial hash
table implementation would still be useful. There are of course very
compelling reasons to use a more featureful hash table: automatic
resize, RT-aware updates, scalable updates, etc... but I see a purpose
for a trivial implementation. Its primary strong points being:

- it's trivially understandable, so anyone how want to be really sure
  they won't end up debugging the hash table instead of their
  work-in-progress code can have a full understanding of it,
- it has few dependencies, which makes it easier to understand and
  easier to use in some contexts (e.g. early boot).

So I'm in favor of not overdoing the abstraction for this trivial hash
table, and honestly I would rather prefer that this trivial hash table
stays trivial. A more elaborate hash table should probably come as a
separate API.

Thanks,

Mathieu
Alright, let's keep it simple then.

I do want to keep the hash_for_each[rcu,safe] family though.
Just a thought: if the API offered by the simple hash table focus on
providing a mechanism to find the hash bucket to which belongs the hash
chain containing the key looked up, and then expects the user to use the
hlist API to iterate on the chain (with or without the hlist _rcu
variant), then it might seem consistent that a helper providing
iteration over the entire table would actually just provide iteration on
all buckets, and let the user call the hlist for each iterator for each
node within the bucket, e.g.:

struct hlist_head *head;
struct hlist_node *pos;

hash_for_each_bucket(ht, head) {
        hlist_for_each(pos, head) {
                ...
        }
}

That way you only have to provide one single macro
(hash_for_each_bucket), and rely on the already existing:

- hlist_for_each_entry
- hlist_for_each_safe
- hlist_for_each_entry_rcu
- hlist_for_each_safe_rcu
  .....

and various flavors that can appear in the future without duplicating
this API. So you won't even have to create _rcu, _safe, nor _safe_rcu
versions of the hash_for_each_bucket macro.

Thoughts ?
In my opinion, the downside here is that it'll require 2 function calls and 2
levels of nesting for a simple hash iteration.
Those are macros, not functions. No function call is required. But I see
your point about nesting.
quoted
hash_for_each_bucket() will always be followed by an iteration of that
bucket, so splitting a hash_for_each() which does both into 2
different functions which will almost always must be called in that
given order sounds unintuitive to me.

It's also just 3 different possible iterators:

 - hlist_for_each_entry
 - hlist_for_each_entry_safe
 - hlist_for_each_entry_rcu

So I think that it's a good price to pay - 2 extra macro definitions
in the header to save a macro call + nesting level in each place that
uses a hashtable.
I must admin I don't care that much one way or another.
Looking again at:

+#define hash_for_each_size(name, bits, bkt, node, obj, member)                 \
+       for (bkt = 0; bkt < HASH_SIZE(bits); bkt++)                             \
+               hlist_for_each_entry(obj, node, &name[bkt], member)

you will notice that a "break" or "continue" in the inner loop will not
affect the outer loop, which is certainly not what the programmer would
expect!

I advise strongly against creating such error-prone construct.

Thanks,

Mathieu


Thanks,

Mathieu
quoted

Thanks,
Sasha
quoted
Thanks,

Mathieu
-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help