[PATCH 00/13] reftable: prepare for re-seekable iterators
From: Patrick Steinhardt <hidden>
Date: 2024-05-08 11:03:40
Hi,
the reftable library uses iterators both to iterate through a set of
records, but also to look up a single record. In past patch series, I
have focussed quite a lot to optimize the case where we iterate through
a large set of records. But looking up a records is still quite
inefficient when doing multiple lookups. This is because whenever we
want to look up a record, we need to create a new iterator, including
all of its internal data structures.
To address this inefficiency, the patch series at hand refactors the
reftable library such that creation of iterators and seeking on an
iterator are separate steps. This refactoring prepares us for reusing
iterators to perform multiple seeks, which in turn will allow us to
reuse internal data structures for subsequent seeks.
The patch series is structured as follows:
- Patches 1 to 5 perform some general cleanups to make the reftable
iterators easier to understand.
- Patchges 6 to 9 refactor the iterators internally such that creation
of the iterator and seeking on it is clearly separated.
- Patches 10 to 13 adapt the external interfaces such that they allow
for reuse of iterators.
Note: this series does not yet go all the way to re-seekable iterators,
and there are no users yet. The patch series is complex enough as-is
already, so I decided to defer that to the next iteration. Thus, the
whole refactoring here should essentially be a large no-op that prepares
the infrastructure for re-seekable iterators.
The series depends on pks/reftable-write-optim at fa74f32291
(reftable/block: reuse compressed array, 2024-04-08).
Thanks!
Patrick
Patrick Steinhardt (13):
reftable/block: use `size_t` to track restart point index
reftable/reader: avoid copying index iterator
reftable/reader: unify indexed and linear seeking
reftable/reader: separate concerns of table iter and reftable reader
reftable/reader: inline `reader_seek_internal()`
reftable/reader: set up the reader when initializing table iterator
reftable/merged: split up initialization and seeking of records
reftable/merged: simplify indices for subiterators
reftable/generic: move seeking of records into the iterator
reftable/generic: adapt interface to allow reuse of iterators
reftable/reader: adapt interface to allow reuse of iterators
reftable/stack: provide convenience functions to create iterators
reftable/merged: adapt interface to allow reuse of iterators
refs/reftable-backend.c | 48 ++++----
reftable/block.c | 4 +-
reftable/generic.c | 94 +++++++++++----
reftable/generic.h | 9 +-
reftable/iter.c | 23 +++-
reftable/merged.c | 148 ++++++++----------------
reftable/merged.h | 6 +
reftable/merged_test.c | 19 ++-
reftable/reader.c | 218 +++++++++++++++--------------------
reftable/readwrite_test.c | 35 ++++--
reftable/reftable-generic.h | 8 +-
reftable/reftable-iterator.h | 21 ++++
reftable/reftable-merged.h | 15 ---
reftable/reftable-reader.h | 45 ++------
reftable/reftable-stack.h | 18 +++
reftable/stack.c | 29 ++++-
16 files changed, 378 insertions(+), 362 deletions(-)
--
2.45.0
Attachments
- signature.asc [application/pgp-signature] 833 bytes