[PATCH v2 0/7] reftable: improve ref iteration performance
From: Patrick Steinhardt <hidden>
Date: 2024-02-12 08:32:26
Hi,
this is the second version of my patch series that improves ref
iteration performance. There are no code changes compared to v1, but a
few improvements to the commit messages.
Thanks!
Patrick
Patrick Steinhardt (7):
reftable/record: introduce function to compare records by key
reftable/merged: allocation-less dropping of shadowed records
reftable/merged: skip comparison for records of the same subiter
reftable/pq: allocation-less comparison of entry keys
reftable/block: swap buffers instead of copying
reftable/record: don't try to reallocate ref record name
reftable/reader: add comments to `table_iter_next()`
reftable/block.c | 3 +--
reftable/merged.c | 19 +++++++-------
reftable/merged.h | 2 --
reftable/pq.c | 13 +--------
reftable/reader.c | 26 +++++++++++-------
reftable/record.c | 67 ++++++++++++++++++++++++++++++++++++++++++++---
reftable/record.h | 7 +++++
7 files changed, 100 insertions(+), 37 deletions(-)
Range-diff against v1:
1: fadabec696 ! 1: bcdb5a2bf0 reftable/record: introduce function to compare records by key
@@ Commit message
In some places we need to sort reftable records by their keys to
determine their ordering. This is done by first formatting the keys into
a `struct strbuf` and then using `strbuf_cmp()` to compare them. This
- logic is needlessly roundabout and can end up costing quite a bit fo CPU
+ logic is needlessly roundabout and can end up costing quite a bit of CPU
cycles, both due to the allocation and formatting logic.
- Introduce a new `reftable_record_cmp()` function that knows to compare
- two records with each other without requiring allocations.
+ Introduce a new `reftable_record_cmp()` function that knows how to
+ compare two records with each other without requiring allocations.
Signed-off-by: Patrick Steinhardt [off-list ref]
2: 576a96f2e5 = 2: 2364a0fa33 reftable/merged: allocation-less dropping of shadowed records
3: 0ca86eba71 ! 3: 205e6b488b reftable/merged: skip comparison for records of the same subiter
@@ Commit message
to return.
There is an edge case here where we can skip that comparison: when the
- record in the priority queue comes from the same subiterator than the
+ record in the priority queue comes from the same subiterator as the
record we are about to return then we know that its key must be larger
than the key of the record we are about to return. This property is
guaranteed by the sub-iterators, and if it didn't hold then the whole
4: 1c9c19a3b3 = 4: fd09ba70fe reftable/pq: allocation-less comparison of entry keys
5: ac3d43c001 = 5: 2317aa43b9 reftable/block: swap buffers instead of copying
6: 41dff8731c = 6: 8c67491504 reftable/record: don't try to reallocate ref record name
7: 2f1f1dd95e ! 7: 167f67fad8 reftable/reader: add comments to `table_iter_next()`
@@ Commit message
more obvious. While at it, touch up the code to conform to our code
style better.
+ Note that one of the refactorings merges two conditional blocks into
+ one. Before, we had the following code:
+
+ ```
+ err = table_iter_next_block(&next, ti
+ if (err != 0) {
+ ti->is_finished = 1;
+ }
+ table_iter_block_done(ti);
+ if (err != 0) {
+ return err;
+ }
+ ```
+
+ As `table_iter_block_done()` does not care about `is_finished`, the
+ conditional blocks can be merged into one block:
+
+ ```
+ err = table_iter_next_block(&next, ti
+ table_iter_block_done(ti);
+ if (err != 0) {
+ ti->is_finished = 1;
+ return err;
+ }
+ ```
+
+ This is both easier to reason about and more performant because we have
+ one branch less.
+
Signed-off-by: Patrick Steinhardt [off-list ref]
## reftable/reader.c ##
base-commit: c875e0b8e036c12cfbf6531962108a063c7a821c
--
2.43.GIT
Attachments
- signature.asc [application/pgp-signature] 833 bytes