Thread (2 messages) 2 messages, 2 authors, 2024-09-26

Re: [PATCH v2 04/22] reftable/basics: handle allocation failures in `reftable_calloc()`

From: Patrick Steinhardt <hidden>
Date: 2024-09-26 12:12:00

On Tue, Sep 24, 2024 at 09:59:24AM -0700, Junio C Hamano wrote:
Patrick Steinhardt [off-list ref] writes:
quoted
 void *reftable_calloc(size_t nelem, size_t elsize)
 {
-	size_t sz = st_mult(nelem, elsize);
-	void *p = reftable_malloc(sz);
-	memset(p, 0, sz);
+	void *p;
+
+	if (nelem && elsize > SIZE_MAX / nelem)
+		return NULL;
Now it is open coded, it strikes me that the check is a bit overly
conservative.

If we are trying to allocate slightly than half of SIZE_MAX by
asking elsize==1 and nelem==(SIZE_MAX / 2 + 10), we'd say that
(elsize * nelem) would not fit size_t and fail the allocation.

For the purpose of this caller, it is not a practical issue, as it
is likely that you'd not be able to obtain slightly more than half
your address space out of a single allocation anyway.

But it illustrates why open coding is not necessarily an excellent
idea in the longer term, doesn't it?  When unsigned_mult_overflows()
is updated to avoid such a false positive, how would we remember
that we need to update this copy we?
I agree in general, but with the reftable library I'm stuck between a
rock and a hard place. My goal is to make it fully reusable by other
projects without first having to do surgery on their side. While having
something like `st_mult()` is simple enough to port over, the biggest
problem I have is that over time we sneak in more and more code from the
Git codebase. The result is death by a thousand cuts.

Now if we had a single header that exposes a small set of functions
without _anything_ else it could work alright. But I rather doubt that
we really want to have a standalone header for `st_mult()` that we can
include. But without such a standalone header it is simply too easy to
start building on top of Git features by accident.

So I'm instead aiming to go a different way and fully cut the reftable
code loose from Git. So even if we e.g. eventually want `struct strbuf`
return errors on failure, it would only address one part of my problem.

A couple months ago I said that I'll try to write something like shims
that wrap our types in reftable types such that other projects can
provide implementations for such shims themselves. I tried to make that
work, but the result was an unholy mess that really didn't make any
sense whatsoever. Most of the features that I need from the Git codebase
can be provided in a couple of lines of code (`struct strbuf` is roughly
50 lines for example), plus maybe a callback function that can be
provided to wire things up on the user side (`register_tempfiles()` for
example). So once I saw that those wrappers are harder to use and result
in roughly the same lines of code I decided to scrap that approach and
instead try to convert it fully.

So yeah, overall we shouldn't open-code things like this. But I don't
really see another way to do this for the reftable library.

Patrick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help