Thread (50 messages) 50 messages, 19 authors, 2017-10-02

Re: Which hash function to use, was Re: RFC: Another proposed hash function transition plan

From: Johannes Schindelin <hidden>
Date: 2017-06-15 19:35:31

Hi,

On Thu, 15 Jun 2017, Ævar Arnfjörð Bjarmason wrote:
On Thu, Jun 15 2017, Jeff King jotted:
quoted
On Thu, Jun 15, 2017 at 08:05:18PM +0900, Mike Hommey wrote:
quoted
On Thu, Jun 15, 2017 at 12:30:46PM +0200, Johannes Schindelin wrote:
quoted
Footnote *1*: SHA-256, as all hash functions whose output is
essentially the entire internal state, are susceptible to a
so-called "length extension attack", where the hash of a
secret+message can be used to generate the hash of
secret+message+piggyback without knowing the secret.  This is not
the case for Git: only visible data are hashed. The type of attacks
Git has to worry about is very different from the length extension
attacks, and it is highly unlikely that that weakness of SHA-256
leads to, say, a collision attack.
What do the experts think or SHA512/256, which completely removes the
concerns over length extension attack? (which I'd argue is better than
sweeping them under the carpet)
I don't think it's sweeping them under the carpet. Git does not use the
hash as a MAC, so length extension attacks aren't a thing (and even if
we later wanted to use the same algorithm as a MAC, the HMAC
construction is a well-studied technique for dealing with it).
I really tried to drive that point home, as it had been made very clear to
me that the length extension attack is something that Git need not concern
itself.

The length extension attack *only* comes into play when there are secrets
that are hashed. In that case, one would not want others to be able to
produce a valid hash *without* knowing the secrets. And SHA-256 allows to
"reconstruct" the internal state (which is the hash value) in order to
continue at any point, i.e. if the hash for secret+message is known, it is
easy to calculate the hash for secret+message+addition, without knowing
the secret at all.

That is exactly *not* the case with Git. In Git, what we want to hash is
known in its entirety. If the hash value were not identical to the
internal state, it would be easy enough to reconstruct, because *there are
no secrets*.

So please understand that even the direction that the length extension
attack takes is completely different than the direction any attack would
have to take that weakens SHA-256 for Git's purposes. As far as Git's
usage is concerned, SHA-256 has no known weaknesses.

It is *really, really, really* important to understand this before going
on to suggest another hash function such as SHA-512/256 (i.e. SHA-512
truncated to 256 bits), based only on that perceived weakness of SHA-256.
quoted
That said, SHA-512 is typically a little faster than SHA-256 on 64-bit
platforms. I don't know if that will change with the advent of
hardware instructions oriented towards SHA-256.
Quoting my own
CACBZZX7JRA2niwt9wsGAxnzS+gWS8hTUgzWm8NaY1gs87o8xVQ@mail.gmail.com sent
~2 weeks ago to the list:

    On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder [off-list ref]
    wrote:
    [...]
    > 4. When choosing a hash function, people may argue about performance.
    >    It would be useful for run some benchmarks for git (running
    >    the test suite, t/perf tests, etc) using a variety of hash
    >    functions as input to such a discussion.

    To the extent that such benchmarks matter, it seems prudent to heavily
    weigh them in favor of whatever seems to be likely to be the more
    common hash function going forward, since those are likely to get
    faster through future hardware acceleration.

    E.g. Intel announced Goldmont last year which according to one SHA-1
    implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They
    only have acceleration for SHA-1 and SHA-256[2]

    1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385

    2. https://en.wikipedia.org/wiki/Goldmont

Maybe someone else knows of better numbers / benchmarks, but such a
reduction in CBP likely makes it faster than SHA-512.
Very, very likely faster than SHA-512.

I'd like to stress explicitly that the Intel SHA extensions do *not* cover
SHA-512:

	https://en.wikipedia.org/wiki/Intel_SHA_extensions

In other words, once those extensions become commonplace, SHA-256 will be
faster than SHA-512, hands down.

Ciao,
Dscho
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help