CRYPTOGRAPHY - Hashing
Contents
Hashing introduction
Hashing types
Discussion
Merkle-Damgaard
The Merkle-Damgaard construction is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. This construction was used in the design of many popular hash algorithms such as MD5, SHA-1 and SHA-2. It was described in Ralph Merkle's Ph.D. thesis in 1979. Ralph Merkle and Ivan Damgård independently proved that the structure is sound: that is, if an appropriate padding scheme is used and the compression function is collision-resistant, then the hash function will also be collision-resistant.
Sponge
At a glance:
Other
At a glance:
Extractable functions are functions where any adversary that outputs a point in the range of the function is guaranteed to “know” a corresponding preimage. Here, knowledge is captured by the existence of an efficient extractor that recovers the preimage from the internal state of the adversary. Extractability of functions was defined by Canetti (ICALP’08) in the context of perfectly one-way functions. It can be regarded as an abstraction from specific knowledge assumptions, such as the Knowledge of Exponent assumption (Hada and Tanaka, Crypto 1998).
Algorithms
- Argon2
- Argon2 is a key derivation function that was selected as the winner of the Password Hashing Competition in July 2015.
- Designed by Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich from the University of Luxembourg.
- Poseidon - ZK-friendly hash
Standards
Refer also to crypto standards.
Bear in mind that NIST publishes the Secure Hash Standard (SHS) as NIST FIPS 180-4, which includes SHA-1 and SHA-2.
RIPEMD
- RIPEMD - Wikipedia - Dobbertin, Bosselaers, Preneel 1992/1996
- RIPEMD - ISO ISO/IEC 10118-3:2018(en)
IT Security techniques — Hash-functions — Part 3: Dedicated hash-functions
SHA-1, SHA-2, SHA-3
- SHA-1 Wikipedia - US NSA, 1995, 160 bits, all major web browser vendors ceased acceptance of SHA-1 SSL certificates in 2017
- SHA-1 is a 160-bit hash standard, published in 1995 by NIST as FIPS 180-1, developed as part of the US Capstone project.
The original specification was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by NIST.
This version is often named SHA-0.
- It is no longer considered as cryptographically safe, see e.g. https://csrc.nist.gov/projects/hash-functions/nist-policy-on-hash-functions
- SHA-2 Wikipedia - US NSA 2001, family: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256
- Built using the Merkle–Damgård construction, from a one-way compression function itself built using the Davies–Meyer structure
from a specialized block cipher
- First published by the NIST as a US FIPS
- SHA-3 Wikipedia - US NIST 2015,
Multihash
Multihash is a protocol for differentiating outputs from various well-established hash functions,
addressing size and encoding considerations.
It is useful to write applications that future-proof their use of hashes, and allow multiple hash functions to coexist.
Hash into elliptic curve
Used in Boneh-Franklin EBE, as well as in SPEKE (Simple Password Exponential Key Exchange) and PAK (Password Authenticated Key exchange).
Accumulators
One view
You can think of an accumulator as the product of multiplying numbers together. In the equation a * b * c * d = e , the accumulator would be e. We can plug in numbers: if a=2, b=3, c=5 and d=7, then e=210. We say that 3 is “in” because it is a factor. If we want to take 3 out of the accumulator, we divide 210 by 3, e=70, 3 is removed.
You can produce e by multiplying any single factor such as a by the product of all the other factors (b * c * d).
The product of all other factors contributing to the accumulator (all factors except the private one for this credential) is called a witness. So you can tell someone the value of a and the product of all the other inputs to the accumulator, (but not the other inputs themselves), and they can produce the output e.
Another view
An accumulator is a one way membership hash function. It allows users to certify that potential candidates are a member of a certain set without revealing the individual members of the set. This concept was formally introduced by Josh Benaloh and Michael de Mare in 1993.
They introduced the notion of quasi-commutative as h(h(x, y1), y2) = h(h(x, y2), y1).
Key derivation
A key derivation function (KDF) is a cryptographic hash function that derives one or more secret keys from a secret value such as a main key, a password, or a passphrase using a pseudorandom function.
KDFs can be used
- to stretch keys into longer keys or
- to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES
Keyed cryptographic hash functions are examples of pseudorandom functions used for key derivation
Bloom filters
An empty Bloom filter is a bit array of m bits, all set to 0. There must be k different hash functions defined, each of which hashes some set element to one of the m array positions, generating a uniform random distribution. Typically, k is a small constant which depends on the desired false error rate ε, while m is proportional to k and the number of elements to be added.
To add an element, feed it to each of the k hash functions to get k array positions. Set the bits at all these positions to 1.
To query for an element (test whether it is in the set), feed it to each of the k hash functions to get k array positions.
- If any of the bits at these positions is 0, the element is definitely not in the set; if it were, then all the bits would have been set to 1 when it was inserted.
- If all are 1, then either the element is in the set, or the bits have by chance been set to 1 during the insertion of other elements, resulting in a false positive.
In a simple Bloom filter, there is no way to distinguish between the two cases, but more advanced techniques can address this problem.
Other
- MinHash - the min-wise independent permutations locality sensitive hashing scheme, is a technique for quickly estimating how similar two sets are
- FeatureHashing - is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix