Content deleted Content added
No edit summary Tags: Reverted Visual edit Mobile edit Mobile web edit |
Mindmatrix (talk | contribs) m Reverted edits by 2001:44C8:48E0:CDEC:21A5:46E4:5204:891B (talk) to last version by MacaroniPizzaHotDog |
||
Line 1:
{{short description|Hash function that is suitable for use in cryptography}}
{{More citations needed|date=May 2016}}
[[Image:Cryptographic Hash Function.svg|thumb|375px|right|A cryptographic hash function (specifically [[SHA-1]]) at work. A small change in the input (in the word "over") drastically changes the output (digest). This is called the [[avalanche effect]].]]
{{SHA-box}} A '''cryptographic hash function''' ('''CHF''') is a [[hash algorithm]] (a [[map (mathematics)|map]] of an arbitrary binary string to a binary string with a fixed size of <math>n</math> bits) that has special properties desirable for a [[cryptography|cryptographic]] application:{{sfn|Menezes|van Oorschot|Vanstone|2018|p=33}} * the probability of a particular <math>n</math>-bit output result ([[hash value]]) for a random input string ("message") is <math>2^{-n}</math> (as for any good hash), so the hash value can be used as a representative of the message;
* finding an input string that matches a given hash value (a ''pre-image'') is infeasible, ''assuming all input strings are equally likely.'' The ''resistance'' to such search is quantified as [[security strength]]: a cryptographic hash with <math>n</math> bits of hash value is expected to have a ''preimage resistance'' strength of <math>n</math> bits, unless the space of possible input values is significantly smaller than <math>2^{n}</math> (a practical example can be found in {{section link||Attacks on hashed passwords}});
* a ''second preimage'' resistance strength, with the same expectations, refers to a similar problem of finding a second message that matches the given hash value when one message is already known;
* finding any pair of different messages that yield the same hash value (a ''collision'') is also infeasible: a cryptographic hash is expected to have a ''collision resistance'' strength of <math>n/2</math> bits (lower due to the [[birthday paradox]]).
Cryptographic hash functions have many [[information security|information-security]] applications, notably in [[digital
[[Non-cryptographic hash
== Properties ==
Most cryptographic hash functions are designed to take a [[string (computer science)|string]] of any length as input and produce a fixed-length hash value.
A cryptographic hash function must be able to withstand all known [[Cryptanalysis#Types of cryptanalytic attack|types of cryptanalytic attack]]. In theoretical cryptography, the security level of a cryptographic hash function has been defined using the following properties:
; Pre-image resistance : Given a hash value {{math|''h''}}, it should be difficult to find any message {{math|''m''}} such that {{math|1=''h'' = hash(''m'')}}. This concept is related to that of a [[one-way function]]. Functions that lack this property are vulnerable to [[preimage
; Second pre-image resistance : Given an input {{math|''m''{{sub|1}}}}, it should be difficult to find a different input {{math|''m''{{sub|2}}}} such that {{math|1=hash(''m''{{sub|1}}) = hash(''m''{{sub|2}})}}. This property is sometimes referred to as ''weak collision resistance''. Functions that lack this property are vulnerable to [[second-preimage
; [[Collision resistance]] : It should be difficult to find two different messages {{math|''m''{{sub|1}}}} and {{math|''m''{{sub|2}}}} such that {{math|1=hash(''m''{{sub|1}}) = hash(''m''{{sub|2}})}}. Such a pair is called a cryptographic [[hash collision]]. This property is sometimes referred to as ''strong collision resistance''. It requires a hash value at least twice as long as that required for pre-image resistance; otherwise, collisions may be found by a [[birthday attack]].{{sfn|Katz|Lindell|2014|pp=155–157, 190, 232}}
Collision resistance implies second pre-image resistance but does not imply pre-image resistance.{{sfn|Rogaway|Shrimpton|2004|loc=in Sec. 5. Implications}} The weaker assumption is always preferred in theoretical cryptography, but in practice, a hash-function that is only second pre-image resistant is considered insecure and is therefore not recommended for real applications.
Informally, these properties mean that a [[adversary (cryptography)|malicious adversary]] cannot replace or modify the input data without changing its digest. Thus, if two strings have the same digest, one can be very confident that they are identical. Second pre-image resistance prevents an attacker from crafting a document with the same hash as a document the attacker cannot control. Collision resistance prevents an attacker from creating two distinct documents with the same hash.
A function meeting these criteria may still have undesirable properties. Currently, popular cryptographic hash functions are vulnerable to [[Length extension attack|''length-extension'' attacks]]: given {{math|hash(''m'')}} and {{math|len(''m'')}} but not {{math|''m''}}, by choosing a suitable {{math|{{′|''m''}}}} an attacker can calculate {{math|hash(''m'' ∥ {{′|''m''}})}}, where {{math|∥}} denotes [[concatenation]].<ref name="Y0rF6">{{cite web|url=http://vnhacker.blogspot.com/2009/09/flickrs-api-signature-forgery.html|title=Flickr's API Signature Forgery Vulnerability|first1=Thai|last1=Duong|first2=Juliano|last2=Rizzo|access-date=2012-12-07|archive-date=2013-08-15|archive-url=https://web.archive.org/web/20130815164303/http://vnhacker.blogspot.com/2009/09/flickrs-api-signature-forgery.html|url-status=live}}</ref> This property can be used to break naive authentication schemes based on hash functions. The [[HMAC]] construction works around these problems.
In practice, collision resistance is insufficient for many practical uses. In addition to collision resistance, it should be impossible for an adversary to find two messages with substantially similar digests; or to infer any useful information about the data, given only its digest. In particular, a hash function should behave as much as possible like a [[random function]] (often called a [[random oracle]] in proofs of security) while still being deterministic and efficiently computable. This rules out functions like the [[SWIFFT]] function, which can be rigorously proven to be collision-resistant assuming that certain problems on ideal lattices are computationally difficult, but, as a linear function, does not satisfy these additional properties.{{sfn|Lyubashevsky|Micciancio|Peikert|Rosen|2008| pp=54–72}}
Checksum algorithms, such as [[CRC32]] and other [[cyclic redundancy
=== Degree of difficulty ===
In cryptographic practice, "difficult" generally means "almost certainly beyond the reach of any adversary who must be prevented from breaking the system for as long as the security of the system is deemed important". The meaning of the term is therefore somewhat dependent on the application since the effort that a malicious agent may put into the task is usually proportional to their expected gain. However, since the needed effort usually multiplies with the digest length, even a thousand-fold advantage in processing power can be neutralized by adding a dozen bits to the latter.
For messages selected from a limited set of messages, for example
In some [[Computational complexity theory|theoretical analyses]] "difficult" has a specific mathematical meaning, such as "not solvable in [[asymptotic computational complexity|asymptotic]] [[polynomial time]]". Such interpretations of ''difficulty'' are important in the study of [[provably secure cryptographic hash
== Illustration ==
Line 97 ⟶ 100:
=== Wide pipe versus narrow pipe <span class="anchor" id="wide pipe"></span><span class="anchor" id="narrow pipe"></span> ===
A straightforward application of the Merkle–Damgård construction, where the size of hash output is equal to the internal state size (between each compression step), results in a '''narrow-pipe''' hash design. This design causes many inherent flaws, including [[Length extension attack|length-extension]], multicollisions,<ref name="LkIref">{{Cite journal|last=Lucks|first=Stefan|date=2004|title=Design Principles for Iterated Hash Functions|url=https://eprint.iacr.org/2004/253|journal=Cryptology ePrint Archive|id=Report 2004/253|access-date=2017-07-18|archive-date=2017-05-21|archive-url=https://web.archive.org/web/20170521181207/http://eprint.iacr.org/2004/253|url-status=live}}</ref> long message attacks,{{sfn|Kelsey|Schneier|2005|pp=474–490}} generate-and-paste attacks,{{Citation needed|date=July 2017}} and also cannot be parallelized. As a result, modern hash functions are built on '''wide-pipe''' constructions that have a larger internal state size – which range from tweaks of the Merkle–Damgård construction<ref name="LkIref" /> to new constructions such as the [[sponge construction]] and [[HAIFA construction]].<ref name="EjaBK">{{Cite
Meanwhile, truncating the output of a longer hash, such as used in SHA-512/256, also defeats many of these attacks.<ref name="ZY8I9">{{Cite report|last1=Dobraunig|first1=Christoph|last2=Eichlseder|first2=Maria|last3=Mendel|first3=Florian|date=February 2015|title=Security Evaluation of SHA-224, SHA-512/224, and SHA-512/256|url=http://www.cryptrec.go.jp/estimation/techrep_id2401.pdf|access-date=2017-07-18|archive-date=2016-12-27|archive-url=https://web.archive.org/web/20161227161240/http://cryptrec.go.jp/estimation/techrep_id2401.pdf|url-status=live}}</ref>
== Use in building other cryptographic primitives ==
Line 120 ⟶ 123:
== Cryptographic hash algorithms ==
There are many cryptographic hash algorithms; this section lists a few algorithms that are referenced relatively often. A more extensive list can be found on the page containing a [[comparison of cryptographic hash functions]].
=== MD5 ===
{{ Main | MD5 }}
MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function, MD4, and was specified in 1992 as RFC 1321. Collisions against MD5 can be calculated within seconds, which makes the algorithm unsuitable for most use cases where a cryptographic hash is required. MD5 produces a digest of 128 bits (16 bytes).▼
▲MD5 was designed by [[Ronald Rivest]] in 1991 to replace an earlier hash function, MD4, and was specified in 1992 as RFC 1321. Collisions against MD5 can be calculated within seconds, which makes the algorithm unsuitable for most use cases where a cryptographic hash is required. MD5 produces a digest of 128 bits (16 bytes).
=== SHA-1 ===
{{ Main | SHA-1 }}
SHA-1 was developed as part of the U.S. Government's Capstone project. The original specification – now commonly called SHA-0 – of the algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology). It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 180-1 and commonly designated SHA-1. Collisions against the full SHA-1 algorithm can be produced using the shattered attack and the hash function should be considered broken. SHA-1 produces a hash digest of 160 bits (20 bytes).▼
▲SHA-1 was developed as part of the U.S. Government's [[Capstone (cryptography)|Capstone]] project. The original specification – now commonly called SHA-0 – of the algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology). It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 180-1 and commonly designated SHA-1. Collisions against the full SHA-1 algorithm can be produced using the [[SHA-1#SHAttered – first public collision|shattered attack]] and the hash function should be considered broken. SHA-1 produces a hash digest of 160 bits (20 bytes).
Documents may refer to SHA-1 as just "SHA", even though this may conflict with the other Secure Hash Algorithms such as SHA-0, SHA-2, and SHA-3.
=== RIPEMD-160 ===
{{ Main | RIPEMD-160 }}
RIPEMD (RACE Integrity Primitives Evaluation Message Digest) is a family of cryptographic hash functions developed in Leuven, Belgium, by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel at the COSIC research group at the Katholieke Universiteit Leuven, and first published in 1996. RIPEMD was based upon the design principles used in MD4 and is similar in performance to the more popular SHA-1. RIPEMD-160 has, however, not been broken. As the name implies, RIPEMD-160 produces a hash digest of 160 bits (20 bytes).
=== Whirlpool ===
{{ Main | Whirlpool (hash function) }}
Whirlpool is a cryptographic hash function designed by Vincent Rijmen and Paulo S. L. M. Barreto, who first described it in 2000. Whirlpool is based on a substantially modified version of the Advanced Encryption Standard (AES). Whirlpool produces a hash digest of 512 bits (64 bytes).
=== SHA-2 ===
{{ Main | SHA-2 }}
SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA), first published in 2001. They are built using the Merkle–Damgård structure, from a one-way compression function itself built using the Davies–Meyer structure from a (classified) specialized block cipher.
SHA-2 basically consists of two hash algorithms: SHA-256 and SHA-512. SHA-224 is a variant of SHA-256 with different starting values and truncated output. SHA-384 and the lesser-known SHA-512/224 and SHA-512/256 are all variants of SHA-512. SHA-512 is more secure than SHA-256 and is commonly faster than SHA-256 on 64-bit machines such as [[X86-64|AMD64]].
The output size in bits is given by the extension to the "SHA" name, so SHA-224 has an output size of 224 bits (28 bytes); SHA-256, 32 bytes; SHA-384, 48 bytes; and SHA-512, 64 bytes.
Line 148 ⟶ 161:
SHA-3 (Secure Hash Algorithm 3) was released by NIST on August 5, 2015. SHA-3 is a subset of the broader cryptographic primitive family Keccak. The Keccak algorithm is the work of Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. Keccak is based on a sponge construction, which can also be used to build other cryptographic primitives such as a stream cipher. SHA-3 provides the same output sizes as SHA-2: 224, 256, 384, and 512 bits.
Configurable output sizes can also be obtained using the SHAKE-128 and SHAKE-256 functions. Here the -128 and -256 extensions to the name imply the [[security strength]] of the function rather than the output size in bits.
=== BLAKE2 ===
{{ Main | BLAKE2 }}
BLAKE2, an improved version of BLAKE, was announced on December 21, 2012. It was created by Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O'Hearn, and Christian Winnerlein with the goal of replacing the widely used but broken MD5 and SHA-1 algorithms. When run on 64-bit x64 and ARM architectures, BLAKE2b is faster than SHA-3, SHA-2, SHA-1, and MD5. Although BLAKE and BLAKE2 have not been standardized as SHA-3 has, BLAKE2 has been used in many protocols including the Argon2 password hash, for the high efficiency that it offers on modern CPUs. As BLAKE was a candidate for SHA-3, BLAKE and BLAKE2 both offer the same output sizes as SHA-3 – including a configurable output size.▼
▲BLAKE2, an improved version of BLAKE, was announced on December 21, 2012. It was created by Jean-Philippe Aumasson, Samuel Neves, [[Zooko Wilcox-O'Hearn]], and Christian Winnerlein with the goal of replacing the widely used but broken MD5 and SHA-1 algorithms. When run on 64-bit x64 and ARM architectures, BLAKE2b is faster than SHA-3, SHA-2, SHA-1, and MD5. Although BLAKE and BLAKE2 have not been standardized as SHA-3 has, BLAKE2 has been used in many protocols including the [[Argon2]] password hash, for the high efficiency that it offers on modern CPUs. As BLAKE was a candidate for SHA-3, BLAKE and BLAKE2 both offer the same output sizes as SHA-3 – including a configurable output size.
=== BLAKE3 ===
{{ Main | BLAKE3 }}
BLAKE3, an improved version of BLAKE2, was announced on January 9, 2020. It was created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O'Hearn. BLAKE3 is a single algorithm, in contrast to BLAKE and BLAKE2, which are algorithm families with multiple variants. The BLAKE3 compression function is closely based on that of BLAKE2s, with the biggest difference being that the number of rounds is reduced from 10 to 7. Internally, BLAKE3 is a Merkle tree, and it supports higher degrees of parallelism than BLAKE2.▼
▲BLAKE3, an improved version of BLAKE2, was announced on January 9, 2020. It was created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O'Hearn. BLAKE3 is a single algorithm, in contrast to BLAKE and BLAKE2, which are algorithm families with multiple variants. The BLAKE3 compression function is closely based on that of BLAKE2s, with the biggest difference being that the number of rounds is reduced from 10 to 7. Internally, BLAKE3 is a [[Merkle tree]], and it supports higher degrees of parallelism than BLAKE2.
== Attacks on cryptographic hash algorithms ==
Line 221 ⟶ 238:
== External links ==
* {{cite book | first1 = Christof | last1 = Paar | first2 = Jan | last2 = Pelzl | chapter-url = http://wiki.crypto.rub.de/Buch/movies.php | chapter = 11: Hash Functions | title = Understanding Cryptography, A Textbook for Students and Practitioners | publisher = [[Springer Science+Business Media|Springer]] | date = 2009 | url-status = dead | archive-url = https://archive.today/20121208212741/http://wiki.crypto.rub.de/Buch/movies.php | archive-date = 2012-12-08 }} (companion web site contains online cryptography course that covers hash functions)
* {{cite web | url = http://ehash.iaik.tugraz.at/wiki/The_eHash_Main_Page | title = The ECRYPT Hash Function Website }}
* {{cite web | url = http://www.guardtime.com/educational-series-on-hashes/ | title = Series of mini-lectures about cryptographic hash functions | first = A. | last = Buldas | date = 2011 | url-status = dead | archive-url = https://archive.today/20121206020054/http://www.guardtime.com/educational-series-on-hashes/ | archive-date = 2012-12-06 }}
* [https://github.com/CRPrinzler/HASH-verify Open source python based application with GUI used to verify downloads.]
{{Cryptography navbox|hash}}
{{Cryptocurrencies}}
{{Cryptographic software}}
{{DEFAULTSORT:Cryptographic Hash Function}}
|