Talk:SHA-1/Archive 1: Difference between revisions

Content deleted Content added
Further F3 optimizations: These optimisations have been known for years.
Legobot (talk | contribs)
m Bot: Fixing lint errors, replacing obsolete HTML tags: <tt> (8x)
 
(132 intermediate revisions by 54 users not shown)
Line 1:
{{talkarchive}}
{{CryptographyProject}}
 
{{CryptographyReader}}
== rot? shr? ==
{{todo}}
 
In the "comparison" section, the primitive operations listed include "rot" and "shr". If this is rotation (i.e. left bit shift) and shift-right, then these abbreviations should, I think, be clearer. If they're something else ... then that should be clearer, too. --[[User:Jim Mahoney|Jim Mahoney]] ([[User talk:Jim Mahoney|talk]]) 14:42, 28 April 2009 (UTC)
 
== 80 000 CPU hours?! ==
 
==80 000 CPU hours?!==
(80 000/24)/7 = 476.190476190476.. Slightly more than a year.
 
Line 10 ⟶ 13:
:The computer had 256 processors running in parallel, so you'll have to divide the time by that number. [[User:Fredrik|Fredrik]] | [[User talk:Fredrik|talk]] 23:38, 7 Apr 2005 (UTC)
 
::Which gives a number slightly less than two days of continuous running assuming that such a multi-core network has no loss of efficiency. That said, the cost of such a supercomputer would be rather large. --[[User:Bob Hu|I]] ([[User talk:Bob Hu|talk]]) 06:15, 28 January 2009 (UTC)
==Patent status?==
 
== Patent status? ==
 
But primary question is, is this method unencumbered by patents or other protections?
~ender 2003-04-19 01:50 MST
Line 24 ⟶ 30:
::: Patent, yes, but there's no copyright in the USA, since works of the United States Government inherently are public ___domain.
 
== Naming ==
 
This article is about five or six different variants of SHA, which is fine, but we could probably do with a slightly more general name. Suggestions include:
 
Line 40 ⟶ 47:
::::::: OK, I've gone with [[SHA family]], and made a few redirects from the other suggestions. [[User:Matt Crypto|&mdash; Matt]] 13:26, 13 Aug 2004 (UTC)
 
==pseudocode Pseudocode error ==
 
There seems to be an error in the Pseudocode:
Line 54 ⟶ 61:
:You're exactly right... I fixed it. You can fix it too, you know =^_^= -- [[User:Myria|Myria]] 05:38, 2 Nov 2004 (UTC)
 
== SHA-2 hashes ==
 
Hi folks, do you find it useful to have additional testvectors for the SHA-2 variants on the page?
I don't want to add them until I have heard some opinions.
Line 75 ⟶ 83:
::: That answers my request for a citation quite well. [[User:Nahaj|Nahaj]] 17:37, 28 October 2005 (UTC)
** ... I would like to cite them but I don't know how and I don't know what we would use as the source, since all I've seen in the source is this. By the way, you can prove that the functions are identical by combining induction with a truth table. -- [[User:Myria|Myria]] 06:43, 28 October 2005 (UTC)
 
 
Hi guys. JulesH slabbed a "citation needed" tag on these f computing "equivalent expressions" and I too wasn't so sure they were equivalent. So I spent some minutes making truth tables with them. And the result is that all of them are correct. Even the "+" one worked right! Here are the tables:
Line 92 ⟶ 99:
1 0 0 (1 and 0) or (0 and 0) = 0 or 0 = 0
1 0 1 (1 and 0) or (0 and 1) = 0 or 0 = 0
1 1 0 (1 and 1) or (0 and 0) = 1 or 0 = 1
1 1 1 (1 and 1) or (0 and 1) = 1 or 0 = 1
 
Line 105 ⟶ 113:
1 0 0 0 xor (1 and (0 xor 0)) = 0
1 0 1 1 xor (1 and (0 xor 1)) = 0
1 1 0 0 xor (1 and (1 xor 0)) = 1
1 1 1 1 xor (1 and (1 xor 1)) = 1
 
 
Both functions do give 0 1 0 1 0 0 1 1 thus all is ok.
</pre>
 
Line 124 ⟶ 133:
1 0 0 (1 and 0) or (1 and 0) or (0 and 0) = 0
1 0 1 (1 and 0) or (1 and 1) or (0 and 1) = 1
1 1 0 (1 and 1) or (1 and 0) or (1 and 0) = 1
1 1 1 (1 and 1) or (1 and 1) or (1 and 1) = 1
 
Line 137 ⟶ 147:
1 0 0 (1 and 0) or (0 and (1 or 0)) = 0
1 0 1 (1 and 0) or (1 and (1 or 0)) = 1
1 1 0 (1 and 1) or (0 and (1 or 1)) = 1
1 1 1 (1 and 1) or (1 and (1 or 1)) = 1
 
Line 150 ⟶ 161:
1 0 0 (1 and 0) or (0 and (1 xor 0)) = 0
1 0 1 (1 and 0) or (1 and (1 xor 0)) = 1
1 1 0 (1 and 1) or (0 and (1 xor 1)) = 1
1 1 1 (1 and 1) or (1 and (1 xor 1)) = 1
 
Line 163 ⟶ 175:
1 0 0 (1 and 0) + (0 and (1 xor 0)) = 0 + 0 = 0
1 0 1 (1 and 0) + (1 and (1 xor 0)) = 0 + 1 = 1
1 1 0 (1 and 1) + (0 and (1 xor 1)) = 1 + 0 = 1
1 1 1 (1 and 1) + (1 and (1 xor 1)) = 1 + 0 = 1
 
All four functions do give 0 0 0 1 0 1 1 1 thus all is ok.
 
Note, we never did get 1+1 thus no bit did overflow
Line 179 ⟶ 192:
::But yeah, since those variants do look a bit weird some kind of link to an explanation would probably be good but I don't find it that necessary. By the way, I just checked several very old SHA1 source codes I have lying around and I found all but one of the variants in those source codes. So apparently these optimisations have been known for years.
::--[[User:Davidgothberg|David Göthberg]] 14:08, 27 January 2007 (UTC)
 
:::Well, strictly, both are original research as long as they don't come from a source. But I didn't really give them a thought earlier, and they do appear quite straightforward boolean function optimizations. So I agree that they probably don't need to be sourced.
:::And by the way, why did you omit the 7th combination (b=1, c=1, d=0) from your proof? Although all these alternatives also produce consistent results in this case. I was first confused by this, as I assumed their equivalence was somehow data-dependent. -- [[User:Intgr|intgr]] 15:43, 27 January 2007 (UTC)
 
::::Oops! Slightly embarrassing for me... Ah well, just goes to show that "proofs" are way too susceptible to the human factor to really be proofs. But thankfully this is Wikipedia where many eyeballs make all bugs shallow. Now I updated the truth table with the 1 1 0 case. And yes, thankfully the expressions are still equivalent.
::::And yeah, if those expressions didn't come from a source then they would in the strict sense be original research. But I have found all but one of them in different SHA1 libs that I have laying around. Thus at least most of them are well known and there are sources. Question is if we should bother to state which crypto libs use which variant? That is, to state some sources? Seems unnecessary to me since this is as you said "straight forward boolean function optimizations". And any one implementing SHA1 should anyway test their implementation carefully by comparing with test vectors and other crypto libs and should then discover any bugs in their implementation.
::::--[[User:Davidgothberg|David Göthberg]] 05:37, 28 January 2007 (UTC)
 
== Explanation ==
Line 189 ⟶ 209:
Does anyone have any insight into why one of the arguments of the rotations in the compression function is a factor of the other, and why the 30-bit left rotation (taken as a 2-bit right rotation) divides the 32-bit word size? I don't have time to analyze it in depth at the moment, but it seems under casual inspection that you would get better mixing if the arguments were coprime... [[User:Inkling]] 10:51, 17 Jan 2005 (UTC)
 
== Extra SHA-pseudocode ==
 
 
 
==Extra SHA-pseudocode==
An anonymous user has kindly contributed pseudocode for SHA-256, in addition to the pseudocode for SHA-1. This is great to have, but I'm concerned that the article is a little over-burdened with pseudo-code at present. We don't want to lose this information, so perhaps we could move the SHA-256 description elsewhere: maybe a [[SHA-256 pseudocode]] page, or onto WikiSource...what do people think? [[User:Matt Crypto|&mdash; Matt <small>Crypto</small>]] 19:18, 14 Mar 2005 (UTC)
 
== External links ==
 
Hi, I recently added hash-it.net to the list of external links, and it was removed and marked as "spam". I would like to discuss this because I feel it is a valuable resource to people who would like to hash relatively short strings. I think that hash 'em all is fantastic because of it's wide support, but there is no need for a page request every time you want to hash something. Also, it's kind of weird to have "Online Hash Calculators" as a large heading and with just one link underneath it... <small>—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/98.216.116.86|98.216.116.86]] ([[User talk:98.216.116.86|talk]]) 05:00, 7 September 2008 (UTC)</small><!-- Template:UnsignedIP --> <!--Autosigned by SineBot-->
:Hi -- I replied to your comments [[User talk:98.216.116.86|here]]. Feel free to re-add the link. [[User:Feezo|Feezo]] <span style="font-size:x-small;">[[User_talk:Feezo|(Talk)]]</span> 05:48, 7 September 2008 (UTC)
 
It doesn't matter much, but I think a good rule of thumb is to have only one good link for a certain type of thing -- "Wikipedia is not a link repository", and all that. If we already have one link to a Javascript SHA calculator, why do we need another? Eventually, the External Links section starts to clutter up as developers inevitably add more links to their particular SHA implementations. To me, it seems easier just to say, "we only need one of those, thanks" at the outset. [[User:Matt Crypto|&mdash; Matt <small>Crypto</small>]] 01:07, 18 Mar 2005 (UTC)
Line 222 ⟶ 243:
:How about the [[cryptographic hash function]] article, and the [[hash function]] article linked from there? [[User:Lunkwill|Lunkwill]] 07:38, 8 September 2005 (UTC)
 
== Pseudocode clarification. ==
 
This line in the code (both versions) is unclear.
Line 235 ⟶ 256:
 
== Numbers, cant figure them out. ==
 
Taken from the article 25/09/05:
"SHA-0 and SHA-1 produce a 160-bit digest"
Line 312 ⟶ 334:
I agree if we talk about very popular pages like SHA and MD5, because those algorihms are well known to the public already and too many links (both free and nonfree, both trivial and complex) have been posted in the past to those topics. Furthermore it is easy to find SHA and MD5 software by a seach engine easily today rather than using Wikipedia. However, for not so famous pages, a link where Wikipedia users can find source and/or software, make absolutely sense in my opinion. BTW, I absolutely disagree with your argument called "we don't need to link to because they are bundled with Perl, Python, Java", because actually in most cases those software is open source, platform independent and run also on entirely free platforms rather than just only on one particular OS. [[User:Jonelo|Jonelo]] 18:02, 29 March 2006 (UTC)
 
== Licensing of the psuedo codepseudocode ==
 
Can I derive proprietary code from the psuedo code on this page? I am concerned that the Free Document License won't allow it, but what is psuedo code for if not to be derived from? [[User:168.215.177.66|168.215.177.66]] 21:37, 6 July 2006 (UTC)
 
Line 379 ⟶ 402:
would anyone object to delinking all year-only dates in the article? I don't think the links add anything relevant in this context (see [[Wikipedia:Manual of Style (dates and numbers)#Partial dates|Partial dates]]). &mdash;[[User:Gennaro Prota|<span style="color: #000080; font-weight: bold">Gennaro Prota</span>]][[User talk:Gennaro Prota|<sup style="color: #006400">&#8226;Talk</sup>]] 06:27, 2 December 2006 (UTC)
 
== "Applications" section ==
 
The "applications" section looks like it's starting to get out of hand &ndash; it already mentions some specific products where it's used, such as Xbox and git. If we were going to link all the products that SHA functions are used then the list would be incomprehensible. Thus, I'm removing these for now. -- [[User:Intgr|intgr]] 04:00, 17 December 2006 (UTC)
 
I support the idea of not listing all the products supporting a specific feature. Being SHA a very common algo, the list would be huge. I believe this is not a problem right now and I think the actual few examples provide useful information, especially to the casual reader. '''I would consider adding back the notes''': a few real world examples are important. The inclusion of another one should be considered on a case by case basis.<br />
''[[User:MaxDZ8|<fontspan colorstyle="color:navy;">[[User:MaxDZ8|MaxDZ8]]</fontspan>]] <small>[[User_talk:MaxDZ8|talk]]</small>'' 10:18, 17 December 2006 (UTC)
 
:I feel the same. The section didn't seem out of hand <em>yet</em>, though we should consider the long term: what about creating a list page? &mdash;[[User:Gennaro Prota|<span style="color: #000080; font-weight: bold">Gennaro Prota</span>]][[User talk:Gennaro Prota|<sup style="color: #006400">&#8226;Talk</sup>]] 10:50, 17 December 2006 (UTC)
Line 394 ⟶ 418:
::::All good points, IMHO. I think we need further discussion. What I can notice is: a) the "Applications of hash functions" section gives (as you say, too) a general overview of hash functions applications. The section in this article instead was specifically about "notable usages" of SHA. So they had different intents. In practice a mention like the Xbox one is pretty much useless but a mention of Git like the one we had may well "open a whole new world" to the reader (that's something that has happened often to me: I reached some article only because of a -strictly non necessary- link from some other page and become passioned about the new subject... it's one of the strength of Wikipedias over more traditional encyclopedias) b) linking to a specific section of another article (or even of the same article) is a bit dangerous in that there's no effective way to detect if the link breaks (if the target article's section is renamed, for instance). &mdash;[[User:Gennaro Prota|<span style="color: #000080; font-weight: bold">Gennaro Prota</span>]][[User talk:Gennaro Prota|<sup style="color: #006400">&#8226;Talk</sup>]] 16:32, 17 December 2006 (UTC)
 
::::The statment about how it is used by Linus in git is totally out of place and adds nothing to the section IMHO.
== Proununciation ==
 
== Pronunciation ==
 
For all these various tech acryonyms, providing a clue as to the common pronounciation would be helpful. Eg, for SHA-1, it's /shah-one/ or /shaw-one/ or the like. Perhaps providing the [[IPA]] or whatever the standard is for pronunciation. (I'll defer this to those more knowledgeable of wikipedia standards on the issue.) - [[User:MickScott|Michael]] ([[User talk:MickScott|talk]]|[[Special:Contributions/MickScott|contrib]]) 02:08, 19 January 2007 (UTC)
 
:Good question. How do you guys really pronounce SHA-1 etc in English? I know how we pronounce it in Swedish but not really what the most common way is in English. (In Swedish we spell it out so translated to English that would mean this pronunciation: Ess Eich Ei One. But it is much harder to say that in English.) --[[User:Davidgothberg|David Göthberg]] 12:40, 28 January 2007 (UTC)
 
The only person I've heard pronounce it like shaw-1 was from Canada. You don't want to get me started on that... The best, most logical in my opinion comes from Paolo Perrotta who has Git training on Pluralsight. He says SHA-1 as if it were SHAONE, or like the word 'shown'. Makes so much more sense than adding a 'w' out of nowhere.
 
 
== New SHA-1 vulnerabilities ==
Line 402 ⟶ 434:
: There is an (English) [http://en.epochtimes.com/news/7-1-11/50336.html article] about the subject. --[[User:JVersteeg|JVersteeg]] 10:43, 21 January 2007 (UTC)
::This article is useless and misleading, it doesn't mention a single relevant detail about the attack itself. -- [[User:Intgr|intgr]] 15:23, 21 January 2007 (UTC)
: The paper that you refer to, does not even talk about SHA-1. Rather it shows attaks against MD5, RIPEMD and SHA-0. How is this supposed to show a new SHA-1 vulnerability? The article in epochtimes is so badly written that it is not possible to determine whether they are talking about a new attack or one that has already been published. If there are no other reports of a new attack, then it is probably safe to assume that epochtime is just recycling an old story. [[User:85.2.45.127|85.2.45.127]] 21:59, 27 January 2007 (UTC)
 
: I've updated the page with my paper explaining the details of Wang's 2^63 attack on SHA-1. After over 2 years of lack of details about the methods, I decided to figure it out myself from her slides and type it up. [[User:Martin.cochran|Martin.cochran]] ([[User talk:Martin.cochran|talk]]) 23:49, 6 February 2008 (UTC)
 
== VB implementation ==
 
I have trawled through John Taylor's VB implementation quite a bit, and I think it's unnecessarily complicated. I'm currently adapting it for my own purposes, and it quickly gets shorter, more efficient, and more readable as well. [[User:Gerbrant|Shinobu]] 07:24, 19 April 2007 (UTC)
: I'd agree that this code is too complicated and not efficient. Especially, the rotates implemented using loops must be killing performance. Even tutorial code should take advantage of existing language features and not neglect them. Real code is very often more educational than tutorial code, because the authors have spend some time optimizing the code. [[User:85.2.38.56|85.2.38.56]] 10:00, 19 April 2007 (UTC)
That's not all :-) I've got it working, and I changed ''a lot''. Of course I added my own bit of iffyness: since only a few rotates are used, I used three or so special functions like U32RotateLeft30 etc. The generic rotate needs a 2^n function, and although I did write a reasonably fast implementation for that, I decided in the end that it's a waste if not really used. The addition bit was shortened and I changed datatypes and such around. Partly because it's conventional to use Byte arrays for data, not Strings, and partly for efficiency and versatility (didn't tie myself to strings containing hex numbers). Perhaps I should post it somewhere. Thing is I'm half afraid I changed so much, that it might go wonky in some obscure situation. Anyway, it works for me, and in any case, people shouldn't trust random code grabbed from the web - and I would like the feedback. I'll think about it. [[User:Gerbrant|Shinobu]] 12:03, 19 April 2007 (UTC)
: I'm not sure if I understood you. A rotate left by n bits of a Uint32 variable x is usually implemented as (x<<n)|(x>>(32-n)). Decent compilers recognize this construction, optimize it and generate just one assembly instruction: a rotate. I'd expect to learn such implementation details from a good tutorial code. [[User:85.2.114.9|85.2.114.9]] 13:05, 19 April 2007 (UTC)
You probably understood me... but what you perhaps don't know is that VB doesn't have native rotates and shifts, or unsigned operations. I'm not sure, but it probably was a design decision, like in Java. VB numbers are all signed (except bytes, which are unsigned, unlike in Java). It's a bit of a dilemma, really. Leave them out and you get people like us complaining and working around it, but leave them in and you get the rest of the world doing utterly wrong things with it. *sigh* [[User:Gerbrant|Shinobu]] 22:32, 19 April 2007 (UTC)
: Are you sure? I did check the online documentation before making my comments. VB does seem to support both unsigned int types and shifts since Visual Studio 2005 .net 2.0. I can't say how I'd implement these functions without unsigned support, because the VB documentation lacks details. But as you say, the sample code by Taylor can be improved in a number of places. Assuming, there are many people that use VB versions without unsigned support, it might indeed be interesting to see what the most efficient work arounds look like. [[User:85.2.127.155|85.2.127.155]] 09:29, 20 April 2007 (UTC)
I'm pretty sure unsigneds were added in .NET, which is incompatible with old-style VB in a number of ways. For various reasons people sometimes stick with older versions of software. Reasons like compatibility, size, memory usage, etc. As for "the best"... how do you ascertain what the best is? I mean
Function U32Add(ByVal A As Long, ByVal B As Long) As Long
If (A Xor B) < 0 Then
U32Add = A + B
Else
U32Add = (A Xor &H80000000) + B Xor &H80000000
End If
End Function
looks okay, and certainly a lot better than
Public Function i32add(operanda As Long, operandb As Long) As Long
'//////////////////////////////////////////////////////////////////////////////////////
'does 32 bit add of two 32 bit numbers as if they were unsigned int32's
'this has to be done this way because of quirk in VB where an add overflow
'into the sign bit is kicked out as an error. This would not be a problem if
'an unsigned int32 were allowed in VB!
'
'result=operanda + operandb
'
'use this function for (a+b)
'
'Not, And, Or, Xor all work the same for signed and unsigned int32 because there are
'no carries or borrows for VB to deal with
'//////////////////////////////////////////////////////////////////////////////////////
Dim operand_ax As Long
Dim operand_bx As Long
Dim upper_a As Integer
Dim upper_b As Integer
Dim result As Long
Dim topbits As Integer
'//////////////////////////////////////////////////////////////////////////////////////
'trim off offending bits
operand_ax = operanda And &H3FFFFFFF
operand_bx = operandb And &H3FFFFFFF
upper_a = ((operanda And &HC0000000) / &H40000000) And 3
upper_b = ((operandb And &HC0000000) / &H40000000) And 3
'do math on lower order bits
result = operand_ax + operand_bx
'do math on upper order bits
topbits = upper_a + upper_b
'if there was an overflow into upper 2 bits, increment the accumulator
If result And &H40000000 Then
topbits = topbits + 1
End If
'get rid of an overflow into upper 2 bits in lieu of separate math below
result = result And &H3FFFFFFF
'now adjust the upper bits for the side calculation results
If topbits And 1 Then
result = result Or &H40000000
End If
If topbits And 2 Then
result = result Or &H80000000
End If
i32add = result
End Function
but how can you tell it's best? [[User:Gerbrant|Shinobu]] 03:34, 21 April 2007 (UTC)
: At some point it becomes necessary to know the constructs that can be compiled into efficient code by the compiler. E.g., your proposal, which btw is a really nice trick, could also be written as
Function U32Add(ByVal A As Long, ByVal B As Long) As Long
Dim HiBit as Long
HiBit = (A Xor B Xor &H8000000) And &H80000000
U32Add = ((A Xor HiBit) + B) Xor HiBit
End Function
: Depending on how good the compiler is this should take 1-2 operations more on average, but it avoids an unpredictable jump and hence might be faster in some cases. However, I don't think all code has to be optimized completely. I'm still looking for some guidelines on what kind of code should be linked to in wikipedia and what should better be avoided. I'd think that if a reader generally could find improvements by comparing his code with some sample code then this sample code would be worth to be listed here. Otherwise there is no point of having a link. Taylor's code doesn't contain any non-obvious tricks, your code does. Hence your code would seem to be better qualified to be listed here. [[User:85.2.69.234|85.2.69.234]] 18:36, 22 April 2007 (UTC)
I'd put it in my user space, but there might be a policy against linking from article space to user space. I'll ask around about what to do. [[User:Gerbrant|Shinobu]] 10:18, 23 April 2007 (UTC)
 
Okay, please have a look here: http://vb.wikia.com/wiki/SHA-1.bas
 
Apparently they don't mind it being there. Before you add a link here, please read it through a few times. It works for me, but I'd rather be sure it works for others as well before the link is added here. Try a few test vectors too. You know what it's like... a little cut 'n paste error can cause horrible bugs. And sorry for the lack of comments. I don't like excessive commenting and most of the code speaks for itself. [[User:Gerbrant|Shinobu]] 14:31, 9 May 2007 (UTC)
: Your code looks definitively better to me than Taylor's code. I'd support to replace Taylor's code with this implementation. [[User:84.73.231.90|84.73.231.90]] 21:26, 23 May 2007 (UTC)
 
== SHA-1 Expansion==
 
The crypto analysis is not exactly a cogent explanation of how the algorithm works. This article says nothing about the actual working of the SHA-1. I think this needs to be included. -[[User:Shreyasjoshis|Weedrat]] 09:40, 21 May 2007 (UTC)
: Did you notice the pseudocode close to the end of the article? It looks good enough to me to actually implement SHA-1. [[User:84.73.231.90|84.73.231.90]] 10:25, 21 May 2007 (UTC)
 
== Basic language question ==
 
The introduction describes SHA-1 as "a condensed digital representation...that is, to a high degree of probability, unique for a given input data sequence." Shouldn't it say "unique *to* a given input..." (i.e. there is likely only a single input that goes with that output) rather than "unique for" (i.e. there likely is only a single output for a given input)? The current use of "for" says simply that there's just one output for each input, which is true of many mathematical functions...
[[User:MacScoop|MacScoop]] 18:27, 23 May 2007 (UTC)
 
:Agreed, sounds better to me, though I am not a native English speaker. -- [[user:intgr|intgr]] <sup>[[user talk:intgr|#%@!]]</sup> 21:11, 23 May 2007 (UTC)
 
:: Sounds incorrect to me. Since there are many more possible inputs than outputs it is very likely that for every possible output there exists many inputs mapping to that output. The whole first sentence is incomprehensible. Hash functions are defined as functions for which it is computationally infeasible to invert them and to find collisions (as described later). [[User:84.73.231.90|84.73.231.90]] 21:34, 23 May 2007 (UTC)
 
I copy-edited the intro to make things clearer, I hope.--[[User:ArnoldReinhold|agr]] 14:06, 30 July 2007 (UTC)
 
== FIPS 180-3 Draft ==
 
The draft version of FIPS 180-3 is available at http://csrc.nist.gov/publications/drafts/fips_180-3/draft_fips-180-3_June-08-2007.pdf. he comment period closes on September 10, 2007. Reid Hochstedler 19:11, 2 July 2007 (UTC)
 
== Question ==
 
How do commitment schemes work? They seem handy, but complicated. How do I get a committed identity? [[User:Ionas68224|<code>ionas68224</code>]]|[[User talk:Ionas68224|<code>talk</code>]]|[[Special:Contributions/Ionas68224|<code>contribs</code>]]|[[Special:Emailuser/Ionas68224|<code>email</code>]] 11:03, 21 July 2007 (UTC)
* Alice takes a random input x, and computes h(x) where h() represents a cryptographic hash function. Alice then asks Bob to guess the first bit of x. Because h(x) is computationally too hard to invert, Bob has only 50% chance of guessing this bit correctly. After Bob gives Alice his guess, Alice reveals x. If Bob guessed the bit right, he wins this "coin flip", otherwise he loses. Note that if Alice can find a collision in h, Alice can always win by giving Bob messages x or x' depending on his guess. If the cryptographic hash function compresses the input and represents a random mapping, Bob cannot gain any information on the first bit of x due to information theory.
<small>—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:193.190.253.144|193.190.253.144]] ([[User talk:193.190.253.144|talk]] • [[Special:Contributions/193.190.253.144|contribs]]) </small><!-- Template:Unsigned -->
 
**I wanted to know how to make h(x) to protect my identity. That was helpful, but that is how these are used. I want to know how to make one. [[User:Ionas68224|<code>ionas68224</code>]]|[[User talk:Ionas68224|<code>talk</code>]]|[[Special:Contributions/Ionas68224|<code>contribs</code>]]|[[Special:Emailuser/Ionas68224|<code>email</code>]] 07:10, 2 August 2007 (UTC)
 
***Maybe you're thinking of [[Zero-knowledge proof|zero-knowledge proofs]] instead of [[Commitment scheme|bit commitment schemes]]? The latter are for commitment to values, but zero-knowledge proofs can be used to prove your identity. Or are you thinking about the use of hash functions in [[Digital signature|digital signatures]]? Your question is unclear to me, but I think I've given you enough links now to find the answer.
 
I think the questioner is referring to the technique described at [[Template:User committed identity]].--[[User:ArnoldReinhold|agr]] 11:41, 2 August 2007 (UTC)
 
== Where are the links of the different SHA websites? ==
 
Hi. There are probably different links for SHA-1, SHA-2, SHA-512, etc. The problem is, what exactly are those links? The [[Template:User committed identity]] gives the link to SHA-1, and the article gives the links to a few, but where are the links to all the SHA websites, say for example where is the link for calculating SHA-512? Can someone list me the links of those websites, and where you got it from? I don't want links to sites other than those used to directly calculate a string into a hash code, so I don't want any sites with info ''about'' the SHA websites, I want the actual links themselves. I will also ask this question on the refernce desk. Thanks. ~[[User:AstroHurricane001/A|<span style="color:blue;">A</span>]][[User:AstroHurricane001|<span style="color:blue;">H</span>]][[User:AstroHurricane001/D|<span style="color:blue;">1</span>]]<sup>([[User:AstroHurricane001/T|T]][[Special:Contributions/AstroHurricane001|C]][[User:AstroHurricane001/U|U]])</sup> 20:41, 4 October 2007 (UTC)
 
== TOC on the right? ==
 
Is there a specific reason why the TOC is on the right?
 
It could be considered a minor issue, though when I opened the article, for a second I was like "errrrm... Where's the TOC?" (spatial memory anyone?), then I saw it was on the right, and it just seemed odd.
 
It's also not underneath the intro, but rather beside it. It seems inconsistent with about every other article I ever saw in WP (and I've been here for years) --[[User:W2bh|W2bh]] 20:40, 15 November 2007 (UTC)
 
:You're right—It should be on the left: default. The [[WP:MOS|manual of style]] suggests right TOC placement for articles where a photo appears at the left (suggested for biographical articles with a right facing headshot). But there is no such consideration in this article. —[[user:EncMstr|EncMstr]] 20:47, 15 November 2007 (UTC)
 
::I agree. --[[User:Morten LJ|Morten LJ]] 09:31, 16 November 2007 (UTC)
 
== This might be it... ==
 
The sentence "Gilbert and Handschuh (2003) have studied the newer variants and found no weaknesses" has a fact tag. It looks like it's referring to a paper, I did a quick search, and I think this might be it:
:Henri Gilbert, Helena Handschuh: Security Analysis of SHA-256 and Sisters. ''Selected Areas in Cryptography'' 2003: 175-193.
 
Anyway, I did a quick google search and didn't find any other 2003 papers by them. Can someone with more knowledge on the topic who maybe is able to access the paper take a look and see? It'd be nice to get rid of that fact tag. Thanks, [[Special:Contributions/4.21.209.231|4.21.209.231]] ([[User talk:4.21.209.231|talk]]) 04:19, 1 January 2008 (UTC)
 
== Merger proposal ==
 
I suggest that [[sha1sum]] be merged into this article, under an "implementations" section, or similar.
<br/>--[[User:Jerome Charles Potts|Jerome Potts]] ([[User talk:Jerome Charles Potts|talk]]) 19:01, 22 March 2008 (UTC)
 
:I'm not opposed to this in theory, but the content should be seriously revised upon the merge. most of the content is meaningless uninformative or redundant, especially in the context of a subsection of this article. The revised content should also make an effort to establish that this is one of the most common tools used for calculating sha1 hashes at the execution level (a presumption on my part). Similar merges could be possibly proposed for [[md5sum]] and [[chksum]]. Those are my thoughts anyway. -[[User:Verdatum|Verdatum]] ([[User talk:Verdatum|talk]]) 20:37, 23 April 2008 (UTC)
 
::I don't think a merger is appropriate. This article discusses a Unix utility for calculating SHA1 check sums. There are similar utilities for other operating systems and many other programs that calculate SHA hashes. The main article on SHA is not a suitable place for all this information.--[[User:ArnoldReinhold|agr]] ([[User talk:ArnoldReinhold|talk]]) 17:25, 20 August 2008 (UTC)
 
== Operations ==
 
Is "rotfl" a binary operation? I'd suggest to revise the alg/operation table :( <small>—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/81.182.160.76|81.182.160.76]] ([[User talk:81.182.160.76|talk]]) 07:08, 7 April 2008 (UTC)</small><!-- Template:UnsignedIP --> <!--Autosigned by SineBot-->
 
== Unclear/incorrect verbage ==
 
I'm only vaguely familiar with the subject matter so I hesitate to edit, but I think the first sentence under Cryptanalysis, "For a hash function which violates the first criterion listed above..." should have read, "For a hash function that does NOT violate the first criterion above..."
 
Here is where I'm on shakier ground, but does that statement not apply to all hashes, whether they violate the first criterion or not? I think 2^L is an upper bound, isn't it? A hash function that violates the first criterion (do they have names? if so we should use them instead of "first criterion" etc...) allows you to find a message for a given hash value in ''less than'' 2^L.
 
If that's correct, then the first sentence should read, "For a hash function which meets the first criterion listed above, finding a message that corresponds to a given message digest requires a brute force search in 2^L..."
 
[[User:Eshafto|Eshafto]] ([[User talk:Eshafto|talk]]) 15:34, 23 May 2008 (UTC)
 
== Naming redundancy ==
 
The current name for this article is "SHA hash functions", which means "Secure Hash Algorithm hash functions". The name is obviously redundant, and should be renamed something more appropriate, such as "SHA functions" or "Secure Hash Algorithm functions" (although I prefer the former simply because it is shorter, and the acronym is more common). miro modo 00:42, 4 August 2008 (UTC)
 
: In a cryptography book this title would indeed be a little odd. But here on wikipedia the title has some disambiguation purposes. It should be clear from the the title that the article is about the hash function SHA and not the airport SHA or anything else that uses the same acronym. Hence the redundancy in the title may actually be necessary. [[Special:Contributions/85.2.17.222|85.2.17.222]] ([[User talk:85.2.17.222|talk]]) 05:28, 4 August 2008 (UTC)
 
::Yeah, but I don't see a reason not to use the title Secure Hash Algorithm functions, or Secure Hash Algorithm family or just Secure Hash Algorithm. The full name is preferable to an acronym per [[Wikipedia:NAME#Prefer_spelled-out_phrases_to_abbreviations]].--[[User:ArnoldReinhold|agr]] ([[User talk:ArnoldReinhold|talk]]) 11:17, 4 August 2008 (UTC)
 
:::Naming conventions <i>do</i> say to use the spelled-out name "unless the term you are naming is almost exclusively known only by its abbreviation and is widely known and used in that form", so I retract my former statement and concede that the full title is preferable. "Secure Hash Algorithm" (or the other suggestions of ArnoldReinhold all seem fine). By not using the acronym, we should be able to avoid confusion or need for disambiguation. miro modo 13:10, 4 August 2008 (UTC) <small>—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Miro modo|Miro modo]] ([[User talk:Miro modo|talk]] • [[Special:Contributions/Miro modo|contribs]]) </small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
::::But I claim that the SHA hash functions ''are'' almost exclusively known only by the abbreviation. Seriously, how often have you ever seen ''Secure Hash Algorithm'' written out? [[User:Ntsimp|Ntsimp]] ([[User talk:Ntsimp|talk]]) 14:36, 4 August 2008 (UTC)
 
:::::A Google search on "Secure Hash Algorithm" (quoted string) gives 110,000 hits. I've advertised this discussion on the WikiProject Cryptography discussion page to get some more voices.--[[User:ArnoldReinhold|agr]] ([[User talk:ArnoldReinhold|talk]]) 15:07, 4 August 2008 (UTC)
 
:I agree with Ntsimp; [[WP:COMMONNAME]] suggests that "Use the most common name of a person or thing that does not conflict with the names of other people or things"; as far as I can tell, the only thing that "SHA" conflicts with is the Shanghai airport IATA code, which is adequately served by the note at the beginning of the article. And there is no question about it that the abbreviation "SHA" is used nearly exclusively.
 
:However, I think the best "common name" for the article would actually be SHA-1 and SHA-2; given that most of the content is specific to one of these algorithms, does anyone think that maybe a split would be worthwhile? I think that the unified article may be more confusing. Compare this with [[MD4]]/[[MD5]], or [[RC4]]/[[RC5]] for example. -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 15:15, 4 August 2008 (UTC)
 
::The policy also says "Avoid the use of abbreviations, including acronyms, in page naming unless the term you are naming is almost exclusively known only by its abbreviation and is widely known and used in that form." I would agree with you if the article was about a single algorithm, such as SHA-1. But it isn't and the full name is widley used when refering to the family as a whole. Even if we had articles on the individual algorithms, we would still want an article on the family, since much of the history, etc. is common.--[[User:ArnoldReinhold|agr]] ([[User talk:ArnoldReinhold|talk]]) 08:41, 5 August 2008 (UTC)
 
:::Well, I did the Google test: "Secure Hash Algorithm" gets 107,000 hits. While 'SHA hash' gets 4,120,000 hits. (I added 'hash' to avoid the hits for other things that is named SHA.) So using the short form "SHA" is about 40 times more common than using the full name "Secure Hash Algorithm". Also, now that we know that some of the SHA hashes are not secure anymore the full name is kind of a misnomer. To me the name is "SHA", and to make the article name clear we need to add the "hash functions" part, so I find the current article name "SHA hash functions" is the best. (And I find it better to use the plural "functions" instead of the confusing word "family".) And we already have redirects from a bunch of other names and links from other hash related articles so should be no problem for readers to find the article. And take a look at the interwiki links, almost all the other language Wikipedias use the short form "SHA".
:::--[[User:Davidgothberg|David Göthberg]] ([[User talk:Davidgothberg|talk]]) 13:12, 5 August 2008 (UTC)
 
::::When I did a Google search on"SHA hash" with the two words quoted to look for occurrences of the phrase, I got 29,100 hits. The [[WP:NAME]] preference for the spelled out title does not depend on frequency. But this is not the most pressing issue of the day and if there isn't more support for the change, I'm willing to drop it.--[[User:ArnoldReinhold|agr]] ([[User talk:ArnoldReinhold|talk]]) 19:04, 5 August 2008 (UTC)
 
:::::I should have explained better. You should search for "SHA hash" without the quotes, since then you get pages with the word "SHA" and the word "hash" but not necessarily right next to each other. Thus you avoid most hits for other SHA things that are not SHA hashes, but still get pretty much all SHA hash related pages. If you search like that then you get 4,120,000 hits.
:::::--[[User:Davidgothberg|David Göthberg]] ([[User talk:Davidgothberg|talk]]) 21:24, 5 August 2008 (UTC)
 
== Comparison of SHA functions table ==
 
The "Collision" column is not clear at all; what is this supposed to represent? Surely not whether the hashing function has collisions or not, since that is a given from /being/ a hashing function. Some more commentary or a better header would be good. [[User:W|W]] ([[User talk:W|talk]]) 10:35, 2 September 2008 (UTC)
 
:It means whether an attack or an actual collision sample has ever been published. I renamed the column header to "Collisions found", is this any better? -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 16:57, 3 September 2008 (UTC)
 
== Question about reversibility ==
 
I'm probably missing something here, but I was wondering, what makes the SHA algorithm difficult to reverse? In the "Cryptographic hash functions" article, it says that good cryptographic hash functions are computationally difficult to reverse. I.e., given an output of a hash function, it would take a tremendous of computing power to find an input that corresponds to that output. But it seems to me that the only way to create such a hash function would be to use a task for which no polynomial algorithm is known. For example, consider this hash function:
* Take the input and map it to two large prime numbers
* Multiply the two prime numbers together; this product of the two primes is the output of the hash function.
To me, the above algorithm actually does seem difficult to reverse, because reversing it would involve factoring the integer, and there is no known polynomial algorithm for integer factorization. But the SHA algorithms contain no multiplication of large primes, or NP-complete problems, or anything like that. They only contain AND, XOR, OR, and leftrotate operations. It seems like all four of those operations would be easy to reverse. If one can reverse every step of the algorithm, then one can reverse the whole thing. Is that right? Again, I feel like I must be missing something, because SHA is widely believed to be secure. Navigatr85 17:34, 8 September 2008 (UTC)
 
:[[Confusion and diffusion]]. While the article talks about block ciphers, all symmetric cryptography primitives share this in their design. The basic idea is to just interleave a large amount of linear operations (addition, XOR, bitwise rotation etc) and non-linear substitutions (S-boxes) &mdash; both very fast on conventional CPUs &mdash; so that no useful reverse function can be derived.
:Problems like integer factorization are commonly used in [[asymmetric cryptography]], but they are very slow &mdash; so oftentimes, instead of digitally signing a long message, instead the application calculates a hash of the longer message, and then signs signs the resulting hash. The same is done with encryption.
:Another quality of symmetric cryptography is that its security increases exponentially with the size; for a block cipher with a 256-bit key it will take 2<sup>256</sup> operations to search the whole key space &mdash; if a faster attack is found, the cipher is denounced broken. In comparison, the complexity function of most asymmetric attacks (like [[GNFS]]) is sub-exponential and very elaborate. (Yes, SHA-1 is considered broken now that it has a 2<sup>63</sup> collision attack &mdash; much less than its natural [[birthday bound]], 2<sup>80</sup>). -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 07:38, 9 September 2008 (UTC)
 
::Navigatr85: I guess it is time for me to bring out my trusty old beginners example. XOR is usually considered a very reversible operation. But consider this case:
:::'''a''' xor '''b''' = '''c'''
::If '''a''' and '''b''' is part of the secret internal state and '''c''' is the output that the attacker gets to see, then the attacker can not figure out any of '''a''' or '''b''' based on just knowing '''c'''.
::Some find it easier to understand if we use addition and actual numbers. Say we use numbers between 0 and 9. Then consider this case:
:::'''5''' + '''3''' = '''8'''
::If '''5''' and '''3''' is part of the secret internal state and '''8''' is the output that the attacker gets to see, then the attacker can not figure out any of '''5''' or '''3''' based on just knowing '''8'''. Since 0 + 8 = 8 and 1 + 7 = 8 and 2 + 6 = 8 and so on.
::--[[User:Davidgothberg|David Göthberg]] ([[User talk:Davidgothberg|talk]]) 08:14, 9 September 2008 (UTC)
 
:::Thanks David, that helps a little bit, but I'm still confused. Isn't a hash considered to be insecure when one is able to find ANY input that hashes to a given output? Using your example, "53" is the secret internal state, and the hash function is to add the digits. So I, the attacker, see the output "8". Then, knowing the hash function, I can easily determine, without searching through the whole keyspace, that the input "17" will have the hash output value 8. So, assume the secret internal state is a password. Then I just use "17" as my password, and then the system hashes it. The system will think it's a valid password because it hashes to the correct value, even though the real password is "53".
:::The example above would be like finding a "collision," if I understand correctly. And the ability to find collisions makes a hash function insecure. More formally, a collision is two different messages with the same digest. Using your addition example, we can take the output "8" and easily reverse it in two different ways to get two different inputs with the same output, like 1 + 7 = 8 and 2 + 6 = 8. It seems to me like one could do the same thing for any algorithm that is simply a combination of addition, AND, XOR, etc. Navigatr85 22:30, 13 September 2008 (UTC)
 
::::That is indeed a collision. With such a trivial hash algorithm, it's easy to exhaustively guess passwords and find a collision rather quickly. That's why the number of bits in the hash is the leading indicator of the strength of the hash. If 16 bit values are used in David's example, there are 65,536 combinations of ''a'' and ''b'' giving a particular hash value (assuming overflow is ignored). The purpose of addition is for that kind of irreversibility. Imagine a multi-stage obfuscation function which adds some bits, rotates others into play and adds different bits. After not many stages, it's very difficult to analyze the permutations, let alone decode any of them. —[[user:EncMstr|EncMstr]] ([[user talk:EncMstr|talk]]) 22:58, 13 September 2008 (UTC)
 
:::::I wasn't talking about exhaustively guessing passwords, I was talking about using another algorithm to reverse the hash function. It still seems to me like any algorithm that uses simple functions would be relatively easy to reverse. For example, you said to "add some bits, rotate others into play, and add." Here's an example of an algorithm that does that:
 
:::::1. Add zeroes to the end of the input until the number of bits is 48. Assign this to the variable x.
:::::2. Split x into three 16-bit chunks, x[0], x[1], x[2]
:::::3. b = x[0] leftrotate 3
:::::4. y = b + x[1]
:::::5. h = y XOR x[2]
:::::6. h is the output value.
 
:::::Now here's an algorithm that reverses the above algorithm and finds a collision:
:::::1. x[2] = 0x0000
:::::2. x[1] = 0x0000
:::::3. b = h
:::::4. x[0] = b rightrotate 3
:::::5. f = x[0] append x[1] append x[2]
 
:::::I guess one could say that the above algorithm is also trivial. So, I guess what I'm asking is, what makes the SHA algorithm nontrivial? The operations AND, XOR, left rotation, addition, etc. are all easy to reverse. And SHA is just made up of many reversible steps, so why isn't the whole thing reversible? Could someone give an example of a short hash function that's not reversible like the one above? Navigatr85 00:01, 21 September 2008 (UTC)
 
::::::No one has responded after a week, whereas my other posts got responses within one day. Does anyone have any thoughts? Navigatr85 19:22, 29 September 2008 (UTC) <small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Navigatr85|Navigatr85]] ([[User talk:Navigatr85|talk]] • [[Special:Contributions/Navigatr85|contribs]]) </span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
:::::::Consider the difficulty of reversing the function '''a + b''': The result is 197383812. What is '''a''', where '''b''' is a secret internal value? This is the essence of the irreversibility of functions like SHA-1 and SHA-2. All that is known is '''a''' and '''b''' have the same number of bits. —[[user:EncMstr|EncMstr]] ([[user talk:EncMstr|talk]]) 22:20, 29 September 2008 (UTC)
 
::::::::OK, you're saying the function is '''a + b''', and the input consists of both '''a''' and '''b'''. In that case, you're right, we can't get either '''a''' or '''b'''. But we can easily find a collision. To find a collision, we simply set '''a''' to be any value we want and then subtract '''a''' from the result. We won't find your secret internal value, but we'll find another input that hashes to the given output. For your example of 197383812, I could set '''a'''=111111111, and then '''b''' = 197383812 - 111111111 = 86272701. Navigatr85 16:38, 3 October 2008 (UTC) <small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Navigatr85|Navigatr85]] ([[User talk:Navigatr85|talk]] • [[Special:Contributions/Navigatr85|contribs]]) </span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
:Navigatr85: Okay, brace yourself, you made me bring out the [[One-way compression function#Davies-Meyer|Davies-Meyer one-way compression function]]. Let's look at a slightly simplified version of it, a "one-way" function but without the "compression" part. Here goes:
:Assume you have a decent mixer function. It takes a 128 bit input, mixes it with good [[avalanche effect]] so the output looks random and all output bits depend on all input bits, and the output is also 128 bits. Like this:
::<code>output = mix( input )</code>
:Say this mixer function is fully reversible. It can for instance be a block cipher with a fixed key that everyone knows about. So you have an unmix function that everyone knows about, like this:
::<code>input = unmix( output )</code>
:Now to make a one-way function of this mixer function do like this:
::<code>output = mix( input ) XOR input</code>
:Or you can do like this:
::<code>output = mix( input ) + input</code>
:Where "+" here means addition with modular arithmetic. That is, values that become larger than 128 bits are wrapped around and start over from 0.
:Now that is much harder to reverse. Since if you know the output you can of course split it into two values, a and b, that when added becomes the output. But one of the two values must be unmixed, and then it has to become equal to the other value.
::output = a + b
::unmix( a ) == b ?
:But finding two values that equals each other after one of them is unmixed is hard. If the mix function is complex enough then the easiest way to find a working value is to try all 2<sup>128</sup> possible inputs. This makes it hard to find out what the input was. And even if you know the input it is hard to find another input that gives the same output. That is, it is hard to find collisions.
:After you have pondered the above for a while I suggest you take a look at the [[one-way compression function]] article. Worth noting is that SHA hash functions internally do use the "Davies-Meyer one-way compression function". (I just double checked, it is there in the pseudo code, just not so easy to see.)
:--[[User:Davidgothberg|David Göthberg]] ([[User talk:Davidgothberg|talk]]) 20:17, 3 October 2008 (UTC)
 
::Thanks, David, that clears things up. After reading your post, I invented a short function, to help convince myself. I considered the following function, with a four-bit input:
:::output = input + (0101 XOR input)
::There really does seem to be no algorithm to reverse that, without searching through all 16 possible inputs. Incidentally, one possibility you suggested was using a block cipher for the "mix" function, and also doing output = mix(input) XOR input. If I understand correctly, a block cipher involves XORing the input with some known key. So for that example, you would have:
:::output = mix(input) XOR input = (key XOR input) XOR input
::That actually wouldn't work, because the output would always be the key, regardless of the input. So I guess the combination of different operations, like XOR combined with addition, is what makes these algorithms hard to reverse.
::The reason this was bugging me was that reversing the SHA function doesn't involve solving an NP-complete problem or anything like that. Since the NP-complete problems are the ones most likely to be outside of P, it would seem that a hash function based on an NP-complete problem would be the most secure. It seems to me that the reversal of the SHA algorithm is in NP, and believed to be outside of P, but not proven to be outside of P, and also not proven to be NP-complete. Is that right? Navigatr85 18:29, 21 October 2008 (UTC) <small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Navigatr85|Navigatr85]] ([[User talk:Navigatr85|talk]] • [[Special:Contributions/Navigatr85|contribs]]) </span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
:Navigatr85: You really should learn to sign your comments. If you look above the edit window there is a special button for it if you have javascript enabled in your browser. Otherwise just add four tildes: <code><nowiki>~~~~</nowiki></code>, which gets translated to your username and time and date when you save your edit.
:Nice four bit example. And right, the mixer function must be different from the extra adding or xoring in of the input, otherwise you just undo the job of the mixer function.
:And regarding block ciphers: No, they do much more than just xoring in the key. They do a lot of mixing. Usually every output bit depends on every input bit. (Read about that in [[avalanche effect]].) You can build the mixer function from a block cipher like this:
::<code>output = mix( input ) = encrypt( key, input)</code>
::<code>input = unmix( input ) = decrypt( key, output)</code>
:So the full one-way function can look like this:
::<code>output = mix( input ) + input = encrypt( key, input) + input</code>
:And as I said, the key can be well known by everyone. The key can even be standardised and published. The key isn't really needed, just that the block cipher expects a key. For most decent block ciphers you can even simply just feed "0000" as the key in this case. (Although some block ciphers do bad mixing if they get such a key, so usually a more random looking key is used.)
:And regarding the NP an N: I don't know much about that, since I am not a mathematician. But I think I know this: Sure, building cryptographic primitives by using the (believed to be) NP-complete problems probably gives an extra security guarantee. Although as you say the algorithms used in the cryptographic building blocks of today perhaps are just as strong, only that we don't know that for sure. But the problem is that as far as I understand the known NP-complete problems are very slow to handle in computers and usually need huge keys. That's why the NP-complete problems normally are not used. (Well, they are used for [[public-key cryptography]], which indeed is very slow and needs huge keys...) But don't quote me on anything NP and N related, since I really don't know much about that.
:--[[User:Davidgothberg|David Göthberg]] ([[User talk:Davidgothberg|talk]]) 13:17, 23 October 2008 (UTC)
 
== Bit-rotted text ==
 
Under "Cryptanalysis and Validation," the phrase "the first criterion listed above" refers to some text which is no longer present in the article (see the intro paragraphs in this revision: http://en.wikipedia.org/w/index.php?title=SHA_hash_functions&oldid=213893308). The paragraph should be rewritten in light of this missing antecedent.
[[Special:Contributions/131.179.192.130|131.179.192.130]] ([[User talk:131.179.192.130|talk]]) 20:12, 14 October 2008 (UTC)
 
== Your opinion: is this useless fancruft? ==
 
I see this article was tagged as "fancruft" briefly -- as containing details almost nobody cares about. Interesting -- the most detailed part I can see is the pseudocode, which matches what we have for other important algorithms (e.g., RC4).
 
The history, analysis info, and so forth seem relevant to many real-world engineers using crypto, and isn't collected in one place under a free-content license. It's also useful as context for anyone trying to understand the current NIST hash competition, the 2008 MD5 crack, and future changes in the SSL protocol. We could perhaps improve the article by removing redundancy, but most of what's here is useful to real people. It's a little analogous to the math articles describing obscure (to non-mathematicians) functions, theorems, and equations: not for everyone but clearly valuable to a real audience.
 
And the most boring sentence probably provides as much value to humanity as the ''Lost'' episode descriptions that seemingly *aren't* fancruft. :)
 
(Incidentally, I'd be for deleting articles about some ciphers that were published and then forgotten. "Published" doesn't mean "notable.")
 
I'm certainly a crypto fanboy, though. Any other perspectives? Particular sections people think don't help? [[Special:Contributions/67.119.195.43|67.119.195.43]] ([[User talk:67.119.195.43|talk]]) 22:10, 2 February 2009 (UTC)
 
== sha-1 collision down to 2^52 now claims Macquarie University and Qualcomm, Australia ==
see paper at: http://eurocrypt2009rump.cr.yp.to/837a0a8086fa6ca714249409ddfae43d.pdf <span style="font-size: smaller;" class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/87.145.199.71|87.145.199.71]] ([[User talk:87.145.199.71|talk]]) 19:15, 29 April 2009 (UTC)</span><!-- Template:UnsignedIP --> <!--Autosigned by SineBot-->
 
:This paper has since been withdrawn because the authors state that their 2<sup>52</sup> estimation was wrong. -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 17:47, 31 March 2010 (UTC)
 
== Critique removal ==
 
I removed the following unsourced "critique": "It is well known that if a hash of bit length m is produced from messages of bit length n, then on average $2^(n-m)$ messages will share the same hash key, and consequently for a small fixed value of m, say, 1024, and messages with length of more than 1024/8=128 characters, collisions are going to be extremely likely." It is certainly true that the set of all 1024 bit messages, if SHA1 hashed, would produce large numbers of collisions. All cryptographic hashes promise is that finding two different messages with the same hash is very difficult. Also note that if every human on Earth sends one message a second, that amounts to about 2^58 messages a year, not enough to raise the probability of a single 128-bit hash collision above 50%.--[[User:ArnoldReinhold|agr]] ([[User talk:ArnoldReinhold|talk]]) 23:15, 27 October 2009 (UTC)
 
== Pseudocode formatting ==
 
In the 11:23, 2 October 2009 update, the pseudo code was reformatted in such a way that the loops are no longer clear when they occur. The previous version used indentation to show when the loops occur, now it's left up to the reader to determine. I would like to change it back, but want to see if others agree that it is a problem. [[User:Xylaan|Xylaan]] ([[User talk:Xylaan|talk]]) 19:12, 6 November 2009 (UTC)
 
== New competition dates ==
 
<blockquote>
A new hash standard, SHA-3, is currently under development – the function will be selected via an open competition running between fall 2008 and 2012.
</blockquote>
 
"fall", is that autumn? Of which hemisphere? Southern or Northern? Is it during the months of "fall 2008", and then again during the months of "fall 2012"? [[User:Jesselong|Jesselong]] ([[User talk:Jesselong|talk]]) 16:47, 12 January 2010 (UTC)
 
:Good catch. Too many people forget about the whole round-planet thing! Also distracting is the use of future tense for something that started in the past. [[User:CosineKitty|CosineKitty]] ([[User talk:CosineKitty|talk]]) 20:48, 12 January 2010 (UTC)
 
:I just made a revision based on your comments. What do you think about it? [[User:CosineKitty|CosineKitty]] ([[User talk:CosineKitty|talk]]) 21:07, 12 January 2010 (UTC)
 
::Awesome, thanks for that, looks great! <small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Jesselong|Jesselong]] ([[User talk:Jesselong|talk]] • [[Special:Contributions/Jesselong|contribs]]) 11:10, 14 January 2010 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
== SHA-1 and SHA-2 split ==
 
I find that the article is needlessly confusing because most of the information actually only applies to SHA-1. I also think that SHA-0 shouldn't be listed as a hash function in its own right, but rather just a preliminary version of SHA-1, because it was never actually employed in any public standards (as far as I know).
 
Reasons for splitting the article:
* The two algorithms aren't really very similar.
* SHA-2 is currently considered "''the''" replacement for the insecure SHA-1, so it helps to distinguish the broken function from the secure one.
* We can now add hash infoboxes to the article without needing to cover both functions.
* SHA-1 article can be listed in [[:Category: Broken hash functions]] without implying that SHA-2 is broken.
* Is consistent with other crypto articles where versions have separate articles, like [[MD4]] and [[MD5]] (the two are quite similar, moreso than SHA-1 and SHA-2), or [[Tiny Encryption Algorithm|TEA]] and [[XTEA]], or [[Blowfish]] and [[Twofish]].
* [[SHA-3]] will be chosen in a competition and will have '''no''' resemblance nor relation to SHA-1 or SHA-2, so it will need a separate article anyway.
 
I have posted initial drafts in my user space for review:
* Original version, 37969 bytes, 3858 words, 424 lines
* [[User:Intgr/SHA-1]] ([http://en.wikipedia.org/w/index.php?title=User%3AIntgr%2FSHA-1&action=historysubmit&diff=353183401&oldid=353181500 diff from current]) 27805 bytes, 3575 words, 289 lines
* [[User:Intgr/SHA-2]] ([http://en.wikipedia.org/w/index.php?title=User%3AIntgr%2FSHA-2&action=historysubmit&diff=353188030&oldid=353183655 diff from current]) 19453 bytes, 2488 words, 223 lines
 
Currently I haven't done too many changes to the either of the forks, just created it to get a rough idea. Obviously, if the split will be carried out, the articles will diverge further (I'm willing to undertake a modest amount of work). But I want somene else's opinion first. -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 18:41, 31 March 2010 (UTC)
 
: '''Agree''' It's a worthwhile split. I would, however, mover some of the SHA-1/SHA2/SHA-3 discussion to a "Secure Hash Algorithm" pseudo-disambiguation page which explains the differences nd summarizes the history. In particular, there was originally "SHA". Very quickly, "SHA-1" was announced, with no reason given. The original was given the [[retronym]] SHA-0. May I edit your drafts? [[Special:Contributions/71.41.210.146|71.41.210.146]] ([[User talk:71.41.210.146|talk]]) 05:48, 1 April 2010 (UTC)
::I was thinking that the SHA-1 lead section can server as such a "pseudo disambiguation". IMO the naming history isn't relevant for SHA-2 because it was published years later.
::Sure, you can edit the drafts, but I'd hold off any larger edits until the new articles are in place. -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 07:20, 1 April 2010 (UTC)
 
Well nobody has opposed so far and I'm really itching to finish this, so I'm going ahead with the split. -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small>
 
:I completed the split today (sorry, wasn't around for the weekend) -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 17:45, 4 April 2010 (UTC)
 
=== Disambiguation text (proposed) ===
The '''Secure Hash Algorithm''' is one of a number of [[cryptographic hash function]]s published by the [[National Institute of Standards and Technology]] as a U.S. [[Federal Information Processing Standard]]. There are currently three generations of Secure Hash Algorithm:
* [[SHA-1]] is the original 160-bit hash function. Resembling the earlier [[MD5]] algorithm, this was designed by the [[National Security Agency]] (NSA) to be part of the [[Digital Signature Algorithm]]. Originally just called "SHA", it was withdrawn shortly after publication due to an undisclosed "significant flaw" and replaced by the slightly revised version SHA-1. The original withdrawn algorithm is now known by the [[retronym]] "SHA-0".
* [[SHA-2]] is a family of two similar hash functions, with different block sizes, known as SHA-256 and SHA-512. They differ in the word size; SHA-256 uses 32-bit words where SHA-512 uses 64-bit words. There are also truncated versions of each standardized, known as SHA-224 and SHA-384. These were also designed by the NSA.
* '''SHA-3''' is a future hash function standard still in development. This is being chosen in a public review process from non-government designers. An ongoing [[NIST hash function competition]] is scheduled to end with the selection of a winning function, which will be given the name SHA-3, in 2012.
 
I'd title the overview article "Secure Hash Algorithm" and make "SHA hash functions" a redirect to that. [[Special:Contributions/71.41.210.146|71.41.210.146]] ([[User talk:71.41.210.146|talk]]) 05:29, 2 April 2010 (UTC)
 
:Sounds good. However, previous discussions have preferred using "SHA" in the name, rather than "Secure Hash Algorithm" &mdash; because the acronym SHA is ubiquitous, but people often don't even know the expansion. See prior discussions: [[#Naming]] and [[#Naming redundancy]].
:How about creating the "disambiguation" at [[SHA hash functions]]? Or maybe even merge with [[Sha (disambiguation)]]? -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 17:51, 4 April 2010 (UTC)
 
:: Er... that's what redirects are for. I strongly prefer (and am about to implement with edits) having the "real" article title being the full formal name, even if that is unfamiliar, and the familiar names can be redirects. [[Special:Contributions/71.41.210.146|71.41.210.146]] ([[User talk:71.41.210.146|talk]]) 17:52, 5 April 2010 (UTC)
 
:::Well that's against established Wikipedia practice, see [[WP:COMMONNAME]]. -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 19:00, 5 April 2010 (UTC)
 
:::: You're quite right. Still, given that [[SHA]] is out of bounds because it's too short, the copetition is actually between "SHA hash algorithms" and "Secure Hash Algorithm". I don't see how the former is better; It's only marginally shorter, and less obvious if you ''do'' know the initialism. (One question is whether the primary page should be [[Secure Hash Algorithm]] or [[Secure hash algorithm]]; it ''is'' a proper name...) [[Special:Contributions/71.41.210.146|71.41.210.146]] ([[User talk:71.41.210.146|talk]]) 21:21, 5 April 2010 (UTC)
 
:::::True, I have changed my mind. Let's leave it there and see if anyone else notices. ;)
:::::Most articles now link to [[SHA-1]] and [[SHA-2]] directly, so I think it doesn't even matter much.
:::::Upper-cased "Secure Hash Algorithm" is correct (like [[Advanced Encryption Standard]], [[Data Encryption Standard]] etc) -- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 21:34, 5 April 2010 (UTC)
 
----
 
{{talkarchive}}