Running key cipher: Difference between revisions

Content deleted Content added
Capi (talk | contribs)
Example: small clarification on meaning of indicator block
m gnoming at 4am
 
(46 intermediate revisions by 37 users not shown)
Line 1:
{{short description|Type of polyalphabetic substitution cipher}}
{{unreferenced|date=June 2009}}
{{refimprove|date=March 2016}}
In classical [[cryptography]], the '''running key cipher''' is a type of [[polyalphabetic substitution|polyalphabetic]] [[substitution cipher|substitution]] [[cipher]] in which a text, typically from a book, is used to provide a very long [[keystream]]. The earliest description of such a cipher was given in 1892 by French mathematician Arthur Joseph Hermann (better known for founding [[Éditions Hermann]]). Usually, the book to be used would be agreed ahead of time, while the passage to usebe used would be chosen [[random]]ly for each message and secretly indicated somewhere in the message.
 
== Example ==
 
SupposeThe wekey text haveused agreedis toa useportion of ''[[The C Programming Language (book)|The C Programming Language]]'' (1978 edition) as our text, and we are using the ordinary ''[[tabula recta]]'' asis ourthe tableau. WeThe needplaintext tohere sendis the message '"Flee at once'".
 
Page 63, line 1 is selected as the running key:
First, we choose a starting point. Let us choose page 63, line 1:
 
:<blockquote>errors can occur in several places. A label has...</blockquote>
 
WeThe writerunning outkey theis runningthen keywritten under ourthe plaintext:
 
{| class="wikitable" style="text-align:center"
<table>
|-
<tr>
! align="left" | Plaintext
<td>'''Plaintext:'''</td><td>f</td><td>l</td><td>e</td><td>e</td><td>a</td>
| f || l || e || e || a || t || o || n || c || e
<td>t</td><td>o</td><td>n</td><td>c</td><td>e</td>
|-
</tr><tr>
! align="left" | Running key
<td>'''Running key:'''</td><td>E</td><td>R</td><td>R</td><td>O</td><td>R</td>
| E || R || R || O || R || S || C || A || N || O
<td>S</td><td>C</td><td>A</td><td>N</td><td>O</td>
|-
</tr><tr>
! align="left" | Ciphertext
<td>'''Ciphertext:'''</td><td>J</td><td>C</td><td>V</td><td>S</td><td>R</td>
| J || C || V || S || R || L || Q || N || P || S
<td>L</td><td>Q</td><td>N</td><td>P</td><td>S</td>
|}
</tr>
</table>
 
AndThe sendmessage theis messagethen 'sent as "JCVSR LQNPS'". However, unlike a [[Vigenère cipher]], if wethe havemessage tois extendextended, ourthe message,key weis don'tnot repeatrepeated; the key; wetext justitself continue(the ontext from the ''[[The C Programming Language]]'') is used as the key text.and Socan supposebe weextended needfor aany longerarbitrary length. If the message is extended, like:such 'as, "Flee at once. We are discovered'.", Thenthen wethe justrunning key continuecontinues as before:
 
{| class="wikitable" style="text-align:center"
<table>
|-
<tr>
! align="left" | Plaintext
<td>'''Plaintext:'''</td><td>f</td><td>l</td><td>e</td><td>e</td><td>a</td>
|f || l || e || e || a || t || o || n || c || e || w || e || a || r || e || d || i || s || c || o || v || e || r || e || d
<td>t</td><td>o</td><td>n</td><td>c</td><td>e</td><td>w</td><td>e</td>
|-
<td>a</td><td>r</td><td>e</td><td>d</td><td>i</td><td>s</td><td>c</td>
! align="left" | Running key
<td>o</td><td>v</td><td>e</td><td>r</td><td>e</td><td>d</td>
|E || R || R || O || R || S || C || A || N || O || C || C || U || R || I || N || S || E || V || E || R || A || L || P || L
</tr><tr>
|-
<td>'''Running key:'''</td><td>E</td><td>R</td><td>R</td><td>O</td><td>R</td>
! align="left" | Ciphertext
<td>S</td><td>C</td><td>A</td><td>N</td><td>O</td><td>C</td><td>C</td>
|J || C || V || S || R || L || Q || N || P || S || Y || G || U || I || M || Q || A || W || X || S || M || E || C || T || O
<td>U</td><td>R</td><td>I</td><td>N</td><td>S</td><td>E</td><td>V</td>
|}
<td>E</td><td>R</td><td>A</td><td>L</td><td>P</td><td>L</td>
</tr><tr>
<td>'''Ciphertext:'''</td><td>J</td><td>C</td><td>V</td><td>S</td><td>R</td>
<td>L</td><td>Q</td><td>N</td><td>P</td><td>S</td><td>Y</td><td>G</td>
<td>U</td><td>I</td><td>M</td><td>Q</td><td>A</td><td>W</td><td>X</td>
<td>S</td><td>M</td><td>E</td><td>C</td><td>T</td><td>O</td>
</tr>
</table>
 
NextTo we need to tell the recipientdetermine where to find the running key for this message. In this case, we've decided to make up a fake block of five ciphertext characters is subsequently added, with three denoting the page number, and two the line number, using A=0, B=1 etc. to encode digits. Such a block is called an '''indicator block'''. The indicator block will be inserted as the second last of each message. (Of course, manyMany other schemes are possible for hiding indicator blocks).) Thus page 63, line 1 encodes as '"AGDAB'" (06301).
 
FinallyThis weyields cana send thefinal message 'of "JCVSR LQNPS YGUIM QAWXS AGDAB MECTO'".
 
== Variants ==
 
Modern variants of the running key cipher often replace the traditional ''tabula recta'' with bitwise [[exclusive or]], operate on whole [[byte]]s rather than alphabetic letters, and derive their running keys from large files. Apart from possibly greater entropy density of the files, and the ease of automation, there is little practical difference between such variants and traditional methods. If the running key is random, never reused, and kept secret, the result is a [[one-time pad]], a method that provides perfect secrecy (reveals no information about the plaintext).
 
=== Permutation generated running keys ===
== Security ==
 
A more compact running key can be used if one combinatorially generates text using several start pointers (or combination rules). For example, rather than start at one place (a single pointer), one could use several start pointers and xor together the streams to form a new running key, similarly skip rules can be used. What is exchanged then is a series of pointers to the running key book and/or a series of rules for generating the new permuted running key from the initial key text. (These may be exchanged via [[public key]] encryption or in person. They may also be changed frequently without changing the running key book.)
Perhaps surprisingly, the security is usually fairly poor. This is because the [[information entropy|entropy]] per character of both plaintext and running key is low, and the combining operation is easily inverted. This means a [[cryptanalysis|cryptanalyst]] can run guessed probable plaintexts along the ciphertext, subtracting them out from each possible position. When the result is a chunk of something intelligible, there is a high probability that the guessed plain text is correct for that position (as either actual plaintext, or part of the running key). The 'chunk of something intelligible' can then often be extended at either end, thus providing even more probable plaintext - which can in turn be extended, and so on. Eventually it is likely that the running key will be recognised, and the jig is up. This process is sometimes performed as a simple puzzle, for recreation.
 
=== Ciphertext appearing to be plaintext ===
There are several ways to improve the security. The first and most obvious is to use a secret mixed alphabet tableau instead of a ''tabula recta''. This does indeed greatly complicate matters but it is not a complete solution. Pairs of plaintext and running key characters are far more likely to be high frequency pairs such as 'EE' rather than, say, 'QQ'. This skew this causes to the output [[frequency distribution]] is smeared by the fact that it is quite possible that 'EE' and 'QQ' map to the same ciphertext character, but nevertheless the distribution is not flat. This may enable the cryptanalyst to deduce part of the tableau, then proceed as before (but with gaps where there are sections missing from the reconstructed tableau).
 
Traditional ciphertext appears to be quite different from plaintext. To address this problem, one variant outputs "plaintext" words instead of "plaintext" letters as the ciphertext output. This is done by creating an "alphabet" of words (in practice multiple words can correspond to each ciphertext output character). The result is a ciphertext output which looks like a long sequence of plaintext words (the process can be nested). Theoretically, this is no different from using standard ciphertext characters as output. However, plaintext-looking ciphertext may result in a "human in the loop" to try to mistakenly interpret it as decoded plaintext.
 
An example would be BDA (Berkhoff deflater algorithm){{Citation needed|reason=Can't find any explanation about this out there|date=October 2019}}, each ciphertext output character has at least one noun, verb, adjective and adverb associated with it. (E.g. (at least) one of each for every [[ASCII]] character). Grammatically plausible sentences are generated as ciphertext output. Decryption requires mapping the words back to ASCII, and then decrypting the characters to the real plaintext using the running key. Nested-BDA will run the output through the reencryption process several times, producing several layers of "plaintext-looking" ciphertext - each one potentially requiring "human-in-the-loop" to try to interpret its non-existent [[semantic]] meaning.
 
=== Gromark cipher ===
 
The "Gromark cipher" ("[[Gronsfeld cipher]] with mixed alphabet and running key") uses a running numerical key formed by adding successive pairs of digits.<ref>American Cryptogram Association.
[http://cryptogram.org/docs/acayou16.pdf "The ACA and You"] {{Webarchive|url=https://web.archive.org/web/20160403131720/http://cryptogram.org/docs/acayou16.pdf |date=2016-04-03 }}.
2016.</ref>
The [[VIC cipher]] uses a similar [[lagged Fibonacci generator]].
 
== Security and cryptanalysis ==
 
If the running key is truly random, never reused, and kept secret, the result is a [[one-time pad]], a method that provides [[perfect secrecy]] (reveals no information about the plaintext). However, if (as usual) the running key is a block of text in a [[natural language]], security actually becomes fairly poor, since that text will have non-random characteristics which can be used to aid cryptanalysis: for example, [[William F. Friedman]] suggested a [[ciphertext-only attack]] during WWI against most frequent letters encoded by other most frequent letters.<ref>{{Cite web |title=Cryptology: Running-Text Ciphers – Cryptanalysis According to Friedman |url=https://www.staff.uni-mainz.de/pommeren/Cryptology/Classic/7_Aperiodic/AnalFR.html |access-date=2024-01-10 |website=www.staff.uni-mainz.de}}</ref> As a result, the [[information entropy|entropy]] per character of both plaintext and running key is low, and the combining operation is easily inverted.
 
PerhapsTo surprisingly,attack the security is usually fairly poor. This is because the [[information entropy|entropy]] per character of both plaintext and running key is lowcipher, and the combining operation is easily inverted. This means a [[cryptanalysis|cryptanalyst]] canmay run [[Known-plaintext attack|guessed probable plaintexts]] along the ciphertext, subtracting them out from each possible position. When the result is a chunk of something intelligible, there is a high probability that the guessed plain text is correct for that position (as either actual plaintext, or part of the running key). The 'chunk of something intelligible' can then often be extended at either end, thus providing even more probable plaintext -, which can in turn be extended, and so on (for more detailed explanation refer to [[Autokey cipher#Cryptanalysis|Autokey cipher]]). Eventually it is likely that the source of the running key will be recognisedidentified, and the jig is up. This process is sometimes performed as a simple puzzle, for recreation.
 
There are several ways to improve the security. The first and most obvious is to use a secret mixed alphabet tableau instead of a ''tabula recta''. This does indeed greatly complicate matters but it is not a complete solution. PairsAs exploited in Friedman's method, pairs of plaintext and running key characters are far more likely to be high frequency pairs such as 'EE' rather than, say, 'QQ'. ThisThe skew this causes to the output [[frequency distribution]] is smeared by the fact that it is quite possible that 'EE' and 'QQ' map to the same ciphertext character, but nevertheless the distribution is not flat. This may enable the cryptanalyst to deduce part of the tableau, then proceed as before (but with gaps where there are sections missing from the reconstructed tableau).
 
Another possibility is to use a key text that has more entropy per character than typical English. For this purpose, the [[KGB]] advised agents to use documents like [[almanac]]s and trade reports, which often contain long lists of random-looking numbers.
 
Another problem is that the keyspace is surprisingly small. Suppose that there are 100 million key texts that might plausibly be used, and that on average each has 11 thousand possible starting positions. To an opponent with a massive collection of possible key texts, this leaves possible a brute force search of the order of <math>2^{40}</math>, which by computer cryptography standards is a relatively easy target. (See permutation generated running keys above for an approach to this problem).
 
== Confusion ==
 
Because both ciphers classically employed [[novel]]s or [[Bible]]s as part of their key material, many sources confuse the [[book cipher]] and the running key cipher. They are really only very distantly related. The running key cipher is a polyalphabetic substitution, the book cipher is a homophonic substitution. Perhaps the distinction is most clearly made by the fact that a running cipher would work best of all with a book of random numbers, whereas such a book (containing no text) would be useless for a book cipher.
 
== See also ==
Line 75 ⟶ 87:
* [[Topics in cryptography]]
 
{{CryptoCryptography navbox | classical}}
 
== References ==
{{Reflist}}
 
[[Category:Stream ciphers]]