Enigma Machine image from Hackaday

Taxonomy of Secret Writing

(Simon Singh, 1999)

Cryptograms

A cryptogram is a puzzle in which a message is encoded by replacing each letter in the original text with some other letter. The substitution pattern remains the same throughout the message. Your job in solving a cryptogram is to figure out this correspondence.
In this story, Poe describes the technique of assuming that the most common letters in the coded message correspond to the most common letters in English, which are E, T, A, O, I, N, S, H, R, D, L, and U.
One of the most famous cryptograms was written by Edgar Allan Poe in his short story “The Gold Bug.” Edgar Allan Poe (1809-1849)

Edgar Allan Poe (1809-1849)

(This image has a secret message embedded in it!)

In “The Gold Bug,”, Poe describes the technique of assuming that the most common letters in the coded message correspond to the most common letters in English, which are E, T, A, O, I, N, S, H, R, D, L, and U.

Poe’s Cryptogram Puzzle

53‡‡†305))6*;4826)4‡*)4‡);806*;48†8¶
60))85;1‡(;:‡*8†83(88)5*†;46(;88*96*
?;8)*‡(;485);5*†2:*‡(;4956*2(5*–4)8¶
8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡
1;48†85;4)485†528806*81(‡9;48;(88;4(
‡?34;48)4‡;161;:188;‡?;

The frequencies for the characters in the above message are as follows:

Character	Frequency
8	33
;	26
4	19
‡	16
)	16
*	13
5	12
6	11
(	10
†	8
1	8
0	6
9	5
2	5
:	4
3	4
?	3
¶	2
-	1
*	1

The frequency of letters in English text is as follows:

Character	Frequency
E	12.70%
T	9.10%
A	8.20%
O	7.50%
I	7.00%
N	6.70%
S	6.30%
H	6.10%
R	6.00%
D	4.30%
L	4.00%
C	2.80%
U	2.80%
M	2.40%
W	2.40%
F	2.20%
G	2.00%
Y	2.00%
P	1.90%
B	1.50%
V	0.98%
K	0.77%
J	0.15%
X	0.15%
Q	0.10%
Z	0.07%

If we combine the two frequency charts for the first few characters, we get this:

Poe Character	Poe Count	English Character	English Frequency
8	33	E	12.70%
;	26	T	9.10%
4	19	A	8.20%
‡	16	O	7.50%
)	16	I	7.00%

Because the 8 character is found 33 times, we can assume that it is the E plaintext:

53‡‡†305))6*;4E26)4‡*)4‡);E06*;4E†E¶
60))E5;1‡(;:‡*E†E3(EE)5*†;46(;EE*96*
?;E)*‡(;4E5);5*†2:*‡(;4956*2(5*–4)E¶
E*;40692E5);)6†E)4‡‡;1(‡9;4E0E1;E:E‡
1;4E†E5;4)4E5†52EE06*E1(‡9;4E;(EE;4(
‡?34;4E)4‡;161;:1EE;‡?;

We can assume that ; is the T:

53‡‡†305))6*T4E26)4‡*)4‡)TE06*T4E†E¶
60))E5T1‡(T:‡*E†E3(EE)5*†T46(TEE*96*
?TE)*‡(T4E5)T5*†2:*‡(T4956*2(5*–4)E¶
E*T40692E5)T)6†E)4‡‡T1(‡9T4E0E1TE:E‡
1T4E†E5T4)4E5†52EE06*E1(‡9T4ET(EET4(
‡?34T4E)4‡T161T:1EET‡?T

Because we see a number of T4E examples, we can then assume that 4 is H:

53‡‡†305))6*THE26)H‡*)H‡)TE06*THE†E¶
60))E5T1‡(T:‡*E†E3(EE)5*†TH6(TEE*96*
?TE)*‡(THE5)T5*†2:*‡(TH956*2(5*–H)E¶
E*TH0692E5)T)6†E)H‡‡T1(‡9THE0E1TE:E‡
1THE†E5TH)HE5†52EE06*E1(‡9THET(EETH(
‡?3HTHE)H‡T161T:1EET‡?T

We can use these techniques to decrypt the entire message (there are no spaces):

AGOODGLASSINTHEBISHOPSHOSTELINTHEDEV
ILSSEATFORTYONEDEGREESANDTHIRTEENMIN
UTESNORTHEASTANDBYNORTHMAINBRANCHSEV
ENTHLIMBEASTSIDESHOOTFROMTHELEFTEYEO
FTHEDEATHSHEADABEELINEFROMTHETREETHR
OUGHTHESHOTFIFTYFEETOUT

So, the final decryption would be (with punctuation added by the reader):

A good glass in the bishop’s hostel in the devil’s seat twenty-one degrees and thirteen minutes northeast and by north main branch seventh limb east side shoot from the left eye of the death’s-head a bee line from the tree through the shot fifty feet out.

Shannon and Weaver’s “Mathematical Model of Communication”

Shannon’s Model of Cryptography

The Enigma Machine

Relevant parts for this seminar: 37:01 – 39:25

Important Properties of the Enigma Code

The decryption team at Bletchley was able to exploit the following facts about the Enigma machine:
- The encoding is symmetrical.
- The Enigma machine can never map a character into itself.
- The steckerboard does not affect the transformation pattern of the rotors, but only the characters to which the outputs of that rotor are assigned.
We can test the Enigma machine with the Enigma Machine Simulator.
The codebreakers were also helped by the fact that the Germans were often both careless and overconfident. In believing they had an unbreakable encoding machine, they failed to take adequate measures to safeguard the integrity of their communications.

Breaking the Enigma Code

The most common technique used at Bletchley Park was the known-plaintext attack, in which the codebreakers guess that a particular sequence of characters exists somewhere in the decoded message. A sequence of characters that you guess is part of the plaintext is called a crib.

Breaking an Enigma message required the following steps:
- Align the crib with the ciphertext to eliminate crashes in which a letter appears to map to itself.
- Create a menu recording the links between letter pairs in the crib and ciphertext.
- Identify loops in the menu at which a chain of letter pairs cycles back to the original letter.
- Use the loops in the menu to create a wiring pattern for an electromechanical device called a Bombe that searches for settings of the Enigma rotors that produce the observed pattern.

Step 1: Align the Crib and Ciphertext

Step 3: Find the Loops

Other Ciphers

Let’s discuss a very simple cipher, called a Caesar Cipher. Julius Caesar used this cipher to communicate with his military generals, and although it was simple, it was supposedly enough to keep secret messages a secret, even from parties that captured the messages.
A Caesar cipher works as follows: it is a substitution cipher that substitues one letter with another, in a similar way to Edgar Allan Poe’s cryptogram. The alphabet is simply shifted some number of places, wrapping around at the end. In English, a Caesar Cipher could be offset by three letters, as follows:

XYZABCDEFGHIJKLMNOPQRSTUVW

With this shift, A in a plaintext message would become X in a ciphertext message, B would become Y, etc., as follows:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎↕︎
XYZABCDEFGHIJKLMNOPQRSTUVW

The plaintext WE LEAVE AT DAWN would be encrypted to ciphertext as TB IBXSB XQ AXTK.
The problem with this cipher is exactly the same as with Poe’s cryptogram: it is easily broken with the use of simple frequency analysis. It is even more vulnerable if you know it is a simple shift – once you know one letter, you instantly have all the other letters.
A very slightly better cipher is a Monoalphabetic substitution cipher, which randomly assigns letters to the alphabet, and this is exactly what Poe’s cipher was.
One example of a more secure cipher is a transposition cipher.

Rectangular Transposition Cipher Example

A transposition cipher rearranges the letters of a message rather than substituting them. The rectangular (or columnar) transposition is one of the most common types.

How It Works

Choose a keyword (this becomes your key)
Write the message in rows under the keyword
Number the columns based on alphabetical order of the keyword letters
Read out the columns in numerical order to create the ciphertext

Step-by-Step Example

Message: MEET ME AT THE LIBRARY

Keyword: CRYPTO

Step 1: Set up the grid with the keyword

C R Y P T O

Step 2: Number columns by alphabetical order of keyword letters

C = 1st in alphabet → column 1
O = 2nd in alphabet → column 2
P = 3rd in alphabet → column 3
R = 4th in alphabet → column 4
T = 5th in alphabet → column 5
Y = 6th in alphabet → column 6

C R Y P T O
1 4 6 3 5 2

Step 3: Write the message in rows (remove spaces)

C R Y P T O
1 4 6 3 5 2
-----------
M E E T M E
A T T H E L
I B R A R Y

Step 4: Read columns in numerical order (1, 2, 3, 4, 5, 6)

Column 1 (C): M, A, I → MAI
Column 2 (O): E, L, Y → ELY
Column 3 (P): T, H, A → THA
Column 4 (R): E, T, B → ETB
Column 5 (T): M, E, R → MER
Column 6 (Y): E, T, R → ETR

Ciphertext: MAIELY THAETB MERETR

(Spaces added for readability - in practice, remove all spaces)

Final ciphertext: MAIELYTHAETBMERETR

Decryption Process

To decrypt, reverse the process:

Know the keyword length and calculate grid dimensions
Recreate the numbered column headers
Fill in columns in numerical order
Read across the rows

Security Notes

Weakness: Letter frequencies remain unchanged
Strength: Disrupts common letter patterns and bigrams
Key security: Longer keywords and irregular message lengths increase security

Try It Yourself!

Message: MEET FOR TEA AT TEN

Keyword: ZEBRA

Can you encrypt this message using the same method?

The Vigenère Cipher

An even harder cipher is a Vigenère cipher, which also uses a keyword, and combines it with a Ceasar cipher.
In a Vigenère cipher, the letters of the keyword determine how many places to shift the alphabet. So, for the keyword KEY, the K means that the first letters is shifted 11 places (with wrapping), the E shifts the alphabet 5 places, and the Y shifts the alphabet 25 places. Then, the letters are used again (K, then E, then Y, etc.). This makes it much harder to apply frequency analysis, becuase ciphertext letters will not map to the same plaintext letters each time.

Perfectly Secure Cryptography

It is possible in theory to have a perfectly secure way of transmitting messages between two parties, as long as the parties can guarantee that the key to the message is kept secure.
The idea, called a *one-time-pad`, extends the Vigenère cipher by making the key long enough so that it never repeats. Additionally, complete randomization of letters (i.e., no words or other guessable or discoverable text) makes it provably impossible to break.
Why is it impossible to break without access to the key? Becuase there exists a key to decrypt into any and all possible messages that could have been sent.
Here is an example:

Ciphertext: JQKQEQYBNWFWEB

If the key was ZQBAUZQZFIZIUZ, this would decrypt to DRIVETOTHEWEST or DRIVE TO THE WEST.
However, if the key was REYESEMIVENESR, the message would decrypt to SWIMINTHERIVER or SWIM IN THE RIVER.
Which is it? It’s impossible to tell!
One time pads are difficult to use in practice for a few reasons:
- The amount of key characters must be the same as the amount of plaintext characters, which could be a lot
- There has to be some way to pass the key between the parties securely prior to messages being sent.
- Any reuse of the key creates opportunities to break the code.
In practice, two parties might use a one-time-pad to encrypt a key, which could then be used for some other type of encryption method that is easier to use.

Do we need to pass keys between parties?

The need for two parties to pass a private between them can be difficult. This is known as The Key Distribution Problem, and its solution is both beautiful and effective.
The problem of distributing encryption keys—no matter whether they are complete codebooks used in one-time pads or shorter keys used in other encryption strategies—is hard only if the keys must be kept secret.
The idea that keys can be made public at first seems crazy. After all, if a key is public than anyone can use it.
That fact, however, causes problems only if knowing the public key makes it possible for an eavesdropper to read messages to which they should not have access. If knowing the public key lets anyone encrypt a message but allows only the intended recipient to decrypt that message, the problem goes away. There is still a problem of trusting the public key’s origin, but we will cover that later.
A coding strategy that allows encryption keys to be shared but protects decryption keys is called public-key encryption.

GCHQ Invents Public-Key Encryption

Public-key cryptography was invented in England by the Government Communications Headquarters (GCHQ), which succeeded the Government Code and Cipher School (GCCS) that broke the Enigma code in World War II.
The basic ideas of public-key encryption were developed in the early 1970s by the GCHQ cryptographers James Ellis, Clifford Cocks, and Malcolm Williamson. Their work went well beyond the fundamental concepts to anticipate many of the practical cryptographic protocols that were subsequently rediscovered in the United States.
Unfortunately, none of the later researchers knew about the British cryptographic work. At GCHQ, everything relating to public-key cryptography remained classified until late in the 1990s, long after this technology was commercialized.

Rediscovery of the idea in the U.S.

The first unclassified announcement of public-key encryption appeared in 1976 in a paper by Stanford graduate student Whit Diffie and his advisor Martin Hellman. The two appear in the picture to the right along with another early collaborator, Ralph Merkle. Diffie and Hellman won the 2016 Turing Award.
The Stanford researchers not only laid out the basic structure of public-key encryption but also proposed an implementation based on a variant of the subset-sum problem, which is an NP-complete problem.
Unfortunately, the specific variant of the problem they chose was considerably easier to solve.

How Public-Key Encryption Works

Assume two people, Bob and Alice, want to exchange encrypted messages.
First, Bob creates a pair of keys labeled $D_B$ and $E_B$.
Then, Bob publishes $E_B$ as his public key. Anyone in the world can access this key, including Alice, and including any eavesdropper.
Bob keeps $D_B$ hidden as his private key.
Alice performs an encryption algorithm with $E_B$ (Bob’s public key) to encrypt her message.
Only someone with $D_B$ can decipher the message, meaning that only Bob can decipher the message.
The encryption with the public key is a one-way function, meaning that the plaintext can be encrypted to ciphertext using $E_B$, but $E_B$ cannot be used to decipher the ciphertext back to plaintext.

RSA Makes the Idea Practical

In 1977, a team consisting of Ron Rivest, Adi Shamir, and Len Adleman—all then at MIT—developed a practical implementation of public-key encryption, which they called RSA after their initials.
The RSA scheme relies on the difficulty of factoring large numbers. As long as the number is large enough—typically many hundreds of digits—revealing the product of two prime numbers $p$ and $q$ does not allow an eavesdropper to recover the original values.

Key Generation in RSA

As in any public-key encryption scheme, RSA requires each potential recipient to generate two keys, a public key that allows anyone to send an encrypted message and a private key that ensures that only the recipient can decrypt that message.
Generating an RSA key pair requires the following steps:
1. Choose two prime numbers $p$ and $q$.
2. Define the variable $n$ as $pq$ and the variable $t$ as $(p - 1)(q - 1)$.
3. Choose an integer $d < n$ so that $d$ is relatively prime to $t$. Two integers are relatively prime if they share no common factors.
4. Set $e$ to be the modular inverse of $d$ with respect to $t$, which is the integer for which the product $de$ divided by $t$ leaves a remainder of $1$. The fact that $d$ and $t$ are relatively prime means that this value is unique.
5. Release $n$ and $e$ as the public key and use $d$ as the private key.

Encryption and Decryption in RSA

Given the public-key parameters $n$ and $e$, the sender creates an encrypted message $c$ using the following steps:
1. Convert the message into a binary integer $m$ by using the internal character codes. If the message length exceeds the size of $n$, the sender must break it into smaller pieces and encrypt each one individually.
2. The sender then computes the ciphertext $c$ as follows: $$c = m^e\text{ mod }n$$
3. The recipient restores the original message by calculating $$m' = c^d\text{ mod }n$$

A small key generation example

Choose two prime numbers, $p$ and $q$.

$p = 11$

$q = 23$
Define the variable $n$ as $pq$ and the variable $t$ as $(p-1)(q-1)$.

$n=253$

$t=220$
Choose an integer $d < n$ so that $d$ is relatively prime to $t$.

$d = 17$
Set $e$ to be the modular inverse of $d$ with respect to $t$.

$e=13$, or $(17 \times 13 = 221; 221\text{ mod }220 = 1$
Release $n$ and $e$ as the public key and use $d$ as the private key.

A small encryption example

$$ n = 253 \newline e = 13 \newline \overline{d = 17} $$

Convert the message into an integer $m$.

$m=\text{‘A’}=65$
Create the ciphertext $c$ by calculating $m^e\text{ mod }n$.

$m^e = 369720589101871337890625$

$c = 76$
Validate the encryption by calculating $c^d\text{ mod }n$.

$c^d=94152329294455713577749264203776$

$m = 65$

The Magic of Modular Arithmetic

The very small example on the preceding slide requires the calculation of $m^e\text{ mod }n$ for $m = 65$, $e = 13$, and $n = 253$.

The numbers stay much smaller if you apply the $\text{mod}$ operator as you go along.

Without mod	With mod
$m^1 = 65$	$m^1\text{ mod }n = 65$
$m^2 = 4225$	$m^2\text{ mod }n = 177$
$m^3 = 274625$	$m^3\text{ mod }n = 120$
$m^4 = 17850625$	$m^4\text{ mod }n = 210$
$m^5 = 1160290625$	$m^5\text{ mod }n = 241$
$m^6 = 75418890625$	$m^6\text{ mod }n = 232$
$m^7 = 4902227890625$	$m^7\text{ mod }n = 153$
$m^8 = 318644812890625$	$m^8\text{ mod }n = 78$
$m^9 = 20711912837890625$	$m^9\text{ mod }n = 10$
$m^{10} = 1346274334462890625$	$m^{10}\text{ mod }n = 144$
$m^{11} = 87507831740087890625$	$m^{11}\text{ mod }n = 252$
$m^{12} = 5688009063105712890625$	$m^{12}\text{ mod }n = 188$
$m^{13} = 369720589101871337890625$	$m^{13}\text{ mod }n = 76$

A large key generation example

Choose two prime numbers, $p$ and $q$.

$p = 1728361$

$q = 7943851$
Define the variable $n$ as $pq$ and the variable $t$ as $(p-1)(q-1)$.

$n=13729842258211$

$t=13729832586000$
Choose an integer $d < n$ so that $d$ is relatively prime to $t$.

$d = 4576614086081$
Set $e$ to be the modular inverse of $d$ with respect to $t$.

$e=6689545226321$
Release $n$ and $e$ as the public key and use $d$ as the private key.

A large encryption example

$$ n = 13729842258211 \newline e = 6689545226321 \newline \overline{d = 4576614086081} $$

Convert the message into an integer $m$.

Text = Hello

In binary: 01001000 01100101 01101100 01101100 01101111

In decimal: $m=310939249775$
Create the ciphertext $c$ by calculating $m^e\text{ mod }n$.

$c = 6571163089559$
Validate the encryption by calculating $c^d\text{ mod }n$.

$m = 310939249775$

Another cool use for Public Key Encryption: Digital Signatures

Because of the nature of one-way functions, we can reverse the process to create a digital signature that demonstrates that a document was created by a particular person or entity.
It works as follows: instead of using a public key to encrypt a document, the owner of the keys uses their private key to encrypt the document. What this means is that anyone with the person’s public key can decrypt the document, but only that public key decripts it. Therefore, the only person that could have encrypted the document is the original person, so therefore it is guaranteed that the user is the originator of the document.
Note that this does not provide any security for the documents – if someone wants to encrypt a document as well as digitally sign it, they should first digitally sign it, and then encrypt it with someone else’s public key.

The Key Distribution Problem

While Public Key Cryptography is a tremendously useful and widely used encryption scheme, it is not without its faults. One fault is that in order to encrypt a document with a public key, there has to be a trust that the public key belongs to the user who is going to decrypt it.
For example, if you make an online purchase at, say, Taobao, this is what happens:
When You Navigate to Taobao
1. DNS Resolution
  - Your browser queries DNS to find Taobao’s IP address
  - Trust: You’re trusting your DNS provider (ISP) isn’t lying about Taobao’s address
2. TLS Handshake
Taobao presents an SSL/TLS certificate containing their public key
- Public Key Use: Your browser uses this to establish an encrypted connection
- Trust Chain:
  - Taobao’s certificate is signed by a Certificate Authority (like DigiCert)
  - Your browser trusts that CA because it’s in your browser’s root certificate store
  - You’re trusting Mozilla/Google/Apple decided this CA is trustworthy
During Your Session
1. Browsing and Login
  - All communication encrypted using keys derived from that initial handshake
  - Trust: That the encrypted connection actually goes to Taobao, not an imposter
2. Payment Processing
  - When you enter credit card info, it’s encrypted for transmission
  - Often there’s another layer - Taobao forwards this to payment processors (Stripe, their own payment system, etc.)
  - Additional Trust: Taobao’s relationship with payment processors, your bank’s validation systems
What You’re Really Trusting
- Certificate Authority System:
  - A handful of organizations (DigiCert, Let’s Encrypt, etc.) that your browser automatically trusts
  - If any of these CAs are compromised or malicious, they could issue fake certificates for Taobao
- Browser Vendors:
  - Microsoft, Google, Mozilla decide which CAs to trust
  - They push updates to root certificate stores
- Taobao Itself:
  - That they’re properly securing their private keys
  - That their servers haven’t been compromised
- Your Device:
  - That your computer/phone isn’t compromised
  - That the browser software itself is trustworthy
Potential Attack Scenarios
- Rogue CA: A certificate authority issues a fake Taobao certificate to an attacker
- DNS Hijacking: You’re directed to a fake Taobao site with its own valid certificate
- Government Interception: State actors with access to CAs or infrastructure
- Compromised Root Store: Malicious updates to your browser’s trusted CA list

Conclusion

Digital encryption can take many forms, and it is truly a great idea in computer science, because it enables secure communications around the world.