Cryptography Overview
Cryptography is the practice of disguising information so that only the intended reader can understand it. Every classical cipher is based on one of two main ideas:
- Substitution — swapping each letter for a different letter or symbol.
- Transposition — keeping the letters but changing their order.
This page explains the most well-known ciphers in plain language and shows you how to break them yourself using The Cipher Lab’s interactive Tools and Cipher Encoder / Decoder.
Substitution Cipher
A substitution cipher replaces every letter in the alphabet with another. The mapping stays the same for the entire message — that consistency is exactly what you exploit to break it.
How to Break It (fast)
Use frequency + word patterns. In English, E/T/A/O/I are common. If spaces are kept, word lengths and repeats give huge clues.
What the frequency looks like: Monogram frequencies keep the same overall “English shape” (one letter is still the most common, a few are medium, many are rare), but the labels are swapped. Bigrams/trigrams also look English-ish but with letters renamed.
What leaks (weakness)
- Letter frequency: ciphertext inherits English distribution.
- Repeats: double letters (LL, EE) and repeated short words.
- Word shapes: patterns like
_H_can hint “THE”.
Practical breaking steps
- Run frequency: guess E/T/A/O/I candidates.
- Find “THE”, “AND”, “OF”, “TO” using patterns.
- Lock confirmed letters, then iterate (it snowballs).
Caesar Cipher
The Caesar cipher shifts every letter by the same number of positions in the alphabet. After Z, it wraps back to A.
How to Break It (fast)
There are only 25 possible shifts. Brute force them and pick the one that produces readable English.
What the frequency looks like: Exactly like normal English, just shifted: the frequency curve is unchanged. A chi‑square match against English over all 25 shifts usually pops the right one immediately.
How it works (math)
Convert letters to numbers A=0..Z=25. Encrypt: (x + s) mod 26. Decrypt: (x − s) mod 26.
Fast sanity checks
- Common hits: “THE”, “AND”, “ING”.
- ROT13 is special: shift 13 decrypts itself.
Vigenère Cipher
Vigenère uses a keyword to apply multiple Caesar shifts. The shift changes through the message based on the key letters.
How to Break It (fast)
Find the key length first (Kasiski / IOC), then solve each column as a Caesar cipher.
What the frequency looks like: Frequencies get “flattened” compared to English (lower IoC, less obvious E/T/A peaks). If you split the text by key position, each slice looks like a Caesar‑shifted English distribution.
Key length tools
- Kasiski: repeated ciphertext chunks → distances share factors ≈ key length.
- IOC: split into k columns; the correct k makes columns look “English-like”.
Then solve
- For each column, run frequency like Caesar.
- Combine shifts → keyword.
- Decrypt and refine (errors stand out quickly).
Affine Cipher
Affine encrypts letters using multiplication and addition mod 26. It’s stronger than Caesar, but still small enough to brute force.
How to Break It (fast)
Brute force all valid (a,b) pairs and score the outputs for English.
What the frequency looks like: Like Caesar/Substitution: monogram frequencies keep the English-like curve but permuted. A chi‑square test over valid (a,b) pairs is typically very effective.
Formula
With A=0..Z=25: E(x)=(a·x+b) mod 26. Decrypt: D(y)=a⁻¹·(y−b) mod 26.
Valid keys
You need gcd(a,26)=1 so an inverse exists. Valid a values: 1,3,5,7,9,11,15,17,19,21,23,25. b can be 0–25.
Breaking
- Try all valid a and all b.
- Pick the best-looking plaintext.
Playfair Cipher
Playfair encrypts pairs of letters using a 5×5 key square (I/J combined). It hides single-letter frequencies — but digraph patterns leak.
How to Break It (fast)
Use digraph frequency + cribs. For serious cracking, heuristic search (hill-climbing) is common.
What the frequency looks like: Single-letter frequencies are less helpful because encryption works on digraphs. You’ll often see fewer obvious English bigrams, no doubled letters (like “LL”), and digraph patterns dominate instead.
Rules
- Same row: take letters to the right (wrap).
- Same column: take letters below (wrap).
- Rectangle: swap corners (same row, other column).
Plaintext prep
- Split into pairs.
- If a pair repeats (LL), insert filler (often X): HE LX LO.
Breaking
- Look at digraphs like TH/HE/IN/ER.
- Try cribs and test quickly with tools.
Hill Cipher
The Hill cipher encrypts blocks of letters using matrix multiplication mod 26. It’s a classical cipher that directly uses linear algebra.
How to Break It (fast)
If you know matching plaintext/ciphertext blocks, you can solve for the matrix. Otherwise, brute force only works for tiny keys.
How it works
- Choose block size n (e.g. 2).
- Convert letters to numbers.
- Compute C = K·P mod 26.
Key requirement
K must be invertible mod 26 (det(K) must have an inverse mod 26).
Breaking
- Known plaintext: enough pairs → solve for K.
- Otherwise: use automated search with strong scoring.
Autokey Cipher
Autokey is a Vigenère variant where the key extends using the plaintext itself, reducing repetition and hiding key-length patterns.
How to Break It (fast)
Use cribs (“DEAR”, “ATTACK”, known headers). Once a chunk of plaintext is guessed, the key stream can unravel.
What changes vs Vigenère
After the initial keyword, the key becomes the plaintext stream. That kills repeating-key repeats.
Breaking
- Guess a plausible plaintext fragment (crib).
- Use it to derive key letters for that region.
- Because the key becomes plaintext, recovering plaintext grows the key too.
Transposition Ciphers
Transposition ciphers rearrange letters instead of replacing them. Letter frequency stays roughly English, but words are scrambled.
How to Break It (fast)
Try likely widths (column counts), look for word fragments, and use cribs if you suspect certain words appear.
How to spot
- Frequency looks “normal” but nothing decodes like substitution.
- Vowels appear at normal rates.
Breaking workflow
- Test widths (2..20 etc) and score outputs for English.
- For columnar: test short keys or use heuristic search for longer keys.
- Cribs: try placing a suspected word into a grid.
Bacon Cipher
Bacon’s cipher hides letters as patterns of As and Bs — often disguised using two text styles (case, font, bold/normal).
How to Break It (fast)
Identify the two styles, map them to A/B, group into 5s, and decode.
Decoding steps
- Pick a rule: Style 1 = A, Style 2 = B.
- Read off the A/B stream.
- Group into chunks of 5.
- Convert each chunk into a letter (variant-dependent).
Common gotcha
Copy/paste often destroys formatting. If styles disappear, use screenshots or inspect the HTML.
Morse & Enigma
Morse is an encoding (no secret key), while Enigma is a rotor-based cipher that changes substitution with every key press.
Morse basics
- Not a cipher: it’s reversible without a secret.
- Spacing matters: letter gaps vs word gaps.
Enigma overview
Enigma uses rotors + plugboard. The “key” is rotor order, ring settings, start positions, and plugboard pairs.
How it was broken (high level)
- Cribs (guessed plaintext fragments).
- Constraints + automation (bombes).
Frequency analysis
Frequency analysis is the classic first step in breaking many “pen-and-paper” ciphers. It doesn’t guess the key. It measures the shape of the text: which letters and patterns repeat, and how closely that resembles real language.
What it looks at
- Letter counts (A–Z): In English, E/T/A/O/I/N are common. Substitution ciphers keep these counts (just relabeled).
- Digraphs & trigraphs: Common pairs like TH, HE, IN and triples like THE, AND.
- Index of Coincidence (IoC): A quick “how clumpy are letters?” signal. Random text is low; natural language is higher.
- Chi‑square vs English: A score of how far the letter distribution deviates from expected English.
- Repeats: Repeated n‑grams can hint at transposition vs substitution, and can help key‑length guesses in polyalphabetic ciphers.
What it tells you (fast triage)
- Substitution vs transposition: transposition tends to preserve overall letter counts (so frequencies can still look “English-ish”), while substitution changes which letters are common but the distribution shape remains similar.
- Is it polyalphabetic (e.g. Vigenère)? IoC often drops compared to plain English when multiple alphabets are used, and periodic structure can appear.
- Is it probably not a classic cipher? Very high entropy / lots of non‑letters can suggest encoding (Base64/Hex), compression, or binary data.
How to use it in the lab
- Paste text into Encoder / Decoder to strip/normalize if needed (remove spacing, keep letters only, etc.).
- Run analysis: check IoC + chi‑square first, then scan the top bigrams/trigrams.
- Use the result to pick the right tool:
- Looks monoalphabetic: try Caesar/Affine/Substitution.
- Looks transposition‑ish: try Railfence/Columnar/Permutation.
- Looks polyalphabetic: try Vigenère and test likely key lengths.
Frequency analysis won’t magically “solve” everything, but it’s a powerful filter: it narrows your search space so you don’t waste time on the wrong family of ciphers.