How Code Breakers Work


While there are hundreds of different codes and cipher systems in the world, there are some universal traits and techniques cryptanalysts use to solve them. Patience and perseverance are two of the most important qualities in a cryptanalyst. Solving a cipher can take a lot of time, sometimes requiring you to retrace your steps or start over. It is tempting to give up when you are faced with a particuarly challenging cipher. 

Another important skill to have is a strong familiarity with the language in which the plaintext is written. Trying to solve a coded message written in an unfamiliar language is almost impossible.

Navajo Code Talkers
During World War II, the United States employed Navajo Native Americans to encode messages. The Navajos used a code system based on how their language translated into English. They assigned terms like "airplane" to code words such as "Da-he-tih-hi," which means "Hummingbird." To encipher words that didn't have a corresponding code word, they used an encoded alphabet. This encoded alphabet used Navajo translations of English words to represent letters; for instance, the Navajo word "wol-la-chee" meant "ant," so "wol-la-chee" could stand for the letter "a." Some letters were represented by multiple Navajo words. The Navajo language was so foreign to the Japanese, they never broke the code [source: Kahn].

A strong familiarity with a language includes a grasp of the language's redundancy.

Redundancy means that every language contains more characters or words than are actually needed to convey information. The rules of the English language create redundancy -- for example, no English word will begin with the letters "ng." English also relies heavily on a small number of words. Words like "the," "of," "and," "to," "a," "in," "that," "it," "is," and "I" account for more than one quarter of the text of an average message written in English [source: Kahn].

Knowing the redundant qualities of a language makes a cryptanalyst's task much easier. No matter how convoluted the cipher is, it follows some language's rules in order for the recipient to understand the message. Cryptanalysts look for patterns within ciphers to find common words and letter pairings.

One basic technique in cryptanalysis is frequency analysis. Every language uses certain letters more often than others. In English, the letter "e" is the most common letter. By counting up the characters in a text, a cryptanalyst can see very quickly what sort of cipher he has. If the distribution of cipher frequency is similar to the distribution of the frequency of a normal alphabet, the cryptanalyst may conclude that he's dealing with a monoalphabetic cipher.

Frequency Table
©HowStuffWorks 2007
This chart shows the frequency with which
each letter in the English language is used.

In the next section, we'll look at more complex cryptanalysis and the role luck plays in breaking a cipher.

Tricks of the Trade
Cryptographers use many methods to confuse cryptanalysts. Acrophony is a method that encodes a letter by using a word that starts with that letter's sound. "Bat" might stand for "b," while "cunning" could stand for "k." A polyphone is a symbol that represents more than one letter of plaintext -- a "%" might represent both an "r" and a "j" for example, whereas homophonic substitution uses different ciphers to represent the same plaintext letter -- "%" and "&" could both represent the letter "c." Some cryptographers even throw in null symbols that don't mean anything at all.