# Secrecy in Plain SIght

Security is a vastly important facet of our day to day lives.  It is so ubiquitous, few people recognize that they use it every day. However, if you ever visit a webpage that has “https” in the URL, you are using secure technology.  Most of the early security protocols require that two parties have a secret exchanged between them.  Unfortunately, if someone is listening to the key exchange, they can easily decrypt all of the messages sent between the two parties.  You may find yourself asking, “But isn’t there some better way to exchange secrets?” You’d be right!

The method is called Diffie-Hellman key exchange, and it relies heavily on modular arithmetic.  The amazing thing about this protocol is that the two parties, Alice and Bob (in keeping with the cryptographic tradition of keeping names simple) can even publish their communications in the newspaper.  Anyone can be listening to these two exchange their keys.

Here’s how it works:

First, Alice and Bob pick two numbers, p and g.  P is a prime number, and g is a primitive root mod p.  A primitive root mod p has a very special property, namely that every number that is coprime with p is congruent to some power of g.  For example, if we pick p=5, 3 is a primitive root mod p because:
1,2,3,4 are relatively prime to 5.
30 ≡ 1 mod 5
31 ≡ 3 mod 5
32 ≡ 4 mod 5
33 ≡ 2 mod 5

As you can see, each number that is relatively prime to 5 is represented by some power of 3.  Therefore, 3 is a suitable primitive root mod 5.

For this example, Alice and Bob will decide on p=17 and g = 7.  They can pick these numbers from a well-known list.  What’s more, they can even display these numbers in public.  These will form the foundation of their key.  Next, Alice and Bob both pick their own numbers and keep them secret.  We’ll call Alice’s secret number a, and Bob’s number b. (Original, right?)

Next, Alice determines ga mod p, and Bob determines gb mod p.  We’ll call these c and d, respectively.  Now, Alice and Bob will exchange their values.  Again, this can take place in the open.
Now, Alice has d, and Bob has c.  Next, Alice performs da mod p, and Bob performs ca mod p.  This has the net effect of giving both Alice and Bob the same number.  They are now both in possession of gab mod p.  What’s more, any attacker could have watched the entire exchange, and wouldn’t have gotten anything out of it.

Back to the example:

Alice chooses 3 as her secret number.
Bob chooses 5 as his.
Alice’s transport number is 73 mod 17 ≡ 3
Bob’s transport number is 75 mod 17 ≡ 11
Alice gets 11, and performs 113 mod 17 ≡ 5
Bob gets 3 and performs 35 mod 17 ≡ 5

Alice and Bob now have the same number, which they can use for regular cryptography.  Obviously, in a real situation, Alice and Bob would have chosen much larger numbers, but this suffices for an example.

This is the foundation of cryptography – relying on operations that are easy to perform, but are nearly impossible to reverse.  In this case, exponentiation can be performed on a computer in a reasonable amount of time.  It’s not the fastest algorithm, but it’s far faster than performing the reverse of a modulo operation.  (Incidentally, that problem is called a discrete logarithm, and there is currently no way that it can be done in a reasonable amount of time.)

You may wonder, “I really want to steal nuclear secrets!  Is there some way that I can still eavesdrop on people using Diffie-Hellman?”  You’d be right!  Diffie-Hellman is vulnerable to what is known as a “man-in-the-middle” attack.  This is difficult, as it requires being present at the time the keys are exchanged.  However, the attack is simple.  All that needs to be done is that some attacker Eve (short for eavesdropper) intercepts Alice and Bob’s messages.  When Alice sends Bob her key, Eve steals it, and substitutes her own key.  Bob then responds, and Eve steals his response.  She then gives her own key to Alice.  Now, there are two unbreakable passwords, and Eve has both of them.  However, there is a limitation: Eve needs to be present for the entire conversation, or Alice and Bob will immediately know something is wrong.  This is because the key that Alice has is not compatible with the key Bob has.  Suddenly, all of their communication will be reduced to gibberish, and they will know that they are compromised.  However, it is possible for Eve to hijack their conversation, leaving none the wiser.

We may ask, why not use RSA?  Well, we do!  If we combine RSA with Diffie-Hellman, we gain protection against eavesdroppers and hijackers.  Diffie-Hellman is faster than RSA and can encode larger messages than RSA.  (The messages sent by RSA are limited in size, lest it become easy to crack.)  In addition, using these two approaches combined gives us the ultimate goal of cryptography:  Perfect Forward Secrecy!  If the secret numbers used by Alice and Bob are discarded, nobody can read the messages exchanged between the two, ever.  That’s right, 30 years down the line, when someone digs up Alice’s old machine out of some attic in Nebraska, they won’t be able to read the messages she shared with Bob.  This is the real advantage of Diffie-Hellman, and the reason that it will likely remain in use for a very long time.

Sources:

# Cryptography – Keeping Our Online Secrets Safe Since the 90s A breakdown of time Americans spend with electronic media. Image: Courtesy of http://www.statista.com/chart/1971/electronic-media-use, Felix Richter

We live in an era where the internet is king. Between our cellphones, tablets, game consoles, laptops, and other devices, the average American adult (18+) spends 11 hours per day ingesting electronic media in some way, shape, or form.  I’m sure we can all admit that on a weekly basis we access or create data that we don’t necessarily want the public to see. Whether it be our bank account or credit card information, our Facebook interactions, our emails, our tweets, our PayPal activity, or even our browsing history. That being said, I’m sure some of us take our internet privacy for granted; but how exactly does are internet privacy remain… private? The answer is simple: modular arithmetic. More specifically, cryptographic algorithms.

A History of Cryptography

Cryptography dates back to Egyptian scribes in 1900 B.C., and it was first used in their hieroglyphs. The Egyptians presumably wanted hide the content of their hieroglyphs from others, and they used very basic cryptography to do so. As you can imagine, this whole “keeping a message’s content safe” idea would become widely popular as mankind become more and more intelligent. The Romans, specifically Julius Caesar himself, created the first truly math-oriented cryptography. He used it primarily to protect messages of military significance. Caesar’s cryptographical ideas would later be used to build out modern day cryptography.

There are two main types of cryptography widely used across the web today: symmetric-key encryption, and asymmetric-key encryption (we’ll go into details later, I promise!). Both of these types of encryption rely on modular arithmetic. We must give credit where credit is due. Friedrich Gauss (1777-1855), birthed modular arithmetic in 1801. Believe it or not, this famous mathematician made most of his breakthroughs in his twenties! For those that aren’t familiar with modular arithmetic, here’s a timeless example (pun intended, wait for it…).  The length of a linear line can have a start and end point, or it can go on to infinity in either direction. In modular arithmetic, the length of a “circular” number line is called the modulus. To actually do the arithmetic, consider this example: Take a regular clock (see, here’s the pun!), consisting of the numbers 1-12 . Clocks measure time on a 12 hour time table before starting back over at 1. The modulus for a 12-hour clock is 12 because it has 12 different numbers for the number of hours. To actually do the arithmetic, take this for example: It’s 8PM and we want to add 9 hours (8 + 9 mod 12). 8 +9 equals 17, however when using a modulus of 12, our number line wraps back around after counting to 12. For this we would count forward from 8 – ie. 8, 9 ,10, 11, 12, 1, 2, 3, 4, 5. So, (8 + 9 mod 12) = 5 AM in this case.

Caesar Cipher A basic Caesar Cipher using a left shift of 3. Image: Matt_Crypto, via Wikimedia Commons.

As I said above, the Caesar cipher has acted as a building block for some of our modern day cryptography. Caesar’s main encryption step is incorporated in some of the more complex schemes we still rely on today. However, the Caesar cipher can be easily broken, or decrypted (more on this soon!). This particular cipher is concerned with the alphabet. The theory behind it is replacing each letter in the alphabet with a different letter some fixed number of positions down the alphabet (this is reffered to as the shift). For instance, with a shift of 3, A would replace D, and B would replace E.

Original: ABCDEFGHIJKLMNOPQRSTUVWXYZ

Cipher:   XYZABCDEFGHIJKLMNOPQRSTUVW

This can be represented mathematically using modular arithmetic. The encryption of any letter ‘x” by a shift ‘n’ can be described as follows:

Encryption:

E(x) = (x + n) mod 26

Decryption:

D(x) = (x – n) mod 26

Brute-Force Attacking:

This cipher is extremely easy to break. There are only 26 possible shifts (26 different english letters). When taking a brute-force approach, it’s only a matter of varying through the different shifts until the message is decrypted. In fact, this process could be optimized by analyzing the encrypted string, finding frequently used letters and associating them with common vowels. That way, you could brute force using intelligent shifts. However, this approach would have to be modified when switching between languages.

Cryptography Online

As promised, I will explain the two types of internet cryptography. First, we have symmetric-key cryptography. This is based on the concept that both communicating parties share the same key for encryption as well as decryption. This key is mathematically applied to a numerical equivalent of the data each party is encrypting/decrypting. It is imperative that this key is kept secret. If another party finds out what the key is, none of the encrypted data is safe anymore. Symmetric-key cryptography uses either stream ciphers (encrypt the numerical representation of the data one digit at a time.), or block ciphers (taking blocks of digits, and encrypting them as a whole). Symmetric-key algorithms have an advantage over asymmetric in that they require less computational power. Asymmetric-Key encryption. Anyone can encrypt data using the public key, but the data only be decrypted with the private key. Image: Dave Gothenburg, via Wikimedia Commons.

As for Asymmetric-key cryptography (aka public-key cryptography) we use a slightly different approach. This cryptosystem implements both a private and public key. The public key is used to do the encryption (just like symmetric key cryptography), but the private key is used to do the decryption. The word “asymmetric” stems from the different keys performing opposite functions. This type of cryptosystem is more user friendly, and requires less administration. This is why public-key cryptography is widely implemented across the web.

The RSA Cryptosystem

The RSA cryptosystem is one of the most practical applications of modular mathematics we see today. In fact, if you look at your browser’s address bar right now and you see an “https” at the beginning of your URL, you’re more than likely relying on an RSA encryption to keep your data secure. RSA was created was created in 1977 by Ron Rivest, Adi Shamir, and Leonard Adleman at MIT. As far as cryptosystems are concerned, RSA in particular is one of the most straightforward to visualize mathematically. This algorithm consists of three parts: key generation, encryption, and decryption. I will be walking through a widely-used example using 3, and 11. Not only does this process use Gauss’s modular arithmetic, it also uses Euler’s totient function φ(n). (A function that counts the totatives of n – the positive integers less than or equal to n that are relatively prime to n.)

Generating the key is the most confusing part, but here’s a somewhat simplified version (don’t get nauseous!):

1. Randomly pick 2 prime numbers p and q : p=3 q=11
2. Calculate the modulus : n = p * q  ->   3 * 11 = 33
3. Calculate the totient φ(n)  : z = (p – 1) * (q – 1) -> ( 3 – 1) * (11 – 1) = 20
4. Choose a prime number k, such that k is co-prime to z : k=7
5. n and k become the public key
6. Calculate the private key : k * j = 1 mod z | 7 * j = 1 mod 20
7. In the previous step, we’re only interested in the remainder. Since we’re working with small numbers here, we can say –> 21/20 gives us “some number” with a remainder of 1. Therefore – 7 * j = 21 -> j = 3
8. j becomes the private key

After the public and private keys are generated, encryption and decryption become easy!  Given P is the data we’d like to encrypt and E is the encrypted message we want to generate:

P^k = E (mod n)  – When P (data we’d like to encrypt) = “14”  We get: 14^7 = E mod 33  So E=20

Given E is the encrypted data we’ve received, and P is the data we want to decrypt:

E^j = P ( mod n) – 20^3 = P mod 33  So P = 14

This proves RSA works!

Does the RSA Cryptosystem Really Keep Me Safe?

Theoretically, a hacker could factor the modulus “n”  in the steps above. Given the ability to recover the prime factors p and q an attacker can compute the “secret exponent” “d” from the public key (n, e). Once the hacker has this “secret exponent”, they can decrypt all data sent with its matching public key. RSA keeps us safe from hackers because there is no known algorithm (The NSA probably has one!) that can factor these large integers in a timely manner. In fact, the largest known number ever factored was 768 bits (232 digits!) long, and this was done with a supercomputer using a state-of-the-art implementation. If that doesn’t make you feel safe enough, RSA keys are typically 1024 to 2048 bits (617 digits!) long, so we don’t need to worry about our data getting hijacked. However it is recommended that we use a value of n that is at least 2048 bits long to ensure the encryption is never cracked.

Sources:

http://en.wikipedia.org/wiki/RSA_(cryptosystem)

http://www.studentpulse.com/articles/41/a-brief-history-of-cryptography

http://www.ti89.com/cryptotut/mod_arithmetic.htm

http://en.wikipedia.org/wiki/Modular_arithmetic

http://cunymathblog.commons.gc.cuny.edu/

How and Why RSA works:

# Code Breaking: Bletchley Park and Bill Tutte

While brainstorming ideas for a blog post, I found myself wondering if math has ever directly saved lives. After looking into many options, I ran into a story about a place called Bletchley Park. It was known as the Fortress of Secrets and was said to have saved millions of lives yet didn’t even appear on any map. Nicknamed ‘Station X,’ it was solely designated for breaking codes, specifically ciphers. In World War II, direct communication between leaders and various units around the world was a big problem. These orders/war plans were coded and broadcast via wireless radio, but because they could be so easily intercepted, they became increasingly vulnerable. The solution to this problem was the cipher machine. Adopted in 1926, the Germans’ answer was called the Enigma.

The Enigma was thought by the Germans to be unbreakable and safe for them to use. Its code was especially hard to crack because each time a key was pressed, its internal wiring was changed. In light of this, the British started to recruit brilliant mathematicians to engage in a battle to learn the enemy’s secrets. The Enigma machine required 6 people to operate, so Hitler ordered even more security. Thus, the Tunny cipher machine was born (also known as the Lorenz Machine). This machine generated code with its 12 wheels and only required two operators to send and receive information. To function, it would first apply two keys, encoding the message twice. The first cipher used 5 wheels with, the second used another 5, and then 2 additional wheels would cause a stutter of random letters that would try to throw off unauthorized decoders. Each wheel had a different number of spokes or choices on it which resulted in 23*26*29*31*37*41*43*47*51*53*59*61 = 1.6 million billion possible combinations!

Here is an example of coding one letter into another: The initial letter is “A,” and the cipher code is “K.” They would be “added up,” and if the corresponding symbol was different, then you would mark down an “x.” Inversely, if it was the same, then you would mark down a *. Here we can see A being coded into the letter N:

A= x x * * *

K= x x x x *

———————–

* * x x * = N

The code’s downfall began with the Germans’ overconfidence in the Tunny machine. A 4000 character message was sent, but the receiver didn’t quite get it, so a re-send was requested. The sender failed to change the wheel settings and re-sent the 4000 character message but it was slightly different. This provided Bletchley with a data set with which he could attempt to crack the code. John Tiltman, a mathematician who led the research department at Bletchley, initially worked on this break but passed it onto Bill Tutte.

Bill began by putting the 4000 word message into columns and made a rectangle out of it. He then looked for repetitions/patterns. Every 23 characters there was a rotation, but he then thought maybe it was 25. So he tried to multiply 23*25 to see if the pattern extends along that but it was inconclusive. But the pattern did extend along 574. He then thought maybe it was 41 as that is the prime number of 574. Resonance occurring after 41 strokes made him deduce that the first wheel had 41 spokes. He then went to the next wheel and so forth. Bill Tutte managed to diagnose how the machine worked without ever actually seeing the machine.

He also worked out a statistical mathematical method of breaking the Tunny code, called the “1+2 – break in method.” To use this decoding method required a massive amount of number-crunching and checking. This is where his co-worker Thomas Flower’s brilliance came into play. He conceptualized Bill Tutte’s method and produced one of the first computers, The Colossus, the world’s first semi-programmable computer completely invented from scratch. With this cracked code—and the computer to help crack it—huge battles were won. It is widely credited with turning the tide of the entire war.

World War II is estimated to have cost 10 million lives per year. Cracking the Tunny code was said to have shortened the war by at least two full years! However, everything involving these machines and ciphers had to be kept secret. The brilliant men involved could not publicly get credit for their achievements for quite some time. Eventually, the secrets were declassified, and Bill Tutte was awarded a membership of the Royal Society. In 1987, he finally signed his name in the Royal Society book where his signature lies alongside those of Isaac Newton, Charles Darwin, Winston Churchill.

Sources:

BBC Documentary: http://vimeo.com/31185786

http://www.bletchleypark.org.uk/content/hist/worldwartwo/enigma.rhtm

http://www.bbc.com/news/uk-england-suffolk-29064159

http://www.english-heritage.org.uk/bletchleypark

# Cryptography: A modern use for modular arithmetic

The common analogy used to describe modular arithmetic is fairly simple. All one has to do is look at an analog clock. For example, if it’s 11 AM and you want to know what time it will be in four hours, we instinctively know the answer is 3 PM. This is modular arithmetic, i.e. 11+4 = 3 mod 12. This is an important concept in the technology driven world we live in. Any time a product is purchased on the internet, cryptography comes into play. The remainder of this paper (pun most definitely intended) will describe how ancient modular arithmetic plays a very important role in today’s society.

History of modular arithmetic

The first known publication of modular arithmetic was in the 3rd century B.C.E, in the book Elements, written by Euclid. Within his book, he not only formalized the fundamentals of arithmetic, but also proved it. In what is known as Euclids Lemma, he states that if a prime number divides the product of two different numbers (x and y), then the prime number must also divide one of the numbers (either x or y), but it could also be both. Between the 3rd and 5th centuries a paper publish by Sun Tzu describes a modular arithmetic process known as the Chinese remainder theorem. This theorem is essentially the basis for modern RSA encryption schemes that are present on every banking/e-commerce website. It uses a congruent set of keys to produce the same numerical value. Imagine if there was a lock on a door that two differently cut keys could unlock and open, this is essentially how Chinese remainder theorem works.

Modern modular arithmetic Oil painting of mathematician and philosopher Carl Friedrich Gauss by G. Biermann (1824-1908). Public Domain.

The modular arithmetic that we use today was discovered by Carl Friedrich Gauss in 1801.

Gauss is famous for numerous discoveries across a wide variety of fields in science and mathematics. Gauss’s proposition, from his book Disquisitiones Arithmeticae, defines modular arithmetic by saying that any integer N belongs to a single residue-class when divided by a number M. The residue-class is represented by the remainder, which can be from 0 to M-1. The remainder is obtained by dividing N by M. Given this fact, Gauss notices that two numbers that differ by a multiple of M are in the same residue-class. He then discusses the role of negative numbers in modular arithmetic. The following is an excerpt from his book:

“The modulus m is usually positive, but there’s no great difficulty in allowing negative moduli  (classes modulo m and -m are the same).  For a zero modulus, there would be infinitely many residue classes, each containing only one element.  [This need not be disallowed.]”

Modular Arithmetic’s Role Today

RSA encryption is named after those who invented it, Ron Rivest, Adi Shamir, and Leonard Adleman (RSA is obtained from their last names). RSA is the process by which information can be passed between two parties without another individual being able to intercept the message. Burt Kaliski has been one of the major contributors to RSA encryption since the 1980’s. I would like to start off with a passage from Burt Kaliski’s paper titled “The Mathematics of the RSA Public-Key Cryptosystem”:

“Sensitive data exchanged between a user and a Web site needs to be encrypted to prevent it from being disclosed to or modified by unauthorized parties. The encryption must be done in such a way that decryption is only possible with knowledge of a secret decryption key. The decryption key should only be known by authorized parties.”

This is a high level description of how RSA encryption works. It is also called public-key encryption, because anyone can obtain a copy of the encryption key it is publically available, but the decryption key cannot be obtained. This makes RSA encryption a secure way of passing data between an individual and a web site.

Performing this calculation (encrypting and decrypting text) is fairly simple. With a basic understanding of modular arithmetic it can be accomplished. First a public and private key must be produced by following the steps below:

1. Generate large prime numbers, p and q (these should be hundreds of digits)
2. Compute the modulus n, n = p×q
3. Compute the totient, totient = (p-1)×(q-1)
4. Choose an “e” > 1 that is co-prime to the totient
5. Choose a “d” such that d×e = 1 mod totient

Once those steps have been completed, a public key (n, e) and a private key (n, d) have been generated. The public key can be distributed to anyone, but the private key must be kept safe. It’s easy to see that without the modular arithmetic this algorithm would be easy to discern. One could generate pairs of random numbers until a pair is found that when multiplied together, would equal the modulus n found in step two above. From there, it would be easy to find all numbers co-prime to the totient in step three. Modular arithmetic then comes into play, because it allows infinite pairs of numbers to satisfy the constraint listed in step five, but it would not allow the user to decrypt the message. In other words, 11+4 = 3 mod 12, but also 11+16 = 3 mod 12. This makes it impossible to determine what the original number was (it could be 4 or it could be 16, or any other multiple of 12).

Once the keys have been generated it is easy to encrypt and decrypt text. To encrypt a message “m,” given the public key (n,e) generated above:

C = me mod n

“C” is then the encrypted message that gets passed to the other party.

To decrypt the message “C” created above, all that is required is the inverse of the operation to encrypt:

M = cd mod n

Let’s do an example to illustrate the instructions listed above (note: we will be using small prime factors because the math is simpler).

1. Select a p and q that are prime
1. P = 11
2. Q = 3
2. The modulus n is then equal to P×Q = 11×3 = 33
3. Computing the totient to be equal to (p-1)×(q-1) = (11-1)(3-1) = 20
4. To select an “e” we must find a number that is coprime to 20
1. The smallest value that is coprime to 20 is 3 because 3 is the smallest number that cannot divide 20 evenly, so “e” = 3
5. Now we need to find “d”, d=e^(-1) mod [(p-1)×(q-1)]
1. Using the Euclidian Algorithm we get d = 7

Now let’s say we want to encrypt the message “4.” To do this we need to know the public key, which in our case is (n=33, e=3).  All we have to do is compute:

C = 43 mod 33 = 31

We can pass 31 (c=31) along to the website, which will then decrypt it using the private key (33, 7):

M = 317 mod 33 = 4

Our message has been successfully “passed” from one place to another.

Thoughts

Without the work from previous mathematicians, this process would not be possible. Modular arithmetic plays a crucial role in our everyday lives and we don’t even notice it. I think it’s an amazing mathematical concept and provides a deep insight into the world of number theory. Even today there are computers constantly trying to figure out how to factor large prime numbers without success. I don’t know if RSA encryption will stand the test of time, but for now it’s the best we’ve got.

References

http://en.wikipedia.org/wiki/Cryptography#History_of_cryptography_and_cryptanalysis

http://www.britannica.com/EBchecked/topic/920687/modular-arithmetic

http://mathworld.wolfram.com/ChineseRemainderTheorem.html