Skip to main content

RSA Encryption System


THE NEED OF ENCRYPTION

In today's world where a lot of secured information like our credit card number, passwords etc., travel around the web, the presence of a secure encryption system is almost inevitable.

We want a method to encrypt a message, send it over the insecure internet connection and allow only the receiver to be able to decrypt and read the original message. This exact problem was solved by Rivest, Shamir and Adleman(RSA) in the year 1978, in their paper A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.
In this post, I will try to explain the method they adopted to create a secure encryption system.

PUBLIC KEY CRYPTOSYSTEM

A public key cryptosystem has the following properties:

1) D(E(M)) = M, if we decrypt an encrypted message, we should get back the message.

2) Both E and D should be easy to compute, if it takes days to encrypt or decrypt the message, then the cryptosystem is not very useful.

3) By publicly revealing E the user does not reveal an easy way to compute D. The encryption key of the users is revealed publicly so that any person can send an encrypted message to the user. However if by using E, someone can obtain D, then the encryption system will fail. So this should not happen.

4) E(D(M ) = M, i.e it is a trap-door one-way permutation. Trap-door, as there is a method to decrypt the message. One-way as by revealing E, we do not reveal D. Permutation as we get back the message by decrypting an encrypted message or encrypting a decrypted message. This property is useful in digital signatures.





img7
                                                        Figure 1

In the above image, we can observe how data is transferred between the sender and the receiver. The sender uses the recipients encryption key to encrypt the message. This transforms the plaintext message to ciphertext. This ciphertext is then sent over the insecure connection to the recipient, who can decrypt the message using his own decryption key, which is exclusive to him.

DIGITAL SIGNATURES

Digital signatures have become commonplace nowadays. They present a very efficient method of saving resources as well as time. However, we require a secure system in place as the signed documents have to be sent over an insecure connection and digital forgery is fairly easy.
Rivest, Shamir and Adleman presented a very elegant method to solve the problem of digital signature in their paper. Here we describe the method.

Bob has to send a signed document(M) to Alice

1)  S = DB(M), Bob uses his decryption key on the message(i,e the document) to create a unique signature.

2) He sends EA(S) to Alice. He encrypts the message using Alice's encryption key to ensure secure communication.

3) Alice decrypts the received ciphertext with DA to obtain S.

4) Now Alice uses Bob's encryption key on S, i.e EB(S), and if the original document is obtained then it ensures that Bob was the one who signed the document as DB is present only with Bob.

Hence the above method provides a way to legally ensure that a signature can be linked to a particular person.
However, we still have to talk about digital forgery. What if someone copy pastes Bob's signature to another document, say M'.
We have nothing to worry about because Bob's signature created in the first step, not only links it to Bob, but also to the document (M) itself. So if the document changes from M to M', Bob's signature will also change from S to S'. Hence a single signature cannot be used on multiple documents and hence forgery of the signatures is not possible.

THE IDEA

The main idea behind the RSA Encryption System is that factorization is a hard problem.
Specifically, if we are given the product of two prime numbers and are asked to find the individual primes which consist of the product, then the best known algorithm to do this is Number Field Sieve (NFS). 
The running time of NFS is O(n1/3) for an integer n.
It is easy to create a product of primes as long as 10^2048. However we can contemplate, how long it will take to obtain the two primes whose product is given.

There was a challenge held till 2007, in which the participants were provided with the number n = p*q, p and q are prime and the participants had to find p and q. The number n was called the RSA number.
RSA-220(220 digits long) was solved last year!

EULER TOTIENT FUNCTION

To understand the math that will follow we have to understand what is the euler totient function. 
φ(n) is the euler totient function, it gives the number of positive integers less than n which are relatively prime to n, i.e gcd(number, n) = 1.
So for example φ(4) = 2 ({1, 3}, gcd(1, 4) = 1 and gcd(3, 4) = 1 but gcd(2, 4) ≠ 1.)
Similarly φ(5) = ({1, 2, 3, 4}, gcd(i, 5) = 1 ∀ i ∈ {1, 2, 3, 4})

We can observe that for every prime p, φ(p) = p-1 (as gcd(i, p) = 1 ∀ i ∈ {1, 2, 3...p-1})

Lets try to calculate the totient function for product of primes, say pq

φ(pq) = (pq − 1) − (p − 1) − (q − 1) .... (1)
pq-1 is the number of numbers less than pq. However there are (p-1) multiples of q among those which will not have a gcd of 1 with pq, similarly there are (q-1) multiples of p which will not have a gcd of 1 with pq. So by subtracting these two quantities from all possible numbers, we get the numbers which have a gcd 1 with pq, which is exactly the value of Euler Totient Function
Simplifying the above equation we get
φ(pq) = pq − p − q + 1 ....... (2)
φ(pq) = (p − 1)(q − 1) ....... (3)

THE METHOD



  • Bob chooses two primes p,q and compute n=pq


  • Bob chooses d with gcd(d , (p-1)(q-1)) = gcd(d , φ(n))=1


  • Bob solves de ≡ 1 (mod φ(n)) and gets e


  • Bob makes (e,n) public and keeps (p,q,d) private


  • Alice encrypts M as C ≡ Me(mod n)


  • Bob decrypts by computing M ≡ Cd(mod n) 

  • THE UNDERLYING MATHEMATICS

    We have to show that the above mentioned method of encrypting and decrypting the message will provide us a way to get D(E(M)) = M and E(D(M) = M
    Now,
    D(E(M)) ≡ (E(M))d ≡ (Me)d(mod n) = Me·d(mod n) ..... (4)


    E(D(M)) ≡ (D(M))e ≡ (Md)e(mod n) = Me·d(mod n) ..... (5)


    So we observe that both the quantities reduce to the same expression.
    If we somehow prove that  Me·d(mod n) ≡ M(mod n), we will be done.

    Me·d ≡ Mk·φ(n)+1(mod n)  (for some integer k) ........ (6)
    This follows from the fact that de ≡ 1 (mod φ(n))

    Now Fermat's little theorem states that Mφ(p) ≡ 1 (mod p) ...... (7)

    From (6) and (7), we get,


    Mk·φ(n)+1 ≡ (Mφ(n))k .M ≡ 1k.M (mod p) ≡ M (mod p) ..... (8)

    Mk·φ(n)+1 ≡ (Mφ(n))k .M ≡ 1k.M (mod q) ≡ M (mod q) ..... (9)

    Now we have to somehow combine (8) and (9) to obtain the result (mod n).

    Simplifying (8) and (9), we have something of the form

    a ≡ b (mod p)   => p | (a-b)  (p divides a-b)  
    a ≡ b (mod q)   => q | (a-b)  (q divides a-b)

    We claim that we can combine the above two equations to give:

    a ≡ b (mod n)   => n | (a-b)  (n divides a-b) (where n = p*q)

    This follows easily because p and q are prime and they divide (a-b).
    Hence if we express the prime factorization of (a-b), it will contain at least two primes which are p and q respectively.
    i,e (a-b) = (p*q)*p1*p2....
    It is clear from the above argument that p*q= n occurs in the prime factorization of (a-b) => n (= p*q) | (a-b) thereby proving the combination property.

    Using the above proved fact we can combine (8) and (9) to give:

    Me·d ≡ Mk·φ(n)+1 ≡ M (mod n) ........ (10)

    Hence we have proved that the RSA system of encryption works and due to the permutation property((4) and (5)), the problem of digital signatures is also resolved.

    A SMALL EXAMPLE

    Consider the case:

    p = 47, q = 59, n = p · q = 47 · 59 = 2773, φ(2773) = 46 · 58 = 2668 and d = 157 e = 17, as 17.157 ≡ 1(mod 2668)

    Let the message be:
      
    ITS ALL GREEK TO ME

    We convert this by substituting a two digit number for each letter, blank = 00, A = 01, B = 02, . . . , Z = 26

    The converted string is:
    0920 1900 0112 1200 0718 0505 1100 2015 0013 0500

    The first block(M=920) can be enciphered as:

    M17 = ((((M)2)2)2)2 ·M = 948 (mod 2773)

    The whole message is enciphered as:
    0948 2342 1084 1444 2663 2390 0778 0774 0219 1655

    Deciphering the first block gives:

    948157 ≡ 920 (mod 2773)

    In the same way we can decipher all the blocks and apply reverse substitution to get back the original message.

    THE LEGEND OF RSA

    The RSA algorithm has remained a secure scheme for sending encrypted messages for almost 40 years, earning Rivest, Shamir, and Adleman the Association for Computing Machinery’s 2002 Alan Turing Award, among one of the highest honors in computer science!

     




    Comments

    1. Brief, point to point explanation. The blog totally explained the RSA encryption in very precise way. Recommended for those who want to understand the encryption in very short time.

      ReplyDelete

    Post a Comment

    Popular posts from this blog

    Streaming Algorithms - Lossy Counting

    We had a discussion recently in our office on one of the famous  streaming algorithms  known as lossy count. I decided to write down my understanding of it focusing on the space complexity discussion and proving why it works in logarithmic space. This  is the original paper introducing this algorithm and most of the information discussed in this blog is (hopefully) a simplified version of this paper relating to lossy count. Problem Statement There are use cases where you may be receiving a continuous stream of data and you want to know which items in that data stream exceed a certain threshold. The data stream could be really big so you can't store it in memory. Moreover you want to answer queries in real time. So here are the challenges this problem imposes: 1. Need to do it in single pass (time constraint) 2. Limited memory 3. Large volume of data in real-time The above issues warrant the need of a smart counting algorithm. Data stream mining to identify events & patterns can

    Consistent Hashing

     I have been meaning to write this blog for some time now but somehow kept on postponing (read procrastinating...). I read about this technique of Consistent Hashing a while back and was mesmerized by its elegance. Recently read it again in this brilliant blog and thought of expressing it in my own words for posterity. So let us begin. Hashing I won't talk too much about hashing since it is a very basic computer science concept. In a word, it means mapping an object to another object. Or more generally, mapping a key to a value where their types don't matter. Mostly the mapping is from a string to int.  There could be multiple different hash functions that can exist which randomize the how the keys are hashed to the values. We are going to consider a simple use case where let's say you have N different servers and there are multiple keys that you want to distribute among those servers. How do you do it? Well, a simple strategy is that you have a function which applies some