What is Encryption and How Does It Work?
The Problem with Open ChannelsImagine a standard online chat room where users can communicate with each other via text message. How would we then build a secure chat room with encrypted messages? The first implementation is a simple TCP-based communications channel. Since there’s no security, every message users send is open to attack. So, when Alice and Bob text each other, attackers can simply come between them and eavesdrop. This is known as a Man in the Middle attack. Hackers can also alter the messages and reroute them. This is possible because the default communication channel passes the messages in plaintext. It does the same with all HTTP communication over open Wi-Fi networks. Clearly, we need a better system.
Symmetric encryptionSymmetric encryption uses an algorithm that converts the original plaintext message into a ciphertext encrypted message using an encryption key. The same key is used by the recipient to convert the ciphertext back to plaintext. Let’s apply this to our application. When Alice wants to send Bob a message, she encrypts it with a symmetric key. When Bob receives it, he uses the same key to decrypt the message. Without it, attackers cannot access the encrypted communication between the two users, keeping it confidential. Typically, a symmetric key is generated per session and is invalid for subsequent communication. We call it a session key.
- Scalability: Our solution is not scalable. If 1,000 users want to communicate with each other, each of them would need 999 different keys to establish a secure channel.
- Key Distribution: We assumed that both the parties would have access to the symmetric key, but how do they get this key in the first place? If Alice generates a symmetric key (session key) and sends it over to Bob, the attacker could intercept it and decrypt any further communication.
Asymmetric encryptionAsymmetric encryption uses two keys – a private key and a public one. When the plaintext is encrypted with a public key, it can only be decrypted with the corresponding private one and vice-versa. This helps us get around the problem of two symmetric keys. Asymmetric encryption is slower than symmetric encryption, so typically, they are both used in tandem. Let’s see how that is done:
- Authentication: We are using Bob’s public key as a starting point, but how did we get it? The public key we received first could have come either from Bob or from an impersonator, an attacker. So while we are communicating securely, it's with the wrong person.
- Data Integrity: The message could be altered during the transfer of data. We’ll want to make sure that the data has not been tampered with.
Certificates and Digital SignaturesAuthentication requires a trust system. A trusted certificate authority (CA) ensures a public key belongs to a specific person. Each of the system users registers a digital certificate with the certificate authority (CA). This contains the owner’s identity information and a public key. So, when Alice wants to communicate with Bob, she can check with the CA that the public key she received does indeed belong to Bob. This is also how HTTPS works on the Internet. One root certificate is linked to various child certificates with digital signatures (described below). So, how do we know that the certificate received is from the root CA and not from the attacker? Typically, the root certificates from a trusted CA are hardcoded in the browser, which gives us a trusted baseline. The problem of data integrity can be solved using digital signatures (not to be confused with digital certificates). When Alice wants to send a message to Bob, she first creates a session key and encrypts it with Bob’s public one. Let’s call this data packet PART1. Then, she creates a message hash using one of the many available hashing algorithms (MD5/SHA256). A message hash is a one-way conversion from a variable length byte to a fixed-length one. You can’t get the original message from the hash value, and it is statistically improbable for two messages to have the same hash value. After creating the hash, Alice encrypts it with her private key. This is called a digital signature since it can be used to check that the message has come from Alice and has not been tampered with. The digital signature and the original message is then encrypted with the session key. Let’s call this one PART2. Here is what we have now:
PART1 = BOB’S_PUBLIC_KEY -> (SESSION_KEY)
PART2 = SESSION_KEY -> (MESSAGE + DIGITAL_SIGNATURE)Alice sends both PART1 and PART2 to Bob. Since he owns the private key, only Bob can decrypt PART1 and access the SESSION_KEY. Next, he uses this session key to decrypt PART2 and retrieve the message and digital signature. He then uses Alice’s public key to decrypt the digital signature and retrieve the message hash. Bob calculates the MESSAGE hash and compares it to the one from the earlier step. If both the hashes match, it means that the data integrity has been preserved and there has been no tampering.