What is IND-CPA Security?
IND-CPA security stands for "Indistinguishability of ciphertexts under chosen-plaintext attack". An encryption scheme is said to be IND-CPA secure if an attacker cannot distinguish between two ciphertexts encrypted under the same key. In simpler terms, we can break it down into two main components, Indistinguishability and CPA (Chosen-plaintext attack).
Indistinguishability - According to its Cambridge Dictionary definition, this term refers to the characteristic where it is impossible to judge something as being different when compared to another similar thing.
Chosen-Plaintext Attack - This is an attack in which we assume that the attacker can choose any arbitrary plaintext to be encrypted and the attacker then receives its ciphertext. In this type of attack, the attacker tries to acquire the secret key or alternatively creates an algorithm that decrypts ciphertext that was encrypted using this secret key, all without knowing the secret key itself.
Although the success rate is low, the attacker would have the freedom to obtain more information about the secret key as well as the system being attacked, due to the ability to choose any plaintext to be encrypted by the cipher. This allows the attacker to understand and learn more about the system's behaviour as well as check for patterns in its output ciphertext, based on the given input data.
If we define it in the context of a game, we have an attacker attempting to demonstrate that they can break the encryption scheme with a non-negligible advantage, all while adhering to the rules. The rules of the IND-CPA game are the following:
For this game, we have Mynar (Attacker) and Oliver (Encryptor).
In this game, Mynar is crowned the winner if the advantage is considered non-negligible. This means that the attackers ability to distinguish between the two ciphertexts is significant enough to be practically exploitable which signifies a weakness in the encryption scheme.
When can we know that a system is secure in terms of indistinguishability?
A system is considered secure if there are no adversary that can identify the message choice or win the game with significantly greater probability than an adversary who is simply guessing randomly. In the case of a two-element message, the adversary must not have a probability of identifying the message greater than
or 50%.
IND-CPA Secure Ciphers | !IND-CPA Secure Ciphers |
---|---|
Counter (CTR) | Electronic Code Book (ECB) |
Cipher Block Chaining (CBC) |
One thing to note is that an IND-CPA secure algorithm cannot be deterministic. This means that given an encryption
What is Electronic Code Book (ECB)?
Encryption algorithms are divided into categories depending on the input type, and those two categories are block ciphers and stream ciphers. Block cipher is a cryptographic algorithm that takes a plaintext block of a fixed size and key as input, and then outputs a ciphertext of the same size. Within block ciphers, there are two primary modes of operation which are Electronic Codebook (ECB) and Cipher Block Chaining (CBC).
Among these two primary operations, Electronic Codebook is the easiest due to its direct encryption where each block of input plaintext is directlly encrypted into blocks of output ciphertext. If the message is larger than the predetermined bit size, the message can then be broken down into a bunch of blocks wherein each block is encrypted independently using the same key and then concatenated at the end to receive the final ciphertext output. An example of its operation is shown below.
However, using this operation comes with a certain risk due to identical plaintext blocks resulting in identical ciphertext blocks. This happens because the encryption algorithm behaves deterministically.
When we say an encryption algorithm behaves deterministically, this means that its output is completely determined by its input. For a particular input, the encryption would always give the same output. A figure for better understanding is shown below.
Given a plaintext 'atla', we have:
Character | ASCII Code | Binary |
---|---|---|
a | 97 | 01100001 |
t | 116 | 01110100 |
l | 108 | 01101100 |
a | 97 | 01100001 |
If you look at the figure shown above, the resulting ciphertext for the plaintext '01100001' which represents 'a' in the first block and the last block are the same. This is a risk to the system as it can reveal patterns that attackers can exploit.
The Problem with Electronic Code Book (ECB)
ECB or Electronic Code Book is NOT IND-CPA Secure for the simple reason that it reveals a pattern in your data. In fact, most deterministic encryption algorithms are not IND-CPA secure due to its characteristic of mapping the same initial message to the same ciphertext.
In ECB, each block of plaintext will always result to a defined corresponding ciphertext, while the ciphertext will also always result to a defined corresponding plaintext.
In the IND-CPA game, the attacker can always ask for the encryption of any message. This means that the attacker can just simply ask for the encryption of the attacker's chosen messages for the game. The attacker can then compare the choices of the IND-CPA game with the encryption of the message that the attacker asked to be encrypted. This way, the attacker will always win simply due to the fact that same messages output the same ciphertexts after encryption.
What can we get from this information? Well, since the outputs in ECB are deterministic, patterns may start to show. Since it uses a simple substitution instead of chaining, two identical plaintext blocks will result in two correspondingly identical ciphertext blocks as well. Say, for example, we have the plaintext "hehe" and each character is given one block and encrypted. The encryption of the first 'h' character is the same as the encryption of the second 'h' character. The same is true for the 'e' characters. This makes it easier to spot patterns which may partly (or completely) reveal the message.
Let's have another example. Suppose we have the plaintext "text" wherein each character is divided into blocks and goes through a cipher function that just simply flips the value (e.g. the value 1100 will result to 0011).
First, let's convert each character into its binary equivalent based on its ASCII code. We will have the following:
Character | ASCII Code | Binary |
---|---|---|
t | 116 | 01110100 |
e | 101 | 01100101 |
x | 120 | 01111000 |
t | 116 | 01110100 |
Now, each character has 8 bits. Assuming each block can only accept a maximum of 8 bits, then each block can accept one character from the plaintext. Let's have the diagram below to illustrate it better:
Since we assumed that each block can accomodate an 8-bit input and each character has an 8-bit binary equivalent, then each block contains one character. Each binary value will then go through a cipher function together with a key. Since we also assumed that the cipher function here will just flip the binary value, then the output will just be the reverse of the binary value of the plaintext.
As seen on the diagram, the plaintext 't' has a binary equivalent of 01110100 which became the input for the first cipher block. The output of such cipher block is the reverse, which is 00101110.
However, we can observe that the last cipher block also yielded the same output as the first cipher block: 00101110. Since ECB is deterministic, this can only mean one thing–this cipher block has the same input as the first cipher block.
Oh, no! Due to the deterministic property of ECB, a pattern of the plaintext has been leaked. This will make the message less confidential since we are now aware that the first and last character of the plaintext are just the same characters. This, in a way, makes it easier to get to the plaintext without even originally knowing what the plaintext was.
Supposedly, if we assume that certain information about the plaintext is already known (such as having only 4 letters of the alphabet), then there will still be
Based on our example, originally, there were more than 450,000 possible letter combinations–which is hard to guess–but simply due to the fact that the pattern of the message was leaked, the possible letter combinations went down to a mere 15,600. The pattern leak made it significantly easier to obtain the plaintext based on the ciphertext.
Such is the problem of ECB and all of the deterministic block cipher operations. Being deterministic prevents them from being IND-CPA secure since same inputs produce the same outputs which may compromise the confidentiality of the message by possibly leaking the pattern of the plaintext through the ciphertext.
Alternative Block Cipher Operations
Since we have already seen that Electronic Code Books are not IND-CPA secure, it is probably not a good idea to continue using them. There are alternative Block Cipher operations that are IND-CPA secure, which includes:
CBC (Cipher Block Chaining)
One of the Block Cipher opetations that adheres to IND-CPA security is the Cipher Block Chaining (CBC).
CRT (Counter)
Similar to the concept of CBC, the CTR also utilizes functions such as IV, keys and block cipher encryption, and XOR operation. While CTR has some differences with CBC, it also offers IND-CPA security. The main feature that provide IND-CPA security to this encryption algorithm is the ctr value. The ctr value is generated incrementally meaning there is absolutely no way two identical plaintext blocks can generate identical ciphertext blocks. This is because the increment generation provides an injective function where each increment can only generate a unique ciphertext block.
It can be noticed that this feature solves the infinitesimally small and "… a billion-to-one cosmic fluke." possibility to break CBC's indistinguishability. While this solves CBC's tiny vulnerability, it also opens up new ones which is some form of trade-off.
References:
Cambridge University Press. (n.d.). Indistinguishable. In Cambridge English Dictionary. https://dictionary.cambridge.org/dictionary/english/indistinguishable
Crypto-IT. (2020). Chosen Plaintext Attack. Crypto-IT. https://www.crypto-it.net/eng/attacks/chosen-plaintext.html
CrySyS. (2021). Deterministic, Stateless Encryption Is Not IND-CPA Secure. CrySyS Blog. https://crysec.dev/2021/02/17/deterministic-stateless-encryption-is-not-ind-cpa-secure.html
GeeksforGeeks. (2023). Block Cipher Modes of Operation. GeeksforGeeks. https://www.geeksforgeeks.org/block-cipher-modes-of-operation/
GeeksforGeeks. (2023). Difference between Deterministic and Non-Deterministic Algorithms. GeeksforGeeks. https://www.geeksforgeeks.org/difference-between-deterministic-and-non-deterministic-algorithms/
George Mason University. (n.d.). Lecture 2 ISA 562: Information Security, Theory and Practice. Retrieved from https://cs.gmu.edu/~gordon/teaching/isa562/notes/lecture_2.pdf
Heninger, N. (n.d.). Hash Functions [Slides]. CSE 107A course - UCSD. https://cseweb.ucsd.edu/classes/wi24/cse107-a/slides/6-hash.pdf
Open Oregon State. (n.d.). Chapter 8: Security Against Chosen Plaintext Attacks. Open Oregon State. https://open.oregonstate.education/cryptographyOEfirst/chapter/chapter-8-security-against-chosen-plaintext-attacks
Wikipedia. (2022). Ciphertext indistinguishability. In Wikipedia. https://en.wikipedia.org/wiki/Ciphertext_indistinguishability