What are Initialization Vectors and how should they be exchanged?
tl;dr: The Initialization Vector (IV) is an unpredictable random number used to make sure that when the same message is encrypted twice, the ciphertext always different. It should be exchanged, in public, as part of the ciphertext.
We don't want the ciphertexts to be the same when we encrypt the same message is because it leaks information. Suppose Alice is asking Bob a series of "yes or no" questions, and that the crypto software they're using makes the mistake of always encrypting the same message to the same ciphertext.
Here is their encrypted conversation:
Alice: DCF0C50A96DC2D9B05F2AC4C24CB9B93
Bob: E0B554B341FF5632DE241FBF4B1DBB37
Alice: 6742FA9512C8C2ACE6942974C8C848FC
Bob: 824783C3B272FF7129F9E153EC10D1AE
Alice: C90BD345639368D951A8B5E267427514
Bob: E0B554B341FF5632DE241FBF4B1DBB37
Alice: AC797939F55E87C361A8F02B4CEA1A08
Bob: 824783C3B272FF7129F9E153EC10D1AE
As you can see, some of Bob's ciphertexts repeat. The ones that are the same are highlighted in red and green. Since we know that Alice is asking Bob "yes or no" questions, we can assume that one of the ciphertexts corresponds to "yes" and the other to "no."
Now, suppose we ask Alice what questions she asked Bob. Alice tells us she asked the following questions (in order):
- Are you Bob?
- Are you Eve?
- Are you Smart?
- Are you Tired?
If Bob answered the questions truthfully, then we find out that Bob's first response, the ciphertext "E0B55..." is the encryption of "yes" and Bob's second response, the ciphertext "82478..." is the encryption of "no". So "E0B55..." is "yes," and "82478..." is "no." Now we know what Bob's answers to the last two questions were:
Alice: DCF0C50A96DC2D9B05F2AC4C24CB9B93 <- Are you Bob?
Bob: E0B554B341FF5632DE241FBF4B1DBB37 <- yes
Alice: 6742FA9512C8C2ACE6942974C8C848FC <- Are you Eve?
Bob: 824783C3B272FF7129F9E153EC10D1AE <- no
Alice: C90BD345639368D951A8B5E267427514 <- Are you Smart?
Bob: E0B554B341FF5632DE241FBF4B1DBB37 <- yes
Alice: AC797939F55E87C361A8F02B4CEA1A08 <- Are you Tired?
Bob: 824783C3B272FF7129F9E153EC10D1AE <- no
So Bob is smart, but not tired. We can probably assume he's angry that we cracked his encryption, too.
To fix this problem, we have to make sure that encrypting the same message always results in a different ciphertext. We have to make sure that when Bob answers "yes", it encrypts to something different than the last time he answered "yes".
This is accomplished using either an Initialization Vector (IV) or a nonce.
An Initialization Vector is an unpredictable random number used to "initialize" an encryption function. It has to be random, and an adversary shouldn't be able to predict it before the message is encrypted. See here to learn why it needs to be unpredictable.
The term "nonce" comes from "number used once", which is exactly what it is. It's a unique number. It can be random, but it doesn't have to be. As long as it's unique (never used before).
IVs and nonces are used by encryption modes like CBC and CTR to make all plaintexts encrypt differently. The ciphertext is a function of the plaintext and the IV or nonce. When the same plaintext is encrypted many times, the IV or nonce will be different each time so the ciphertext will be different each time.
IVs and nonces do not have to be kept secret. They are usually prefixed to the ciphertext and transmitted in full public view.
IVs should be generated by a cryptographically-secure random number generator, and not derived from the secret key. CakePHP made the mistake of using the key as the IV, making its encryption very easy to crack.
What is "Crypto Noobs"?
"Crypto Noobs" is a new series of posts where I answer your crypto questions. Each month, I will select one question and answer it in a new "Crypto Noobs" post.
Please send me your questions. All question askers will remain anonymous, so don't be afraid to ask dumb questions.