This article is not a copy-paste of the AES specification. I wanted to write an introduction that helps the reader understand the basics of the AES algorithm. To read this article you should have at least some vague knowledge of what encryption is and you should be comfortable with words like bits, bytes and XOR. At the end of this article you should know enough to read the actual AES specification with ease and implement the encryption algorithm yourself. To keep the article short I focus only on the encryption algorithm.
If you're interested in seeing an implementation close to the specification, checkout the one I coded for the cryptopals challenge. For a real world implementation I recommend Go's aes crypto package.
Let's start with a quick recap. The Advanced Encryption Standard is a symmetric cipher, which means that you need a secret key to encrypt a plaintext and the same key to decrypt the ciphertext. The key can be 128
, 192
or 256
bits. In the rest of this article I assume that we are working with AES-128 that uses a key of 128 bits.
But AES is also a block cipher, which means that it encrypts inputs of 128 bits in multiple rounds before outputting the final output: a 128-bit ciphertext. Each round gets a different key called round key
. The round keys are derived from the cipher key. Take a look at my previous post for more information about round keys and how they are generated.
We need a 128-bit plaintext because AES is a block cipher. The only thing it can do is encrypt blocks. To do so the block goes through a series of rounds. Each round performs some substitutions and permutations.
AES is a substitution-permutation network, the substition is provided by SubBytes and the permutation by ShiftRows and MixColumns.
With AES-128 we have 10 rounds:
The AES state is the current value of the bytes being encrypted (more on this later). After the first AddRoundKey
the state is equal to .
Each round is made up of 4 functions:
Before the first round, the plaintext is XOR'd to . The last round does not apply MixColumns since it has no security relevance (more on that later).
The AES state is the array of bytes on which the cryptographic operations are being performed. At the very beginning of the algorithm, the state is equal to the 128-bit plaintext block. At the end of the algorithm, it is equal to the ciphertext.
The state is often represented as a matrix. Let the 16 bytes of the plaintext . Then we represent the state as :
We note the number of columns comprising the state. . The state is always 128 bits independently of the key length.
SubBytes applies the S-box to each byte of the state (see below).
The AES S-box is a lookup table that maps a byte to another byte. For instance b3
is mapped to 6d
.
Let be a byte, to compute the mapping :
Where is the bit of {01100011}
.
There are other ways to compute the mapping. You can use a matrix form, or you can use a precomputed lookup table. The latter is probably the most convenient, here is the S-box in all its glory (taken from Wikipedia):
ShiftRows performs a cyclical rotation of the bytes of the state by rows:
MixColumns applies a transformation on each column of the state. Let be the bytes of the state as described earlier and the bytes of the state after the MixColumns transformation. For the column we have:
MixColumns is a Hill cipher.
AddRoundKey computes a bitwise XOR between the state and the round key.
Remember: the size of a Round Key is always the size of the state.
Since operations are done on 4-byte (i.e. 32-bit) words let's split the round key in 4-byte words ( is the round number):
We have seen that is the number of columns comprising the state, so . Then, . And as a reminder the state can be represented as four columns, each column defined by .
Here and are 4-byte words and is one byte.
The result of AddRoundKey is the bitwise XOR between the state and the round key where the 4-byte word is defined by:
Claude Shannon identified two properties of a secure cipher:
The steps of AES are designed to add confusion and diffusion:
AddRoundKey
adds some dependency on the key, and as such, some confusion.ShiftRows
, a modification on one bit in one column of the state affects other columns of the state and with MixColumns
changing one byte of the state affects other bytes of the state. These two steps add diffusion.SubBytes
adds nonlinearity and confusion. Without it, you could encrypt a bunch of plaintext, get the corresponding ciphertexts and use a Gaussian elimination to retrieve the key.The designers of AES also defined a new term: dispersion. Here is their definition from their book The Design of Rijndael:
By dispersion we mean the operation by which bits or bytes that are close to each other in the context of θ are moved to positions that are distant.
Basically, dispersion means separating bits that are close together. This step is provided by ShiftRows
.
This concludes this article, you should now have a fair understanding of how the AES encryptionalgorithm works if you ever need to code it or to break it. If you're interested in the decryption algorithm, go over to my next post.