Top Banner
Cache-Collision Timing Attacks Against AES Joseph Bonneau Stanford University [email protected] u Ilya Mironov Microsoft Research [email protected] m
56

Cache-Collision Timing Attacks Against AES

Jan 13, 2016

Download

Documents

darby

Cache-Collision Timing Attacks Against AES. Joseph Bonneau Stanford University [email protected]. Ilya Mironov Microsoft Research [email protected]. Side Channel Cryptanalysis. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cache-Collision Timing Attacks Against AES

Cache-Collision Timing Attacks

Against AES

Joseph BonneauStanford University

[email protected]

Ilya MironovMicrosoft Research

[email protected]

Page 2: Cache-Collision Timing Attacks Against AES

Side Channel Cryptanalysis

Definition: Any attack on a cryptosystem using information leaked given off as a byproduct of the physical implementation of the cryptosystem, rather than a theoretical weakness.

Exploitable side-channels:–Power usage–Cache accesses–Noise–Heat–Time

Page 3: Cache-Collision Timing Attacks Against AES

Brief History of Timing Attacks

Timing attacks consider variability in the time taken to perform an encryption due to secret data.

•Paul Kocher demonstrated timing attacks against Diffie-Hellman, RSA, DSS, etc. at CRYPTO ’96

•Dan Boneh, David Brumley demonstrate first remote timing attack against RSA in 2003

•Public Key systems are vulnerable due to their use of lengthy mathematical operations

Page 4: Cache-Collision Timing Attacks Against AES

Brief History of Timing Attacks

•During AES competition, timing attacks were only believed to be possible against branch statements or data-dependent rotations.

•Rijndael has a mathematical formulation in the field GF(28)

•Optimized Rijndael implementation in software use only table lookup, shift, and exclusive-or operations

•NIST declared Rijndael “not vulnerable to timing attacks” in it final evaluation in 2000, Rijndael wins competition.

Page 5: Cache-Collision Timing Attacks Against AES

Brief History of Timing Attacks

•Daniel Bernstein announces successful timing attacks against AES in April 2005, exploiting timing characteristics of table lookups

•Osvik, Shamir, Tromer, follow up in November 2005 with very powerful attacks, requiring direct observation of cache before and after encryption

Page 6: Cache-Collision Timing Attacks Against AES

Implementation details of AES, part I

The textbook description of an AES round as a function from (Xi, Ki) Xi+1:

Page 7: Cache-Collision Timing Attacks Against AES

Implementation details of AES, part I

The actual round computation in software, as proposed with Rijndael and now widely used:

All three operations are combined into pre-computed tables. A round of encryption requires just 16 table lookups, 16 xor’s, and 12 shifts.

Page 8: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

Notice that for the first round, the table lookup indices are each related to only one key byte and one plaintext byte:

Remarkably, the entire encryption time will be affected by just the value of

Page 9: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

To prepare for the attack, collect a large body of reference timing data for each

Page 10: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

Next, collect a large body of timing data from a target machine for the plaintext byte

Page 11: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

The target machine’s timing data should be shifted from the reference data by exactly

Page 12: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

The target machine’s timing data should be shifted from the reference data by exactly

Page 13: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

Problems:

•The reference machine must be identical to the target

•Requires known plaintext as well as timing data

•Plaintexts must be sufficiently random

•High number of samples required, best case as reported by Bernstein is around 227.5

Page 14: Cache-Collision Timing Attacks Against AES

Bernstein’s timing attack

•Overall, a very general statistical method to constructing a timing attack.

•Getting code to run in constant time on a machine with cache is very difficult, meaning most cryptosystems are theoretically vulnerable.

•Bernstein’s attack doesn’t exploit any specific features of Rijndael, yet the attack does not seem to work against other AES finalists (Serpent, Twofish)

Page 15: Cache-Collision Timing Attacks Against AES

Cache-collision timing attacks

What is Rijndael’s weakness?

Page 16: Cache-Collision Timing Attacks Against AES

Cache-collision timing attacks

What is Rijndael’s weakness?

•Heavy use of table lookups which dominate the running time•Table lookup indices are easily related to single plaintext and key bytes

Page 17: Cache-Collision Timing Attacks Against AES

Cache collisions

Rijndael is just a sequence of table lookups. …

T[x] T[x] T[xi] T[x] T[x] T[xj] T[x] …

Page 18: Cache-Collision Timing Attacks Against AES

Cache collisions

Rijndael is just a sequence of table lookups.

•What happens when xi= xj?

… T[x] T[x] T[xi] T[x] T[x] T[xj] T[x] …

Page 19: Cache-Collision Timing Attacks Against AES

Cache collisions

Rijndael is just a sequence of table lookups.

•What happens when xi= xj?

The access to xj will hit in cache.

… T[x] T[x] T[xi] T[x] T[x] T[xj] T[x] …

Page 20: Cache-Collision Timing Attacks Against AES

Cache collisions

Rijndael is just a sequence of table lookups.

•What happens when xi= xj?

The access to xj will hit in cache.

•What happens when xi≠ xj?

… T[x] T[x] T[xi] T[x] T[x] T[xj] T[x] …

Page 21: Cache-Collision Timing Attacks Against AES

Cache collisions

Rijndael is just a sequence of table lookups.

•What happens when xi= xj?

The access to xj will hit in cache.

•What happens when xi≠ xj?

The access to xj may or may not hit in cache, depending on the rest of the sequence and the prior cache contents.

… T[x] T[x] T[xi] T[x] T[x] T[xj] T[x] …

Page 22: Cache-Collision Timing Attacks Against AES

Cache collisions

A cache-collision occurs when we know that xi= xj.

For a large number of samples, the average encryption time will be lower when xi= xj than when xi≠ xj.

This is all we need to build an attack.

… T[x] T[x] T[xi] T[x] T[x] T[xj] T[x] …

Page 23: Cache-Collision Timing Attacks Against AES

Cache collisions

-40

-30

-20

-10

0

10

20

30

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

# of cache collisions

Tim

ing

de

via

tio

n (

cy

cle

s)

Actual Results, Pentium III

Page 24: Cache-Collision Timing Attacks Against AES

First Round Attack

Pick two lookups in the first round of encryption:

Page 25: Cache-Collision Timing Attacks Against AES

First Round Attack

Pick two lookups in the first round of encryption:

Solve for the collision constraint:

Page 26: Cache-Collision Timing Attacks Against AES

Result: A working attack!

There is an easily identifiable low average encryption time whenever

First Round Attack

Page 27: Cache-Collision Timing Attacks Against AES

Result: A working attack!

There is an easily identifiable low average encryption time whenever

However, there are some complications…

First Round Attack

Page 28: Cache-Collision Timing Attacks Against AES

Complication #1: Table families

Notice four separate tables are used:

Each “family” of four bytes is isolated.

Page 29: Cache-Collision Timing Attacks Against AES

Complication #2: Cache Lines

Modern memory is cached in lines.

Page 30: Cache-Collision Timing Attacks Against AES

Complication #2: Cache Lines

Modern memory is cached in lines.

Table Lookup

Page 31: Cache-Collision Timing Attacks Against AES

Table Lookup

Complication #2: Cache Lines

Modern memory is cached in lines.

Cache

Page 32: Cache-Collision Timing Attacks Against AES

So, we can only tell if two lookups hit the same line in memory, not if they are identical. We denote:

Most CPU’s use 32 or 64 byte cache lines.

With 4 byte table entries, this means we are forced to ignore the 3 or 4 low-order bits.

Complication #2: Cache Lines

Page 33: Cache-Collision Timing Attacks Against AES

We gain a set of equations in each family, such as:

This leaves 68 or 80 bits of key to search.

This limitation was also problematic for Osvik et al.Their solution: examine the second round as well. This can fix some of the problems but is difficult for timing attacks (see paper).

First Round Attack: The bad news

Page 34: Cache-Collision Timing Attacks Against AES

•Cache-collisions are a strong method.

•The timing variability is much better than the random effects previously used.

•The attack requires ~215 samples, compared to 227.5.

Can we recover the full key with this efficiency?

First Round Attack: The good news

Page 35: Cache-Collision Timing Attacks Against AES

The final round of encryption is special

Implementation details of AES, part II

round 1

round 2

round 8

round 9

round 10

special!

Page 36: Cache-Collision Timing Attacks Against AES

The final round of encryption is special:

•No MixColumns operation is performed, as it would add no additional security

•In software, this requires a new table to be used only for the final round. This table is just the S-box

Implementation details of AES, part II

Page 37: Cache-Collision Timing Attacks Against AES

The final round also uses expanded key bytes

However, the AES key schedule is invertible. Finding the final 16 bytes is equivalent to finding the raw key. This design was intentional.

Implementation details of AES, part II

Page 38: Cache-Collision Timing Attacks Against AES

Again, we consider a cache-collision for two bytes

When do these bytes collide in the table?

Final Round Attack

Page 39: Cache-Collision Timing Attacks Against AES

We want to solve for

Final Round Attack

Page 40: Cache-Collision Timing Attacks Against AES

We want to solve for

We assume that

Final Round Attack

Page 41: Cache-Collision Timing Attacks Against AES

We want to solve for

We assume that , leaving

Final Round Attack

Page 42: Cache-Collision Timing Attacks Against AES

We want to solve for

We assume that , leaving

Final Round Attack

Page 43: Cache-Collision Timing Attacks Against AES

So, guarantees a collision

What happens if ?

Final Round Attack

Page 44: Cache-Collision Timing Attacks Against AES

So, guarantees a collision

What happens if ?

We get a fixed offset

Final Round Attack

Page 45: Cache-Collision Timing Attacks Against AES

So, guarantees a collision

What happens if ?

We get a fixed offset

Surprise: the non-linearity of the S-box enables the attack to succeed.

Final Round Attack

Page 46: Cache-Collision Timing Attacks Against AES

Why does this happen?

Because α, β, are the result S-box lookups, a fixed offset does not mean anything about the indices used to look them up. A small offset γ = 1 does not mean a collision on the same cache line.

Thus, the cache-line issue is gone.

Final Round Attack

Page 47: Cache-Collision Timing Attacks Against AES

•Collect timing data, compute average time for each value of for all i, j. Low times will occur at the values

•Attack data produces likelihood estimate for different values for each ki, kj.

•Need to find k0,…,k15 minimizing the global cost function: ij Cij(ki, kj)

•Use standard AI algorithms (Local Optimization, Belief Propagation).

Final Round Attack

Page 48: Cache-Collision Timing Attacks Against AES

Final Round Attack: Results

•Huge improvement over the original 227.5.

•“Offline” complexity is low, attack takes seconds. This can be increased to further lower number of samples required.

CPU L2 cache eviction

L1 cache eviction

Pentium III 215 216

Pentium IV 216 219.9

UltraSPARC-III

215 218.7

Page 49: Cache-Collision Timing Attacks Against AES

Expanded Final Round Attack

•Produce cost estimate for specific values of key bytes, instead of simply their difference

•Require more time, memory by attacker, but attack still finishes in ~10 minutes

CPU L2 cache eviction

L1 cache eviction

Pentium III 213 214

Pentium IV 213.6 218.6

UltraSPARC-III

214.3 217.3

Page 50: Cache-Collision Timing Attacks Against AES

Final Round Attack: Results

•Bonuses from attacking the final round:

–Attack requires only ciphertext and timing.

–Related plaintexts produce essentially random cipher state by the 9th round.

•Attack is oblivious to the target platform

•Attack works well against decryption

Page 51: Cache-Collision Timing Attacks Against AES

Final Round Attack: Results

The attack should be widely applicable:

•Most CPU’s use similar cache structure

•Most standard crypto libraries use the original Rijndael implementation of AES. Attacks are implemented against OpenSSL.

Page 52: Cache-Collision Timing Attacks Against AES

Final Round Attack: Complications

•The attacks assume the AES tables are out of cache before encryption. This means a target machine must be made to do some unrelated work in between encryptions.

•Recent CPU’s (ie Pentium IV) are more complicated than the model—hardware prefetch, out-of-order execution, etc.

•Larger cache line sizes are also a problem.

Page 53: Cache-Collision Timing Attacks Against AES

Countermeasures

•Solutions requiring special hardware support are probably not practical

•Cannot guarantee the encryption will take constant time without crippling performance.

•It is possible to greatly increase resistance of the common AES implementation to final round attacks with no performance penalty by eliminating the special lookup table

Page 54: Cache-Collision Timing Attacks Against AES

Conclusions

•AES is vulnerable to timing attacks due to its use of table lookups. Better attacks are still possible.

•Real-world use of timing attacks is questionable, as they require cycle-count level data, but these attacks tolerate much more noise than before

•Applications?•Process-to-Process attacks•Virtual Machines•Against a “secure” CPU on a multiprocessor machine•Against a remote server- the holy grail

Page 55: Cache-Collision Timing Attacks Against AES

Conclusions

•Table lookups into cached memory are dangerous for cryptographic software.

•Information leaked through many side channels:

–Time–Cache contents–Power usage

•AES selection largely ignored this problem. Runner up cipher Serpent avoids lookup tables, but this was not seen as an advantage.

Page 56: Cache-Collision Timing Attacks Against AES

Thank you

Questions?

Joseph [email protected]

Current version of paper available at:www.stanford.edu/~jbonneau/AES_timing.pdf