-
This paper is included in the Proceedings of the 28th USENIX
Security Symposium.
August 14–16, 2019 • Santa Clara, CA, USA
978-1-939133-06-9
Open access to the Proceedings of the 28th USENIX Security
Symposium
is sponsored by USENIX.
IodIne: Verifying Constant-Time Execution of Hardware
Klaus v. Gleissenthall, Rami Gökhan Kıcı, Deian Stefan, and
Ranjit Jhala, University of California, San Diego
https://www.usenix.org/conference/usenixsecurity19/presentation/von-gleissenthall
-
IODINE: Verifying Constant-Time Execution of Hardware
Klaus v. GleissenthallUniversity of California, San Diego
Rami Gökhan KıcıUniversity of California, San Diego
Deian StefanUniversity of California, San Diego
Ranjit JhalaUniversity of California, San Diego
Abstract. To be secure, cryptographic algorithms cru-cially rely
on the underlying hardware to avoid inad-vertent leakage of secrets
through timing side channels.Unfortunately, such timing channels
are ubiquitous inmodern hardware, due to its labyrinthine
fast-paths andoptimizations. A promising way to avoid timing
vulnera-bilities is to devise—and verify—conditions under whicha
hardware design is free of timing variability, i.e., exe-cutes in
constant-time. In this paper, we present IODINE:a clock-precise,
constant-time approach to eliminatingtiming side channels in
hardware. IODINE succeeds inverifying various open source hardware
designs in sec-onds and with little developer effort. IODINE also
discov-ered two constant-time violations: one in a
floating-pointunit and another one in an RSA encryption module.
1 IntroductionTrust in software systems is always rooted in the
underly-ing hardware. This trust is apparent when using
hardwaresecurity features like enclaves (e.g., SGX and
TrustZone),crypto units (e.g., AES-NI and the TPM), or MMUs. Butour
trust goes deeper. Even for simple ADD or MUL instruc-tions, we
expect the processor to avoid leaking any ofthe operands via timing
side channels, e.g., by varyingthe execution time of the operation
according to the data.Indeed, even algorithms specifically designed
to be re-silient to such timing side-channel attacks crucially
relyon these assumptions [23–25]. Alas, recently
discoveredvulnerabilities have shown that the labyrinthine
fast-pathsand optimizations ubiquitous in modern hardware exposea
plethora of side channels that undermine many of ourdeeply held
beliefs [34, 36, 42].
A promising way to ensure that trust in hardware isproperly
earned is to formally specify our expectations,and then, to
verify—through mathematical proof—thatthe units used in security
critical contexts do not exhibit
any timing variability, i.e., are constant-time. For instance,by
verifying that certain parts of an arithmetic logic unit(ALU) are
constant-time, we can provide a foundation forimplementing secure
crypto algorithms in software [16,20, 22]. Dually, if timing
variability is unavoidable, e.g.,in SIMD or floating-point units,
making this variabilityexplicit can better inform mechanisms that
attempt tomitigate timing channels at the software level [18,
46,54] in order to avoid vulnerabilities due to gaps in
thehardware-software contract [17, 18].
In this paper, we introduce IODINE: a
clock-precise,constant-time approach to eliminating timing side
chan-nels in hardware. Given a hardware circuit describedin
Verilog, a specification comprising a set of sourcesand sinks
(e.g., an FPU pipeline start and end) and a setof usage assumptions
(e.g., no division is performed),IODINE allows developers to
automatically synthesizeproofs which ensure that the hardware runs
in constant-time, i.e., under the given usage assumptions, the
timetaken to flow from source to sink, is independent ofoperands,
processor flags and interference by concurrentcomputations.
Using IODINE, a crypto hardware designer can be cer-tain that
their encryption core does not leak secret keys ormessages by
taking a different number of cycles depend-ing on the secret
values. Similarly, a CPU designer canguarantee that programs (e.g.,
cryptographic algorithms,SVG filters) will run in constant-time
when properlystructured (e.g., when they do not branch or access
mem-ory depending on secrets [20]).
IODINE is clock-precise in that it enforces constant-time
execution directly as a semantic property of the cir-cuit rather
than through indirect means like informationflow control [55]. As a
result, IODINE neither requiresthe constant-time property to hold
unconditionally nor
USENIX Association 28th USENIX Security Symposium 1411
-
demands the circuit be partitioned between different se-curity
levels (e.g., as in SecVerilog [55]). This makesIODINE particularly
suited for verifying existing hard-ware designs. For example, we
envision IODINE to beuseful in verifying ARM’s recent set of data
indepen-dent timing (DIT) instructions which should execute
inconstant-time, if the PSTATE.DIT processor state flag isset [2,
41].
While there have been significant strides in verifyingthe
constant-time execution of software [14–16, 18, 20–22,53], IODINE
unfortunately cannot directly reuse theseefforts. Constant time
methods for software focus onstraight-line, sequential—often
cryptographic—code.
Hardware designs, however, are inherently concurrentand
long-lived: circuits can be viewed as collections ofprocesses that
run forever, performing parallel compu-tations that update
registers and memory in every clockcycle. As a result, in hardware,
even the definition ofconstant-time execution becomes problematic:
how canwe measure the timing of a hardware design that neverstops
and performs multiple concurrent computationsthat mutually
influence each other?
In IODINE, we address these challenges through thefollowing
contributions.
1. Definition. First, we define a notion of
constant-timeexecution for concurrent, long-lived computations. In
or-der to reason about the timing of values flowing betweensources
and sinks, we introduce the notion of influenceset. The influence
set of a value contains all cycles t,such that an input (i.e., a
source value) at t was used inits computation. We say that a
hardware design is con-stant time, if all its computation paths
(that satisfy usageassumptions) produce the same sequence of
influencesets for sinks.
2. Verification. To enable its efficient verification, weshow
how to reduce the problem of checking constant-time execution—as
defined through influence sets—tothe standard problem of checking
assertion validity. Forthis, we first eschew the complexity of
reasoning aboutseveral concurrent computations at once, by focusing
ona single computation starting (i.e., inputs issued) at somecycle
t. We say that a value is live for cycle t (t-live), ifit was
influenced by the computation started at t, i.e., t isin the
value’s influence set. This allows us to reduce theproblem of
checking equality of influence sets, to check-ing the equivalence
of membership, for their elements.We say that a hardware design is
liveness equivalent, if,for any two executions (that satisfy usage
assumptions),and any t, t-live values are assigned to sinks in the
same
way, i.e., whenever a t-live value is assigned to a sink inone
execution, a t-live value must also be assigned to asink in the
other.
To check a hardware design for liveness equivalence,we mark
source data as live in some arbitrarily chosenstart cycle t, and
track the flow of t-live values throughthe circuit using a simple
standard taint tracking moni-tor [44]; the problem of checking
liveness equivalencethen reduces to checking a simple assertion
stating thatsinks are always tainted in the same way.
Reducingconstant-time execution to the standard problem of
check-ing assertion validity allows us to rely on
off-the-shelf,mature verification technology, which explains
IODINE’seffectiveness.
3. Evaluation. Our final contribution is an implemen-tation and
evaluation of IODINE on seven open sourceVERILOG projects—CPU
cores, an ALU, crypto-cores,and floating-point units (FPUs). We
find that IODINEsucceeds in verifying different kinds of hardware
de-signs in a matter of seconds, with modest developer ef-fort (§
6). Many of our benchmarks are constant-time forintricate reasons
(§ 6.3), e.g., whether or not a circuit isconstant-time depends on
its execution history, circuitsare constant-time despite triggering
different control flowpaths depending on secrets, and require a
carefully cho-sen set of assumptions to be shown constant-time. In
ourexperience, these characteristics—combined with the cir-cuit
size—make determining whether a hardware designis constant-time by
code inspection near impossible.
IODINE also revealed two constant-time violations:one in the
division unit of an FPU designs, another in themodular
exponentiation module of an RSA encryptionmodule. The second
violation—a classical timing sidechannel—can be abused to leak
secret keys [27, 35].
In summary, this paper makes the following contribu-tions.
I First, we give a definition for constant-time executionof
hardware, based on the notion of influence sets (§ 2).We formalize
the semantics of VERILOG programswith influence sets (§ 3), and use
this formalization todefine constant-time execution with respect to
usageassumptions (§ 4).
I Our second contribution is a reduction of
constant-timeexecution to the easy-to-verify problem of
livenessequivalence. We formalize this property (§ 4), proveits
equivalence to our original notion of constant-timeexecution (§
4.3), and show how to verify it usingstandard methods (§ 5).
1412 28th USENIX Security Symposium USENIX Association
-
1 // source(x); source(y); sink(out);
2 // assume(ct = 1);
3
4 reg flp_res , x, y, ct, out , out_ready , ...;
5 wire iszero , isNaN , ...;
6
7 assign iszero = (x == 0) || (y == 0);
8
9 always @(posedge clk) begin
10 ...
11 flp_res
-
P ::= Program
| [s]id process| P ‖ P parallel composition| repeat P sync.
iteration| › empty process
s ::= Command| skip no-op| v = e blocking| v⇐ e non-blocking| v
:= e continuous| ite(e,s,s) conditional| s1 ; . . . ; sk sequence|
a annotation
e ::= Expression
| v variables| n constants| f (e1, . . . ,ek) function
literal
Figure 2: Syntax for intermediate language VINTER.
repeat [iszero := (x == 0 || y == 0)]
‖ repeat [. . . ; flp_res⇐ . . .]
‖ repeat
ite(ct,
out⇐ flp_res,ite(iszero,
out⇐ 0,out⇐ flp_res))
Figure 3: EX1 written in VINTER
fixed-rate clock, allows us to translate VERILOG pro-grams, such
as our FPU multiplier to a more conciserepresentation shown in
Figure 3.
Intermediate Language. In this language—called VIN-TER—VERILOG
always- and assign-blocks are repre-sented as concurrent processes,
wrapped inside an infiniterepeat-loop. As Fig. 2 shows, each
process sequentiallyexecutes a series of VERILOG-like statements.
(Each pro-cess also has a unique identifier id ∈ PIDs, which
wesometimes omit, for brevity.) Most of these are standard;we only
note that VINTER—like VERILOG—supportsthree types of assignment
statements: blocking (v = e),non-blocking (v⇐ e) and continuous (v
:= e). Blockingassignments take effect immediately, within the
currentcycle; non-blocking assignments are deferred until thenext
cycle. Finally, continuous assignments enforce di-rected equalities
between registers or wires: wheneverthe right-hand side of an
equality is changed, the left-hand side is updated by re-running
the assignment. Notethat VINTER focuses only on the synthesizable
fragmentof VERILOG, i.e., does not model delays, etc., which
areonly relevant for simulation.
VINTER processes are composed in parallel using the(‖) operator.
Unlike concurrent software processes, theyare, however,
synchronized using a single (implicit) fixed-rate clock: each
process waits for all other (parallel) pro-cesses to finish
executing before moving on to the nextiteration, i.e., next clock
cycle. Moreover, unlike soft-ware, these programs are usually
data-race free, in orderto be synthesizable to hardware.
VINTER processes run forever; they perform computa-tions and
update registers (e.g., out in our multiplier) onevery clock cycle.
For example, pipelined hardware unitsexecute multiple, different
computations simultaneously.
From Software to Hardware. This execution model, to-gether with
the fact that software operates at a higherlevel of abstraction
than hardware, makes it difficult forus to use existing
verification tools for constant-timesoftware (e.g., [16, 20]).
First, constant-time verification for software only con-siders
straight-line, sequential code. This makes it ill-suited for the
concurrent, long-lived execution model ofhardware.
Second, software constant-time models are
necessarilyconservative. They deliberately abstract over
hardwaredetails—i.e., they don’t rely on a precise hardware mod-els
(e.g., of caches or branch predictors)—and insteaduse leakage
models that make control flow and memoryaccess patterns observable
to the attacker. This makesconstant-time software portable across
hardware. But,it also makes the programming model restrictive:
themodel disallows any branching to protect against
hiddenmicroarchitectural state (e.g., the branch predictor).
Since we operate on VERILOG, where all state is ex-plicit and
visible, we can instead directly track the influ-ence of secret
values on the timing of attacker-observableoutputs. This allows us
to be more permissive than soft-ware constant-time models. For
instance, if we can showthat the execution of two branches of a
hardware designtakes the same amount of time, independent of
secretinputs, we can safely allow branches on secrets. How-ever,
this still leaves the problem of pipelining: hardwareingests inputs
and produce outputs at every clock cy-cle: how then do we know (if
and) which secret inputsinfluenced a particular output?
Influence Sets. This motivates our definition for influ-ence
sets. In order to define a notion of constant-timeexecution that is
suitable for hardware, we first add an-notations marking inputs
(i.e., x and y in our example)as sources and outputs (i.e., out) as
sinks. For a givencycle, we then associate with each register x its
influence-
1414 28th USENIX Security Symposium USENIX Association
-
Cycle # x y ct fr out θ(x) θ(y) θ(ct) θ(fr) θ(out)0 0 1 F X X
{0} {0} ∅ ∅ ∅1 0 1 F X 0 {1} {1} ∅ ∅ {0}
...k-1 0 1 F 0 0 {k−1} {k−1} ∅ {0} {k−2}k 0 1 F 0 0 {k} {k} ∅
{1} {k−1}
Figure 4: Execution of EX1, where x = 0 and y = 1, and ct is
unset. For each variable and cycle, we show its current value
andinfluence set. We assume that it takes k cycles to compute the
output along the slow path, and abbreviate flp_res as fr. X denotes
anunknown/irrelevant value. Register out is only influenced by
values from the last cycle. Highlighted cells are the difference
withFigure 5. Values that stayed the same in the next cycle are
shaded.
Cycle # x y ct fr out θ(x) θ(y) θ(ct) θ(fr) θ(out)0 1 1 F X X
{0} {0} ∅ ∅ ∅1 1 1 F X X {1} {1} ∅ ∅ {0}
...k-1 1 1 F 1 X {k−1} {k−1} ∅ {0} {k−2}k 1 1 F 1 1 {k} {k} ∅
{1} {0,k−1}
Figure 5: Execution of EX1, where both x = 1 and y = 1, and ct
is unset. The execution produces the same influence sets as
theexecution in Fig. 4, except for cycle k, where out’s influence
set contains the additional value 0, thereby violating our
definition ofconstant-time execution.
set θ(x). The influence set of a register x contains allcycles
t, such that an input at t was used in the com-putation of x’s
current value. This allows us to defineconstant-time execution for
hardware: we say that a hard-ware design is constant-time, if any
two executions (thatsatisfy usage assumptions) produce the same
sequenceof influence sets for their sinks.
Example. We now illustrate this definition using our run-ning
example EX1 by showing that EX1 violates ourdefinition of
constant-time, if the ct flag is unset. Forthis, consider Fig. 4
and Fig. 5, which show the state ofregisters and wires as well as
their respective influencesets, for two executions. In both
executions, we let y = 1,but vary the value of the x register: in
Fig. 4, we set x to0 to trigger the fast path in Fig. 5 we set it
to 1. In bothexecutions, sources x and y are only influenced by
thecurrent cycle, constant-time flag ct is set independentlyof
inputs, and temporary register flp_res is influenced bythe inputs
that were issued k−1 cycles ago, as it takesk−1 cycles to compute
flp_res along the slow path.
The two executions differ in the influence sets of out.In Fig.
4, out is only influenced by the input issued inthe last cycle,
through a control dependency on iszero. Inthe execution in Fig. 5,
its value at cycle k is howeveralso influenced by the input at 0.
This reflects the propa-gation of the computation result through
the slow path.Crucially, it also shows that the multiplier is not
constant-
time—the sets θ(out) differing between two runs reflectsthe
influence of data on the duration of the computation.
2.2 Liveness EquivalenceWe now show how to reduce verifying
whether a givenhardware is constant-time to an easy-to-check, yet
equiv-alent problem called liveness equivalence.
Intuitively,liveness equivalence reduces the problem of
checkingequality of influence sets, to checking the equivalence
ofmembership, for arbitrary elements.
Liveness Equivalence. Our reduction focuses on a sin-gle
computation started at some cycle t. We say thatregister x is live
for cycle t (t-live), if its current value isinfluenced by an input
issued in cycle t, i.e., if t ∈ θ(x).Two executions are t-liveness
equivalent, if whenever at-live value is assigned to a sink in one
execution, a t-livevalue must also be assigned in the other.
Finally, a hard-ware design is liveness equivalent, if any two
executionsthat satisfy usage assumptions are t-liveness
equivalent,for any t.
Live Value Propagation. To track t-liveness for a fixed t,IODINE
internally transforms VINTER programs as fol-lows. For each
register or wire (e.g., x in our multiplier),we introduce a new
shadow variable (e.g., x•) that rep-resents its liveness; a shadow
variable x• is set to L if xis live and D (dead) otherwise.2 We
then propagate live-
2 For liveness-bits x• and y•, we define a join operator ∨, such
that
USENIX Association 28th USENIX Security Symposium 1415
-
repeat
[iszero := (x == 0 || y == 0) ;iszero• := (x•∨ y•)
]
‖ repeat[
. . . ; flp_res⇐ . . . ;
. . . ; flp_res•⇐ . . . //(x•∨ y•)
]
‖ repeat
ite(ct,out⇐ flp_res ;out•⇐ (flp_res•∨ ct•) ,ite(iszero,out⇐ 0
;
out•⇐ (ct•∨ iszero•) ,out⇐ flp_res ;
out•⇐(
flp_res•∨ct•∨ iszero•
)))
Figure 6: EX1, after we propagate liveness using a
standardtaint-tracking inline monitor.
x y ct fr out x• y• ct• fr• out•
0 0 1 F X X L L D D D1 0 1 F X 0 D D D D L
...k-1 0 1 F 0 0 D D D L Dk 0 1 F 0 0 D D D D D
Figure 7: Execution of EX1•, where x = 0 and y = 1. Weshow
current value and liveness bit for each register and cycle.Register
out is live in cycle one, due to the fast path and dead,otherwise.
Highlights are the differences with Figure 8. Valuesthat stayed the
same in the next cycle are shaded.
ness using a standard taint-tracking inline monitor [44]shown in
Figure 6. Intuitively, our monitor ensures thatregisters and wires
that depend on a live value—directlyor indirectly, via control
flow—are marked live.
Example. By tracking liveness, we can again see that
ourfloating-point multiplier is not constant-time when thect flag
is unset. To this end, we “inject” live values atsources (x and y)
at time t= 0 for two runs; as before, weset y = 1, and vary the
value of x: in one execution, weset x to 0 to trigger the fast
path, in the other execution,we set it to 1. Fig. 7 and 8 show the
state of the differentregisters and wires for these runs. In both
runs, out islive at cycle 1—due to a control dependency in Fig.
7,due a direct assignment in Fig. 8. But, in the latter, outis also
live at the kth cycle. This reflects the fact that theinfluence
sets of out at cycle k differ in the membershipof 0, and therefore
witnesses the constant-time violation.
2.3 Verifying Liveness EquivalenceUsing our reduction to
liveness equivalence, we can ver-ify that a VERILOG program
executes in constant-timeusing standard methods. For this, we mark
source data as
x•∨ y• is L, if x• or y• is L andD, otherwise.
x y ct fr out x• y• ct• fr• out•
0 1 1 F X X L L D D D1 1 1 F X X D D D D L
...k-1 1 1 F 1 X D D D L Dk 1 1 F 1 1 D D D D L
Figure 8: Execution of EX1•, where both x = 1 and y = 1.
Theliveness bits are the same as in 7, except for cycle k, where
outis now live. This reflects the propagation of the output
valuethrough the slow path and shows the constant-time
violation.
live in some arbitrarily chosen start cycle t. We then ver-ify
that any two executions that satisfy usage assumptionsassign t-live
values to sinks, in the same way.
Product Programs. Like previous work on verifyingconstant-time
software [16], IODINE reduced the prob-lem of verifying properties
of two executions of someprogram P by proving a property about a
single execu-tion of a new program Q. This program—the
so-calledproduct program [22] – consists of two disjoint copies
ofthe original program.
Race-Freedom. Our product construction exploits thefact
thatVERILOG programs are race-free, i.e., the orderin which
always-blocks are scheduled within a cycle doesnot matter. While
races in software often serve a purpose(e.g., a task distribution
service may allow races betweenequivalent worker threads to
increase throughput), racesin VERILOG are always artifacts of
poorly designed code:any synthesized circuit is, by its nature,
race-free, i.e., thescheduling of processes within a cycle does not
affectthe computation outcome. Indeed, races in VERILOGrepresent an
under-specification of the intended design.
Per-Process Product. We leverage this insight to com-pose the
two copies of a program in lock-step. Specifi-cally, we merge each
process of the two program copiesand execute the “left” (L) and
“right” (R) copies together.For example, IODINE transforms the
VINTER multipliercode from Figure 6 into the per-process product
programshown in Figure 9.
Merging two copies of a program as such is sound:since the
program is race-free—any ordering of processtransitions within a
cycle yields the same results—we arefree to pick an arbitrary
schedule.3 Hence, IODINE takesa simple ordering approach and
schedules the left andright copy of same process at the same
time.
Constant-Time Assertion. Given such a product pro-gram, we can
now frame the constant-time verification
3To ensure that hardware designs are indeed race-free, our
imple-mentation performs a light-weight static analysis to check
for races.
1416 28th USENIX Security Symposium USENIX Association
-
repeat
iszeroL := (xL == 0 || yL == 0) ;iszeroR := (xR == 0 || yR == 0)
;
iszero•L :=(x•L∨ y
•L
);
iszero•R :=(x•R∨ y
•R
)
‖ repeat
. . . ; flp_resL⇐ . . . ;. . . ; flp_resR⇐ . . . ;
flp_res•L⇐ . . . //(x•L∨ y
•L
);
flp_res•R⇐ . . . //(x•R∨ y
•R
)
‖ repeat . . .
Figure 9: Per-process product form of EX1.
challenge as a simple assertion: the liveness of the leftand
right program sink-variables must be the same (re-gardless of when
the computation started). In our exam-ple, this assertion is simply
out•L = out•R. This asser-tion can be verified using standard
methods. In particular,IODINE synthesize process-modular invariants
[45] thatimply the constant-time assertion (§ 5).
The following two sections formalize the material pre-sented in
this overview.
3 Syntax and SemanticsSince VERILOG’s execution model can be
subtle [12], weformally define syntax and semantics of the
VERILOGfragment considered in this paper.
3.1 PreliminariesFor a function f, we write dom f to denote f’s
domain andran f for its co-domain. For a set S⊆ dom f, we let
f[S←b] denote the function that behaves the same as f exceptS,
where it returns b, i.e., f[S← b](x) evaluates to b ifx ∈ S and
f(x), otherwise. We use f[a← b] as a shorthand for f[{a}← b].
Sometimes, we want to update afunction by setting the function
values of some subset Sof its domain to a non-deterministically
chosen value. ForS⊆ dom f, we write f[S←∗](x) to denote the
functionthat evaluates to some y with y ∈ ran f, if x ∈ S and
f(x)otherwise.
3.2 SyntaxWe restrict ourselves to the synthesizable fragment
ofVERILOG, i.e., we do not include commands like initialblocks that
only affect simulation and implement a nor-malization step [32] in
which the program is “flattened”by removing module instantiation
through in-lining. Weprovide VERILOG syntax and a translation to
VINTER inAppendix A.2, but define semantics in VINTER (Fig. 2).
Annotations. We define annotations in Figure 10. LetRegs denote
the set of registers and Wires the set ofwires and let VARS denote
their disjoint union, i.e.,
a ::= In/Out Assump.| source(v) source | init(ϕ) initiallyϕ|
sink(v) sink | �(ϕ) alwaysϕ
Figure 10: Annotation syntax.
Config Meaning Trace Meaning
σ store Σ configurationτ liveness map l labelθ influence map b
liveness bitµ assign. buffer π traceev event set store(π,i) σiP
current program live(π,i) τiI initial program inf (π,i) θic clock
cycle clk(π,i) ci
reset(π,i) bi
Figure 11: Configuration and trace syntax.
VARS , Regs]Wires. For a register v ∈ Regs, annota-tions
source(v) and sink(v) designate v as source or sink,respectively.4
We let IO , (Src,Sink) denote the set ofinput/output assumptions,
where Src denotes the set ofall sources and Sink denote the set of
all sinks. Let ϕbe a first-order formula over some background
theorythat refers to two disjoint sets of variables VarsL andVARSR.
Then, annotations init(ϕ) and�(ϕ) indicate thatformula ϕ holds
initially or throughout the execution.The assumptions are collected
in A , (INIT, ALL), suchthat INIT contains all formulas under init
and ALL allformulas under �.
3.3 Semantics
Values. The set of values VALS,Z] {X} consists of thedisjoint
union of the integers and special value X whichrepresents an
irrelevant value. A function application thatcontains X as an
argument evaluates to X.
Configurations. The program state is representedby a
configuration Σ ∈ Configs. Figure 11 showsthe components of a
configuration. A store σ ∈STORES , (VARS 7→ VALS) is a map from
registersand wires to values. A liveness map τ ∈ LIVEMAP ,(VARS 7→
{L,D}) is a map from registers and wiresto liveness bits. A
influence map θ ∈ INFMAPS ,(VARS 7→ P(Z)) is a map from registers
and wiresto influence sets. Assignment buffers serve to
modelnon-blocking assignments. Let PIDs denote a set ofprocess
identifiers. An assignment buffer µ ∈ PIDs 7→(VARS×VALS×
{L,D}×P(Z))∗ is a map from pro-
4To use wires as source/sink, one has to define an auxiliary
register.
USENIX Association 28th USENIX Security Symposium 1417
-
[VAR]
v,σ,τ,θ 99Kσ(v),τ(v),θ(v)
[CONST]
n,σ,τ,θ 99Kn,D,∅
[FUN]e1,σ,τ,θ 99K v1,t1,i1 . . . ek,σ,τ,θ 99K vk,tk,ik
t= (t1 ∨ · · ·∨tk) i= (i1∪···∪ik)f (e1, . . . ,ek),σ,τ,θ 99K f
(v1, . . . ,vk),t,i
Figure 12: Expression evaluation.
cess identifier to a sequence of
variable/value/liveness-bit/influence set tuples. An event set ev ∈
P(VARS) isa set of variables, where we use v ∈ ev to indicate
thatvariable v has been changed in the current cycle. Finally,I ∈
Progs contains the initial program. Intuitively, theinitial program
is used to activate all processes when anew clock cycle begins.
Evaluating Expressions. We define an evaluationrelation 99K∈
(EXPR × STORES × LIVEMAP ×INFMAPS) 7→ (VALS× {L,D}× P(Z)) that
computesvalue, liveness-bit, and influence map for an expression.We
define the relation through the inference rules shownin Fig. 12. An
evaluation step (below the line) can betaken, if the preconditions
(above the line) are met. Rule[VAR] evaluates a variable to its
current value underthe store, its current liveness-bit and
influence set. Anumerical constant evaluates to itself, is dead and
notinfluenced by any cycle. To evaluate a function literal,we
evaluate its arguments and apply the function on theresulting
values. A function value is live if any of itsarguments are, and
its influence set is the union of itsinfluences.
Transition Relations. We define our semantics interms of four
separate transition relations of type(Configs×Labels×Configs). We
now discuss the indi-vidual relations and then describe how to
combine theminto an overall transition relation .
Per-process transition P. The per-process transitionrelation P
describes how to step along individual pro-cesses. It is defined in
Fig. 13. Rules [SEQ-STEP] and[PAR-STEP] are standard and describe
sequential andparallel composition. Rule [B-ASN] reduces a
block-ing update x = e to skip, by first evaluating e to yielda
value v, liveness bit t and influence set i, updatingstore σ,
liveness map τ and influence map θ, and finallyadding x to the set
of modified variables. Rule [NB-ASN]defers a non-blocking
assignment. In order to reduce anassignment (x⇐ e)id for process id
to skip, the rule eval-uates expression e to value v, liveness bit
t and influence
set i, and defers the assignment by appending the tu-ple (x,v,t,
i) to the back of id’s buffer. We omit rulesfor conditionals and
structural equivalence. Structuralequivalence allows transitions
between trivially equiva-lent programs such as P ‖Q and Q ‖
P.Non-blocking Transition N. Transition relation Napplies deferred
non-blocking assignments. It is definedby a single rule [NB-APP]
shown in Fig. 13. The rulefirst picks a tuple (x,v,t, i) from the
front of the bufferof some process id, and, like [B-ASN], updates
store σ,liveness map τ and influence map θ, and finally adds xto
the set of updated variables.
Continuous Transition C. Relation C specifies howto execute
continuous assignments. It is described byrule [C-ASN] in Fig. 13,
which reduces a continuousassignment x := e to skip under the
condition that somevariable y occurring in e has changed, i.e., y ∈
ev. Toapply the assignment, it evaluates e to value, liveness
bitand influence set, and updates store and liveness map
andinfluence map. Importantly, variable y is not removedfrom the
set of events, i.e., a single assignment can enableseveral
continuous assignments.
Global Transition G. Finally, global transition re-lation G is
defined by rules [NEWCYCLE] and[NEWCYCLE-ISSUE] shown in Fig. 13.
[NEWCYCLE]starts a new clock cycle by discarding the current
pro-gram and event set, emptying the assignment buffer,resetting
the wires to some non-deterministically cho-sen state (as wires
only hold their value within a cy-cle), and rescheduling and
activating a new set of pro-cesses, extracted from initial program
I. For a program P,let REPEAT(P) ∈ P(Progs) denote the set of
processesthat occur under repeat. For a set of programs S, we letu
S denote their parallel composition. [NEWCYCLE]uses these
constructs to reschedule all processes that ap-pear under repeat in
I. Both sources and wires are settoD. The influence map is updated
by mapping all wiresto the empty set, and each source to the set
containingonly the current cycle.
[NEWCYCLE-ISSUE] performs the same step, butadditionally updates
the liveness map by issuing newlive bits for the source variables.
Both rules incrementthe cycle counter c. The rules issue a label l
∈ Labels,((STORES×LIVEMAP× INFMAPS×N× {L,D})]�)which is written
above the arrow (all previous rulesissue the empty label �). The
label contains the currentstore, liveness map, influence map, clock
cycle, and abit indicating whether new live-bits have been
issued.Labels are used to construct the trace of an execution,
as
1418 28th USENIX Security Symposium USENIX Association
-
we will discuss later.
Overall Transition . We define the overall transitionrelation ∈
Configs× Labels×Configs by fixing anorder in which to apply the
relations. Whenever a con-tinuous assignment step (relation C) can
be applied,that step is taken. Whenever no continuous
assignmentstep can be applied, however, a per-process step
(relation P) can be applied, a P step is taken. If no continu-ous
assignment and process local steps can be applied,however, an
non-blocking assignment step (relation N)is applicable, a N step is
taken. Finally, if neither con-tinuous assignment, per-process, or
non-blocking stepscan be applied, the program moves to a new clock
cy-cle by applying a global step (relation G). Our
overalltransition relation closely follows the Verilog
simulationreference model from Section 11.4 of the standard
[12].
Executions and Traces. An execution is a finite se-quence of
configurations and transition labels r ,
Σ0l0Σ1 . . .Σm−1lm−1Σm such that Σili Σi+1 for i ∈
{1, . . . ,m−1}. We call Σ0 initial state and require that
alltaint bits are set to D, the influence map maps each vari-able
to the empty set, the assignment buffer is empty,the current
program is the empty program ›, and theclock is set to 0. The trace
of an execution is thesequence of its (non-empty) labels. For a
trace π ,(σ0,τ0,θ0,c0,b0) . . .(σn−1,τn−1,θn−1cn−1,bn−1) ∈Labels∗
and for i ∈ {0, . . . ,n− 1} we let store(π, i) ,σi, live(π, i) ,
τi, inf (π, i) , θi, clk(π, i) , ci andreset(π, i) = bi, and say
the trace has length n. For aprogram P we use TRACES(P) ∈
P(Labels∗) to denotethe set of its traces, i.e., all traces with
initial program P.
4 Constant-Time ExecutionWe now first define constant-time
execution with respectto a set of assumptions. We then define
liveness equiva-lence and show that the two notions are
equivalent.
4.1 Constant-Time Execution
Assumptions. For a formula ϕ that ranges over two dis-joint sets
of variables VARSL and VARSR and storesσL and σR such that dom σL =
VARSL and dom σR =VARSR, we write σL,σR |=ϕ to denote that formula
ϕholds when evaluated on σL and σR. For some pro-gram P and a set
of assumptions A , (INIT, ALL), wesay that two traces πL,πR ∈
TRACES(P) of length nsatisfy A if i) for each formula ϕI ∈ INIT, ϕI
holdsinitially, and ii) for each formula ϕA ∈ ALL, ϕAhold
throughout, i.e., store(πL, 0),store(πR, 0) |=ϕI and
store(πL, i),store(πR, i) |=ϕA, for 06 i6 n−1. Intu-itively,
pairs of traces that satisfy the assumptions are“low” or “input”
equivalent.
Constant Time Execution. For a program P, assump-tions A and
traces πL,πR ∈ TRACES(P) of length n thatsatisfy A, πL and πR are
constant time with respect to A,if they produce the same influence
sets for all sinks, i.e.,inf (πL, i)(v) = inf (πR, i)(v), for 0 6 i
6 n− 1 and allv ∈ Sink, and where two sets are equal if they
contain thesame elements. A program is constant time with respectto
A, if all pairs of its traces that satisfy A are constanttime.
4.2 Liveness Equivalence
t-Trace. For a trace π, we say that π is a t-trace, ifreset(π,t)
= L and reset(π, i) =D, for i 6= t.Liveness Equivalence. For a
program P, let πL,πR ∈TRACES(P), such that both πL and πR are of
length n.We say that πL and πR are t-liveness equivalent, if
bothare t-traces, and live(πL, i)(v) = live(πR, i)(v), for 06i6 n−1
and all v ∈ Sink. A program is t-liveness equiv-alent, with respect
to a set of assumptions A, if all pairsof t-traces that satisfy A
are t−liveness equivalent. Fi-nally, a program is liveness
equivalent with respect to A,if it is t-liveness equivalent with
respect to A, for all t.
4.3 EquivalenceWe can now state our equivalence theorem.
Theorem 1. For all programs P and assumptions A, Pexecutes in
constant-time with respect to A if and only ifit is liveness
equivalent with respect to A.
We first give a lemma which states that, if a register ist-live,
then t is in its influence set.
Lemma 1. For any t-trace π of length n, index 06 i6n−1, and
variable v, if v is t−live, i.e., live(π, i)(v) = L,then t is in
v’s influence map, i.e., t ∈ inf (π, i)(v).
We can now state our proof for Theorem 1.
Proof Theorem 1. The interesting direction is “right-to-left”,
i.e., we want to show that a liveness equivalentprogram is also
constant-time. We prove the contrapos-itive, i.e., if a program
violates constant-time, it mustalso violate liveness equivalence.
For a proof by con-tradiction, we assume that P violates constant
time ex-ecution, but satisfies liveness equivalence. If P
violatesconstant-time execution, then there must be a sink v∗,two
trace π∗L,π
∗R ∈ TRACES(P) that satisfy A, and some
USENIX Association 28th USENIX Security Symposium 1419
-
[SEQ-STEP]〈σ,µ,θ,ev,τ,s1, I,c〉 P 〈σ ′,µ ′,θ ′,ev ′,τ ′,s ′1,
I,c〉
〈σ,µ,θ,ev,τ, [s1;s2], I,c〉 P 〈σ ′,µ ′,θ ′,ev ′,τ ′, [s ′1;s2],
I,c〉
[PAR-STEP]〈σ,µ,θ,ev,τ,P, I,c〉 P 〈σ ′,µ ′,θ ′,ev ′,τ ′,P ′,
I,c〉
〈σ,µ,θ,ev,τ,P ‖Q, I,c〉 P 〈σ ′,µ ′,θ ′,ev ′,τ ′,P ′ ‖Q, I,c〉
[B-ASN]e,σ,τ,θ 99K v,t,i σ ′ =σ[x← v] τ ′ = τ[x← t] θ ′ = θ[x←
i]
〈σ,µ,θ,ev,τ,x = e, I,c〉 P 〈σ ′,µ,θ ′,ev∪ {x},τ ′,skip, I,c〉
[NB-ASN]e,σ,τ,θ 99K v,t,i µ ′ =µ[id← (x,v,t,i) ·q]
〈σ,µ[id←q],θ,ev,τ,(x⇐ e)id , I,c〉 P 〈σ,µ′,θ,ev,τ,skip, I,c〉
[NB-APP]σ ′ =σ[x← v] µ ′ =µ[id←q] θ ′ = θ[x← i] τ ′ = τ[x← t] ev
′ = ev∪ {x}
〈σ,µ[id←q ·(x,v,t,i)],θ,ev,τ,P, I,c〉 N 〈σ ′,µ ′,θ ′,ev ′,τ ′,P,
I,c〉
[C-ASN]e,σ,τ,i 99K v,t,i y∈ VARS(e) σ ′ =σ[x← v] τ ′ = τ[x← t] θ
′ = θ[x← i]
〈σ,µ,θ,ev∪ {y},τ,x := e, I,c〉 C 〈σ ′,µ,θ ′,ev∪ {x,y},τ ′,skip,
I,c〉
[NEWCYCLE]σ ′ ,σ[Wires←∗] τ ′ , τ[Src←D][Wires←D] θ ′ ,
θ[Wires←∅][Src← {c+1}] µ ′ ,µ[PIDs← �]
〈σ,µ,θ,ev,τ,P, I,c〉 (σ,τ,θ,c,D) G 〈σ ′,µ ′,θ ′,∅,τ,u REPEAT(I),
I,c+1〉
[NEWCYCLE-ISSUE]σ ′ ,σ[Wires←∗] τ ′ , τ[Src← L][(VARS−Src)←D] θ
′ , θ[Wires←∅][Src← {c+1}] µ ′ ,µ[PIDs← �]
〈σ,µ,θ,ev,τ,P, I,c〉 (σ,τ,θ,c,L) G 〈σ ′,µ ′,θ ′,∅,τ ′,u
REPEAT(I), I,c+1〉
Figure 13: Per-thread transition relation P, non-blocking
transition relation N, continuous transition relation C, and
globalrestart relation G .
index i∗ such that inf (π∗L, i∗)(v∗) 6= inf (π∗R, i∗)(v∗),
and
therefore without loss of generality, there is a cycle t∗,such
that t∗ ∈ inf (π∗L, i∗)(v∗) and t∗ 6∈ inf (π∗R, i∗)(v∗).We can find
two traces t∗-traces π̂L and π̂R that onlydiffer from π∗L and π
∗R in their liveness maps. But
then, since the traces are t∗-liveness equivalent, by
def-inition, at index i∗ both π̂L and π̂R are t∗−live,
i.e.,live(π̂L, i∗)(v∗) = live(π̂R, i∗)(v∗) = L and, by lemma 1,t∗ ∈
inf (π̂R, i∗)(v∗). Since π̂R and π∗R only differ in theirliveness
map, this implies t∗ ∈ inf (π∗R, i∗)(v∗), fromwhich the
contradiction follows.
5 Verifying Constant Time ExecutionIn this section, we describe
how IODINE verifies livenessequivalence by using standard
techniques.
Algorithm IODINE. Given a VINTER program P, a setof input/output
specifications IO and a set of assump-tions A, IODINE checks that P
executes in constant timewith respect to A. For this, IODINE first
checks for race-freedom. If a race is detected, IODINE returns a
witnessdescribing the violation. If no race is detected,
IODINEtakes the following four steps: (1) It builds a set of
Horn
clause constraints hs [26, 33] whose solution character-izes the
set of all configurations that are reachable bythe per-process
product and satisfy A. (2) Next, it buildsa set of constraints cs
whose solutions characterize theset of liveness equivalent states.
(3) It then computes asolution Sol to hs and checks whether the
solution sat-isfies cs. To find a more precise solution, the user
cansupply additional hints in the form of a set of predicateswhich
we describe later. (4) If the check succeeds, P ex-ecutes in
constant time with respect to A, otherwise, Pcan potentially
exhibit timing variations.
Constraint Solving. IODINE solves the reachability con-straints
by using Liquid Fixpoint [10], which computesthe strongest solution
that can be expressed as a con-junction of elements of a set of
logical formulas. Theseformulas are composed of a set of base
predicates. Weuse base predicates that track equalities between the
live-ness bits and values of each variable between the tworuns. In
addition to these base predicates, we use hintsthat are defined by
the user. We discuss in § 6 whichpredicates were used in our
benchmarks.
1420 28th USENIX Security Symposium USENIX Association
-
6 Implementation and EvaluationIn this section, we describe our
implementation and eval-uate IODINE on several open source VERILOG
projects,spanning from RISC processors, to floating-point unitsand
crypto cores. We find that IODINE is able to showthat a piece of
code is not constant-time and otherwiseverify that the hardware is
constant-time in a matter ofseconds. Except our processor use
cases, we found theannotation burden to be light weight—often less
than 10lines of code. All the source code and data are availableon
GitHub, under an open source license.5
6.1 ImplementationIODINE consists of a front-end pass, which
takes anno-tated hardware descriptions and compiles them to
VIN-TER, and a back-end that verifies the constant-time execu-tion
of these VINTER programs. We think this modulardesigns will make it
easy for IODINE to be extended tosupport different hardware
description languages beyondVERILOG (e.g., VHDL or Chisel
[19]).
Our front-end extends the Icarus Verilog parser [9]and consists
of 2000 lines of C++. Since VINTER sharesmany similarities with
VERILOG, this pass is relativelystraightforward, however, IODINE
does not distinguishbetween clock edges (positive or negative) and,
thus, re-moves them during compilation. Moreover, our prototypedoes
not support the whole VERILOG language (e.g., wedo not support
assignments to multiple variables).
IODINE’s back-end takes a VINTER program and, fol-lowing § 5,
generates and checks a set of verificationconditions. We implement
the back-end in 4000 lines ofHaskell. Internally, this Haskell
back-end generates Hornclauses and solves them using the
liquid-fixpoint librarythat wraps the Z3 [29] SMT solver. Our
back-end outputsthe generated invariants, which (1) serve as the
proof ofcorrectness when the verification succeeds, or (2)
helpspinpoint why verification fails.
Tool Correctness. The IODINE implementation and Z3SMT solver
[29] are part of our trusted computing base.This is similar to
other constant-time and informationflow tools (e.g., SecVerilog
[55] and ct-verif [16]). Assuch, the formal guarantees of IODINE
can be under-mined by implementation bugs. We perform several
teststo catch such bugs early—in particular, we validate: (1)our
translation into VINTER against the original VER-ILOG code; (2) our
translation from VINTER into Hornclauses against our semantics;
and, (3) the generated in-
5https://iodine.programming.systems
variants against both the VINTER and VERILOG code.
6.2 EvaluationOur evaluation seeks to answer three questions:
(Q1)Can IODINE be easily applied to existing hardware de-signs?
(Q2) How efficient is IODINE? (Q3) What is theannotation burden on
developers?
(Q1) Applicability. To evaluate its applicability, we runIODINE
on several open source hardware modules fromGitHub and OpenCores.
We chose VERILOG programsthat fit into three categories—processors,
crypto-cores,and floating-point units (FPUs)—these have
previouslybeen shown to expose timing side channels. In
particular,our benchmarks consist of:
I MIPS- and RISCV-32I-based pipe-lined CPU coreswith a single
level memory hierarchy.
I Crypto cores implementing the SHA 256 hash functionand RSA
4096-bit encryption.
I Two FPUs that implement core operations (+,−,×,÷)according to
the IEEE-754 standard.
I An ALU [1] that implements (+,−,×,�, . . . ).In our
benchmarks, following our attacker model from§ 2.1, we annotated
all the inputs to the computation. Forexample, this includes the
sequence of instructions forthe benchmarks with a pipeline (i.e.,
MIPS, RISC-V, FPUand FPU2) in addition to other control inputs, and
all thetop level VERILOG inputs for the rest (i.e., SHA-256,ALU and
RSA). Similarly, we annotated as sinks, all theoutputs of the
computation. In the case of benchmarkswith a pipeline, this
includes the output from the last stageand other results (e.g.,
whether the result is NaN in FPU),and all the top level VERILOG
outputs for the rest. Themodifications we had to perform to run
IODINE on thesebenchmarks were minimal and due to parser
restrictions(e.g., desugaring assignments to multiple variables
intoindividual assignments, unrolling the code generated bythe loop
inside the generate blocks).
(Q2) Efficiency. To evaluate its efficiency, we run IO-DINE on
the annotated programs. As highlighted in Ta-ble 1, IODINE can
successfully verify different VERILOGprograms of modest size (up to
1.1K lines of code) rel-atively quickly (
-
Name #LOC#Assum
CT Check (s)#flush #always
MIPS [5] 434 31 2 X 1.329RISC-V [7] 745 50 19 X 1.787SHA-256 [8]
651 5 3 X 2.739FPU [6] 1182 0 0 X 12.013ALU [1] 913 1 5 X 1.595FPU2
[3] 272 3 4 7 0.705RSA [4] 870 4 0 7 1.061
Total 5067 94 33 - 21.163
Table 1: #LOC is the number of lines of Verilog code, #Assumis
the number of assumptions (excluding source and sink); flushand
always are annotations of the form init and � respectively,CT shows
if the program is constant-time, and Check is thetime IODINE took
to check the program. All experiments wererun on a Intel Core i7
processor with 16 GB RAM.
Discovered Timing Variability. Running IODINE re-vealed that two
of our use cases are not constant-time:one of the FPU
implementations and the RSA crypto-core. The division module of the
FPU exhibits timingvariability depending on the value of the
operands. Inparticular, similar to the example from § 2, the
moduletriggers a fast path if the operands are special values.
The RSA encryption core similarly exhibited time vari-ability.
In particular, the internal modular exponentiationalgorithm
performs a Montgomery multiplication de-pending on the value of a
source bit ei: if ei = 1 then c :=ModPro(c,m). Since e is a secret,
this timing variabilitycan be exploited to reveal the secret key
[27, 35].
(Q3) Annotation burden. While IODINE automaticallydiscovers
proofs, the user has to provide a set of assump-tions A under which
the hardware design executes inconstant time. To evaluate the
burden this places on de-velopers, we count the number and kinds of
assumptionswe had to add to each of our use cases. Table 1
sum-marizes our results: except for the CPU cores, most ofour other
benchmarks required only a handful of assump-tions. Beyond
declaring sinks and sources, we rely ontwo other kinds of
annotations. First, we find it usefulto specify that the initial
state of an input variable x isequal in any pair of runs, i.e.,
init(xL = xR). This assump-tion essentially specifies that register
x is flushed, i.e., isset to a constant value, to remove any
effects of a pre-vious execution from our initial state. Second, we
findit useful to specify that the state of an input variable xis
equal, throughout any pair of runs, i.e., �(xL = xR).This
assumption is important when certain behavior isexpected to be the
same in both runs. We now describethese assumptions for our
benchmarks.
I MIPS: We specify that the values of the fetched in-structions,
and the reset bit are the same.
I RISC-V: In addition to the assumptions required bythe MIPS
core, we also specify that both runs take thesame conditional
branch, and that the type of mem-ory access (read or write) is the
same in both runs(however, the actual values remain unrestricted).
Thiscorresponds to the assumption that programs runningon the CPU
do not branch or access memory basedon secret values. Finally, CSR
registers must not beaccessed illegally (see § 6.3).
I ALU: Both runs execute the same type of operations(e.g.,
bitwise, arithmetic), operands have the same bitwidth, instructions
are valid, reset pins are the same.
I SHA-256 and FPU (division): We specify that thereset and
input-ready bits are the same.
In all cases, we start with no assumptions and add
theassumptions incrementally by manually investigating
theconstant-time “violation” flagged by IODINE.
Identifying Assumptions. From our experience, the as-sumptions
that a user needs to specify fall into threecategories. The first
are straightforward assumptions—e.g., that any two runs execute the
same code. The sec-ond class of assumptions specify that certain
registersneed to be flushed, i.e., they need to initially be the
same(flushed) for any two runs. To identify these, we firstflush
large parts of circuits, and then, in a minimizationstep, we remove
all unnecessary assumptions. The last,and most challenging, are
implicit invariants on data andcontrol—e.g., the constraints on CSR
registers. IODINEperforms delta debugging to help pinpoint
violations but,ultimately, these assumptions require user
interventionto be resolved. Indeed, specifying these assumptions
re-quire a deep understanding of the circuit and its intendedusage.
In our experience, though, only a small fractionof assumptions fall
into this third category.
User Hints. For one of our benchmarks (FPU), weneeded to supply
a small number of user hints (
-
1 always @(*) begin
2 if (...)
3 Stall = 1; else Stall = 0;
4 end
5 always @(posedge clk) begin
6 if (Stall)
7 ID_instr
-
1 wire de_illegal_csr_access =
2 de_valid &&
3 de_inst‘opcode == ‘SYSTEM &&
4 de_inst‘funct3 != ‘PRIV &&
5 ( csr_mstatus‘PRV < de_inst[29:28] ||
6 ... );
7 always @(posedge clk) begin
8 if (de_illegal_csr_access) begin
9 ex_restart
-
Combining Hardware & Software Mitigations. Hyper-Flow [31]
and GhostRider [43], take hardware/soft-ware co-design approach to
eliminating timing channels.Zhang et al. [54] present a method for
mitigating timingside-channels in software and give conditions on
hard-ware that ensure the validity of mitigations is
preserved.Instead of eliminating timing flows all together, they
spec-ify quantitative bounds on leakage and offers primitivesto
mitigate timing leaks through padding. Many othertools [11, 13, 30,
38, 48] automatically quantify leakagethrough timing and cache
side-channels. Our approachis complementary and focuses on
clock-precise analysisof existing hardware. However, the explicit
assumptionsthat IODINE needs to verify constant-time behavior canbe
used to inform software mitigation techniques.
References[1] https://github.com/scarv/xcrypto-ref.
[2] ARM A64 instruction set
architecture.https://static.docs.arm.com.
[3] https://github.com/dawsonjon/fpu.
[4] https://github.com/fatestudio/RSA4096.
[5] https://github.com/gokhankici/iodine.
[6] https://github.com/monajalal/fpga_-mc/tree/master/fpu.
[7] https://github.com/tommythorn/yarvi.
[8] https://opencores.org/project/sha_core.
[9] Icarus verilog. http://iverilog.icarus.com/.
[10] Liquid fixpoint. https://github.com/ucsd-progsys.
[11] TIS-CT. http://trust-in-soft.com/tis-ct/.
[12] IEEE Standard for Verilog Hardware DescriptionLanguage.
IEEE Std 1364-2005, 2005.
[13] J Bacelar Almeida, Manuel Barbosa, Jorge S Pinto,and
Bárbara Vieira. Formal verification of side-channel countermeasures
using self-composition.In Science of Computer Programming,
2013.
[14] José Bacelar Almeida, Manuel Barbosa, GillesBarthe, Arthur
Blot, Benjamin Grégoire, VincentLaporte, Tiago Oliveira, Hugo
Pacheco, BenediktSchmidt, and Pierre-Yves Strub. Jasmin:
High-assurance and high-speed cryptography. In CCS,2017.
[15] José Bacelar Almeida, Manuel Barbosa, GillesBarthe, and
François Dupressoir. Verifiable side-channel security of
cryptographic implementations:Constant-time mee-cbc. In FSE,
2016.
[16] José Bacelar Almeida, Manuel Barbosa, GillesBarthe,
François Dupressoir, and Michael Emmi.Verifying constant-time
implementations. InUSENIX Security, 2016.
[17] Marc Andrysco, David Kohlbrenner, Keaton Mow-ery, Ranjit
Jhala, Sorin Lerner, and Hovav Shacham.On subnormal floating point
and abnormal timing.In S&P, 2015.
[18] Marc Andrysco, Andres Noetzli, Fraser Brown,Ranjit Jhala,
and Deian Stefan. Towards verified,constant-time floating point
operations. In CCS,2018.
[19] Jonathan Bachrach, Huy Vo, Brian C. Richards,Yunsup Lee,
Andrew Waterman, Rimas Avizienis,John Wawrzynek, and Krste
Asanovic. Chisel: con-structing hardware in a scala embedded
language.In DAC, 2012.
[20] Gilles Barthe, Gustavo Betarte, Juan Diego Campo,Carlos
Daniel Luna, and David Pichardie. System-level non-interference for
constant-time cryptogra-phy. In CCS, 2014.
[21] Gilles Barthe, Juan Manuel Crespo, and Cesar
Kunz.Relational verification using product programs. InFM,
2011.
[22] Gilles Barthe, Pedro R. D’Argenio, and TamaraRezk. Secure
information flow by self-composition.In CSF, 2004.
[23] Daniel J. Bernstein. The poly1305-aes
message-authentication code. In Fast Software Encryption,2005.
[24] Daniel J. Bernstein. Curve25519: New diffie-hellman speed
records. In Public Key Cryptography,2006.
[25] Daniel J Bernstein. The salsa20 family of streamciphers. In
New stream cipher designs. Springer,2008.
[26] Nikolaj Bjørner, Arie Gurfinkel, Ken McMillan, andAndrey
Rybalchenko. Horn clause solvers for pro-gram verification. In
Fields of Logic and Computa-tion. 2015.
USENIX Association 28th USENIX Security Symposium 1425
https://github.com/scarv/xcrypto-refhttps://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdfhttps://github.com/dawsonjon/fpuhttps://github.com/fatestudio/RSA4096https://github.com/gokhankici/iodine/tree/master/benchmarks/472-mips-pipelinedhttps://github.com/monajalal/fpga_mc/tree/master/fpuhttps://github.com/monajalal/fpga_mc/tree/master/fpuhttps://github.com/tommythorn/yarvihttps://opencores.org/project/sha_corehttp://iverilog.icarus.com/https://github.com/ucsd-progsys/liquid-fixpointhttp://trust-in-soft.com/tis-ct/
-
[27] David Brumley and Dan Boneh. Remote timingattacks are
practical. Computer Networks, 2005.
[28] Michael R. Clarkson and Fred B. Schneider.
Hy-perproperties. Journal of Computer Security, 2010.
[29] Leonardo de Moura and Nikolaj Bjørner. Z3: Anefficient SMT
solver. In TACAS, 2008.
[30] Goran Doychev, Dominik Feld, Boris Köpf, LaurentMauborgne,
and Jan Reineke. Cacheaudit: A toolfor the static analysis of cache
side channels. InUSENIX Security, 2013.
[31] Andrew Ferraiuolo, Mark Zhao, Andrew C Myers,and G Edward
Suh. Hyperflow: A processor archi-tecture for nonmalleable,
timing-safe informationflow security. In SIGSAC, 2018.
[32] Michael J. C. Gordon. The semantic challenge ofverilog hdl.
In LICS, 1995.
[33] Sergey Grebenshchikov, Nuno P. Lopes, CorneliuPopeea, and
Andrey Rybalchenko. Synthesizingsoftware verifiers from proof
rules. In PLDI, 2012.
[34] Paul Kocher, Daniel Genkin, Daniel Gruss, WernerHaas, Mike
Hamburg, Moritz Lipp, Stefan Man-gard, Thomas Prescher, Michael
Schwarz, and Yu-val Yarom. Spectre attacks: Exploiting
speculativeexecution. CoRR, 2018.
[35] Paul C Kocher. Timing attacks on implementationsof
Diffie-Hellman, RSA, DSS, and other systems.In CRYPTO, 1996.
[36] David Kohlbrenner and Hovav Shacham. On theeffectiveness of
mitigations against floating-pointtiming channels. In USENIX
Security, 2017.
[37] Hyoukjun Kwon, William Harris, and HadiEsameilzadeh.
Proving flow security of sequen-tial logic via
automatically-synthesized relationalinvariants. In CSF, 2017.
[38] Adam Langley. ctgrind: Checking thatfunctions are constant
time with valgrind.https://github.com/agl/ctgrind/.
[39] Xavier Leroy. Formal certification of a compilerback-end,
or: programming a compiler with a proofassistant. In POPL,
2006.
[40] Xun Li, Mohit Tiwari, Jason K Oberg, VineethKashyap,
Frederic T Chong, Timothy Sherwood,and Ben Hardekopf. Caisson: a
hardware descrip-tion language for secure information flow. In
PLDI,2011.
[41] Linux on ARM. ARM64 prepping ARM v8.4 fea-tures, KPTI
improvements for Linux 4.17. https://www.linux-arm.info/.
[42] Moritz Lipp, Michael Schwarz, Daniel Gruss,Thomas Prescher,
Werner Haas, Anders Fogh, JannHorn, Stefan Mangard, Paul Kocher,
Daniel Genkin,Yuval Yarom, and Mike Hamburg. Meltdown: Read-ing
kernel memory from user space. In USENIXSecurity, 2018.
[43] Chang Liu, Austin Harris, Martin Maas, MichaelHicks, Mohit
Tiwari, and Elaine Shi. Ghostrider: Ahardware-software system for
memory trace oblivi-ous computation. SIGPLAN Notices, 2015.
[44] Jonas Magazinius, Alejandro Russo, and AndreiSabelfeld.
On-the-fly inlining of dynamic securitymonitors. In IFIP, 2010.
[45] Susan Owicki and David Gries. Verifying proper-ties of
parallel programs: an axiomatic approach.Communicationsof the ACM,
1976.
[46] Ashay Rane, Calvin Lin, and Mohit Tiwari. Secure,precise,
and fast floating-point operations on x86processors. In USENIX
Security, 2016.
[47] Oscar Reparaz, Joseph Balasch, and Ingrid Ver-bauwhede.
Dude, is my code constant time? InDATE, 2017.
[48] Bruno Rodrigues, Fernando Magno Quin-tão Pereira, and Diego
F Aranha. Sparserepresentation of implicit flows with
applicationsto side-channel detection. In CCC, 2016.
[49] Marcelo Sousa and Isil Dillig. Cartesian hoare logicfor
verifying k-safety properties. In PLDI, 2016.
[50] Tachio Terauchi and Alex Aiken. Secure informa-tion flow as
a safety problem. In SAS, 2005.
[51] Mohit Tiwari, Jason K Oberg, Xun Li, JonathanValamehr,
Timothy Levin, Ben Hardekopf, RyanKastner, Frederic T. Chong, and
Timothy Sherwood.Crafting a usable microkernel, processor, and
i/osystem with strict and provable information flowsecurity. In
ISCA, 2011.
1426 28th USENIX Security Symposium USENIX Association
https://github.com/agl/ctgrind/https://www.linux-arm.info/https://www.linux-arm.info/
-
[52] Mohit Tiwari, Hassan MG Wassel, Bita Mazloom,Shashidhar
Mysore, Frederic T Chong, and TimothySherwood. Complete information
flow trackingfrom the gates up. In Sigplan Notices, 2009.
[53] Conrad Watt, John Renner, Natalie Popescu, SunjayCauligi,
and Deian Stefan. Ct-wasm: Type-drivensecure cryptography for the
web ecosystem. 2019.
[54] Danfeng Zhang, Aslan Askarov, and Andrew C.Myers.
Language-based control and mitigation oftiming channels. In PLDI,
2012.
[55] Danfeng Zhang, Yao Wang, G. Edward Suh, andAndrew C. Myers.
A hardware design languagefor timing-sensitive information-flow
security. InASPLOS, 2015.
A Appendix
A.1 Comparison to Information FlowIn this section, we discuss
the relationship between con-stant time execution and information
flow checking. In-formation flow safety (IFS) and constant time
execution(CTE) are incomparable, i.e., IFS does not imply CTE,and
vice versa. We illustrate this using two examples:one is
information flow safe but does not execute in con-stant time and
one executes in constant time but is notinformation flow safe.
Figure 17 contains example program EX2 which isinformation flow
safe but not constant time. The examplecontains three registers
that are typed high as indicatedby the annotation H, and one
register that is typed lowas indicated by the annotation L. The
program is infor-mation flow safe, as there are no flows from high
to low.Indeed, SecVerilog [55] type checks this program.
This program, however, is not constant time whenslowL 6= slowR.
This does not mean that the programleaks high data to low
sinks—indeed it does not. Instead,what this means is that the high
computation takes a vari-able amount of time dependent on the
secret input values.In cases like crypto cores where the attacker
has a stopwatch and can measure the duration of the sensitive
com-putation, it’s not enough to be information flow safe: wemust
ensure the core is constant-time.
Next, consider Figure 18 that contains program EX3which executes
in constant time but is not informationflow safe. EX3 violates
information flow safety by assign-ing high input sec to low output
out. The example how-ever executes constant time with source in and
sink out un-
1 // source(in_low ); source(in_high );
2 // sink(out_low ); sink(out_high );
3 module test(input {L} clk ,
4 input {L} in_low ,
5 input {H} in_high ,
6 output {L} out_low ,
7 output {H} out_high );
8 reg {H} flp_res;
9 reg {H} slow;
10 reg {L} out_low;
11 reg {H} out_high;
12 always @(posedge clk) begin
13 out_low
-
P⇒ P ′ Q⇒Q ′
P ·Q⇒ P ′ ‖Q ′s1⇒ s ′1 . . . sn⇒ s ′n
begin s1; . . . ;sn; end⇒ s ′1; . . . ;s ′n
s⇒ s ′ id freshalways @(_) s⇒ repeat [s ′]id
id fresh
assign v = e⇒ repeat [v := e]id
s1⇒ s ′1 s2⇒ s ′2if (e) s1 else s2 end ⇒ ite(e,s ′1,s ′2)
Figure 19: Translation from VERILOG to VINTER.
1 always @(*)
2 case({opa[31], opb[31]})
3 2’b0_0: sign_mul_r