Veriﬁcation of Galois Field Based Circuits by Formal ...Veriﬁcation of Galois Field Based Circuits by Formal Reasoning Based on Computational Algebraic ... consistency of polynomial

Formal Methods in System Design manuscript No.(will be inserted by the editor)

Verification of Galois Field Based Circuits by Formal

Reasoning Based on Computational Algebraic

Geometry

Alexey Lvov · Luis A. Lastras-Montano ·

Barry Trager · Viresh Paruthi ·

Robert Shadowen · Ali El-Zein

Received: date / Accepted: date

Abstract Algebraic error correcting codes (ECC) are widely used to implementreliability features in modern servers and systems and pose a formidable verifi-cation challenge. We present a novel methodology and techniques for provablycorrect design of ECC logics. The methodology is comprised of a design speci-fication method that directly exposes the ECC algorithm’s underlying math toa verification layer, encapsulated in a tool “BLUEVERI”, which establishes thecorrectness of the design conclusively by using an apparatus of computationalalgebraic geometry (Buchberger’s algorithm for Grobner basis construction). Wepresent results from its application to example circuits to demonstrate the effec-tiveness of the approach. The methodology has been successfully applied to provecorrectness of large error correcting circuits on IBM’s POWER systems to protectmemory storage and processor to memory communication, as well as a host ofsmaller error correcting circuits.

Keywords Galois finite fields · error correcting circuits · formal verification ·Buchberger algorithm

1 Introduction

ECCs are widely used in practice to protect data against random errors thatinevitably occur during transmission as well as during prolonged storage. As semi-conductor technology is scaling down to the nanometer regime and tens of giga-bits per second transmission rates, error-free data handling requires larger andmore sophisticated error correcting circuits, with the code construction and en-coding/decoding algorithms almost always going beyond the templates found in

Alexey Lvov, Luis A. Lastras-Montano, Barry TragerIBM T.J.Watson Research CenterYorktown Heights, NY, 10598E-mail: lvov, lastrasl, [email protected]

Viresh Paruthi, Robert Shadowen, Ali El-ZeinIBM Systems and Technology GroupAustin, TX, 78758E-mail: vparuthi, shadowen, [email protected]

2 A.Lvov, L.Lastras-Montano, B.Trager, V.Paruthi, R.Shadowen, A.El-Zein

classical literature due to feature set requirements. For example, the IBM z196systems feature “RAIM” (Redundant Array of Independent Memory, [1], [2]) witha 90 byte ECC that allows the system to recover instantaneously from a full DIMMfailure even in the presence of additional chip failures. Each such error correctingcircuit has to be individually designed and programmed by a human designer. Theresulting implementation complexity in hardware can lead to design errors whichcan cause costly re-spins of the Silicon and derail schedules. Establishing correct-ness/verification of such complex hardware is of critical importance, though posesformidable challenges.

Traditional verification methods such as software simulation, hardware acceler-ated simulation or post-Silicon debug offer insufficient coverage given the difficultnature of the logic and the large solution space to be investigated. State-of-the-art formal verification algorithms (which inherently check circuit behavior againstall possible legal combinations of inputs) offering high capacity have been foundlacking in proving correctness because of their inability to exploit the specifics ofthe underlying algebra - Galois field arithmetic.

We propose a solution to the problem of complete symbolic verification oflogical circuits which substantially rely on arithmetic over Galois fields. Most ofthe error correcting circuits fall in the above category, as well as some of thecircuits for data encryption and arithmetic logic unit (ALU).

The verification technique is encapsulated in a reasoning tool “Blue Code Veri-fier” - “BLUEVERI” - and applies algebraic geometry methods (e.g. checks on theconsistency of polynomial systems of equations using the concept of Grobner basisand the associated Buchberger’s algorithm) to the problem of verifying circuits de-fined over Galois fields in order to establish correctness of the logic circuit againsta mathematical specification. The methodology has been successfully applied toverify real life error correcting codes at IBM resulting in substantially improvedverification quality, by providing full proof of the correctness of the design whichwas otherwise unobtainable, and in improved productivity, via significantly re-duced verification time and effort. We expect the improvements to accumulate asthe methodology gets applied “out-of-the-box” to future processor chips employ-ing even stronger ECC designs, and will be key to integrate commodity memoriesin products as well as in the design of communication link transceivers. The tech-niques involved are applicable to other types of logic circuitry based on Galoisfield arithmetic such as Elliptic Curve Cryptography.

2 Previous Art

Simulation-based methods such as software simulation or hardware-acceleratedsimulation are inapplicable to the problem of complex ECC verification. This isdue to the fact that the problem has large numbers of inputs which precludesan exhaustive exploration to fully verify the ECC circuitry to cover all possiblecombinations of input bit strings and injected errors (within the claimed errorcorrection capability of the code) and check to see if in each case the decoded bitstring is equal to the original one. Directed simulation to cover the vast major-ity, if not all, of “corner cases” again requires a careful analysis of the code toenumerate correction capability and features - a process which is inherently sub-ject to human limitations and errors. Systematic methods such as SAT or graph-

Verification of Galois Field Based Circuits Using Buchberger’s algorithm 3

based canonical representations of the logic with Decision Diagrams (DD) such asBDDs [3], BMDs [4], FDDs [5] run out of steam quickly due to the large inputspace and the complexity of the underlying logic employing exclusive-ORs. Ourexperience suggests that these existing decision procedures have difficulty scalingto designs beyond circuits with more than 24-bit inputs. Enhanced verificationtechniques leveraging Transformation-based Verification (TBV) [12] concepts tosimplify then prove the designs become capacity gated for 32-bit Galois field al-gorithms and beyond. Satisfiability Modulo Theory (SMT) solvers which utilizespecialized theories to address specific problem domains (e.g. bit-vectors) do notaddress polynomial equation solving over Galois fields. Our approach addressesthis niche and proposes a methodology to solve such systems of polynomial equa-tions over Galois fields efficiently.

A search for verification of Galois field circuits reveals the following applicablereferences - [6] and [7]. [6] defines a formal first-order logic language for symbolicarithmetic over an arbitrary binary Galois field along with a set of rules for ma-nipulation of formal sentences (such as transformation of the sentence into prenexnormal form, usage of DeMorgan’s law, elimination of variables etc.). The correct-ness criterion for parts of some ECC circuits can be formally expressed in thislanguage, e.g. finding the error locator polynomial from the value of the syndromefor Reed-Solomon codes. A formal reasoning in the language is then applied toprove or disprove the correctness statement. The method is only applicable to ver-ification of algorithms which are correct in any GF (2k) independently of the valueof k. In our method the size of the field is specified; in particular this allows theuse of constants of the field other than ‘0’ and ‘1’ in the circuit. The method doesnot employ any of the computational algebraic geometry machinery; that boundsit to purely GF (2k) circuits (with no bit operations allowed), while our methodworks on circuits with mixed bit and GF signals (Boolean result of test valueoperations on GF signals is computed by building Grobner basis of polynomialalgebraic system).

The latter [7] applies Grobner basis techniques to the very narrow problem ofverifying multipliers over a large Galois field. The class of the multipliers is furtherlimited to those based on representation of the large field as an extension of degreem of a smaller field of degree n. The paper reports practical results of verifyingmultipliers up to maximum field size of GF (21024), (m = 32, n = 32), but it doesnot make any attempts to verify circuits other than this multiplier circuit witha fixed structure parameterized with only two integers m and n. In contrast ourmethod is capable of verifying virtually any circuit built with GF , Boolean andmixed operations, with the runtime and memory being the only limiting factorsfor large circuits.

3 Galois Fields and Error Correcting Codes

In this section we give a concise background on finite fields and their applicationsto error-correcting codes. Refer to [10] pp. 1-286 and [9] pp. 1-146 for a detailedpresentation.


3.1 Finite Fields

A Galois (finite) field is a finite set GF of elements together with two binaryoperations “addition” and “multiplication” from GF × GF to GF which satisfyall the common laws of addition and multiplication for rational numbers. See [9]pp.11-12 Def. 1.29 for a formal definition.

For any given positive integer number k and any prime number p there existsa unique up to isomorphism Galois field of order pk ([9] p.49 Theorem 2.5).

Fields of the form GF (2k) are called binary Galois fields. Each binary Galoisfield contains a subfield with just two elements ‘0’ and ‘1’ and is a vector spaceof dimension k over this subfield. Given a basis b1, b2, . . . , bk ∈ GF (2k) the ele-ments of GF (2k) can be represented as k-tuples of coordinates in this basis or justbit strings of length k. The operation of addition becomes a bitwise XOR in thisnotation, e.g. “1011” + “0101” = “1110”. Due to distributivity of multiplicationthe product of any two bit strings can be computed by using only the multipli-cation table for the basis elements b1, b2, . . . , bk (often called table of structuralconstants). Here is an example of structural constants for GF (4) and their usagefor multiplication:

× b1 =“10” b2 =“01”

b1 =“10” “01” “11”

b2 =“01” “11” “10”

“11” × “11” =(b1 + b2)× (b1 + b2) =b21 + b1b2 + b2b1 + b22 =

“01” xor “11” xor “11” xor “10” =“11”.

Note that there are multiple different ways of choosing a basis in GF (2k), hencemultiple different ways of representing multiplication by structural constants. Onthe other hand not any structural constant table yields a binary operation thatis associative and reversible, so given a Galois field (in whatever abstract form),one must do some work in order to generate a valid table of structural constants.There exists several methods for obtaining some valid table of structural constantsfor a Galois field of a given size, e.g. [9] pp.66-67 Example 2.51, however we donot need to consider them in detail in this paper.

For the rest of this paper we will work only with binary Galois fields assumingthat their elements are represented by bit strings of length k and that the operationof multiplication is given by a table of structural constants.

3.2 Block Data Transmission Model and Error Correction

Data is stored by blocks of length r (Fig. 1a). During one transmission up to terrors (bit flips) may occur.

The idea behind block error correction is mapping of the original r-bit stringsto a subset S of the set of all n-bit strings (n > r) prior to transmission. Set S iscalled the set of code words and must have propertiesi) |S| = 2r andii) the Hamming distance, that is the number of positions in which bit stringsdiffer, between any two bit strings from S is ≥ 2t+ 1.The mapping (encoding) of data blocks into code words must be a one-to-onemapping. Generally this mapping is assumed to be given by a table, however formost particular codes short encoding algorithms exist.


0

0

0

1

1

1

1

1

1

1

0

0

0

0

1

1

1

1

1

1

1

0

0

1

1

0

0

1

0

0

1

1

0

0

1

0

block of data.Original r=7 bit

s=11 bitcodeword.

encoding

0

0

0

1

1

1

1

1

1

1

0

Error

Error

transmission

corruptedword.

The space of all 11−bit strings

2t+1

shown as a Hamming metric space.

decoding

b)

a)

0

1

t

tt

tt

t

Code words neighborhoods ofHamming radius tdo not intersect.

by m

ovin

g to

the

Ham

min

g−cl

oses

t cod

e w

ord.

Res

tore

the

orig

inal

cod

e w

ord

Res

tore

the

orig

inal

dat

a bl

ock

by e

ncod

ing

tabl

e lo

ok u

p.

Fig. 1 General schema of block error correction.

During transmission at most t bits of a code word may change their values(Fig. 1b). The received corrupted word is now at a distance ≤ t from the trans-mitted code word and at a distance ≥ t + 1 from any other code word. Thisproperty allows to uniquely reconstruct the transmitted code word from the re-ceived corrupted word by simply checking all code words in the encoding table andfinding the one at a distance ≤ t from the received word. It remains to convertthe transmitted code word back to the original block of data by using the sameencoding table.

Note that our code also can be used for detection of up to 2t errors insteadof correction of up to t errors. In fact, if more than zero but no more than 2tbits change their values during transmission then the corrupted word is different


r

r

n

n r

n −

r

n −

r

n −

r

110

...

0

1

111

1111 1

11

11

11

1

10 0 0

00 0

000

0

... ... ... ... ... ... ...

...

...

...

...1

11

100 0

0 00

000

00 01000

0 0 0 001

...

...

...

...

...

0 0 00 0 00 0 00 0 0

000

... ... ... ... ...

000

000 0

00 0 0

000 0

...

...

...

... ... ...

0 00

00

0

11

1

...110

...

0

1

111

1111 1

11

11

11

1

10 0 0

00 0

000

0

... ... ... ... ... ... ...

...

...

...

...1

000 0 0 0 0

00000 0 0 0

0 0 0 0 0

...

...

...

...

000

0 0 0

0000 0

0

... ... ... ... ... ... ... ...

generator matrix M

11

1

1

00 0

00 0

0 0 0

000

...

...

...

... ... ... ......

...

somenon−degen.

matrixX

zero matrixmatrix C

a check matrix

Fig. 2 Generator and check matrices of a systematic linear code.

from the original code word and also could not reach (by Hamming distance) anyother code word because our code has a minimum Hamming distance of 2t + 1by construction. So our code in 2t error detection decoding mode should raise an“uncorrectable error” flag if and only if the received word is not a code word.

In general a block code with minimum distance 2t + 1 can be used in any ofthe following decoding modes:- detection of up to 2t errors,- correction of up to 1 error and detection of 2 to 2t− 1 errors,- correction of up to 2 errors and detection of 3 to 2t− 2 errors,- . . . . . .,- correction of up to t− 1 errors and detection of t or t+ 1 errors,- correction of up to t errors.

3.3 Systematic Linear Codes, Generator Matrix, Syndrome

Consider the space of data blocks as a vector space of dimension r over Galois fieldGF (2). A code is called a linear error correcting code if the encoding mapping is alinear mapping from

(

GF (2))r

to(

GF (2))n

. In other words a code is linear if theencoding mapping has form x → Mx where M is a fixed r-by-n matrix of GF (2)elements. Matrix M is called the generator matrix of the code.

A linear code is called systematic if the first r rows of matrix M form a unitr-by-r matrix. In terms of the encoding that means that the first r bits of anycode word coincide with the r bits of the encoded data block. Denote the lowern− r rows of M by G, G is an r-by-(n− r) matrix. Denote the result of horizontalconcatenation of r-by-(n− r) matrix G and (n− r)-by-(n− r) unit matrix by C.CM = 0. See Figure 2. Let X be an arbitrary non-degenerate (n − r)-by-(n− r)matrix. Any matrix of the form XC is called a check matrix of the systematiclinear code; different decoding procedures may use different check matrices.

Note that a check matrix of a systematic linear code is always a full rank matrixwhich null space is the set of the code words S. Moreover, this is a characteristicproperty of check matrices: any full rank n-by-(n − r) matrix which null spaceis the set of the code words can be expressed as XC. Any check matrix of asystematic linear code uniquely defines the generator matrix and very often it ismore convenient to describe a systematic linear code by specifying one of its checkmatrices rather than by the generator matrix.


Theorem 1 ( [10] p.59 Theorem 4.11.extension) : A code with check matrix Yhas minimum Hamming distance of at least d + 1 if and only if any d columns ofY are linearly independent.

All decoding procedures for all systematic linear codes have the following com-mon steps:

1/3) Denote the corrupted code word received by decoder by v. Compute n−r

dimensional vector sdef= XCv (the check matrix XC is a fixed property of the

decoder). Vector s is called the syndrome of v and carries full information abouterror locations and the presence of uncorrectable errors in the corrupted word. Inparticular in the assumption that the number of errors does not exceed the errorcorrecting/detecting capability of the code, the syndrome is zero if and only if noerrors occurred during transmission.

2/3) Extract the error location information and UE flag from the syndrome.This step varies greatly from code to code and from one decoding algorithm toanother. For codes with large ranks and large error correcting capabilities this stepusually heavily relies on Galois field algebra. However, given unlimited memory,this step always can be done by brute force, that is by simply constructing a fullsyndrome to <error locations and UE flag> lookup table.

3/3) Once the error locations are known it remains only to flip the correspond-ing bits of the corrupted code word to obtain the transmitted code word and thento take the first r bits of this code word which (in systematic codes) form theoriginal data block. In multiple-bits-per-symbol codes that we will discuss in thenext subsection an additional step of finding error magnitudes is required. Thisprocedure does not depend on the code or decoding algorithm, involves only linearalgebra and is very simple in comparison to finding the error locations.

3.4 Multiple Bits per Symbol Codes, Reed-Solomon Codes

All the definitions and statements of the previous subsections remain correct if toreplace the 2-symbol alphabet GF (2) by a bigger binary Galois field GF (2k). Forexample data may be stored as blocks of bytes instead of blocks of bits and theremay be strong grounds to assume that if a single bit in a byte is corrupted then theother bits of that byte are also very likely to be corrupted. An error correcting codeover GF (28) with minimum Hamming distance (that is the minimum number ofdifferent bytes in any two code words) of 2t+1 will be able to correct any numberof bit errors providing that they occurred in no more that t different bytes of atransmitted block. Such error correcting codes are called burst error correctingcodes.

Possibly the mostly well known family of systematic linear burst error cor-recting codes are Reed-Solomon codes; in particular Reed-Solomon ECC are usedfor data protection in all CD and DVD disks. All examples in our ExperimentalResults section also address Reed-Solomon codes.

Let g be a multiplicative generator of GF (2k), see [9] p.50 Theorem 2.8. Reed-Solomon ECC is a systematic linear burst ECC with code word length 2k − 1,minimum distance d+ 1, number of check symbols (syndrome coordinates) d anddata block size 2k − d− 1.


R.S.ECC(k, d) can be defined by explicitely giving one of its check matrices:

1 1 . . . 1 1 1

g2k−2 g2

k−3 . . . g2 g 1

g2·(2k−2) g2·(2

k−3) . . . g4 g2 1

. . . . . . . . . . . . . . . . . .

g(d−1)·(2k−2) g(d−1)·(2k

−3) . . . g(d−1)·2 g(d−1) 1

The generator matrix of R.S.ECC(k, d) can be computed numerically as follows.Its first 2k − d− 1 rows form a unit (2k − d− 1)-by-(2k − d− 1) matrix. The lastd rows can be computed by bringing the rightmost d columns of the check matrixto unit d-by-d matrix by elementary transformations of rows (this computationrequires GF (2k) arithmetic). Unfortunately the generator matrix does not haveany visible regularity in its structure.

The most notable property of R.S.ECC(k, d) is that it achieves very largeminimum distance of d+ 1 by using very few check symbols d.Proof: Any d columns of the check matrix form a Vandermonde matrix [10] pp. 90-92. So any d columns are linearly independent and by Theorem 1 R.S.ECC(k, d)has a minimum distance of at least d+ 1.

4 Proposed Method

Our method was first inspired by the need to verify a large 1024-bit input er-ror correction circuit responsible for protecting the memory store as well as thecommunication between a POWER processor and memory. A traditional formalverification approach to verify the circuitry quickly became intractable given thevast search space.

The main idea is to use the fact that algebraic ECCs operate mostly on theelements of finite fields, and there are powerful techniques for symbolic reasoningin this domain. The process of verification of such circuits reduces to the ver-ification of a number of algebraic statements of the type “A certain system ofmultivariate polynomials over a finite field implies some other system of multivari-ate polynomials over a finite field”. The latter problem relates to computationalalgebraic geometry and can be solved by building Grobner bases for certain setsof polynomials by using Buchberger’s algorithm ([8], pp.77, 82-87).

4.1 Verification Setup

The verification set-up consists of two parts: the circuit to be verified, and a checkfile containing information about the set of legal inputs and the expected values forsome set of “crucial” signals; an example of the latter would be an uncorrectableerror flag (see subsection 5.1) or a signal that tests the equality between two bitvectors (see subsection 5.2). The verification task at hand is to formally prove (ordisprove) that for any legal combination of inputs, the values of the crucial signalsmatch their expected values.

In a standard processing methodology, the circuit is generally represented by adirected graph where the edges are wires carrying only Boolean signals, and nodes


XOR

CONST 1MULT

bit b xGF GF y GF z

bit GF

IS_ZERO

cruc

ial

ADD

SQUARE

WHEN_ELSE

Fig. 3 Example of BLUEVERI circuit representation.

are gates performing only basic Boolean operations. Since we assume that a largeportion of the operations in the circuit are operations in GF (2k) arithmetic, wemodify this representation by “glueing” together wires which represent the sameGF (2k) elements and putting “black boxes” around the pieces of the circuit whichrepresent basic GF (2k) arithmetic operations. Practically this is done by passinga special option to the HDL compiler, telling it to not synthesize functions from agiven list. The circuit in our representation typically looks similar to the exampleon Fig 3.

After this transformation, each wire carries either a Boolean signal or a GF (2k)signal. For this reason, we generalize the concept of “gate” so that now each gateperforms one of the following operations:

– Basic binary arithmetic operations on GF (2k):

ADD (both x+y and x−y), MULT (xy), DIV (xy2k−2).

– Any fixed set of unary operations on GF (2k) which are linear over GF (2), e.g.Frobenius automorphism (square), projections on elements of a fixed basis,square root, bit permutations etc.

– Any fixed set of GF (2k) constants (functions without arguments).– WHEN ELSE(b, x, y) function which returns GF (2k) element x when bit b

is 1 and GF (2k) element y otherwise.– GF (2k) value test functions which return value is a bit:

IS ZERO(x), IS NONZERO(x).– Boolean functions:

NOT, AND, OR, XOR.

The check file contains algebraic constraints on the GF (2k) inputs, optionallyinitial values for some Boolean and GF (2k) inputs, and the expected values forthe crucial Boolean signals testing the desired behavior for the circuit. The crucialsignals are restricted to Boolean because any condition on GF (2k) signals canbe expressed as a condition on Boolean signals by adding just a few gates to thecircuit. For example, if one wants to state that a GF signal x is equal to a given


constant const, then one may alternatively assert that we expect

(

IS ZERO(ADD(x, const))

to be equal to 1.The algebraic constraints are specified in conjunctive normal form (CNF)

whose literals are multivariate polynomial equalities or inequalities on the freevariables associated with each of the GF (2k) inputs.

Here is an example of a check file for the circuit on Fig 3:

BEGIN_CHECK;

IN_BITS_SETTINGS;

b <= ’0’;

EXPLICIT_EXPRESSIONS_FOR_SOME_GF_INPUTS;

x <= "8F3A";

ALGEBRAIC_CONSTRAINTS_ON_GF_INPUTS;

[ (y^3 + z^5 == 0) or (y^2 + z != 0) ]

and

[ (y == 0) or (z == 0) or (y + z != 0) ]

BIT_EXPECTED_VALUES;

crucial must be ’1’;

END_CHECK;

We support multiple checks in one check file in which case our tool verifies themindependently one by one, and appending new checks at the end of the file dur-ing verification (a necessary feature for the “fork on unresolved bits” mechanismoutlined later).

4.2 Verification Flow

The process starts by assigning a free variable (e.g. the symbolic string identifierused in the HDL file) to each of the GF (2k) inputs. Next the values of the crucialbit signals are computed one by one by applying the following recursive procedure.The procedures for “. . . execute the operation . . . ” will be explained for each typeof operation subsequently.

COMPUTE OUTPUT OF GATE(signal g) // case g is Boolean : Attempt to compute to const. ‘0’ or ‘1’ .// case g is GF (2k) : Compute as a symbolic rational

expression in the free variables.for all inputs gi of g COMPUTE OUTPUT OF GATE(gi)

switch (type of g) ADD: . . . Execute the operation . . .MULT: . . . Execute the operation . . .· · · · · ·XOR: . . . Execute the operation . . .


Given unlimited time and memory and assuming that all recursive sub-callssuccessfully compute values of g1, g2, . . . a call to COMPUTE OUTPUT OF GATE(g) al-ways succeeds if g is a GF (2k) signal. However, it may fail for Boolean signalsbecause Boolean signals are (generally) not constants but depend on the inputs.If a Boolean signal cannot be computed to ‘0’ or ‘1’ we skip to the next checkand add two new checks at the end of the check file assuming values ‘0’ and ‘1’for that bit by applying the “fork on unresolved bit” procedure described later inthis subsection. Note that although it may seem that this would fork on nearlyevery bit in the circuit, in our experience for ECCs the situation is typically justthe opposite: given a restricted set of inputs (e.g. exactly one injected error) mostof the Boolean signals in the circuit do not depend on the inputs; an example ofthis can be seen in subsection 5.1 in the computation of the uncorrectable errorflag of a decoder 1. Furthermore, BLUEVERI performs signal dependency checksthat result in the value of many boolean signals in the circuit not being needed;such booleans never cause a fork as described above.

Given g1, g2, . . ., we compute g depending on the type of operation as follows:

ADD andMULT: Perform the operation on the multivariate rational expressions.

E.g. ADD( xy+z

, yx+z

) = x2+xz+y2+yzxy+xz+yz+z2 , MULT(x+ 1, y + 1) = xy + x+ y + 1 etc.

UNARY LINEAR i : Any operation on GF (2k) which is linear over GF (2) canbe given by a linearized polynomial (a polynomial containing only terms of the

form cx2t

, see [9] pp.107-124). Substitute the input rational expression into the

linearized polynomial. E.g. in GF (16) Tr(x)def= x8 + x4 + x2 + x, Tr(y + z3) =

y8 + y4 + y2 + y + z24 + z12 + z6 + z3.

CONST i : Set signal g to the constant (a rational expression containing no freevariables).

WHEN ELSE(b,X, Y ) : Set rational expression g to rational expression X if bis ‘1’ and to rational expression Y otherwise.

IS ZERO, IS NONZERO, NOT, AND, OR, XOR : Computation of valuesof gates with Boolean output constitutes the most complex part of our algorithm.

To compute the value of g we first find the maximal connected island of ances-tors of g with Boolean outputs, that is the subgraph consisting of all gates hj suchthat there exists a directed path from hj to g and all gates on this path except forhj itself are elementary Boolean gates (NOT, AND, OR or XOR). An example isshown on Fig. 4. Note that any value test function (IS ZERO or IS NONZERO)in the subgraph must be a top most gate. The input signals gi of the subgraphare either GF (2k) inputs of value test functions or Boolean inputs of the wholecircuit.

By inductive hypothesis for our recursive function COMPUTE OUTPUT OF GATE(g)all GF (2k)-type gi have already been assigned some rational expression in the freevariables, and all Boolean type gi have been computed to constant ‘0’ or ‘1’ (thisis possible for all Boolean inputs to the circuit due to an explicit assignment inthe “In bits settings” section of the check which may be set either by the user oras a result of forking on unresolved bits).

1 Very often the uncorrectable error signal is both an internal signal upon which furtherthings depend and also an output by itself.


g4g3

g1

OR

g

NOTAND

ANDIS_NONZERO IS_ZERO

IS_ZERO

ADD Inputbit

DIV

CONST 1

XOR

WHEN_ELSE

WHEN_ELSE

g2

Fig. 4 Example of maximal “algebraic system” subgraph for signal g.

The Boolean function given by the subgraph can be written as a conjunctivenormal form whose literals are gi = 0 or gi 6= 0, where gi are rational expressions.As we will show in the description of DIV operation, we always make sure thedenominators of our rational expressions cannot be zero. This allows replacing ofgi = 0 and gi 6= 0 literals by numerator(gi) = 0 and numerator(gi) 6= 0 polynomialequalities/inequalities and expressing g as an algebraic system of the form

[

P11(x0, x1, . . .) =, 6= 0]

or . . . or[

P1r1(x0, x1, . . .) =, 6= 0

]

,. . . . . .

[

Ps,1(x0, x1, . . .) =, 6= 0]

or . . . or[

Ps,rs(x0, x1, . . .) =, 6= 0

]

,

(1)

where Pij denote arbitrary polynomials in the free variables x0, x1, x2, . . . associ-ated with the GF (2k) inputs of the circuit.

The algebraic constraints on the inputs are also given as CNF, and form analgebraic system of the same type.g is constant ‘0’ if and only ifinput constraints CNF AND g-subgraph CNF (2)is unsatisfiable.g is constant ‘1’ if and only ifinput constraints CNF AND NOTg-subgraph CNF (3)is unsatisfiable.Each of the expressions (2) and (3) can be converted to a single CNF of theform (1). It suffices to show how to check whether a system of the form (1) isunsatisfiable.

<BEGIN Satisfiability checking algorithm>.The first step is to get rid of inequalities in the system. For each i, j for which we


have inequality Pij(x0, x1, . . .) 6= 0 we introduce an auxiliary free variable tij andreplace the inequality by

tij · Pij(x0, x1, . . .)− 1 = 0.

One can easily check that if the system before replacement is satisfiable invariables

x0, x1, . . . , <all previously added auxiliary variables>

then the system after replacement is satisfiable in variables

x0, x1, . . . , <all previously added auxiliary variables> ∪ tij

and vice versa.All CNF-literals of the new system are polynomial equalities. Denote them by

Qij(x0, x1, . . . , ti0j0 , ti1j1 , . . .) = 0.

Next we replace all OR operations with multiplication:

Q11(x0, x1, . . . , ti0j0 , ti1j1 , . . .) · . . . · Q1r1(x0, x1, . . . , ti0j0 , ti1j1 , . . .) = 0,

. . . . . .Qs,1(x0, x1, . . . , ti0j0 , ti1j1 , . . .) · . . . · Qs,rs

(x0, x1, . . . , ti0j0 , ti1j1 , . . .) = 0.(4)

System (4) is a regular algebraic system of multivariate polynomials over GF (2k).By Hilbert’s Weak Nullstellensatz a system of multivariate polynomials is un-

satisfiable over an algebraically closed field if and only if the ideal generated bythe polynomials of the system coincides with the whole ring (i.e. contains 1) (refer[8], pp. 169-173).

x ∈ GF (2k) if and only if[

x ∈ alg closure(

GF (2k))

AND x2k

− x = 0]

.

For each variable xi add equation x2k

i − xi = 0.

Q11(x0, x1, . . . , ti0j0 , ti1j1 , . . .) · . . . · Q1r1(x0, x1, . . . , ti0j0 , ti1j1 , . . .) = 0,

. . . . . .Qs,1(x0, x1, . . . , ti0j0 , ti1j1 , . . .) · . . . · Qs,rs

(x0, x1, . . . , ti0j0 , ti1j1 , . . .) = 0,

x2k

0 − x0 = 0,. . . . . .

x2k

last − xlast = 0.

(5)

System (5) is satisfiable in the algebraic closure of GF (2k) if and only if theoriginal system (1) is satisfiable in GF (2k).

It remains to build a Grobner basis of the ideal given by the polynomials ofsystem (5). This can be done by Buchberger’s algorithm ([8], pp. 77, 82-87). Theoriginal system (1) is unsatisfiable in GF (2k) if and only if this Grobner basiscontains 1.<END Satisfiability checking algorithm>.

If the value of g is proved to be a constant ‘0’ or ‘1’ assign this value to g(computation successful). Otherwise fork on the unresolved Boolean signal g asfollows:

Add two copies of the current check at the end of the check file as given below.


– If g is an input Boolean signal add g <= ’0’ to the “In bits settings” sectionof copy 1 and g <= ’1’ to the “In bits settings” section of copy 2.

– Otherwise add NOT( System (1) ) to the conjunctive normal form in “Algebraicconstraints on GF inputs” section of copy 1 and System (1) to the CNF in“Algebraic constraints on GF inputs” section of copy 2.

Skip the current check and continue to the next one with the two additionalchecks added at the end of the queue. As a side note, the two examples in subsec-tions 5.1 and 5.2 do not require branching of this type for completion.

The only operation we have not explained yet is division. DIV : In logicalcircuits division is usually implemented as if y 6= 0 return x/y; else return

0; (which is equivalent to xy2k−2). To compute the result of division we first

attempt to prove that the constraints on the inputs imply that the divisor iseither always = 0 or always 6= 0 by the same algebraic method as for the gates withBoolean output. If successful, we simply assign 0 or the rational expressionx/yto g. Otherwise we fork on the test of [denominator = 0] the same way as shownabove for non-input Boolean signals.

We have shown how to compute value of any gate given the values of its inputs.GF (2k) signals are computed as symbolic rational expressions in the input signals,and Boolean signals must compute to constant ‘0’ or ‘1’ creating new brancheswith additional algebraic constraints on the inputs if necessary. This completesthe description of our algorithm.

Our actual C implementation contains many more features than describedabove. The most important ones include:

– Careful manipulations of conjunctive normal form systems: A brute force ma-nipulation of CNFs, and opening parenthesis in polynomial products whichcome from large OR-clauses would cause an immediate exponential explosionof the size of the system. However special care is taken of systems of the form(1) which most commonly appear in algebraic circuits. This prevents a rapidincrease of the size of the system - at least for typical cases. In particular, ifg-CNF has only one OR clause of length ≥ 2, i.e. has form(

[P11 =, 6= 0] OR . . . OR [P1r =, 6= 0])

AND [P21 =, 6= 0] AND . . . AND [Ps,1 =, 6= 0],

our implementation ensures the size of any system for which we build a Grobnerbasis is simply equal to the sum of the sizes of the input constraints systemand g-CNF system.

– “Lazy” signal computation method: In order to find values of expressions suchas (‘1’ or x), (‘0’ and x), (when ‘1’ : const else x) etc., we do not computex. This gives a significant speed up especially when the signals whose valueswe need to verify are localized in a relatively small part of a large circuit.

– Verification flow control: The user can control a number of verification processoptions such as whether to spend more time on Grobner basis computation of agiven bit vs. fork; whether to attempt to save time by skipping the x ∈ GF (2k)constraints which makes false negatives (but not false positives) possible; etc.

4.3 Verification Result

The verification process can have three possible outcomes:


1. For all checks all crucial bit values are computed and match the expectedvalues.

2. One of the checks (including checks added by “fork on unresolved bit”) failsbecause the value of one of the crucial bits is opposite to the expected valuespecified in the check file.

3. One of the checks (including checks added by “fork on unresolved bit”) failsto compute one of the crucial bit values due to insufficient time or memory.

“BLUEVERI” is primarily targeted at giving a formal proof of the correctnessof the design. In case of a failure (situation 2 or 3) it does not automaticallygenerate a counter example but instead provides symbolic values for as manysignals of the circuit as the time and memory constraints permitted it to compute.An interactive bug tracing interface then allows the user to browse the graph ofsignals and view their values in the form of symbolic rational expressions andalgebraic systems. Usually ECC algorithms are described and proved by humandesigners in a very similar form: algebraic manipulations with variables over finitefields. So in most cases the ability to see the algebraic expressions for intermediatesteps of the computation allows the programmer to compare the execution of thecircuit directly to the original human readable description of the ECC given in abook or a paper and find the bug.

However if an explicit counter example is necessary it can be manually foundin situation 2 (the opposite statement proved explicitely) by a sequence of binarysubdivisions of the set of admissible input signals. That is: fix bit 1 of input signal1 to ’0’ and run the test, then fix bit 1 of input signal 1 to ’1’ and run the test. Oneof these two subtests must explicitely fail giving an explicit value of bit 1 of signal1 of counter example. Set bit1-sig1 to this value and do a fork test for bit2-sig1in the same fashion. Continue until the values of all bits of all input signals ofcounter example are computed.

In situation 3 (timed out / out of memory) nothing can be said about the circuitfor sure. However if the structure of the circuit reflects the original algebraic ideasof the ECC well enough it is very likely that an excessive runtime is due to apresence of a bug in the implementation. In that case a counter example can notbe found by either binary subdivision or Grobner basis approach because finding asolution to an algebraic system is always more time and memory consuming thansimply showing that the set of solutions is non-empty. One practical attempt ofresolving this problem is performing the following two mutually complementarycomputations in parallel until one of them succeeds:

1) Attempt to obtain a formal proof of correctness by running BLUEVERIwith gradually increasing time and memory limits.

2) Attempt to obtain a counter example by continuously running “good old”random sampling test.

5 Experimental Results

If there is no restriction on time and memory the verification process is guaran-teed to prove or disprove the specification in the check file. We will give in whatfollows two simple examples (subsections 5.1 and 5.2) where this is accomplishedwithin a reasonable amount of time, demonstrating the power of reasoning at theGalois field level as opposed to the Boolean level. For complex, real-life designs


(as exemplified in subsection 5.4) we have found it useful to help BLUEVERI bymanually partitioning the search space, resulting in very little use of the “forking”feature described earlier. In addition, in some instances care is taken to specifythe circuit in otherwise equivalent forms to aid BLUEVERI in keeping down thesize of its internal rational expressions and the complexity of algebraic systems itgenerates; this was not necessary in the two examples below.

5.1 The Uncorrectable Error Flag of a Sample Reed-Solomon Decoder

As a first example, we consider a Reed-Solomon code with symbols belonging toa finite field GF (q) with q = 2k elements for some integer k. We shall assumethat the length of this code is n = 2k − 1. Let d denote the number of checksymbols of the Reed-Solomon code. We assume that this Reed-Solomon code hasbeen furnished with a decoder that is capable of correcting any one symbol error,and can detect up to d− 1 different errors. This decoder has a number of differentcomponents, one of which is responsible for the computation of the uncorrectableerror flag. This flag is a single Boolean output that is raised whenever the decoderhas detected 2,3, or up to d − 1 errors, and kept low whenever the error scenariocorresponds to a single error, or alternately whenever there is no error.

For our choice of Reed-Solomon code, the d-symbol syndrome of this ReedSolomon code can be computed from a (potentially corrupted) encoded vectorv ∈ GF (q)n using the formula

Si =

n−1∑

j=0

vjωij (wij means raise w to the power of ij)

for i ∈ 0, · · · , d − 1, where ω denotes a primitive element of the field. Denotethe error vector affecting v by e. e ∈ Fn

q , v = e + x where x ∈ Fnq is the uncor-

rupted codeword. Vector x has zero syndrome, therefore v can be replaced by e insyndrome computation.

Si =

n−1∑

j=0

ejωij (4)

The design of the uncorrectable error flag for this scenario is a well understoodproblem; for the sake of demonstration we deduce what might be a reasonablemethod to test it directly through formal methods. It can be easily seen from (4)that if there is only one error in e then the syndromes have form

Si = const1 ·(

const2)i

and satisfy the following condition: SiSi+2 = S2i+1 for i = 0, · · · , d−3. Furthermore

it is also known whenever e has at least one error and at most d errors, one or moreof the Si

d−1i=0 is nonzero. Therefore one can compute the uncorrectable error flag

through the following code, written using BLUEVERI VHDL style semantics:


symbol errors expected UE8 bit symbols

BLUEVERI input bits SXS1 false Success after 0.1 s. 16 Success after 14 s.2 true Success after 1 s. 32 Gives up after 24 h.3 true Success after 1 s. 48 N/A4 true Success after 33 m. 64 N/A5 true Gives up after 6 h. 80 N/A

symbol errors expected UE4 bit symbols

BLUEVERI input bits SXS1 false Success after 0.1 s. 8 Success after 0.7 s2 true Success after 1 s. 16 Success after 3 s3 true Success after 1 s. 24 Success after 55 m4 true Success after 33 m. 32 Gives up after 24h5 true Gives up after 6 h. 40 N/A

Table 1 Experimental results for the formal verification of the uncorrectable error flag of asingle error correct, multiple error detect Reed-Solomon decoder. SXS refers to Sixth Sense, abit-level formal verification tool set developed at IBM.

t_comp : for i in 0 to d-3 generate

t(i) <= add( mult(s(i),s(i+2)) , square(s(i)) );

end t_comp;

snz <= is_nonzero(s(0)) or ... or is_nonzero(s(d-1));tnz <= is_zero(s(0)) or ... or is_zero(s(d-1))

or

is_nonzero(t(0)) or ... or is_nonzero(t(d-3));

UE <= snz and tnz;

As written above, snz and tnz represent two distinct systems of equationswhich BLUEVERI will treat independently of each other. On the other hand,BLUEVERI will attempt to establish whether tnz (for example) is true or falseby examining the properties of s(0), ... s(d-1), t(0), ... t(d-3) simulta-neously as opposed to testing whether each s(i), t(i) is zero or not individually.

In order to test the ability of a model checker to prove the correctness of thisimplementation of the uncorrectable error flag, we assume that the syndrome gen-eration portion of the decoder has been proved correct separately; this task is infact generally computationally simpler than the one currently at hand. We thenbuild a module that accepts inputs e m(0)...e m(t-1) (for the error magnitudes)and inputs l(0)...l(t-1) (for the error locations) where t is the maximum num-ber of errors one can inject into the decoder during the test; in this particularexample for the uncorrectable error flag to be correct it is known that t = d − 1.This module emulates the syndrome generator and computes s(0)...s(d-1) us-ing the equation s(i) =

∑t−1i=0 l(i) e_m(i) (as per Equation 4), and then passes

the resulting syndromes to a module that computes the uncorrectable error flagas previously described.

In order to test a variety of error scenarios, we can place constraints on e m(i)

and l(i). For example, one can restrict the test to have exactly two errors byspecifying the following constraints:

e(0) != 0, e(1) != 0, l(0) != 0, l(1) != 0

add(l(0),l(1)) != 0, e(1) = ... = e(t-1) = 0

Note that in a field of characteristic 2, addition is equivalent to subtraction,and hence the addition constraint effectively constrains l(0) != l(1). These con-straints can be specified in a BLUEVERI check file as equal/not equal to zero


conditions on multivariate polynomial expressions. When BLUEVERI examinesthe dependencies of the UE signal, it finds that it depends on snz and tnz. BLUEV-ERI must either resolve that both are true, or that at least one of them is false.As described earlier, this is accomplished by invoking an attempt to compute theGrobner basis of various system of equations related to the constraints and theexpressions defining snz and tnz. Similar experiments can be conducted by up-dating the constraints to specify “at least two, but not more than y errors” wherey is a number between 2 and d− 1.

In order to test the capability of BLUEVERI as applied to this problem andcontrast it with that of a formal prover (we chose SixthSense, IBM’s state-of-the-art formal and semi-formal verification tool set, for that purpose), we set up a testwith d = 8, k = 8 (using GF (256)) and with the capability to inject from 2 up to7 errors at arbitrary locations, since the corresponding Reed-Solomon decoder issupposed to be able to detect all those errors.We also set up a parallel test with k =4 which is a considerably simpler problem for a Boolean oriented formal verificationsystem such as SixthSense [12]. The SixthSense and BLUEVERI experiments donot have any special tuning of the VHDL or the tool to improve the outcomes.

We refer the reader to Table 1 where the experiments were performed in asingle processor (POWER6 processor @ 5GHz running AIX) and the SixthSensewas run as a single software thread mainly orchestrating redundancy removal andSAT algorithms. In this set of experiments, BLUEVERI was configured to reasonabout the circuit with the variables (due to inputs or constraints) belonging tothe algebraic closure of the fields. This in essence means that we did not constrainthe variables to belong to the field GF (256) (resp. GF (16)) depending on whetherthe symbols used were 8 bit (resp. 4 bit) symbols. The consequence of this is thatalthough the BLUEVERI results are listed under 8-bit column, they in fact holdfor any field size, including larger field sizes which would be even harder for abit-level verification system to handle. Both formal systems were able to provethe correctness of the uncorrectable error flag under the single error scenario quiteeasily, but SixthSense was not able to prove the correctness of this flag in thedouble error case in the amount of time indicated in the table. In order to testthe sensitivity of SXS to the field size, we performed a similar experiment for aReed-Solomon code defined over GF (16). In this case we saw better results fromSixthSense, since we were able to prove the correctness of double and triple errordetect cases but not four error case. It is worth noting that the field size determinesmany important properties of an error control code, including the total codewordlength, and thus it cannot be modified for the purposes of formal verification sincethe resulting code is entirely different and, in all likelihood, not applicable to theoriginal problem.

5.2 Computing Error Magnitudes in a Reed-Solomon Code

One of the tasks that an error control decoder for a code defined over multibit(q > 2) symbols must perform is to compute the locations of the symbols in errorand then to compute the multibit pattern that one must add to those locations inorder to correct the codeword. This multibit pattern is called the error magnitude.Suppose that there are t errors in a codeword, and let s(0), · · · , s(t-1) be the first tsyndromes (note that this example is for a different setting than the example in the


errors8 bit symbols 4 bit symbols

BLUEVERIinputbits SXS

inputbits SXS

2 Succ. 2 s. 32Gives upafter 24h 16 Succ. 0.6s

3 Succ. 2.1 s. 48 N/A 24 Succ. 16m

4 Succ. 2.1 s. 64 N/A 32Gives upafter 24h

5 Succ. 2.3 s. 80 N/A 40 N/A6 Succ. 3.1 s. 96 N/A 48 N/A7 Succ. 49.4 s. 112 N/A 56 N/A8 Succ. 8m 128 N/A 64 N/A9 Succ. 53m 144 N/A 72 N/A

Table 2 Experimental results for the formal verification of the error magnitude computationstage of a Reed-Solomon code.

previous subsection). From (4), we can derive that error magnitude computationcan be carried over using the equation

e_m(0)

.

..e_m(t-1)

=

1 · · · 1

l(0) · · · l(t-1)

.... . .

...l(0)t−1 · · · l(t-1)t−1

−1

s(0)

.

..s(t-1)

The inverse matrix above can be derived analytically. It is well known thatthe inverse is non singular if and only if the locations l(i) are all distinct ofeach other. This restriction can be specified through

(

t2

)

constraints each of whichis a polynomial with two monomials. We refer the reader to Table 5.2 wherewe show that in this case, BLUEVERI was able to show the correctness of thecorresponding circuit with up to 8 errors, while SixthSense was unable to finishthe double error case within the time allocated. As in the previous subsection, inthis particular example the result for BLUEVERI is actually field size independentsince it exploits only the algebraic properties of the symbols. It is worth notingthat the Grobner basis machinery in BLUEVERI does get involved in provingthe correctness of this circuit. This is because the inversion of the Vandermondematrix results in rational expressions (as opposed to plain polynomial expressions)whose denominator could be zero. The task of Grobner in here then is to showthat the denominator is not zero given the assumptions on the inputs, so thatBLUEVERI can proceed with the corresponding algebraic simplifications leadingto the desired result.

5.3 Solving the Key Equation for a Reed-Solomon Code

The standard decoding procedure for correcting random errors in Reed-Solomoncodes consists of 3 stages. In the first stage syndromes are computed from thepotentially corrupted encoded vector. This has been described in previous sectionsand as noted, verification of syndrome generators is very easy using BLUEVERItechnology, even for very long codewords and large numbers of syndromes. Thisis essentially due to the fact that these circuits have no branching and are linear


functions of the data, which is the easiest case for BLUEVERI. The next stageinvolves using the syndromes to compute a pair of polynomials which can be usedby the final stage for locating and correcting the errors. One polynomial Λ(x) iscalled the error locator polynomial since its roots can be used to determine thelocations of the errors in the codeword. The other polynomialΩ(x) is call the errorevaluator polynomial since it is used along with the error locator polynomial tocompute the values of the errors at each error location. This pair of polynomials areminimal solutions to a particular congruence equation called the “Key Equation”.If we have r syndromes, S0, . . . , Sr−1, then we first form the syndrome polynomialS(x) =

∑r−1j=0 Sjx

j . Λ(x) and Ω(x) are the least degree solution to the equation:

S(x)Λ(x) ∼= Ω(x) (mod xr)

There are several algorithms for solving this equation, but a recent algorithm [13]due to Sarwate and Shanbhag has the advantages not requiring divisions and re-ducing the latency to only 1 Galois Field multiplication followed by 1 addition. Wedecided to implement this algorithm as part of a high performance Reed-Solomondecoder operating over the field GF(1024) (i.e. k=10) and with r=14 syndromes.In this situation we can correct up to 7 random errors. Although we were im-plementing a published algorithm, we needed to be sure our implementation wascorrect so we attempted to verify it using BLUEVERI.

Our implementation was a single latched module which iterates 14 cycles inorder to produce the locator and evaluator polynomials. Since the current im-plementation of BLUEVERI requires pure combinatorial logic (no latches), weneeded to transform our implementation into a latch free version. Our circuit op-erates in a simple feedback manner and always iterates exactly 14 times, thereforethe transformation simply consisted on cascading 14 copies of our circuit with thelatches removed, feeding the output of each stage as input to the next stage. Eventhough our circuit has no divisions, it does have one branch point during each iter-ation which depends strongly on the input syndromes. Thus without constrainingthe syndromes, BLUEVERI would be unable to decide which branch to take andwould need to use its forking mechanism to explore all branches. Since verificationof this circuit is an extremely difficult task we decided to avoid using the forkingmechanism, and instead we added constraints to the input syndromes to force thealgorithm to take a single path through the circuit. Since there is only 1 branchpoint per iteration and we iterate 14 times, this would mean at most 214 = 16384cases to be handled.

In general it can be a difficult problem to find the constraints on the inputsin order to force an algorithm to take a particular branch; this is why the fork-ing mechanism is needed to handle some situations. For Sarwate’s algorithm thebranch points are determined by the sequence of “discrepancies”, which are in-ner products between the candidate current locator polynomial coefficients andsliding sequences of syndromes. However one can show that the branching is alsocontrolled by particular determinants of a matrix built from the syndromes. Forany k ≤ 7, the locator polynomial for exactly k errors can also be determined fromthe (14− k)× (k+1) syndrome matrix constructed in the following way. In row 0we put syndromes 0 to k, in row 1 syndromes 1 to k+1 and generally in row i weput syndromes i to k+ i. As an example, the syndrome matrix for 5 errors would


be:

s0 s1 s2 s3 s4 s5s1 s2 s3 s4 s5 s6s2 s3 s4 s5 s6 s7s3 s4 s5 s6 s7 s8s4 s5 s6 s7 s8 s9s5 s6 s7 s8 s9 s10s6 s7 s8 s9 s10 s11s7 s8 s9 s10 s11 s12s8 s9 s10 s11 s12 s13

If there are exactly 5 errors then this matrix would have rank 5 and a generatorof its 1 dimensional nullspace gives the coefficients of an error locator polynomial.In general for k errors the determinant conditions on the (14− k) × (k + 1) syn-drome matrix are that it has rank k and that the principal k × k subdeterminantis nonzero. Since the principal k × k matrix has full rank, the condition of rankk can be verified by testing that the determinants of all (k + 1) × (k + 1) sub-matrices formed from k + 1 sequential rows of this matrix are zero. In light ofthis, we generated input constraints for BLUEVERI by asserting the appropriatesymbolic determinants to be zero or nonzero. The situation of k errors does notuniquely determine the branching in Sarwate’s algorithm, since it also depends onthe determinants of the k − 1 principal minors of size less than k. Each of thesecan be zero or nonzero and there are thus 2k−1 subcases for k errors. So the inputconstraints for each subcase of k errors contain the k determinants of principalminors and the 14− 2 ∗ k determinants of all (k+1)× (k+1) sliding blocks. Thusthe input constraints for k errors are determinants of submatrices of size at mostk + 1.

During our verification exercise, we were able to verify many of the cases de-scribed previously. Most cases we were unable to handle were due to size con-straints, as on some occasions, the intermediate expressions exceeded address spacelimitations since some of the tools we were using were only available as 32 bit appli-cations and not 64 bit applications. For cases when we couldn’t completely verifyk errors using all 14 syndromes, we reduced the problem by either reducing thenumber of syndromes or setting some of the syndromes to zero since the number ofsyndromes is constrained to be at least twice the number of errors. We summarizebelow what we were able to demonstrate:

1. All cases 1,2, or 3 errors with all 14 syndromes.2. All cases of 4 errors with 12 syndromes (last subcase took 2 days to verify).3. All cases of 5 errors with 10 syndromes.4. Most subcases of 6 errors with 12 syndromes and first syndrome set to zero.5. Most subcases of 7 errors with 14 syndromes and first 2 syndromes set to zero.

The reader may notice that cases with more syndromes than twice the numberof errors require more effort on the part of BLUEVERI. This is because even afterfinding the correct error locator, on successive iterations the locator is scaled inSarwate’s algorithm by a computed constant in GF (2k). Since the actual value ofthis constant depends on the syndromes, performing these successive iterations inBLUEVERI involve multiplying the coefficients of the locator polynomial by ansymbolic expression in the syndromes which continue to cause their size to growmaking subsequent computations more expensive.


This helps explain why, for example, proving the case for 4 errors and 12 syn-dromes just barely worked, even though only the first 8 syndromes are needed toget the locator polynomial. The difficulty comes in subsequent iterations involv-ing the remaining syndromes which continue to increase the size of the locatorpolynomial, making IS ZERO questions much more difficult. Even these partialverification results are remarkable since we are not aware of any other verificationtools which could be successfully applied to a circuit of this complexity.

5.4 A Note on a Real Life Application of BLUEVERI

The examples in the previous subsections are meant to illustrate the capabilities ofa formal verification system such as BLUEVERI when compared to Boolean ori-ented systems. In our experience, the implementation of a real-life encoder/decoderemploys many custom algorithm variants as one tries to address problems that arespecific to the application at hand. In the most significant application of BLUEV-ERI so far, we have succeeded in proving the correctness of an ECC of a POWERmicroprocessor that is based on the mathematics of Reed-Solomon codes. The cor-rectness criteria included all correctable and uncorrectable cases for which we hadgiven guaranteed behavior (e.g. recovery from complete chip failures and detectionof multiple errors). The ECC, from the decoders perspective, had over 1000 bits ofinput including several tens of bits worth of configuration parameters. The numberof syndrome bits produced by the decoder was over 100 bits, although our testingdid include testing the behavior of the encoder with analytically generated sym-bolic syndromes, it was not limited to it - approximately half of the total testingtime exercised the more than 1000 bits of input of the circuit directly. The numberof Galois field and Boolean elements in the corresponding graph is over 100,000(compared to at most a few hundred in the previous experiments). Because of thecomplexity of the problem, we had to case-split to create 1M different tests, eachof which exercised formally a particular region of the test space. It took about 2weeks to prove the correctness of the entire design in a 10 machine Linux (x86)cluster.

6 Technical Solutions

The BLUEVERI tool leverages IBM’s existing front-end and simulation tools andflows. For language processing we are using Portals, IBM’s HDL compiler, whichaccepts the synthesizable subset of standard VHDL and Verilog languages. Portalsperforms behavioral synthesis on procedural HDL and produces an elaboratednetlist, for BLUEVERI this is in the DADB logic database. DADB is a box-pin-net logic database used for verification flows, such as topology checking andsimulator model build, which supports client transforms via a dynamically loadedplugin architecture.

Portals was modified for BLUEVERI to support the blackboxing of functioncalls, enabling the logic to be represented in a form amenable to analysis byBLUEVERI. High level language constructs which are output by Portals into thenetlist, such as case statements, can be synthesized into lower level representationsby the use of DADB client transforms.


1b

6a

6b

8

9a 9b10

7

1a

2

3

5

4

A set of alg. systems ofpolynomials on Galois field inputvariables and expected values of

crucial signals

script parser"Settings for checks"

bugs tracingUser interface for

C library for

over Galois fieldrational expressionsmanipulations with

Buchbergers algorithmfor Grobner basis

computation

a galois field operation−or−

as described in Section II.aa bit operation

A set of black−boxed functions.Each black−boxed function is

PORTALScompile with

then useDA−DB

with a dynamicallylinked set of our

functions

Library of GFarithmetic functions

(by exhaustive checking)GF ar. func. verifier

Checkfile

Report

pass/fail Log file

Mycircuit

Fig. 5 General schema of BLUEVERI tool.

The BLUEVERI analysis tool has its own custom input netlist format. A netlisttranslator was built as a DADB client to enable the tools flow from Portals intoBLUEVERI.

The MESA logic simulator is a high performance cycle simulator used forfunctional verification within IBM. MESA simulation models are built from logicnetlists in DADB by using model build clients.

The BLUEVERI code is written in C. For the computation of Grobner baseswe use ”SINGULAR” [11] a powerful program for algebraic geometry computa-tions distributed under general public license. BLUEVERI runs SINGULAR as achild process and uses ”expect.h”, (a standard C library), for sending queries andreceiving results from SINGULAR’s Grobner basis engine. The monomial orderingused in the Grobner basis computation is graded reverse lexicographic ordering.

7 Future Research Directions

7.1 Converting Branching to Polynomial Operations

The current design of BLUEVERI requires that the predicates associated withessential branch points in the algorithm be decidable from the input constraints.


When this is not the case, a forking algorithm allows one to explore alternativepaths automatically including additional constraint hypotheses. A possible alter-native design would be to convert IS ZERO operations into symbolic polynomials.This is only possible if we restrict ourselves to operating over a fixed finite fieldGF (2k) (recall that currently BLUEVERI provides the option of making this re-striction). Once k is fixed, since we know that every element of the field satisfies

x2k

= x, we have that x2k−1 is 1 if x 6= 0 and 0 otherwise. So we have:

IS NONZERO(x) 7→ x2k−1

IS ZERO(x) 7→ x2k−1 + 1

We view the boolean values 0, 1 as elements of our finite field GF (2k). Theboolean operations AND, OR, NOT, XOR can also be described by polynomial opera-tions:

AND(x, y) 7→ xy

NOT(x) 7→ x+ 1

OR(x, y) 7→ xy + x+ y

XOR(x, y) 7→ x+ y

Given that IS ZERO and boolean operations on the elements 0, 1 can be describedby polynomials, we can finally give a polynomial definition for the WHEN ELSE(b, x, y)branching construct which yields x when b = 1 and y when b = 0.

WHEN ELSE(b, x, y) 7→ bx+ (1 + b)y = b(x+ y) + y

Among the previous polynomial definitions, the most costly transformation isthe IS ZERO function which can greatly increase the size of expressions. The currentmechanism in BLUEVERI was intended to help control the size of intermediateexpressions, and clearly should be used in cases when the boolean predicates toWHEN ELSE can be decided. However the tradeoff between using the forking mech-anism and trying to directly encode boolean operations as polynomials should beexplored. Particularly in situations where the degree k of the finite field is small,the polynomial version could be very effective.

7.2 Avoiding Grobner Bases

The transformations from boolean and conditional operations described in theprevious section are special cases of a more general phenomenon. Any purely com-binatorial function which takes inputs from a finite field and produces an outputin the same field can be described by a polynomial function of its inputs. With thechanges described in the previous section for booleans and conditionals, BLUEV-ERI can generate a polynomial describing any such combinatorial circuit. If weassign the resulting polynomial to a new variable, and use a lexicographic order-ing among variables, then as observed in [7] the set of equations we generate isautomatically a Grobner basis assuming we have no constraints on the inputs. Ifwe have constrained inputs, then we first compute a Grobner basis for the con-straints. Then for each combinatorial block in the circuit, we produce polynomials


representing each of its outputs as functions of its inputs. We choose a term or-dering where each output variable is more main than any of its inputs. The set ofpolynomial equations generated in this fashion is automatically a Grobner basis.Since we require that the input variables range over GF (2k), we have relations

of the form x2k

− x for each input variable. This guarantees that our input con-straints, and consequently all our output polynomials generate what is called aradical ideal. A radical ideal is an ideal with the property that if some power of anelement is in the ideal, then the element itself is in the ideal. In this situation withvery little work we can produce the generators for the radical ideal describing ourcircuit. We can defer the expression blowup described in the previous section bythe use of new variables. For example if the input of the IS ZERO function was alarge expression, we would assign the input to a new variable V, and express the

output of the IS ZERO function in terms of the new symbol V, i.e. V 2k−1 + 1.

In this way we are deferring the problem of taking large powers of multitermpolynomials. Now after generating the equations describing the circuit, if we havesome output signal that we need verify is always 0 for any input satisfying ourconstraints, we need to verify that the output signal is contained in our ideal ofequations. Since the generators form a Grobner basis, this becomes the problemof reducing a polynomial modulo the set of generators. For generators comingfrom the circuit, reduction is the same as substitution for the output variable. Forgenerators of the constraint ideal, reduction is a multivariate version of divisionwith remainder. This reduction of the output signal modulo the generators of ourideal is the only computationally expensive step. If we can find an efficient wayto implement this reduction operation, then we have a potentially efficient way toverify any combinatorial circuit operating on elements of a finite field.

8 Conclusions

In this article we presented a novel technique for designing and verifying circuitsbased on the mathematics of Galois fields. At the heart of our approach is theidea of exposing operations on Galois field directly to a verification layer (encap-sulated in a tool called BLUEVERI) which leverages powerful techniques fromalgebraic geometry to reason about the properties of the abstract Galois field ra-tional expressions generated in the circuit. Our circuits are specified using a subsetof existing Hardware Description Languages and as such, remain fully synthesiz-able, an important attribute to reduce the possibility of human error in the designprocess.

We demonstrated the value of the ideas we proposed in the context of twoproblems representative of the type of situations encountered when designing errorcorrecting codes. In both instances, we showed BLUEVERI can significantly out-perform conventional bit-level formal verification. We outlined a successful appli-cation of the BLUEVERI system to prove correctness of a real production complexerror correcting code implemented on a POWER microprocessor which otherwisecould not be verified conclusively with traditional verification methods.

Acknowledgements The authors would like to thank Shmuel Winograd and Geert Janseenof IBM Research for insightful discussions to help shape the solution.


References

1. Meaney, P. J., Lastras-Montano, L. A., Papazova, V. K., Stephens, E., Johnson, J. S.,Alves, L. C., O’Connor, J. A. and Clarke, W. J., IBM zEnterprise redundant array of

independent memory subsystem IBM Journal of Research and Development, Jan-Feb,Vol. 56, 2012.

2. Lastras-Montano, L.A.; Meaney, P.J.; Stephens, E.; Trager, B.M.; O’Connor, J.; Alves,L.C., A new class of array codes for memory storage, Information Theory and ApplicationsWorkshop (ITA), 2011 , vol., no., pp. 1–10, 6–11 Feb. 2011

3. R. E. Bryant Graph Based Algorithms for Boolean Function Manipulation, IEEE Trans.on Computers, vol. C-35, pp. 677-691, August 1986.

4. R. E. Bryant and Y-A. Chen Verification of Arithmetic Functions with Binary MomentDiagrams, Design Automation Conference 1995.

5. U. Kebschull and W. Rosentiel Efficient graph-based computation and manipulation of

functional decision diagrams, European Conference on Design Automation, pp. 278–282,1993.

6. S. Morioka, Y. Katayama and T. Yamane Towards Efficient Verification of ArithmeticAlgorithms over Galois Fields. Proc. Computer Aided Verification 2001, vol. 2102, pp.465–477.

7. Jinpeng Lv, Priyank Kalla and Florian Enescu Verification of Composite Galois Field

Multipliers over GF ((2m)n) Using Computer Algebra Techniques. Proc. IEEE Interna-tional High Level Design Validation and Test Workshop 2011, pp. 136–143.

8. David Cox, John Little, and Donald O’Shea. Ideals, Varieties and Algorithms. Under-graduate Texts in Mathematics. Springer, 2010. ISBN: 0-387-35650-9.

9. Rudolf Lidl and Harald Niederreiter, Finite Fields. Encyclopedia of Mathematics and ItsApplications, Volume 20. Cambridge University Press, 1997. ISBN: 0-521-39231-4.

10. Oliver Pretzel, Error-Correcting Codes and Finite Fields. Oxford Applied Mathematicsand CS Series. Oxford University Press, 1992. ISBN: 0-198-59678-2.

11. W. Decker, G.-M. Greuel, G. Pfister, and H. Schonemann, Singular 3-1-3 — A computeralgebra system for polynomial computations. http://www.singular.uni-kl.de (2011).

12. H. Mony, J. Baumgartner, V. Paruthi, R. Kanzelman, and A. Kuehlmann, Scalable

automated verification via expert-system guided transformations, Formal Methods inComputer-Aided Design, 2004, pp. 159-173.

13. D. Sarwate and N. Shanbhag, High-Speed Architectures for Reed-Solomon Decoders, IEEETrans. on VLSI Systems, Vol. 9, No. 5, pp. 641–655, Oct. 2001.

Veriﬁcation of Galois Field Based Circuits by Formal ...Veriﬁcation of Galois Field Based Circuits by Formal Reasoning Based on Computational Algebraic ... consistency of polynomial

Documents