Top Banner
Master’s thesis Deterministic Test Vector Compression/Decompression Using an Embedded Processor and Facsimile Coding by Jon Persson LITH-IDA-EX–05/033–SE 2005-03-21
70

Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Mar 23, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Master’s thesis

Deterministic Test VectorCompression/Decompression Using an

Embedded Processor and FacsimileCoding

byJon Persson

LITH-IDA-EX–05/033–SE

2005-03-21

Page 2: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Master’s thesis

Deterministic Test Vector

Compression/Decompression Using anEmbedded Processor and Facsimile Coding

by Jon Persson

LiTH-IDA-EX–05/033–SE

Supervisor and Examiner:Erik LarssonDepartment of Computer and Information Scienceat University of Linkoping

Page 3: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test
Page 4: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Abstract

Modern semiconductor design methods makes it possible to design in-creasingly complex system-on-a-chips (SOCs). Testing such SOCs becomeshighly expensive due to the rapidly increasing test data volumes with longertest times as a result. Several approaches exist to compress the test stim-uli and where hardware is added for decompression. This master’s thesispresents a test data compression method based on a modified facsimilecode. An embedded processor on the SOC is used to decompress and ap-ply the data to the cores of the SOC. The use of already existing hardwarereduces the need of additional hardware.

Test data may be rearranged in some manners which will affect thecompression ratio. Several modifications are discussed and tested. To berealistic a decompressing algorithm has to be able to run on a systemwith limited resources. With an assembler implementation it is shownthat the proposed method can be effectively realized in such environments.Experimental results where the proposed method is applied to benchmarkcircuits show that the method compares well with similar methods.

A method of including the response vector is also presented. This ap-proach makes it possible to abort a test as soon as an error is discovered,still compressing the data used. To correctly compare the test responsewith the expected one the data needs to include don’t care bits. The tech-nique uses a mask vector to mark the don’t care bits. The test vector,response vector and mask vector is merged in four different ways to findthe most optimal way.

Keywords: System-on-a-chip(SOC) testing, test data compression/de-compression, processor-based testing, variable-to-variable-length codes,facsimile coding, deterministic testing.

Page 5: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

iii

Page 6: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Acknowledgements

A lot of thanks to Erik Larsson, my supervisor and examiner at IDA (De-partment of Computer and Information Science at University of Linkoping)who helped me a lot. Not only with the explanation of how SOC’s are testedbut also all practical issues and last but not the least, a lot of reasoningabout upcoming ideas and problems.

I would also like to thank Kedarnath Balakrishnan, University of Texas,for sending me ISCAS’89 test vectors and Syed Irtiyaz Gilani at IDA forthe D695 test and response vectors. Without ability to test the methodwith realistic data I would know nothing about the quality of the method.

Thanks to all my friends who have discussed the subject with me andthe biggest thank to Louise who encouraged me all the way during thiswork, I love you!

Thanks,Jon

Page 7: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

v

Page 8: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Abbreviations

ATE Automatic Test EquipmentATPG Automatic Test Pattern GeneratorBIST Built-In Self-TestCPU Central Processing UnitCUT Core Under TestDSP Digital Signal Processor (or Processing)FDR Frequency-Directed Run-LengthI/O Input/OutputMISR Multi-Input Signature (or Shift) RegisterNOP Dummy Instruction in AssemblerSOC System-on-a-ChipTAM Test Access MechanismX Don’t Care BitXOR Exlusive Or

Page 9: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

vii

Contents

1 Introduction 11.1 System-on-a-Chip (SOC) . . . . . . . . . . . . . . . . . . . 11.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Don’t Care Bits (X’s) . . . . . . . . . . . . . . . . . 41.3 Examine the Response . . . . . . . . . . . . . . . . . . . . . 4

1.3.1 MISR . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 The Problem 72.1 High Test Data Volume . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Solution . . . . . . . . . . . . . . . . . . . . . . . . . 72.1.2 Using an Embedded Processor . . . . . . . . . . . . 82.1.3 What is Given . . . . . . . . . . . . . . . . . . . . . 9

3 Related Work 113.1 Decompressing Using on-chip Circuitry . . . . . . . . . . . . 113.2 Built-In Self-Test (BIST) . . . . . . . . . . . . . . . . . . . 123.3 Decompressing Using Processor . . . . . . . . . . . . . . . . 12

3.3.1 Decompression Using Linear Operations . . . . . . . 13

4 Design and Implementation 174.1 Facsimile Standard . . . . . . . . . . . . . . . . . . . . . . . 17

4.1.1 One-Dimensional . . . . . . . . . . . . . . . . . . . . 184.1.2 Two-Dimensional . . . . . . . . . . . . . . . . . . . . 18

4.2 Compressing Test Vectors . . . . . . . . . . . . . . . . . . . 21

Page 10: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

CONTENTS

4.2.1 Plain Facsimile, No Reorder . . . . . . . . . . . . . . 214.2.2 Greedy Sort . . . . . . . . . . . . . . . . . . . . . . . 224.2.3 Frequency-Directed Run-Length (FDR) . . . . . . . 234.2.4 Modifying Facsimile Codewords . . . . . . . . . . . . 244.2.5 Local Search . . . . . . . . . . . . . . . . . . . . . . 254.2.6 The Complete Proposed Method . . . . . . . . . . . 264.2.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3 Decompression . . . . . . . . . . . . . . . . . . . . . . . . . 294.4 Decompression in Assembler . . . . . . . . . . . . . . . . . . 304.5 Including Response Vectors . . . . . . . . . . . . . . . . . . 30

4.5.1 Using Mask . . . . . . . . . . . . . . . . . . . . . . . 314.5.2 Two Bits Each . . . . . . . . . . . . . . . . . . . . . 314.5.3 Merged Test and Response Vector . . . . . . . . . . 32

5 Experimental Results 355.1 Compressing Test Vectors Only . . . . . . . . . . . . . . . . 35

5.1.1 Unix gzip utility . . . . . . . . . . . . . . . . . . . . 375.1.2 Local Search Heuristic . . . . . . . . . . . . . . . . . 38

5.2 Including Response Vectors . . . . . . . . . . . . . . . . . . 38

6 Discussion 436.1 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . 43

6.1.1 Local Search . . . . . . . . . . . . . . . . . . . . . . 446.1.2 Discarded Techniques . . . . . . . . . . . . . . . . . 45

6.2 Storing Previous Vector . . . . . . . . . . . . . . . . . . . . 456.2.1 Store in Memory . . . . . . . . . . . . . . . . . . . . 466.2.2 Core Feedback . . . . . . . . . . . . . . . . . . . . . 46

6.3 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . 486.4 Comparing the Result . . . . . . . . . . . . . . . . . . . . . 486.5 Including Response . . . . . . . . . . . . . . . . . . . . . . . 496.6 Complex Methods . . . . . . . . . . . . . . . . . . . . . . . 50

7 Conclusions and Further Work 517.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 517.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Page 11: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

ix

A Assembler Code 57

Page 12: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

CONTENTS

Page 13: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

1

Chapter 1

Introduction

This chapter gives an introduction to System-on-a-Chip (SOC) and to test-ing.

1.1 System-on-a-Chip (SOC)

“System-on-a-chip (SoC or SOC) is an idea of integratingall components of a computer system into a single chip. Itmay contain digital, analogue, mixed-signal, and often radio-frequency functions – all on one chip.”

Wikipedia (http://en.wikipedia.org)

Modern semiconductor design methods and manufacturing technologiesenable the creation of a complete system on one single die, the so-calledsystem chip or SOC [4]. Such system chips typically are very large In-tegrated Circuits (ICs), consisting of millions of transistors, containing avariety of hardware modules [4]. These modules, called cores are reusable,predesigned silicon circuit blocks. Embedded cores incorporated into sys-tem chips cover a very wide range of functions like processor, mpeg cod-ing/decoding, memory etc.

Page 14: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

1.2. Testing

Processor

Memory

Core A

Core B��

Figure 1.1: Example SOC

Throughout this report we will look at a simple example SOC shownin Figure 1.1. It contains a processor, a memory and two small cores, theones that will be tested.

1.2 Testing

When testing a core (referred to as core-under-test or CUT) the core willbe set to a starting state and the system clock will be applied, bringing thecore to its next state, the response, which is examined. If the response isthe expected one then this test has passed. A core has a number of suchtests to pass, each checking for different modelled faults that can arise.

To easily set the starting state the core is equipped with scan chains,shift registers connected to the inner parts of the core. The scan chainsare first filled with a test vector by shifting it in, then the system clock isapplied and the response is captured into the scan chains. The response isshifted out and compared with the expected response. The data bus usedto transfer the test data, called Test Access Mechanism (TAM) is dedicated

Page 15: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

3

Processor

Memory

Core A

Core B��

Wrapper

Wrapper

Scan Chain 1

Scan Chain 2...

Scan Chain n

Scan Chain 1

Scan Chain 2...

Scan Chain n

�TAM

ATE

��

��

Figure 1.2: Example SOC with TAM, wrappers and scan chains

to testing only. Often the TAM is of a width different from the numberof scan chains. To handle the interface between scan chains and the TAMevery core is surrounded by a wrapper, applying the incoming bits to theright scan chain. As mentioned a number of test vectors is to be appliedwhen testing a core, together these test vectors constitutes the test data,sometimes referred to as a test cube. The example SOC has n scan chainsper core and the TAM is four bits wide. (Figure 1.2)

Where do the test vectors come from? Together with the specificationof the core an automatic test pattern generator (ATPG) can produce thetest sets and responses. If the core is constructed as a black box wherethe buyers have no information of its internals the vendor of the core willdeliver test sets and the corresponding responses. The test vectors are thenusually stored in an automatic test equipment (ATE) which is connectedto the SOC when testing it, sending over the vectors one by one.

Page 16: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

1.3. Examine the Response

1.2.1 Don’t Care Bits (X’s)

Each test vector is designed to test the SOC for one or more modelledfaults. Every such fault deals only with some of the input bits, thus leavingother bits that can be either 0 or 1. These are called don’t care bits andare represented with X’s in the test and response vectors. For the testvectors used in this report the number of don’t care bits can be as much as95% of the total number of bits [2]. A good compression algorithm shouldmaximize the compression ratio by assigning the don’t care bits to either0 or 1 carefully.

1.3 Examine the Response

There are mainly two alternatives to examine the response. The first one isto compare every bit of the response with the expected, modelled response.This approach will detect all possible errors and can also be used to ter-minate a test as soon as the first error is detected, so called abort-on-fail.This way less time is used testing faulty SOCs.

The second approach is to compress the response before it is comparedwith an equally compressed expected response. The response can be com-pressed without keeping all the information as long as the probability ofaccepting a faulty SOC is low. One straightforward compressing algorithmwould be to count the sum of all the 1’s in the responses, if the sum isdifferent from the expected one the SOC is faulty. If several faults occurthere is a possibility that the sum ends up to the correct value and theSOC wrongly passes the test. Today the most commonly used approachis to place a multi-input signature register (MISR) at the outputs of eachcore.

1.3.1 MISR

A multi-input signature register (MISR) is a small circuit designed to createa signature of the data sent to its inputs. When all the tests are completedthe signature is compared with the desired signature, if equal the MISRwill signal that the tests are passed, otherwise fail is signalled. The desiredsignature is small enough to be stored inside the MISR itself. In Figure 1.3

Page 17: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

5

D

D

D1

3

2

Inputs

reg 1

reg 2

reg 3

Time InputsD1 D2 D3 reg1 reg2 reg3

0 0 1 1 0 0 01 1 1 0 0 1 12 0 0 0 0 0 13 1 0 1 1 1 04 1 1 0 1 1 05 (end) 1 0 1

Signature

Figure 1.3: Example MISR and signature calculations

an example MISR is shown together with a signature calculated from someexample inputs. The ⊕-symbols represent modulo-2 adders, an odd numberof 1’s to its inputs will set output to 1, even number of 1’s will set output to0. As seen in the example MISR the output from register 3 is connected tothe modulo-2 adders in front of register 1 and 2. Which modulo-2 addersthat will be connected to the output of the last register can be changed togive the MISR other characteristic. Differently connected MISRs producessignatures of differerent quality. [9]

Due to its cyclical behaviour a MISR distributes faults evenly over all itsregisters. This way multiple faults are less probably to produce the correctsignature. It can been shown that the probability for erroneous inputsto generate the correct signature is nearly 2−n where n is the number ofregisters in the MISR. Figure 1.4 shows where the MISRs are added to theSOC. [9]

Page 18: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

1.3. Examine the Response

Processor

Memory

Core A

Core B��

Wrapper

Wrapper

MISR

MISR

Scan Chain 1 �Scan Chain 2 �

.

.

.

.

.

.

Scan Chain n �

Scan Chain 1 �Scan Chain 2 �

.

.

.

.

.

.

Scan Chain n �

�TAM

ATE

��

��

Figure 1.4: Example SOC with MISRs added

Page 19: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

7

Chapter 2

The Problem

2.1 High Test Data Volume

With rapidly increasing complexity in the SOCs the test data increases justas fast. This brings two problems, the ATE needs more memory to storetest data and the tests takes longer time to perform. Especially the longertest times, that is a huge bottleneck in the production of SOCs, increasesthe production cost.

2.1.1 Solution

What can be done to reduce the size of the test data? One popular approachis the use of compression techniques. The test data for a particular SOCis compressed and stored in the ATE. This requires less memory than theoriginal data, giving us a solution to the first problem. When testing a SOCthe compressed data is sent to the SOC where a decompressor restores theoriginal data. The decompressor is usually some extra circuitry added tothe SOC. The decompressed, original data is then sent to the CUT as ifthe ATE did send the original data directly.

There is still the same amount of data to be applied to each core evenif it were compressed when sent to the SOC. How can the second problemwith long test time be solved? Luckily the technique described above will

Page 20: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

2.1. High Test Data Volume

Processor

Memory

I/O Core A

Core B

��

��

Wrapper

Wrapper

MISR

MISR

Scan Chain 1 �Scan Chain 2 �

.

.

.

.

.

.

Scan Chain n �

Scan Chain 1 �Scan Chain 2 �

.

.

.

.

.

.

Scan Chain n �

ATE

��

�TAM

��

����

Figure 2.1: Example SOC with processor connected to TAM

help also in this matter. ATEs are usually built with slower electronicsthan SOCs and a SOC will have to operate at very slow speed duringtest. When an ATE sends compressed data only the parts of the SOCreceiving this data needs to operate at the same clock speed as the ATE.The decompressor and also the rest of the SOC can operate at higher clockspeed applying test vectors in less time.

2.1.2 Using an Embedded Processor

Many of the SOCs of today have embedded processors to solve calculationsspecific to the operation of the SOC. Is it possible to use the embeddedprocessor to decompress a compact version of the test data? This questionwas the starting point for this thesis. The idea is somewhat like Figure 2.1.The ATE will send precomputed, compressed test vectors to the SOC.The embedded processor then restores the original test vectors using a

Page 21: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

9

decompression algorithm and applies them to the cores.It turned out this approach had already been tested with good results,

but there exists more compression algorithms that hasn’t been tested yet.

2.1.3 What is Given

Figure 2.1 shows the layout for the example SOC that is to be tested. Thefollowing requirements are fulfilled for this SOC:

• The ATE is capable of using the I/O-module to send data to the rightplace in memory. Not only can it send the compressed data but alsothe decompression program can be transferred and executed.

• The memory is of sufficient size to hold the decompression program,a buffer for the incoming data and also one copy of the longest testvector.

• There exists controlling circuitry which will synchronize the data flowfrom the ATE and also send enable signals to the right parts of thesystem.

• Test vectors are available and come, one for each core, in the followingformat:

225000000XXXXXXX101XXXXXXXX00000000011111XXX111100XXX

The first two rows specifies how many vectors there are and how longeach vector is. Don’t care bits are represented with X’s.

What is left to be done is the compressing algorithm and the decom-pression program. The compressed data will only deal with the vectors.The two first controlling rows may be transferred as they are, telling theprocessor how many vectors to decompress and how many bits each ofthem are. The output from the compression program will be a stream of

Page 22: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

2.1. High Test Data Volume

bits which uncompressed will yield the same vectors as in the original datawith one exception; each X is replaced with either 0 or 1. Since each vectoris a stand-alone test, the vectors produced from the compressed data maybe in different order than the original vectors. What matters is that theresponse vectors needs to be reordered in the exact same way.

This report presents a technique that compresses the test data aboveinto this:

225111100101101101001111011010110

The two vectors are represented by 30 bits instead of 50 bits in theoriginal data.

Page 23: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

11

Chapter 3

Related Work

This chapter discusses some of the different solutions to the problem ofreducing test data volume. Both decompression techniques using hardwareand software are represented.

3.1 Decompressing Using on-chip Circuitry

As long as the decompression scheme is not to difficult decompression canbe made in hardware using additional circuitry inside the SOC. The mainadvantage is that these techniques can be used in any SOC without therequirement of an embedded processor and/or memory. The cost is thearea overhead inside the SOC to fit the decompressing circuitry.

Frequency-Directed Run-Length (FDR) code (described in Section 4.2.3)is used by Chandra and Chakrabarty [3]. The report shows that FDR codeoutperforms other compressing schemes when dealing with a special casesuch as test vector compressing. In the report Chandra also applies thetechnique on difference vectors where every vector only represents the dif-ference with the previous one. This way longer runs of 0’s is achievedand better compression. The result is also compared with more complexmethod like gzip and compress, two Unix utilities for compressing datafiles.

Page 24: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

3.2. Built-In Self-Test (BIST)

Gonciari and Al-Hashimi [5] propose a Huffman-coding algorithm usingpatterns of variable lengths. The method aims to solve three problemsto SOC testing, on-chip area overhead, high test data volume and testapplication time.

3.2 Built-In Self-Test (BIST)

A BIST technique is only applicable when the interior of a module is known.The idea is to create the test vectors somewhat randomly and see whichmodelled faults these random vectors covers. It is important that the ran-domizing algorithm produces exactly the same vectors each time. Suchalgorithm is called pseudo-random generator. Those faults not covered bythe random vectors are tested with ordinary, deterministic test vectors.

Hwang and Abraham [6] suggest a BIST technique where each pseudorandom pattern is shifted cyclical to cover more simple faults. To avoidtesting the circuit with a high number of unnecessary vectors the distanceto the next good vector is sent for each test. For the deterministic part ofthe method they encode the difference for each deterministic vector to oneof the random ones. The probability that there exist one similar randomvector is high.

3.3 Decompressing Using Processor

A few other methods where an embedded processor is used for decompres-sion already exists. Compressed data is sent to the memory. A decom-pression program, running on the embedded processor decompresses thevectors and applies them to the CUT.

Jas and Touba [7] present an approach where only the difference fromthe previous vector is sent. The vectors are divided into blocks of a certainlength and only blocks with changed bits will be sent. The compresseddata consists of a list of blocks. For each block the position must be savedand also one bit to tell if a block is the last one for a vector. The vectorsin the test set are reordered to achive less difference between the vectors.

Balakrishnan and Touba [1] use matrix operations to compress the test

Page 25: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

13

data. For a number n the first n2 bits forms a n × n matrix. A set ofequations is then solved to find two vectors which, together with a XORingalgorithm, can reproduce the original matrix. XORing two or more bitsworks like this; if an odd number of bits are 1 the result is 1. Otherwise,the result is 0. If the equations can’t be solved the first n bits are sentuncompacted.

3.3.1 Decompression Using Linear Operations

The method proposed in Balakrishnan and Touba [2] where linear opera-tions are used to decompress the test set is presented in more detail. Thescheme for testing a SOC with this method is based on word-based XORoperation. The length of the words is usually chosen to be the word-lengthof the processor, 32 is the most common today. The method works basicallylike this:

1 All the words from the compressed data are sent to the embeddedmemory.

2 A pseudo-random number generator inside the SOC creates a numberof integers smaller or equal to the number of words in the compresseddata.

3 The integers points out words in the compressed data which areXORed together bitwise.

4 The resulting word is sent to the CUT.

5 Unless all tests are done, repeat from step 2.

A pseudo-random number generator gives, what seems, a series of ran-dom numbers but the important thing is that each time it is restarted itwill produce exactly the same series. This way it is known which wordsfrom the compressed data that will be XORed together to create a certainword in the decompressed data. The compressed data needs to be createdin such a way that when decompressed it will correspond to the originaldata. This is done by creating linear equations using all that is known fromabove.

Page 26: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

3.3. Decompressing Using Processor

Test Vector 1Test Vector 2Test Vector 3Test Vector 4Test Vector 5

Test Vector 1Test Vector 2Test Vector 3Test Vector 4Test Vector 5

W2

W3

W4

W5

W6

W7

W8

W1

Compressed Data

Original Test Set

10 XX 0X XX

X1 XX 1X 1X

X0 XX 1X XX

11 XX X1 0X

01 XX X0 0X

W1⊕W5⊕W8 W2⊕W6⊕W7 W3⊕W4⊕W5 W1⊕W2⊕W7

W2⊕W5⊕W6 W3⊕W6⊕W8 W4⊕W1⊕W6 W1⊕W7⊕W8

W2⊕W3⊕W4 W1⊕W3⊕W6 W5⊕W7⊕W2 W8⊕W4⊕W2

W5⊕W4⊕W8 W2⊕W1⊕W4 W7⊕W6⊕W3 W4⊕W5⊕W6

W8⊕W2⊕W6 W1⊕W5⊕W7 W2⊕W4⊕W6 W6⊕W7⊕W8

Figure 3.1: Forming example test set from compressed bits

Page 27: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

15

The method is illustrated with an example where the situation is likein Figure 3.1. W1-W8 refers to words, usually of length 32. To reduce thesize of this example the word-length is set to 2.

In this case the pseudo-random generator produces the series 1-5-8-2-6-7-3-4-5-1-2-7-2-5-6-3-6-8..., three by three these the words correspodingto these numbers are XORed together inside the box in the middle ofFigure 3.1. Setting the XORed expressions equal to the original data,found at the bottom of Figure 3.1, will give us the following equations.

W1⊕W5⊕W8 = 10 W5⊕W7⊕W2 = 1XW2⊕W6⊕W7 = XX W8⊕W4⊕W2 = XXW3⊕W4⊕W5 = 0X W5⊕W4⊕W8 = 11W1⊕W2⊕W7 = XX W2⊕W1⊕W4 = XXW2⊕W5⊕W6 = X1 W7⊕W6⊕W3 = X1W3⊕W6⊕W8 = XX W4⊕W5⊕W6 = 0XW4⊕W1⊕W6 = 1X W8⊕W2⊕W6 = 01W1⊕W7⊕W8 = 1X W1⊕W5⊕W7 = XXW2⊕W3⊕W4 = X0 W2⊕W4⊕W6 = X0W1⊕W3⊕W6 = XX W6⊕W7⊕W8 = 0X

All equations are then divided to handle one bit each, those where theright-hand side is X can be removed, whatever the bits of the left-hand sideare they will always satisfy a don’t care bit. W1(1) refers to the first bit ofW1 and W1(2) to the second. This gives the followign equations.

W1(1)⊕W5(1)⊕W8(1) = 1W1(2)⊕W5(2)⊕W8(2) = 0W3(1)⊕W4(1)⊕W5(1) = 0W2(2)⊕W5(2)⊕W6(2) = 1W4(1)⊕W1(1)⊕W6(1) = 1W1(1)⊕W7(1)⊕W8(1) = 1W2(2)⊕W3(2)⊕W4(2) = 0W5(1)⊕W7(1)⊕W2(1) = 1W5(1)⊕W4(1)⊕W8(1) = 1W5(2)⊕W4(2)⊕W8(2) = 1W7(2)⊕W6(2)⊕W3(2) = 1W4(1)⊕W5(1)⊕W6(1) = 0W8(1)⊕W2(1)⊕W6(1) = 0W8(2)⊕W2(2)⊕W6(2) = 1W2(2)⊕W4(2)⊕W6(2) = 0W6(1)⊕W7(1)⊕W8(1) = 0

Solving this system of equations is the major task in this method. Bal-akrishnan and Touba show that every such system of equations can bemade solvable by increasing the size of the compressed data. This smallexample has one solution in the following values of the compressed data:

Page 28: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

3.3. Decompressing Using Processor

W1 = 00W2 = 11W3 = 10W4 = 01W5 = 10W6 = 10W7 = 11W8 = 00

With this method we have compressed the test set from 40 bits (theoriginal data at the bottom of Figure 3.1) to 16 bits (8 words of 2 bitseach). 16 also happen to be the number of specified bits, stot, in theoriginal test set. Most often this method only needs a few more bits thanstot to get solvable equations [2].

The major disadvantage with this method is the requirement of availablememory. For every word it decompresses the method needs to look up wordsfrom different parts of the compressed data, hence all of the compresseddata needs to be sent to the systems memory before decompression cantake place. There can also be a problem when solving enormous system ofequations as they can be too large to solve in a reasonable amount of time.If these factors become an issue, then the test set can simply be partitionedand each partition processed one at a time [2]. Partitioning the test setwill reduce the overall compression slightly (the larger the partitions, thebetter the overall compression) [2].

Page 29: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

17

Chapter 4

Design andImplementation

This chapter begins with a description of the facsimile standard. Themethod is then designed through a number of stages, each adding newfeatures. An algorithm for decompressing the vectors constructed in as-sembler using an emulator for 8086 processor is also presented.

4.1 Facsimile Standard

The facsimile coding standard used in this report is the ITU-T Group 3standard. The idea behind this facsimile coding is that many lines of aprinted paper is similar to the line just above. Every dot on the paperis coded to be either white or black, also known as Bi-Level images. Thesender compares the next runs of equally colored dots with the dots rightabove on the previous line. If they are somewhat similar special codewordsare sent to the receiver. The receiver, who already got the previous line,can calculate the length of the runs. The facsimile standard in more detailsfollows below, as described by Sayood [8].

In the recommendations for Group 3 facsimile the code is divided intotwo schemes. The first is a one-dimensional scheme in which the data is

Page 30: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.1. Facsimile Standard

Figure 4.1: Two rows of an Image. The transition pixels are marked.

coded independently of any other data. The other is two-dimensional wherespecial codewords are sent using the line-to-line correlations.

4.1.1 One-Dimensional

The one-dimensional coding scheme is a run-length coding scheme in whichthe next block of data is represented as a series of alternating white runsand black runs. If this scheme is used at the beginning of a line, the firstrun is always a white run. If the first pixel is a black pixel, then a whiterun of length zero is sent first.

The run-length code used is a Huffman code, a way of choosing thebest fitted codeword for each situation based on how frequently a situationoccurs. Each line of a A4-size document is representated by 1728 pixels.Creating 1728 different Huffman codes are not very suitable, instead thecode is divided into two parts, m and t and a run of length ri is expressedas

ri = 64 × m + t for t = 0, 1, . . . , 63; and m = 1, 2, . . . , 27.

The codes for t are called the terminating codes and the codes for m arecalled the make-up codes. Black and white run length also have separatecodes. If ri < 63, only a terminating code is used. Otherwise, both a make-up code and a terminating code are used. This coding scheme is generallyreferred to as a Modified Huffman (MH) scheme.

4.1.2 Two-Dimensional

In the two-dimensional scheme, the key is the transition pixels. A transitionpixel is a pixel of different color than the pixel to the left of it. In Figure 4.1the transition pixels are marked with dots. Even the leftmost pixel on a

Page 31: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

19

row can be a transition pixel. One can think of each row extended withan imaginary white pixel to the left of the row, if the first pixel is black itis also a transition pixel. In most documents a row is very similar to itsneighbours and the transition pixels will be close to each other. The idea isto encode the position of a transition pixel in relation to a transition pixelon the previous line. This is a modification of a coding scheme called Rela-tive Element Address Designate (READ) code and is often called ModifiedREAD (MR).

Some definitions are needed to explain the coding scheme:

a0: The last pixel of the row currently being encoded. The position andcolor is known to both encoder and decoder. At the beginning ofeach line, a0 refers to the imaginary white pixel to the left of the firstactual pixel. Often this pixel is a transition pixel but not always.

a1: The first transition pixel on the same row and to the right of a0. Thelocation of this pixel is known only to the encoder.

a2: The second transition pixel on the same row and to the right of a0.As with a1 its location is known only to the encoder.

b1: The first transition pixel with the opposite color of a0 on the lineabove and to the right of a0. As the line above is known to bothencoder and decoder, as is the value of a0, the location of b1 is alsoknown to both encoder and decoder.

b2: The second transition pixel on the line above and more than one pixelto the right of a0. Also known to both encode and decoder.

For the implementation of the facsimile standard used in this report b1

and b2 may be placed to the right of the entire row. If only b2 is to theright it is placed one pixel to the right. If both are outside, b1 is placedone pixel and b2 is placed two pixels to the right. This is slightly differentfrom Sayood [8] where an additional codeword is mentioned, representingthe situation where all the remaining pixels of a row is equally colored.

In Figure 4.2 the example rows are labelled. In this situation the secondrow is the one currently being encoded and the encoder has encoded thepixels up to the second pixel (marked with a0). The pixel assignments

Page 32: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.1. Facsimile Standard

0a a1 a2

b1 b2

Figure 4.2: The transition pixels are labelled.

for a slightly different arrangement of black and white pixels are shown inFigure 4.3.

If a1 is to the right of b2, we call the coding mode used the pass mode.This mode is coded with 0001. When the decoder receives this code itknows that all the pixels from the last one decoded to the pixel straightbelow b2 has the same color. For the next round this pixel below b2 is thelast pixel known to both encoder and decoder. This is the only time wherethe last known pixel is not a transition pixel.

If a1 is to the left of or straight below b2 one of two things can happen.The vertical mode is used if the number of pixels from a1 to right under b1

is less than or equal to three. Seven different codes tell the location of a1

in relation to b1. These are:

1: a1 is straight below b1.

011: a1 is to the right of b1 by one pixel.

000011: a1 is to the right of b1 by two pixels.

0000011: a1 is to the right of b1 by three pixels.

010: a1 is to the left of b1 by one pixel.

000010: a1 is to the left of b1 by two pixels.

0000010: a1 is to the left of b1 by three pixels.

Page 33: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

21

2

b

a0

1

1a

b2

a

Figure 4.3: Two slightly different rows with transition pixels labelled.

After the decoder has received and decoded one of these codes the pixelat a1 is the last one known to both encoder and decoder and the codingprocess is continued.

In the case where a1 is to the left of or straight below b2 and thedistance to b1 is greater than three the one-dimensional technique describedin Section 4.1.1 is used. To inform the decoder about this mode the code001 is sent followed by two sets of Modified Huffman codewords. The firstrun-length is of the same color as the last decoded pixel and the second ofthe opposite. This is in fact the runs from a0 to a1 and from a1 to a2. Thedecoder then adds one pixel with the same color as the first run and thisis the last known pixel for the next round.

4.2 Compressing Test Vectors

4.2.1 Plain Facsimile, No Reorder

This first solution uses plain facsimile code to compress the vectors in theorder given in the test cube. Later we will see that reordering the vectorsimproves the compression ratio. The first line to be coded also needs oneprevious vector. The algorithm uses a imaginary first vector containingonly 0’s which forces the first vector to be coded with run-length codesonly.

Page 34: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.2. Compressing Test Vectors

A first look at the test data clarifies that the X’s (don’t care bits) needsto be assigned 0 or 1 carefully. When the algorithm comes across don’t carebits it tries to set a1 after b2 (see Section 4.1). If it fails it will try to placea1 as close to b1 as possible. If a1 can not be placed as close to b1 as threesteps away the horizontal mode is used sending run-length codes. In thefacsimile standard the run-length codes are compressed. This compressiontechnique is based on the length of one row of pixels for a paper copy,which is fixed. This is not applicable for test vectors with different lengths.Instead of creating new compression techniques for each circuit the run-length case is not compressed at all. In Section 4.2.3 a better solution ispresented.

4.2.2 Greedy Sort

Since each test vector is a separate test it does not matter in which orderthe test vectors are applied as long as all of them are applied. A reorderof the vectors is done to achieve better compression. A test data set withn vectors can be reordered in n! ways. With conventional computers it isimpossible to test all n! combinations unless n is very small, a heuristic isnecessary.

Even with a heuristic reordering the test set is a difficult problem be-cause when one test vector is moved inside the test cube it will affect thefacsimile code for many other vectors. To start with the vector that ismoved needs to get all its don’t care bits reassigned to achieve better com-pression. Then it will be coded in relation to its new previous vector. Thisvector will also force the next vector to be recalculated in the same wayand this will propagate downwards. Only when a vector happens to beassigned the don’t care bits in the same way as before this chain reactioncan be broken. Otherwise all following vectors needs to be recalculated.

The greedy sort heuristic starts with the first imaginary vector of 0’sand compresses every vector in the test cube with this as the previousvector. The vector with the shortest facsimile code is chosen and acts asthe previous vector in the next round. This way the algorithm chooses thenext vector that extends the compressed data the least until all vectors areincluded. The biggest disadvantage is that the last vectors are not verywell suited to be compressed in relation to each other.

Page 35: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

23

4.2.3 Frequency-Directed Run-Length (FDR)

As mentioned in Section 4.2.1 the run-length code used in the facsimilestandard is not very suitable for test vector compression. Chandra andChakrabarty [3] show that FDR codes are easy to decompress and com-presses test data very good. Its finest characteristic is the ability to coderuns of any length.

The FDR code is constructed to give short codewords for short runs andworks like this: A codeword consist of two parts, the group prefix and a tail.The group prefix tells which group of run-lengths the codeword belongs to.The first group, A1, has a single 0 as its group prefix, group A2 has 10 asprefix and A3 has 110. This way every next group gets one more leading 1.Given a complete FDR code the group is determined by seeking the firstoccurrence of the bit 0. If this is found on the kth position the group isAk.

The next part is the tail that points out one of the run-lengths in thegroup. It consists of the same number of bits as the group prefix, onefor group A1, two for A2 and so on. With k bits available the group Ak

will include 2k different run-lengths, 0 and 1 for group A1, 2-5 for groupA2, etc. The first 14 run-lengths are shown in Table 4.1. The right-mostcolumn shows the codeword (the prefix and tail concatenated) used for eachrun-length. The FDR code has the following properties:

• It is easy to extract the prefix and the tail. The prefix is all bits fromthe beginning including the first 0. The tail is of equal length as theprefix.

• For any codeword the sum of the binary representation of the prefixand the tail equals the run-length that is coded.

• Short run-lengths are coded with shorter codewords.

This next modification uses FDR where the original run-length codeshould have been used.

Page 36: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.2. Compressing Test Vectors

Group Run-length Group prefix Tail Codeword0 0 0 00

A1 1 1 012 00 1000

A2 3 10 01 10014 10 10105 11 10116 000 1100007 001 1100018 010 110010

A3 9 110 011 11001110 100 11010011 101 11010112 110 11011013 111 110111

· · · · · · · · · · · · · · ·

Table 4.1: The first 14 run-lengths and their codewords

4.2.4 Modifying Facsimile Codewords

The choice of codewords in the facsimile standard is based on character-istics of paper copies. In this next modification to the method statisticswere gathered of how many times each codeword were used in the com-pressed data. The ordering algorithm described in Section 4.2.2 will benefitfrom using the short codewords, hence all the codewords needs to be madeequally long, otherwise the shorter ones would be used more often thanlonger ones only because they are shorter. The statistics are the sum fromall six circuits used in the experiments in Chapter 5.

Statistics shows that four of the codewords are rarely used. They cor-respond to the cases where a1 is placed two or three bits to the left or rightof b1. One by one these codewords were removed from the method. For allof them the removal reduced the size of the compressed set. The remainingcodewords can be changed further to enhance the compression even more.The new codewords can be found in the last column of Table 4.2. Not only

Page 37: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

25

Situation Org. code Times used New codeword

run-length 001 7250 11

a1 > b2 0001 6990 10

a1 right under b1 1 12403 01

a1 one right of b1 011 2386 001

a1 two right of b1 000011 546 not used

a1 three right of b1 0000011 253 not used

a1 one left of b1 010 2349 000

a1 two left of b1 000010 1120 not used

a1 three left of b1 0000010 877 not used

Table 4.2: Statistics for codewords

do these changes reduce the size of the compressed set, it also makes thedecompressing algorithm simpler and faster.

50 bits has become 30!

4.2.5 Local Search

As mentioned in Section 4.2.2 ordering the vectors is difficult. Local searchis a heuristic, a looping algorithm working like this; the algorithm startswith a given starting solution, in this case a test set that has a specific order.Given this starting solution the facsimile coding algorithm compresses thedata, rendering the size of the compressed data. The size of this compresseddata is what the heuristic tries to minimize. For each loop the heuristicwill try a set of different orderings and calculate the size of the compresseddata. The one change that gives the best solution and is better than theone given is taken as the starting point for the next run in the loop. Theset of orderings that are tested is determined by a rule. In every loop thealgorithm will check all the solutions that can be reached with the rule,called the surroundings, to find a better one. Usually it is a good idea tokeep the surroundings very small, hence the name ’local search’. Examplesof suitable rules defining the surroundings could be:

• Moving one vector to another place

Page 38: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.2. Compressing Test Vectors

• Switch place for two adjacent vectors

• Switch place for two arbitrary vectors

The heuristic was added to the modifications described earlier. Thesurroundings were chosen to the last one in the list above. As we will seethe result is not much better than greedy sort. Therefore this modificationis not part of the proposed method. Because of the long execution time ofthe heuristic, even for these small example cores, it is not suitable.

4.2.6 The Complete Proposed Method

The modification mentioned above bring us to one complete algorithm fortest vector compression. The local search heuristic is not part of the pro-posed method.

To encode a test vector the algorithm uses the previous vector andset the don’t care bits to get the best position of a1, preferably after b2

otherwise as close to b1 as possible. The different cases are encoded withthe following codes:

• 10: a1 is to the right of b2

• 01: a1 is right under b1

• 001: a1 is placed one to the right of b1

• 000: a1 is placed one to the left of b1

If none of the above is applicable the code 11 is used and thereafter twosets of FDR codes. There is dependence to the bit at a0. The first FDRcodeword gives the run-length with the same value as at a0 and the sec-ond gives the run-length for the opposite bit. This code also includes onefinal bit with the value at a0. For example 000001110 will be encodedas 1110111001 if the preceding bit is 0. 11(FDR-code)+1011(runlength5)+1001(runlength 3).

As described in Section 4.2.2 the vectors are sorted. Figure 4.4 showspseudo-code that illustrates the process.

Page 39: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

27

void GreedySort(testCube) {

int lengthOfVectors;string previous = 000...; //length = lengthOfVectorsstring tempfax;int shortest;

for each vector in testCube {shortest = FindShortestCode();MarkAsCoded(shortest);tempfax = EncodeVector(shortest, previous);previous = DecodeVector(tempfax, previous);Write(tempfax);

}}

Figure 4.4: Pseudo-code for Greedy Sort

Page 40: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.2. Compressing Test Vectors

4.2.7 Example

The small test set from Section 2.1.3 is here encoded with the algorithmdesribed above.

Vector1 - 000000XXXXXXX101XXXXXXXX0Vector2 - 0000000011111XXX111100XXX

The two vectors are first encoded with the imaginary first vector of 0’s asthe previous vector forcing them to be coded with run-length only. Thevectors get the following compressed codes (decompressed data is shownunder):

Vector1 :decompressed:

run−length︷︸︸︷

11 110111︸ ︷︷ ︸

0000000000000

01︸︷︷︸

1

︸︷︷︸

0

|run−length

︷︸︸︷

11 00︸︷︷︸

110011︸ ︷︷ ︸

111111111

︸︷︷︸

0

Vector2 :decompressed:

run−length︷︸︸︷

11 110010︸ ︷︷ ︸

00000000

110110︸ ︷︷ ︸

111111111111

︸︷︷︸

0

|a1>b2︷︸︸︷

10︸︷︷︸

00000

Since vector2 is encoded with a shorter code it is included first in thecompressed data. Vector1 is then encoded with a decompressed vector2 asprevious vector.

0000000011111111111100000 //Vector2 decompressed000000XXXXXXX101XXXXXXXX0 //Vector1 with don’t-cares

Vector1 :decompressed:

a1=b1︷ ︸︸ ︷

01︸︷︷︸

000000001

|run−length

︷︸︸︷

11 1011︸︷︷︸

11111

01︸︷︷︸

0

︸︷︷︸

1

|a1=b1︷︸︸︷

01︸︷︷︸

11110

|a1>b2︷︸︸︷

10︸︷︷︸

00000

Compressed dataVector2 - 1111001011011010Vector1 - 01111011010110

Page 41: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

29

4.3 Decompression

When decompressing the compressed data the algorithm needs to knowhow long the vectors are and it also requires access to the previous vector.It is sufficient to treat the previous vector as an input stream since itonly will be read in a sequence from the beginning. The decompressionalgorithm will consume each codeword in the compressed data and outputthe original data with the help of the previous vector. For any codewordto be decompressed the current bit denotes the value of the last bit thatwas decompressed. At the beginning of a new vector the current bit is setto 0. With different codewords different actions are taken:

10: a1 is placed after b2, keep producing bits with the same value ascurrent bit until b2 is reached, i.e. when the value in the previousvector input stream has changed two times. Current bit keeps thesame value.

01: a1 right under b1, produce bits with the same value as current bituntil there is a change in the previous vector. Add one bit with theopposite value and change current bit.

001: a1 one bit right of b1, same as with 01 except produce one extra bitbefore the last opposite bit.

000: a1 one bit left of b1, same as with 01 except produce one bit lessbefore the last opposite bit.

11: FDR run-length code. The decompression algorithm should consumetwo sets of FDR codes. The first set tells how many bits with thevalue of current bit that will be produced, the second tells how manybits of the opposite value. Finally one bit with the value of currentbit is produced. Current bit keeps the same value.

After each codeword is taken care of, the algorithm should see if allthe bits of one single vector is produced, otherwise continue with the nextcodeword. In each turn the algorithm should consume the same amountof bits from the previous vector as it produced itself. Codeword 10 is abit special since b2 can be placed to the right of all bits in the vector.

Page 42: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.4. Decompression in Assembler

When decompressing a 10 codeword the algorithm should stop producingbits when the right side is reached.

4.4 Decompression in Assembler

The decompression algorithm is easily implemented using a high level lan-guage such as C or java. But can it run on a simple processor with asmall amount of memory? Since none of the tested compiler together witha disassemlber could generate a small program, an implementation of afacsimile decoder were made in assembler directly. For this the Emu8086emulator were used to test the code. Without any SOC specific program-ming the size of the assembler code is 88 instructions. This is similar insize to implementation for other methods. The full code can be seen inAppendix A.

Instead of sending the output to the screen a real implementation wouldsend the output to the CUT. Also there would be some I/O instructionsto read the input stream and the previous vector.

4.5 Including Response Vectors

In most testing applications the response from a core is inserted into aMISR (multi-input signature register). The MISR only reports a signatureof all its inputs at the end of the test. If the signature doesn’t match theexpected one the chip is faulty and will be destroyed. An alternative isto compare every bit of each response with the expected one. The mainadvantage is that a test can be stopped as soon as a fault is discovered(abort-on-fail). There is also a risk that a response with multiple faultsstill generates the correct signature in a MISR. Usually the response is sentback to the ATE where the comparison is made. This transfer is donewithout any compression technique and that is why the MISR has becomeso popular, it decreases the test application time a lot.

A new approach to response examination is presented here. The idea isto send the responses in compressed form and let the embedded processordo the comparison with the actual response. The compression is done using

Page 43: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

31

the same facsimile technique as in previous sections and the test vectorscan simply be extended to include the responses.

The responses are very similar to the test vectors as they consists ofmany don’t care bits, but we need to be careful. A don’t care bit in theresponse can not only be chosen to 0 or 1. It must match the expectedresponse when the corresponding test vector is used, a test vector with alot of don’t care bits assigned to either 0 or 1. This can only be done withan ATPG which would need to be incorporated in the proposed method.This approach would probably make response vectors that do not fit to becompressed with the facsimile method.

A better solution is to send the response vectors with don’t care bits leftuntouched. With a don’t care bit in the expected response the comparisonprogram then should accept any bit in the actual response. Additionaldata needs to be sent to represent the don’t care bits. Four ways of codingthe response vector is presented here. For all four methods an exampleis shown with the test vector X1XXXXX001X and the response vectorXXXXX10XX0X. The complete vector is encoded to the representationthat is sent to the facsimile coding algorithm.

4.5.1 Using Mask

The don’t care bits are chosen freely to 0 or 1 in the same way as in the testvector. To determine which ones are don’t care bits a mask is added at theend of each vector. A 0 in the mask indicates that the corresponding bitin the response vector is don’t care, a 1 indicates that the bit is specifiedand should be compared with the bit in the actual response.

Orig. test︷ ︸︸ ︷

X1XXXXX001X

Orig. response︷ ︸︸ ︷

XXXXX10XX0X

The mask︷ ︸︸ ︷

00000110010

4.5.2 Two Bits Each

If each bit in the response vector is coded with two bits each there wouldbe no need of a mask. Actually only two of the four different combinationsof two bits are used by the bit 1 and 0 leaving two combinations to code

Page 44: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.5. Including Response Vectors

the don’t care bit. One solution would be to code 1 as 11, 0 as 10 and Xas either 00 or 01. To code an X as either 00 or 01 we simple code it 0X.

Orig. test︷ ︸︸ ︷

X1XXXXX001X

Response︷ ︸︸ ︷

0X︸︷︷︸

X

0X︸︷︷︸

X

0X︸︷︷︸

X

0X︸︷︷︸

X

0X︸︷︷︸

X

11︸︷︷︸

1

10︸︷︷︸

0

0X︸︷︷︸

X

0X︸︷︷︸

X

10︸︷︷︸

0

0X︸︷︷︸

X

You may ask why the X is coded as 0X and not simply a single 0. Thereason is that the vectors would become different in length and the facsimilecoding algorithm requires a previous vector of the same length.

4.5.3 Merged Test and Response Vector

When running the test application there is a matter of timing not previouslydiscussed. In Section 1.2 it is written that a test vector is shifted into thecore, the clock is applied and the response is shifted out. However, at thesame time as the response is shifted out it is possible to shift in the nexttest vector. This is called pipelining and saves a great amount of time. Inorder to use pipelining the application should compare the response vectorwith the expected one, at the same time as it shifts in the next vector.

The last two approaches merge the response vector with the next testvector. For each bit that is shifted into the core the decompression programalso will decompress one bit from the response, compare it with the actualresponse shifted out and continue only if they match (or the decompressedbit is don’t care). The first approach uses a mask that is placed at thebeginning of the vector. The mask has to be decompressed and savedbefore comparison can take place. The second uses two bits for each bitin the response vector in the same way as the method above. In boththese methods a last empty test vector needs to be added to include thelast response and mask. In the examples below the response vector is thesame as before but here it refers to the response vector of the preceedingvector. A t denotes a bit from the test vector. An r denotes a bit from theresponse.

The Mask︷ ︸︸ ︷

00000110010

Merged vector︷ ︸︸ ︷

Xt

Xr

1tXr

Xt

Xr

Xt

Xr

Xt

Xr

Xt

1rXt

0r0tXr

0tXr

1t0rXt

Xr

Page 45: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

33

Merged vector︷ ︸︸ ︷

Xt

0Xr

1t0X

rXt

0Xr

Xt

0Xr

Xt

0Xr

Xt

11r

Xt

10r

0t0X

r0t0X

r1t10r

Xt

0Xr

Page 46: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

4.5. Including Response Vectors

Page 47: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

35

Chapter 5

Experimental Results

With a set of experiments this chapter will show the efficiency of the pro-posed method. The experiments are made on real test data since real testdata has special properties that will affect the results. Results from thedifferent stages show which modifiation is the most valuable one and theresults are also compared with results from other methods. During all testincluding only test vectors some of the ISCAS’89 circuit’s test vectors wereused. For the algorithms that also include the response vectors test datafor the circuit D695 were used. These ciruits are small circuits released inpublic for development purposes.

5.1 Compressing Test Vectors Only

The compression algorithm was implemented and tested with Java on aSunBlade100 (500 MHz). In Table 5.1 the result for the different stages areshown. The second column shows the size of the uncompressed set, TD.For every stage both the compressed number of bits are shown and thepercentage compression. The percentage data compression was computedas:

Percentage Data Compression = Original Bits - Compressed BitsOriginal Bits

× 100

Page 48: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

5.1. Compressing Test Vectors Only

Circuit No Reorder Greedy Sort With FDR Mod. Codew.

Size (Prop. Scheme)

of TD

(bits)

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

s13207 178500 22903 87.17 17028 90.46 14648 91.79 14356 91.96

s15850 90428 29040 67.89 22791 74.80 17426 80.73 16816 81.40

s38417 174720 66922 61.70 58542 66.49 42762 75.53 41452 76.28

s38584 191784 115329 39.87 107087 44.16 69388 63.82 63789 66.74

s5378 30602 16211 47.03 11619 62.03 10134 66.88 10000 67.32

s9234 36062 23777 34.07 19037 47.21 14890 58.71 14074 60.97

Table 5.1: Compression obtained for different stages

Circuit FDR [3] Matrix [1] Linear [2] Prop. Scheme

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

s13207 30880 81.30 33470 79.99 9920 94.44 14356 91.96

s15850 26000 66.22 23552 67.88 11168 87.65 16816 81.40

s38417 93466 43.26 69556 56.00 30432 82.58 41452 76.28

s38584 77812 60.91 66838 65.15 30208 84.25 63789 66.74

s5378 12346 48.02 10390 59.20 5696 81.39 10000 67.32

s9234 22152 43.59 16888 53.49 9280 74.27 14074 60.97

Table 5.2: Comparison with other methods

As seen every modification gives better compression than the previ-ous one for all circuits. The compressions for the last modification arealmost similar as without, the compressions with Modified Codewords areonly 3-4% better than the ones with FDR. The main advantage with thismodification is not its compression ratio but its simplified decompressionalgorithm due to fewer codewords in use. This will make the decompressionprogram to run faster. In Table 5.2 the proposed method is compared withresults taken from reports proposing other techniques. The original dataused in different reports are not the same.

Page 49: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

37

Circuit gzip Prop. Scheme

0-mapped Facs. map

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

s13207 22600 87.34 23072 87.07 14356 91.96

s15850 22816 74.77 23600 73.90 16816 81.40

s38417 48864 72.03 46792 73.22 41452 76.28

s38584 68040 64.52 66872 65.13 63789 66.74

s5378 14040 54.12 14576 52.37 10000 67.32

s9234 19360 46.31 17920 50.31 14074 60.97

Table 5.3: Comparison with Unix gzip utility

The proposed scheme makes better compression than the methods inChandra and Chakrabarty [3] and in Balakrishnan and Touba [1] for allcircuits. The method in Balakrishnan and Touba [2] is still better, forcircuit s38584 the compressed size is less than half the size achieved withthe proposed method.

5.1.1 Unix gzip utility

To compare the result with the Unix gzip utility the test data can not simplybe sent to the utility, it needs to be changed to get better comparison. Allthe don’t cares need to be assigned to either 0 or 1 and the vectors maybe reordered. Which mapping that fits gzip the best is not known so twodifferent test sets were explored for each circuit. The first test set wasnot reordered and all don’t cares were assigned a 0. In Table 5.3 columntwo and three presents the compressed size and ratio for this mapping.The second test set was the one calculated by the proposed scheme withreordered vectors and don’t cares assigned to fit facsimile coding. Resultsare found in column four and five. For three of the circuits the first onewas the best and for the other three the second was the best. Also notethat the difference in size between the two mappings is small.

Compressing an ascii text file similar to the ones usually used in theseexperiments would give gzip an advantage, all the 0’s are coded with the

Page 50: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

5.2. Including Response Vectors

Circuit With FDR Local Search Reduction

Comp.

bits

%

Comp.

Comp.

bits

%

Comp.

%

s13207 14648 91.79 13898 92.21 5.12

s15850 17426 80.73 16541 81.71 5.07

s38417 42762 75.53 40509 76.81 5.27

s38584 69388 63.82 66423 65.37 4.27

s5378 10134 66.88 9694 68.36 4.34

s9234 14890 58.71 14090 60.93 5.37

Table 5.4: Comparison with Local Search

same eight bits which can be used to compress the file. The same for allthe 1’s. Instead the test data files were transformed into a binary formatconsisting only of the bits in the test sets. We can see that the proposedmethod outperforms gzip on all circuits.

5.1.2 Local Search Heuristic

Even though all test sets used in this report are smaller than 200,000 bitsthe completion of this heuristic took a very long time. Days were neededto finish the calculations for the largests circuits. The main reason is, asmentioned in Section 4.2.2, that for each change in the test set, big partsof the compressed data needs to be recalculated.

The local search heuristic was implemented before the modified code-words were applied. In Table 5.4 the heuristic is compared not to the pro-posed scheme, but to the scheme without the modified codewords, called’With FDR’ in Table 5.1. Since the run time for this heuristic is extremelylong I did not redo this experiment with the final scheme, still we can seethat the gain in compression is small.

5.2 Including Response Vectors

For this part, test vectors to the D695 SOC was used. It consists of tencircuits, some of them used in the previous section but with different test

Page 51: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

39

Circuit Size Test only Size Resp. only

(test

only)

Comp.

bits

%

Comp.

(resp.

only)

Comp.

bits

%

Comp.

c6288 448 697 -55.58 448 742 -65.63

c7552 15525 11802 23.98 8100 7521 7.15

s838 5092 2218 56.44 2584 789 69.47

s9234 27417 14209 48.17 27750 12815 53.82

s38584 166896 72103 56.80 166896 72095 56.80

s13207 164500 25204 84.68 185650 28936 84.41

s15850 59267 19832 66.54 66348 23676 64.32

s5378 21400 11005 48.57 22800 10860 52.37

s35932 21156 3502 83.45 24576 3553 85.55

s38417 144768 66428 54.11 151554 71640 52.73

Table 5.5: Compression for D695 test and response vectors separate

sets.First a compression was made on the test and response vectors sepa-

rately in order to compare the compression ratio with the ones includingthe response. The result can be found in Table 5.5. Column two gives theoriginal size of the test set, column three and four gives the compressedsize and the percentage. Column five through seven shows the same thingsfor the response vectors.

When compressing the test and response vectors separately the com-pression is good for all but the two first circuits. The reason is that thereare much fewer don’t care bits in their vectors. In fact, the first circuit hasno don’t care bits at all.

The results from compressing test and response vectors together areshown in Table 5.6. The second column shows the size of each test vector,each response vector and the sum of these two. For each of the four differentways of coding the don’t care bits both the size of the compressed dataand the compression percentage is shown. The results are not as good asin Table 5.5, mainly because the response data is twice the size in theseexperiments.

It is clear that the use of a mask is much better than coding each don’t

Page 52: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

5.2. Including Response Vectors

care bit with two bits. The difference between the two methods using amask is not very big. For all but one of the circuits, it is better to placethe response after the test but only for three of the circuits the difference issomewhat big. As described in Section 4.5 the scheme for a decompressionprogram benefits from when the test and response vectors are in mixedmode.

Page 53: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

41

Size Response after Mixed

Test Mask Two bits Mask Two bits

Circuit Resp. Comp. % Comp. % Comp. % Comp. %

Total bits Comp. bits Comp. bits Comp. bits Comp.

448

c6288 448

896 1451 -61.94 2062 -130.13 1455 -62.39 1972 -120.09

15525

c7552 8100

23625 24463 -3.55 30975 -31.11 28636 -21.21 39800 -68.47

5092

s838 2584

7676 3979 48.16 10851 -41.36 5346 30.35 12659 -64.92

27417

s9234 27750

55167 40754 26.13 48702 11.72 44163 19.95 58148 -5.40

166896

s38584 166896

333792 255166 23.56 259684 22.20 238922 28.42 266203 20.25

164500

s13207 185650

350150 117236 65.52 125179 64.25 135147 61.40 154607 55.85

59267

s15850 66348

125615 72680 42.14 103476 17.62 83152 33.80 125817 -0.16

21400

s5378 22800

44200 36556 17.29 47278 -6.96 40049 9.39 52423 -18.60

21156

s35932 24576

45732 10434 77.18 56930 -24.49 34405 24.77 85425 -86.79

144768

s38417 151554

296322 265498 10.40 290632 1.92 283693 4.26 340239 -14.82

Table 5.6: Test and response vectors compressed together

Page 54: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

5.2. Including Response Vectors

Page 55: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

43

Chapter 6

Discussion

Here some thoughts about compression of test vectors is discussed. Whatare the benefits and disadvantages of the various techniques? Is there somelower limit for how small a compacted set of test vector can be? (Entropybounds). Is it possible to use a more complex method like zip and if so,what is gained?

6.1 Proposed Method

When the work was initially started, the facsimile approach was chosenbecause of three reasons:

• Facsimile code compresses bi-level images (black-white) - the test setis a square of 1’s and 0’s.

• Facsimile code uses the fact that each line of dots is often very similarto the one above - the vectors (lines) in the test set can be reorderedto minimize the difference.

• A short facsimile code can produce a very long output if the bits arethe right ones - all the don’t care bits in the test set can be assigned1 or 0 in order to maximize this.

Page 56: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

6.1. Proposed Method

The first one is not much to comment, the other two are more inter-esting. To illustrate how many different correct test sets we can find for acircuit we can stduy the smallest one of the circuits in this report.

Circuit s5378 has 143 test vectors with 214 bits each, 30602 bits intotal. The second reason above is that we can reorder the vectors. Thiswill give us 143! (≈ 3.8×10247) different test sets to evaluate. The test setalso contains 25500 don’t care bits which for every test set has to be set toeither 0 or 1, 225500 ≈ 107676. Hence more than 107923 different test setsare correct. Which one of these will have the best compression ratio whensent to the compression algorithm? No conventional computers will everbe able to test all combinations even for this small example. Imagine whatwill happen when circuits with billions of bits in their test sets are used.

The proposed method uses in fact two different heuristics to find a goodtest set. The first heuristic tries to make a short facsimile code given a pre-vious vector and the next vector including don’t care bits. It is possible tofind the very best assignment of don’t care bits and produce the shortestfacsimile code. Still it is a heuristic because it does not care about whatis coming after. A shorter code may force all following vectors to be en-coded with longer codes than before. The second heuristic is Greedy-Sort,explained in Section 4.2.2.

If these two heuristics could be joined to one or cooperate more a lotwould be won. One idea is to also incorporate an ATPG (automatic testpattern generator) in the heuristic. The ATPG creates the test vectors byanalyzing the circuit’s specification. Different ATPG creates different setsof vectors which may differ in size. With the ATPG incorporated into theheuristic a bigger test set could be generated if the compressing algorithmwould gain much. This may not be possible. It is also not sure if the gainwould be high.

6.1.1 Local Search

There were big hopes when the work with the Local Search heuristic wasstarted. In earlier work it has been showed that this heuristic can solvevery hard problems and creating better solutions for other problems. Theseproblems do not differ very much from the problem with ordering testvectors, but there is one major difference. For all other problems, a change

Page 57: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

45

in the data only forced the algorithm to recalculate a small part of thesolution. In this problem a lot of vectors have to be compressed againbecause every change is propagated downwards.

If a better heuristic for sorting the vectors could be found, further tech-niques like hill-climbing or simulated annealing could be evaluated. Whenthe implementation of Local Search had found the first local minima for allthe circuits, it was clear that something had to be changed. The time ittook to find the first local minima was way too long and the improvementin compression was very low.

6.1.2 Discarded Techniques

In the stages presented in Design and Results two techniques were left out.Experiments using the difference vectors, Tdiff , when compressing madealmost no difference in compressed size. With a difference vector the bitsthat are changed from the previous vector is marked with a 1, all other bitsgets a 0. The idea is that the complete test set will be mostly 0’s and thatthe facsimile code could make a better work. Probably the poor result isdue to the fact that facsimile code already uses the difference between thevectors to compress the data.

The second discarded technique was modifying the codewords inside theFDR run-length code. As seen in Table 4.1 the group prefix is 10 and 110for group A2 and A3 respectively. If the run-lengths in group A3 are morecommon than the ones in A2, a gain could be achieved if the prefixes wereswitched. Statistics from the experiments showed that a small gain couldbe made for some of the circuits. However, the gain in size does not justifythe more complex decompression algorithm. A lookup table would have tobe used assigning each group a prefix.

6.2 Storing Previous Vector

The facsimile coding approach uses the previous test vector when decom-pressing the next one. The SOC needs to be designed in such a way that itis possible to store a test vector without destroying it. Two solutions arediscussed here.

Page 58: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

6.2. Storing Previous Vector

6.2.1 Store in Memory

As assumed in this report a SOC with a built-in memory of sufficient sizecan store the previous vector. The memory also needs to store the decom-pression program and maybe also a buffer for the input stream. This ap-proach is straight forward, the decompression program only needs a pointerto the memory where it will read and write the previous vector. The proces-sor already has the ability to read and write to memory, no extra circuitrywill be needed.

6.2.2 Core Feedback

In the case with longer test vectors than available memory we can use otherinactive cores. The processor then shifts out a test vector not only to theCUT, but also to one or more temporary cores. These cores should not getthe system clock applied when the CUT gets it, in that case the previousvector would be destroyed. To access the vector saved in a temporarycore some extra circuitry is needed. The MISRs could be used for this, byrebuilding them to function as serializers when their corresponding core isused as temporary memory. The design is presented in Figure 6.1.

The decompression of one vector would start by shifting out the firstbits from the temporary cores. With the MISR as a serializer the bits aresent one by one to the processor where they are used to decompress thenext vector. As bits are decompressed they will be shifted into the CUTand also to the temporary cores. This way new bits will be shifted out tothe MISR and can be sent to the processor.

The assignment of temporary cores to each core and the way the systemknows which cores to use may be a little tricky. The difference in vectorsize also needs to be considered. When storing a small vector in a biggercore the system should know the difference in order to shift out unwantedbits before receiving the previous vector. In that case extra time will bespent doing practically nothing.

Page 59: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

47

Processor

Memory

I/O Core A

Core B

��

��

Wrapper

Wrapper

MISR

/se

rializ

er

MISR

/se

rializ

er

Scan Chain 1 �Scan Chain 2 �

.

.

.

.

.

.

Scan Chain n �

Scan Chain 1 �Scan Chain 2 �

.

.

.

.

.

.

Scan Chain n �

ATE

��

TAM

��

����

Figure 6.1: Example SOC with core feedback

Page 60: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

6.3. Synchronization

6.3 Synchronization

Throughout this report it has not gone into detail about how the com-pressed data is sent to the SOC. Working solutions already exists and areused in reports that use techniques similar to this one. One issue that couldbe a problem is the synchronization. When the ATE sends some data theonboard processor needs to decompress everything before the ATE can sendmore. The other way around is also important, when the processor has fin-ished one block it want to start with another one right away. In manycases synchronization is made by inserting NOP instructions(instructionsthat do nothing) to the decompression algorithm or slowing down the op-erating speed of the ATE.

Since facsimile coding is a variable-to-variable-length coding scheme thesynchronization can be tricky. For example the short pass mode code 10 canproduce hundreds of bits. A long horizontal mode code on the other handcan produce very few bits. For example will the code 110001, which is 6bits long, only produce 2 bits. When a code is decompressed, either to datahundred times the length, or to data with 1/3 the length synchronizationcan not be made just by slowing one of the components down. One ideais to prepare two buffers inside the memory. When the ATE writes to oneof them the processor reads from the other one. As the ATE or processorcomes to the end of the buffer it gives a signal. When both have given thesignal they switch buffer. There exist more effective solutions where lesstime is spent on waiting. A bounded buffer is one example, but rememberthat there are limited resources inside a SOC. Such solutions require somesort of counters, the ability to read these and controlling circuitry to stopreading/writing when the buffer is empty/full.

6.4 Comparing the Result

The comparison with other methods was made with results from their cor-responding reports. The test data used in those experiments differ in sizefrom the test data used in this report. In order to compare the methodscorrectly all tests should be made with the same test sets. With more timeto complete this work each method should have been implemented in order

Page 61: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

49

to compare the results using the same data. Hopefully the data used inother reports are similar to the data used here.

What about entropy bounds, lower limit of the compressed size usingthe proposed scheme? With such a measurement, we could see how wellthe proposed method compresses the data. In Balakrishnan and Touba [2]the result is compared with the number of specified bits. The ratio of thespecified bits to the compressed bits is called the encoding efficiency. Stillthere is a chance that the compressed size can be made smaller than thespecified bits which will result in an encoding efficiency over 1. This can notbe true for an entropy bound. Since the proposed method does not havea correlation to the number of specified bits no such comparison is madein this report, nor could a way to calculate the correct entropy bounds befound.

6.5 Including Response

The idea of compressing and sending also the response vectors is new andcan not be compared with other results. Even if the compression ratio isnot very good in the experiments, this technique could be found useful inenvironments where abort-on-fail is wanted. Also situations with extremelyhigh demands, where no faulty SOC may pass the test, may benefit fromthis technique. With a MISR there is still a chance that a faulty SOCwill pass the tests. Including the response vectors is not limited to thiscompression method, earlier methods could also be extended to handlethe response vectors. In order to fully see the efficiency of the proposedmethod this is something that should be done. Four different methods ofrepresenting the don’t care bits and including the response in the test setwere tried in this report. The use of a mask has much better compressionthan to represent each bit in the response with two bits. The reason isthat the two parts will be much more correlated to other vectors. Themask consists mainly of zeros and is often similar to other masks of othervectors. With two bits for each bit in the response the data is cluttered withspecified bits that the algorithm can not choose itself which leads to worsecompression ratio. The fact that most of the circuits gave almost the sameresult when mixing response and test as when adding the response after

Page 62: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

6.6. Complex Methods

the test is pleasant. The implementation can be made much more efficientif the test can be shifted in at the same time as the response is comparedwithout storing the expected response temporarily. However, this may notbe the case with bigger designs for real systems.

6.6 Complex Methods

Complex methods is actually some standard compression utilities that ev-eryone will think of when they’re asked about compression. There are anumber of such utilities, zip, gzip, compress and bzip only to mention someof them. The most common technique used in these utilities is Lempel-Zivcompression. This is a dictionary-based technique, the compressed datacontains references to blocks of data in a dictionary. The main advantageof Lempel-Ziv is that the dictionary is built up from the original data, thedictionary does not need to be saved as additional data. Some of the util-ities also combines Lempel-Ziv with Huffman codes or other techniques,this is not the case with gzip which uses Lempel-Ziv alone.

The results in Table 5.3 show that gzip does not compress the testdata as good as the proposed method. The reason is that with gzip don’tcare bits are not assigned 0 or 1 in order to optimize compression norare the vectors reordered. An implementation of Lempel-Ziv for test datacompression should do much better and would be an interesting thing todo.

Page 63: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

51

Chapter 7

Conclusions and FurtherWork

In which situations is this method well suited and what can be made better.

7.1 Conclusions

The proposed method outperforms all other previous methods except theone presented by Balakrishnan in [2] (shortly presented in Section 3.3.1).The compression ratio in that method may be reduced in two situations.The first is when the memory available on the SOC is to small to hold allthe compressed data at once, then the test set needs to be partitioned andthis reduces the compression ratio. The second is the situation where thelinear equations can not be solved in a reasonable amount of time. Heretoo the test set needs to be partitioned. When this occurs this methodshould be compared with the proposed method to find the better one.With sufficient amount of memory and with linear equations that can besolved my suggestion is to use the method by Balakrishnan and Touba.

The solution with response vectors included showed that it can be done.If the result is good enough is still to be explored as no other method hasincluded the response.

Page 64: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

7.2. Further Work

7.2 Further Work

All the experiments in this report used very small test data for benchmarkcircuits. It would be interesting to evaluate the method with bigger testdata sets from real circuits. The proposed method should also be evaluatedin terms of the given prerequisites. If no real SOC has sufficiently amountof available memory and the core feedback method proposed in 6.2.2 is notapplicable for some reason, this method may not be realizable.

A work on test data compression focusing on optimization methodscould give better heuristics. The idea is to adapt and merge an ATPGwith the assignment of don’t care bits and ordering problem to create oneheuristic. The heuristic would include all parts of the test data creationthat can be changed. The key is to do this in reasonable amount of time.

As with the method to only compress test data, the method includingthe response vectors should be tested on bigger test data sets from realcircuits. It would also be interesting to extend other methods to includethe response vectors in order to do some comparisons.

Even if some claims that complex compression techniques like Lempel-Ziv is not suitable for test data compression, time will surely make themsuitable. The SOC’s of tomorrow most certainly will contain enough re-sources to decompress test data compressed with any algorithm. Lempel-Ziv is a highly studied technique with many variations. A method adaptedto test data compression may provide great results.

Page 65: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

53

Bibliography

[1] K. J. Balakrishnan and N. A. Touba. Matrix-Based Test Vector Decom-pression Using an Embedded Processor. In Defect and Fault Tolerancein VLSI Systems, 2002, pages 159–165, Nov 2002.

[2] K. J. Balakrishnan and N. A. Touba. Deterministic Test Vector Decom-pression in Software Using Linear Operations. In VLSI Test Symposium,2003, pages 225–231, April-May 2003.

[3] A. Chandra and K. Chakrabarty. Test Data Compression andTest Resource Partitioning for System-on-a-Chip Using Frequency-Directed Run-Length (FDR) Codes. IEEE Transactions on Computers,52(8):1076–1088, Aug 2003.

[4] S. K. Goel and E. J. Marinissen. SOC Test Architecture Design forEfficient Utilization of Test Bandwidth. ACM Transactions on DesignAutomation of Electronic Systems, 8(4):399–429, 2003.

[5] P. T. Gonciari, B. M. Al-Hashimi, and N. Nicolici. Improving Compres-sion Ratio, Area Overhead, and Test Application Time for System-on-a-Chip Test Data Compression/Decompression. In Design, Automationand Test in Europe Conference and Exhibition, 2002, pages 604–611,March 2002.

[6] S. Hwang and J. A. Abraham. Test Data Compression and Test TimeReduction Using an Embedded Microprocessor. IEEE Transactions onVery Large Scale Integration (VLSI) Systems, 11(5):853–862, Oct 2003.

Page 66: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

[7] A. Jas and N. A. Touba. Deterministic Test Vector Compres-sion/Decompression for Systems-on-a-Chip Using an Embedded Proces-sor. Journal on Electronic Testing: Theory and Applications (JETTA),18(4/5):503–513, Aug 2002.

[8] Sayood Kahlid. Introduction to Data Compression. Morgan KaufmannPublishers, 2 edition, 2000.

[9] A. D. Friedman M. Abramovici, M. A. Breuer. Digital Systems Testingand Testable Design. IEEE PRESS, 1990.

Page 67: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Copyright

SvenskaDetta dokument halls tillgangligt pa Internet - eller dess framtida ersattare - under en lan-gre tid fran publiceringsdatum under forutsattning att inga extra-ordinara omstandigheteruppstar.

Tillgang till dokumentet innebar tillstand for var och en att lasa, ladda ner, skriva utenstaka kopior for enskilt bruk och att anvanda det oforandrat for ickekommersiell forskningoch for undervisning. Overforing av upphovsratten vid en senare tidpunkt kan inte upphavadetta tillstand. All annan anvandning av dokumentet kraver upphovsmannens medgivande.For att garantera aktheten, sakerheten och tillgangligheten finns det losningar av teknisk ochadministrativ art.

Upphovsmannens ideella ratt innefattar ratt att bli namnd som upphovsman i den om-fattning som god sed kraver vid anvandning av dokumentet pa ovan beskrivna satt samtskydd mot att dokumentet andras eller presenteras i sadan form eller i sadant sammanhangsom ar krankande for upphovsmannens litterara eller konstnarliga anseende eller egenart.For ytterligare information om Linkoping University Electronic Press se forlagets hemsidahttp://www.ep.liu.se/

EnglishThe publishers will keep this document online on the Internet - or its possible replacement -for a considerable time from the date of publication barring exceptional circumstances.

The online availability of the document implies a permanent permission for anyone toread, to download, to print out single copies for your own use and to use it unchanged for anynon-commercial research and educational purpose. Subsequent transfers of copyright cannotrevoke this permission. All other uses of the document are conditional on the consent of thecopyright owner. The publisher has taken technical and administrative measures to assureauthenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned whenhis/her work is accessed as described above and to be protected against infringement. Foradditional information about the Linkoping University Electronic Press and its proceduresfor publication and for assurance of document integrity, please refer to its WWW home page:http://www.ep.liu.se/

c© Jon PerssonLinkoping, 8th April 2005

Page 68: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test
Page 69: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

Appendix A

Assembler Code

ORG 100hjmp startprev DB ’00000001111111111110010000’prev_size DW 25fax DB ’00100110000001110001110101’start: MOV AH, 0Eh ;Sub-function for printing

MOV DI, 0 ;Counts number of symbols producedMOV DX, prev_size ;Number of symbols to produceMOV SI, 0 ;FaxposMOV CX, 0 ;Runlength variableMOV AL, ’0’ ;CurChar

next_word:CMP DI, DXJNE cont ;End of one rowJMP stop

cont: MOV BP, 0 ;a1 in relation to b1CMP fax[SI], ’1’JE one ;Run-length or a1 > b2INC SI ;0CMP fax[SI], ’1’JE print ;01, a1 right under b1

zerozero: INC SICMP fax[SI], ’1’JE zerozerooneMOV BP, 1 ;000, a1 one left of b1JMP print

zerozeroone:INC DI ;Needs to print at least oneINT 10hMOV BP, -1JMP print

one: INC SICMP fax[SI], ’1’JNE onezero

oneone: INC SI ;11, run-length

Page 70: Deterministic Test Vector Compression/Decompression …20197/FULLTEXT01.pdfwhen testing a core, together these test vectors constitutes the test data, sometimes referred to as a test

INC BP ;Counter for tailCMP fax[SI], ’0’JE totailSTC ;Set carryRCL CX, 1 ;Leftshift in 1 (carry)JMP oneone

totail: SHL CX, 1 ;Leftshift in 0MOV BX, 0 ;Empty in case of run-length

tail: INC SICMP fax[SI], ’1’ ;Sets carry to inverted fax[SI]CMC ;Inverts CFRCL BX, 1 ;Leftshift next bitDEC BPCMP BP, 0JNE tailADD CX, BXCMP CX, 0JE noloop

print-rl: INC DI ;Print CX bitsINT 10hLOOP print-rl

noloop: XOR AL, 00000001bXOR DI, 1000000000000000bCMP DI, 0JS oneoneCMP DI, DXJE stop ;End of one rowINC DIINT 10hINC SIJMP next_word

onezero: INC DI ;Same all the way to b2 or end of rowINT 10hCMP DI, DXJE stop ;End of one rowCMP prev[DI-1], ALJE onezero

b1b2: INC DIINT 10hCMP DI, DXJE stop ;End of one rowCMP prev[DI-1], ALJNE b1b2INC SIJMP next_word ;End of onezero

print: INC DICMP prev[DI+BP-1], ALJNE finishINT 10hJMP print

finish: XOR AL, 00000001bINT 10hINC SIJMP next_word

stop: RET