LOW POWER TEST PATTERN GENERATION FOR SYSTEM ON CHIP DEVICES by AFTAB FAROOQI, B.S.E.E., M.B.A. A THESIS IN ELECTRICAL ENGINEERING Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN ELECTRICAL ENGINEERING Approved Richard Gale Chairperson of the Committee Tim Dallas Accepted John Borrelli Dean of the Graduate School May, 2006
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
LOW POWER TEST PATTERN GENERATION FOR
SYSTEM ON CHIP DEVICES
by
AFTAB FAROOQI, B.S.E.E., M.B.A.
A THESIS
IN
ELECTRICAL ENGINEERING
Submitted to the Graduate Faculty of Texas Tech University in
Partial Fulfillment of the Requirements for
the Degree of
MASTER OF SCIENCE
IN
ELECTRICAL ENGINEERING
Approved
Richard Gale Chairperson of the Committee
Tim Dallas
Accepted
John Borrelli Dean of the Graduate School
May, 2006
ACKNOWLEDGEMENTS
I would like to thank Dr. Gale, my mentor for the entire MSEE
Program, especially for his leadership in guiding me to define the scope of the thesis and helping me identify key milestones towards the completion of the project. Many thanks to Dr. Dallas for helping me break-down the overall thesis project to smaller manageable sub-projects. Thanks to Dr. Nutter for his assistance in helping me experiment the project using Xilinx. Dr. Karp always made herself available to help me decipher signal processing attributes included in the many IEEE papers I had to read for deeper understanding of the issues. Dr. Parten who was always there to help me breakdown complex issues included in the IEEE papers to simple algorithms for better conceptual understanding. Dr. Mitra for her guidance in helping me understand complex mathematical concepts behind signal processing algorithms.
I would like to honor Dr. Chris Monico of the mathematics department for helping me better understand random number generation theory especially the correlation between complex polynomials, matrices and LFSR’s. (Linear Feedback Shift Registers).
ii
Dr. Monico’s assistance was pivotal in helping me grasp the
fundamental mathematical concepts behind a very complex subject of low power pattern generation.
I would also like to thank Dr. Temkin’s candid and sincere steering to help me focus on the fundamental semiconductor manufacturing concepts for stronger technical foundation.
iii
TABLE OF CONTENTS ACKNOWLEDGEMENTS ii ABSTRACT vi LIST OF TABLES viii LIST OF FIGURES ix CHAPTER
A Technique to Produce Low Power Pattern 29 for BIST
Benchmark Design Circuits 34
V. RESULTS AND DISCUSSION
Simulation Using Standard LFSR Pattern 39 Simulation Using LP-LFSR 40 Power Consumption Using standard LFSR 41 Power Consumption Using LP-LFSR 43
Power Consumption Comparison 44 (Standard LFSR versus LP-LFSR) Summary and Conclusion 46 VI. RECOMMENDATIONS FOR FUTURE WORK 49
SELECTED BIBLIOGRAPHY 51 APPENDICES 54
A. PATTERN GENERATION CONTROLLER 54 B. VERILOG TESTBENCH 62
C. XILINX REPORTS 68 D. C432 VERILOG CODE 91
v
ABSTRACT
State of the art developments in the semiconductor manufacturing processes, integrated chip design methodology, availability of thousand plus pin integrated circuit (IC) packaging options and efficient IC test techniques have contributed immensely towards the integration of entire system on a chip.
These System-On-Chip (SOC) devices can include multiple microprocessors, various types of memories such as SRAM, Flash and ROM, Digital Signal Processor(s), dozens of IP blocks and user defined logic.
Various SOC test techniques have been innovated in the last decade to test complex mixed signal systems on a chip in a cost effective manner. The test industry has made great strides in developing new automated test equipment which can test logic, memory and analog components of the chip via external interface to the IC. Advances in the Built-In-Self-Test (BIST) techniques has enabled IC testing using a combination of external automated test equipment and BIST Controller on the chip.
vi
The power consumption of the chip during manufacturing test can
be significantly higher than the power consumption of the chip in its target system. This increase in the power consumption can be attributed primarily to on-chip extremely random test pattern generation.
This thesis probes into the various IC test approaches such as
external, internal and embedded with specific investigation into the low power test stimulus generation. A new low power pattern generation technique is implemented. Conventional and low power test patterns are applied on an industry standard ISCAS-85 c432 27-channel interrupt controller circuit and average power consumption is measured. The results indicate 60% lower power consumption by the circuit using the new approach for an identical fault coverage of 98% in both cases.
vii
LIST OF TABLES 2.1 ITRS Roadmap by Product 12 3.1 Present/Next State of the Flip-Flops 22 5.1 Power Consumption Analysis 46
The interrupt controller has three interrupt request buses A, B and C,
each having nine bits or channels, and one channel-enable bus E. The
following priority rules apply: A[i] > B[j] > C[k], for any i, j, k; i.e., bus A
has the highest priority and bus C the lowest. Within each bus, a channel
with a higher index has priority over one with a lower index; for example,
A[i] > A[j], if i > j. If E[i] = 0, then the A[i], B[i], and C[i] inputs are
disregarded.
35
The seven outputs PA, PB, PC and Chan[3:0] specify which channels
have acknowledged interrupt requests. Only the channel of highest priority
in the requesting bus of highest priority is acknowledged. One exception is
that if two or more interrupts produce requests on the channel that is
acknowledged, each bus is acknowledged. For example, if A[4], A[2], B[6]
and C[4] have requests pending, A[4] and C[4] are acknowledged. Figure
4.9 is a 9-line-to-4-line priority encoder.
Figure 4.5 ISCAS-85 c432 M1
Figure 4.6 ISCAS-85 c432 M2
36
Figure 4.7 ISCAS-85 c432 M3
Figure 4.8 ISCAS-85 c432 M4
37
Figure 4.9 ISCAS-85 c432
38
CHAPTER V
RESULTS AND DISCUSSION
Simulation using standard LFSR pattern
The standard 36-bit pattern is generated using the LFSR configuration
as shown in figure 5.1 below. The schematic in the case of conventional
pattern generation consists of 36 flip-flops connected in series. The design is
modified as indicated in figure 5.2 below with feedback taps to generate a
maximal length pattern generator including all 0’s and 1’s. The number of
vectors expected in this case are 236. The outputs of the 36-bit LFSR are
used as the inputs to the c432 ISCAS-85 interrupt controller design circuit.
A common clock is supplied to all flip-flops. A seed value is assigned to the
output of each flip-flop. Each clock pulse thereafter shifts the logic value
present at the input of the flip-flop to its output.
39
Figure 5.1 8-bit LFSR
Figure 5.2 Maximal 8-bit LFSR
Simulation Using LP-LFSR
LP-LFSR pattern is generated as shown in Figure 5.3 below. The
simulation report confirms the number of signal transitions between the bits
of the successive vectors to be the same for both patterns namely,
conventional and LP-LFSR.
40
Figure 5.3 LP-LFSR Pattern Simulation
Power consumption using standard conventional pattern
The methodology used to estimate the power consumption is similar to
the one used for the low power pattern generator. As shown in figure 5.4 the
design circuit is simulated in the Xilinx ISE development environment using
Mentor Graphics’ ModelSim. The number of test vectors is restricted in
order to contain the Verilog Core Dump file to a manageable size for power
consumption analysis.
41
Fig 5.4 Power Estimation Flow
42
The VCD file contains the switching activity of the design circuit for the
number of test vectors. The number of test vectors is obtained from a Fault
simulation tool called TetraMax from Synopsys. This tool takes the VCD
file as the input file along with the c432 interrupt controller design file and
produces the number of test vectors required for the desired fault coverage.
Another way of generating specific number of vectors is by using the
clock period and simulation time. For instance if the clock period is 60ns
and the simulation time is 60us. The number of vectors produced will be
60us/60ns = 1000.
Using the standard pattern, the ATPG tool generates 330 vectors for
98% fault coverage which translates to approximately 16mw power
consumption by the c432 circuit.
Power consumption using low power pattern
The key to achieving Low power consumption in System-On-Chip
devices is by reducing the switching activity in the device under test. The
low power technique described in chapter 4 improves the correlation
between the signals of the successive vectors (i.e. input stimulus to the
43
circuit under test) resulting in reduced transitions of the primary inputs
hence reducing switching activity inside the circuit under test.
The methodology used in estimating the power consumption16,17 of
the device under test includes the generation of the 36-bit low power pattern,
synthesizing the c432 circuit using generic libraries, running the 36-bit
pattern on the c432 circuit and computing the power consumption using a
power estimation EDA tool.
Circuit Simulation is implemented using Mentor’s ModelSim tool in
the Xilinx ISE development environment. It is important to restrict the
simulation time (i.e. number of test vectors) to a few microseconds in order
to contain the VCD file to a manageable size for the purpose of evaluating
the switching activity. Xilinx xPower tool is used to read in the VCD file for
power consumption estimation. TetraMax was used to determine the number
of test vectors required for the desired fault coverage.
Power consumed by the c432 circuit is observed to be 10mw using
370 vectors for 98% fault coverage. Detailed reports on synthesis,
simulation and power estimation are included in Appendix C.
Power Consumption Comparison (Standard LFSR vs LP-LFSR)
44
Two test benches are designed using Verilog as labeled in Appendix B.
The first test bench uses a conventional test pattern generator and the
second test bench uses a low power pattern generator. Both test benches are
used to simulate a common design circuit which in this case is an industry
standard 27-channel interrupt controller benchmark circuit. Verilog code for
c432 is included in Appendix D.
Both test benches are designed to use the same pre-defined clock
period as well as identical simulation time. This ensures the same number of
test vectors generated by both test benches. The number of gates used by the
interrupt controller are 250 as indicated by the synthesis reports from the
Xilinx development environment. Logic gates used by the conventional test
bench are 60 and the number of gates used by the low power test bench are
135.
TetraMax ATPG and Fault simulation tool is used to estimate the
number of test vectors required for 98% fault coverage of the interrupt
controller. The tool generated 330 vectors for the conventional test bench
and 370 vectors for the low power test bench. Both test benches produced
almost the same number of test vectors for the desired fault coverage thus
demonstrating about the same test time used in both cases.
45
The two VCD files (for conventional and low power pattern)
containing the interrupt controller’s switching activity were used for power
consumption estimation by Xilinx xPower power analysis tool.
xPower calculates the average power consumed by the circuit for each
test vector applied by observing the logic value at each internal and external
node of the circuit. The transition in the logic value at each node (1 → 0 or 0
→ 1) results in the dynamic power consumption by the gate of the Xilinx
Spartan 2 device. Total power consumed by the circuit is the sum of the
power consumption by the circuit for each test vector.
The reported power consumption estimates as indicated above
demonstrates approximately 60% lower power consumption by the interrupt
controller using the low power test bench as compared with the conventional
pattern. This result demonstrates lower number of logic transitions at the
internal and external nodes of the test circuit. The difference in the power
consumption of the test logic between the two approaches (65 gates versus
135 gates is negligible).
46
Table 5.1 Power Consumption Comparison
Fault Coverage # of Test Vectors
# of Gates in the
Test Circuit
# of Gates in the Test
Controller Average Power Consumption
Conventional
The configurable logic blocks and input/output blocks used in most
field programmable gate arrays such as the Spartan 2 device are typically not
optimized for lowest power consumption compared with some options of the
gate array and Standard cell products. Therefore it is possible to achieve
even lower power consumption by the circuit in an ASIC implementation
compared with an FPGA.
Summary and Conclusion
The System on a chip revolution challenges both design and test
engineers especially in the area of power dissipation. Generally the chip
consumes more power in the manufacturing test mode than in normal
operation mode in its targeted system. The increase in the power
LFSR 0.98
330.00
250
65 16mW
LP-LPSR 0.98
370.00
250
130 10mW
47
consumption can result in un-repairable damages in the chip directly
impacting the overall yield and cost.
This thesis investigates the fundamental process used for IC Design, Test
and Manufacturing including design entry, tool flow methodology and hand-
off to Manufacturing. Specific detailed attention is focused on IC
verification and test.
Design-For-Test and Design-For-Manufacturing is the mainstream
approach today for IC Design. This approach is mandating the entire SOC
development team to collaborate very closely with each other in clearly
articulating adequate test requirements and methodology as well as ensuring
the manufacturability of the chip.
Various test methodologies such as external (ATE based), DFT-
SCAN/BIST and embedded (combination of low cost external tester and
SCAN/BIST) approaches are studied. The embedded approach is found to be
prevalent in SOC testing.
It may be easier, for the large Semiconductor component companies
when compared with smaller Fabless companies, to justify the cost of
expensive external ATE systems due to the higher utilization rate by the
48
former and inherent flexibility in expensive ATE’s to integrate test (e.g.
memory, mixed-signal, etc) specific electronics. Other trade-off factors
such as the impact of the number of pins required in the device for test,
external versus internal pattern generation, etc need to be carefully evaluated
for the most optimum cost versus performance test solution.
Including SCAN/BIST on the chip tends to increase the die size by a
small percentage and the power consumption of the chip. The increase in the
power consumption is attributed primarily to the increase in the circuit’s
switching activity. Random pattern generation theory is investigated along
with the correlation of the Linear Feedback Shift Register based PRPG,
matrix theory, characteristic polynomial and the generator polynomial.
A Technique to generate low power PRPG is implemented and
applied on an industry standard benchmark circuit for power consumption
estimation. The comparison of power consumption by the circuit
demonstrates 60% lower power consumed by the circuit when using low
power pattern as the input stimulus compared with the input stimulus
generated by the conventional LFSR based PRPG.
49
CHAPTER VI
RECOMMENDATIONS FOR FUTURE WORK
SOC designs are making a rapid shift from mostly digital to mixed
signal including millions of user defined logic gates and dozens of IP (Core
as well as I/O based). IC Verification and Test strategy needs to include
advanced controllers and pattern generators for testing digital as well as
analog components of the chip. Pattern generation inside the chip is well
known to cause increase in the power consumption of the IC during the
manufacturing test. New design and test techniques need to be investigated
to keep this increase in the power consumption by the chip as minimum as
possible.
The availability of advanced manufacturing process rules in the
design/verification libraries and tool flow methodologies is mandating the IC
front-end designers to verify the manufacturability of the chip much in
advance in the design process . Therefore development of the new SOC DFT
techniques needs to be compliant with the advanced DFM rules18.
50
SELECTED BIBLIOGRAPHY
1 1 Patrick Girard, “Survey of Low-Power Testing of VLSI Circuits, IEEE Design and Test of Computers, May-June 2002, Volume: 19 , Issue: 3, page(s): 80 – 90, ISSN: 0740-7475 2 2 S. Zhang, et. al, “Cost driven optimization of fault coverage in combined built-In-Self-Test/Automated Test Equipment Testing”, IEEE Instrumentation and Measurement Technology Conference, May 18-20, 2004 3 3 L. Ungar and T. Ambler, “Economics of Built-In-Self-Test”, IEEE Design and Test of Computers, Sept.-Oct. 2001, Volume: 18 , Issue: 5, page(s): 70 – 79, ISSN: 0740-7475 4 4 Benoit Nadeau-Dostie, “Design for AT-SPEED TEST, DIAGNOSIS and MESUREMENT, ISBN 0-7923-8669-8 5 5 T.Moon and W. Stirling, “ Mathematical Methods and Algorithms for Signal Processing”, ISBN 0-201-36186-8 6 6 A. J. van de Goor, “TESTING SEMICONDUCTOR MEMORIES theory and practice”, ISBN 90-80 4276-1-6 7 7 N.Ahmed, M. H. Tehranipour, M. Nourani, “Low Power Pattern Generation for BIST Architecture”, IEEE Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium, 23-26 May 2004, Vol. 2, pages 689-92 8 8 X. Zhang and K. Roy, “Peak Power reduction in low power BIST”, Quality Electronic Design, 2000. ISQED 2000. Proceedings. 20-22 March 2000, page(s): 425 – 432
51
9 9 G. Marsaglia and A. Zaman, “A New Class of Random Number Generators”, The annals of Applied Probability, 1991, Vol 1, No. 3, 462 – 480 10 10 G. Marsaglia and L. Tsay, “ Matrices and the Structure of Random Number Sequences”, Linear Algebra and its applications 67:147-156 (1985) 11 11 F. Nekoogar, “From ASICs to SOCs, A practical Approach”, ISBN 0-13-033857-5 12 12 Barabara Chappel, “The fine art of IC design”, IEEE Spectrum, July 1999, Volume: 36 , Issue: 7, page(s): 30 – 34, ISSN: 0018-9235 13 13 C. Wang and K. Roy, “Maximum Power Estimation for CMOS circuits using deterministic and statistic approaches”, 9th International conference on VLSI design, Jan 1996 14 14 E. Larson et, al, “Efficient Test Solutions for Core-Based Designs”, IEEE transactions on Computer aided design of integrated circuits and systems, vol. 23, May 2004 15 15 M.L. Mehta, “Some remarks on Random Number Generators”, Number theory and physics, 1990, Springer proceedings in Physics, Vol. 47, pages 253-259 16 16 F. Najm, “A survey of power estimation techniques in VLSI circuits”, IEEE Very Large Scale Integration (VLSI) Systems, Dec. 1994, Volume: 2 , Issue: 4, page(s): 446 – 455
52
17 17 F. Najm, “ Estimating power dissipation in VLSI circuits”, IEEE Circuits and Devices Magazine, July 1994, Volume: 10 , Issue: 4, page(s): 11 – 19 18 18 M. Schrader and R. McConnell, “SOC Design and Test considerations”, Design, Automation and Test in Europe Conference and Exhibition, 2003, page(s): 202 – 207, ISSN: 1530-1591
`timescale 1ns / 1ps //////////////////////////////////////////////////////////////////////////////// // Company: // Engineer: // // Create Date: 17:08:47 12/08/2005 // Design Name: main // Module Name: testdec08.v // Project Name: projecta // Target Device: // Tool versions: // Description: // // Verilog Test Fixture created by ISE for module: main // // Dependencies: // // Revision: // Revision 0.01 - File Created // Additional Comments: // //////////////////////////////////////////////////////////////////////////////// module testdec08_v; // Inputs reg Clk; reg Reset; // Outputs
62
wire PA; wire PB; wire PC; wire [3:0] Chan; // Instantiate the Unit Under Test (UUT) main uut ( .PA(PA), .PB(PB), .PC(PC), .Chan(Chan), .Clk(Clk), .Reset(Reset) ); initial begin // Initialize Inputs Clk = 0; Reset = 0; // Wait 100 ns for global reset to finish #100; // Add stimulus here // Create a 60ns/16.7MHZ clock and run it for a few us Reset = 1; end always #30 Clk = ~ Clk; //Need to set sim time in modelsim for the VCD file - try atleast 6us // for sufficient switching activity. For 6us VCD file size is 26KB
.PA(PA), .PB(PB), .PC(PC), .Chan(Chan), .Clk(Clk), .Reset(Reset), .TE(TE) ); initial begin // Initialize Inputs Clk = 0; Reset = 0; TE = 0; // Wait 100 ns for global reset to finish #100; // Add stimulus here Reset = 1; TE = 1; end //Clock is 16.7MHZ (60ns), simulate in modelsim for 6us for the VCD file always #30 Clk = ~ Clk; initial begin $dumpfile ("mod.vcd"); $dumpvars(1, modpatt_v.uut);
Set property "resynthesize = true" for unit <top>.
Analyzing module <lfsr_fsm>.
72
Module <lfsr_fsm> is correct for synthesis.
Analyzing module <lfsr_andor_mux>.
Module <lfsr_andor_mux> is correct for synthesis.
* HDL Synthesis *
Synthesizing Unit <lfsr_andor_mux>.
Related source file is "lfsr_andor_mux.v".
Unit <lfsr_andor_mux> synthesized.
Synthesizing Unit <lfsr_fsm>.
Related source file is "lfsr_fsm.v".
Found 1-bit register for signal <en1>.
Found 1-bit register for signal <en2>.
Found 1-bit register for signal <sel1>.
Found 1-bit register for signal <sel2>.
Found 3-bit comparator greatequal for signal <$n0000> created at line 54.
73
Found 3-bit up counter for signal <count>.
Summary:
inferred 1 Counter(s).
inferred 4 D-type flip-flop(s).
inferred 1 Comparator(s).
Unit <lfsr_fsm> synthesized.
Synthesizing Unit <top>.
Related source file is "top.v".
WARNING:Xst:1780 - Signal <anor> is never used or assigned.
Found 1-bit xor2 for signal <$n0000> created at line 92.
Found 4-bit register for signal <q_lower>.
Found 1-bit register for signal <q_mid>.
Found 4-bit register for signal <q_upper>.
Summary:
inferred 1 D-type flip-flop(s).
Unit <top> synthesized.
* Advanced HDL Synthesis *
74
=================================================
Advanced RAM inference ...
Advanced multiplier inference ...
Advanced Registered AddSub inference ...
Dynamic shift register inference ...
HDL Synthesis Report
Macro Statistics
# Counters : 1
3-bit up counter : 1
# Registers : 7
1-bit register : 5
4-bit register : 2
# Comparators : 1
3-bit comparator greatequal : 1
# Xors : 1
1-bit xor2 : 1
75
* Low Level Synthesis *
Optimizing unit <top> ...
Optimizing unit <lfsr_fsm> ...
Optimizing unit <lfsr_andor_mux> ...
Loading device for application Rf_Device from file 'v200.nph' in
environment C:/Xilinx.
Mapping all equations...
Building and optimizing final netlist ...
Found area constraint ratio of 100 (+ 5) on block top, actual ratio is 0.
* Final Report *
Final Results
RTL Top Level Output File Name : top.ngr
Top Level Output File Name : top
Output Format : NGC
Optimization Goal : Speed
Keep Hierarchy : NO
76
Design Statistics
# IOs : 11
Macro Statistics :
# Registers : 17
# 1-bit register : 17
# Comparators : 1
# 3-bit comparator greatequal : 1
Cell Usage :
# BELS : 16
# INV : 1
# LUT2_L : 2
# LUT3 : 1
# LUT3_L : 5
# LUT4 : 6
# LUT4_L : 1
# FlipFlops/Latches : 16
# FDC : 4
# FDE : 4
# FDP : 5
77
# FDR : 2
# FDS : 1
# Clock Buffers : 1
# BUFGP : 1
# IO Buffers : 10
# IBUF : 2
# OBUF : 8
Device utilization summary:
---------------------------
Selected Device : 2s200pq208-6
Number of Slices: 9 out of 2352 0%
Number of Slice Flip Flops: 16 out of 4704 0%
Number of 4 input LUTs: 15 out of 4704 0%
Number of bonded IOBs: 11 out of 144 7%
Number of GCLKs: 1 out of 4 25%
78
Total memory usage is 86756 kilobytes
Number of errors : 0 ( 0 filtered)
Number of warnings : 1 ( 0 filtered)
Number of infos : 1 ( 0 filtered)
Release 7.1.02i - XPower SoftwareVersion:H.40 Copyright (c) 1995-2005 Xilinx, Inc. All rights reserved. Design: main.ncd Preferences: main.pcf VCD File: C:\Xilinx\bin\Design with inputs from low power lfsr\mod.vcd Part: 2s200pq208-6 Data version: PRELIMINARY,v1.0,07-31-02 XPower and Datasheet may have some Quiescent Current differences. This is due to the fact that the quiescent numbers in XPower are based on measurements of real designs with active functional elements reflecting real world design scenarios. Power summary: I(mA) P(mW) ---------------------------------------------------------------- Total estimated power consumption: 10 --- Vccint 2.50V: 1 2 Vcco33 3.30V: 2 8 --- Clocks: 0 0 Inputs: 1 2 Logic: 0 1 Outputs:
In presenting this thesis in partial fulfillment of the requirements for a master’s
degree at Texas Tech University or Texas Tech University Health Sciences Center, I
agree that the Library and my major department shall make it freely available for
research purposes. Permission to copy this thesis for scholarly purposes may be granted
by the Director of the Library or my major professor. It is understood that any copying
or publication of this thesis for financial gain shall not be allowed without my further
written permission and that any user may be liable for copyright infringement.
Agree (Permission is granted.)
________Aftab Farooqi______________________________ __04-29-06_______ Student Signature Date Disagree (Permission is not granted.) _______________________________________________ _________________ Student Signature Date