VERILOG DESIGN OF INPUT/OUTPUT PROCESSOR WITH BUILT-IN-SELF-TEST GOH KENG HOO UNIVERSITI TEKNOLOGI MALAYSIA
VERILOG DESIGN OF INPUT/OUTPUT PROCESSOR
WITH BUILT-IN-SELF-TEST
GOH KENG HOO
UNIVERSITI TEKNOLOGI MALAYSIA
iii
Specially dedicated to my beloved parents, younger brother, supervisor, lectures,
fellow friends and those who have guided and inspired me throughout my journey of
education.
iv
ACKNOWLEDGEMENT
This project will not able to complete without the help and guidance from
others. Thus, I would like to take this opportunity to acknowledge the people below.
First and most importantly, I would like to express my gratitude to my project
supervisor Prof. Dr. Mohamed Khalil Hani. He guided me along the journey and
always gives constructive suggestions and opinions. With his help, I do not lose in
term of the project scope and objectives. I would say, through out the whole year of
working under him, I really gain a lot of knowledge not only in the technical area but
also prepare myself to be a better person.
On top of that, I would like to thank my manager and Intel Penang Design
Centre for supporting and sponsoring me to sign up this part time course. Thanks to
my manager for being so thoughtful and always allows me to take examination leave
to complete my project.
On a personal level, my deepest thank go to my parents, friends and peers for
their mental support throughout my academic years.
v
ABSTRACT
This project has a final goal of designing an I/O processor (IOP) with
embedded built-in-self-test (BIST) capability. The IOP core design was originally
design in VHDL modeling has been migrated to Verilog HDL modeling in this
project. BIST is one of the most popular test technique used nowadays. The
embedded BIST capability in IOP designed in this project has the objectives to
satisfy specified testability requirements and to generate the lowest-cost with the
highest performance implementation. Linear Feedback Shift Register (LFSR) is used
to replace the expensive testers to generate pseudo random test pattern to IOP while
Multiple Input Signature Register (MISR) is able to compact the IOP output response
into a manageable signature size. In this project, the designed is coded in Verilog
hardware description language at register transfer level (RTL), synthesized using
Altera Quartus II using FPFA device from APEC20KE family, RTL level
compilation and simulation using Modelsim v6.1b and gate level timing simulation
using Modelsim-Altera v6.1g. This project was scheduled for two semester in which
the activities to study and determine hardware specifications, requirements,
functionalities and Verilog HDL migration were done in first semester whereas
activities to design, synthesis, compile, simulate, and validate were carried out in
semester 2. IOP with BIST capability contributes additional 30% hardware overhead
but is somehow reasonable considering the test performance obtained and the ability
of the BIST block provides high fault coverage.
vi
ABSTRAK
Projek ini bertujuan untuk merekacipta satu pemproses masukan/keluaran
(IOP) dengan keupayaan terbina dalam uji sendiri (BIST). Pada asalnya, IOP
dimodel dalam bahasa VHDL telah ditukar kepada rekacipta dalam bahasa Verilog
HDL. BIST merupakan salah satu teknik yang paling banyak diguna hari ini.
Keupayaan BIST yang dimasukan ke dalam IOP bertujuan untuk memenuhi
keperluan keujianan tertentu dan menjana kos yang paling rendah dengan prestasi
yang paling tinggi. Pendaftar Anjakan Suap Balik Linear (LFSR) diguna untuk
mengantikan penguji yang mahal and untuk menjana ujian yang pseudo rawak
kepada IOP. Padahal, Pendaftar Tandatangan Pelbagai Masukan (MISR) boleh
diguna untuk mengurangkan keluaran sambutan dalam untuk tandatangan yang boleh
gurus. Dalam projek ini, IOP dimodel dalam bahasa Verilog dalam tahap pendaftar
pindah (RTL), sistesis dengan menggunakan Altera Quartus II dengan FPGA dari
keluarga APEC20KE, kompil rekacipta dalam tahap RTL, simulasi rekacipta dengan
menggunakan Modelsim v6.1b dan simulasi tahap get dengan Modelsim-Altera
v6.1g. Project ini dirancang bagi 2 semester dengan aktiviti untuk menentukan
specifikasi rekacipta, keperluan dan fungsi rekacipta and ketukaran kepada IOP
dalam bahasa Verilog dilaksanakan semasa semester 1, manakala aktiviti rekacipta,
sistesis, kompil dan simulasi dilaksanakan dalam semester 2. IOP dengan keupayaan
BIST terbina dalam menbesarkan IOP sebanyak 30%, tetapi dianggap boleh tahan
and boleh diterima memandangkan prestasi uji yang cemerlang and keupayaan BIST
untuk memberi liputan kesalahan yang begitu tinggi.
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION ii
DEDICATION iii
ACKNOLEDGEMENT iv
ABSTRACT v
ABSTRAK vi
TABLE OF CONTENTS vii
LIST OF TABLES xi
LIST OF FIGURES xii
LIST OF ABBREVIATIONS xv
LIST OF APPENDICES xvi
CHAPTER 1 INTRODUCTION
1.1 Overview and Problem Statement 1
1.2 Objectives 3
1.3 Scope of Work 4
1.4 Project Contributions 5
1.5 Thesis Outline 6
viii
CHAPTER 2 BACKGROUND AND THEORY
2.1 An Overview of Serial Communication 7
2.2 UART 8
2.3 BIST – An Overview 10
2.3.1 BIST Process 10
2.3.2 BIST Implementation 11
2.3.3 BIST Pattern Generation 13
2.3.3.1 Standard LFSR 14
2.3.4 BIST Response Compaction 16
2.3.5 Built-In Logic Block Observers 18
CHAPTER 3 METHODOLOGY AND DESIGN TOOLS
3.1 Project Design and Implementation Flow 22
3.2 Verilog HDL 28
3.3 Mentor Graphics Modelsim v6.1 29
3.4 FPGA Synthesis 31
3.5 Altera Quartus II 6.1 32
3.6 Modelsim-Altera v6.1g Web Edition 34
CHAPTER 4 I/O PROCESSOR CORE DESIGN
4.1 IOP – An Overview 36
4.2 IOP Architecture 38
4.3 Message Format 41
4.4 IOP Operation 43
ix
CHAPTER 5 DESIGN OF IOP WITH EMBEDDED
BIST CAPABILITY
5.1 Top Level Architecture 44
5.2 BIST Module 51
5.2.1 BIST DPU Module 54
5.2.1.1 Command Select Mux 57
5.2.1.2 Functional Block 58
5.2.1.3 BILBO 60
5.2.1.4 Decoder Block 65
5.2.1.5 Command Decoder 67
5.2.1.6 NextCmd Select 68
5.2.2 BIST CU Module 69
5.3 IOP BIST Operation 83
CHAPTER 6 SIMULATION, VERIFICATION AND
RESULTS ANALYSIS
6.1 Verilog HDL Code Efficiency Gain 90
6.2 Design Compilation, Synthesis and Timing Analysis 92
6.2.1 Design Synthesis and Analysis 92
6.2.2 Fitter and Assembler 94
6.2.3 Design Timing Analysis 95
6.3 Design Test bench 98
6.4 IOP Simulations and Discussion 100
6.4.1 Functional Mode 100
6.4.2 BIST Mode 108
6.4.2.1 Fault Free Circuit Simulation 109
6.4.2.2 Faulty Circuit Simulation 113
x
CHAPTER 7 CONCLUSION AND FUTURE WORK
7.1 Conclusion 116
7.2 Recommendation for Future Works 117
LIST OF REFERENCES 120
APENDIX A-B 122-192
xi
LIST OF TABLES
TABLE NO TITLE PAGE
Table 2.1 Control modes for the BILBO of figure 2.6 19
Table 4.1 (a) Table of Control Bytes 41
Table 4.1 (b) Table of Command Bytes 42
Table 5.1 BIST module Input, Output and Internal signal 53
Table 5.2 BIST DPU Input, Output signal 56
Table 5.3 Command Select Mux output table 58
Table 5.4 BILBO block in BIST DPU operation mode 60
Table 5.5 Input and output signals for BIST CU module 70
Table 5.6 BIST CU RTL control sequences table 82
Table 6.1 Verilog HDL migration code efficiency gain 91
Table 6.2 Analysis and Synthesis - IOP Resource 94
Usage Summary
Table 6.3 Summary of IOP timing analysis 96
xii
LIST OF FIGURES
FIGURE NO TITLE PAGE
Figure 2.1 UART data frames format 9
Figure 2.2 BIST hierarchy 11
Figure 2.3 BIST architecture 12
Figure 2.4 Standard n-stage LFSR circuit 14
Figure 2.5 Multiple input signature register 18
Figure 2.6 BILBO circuit 19
Figure 2.7 BILBO in serial scan mode 19
Figure 2.8 BILBO in LFSR mode 20
Figure 2.9 BILBO in normal D flip-flop mode 20
Figure 2.10 BILBO in MISR mode 21
Figure 3.1 Project design flow 23
Figure 3.2 Multiple abstractions for digital system design 24
Figure 3.3 Digital design flow 25
Figure 3.4 Project design and implementation flow 27
Figure 3.5 General structure of an FPGA 31
Figure 3.6 Quartus II design flow 33
Figure 4.1 Overall System Architecture of PC-based Design 37
and Test System for digital design in FPGA
Figure 4.2 IOP Architecture 40
Figure 4.3 Data Packet Format 41
Figure 5.1 IOP block diagram with embedded BIST 45
xiii
Figure 5.2 IOP design with embedded BIST 46
capability architecture
Figure 5.3 IOP with embedded BIST Design Hierarchy 47
Figure 5.4 High level BIST operation flow 48
Figure 5.5 BIST module block diagram 52
Figure 5.6 BIST DPU module block diagram 55
Figure 5.7 Command Select Mux block diagram 57
Figure 5.8 Functional Block diagram 59
Figure 5.9 BILBO circuit 61
Figure 5.10 BILBO LFSR block diagram 62
Figure 5.11 BILBO LFSR mode operation waveform 62
Figure 5.12 BILBO scan mode operation waveform 63
Figure 5.13 BILBO D flip-flop mode operation waveform 63
Figure 5.14 BILBO MISR block diagram 64
Figure 5.15 BILBO D MISR mode operation waveform 65
Figure 5.16 Decoder Block diagram 66
Figure 5.17 Command Decoder block diagram 68
Figure 5.18 NextCmd Select block diagram 68
Figure 5.19 BIST CU block diagram 69
Figure 5.20 BIST CU RTL Code 71
Figure 5.21 IOP BIST operation flow chart 84
Figure 6.1 Full compilation of the design by Quartus II 97
Figure 6.2 Design test bench 99
Figure 6.3 Waveform for functional mode simulation 1 result 101
Figure 6.4 Waveform for functional mode simulation 2 result 103
Figure 6.5 Waveform for functional mode simulation 3 result 105
Figure 6.6 Waveform for functional mode simulation 4 result 106
Figure 6.7 Waveform for functional mode simulation 5 result 108
Figure 6.8 Simulation waveform 1 for fault free IOP 110
in BIST mode
Figure 6.9 Simulation waveform 2 for fault free IOP 112
in BIST mode
xiv
Figure 6.10 Simulation waveform 3 for fault free IOP 113
in BIST mode
Figure 6.11 Simulation waveform 1 for faulty IOP 114
in BIST mode
Figure 6.12 Simulation waveform 2 for faulty IOP 115
in BIST mode
xv
LIST OF ABBREVIATIONS
ASIC - Applications Specific Integrated Circuit
ATPG - Automatic Test-Pattern Generation
BILBO - Built-In Logic Block Observers
BIST - Built-In-Self-Test
CAD - Computer Aided Design
CU - Control Unit
CUT - Circuit Under Test
DPU - Data Path Unit
DUT - Design Under Test
FES - Front End Subsystem
FPGA - Field Programmable Gate Array
HDL - Hardware Description Language
IC - Integrated Circuit
IP - intellectual property
LFSR - Linear Feedback Shift Register
LSB - Least Significant Bit
MISR - Multiple Input Signature Register
I/O - Input/Output
PC - Personal Computer
ROM - Read-Only-Memory
RTL - Register Transfer Level
UART - Universal Asynchronous Receiver/Transmitter
SDF - Standard Delay Format
XOR - Excusive OR
xvi
LIST OF APPENDIXS
APPENDIX TITLE PAGE
APPENDIX A - Module Block Diagram and Verilog HDL Code
A1 Block diagram and VHDL code for Testbench module 122
A2 Block diagram and Verilog HDL code for IOP module 126
A3 Block diagram and Verilog HDL code for clk_1mhz module 128
A4 Block diagram and Verilog HDL code for bist module 129
A5 Block diagram and Verilog HDL code for bis_cu module 131
A6 Block diagram and Verilog HDL code for bis_dpu module 139
A7 Block diagram and Verilog HDL code for bilbo module 146
A8 Block diagram and Verilog HDL code for DFFlop module 148
A9 Block diagram and Verilog HDL code for IOP_CU module 148
A10 Block diagram and Verilog HDL code for IOP_DPU module 158
A11 Block diagram and Verilog HDL code 159
for IO_Interface module
A12 Verilog HDL code for Interface module 160
A13 Block diagram and Verilog HDL code for Buffers module 161
A14 Verilog HDL code for Register module 163
A15 Verilog HDL code for T_output_reg module 164
A16 Block diagram and Verilog HDL code for Input_card module 164
A17 Verilog HDL code for Trans_comparator module 166
xvii
A18 Verilog HDL code for Counter module 167
A19 Block diagram and Verilog HDL code for Output_card module 167
A20 Verilog HDL code for Poll_comparator module 169
A21 Block diagram and Verilog HDL code for UART module 170
A22 Block diagram and Verilog HDL code for C_transmit module 170
A23 Block diagram and Verilog HDL code for C_receive module 173
A24 Block diagram and Verilog HDL code for Status_reg module 175
A25 Block diagram and Verilog HDL code for UART_Comm module 176
A26 Verilog HDL code for Clock_generator module 177
A27 Block diagram and Verilog HDL code for 178
UART_Receiver module
A28 Verilog HDL code for Receiver_CU module 179
A29 Block diagram and Verilog HDL code 182
for Receiver_DPU module
A30 Verilog HDL code for Bit_counter1 module 183
A31 Verilog HDL code for Rcv_shftreg module 183
A32 Verilog HDL code for Sample_counter module 184
A33 Block diagram and Verilog HDL code 184
for UART_Transmitter module
A34 Block diagram and Verilog HDL code 185
for Transmitter_CU module
A35 Block diagram and Verilog HDL code 187
for Transmitter_DPU module
A36 Verilog HDL code for Datareg module 188
A37 Verilog HDL code for Bit_counter module 189
A38 Verilog HDL code for Shftreg module 190
APPENDIX B
B1 VHDL to Verilog HDL Conversion Table 191
CHAPTER 1
INTRODUCTION
This chapter gives an overview of the whole project, starts with the project
background and problem statement, followed by the project objectives, scopes,
project contributions and thesis outline
1.1 Overview and Problem Statement
The UTM PC-based FPGA Prototyping System consists of the I/O Processor
(IOP) core and the Front-End Subsystem. Please refer to thesis titled “PC-Based
FPGA Prototyping System – Front-End Subsystem Design” (2005) by Ng, Shuh
Jiuan and “PC-Based FPGA Prototyping System – I/O Processor Design” (2005) by
Chua, Lee Ping for more detail of the system design. The system is designed to
enable users to send input data to Design-Under-Test (DUT) and obtains the DUT
outputs displays on the Personnel Computer (PC) screen. This project will focus on
discussing the IOP with embedded Built-In-Self-Test capability design.
2
IOP is a soft core that can be used to handle data communication between the
PC and DUT in the FPGA-based Prototyping Board. The UART module in IOP is a
soft core that used to conduct serial I/O communication. A universal asynchronous
receiver/transmitter (UART) is a type of asynchronous receiver/transmitter computer
hardware which is used to translate data between parallel and serial interfaces. It is
commonly used for serial data telecommunication. A UART converts bytes of data
to and from asynchronous start-stop bit streams represented as binary electrical
impulses. It is mainly used at broadband modem, base station, cell phone, and PDA
designs
It is crucial that the IOP is functioning correctly and is fault free in real
silicon or FPGA board to ensure that the data sends to the DUT inputs are correct
and outputs from the DUT send back to the Front End Subsystem is reliable.
With the increasing growth of sub-micron technology has resulted in the
difficulty of testing. Manufacturing processes are extremely complex, making the
manufacturers to consider testability as a requirement to assure the reliability and the
functionality of each of their designed circuits. Built-In-Self-Test (BIST) is one of
the most popular test technique used. For design and test development, BIST
significantly reduces the costs of automatic test-pattern generation (ATPG) and also
reduces the likelihood of disastrous product introduction delays because of a fully
designed system cannot be tested.
A Universal Asynchronous Receive/Transmit with BIST capability has the
objectives of firstly to satisfy specified testability requirements, and secondly to
generate the lowest-cost with the highest performance implementation. Although
BIST slightly increases the cost because of the BIST hardware overhead in the
design and test development, due to added time required to design and added pattern
generators, response compactors, and testability hardware. However, it is normally
less costly than test development with ATPG.
3
With the reasons discussed above, this project focuses on the design of the
embedded BIST architecture for an IOP. The designs will be implemented using
Verilog Hardware Description Language (HDL) at the Register Transfer Level
(RTL) abstraction level. BIST technique will be incorporated into the IOP design
before the overall design is synthesized by means of reconfiguring the existing
design to match testability requirements.
1.2 Objectives
The main objective of this project is to design a serial I/O Processor (IOP)
logic core with the Built-in-Self-Test (BIST) capability using Verilog HDL.
This entails the following sub-objectives:
1. To migrate the UTM serial I/O processor (IOP) from VHDL to Verilog
HDL modeling.
2. To upgrade the I/O Processor with Built-In-Self-Test capability.
3. To explore the possibility for implementation of the I/O module design
with Altera APEX20 family Field Programmable Gate Array (FPGA).
4
1.3 Scopes of Work
(a) IOP was originally design in using VHDL. It is converted to Verilog
HDL design.
(b) IOP is upgraded with embedded BIST capability.
(c) The design is modeled in Verilog HDL at the RTL abstraction level.
(d) This project is limit to design, simulate, validate and verify the design
at RTL level using Mentor Graphic’s Modelsim v6.1 b.
(e) In this project, the design is synthesized into gate level netlist with
Altera EP20K200EFC484-2X FPGA board using Altera’s Quartus II
6.1.
(f) Gate level timing simulation, validation and verification will be
performed using Modelsim-Altera 6.1g.
(g) The design is targeting based on opportunistic to implement into
Altera’s EP20K200EFC484-2X FPGA board.
5
1.4 Project Contributions
The IOP is migrated to Verilog HDL which contributes several advantages.
Verilog HDL allows different levels of abstraction to be mixed in the same models
and thus can define a hardware model in terms of switches, gates, RTL, or behavioral
code. Besides, most popular logic synthesis tools support Verilog HDL. This makes
it the language of choice for many ASIC companies. More important, all fabrication
vendors provide Verilog HDL libraries for post logic synthesis simulation. Thus,
designing a chip in Verilog HDL allows the widest choice of vendors. On top of that,
compared to VHDL, Verilog HDL provides better code efficiency and easier to learn.
Nowadays, most IC design company like Intel, Altera, Avago and other companies
have migrated HDL design from VHDL to Verilog.
BIST is getting more and more important today. New ASIC (Applications
Specific Integrated Circuit) designs nowadays are having embedded BIST. An IOP
with BIST capability helps UTM Faculty of Electrical (FKE) in future research of the
area of IC testing. This project also serves as a starting point and proof of concept.
It can be a soft core IP (intellectual property) for UTM FKE and helps FKE to own
its own IC design with BIST.
With the implementation of BIST, it enables to test IOP automatically
through the self-generated test with exhaustive data values. This ensures the IOP
chip is fault free by having very high fault coverage testing. With embedded BIST,
expensive tester requirements and testing procedures starting from circuit or logic
level to field level testing are minimized and this reduces the chip or system test cost.
The reduction of the test cost will lead to the reduction of overall production cost.
6
IOP with BIST capability at the same time ensures the fault free circuit and
thus making sure that the input data from Front End Subsystem through the IOP to
DUT are correct as well as the DUT outputs values are reliable.
1.5 Thesis Outline
This thesis concentrates on the theory and design of IOP with embedded
BIST capability and its functionality. This thesis is organized into 7 chapters.
Chapter 1 is an overview of the project, objectives, project scope, project
contributions and thesis outline. Chapter 2 discusses the project background,
literature survey and theory which includes theory of serial communication, the
UART and the overview of Built-In-Self-Test such as the BIST process,
implementation, architecture and design
Chapter 3 elaborates the project methodology and the CAD design tools. It
discusses the project design and implementation flow and CAD tools used in
working out this project. Chapter 4 is the gives an overview 0f the UTM IOP design.
This chapter includes the architecture of the IOP, messages format and IOP
operations.
Chapter 5 describes core chapter of this thesis, which will describe in more
detail the BIST module design includes the BIST architecture, operations, control
unit as well as data path unit (DPU). Chapter 6 elaborates the simulation, results and
analysis of IOP with embedded BIST capability. Finally, Chapter 7 concludes this
project with conclusion, project limitations, recommendations and suggested future
works.
7
CHAPTER 2
BACKGROUND AND THEORY
This chapter discusses the theory of serial communication, the UART and the
overview of Built-In-Self-Test which includes the BIST process, implementation,
architecture and design.
2.1 An Overview of Serial Communication
Serial communication is the process of sending data and receiving one bit of
data at one time sequentially through a communications channel or computer bus.
On the other hand, parallel communications is a process where all the bits of each
symbol are sent together. In general, serial communication is used for all long-haul
communications and most computer networks where it is impractical to use parallel
communications due to the cost of cable and synchronization. Nowadays computer
buses or network communication using serial communications are becoming more
common as improved technology enables them to transfer data at higher speeds.
8
There are 2 types of serial communication, full-duplex and half duplex. A
full duplex device can send and receive data at the same time. Thus, a full duplex
communication needs 2 different ports, one for serial in data while another for serial
out data. On the other hand, half duplex serial devices support only one-way
communications and therefore only able either receiving or transmitting data at a
time. Normally half duplex devices share the same port for both serial in and out.
Although IOP designed in this project has 2 dedicated port serial in and serial out for
transmitting and receiving data, however IOP is considered as half-duplex device as
IOP only have one control unit to manage the receive and transmit traffic at a time.
2.2 UART
Universal asynchronous receiver/transmitter (UART) is an asynchronous
serial receiver/transmitter. It is a piece of computer hardware that commonly used in
PC serial port to translate data between parallel and serial interfaces. The UART
takes bytes of data and transmits the individual bits in a sequential fashion. At the
receiving point, UART re-assembles the bits into complete bytes.
Asynchronous transmission allows data to be transmitted without having to
send a clock signal to the receiver. Thus, the sender and receiver must agree on
timing parameters in advance and special bits are added to each word which is used
to synchronize the sending and receiving units. In general, UART contains of two
main block, the transmitter and receiver block. The transmitter sends a byte of data
bit by bit serially out from UART while UART receiver receives the serial in data bit
by bit and converts them into a byte of data.
9
UART starts the data transmission by asserting a bit called the "Start Bit" to
the beginning of each data that is to be transmitted. The Start Bit is also used to
inform the receiver that a byte of data is about to be sent. After the Start Bit, the
individual bits of the byte of data are sent, with the Least Significant Bit (LSB) being
sent first. Each bit in the transmission is transmitted for exactly the same amount of
time as all of the other bits. On the other, UART the receiver will need to sample the
logic value that being received at approximately halfway through the period assigned
to each bit to determine if it is logic 1 or logic 0.
When a byte of data has been sent, the transmitter may add a Parity Bit. The
Parity Bit may be used by the receiver to perform simple error checking. In this
project, parity bit is not being implemented. After this, a Stop Bit is sent by the
transmitter to indicate the transmitter has completed the data transmission. If another
byte of data is to be transmitted, the Start Bit for the new data can be sent as soon as
the Stop Bit for the previous word has been sent. Figure 2.1 below shows the typical
UART data frames format that used by the IOP UART module in this project.
Figure 2.1 UART data frames format
The speed of the serial connection is measured in bits-per-second or normally
expressed as "baud rate". The duration of a bit is dependent on the baud rate. The
baud rate is the number of times the signal can switch states in one second. Thus, if
the line is operating at 9600 baud, the line can switch states 9,600 times per second.
This means each bit has the duration of 1/9600 of a second or about 100 micro
second. In this project the baud rate of UART module in IOP is set as 9600. As
10
shown in Figure 2.1, each character or byte requires 10 bits to be transmitted. Thus,
IOP is able to transfer 960 bytes of data in a second.
2.3 BIST – An Overview
A digital system is tested and diagnosed during its lifetime on numerous
occasions. It is very critical to have quick and very high fault coverage testing. One
common and widely used in semiconductor industry for IC chip testing is to ensure
this is to specify test as one of the system functions and thus becomes self-test. A
system designed without an integrated test strategy which covering all levels from
the entire system to components is being described as chip-wise and system-foolish.
A proper designed Built-In-Self-Test (BIST) is able to offset the cost of added test
hardware while at the same time ensuring the reliability, testability and reduce
maintenance cost.
2.3.1 BIST Process
Figure 2.21 shows the BIST system hierarchy for the 3 level of packaging
which is the system level, board level and chip level. The system consists of several
PCBs (or boards). Each of the PCB has multiple chips. The system Test Controller
can activate self-test simultaneously on all PCBs. Each Test Controller on each PCB
board can activate self-test on all the chips on the board. The chip Test Controller
runs the self-test on the chip and transmits the result out to the board Test Controller.
The board Test Controller accumulates test results from all chips on the PCB and
sends the results to the system Test Controller. The system Test Controller uses all
of these results to determine if the chips and board are faulty.
11
Figure 2.2 BIST hierarchy
2.3.2 BIST Implementation
Figure 2.32 shows the BIST hardware architecture in more detail. In this
project, the BIST module in the IOP is developed based on the architecture in Figure
2.3. Basically, a design with embedded BIST architecture consists of a test
controller, hardware pattern generator, input multiplexer, circuit under test (CUT)
which in this project is the IOP and output response compactor. Optionally, a design
with BIST capability may includes also the comparator and Read-Only-Memory
(ROM).
As shown in Figure 2.3, the test controller is used to control the test pattern
and test generation during BIST mode. Hardware pattern generator functions to
generate the input pattern to the CUT.
1,2
Michael L. Bushnell, “Essential of Electronic Testing for Digital, Memory and Mixed-Signal
VLSI Circuit “ Springer, pp 496-497
12
Normally, the pattern generator generates exhaustive input test patterns to the CUT
to ensure the high fault coverage. For example, a CUT with 10 inputs will required
1024 test patterns. Primary Inputs are the input for CUT during the non BIST mode
or in other word, functional mode. Input multiplexer is used to select correct inputs
for the CUT for different mode.
During BIST mode, it selects input from the hardware pattern generator while
during functional mode, selects primary inputs. Output response compactor acts as
compactor to reduce the number of circuit responses to manageable size that can be
used as the signature and stored on the ROM. Implementation of the pattern
generation as well as the response compactor will be discussed in more details in
section below.
Figure 2.3 BIST architecture
As mentioned earlier, a BIST block can optionally consist of a ROM and a
comparator. ROM is used to store the golden signature obtained from simulation at
the pre-silicon phase. A comparator is used to compare the signature obtained during
BIST mode with the golden signature. If the signature matched with the golden
13
signature, then the chip is considered as fault free. On the other hand, if the signature
is not matching with the golden signature, then the chip is considered as faulty.
From Figure 2.3, the wires from primary inputs to the input multiplexer and
the wires from circuit output P to primary outputs is not able to be tested by BIST.
These wires require another testing method such as an external ATE or JTAG
Boundary Scan hardware.
2.3.3 BIST Pattern Generation
There are various methods and approaches have been used to generate test
patterns during BIST. This can be described in brief below:
(i) LFSR. Linear Feedback Shift Register is used to generate pseudo-
random test patterns. This normally requires a sequence of one
million or more tests pattern in order to achieve high fault coverage.
One of the advantages of LFSR is it uses very little hardware and thus
is currently the preferred BIST pattern generation method. In this
project, LFSR is being chosen as the test pattern generation method.
(ii) Binary Counters. A binary counter can generate an exhaustive but not
randomized test sequences. Draw back of binary counters as the
pattern generator is, it requires more hardware than typical LFSR
pattern generator.
(iii) Modified Counters. Modified counters also have been successfully as
test-pattern generators. However, they also require long test
sequences.
14
(iv) ROM. This method stores a good test-pattern set from an ATPG
program in a ROM on the chip. However, drawback of this approach
is relatively expensive in chip area.
(v) Cellular Automaton. In this method, each pattern generator cell has a
few logic gates, a flip-flop, and connections only to neighboring gates.
The cell is replicated to produce the cellular automaton.
2.3.3.1 Standard LFSR
The standard LFSR method has been used in this project as the test pattern
generator for the BIST. In this section, the implementation of LFSR will be
discussed in detail. A LFSR is a shift register where the input is a linear function of
two or more bits (taps) as shown in Figure 2.43. It consists of D flip-flops and linear
exclusive-OR (XOR) gates. It is considered an external exclusive-OR LFSR as the
feedback network of the XOR gates feeds externally from X0 to Xn-1.
Figure 2.4 Standard n-stage LFSR circuit
3 Michael L. Bushnell, “Essential of Electronic Testing for Digital, Memory and Mixed-Signal
VLSI Circuit “ Springer, pp 503
15
One of the two main parts of an LFSR is the shift register. A shift register is
used to shift its contents into adjacent positions within the register or, in the case of
the position on the end, output of the register. The position on the other end is left
empty unless some new content is shifted into the register. The contents of a shift
register are usually thought of as being binary, that is, ones and zeroes. If a shift
register contains the bit pattern 1101, a shift (to the right in this case) would result in
the contents being 0110; another shift yields 0011.
In an LFSR, the bits contained in selected positions in the shift register are
combined in some sort of function and the result is fed back into the register's input
bit. By definition, the selected bit values are collected before the register is clocked
and the result of the feedback function is inserted into the shift register during the
shift, filling the position that is emptied as a result of the shift.
The bit positions selected for use in the feedback function are called "taps".
The list of the taps is known as the "tap sequence". By convention, the output bit of
an LFSR that is n bits long is the nth bit; the input bit of an LFSR is bit 1. The state of
an LFSR that is n bits long can be any one of 2n different values. The largest state
space possible for such an LFSR will be 2n - 1
, all possible values except the zero
state. All zero is not allow in LFSR as it will always produce 0 in spite of how many
clock iteration. Because each state can have only once succeeding state, an LFSR
with a maximal length tap sequence will pass through every non-zero state once and
only once before repeating a state.
During BIST, it is important that the circuit be excited once and only once
with a particular pattern. This is due to a given pattern causes an error vector to
appear at the faulty circuit outputs, which are read by the BIST response compactor,
and repeating the pattern later cause the same error vector to be appear again. Since
the response compactor is an XOR-ing system as well, the two erroneous responses
from that error vector will cancel and leave the BIST system with only the good-
machine response. As a result, this causes the testing hardware to accept a faulty
16
circuit as a good circuit. Thus, it is critical to avoid repeating any of the LFSR
patterns more than once. Besides, as discussed, initialize the LFSR to all zeros is
strictly prohibited as this will hang the LFSR indefinitely in all zero state.
2.3.4 BIST Response Compaction
During BIST, for every test pattern that being generated, the CUT produces a
set of output values. In order to ensure the chip is fault free, every output values
from the CUT for each test pattern will need to compare with the correct output
values obtained from the simulations. This is a tedious and time consuming process.
Thus, it is necessary to reduce the enormous of circuit responses to a manageable
size that can be either store in the chip or can easily compared with the golden
response values. For example, a BIST pattern generator in a chip can produce 1
million test patterns. If the chip has a total of 100 primary output, at the end of the
BIST process, it will generate a total of 1 million output values or 1000000 x100 =
100 million bits of output values. With such a huge amount of data, it is very costly
and almost impossible to store in the storage or ROM inside a chip. Thus, the circuit
response must be compacted.
Together with discussion on BIST response compaction, understanding
several terminologies is required.
• Signature – A statistical property of a circuit which normally is a
number computed for a circuit from its response during testing. A
fault in the circuit should cause property or signature difference from
the property or signature of a good circuit.
• Signature Analysis – Golden signature is obtained from the simulation
using the circuit response compaction method. During the testing, the
17
actual signature for the CUT is generated. A process of comparing
the actual CUT signature with the golden signature is called signature
analysis. Signature analysis is used to determine if the CUT circuit is
faulty.
• Aliasing – A case where the faulty CUT signature obtained during
testing is matched with the golden signature due to the information
loss during the compaction. When aliasing happen, a faulty circuit
will still pass the testing since the signature after the compaction is
same with the golden signature.
• Compaction – An approach to drastically reducing the number of bits
in the original circuit response during testing with some information
lost. Regenerating the original circuit output response is not possible.
• Compression – A method of reducing the number of bits in the
original circuit response during testing where no information is lost.
With this, the original output sequence can be fully regenerated from
the compressed sequence.
Compared to compaction, compression scheme, at present are impractical for
BIST response analysis as they inadequately reduce the huge volume of data. Thus,
in this project, compaction is being used instead of compression method.
There are several approaches and method can be used for response
compaction, such as transition count response compaction, LFSR for response
compaction, Modular LFSR response compaction and multiple input signature
register. In this project, multiple input signature register as shown in Figure 2.54 will
be used as response compactor.
4
Michael L. Bushnell, “Essential of Electronic Testing for Digital, Memory and Mixed-Signal
VLSI Circuit “ Springer, pp 517
18
Figure 2.5 Multiple input signature register
2.3.5 Built-In Logic Block Observers
Built-In Logic Block Observers (BILBO) is a circuitry that combines the
functionality of the D flip-flop, a standard LFSR testing hardware pattern generator
(for the circuit portion driven by the BILBO Q outputs), a testing response compacter
(for the circuit portion driven by the BILBO D inputs) and a scan chain function. By
shifting in an all-zero pattern into the BILBO in serial scan modem the scan chain
can be reset to zero. Figure 2.65 shows the circuit for the BIBLO while Table 2.1
shows the control mode for the BILBO of Figure 2.6. The BILBO in Figure 2.6 uses
the NAND gate to accelerate the speed over the implementation with AND and OR
gates.
Figure 2.76 illustrates the effective BILBO hardware in serial scan mode,
when B1 and B1 equal to “00”, Figure 2.87 shows hardware in LFSR mode with B1
and B2 equal to “01” and Figure 2.98 shows the hardware in D flip-flop mode with
B1 and B2 is “10” and Figure 2.109 shows the hardware in MISR mode when B1 and
B2 is 11. The bold lines show the enabled data path.
19
Figure 2.6 BILBO circuit
Table 2.1 Control modes for the BILBO of figure 2.6
B1 B2 Operation Mode
00 BILBO Serial SCAN Mode
01 BILBO LFSR Pattern Generator Mode
10 D- flip-flop mode
11 MISR mode
Figure 2.7 BILBO in serial scan mode
20
Figure 2.8 BILBO in LFSR mode
Figure 2.9 BILBO in normal D flip-flop mode
21
Figure 2.10 BILBO in MISR mode
5,6,7,8,9 Michael L. Bushnell, “Essential of Electronic Testing for Digital, Memory and Mixed-
Signal VLSI Circuit “ Springer, pp 520-523
22
CHAPTER 3
METHODOLOGY AND DESIGN TOOLS
This chapter discusses methodology and design tools that used to develop and
implement this project. Contents of this chapter includes project design and
implementation flow, Verilog HDL and CAD tools such as Modelsim v6.1, FPGA
synthesis, Quartus II 6.1 and Modelsim-Altera 6.1g Web Edition.
3.1 Project Design and Implementation Flow
Figure 3.1 shows the design flow of the IOP with BIST capability. Basically,
the process can be divided into 3 phases. First phase consists of the study of the
UTM IOP design. After comprehend the design, IOP in VHDL is being migrated to
Verilog HDL design. After the migration, verification and validation is being
performed to ensure that the new IOP in Verilog HDL is functioning correctly. If it
is not, modifications are performed on both IOP in Verilog HDL and IOP in VHDL
if there errors are impacting both designs. On the other hand, if the errors are caused
by the Verilog migration, only the IOP in Verilog will be corrected. After the design
is being verified, phase 1 design is considered completed.
23
Figure 3.1 Project design flow
Phase 2 design starts with the BIST module design. The BIST module is
designed in Verilog HDL at RTL level. Same as phase 1, BIST module is being
validated and corrected. The process repeats until the BIST module is functioning
correctly based on the design specification. Phase 3 is the design integration of the
IOP with the BIST module to become a new IOP with embedded BIST capability.
The design is considered complete after the verification on the new IOP.
In this project, the design is modeled using Verilog HDL at RTL level. As
shown in Figure 3.2, digital system design can be modeled using HDL at different
abstraction level. Behavioral abstraction is the highest level of HDL modeling.
Design modeled at this level has less accuracy but has the fastest simulation speed.
In most cases, the design at behavioral model is not synthesizable by the design
UTM
VHDL
IOP
Correct?
Complete
Modification
No
No
Yes
BIST Module
Design
Validation
New IOP
In
Verilog
Verilog
Migration
Validation
Correct?
IOP with
BIST Correct?
Validation
Modification
No
Yes
Yes
Integration
Modification
24
synthesis tools. Behavioral modeling is widely used for test bench modeling and bus
functional model to model the components around the DUT during validation phase.
Figure 3.2 Multiple abstractions for digital system design
The lower level of modeling under behavioral modeling is the RTL modeling.
RTL modeling has been used in this project and it is more accurate compared to
behavioral modeling but have some draw backs on the simulation speed due to the
details of abstraction level. Design in RTL level is able to be synthesized to gate
level netlist by most of the synthesis tools. Design at RTL level has been widely
used in the ASIC design industry.
The lowest level of digital hardware modeling is the gate level and transistor
level modeling. These 2 level modeling provides the most accurate of hardware
modeling including the timing information and is able to model as close to real
silicon as possible. However, the simulation speed for design in gate and transistor
level is extremely slow. On top of that, normally design in gate or transistor level is
Behavioral
RTL
Gate
Transistor Level
Abstract
Detail
Fast
Slow
More Accuracy
More Simulation Speed
25
Figure 3.3 Digital design flow
Specifications
Input-output interface descriptions
Circuit functions
Behavioral Algorithm/Top-level Architecture
Flow chart, block diagrams, functional block
diagrams
RTL Control Algorithm
RTL code
RTL Control Sequence Table
Data Path Unit
Processing Unit Storage/ Registers
architecture
Steering logic, Bussing
Interconnection, control signals
Control Unit
Data transfer (FSM output) FSM transition
Verilog code and simulation
Design synthesis into gate
and transistor level
Digital Design
Methodology
26
performed by the help of CAD synthesis tools by synthesizing the RTL level design
to gate level and from gate level to transistor level.
With the advancement in CAD tools, it is rarely that digital hardware design
is modeled at gate level or transistor level by the designer. In most if not all, the
design is modeled at RTL level and synthesize to gate and transistor level by the
CAD tools.
Figure 3.4 shows the detail design and implementation flow of the IOP with
BIST capability. This project implementing the design modularization method and
divide and conquer method. During the design process, the design is partitioned into
a smaller function block. Each functional block is design and verify separately. The
design is modeled using Verilog HDL at RTL level. Simulation and validation is
then performed using the Modelsim v6.1 from Mentor Graphic.
Once the design at RTL level is completed and validated, each design module
is then synthesize into gate level netlist using the Altera Quartus II 6.0. The Altera
APEX20KE family FPGA device is selected in this project. After the synthesis, gate
level timing simulation is performed using the Modelsim-Altera v6.1g. Modelsim-
Altera reads in the gate level netlist design as well as the timing information from the
Standard Delay Format (SDF) file generated by Quartus II 6.1 during the synthesis
and performs the timing simulation.
During the simulation, not only the functional of the design is being verified,
timing of the design such as setup and hold timing check, interconnect delay and
input/output path delay are being check by the simulator. At this phase, the design
with timing information is being model as close to the actual hardware
implementation as possible. Simulation at this phase is much slower that simulation
at RTL level.
27
Figure 3.4 Project design and implementation flow
IOP with BIST
Design Specifications and Definitions
Partitioning
Design partitioned and modeled
in Verilog HDL
Synthesize using Quartus II 6.1 and performing gate level
timing simulation with the APEX20KE FPGA family using
Modelsim-Altera v6.1g
Correct?
No
Complete
Hardware Testing
Implement into Altera Nios APEX20KE
board FPGA EP20K200EFC484 chip
Debugging and
Modification Correct?
Yes
Integrate all the partitioned designs into a complete
design. Synthesize the entire IOP with BIST using the
Quartus II and perform full chip gate level timing
simulation with APEX20KE FPGA family using
Modelsim-Altera v6.1g
Stage 1
Stage 2
Stage 3
Yes
RTL modeling simulation using
Modelsim v6.1
Correct? Debug and correct
RTL if required
Correct?
Yes
No
No
28
After the partitioned design has been verified, all the functional blocks are integrated
together to become the complete design. Full chip RTL simulation and validation is
then performed. Once the validation at RTL level is complete, the entire IOP is
being synthesized and gate level timing simulation is performed using Quartus II and
Modelsim-Altera.
Finally, the design is implements in the hardware by programming the design
into the Altera P20K200EFC484 FPGA board.
3.2 Verilog HDL
Verilog is used to model the design in this project. It is a hardware
description language (HDL) used to model electronic systems. It is sometimes called
Verilog HDL, which supports the design, verification, and implementation of analog,
digital, and mixed-signal circuits at various levels of abstraction.
Verilog was originally developed and owned by Gate Way design in 1984.
After this, Cadence Design Systems purchased Gate Way and continued selling
Verilog-XL as a Verilog-HDL simulator with PLI support in 1990. In 1995 Cadence
released the specs for Verilog-HDL and they were accepted as IEEE -1364 standard
which included the PLI1.0 (TF/ACC) routines as a standard for all Verilog
Simulators. In 1993 PLI2.0 (VPI) routines were released as a standard by OVI and
in 1999 IEEE will vote on updating the 1364 standard to include PLI2.0. In 2001
IEEE accepted the updated Verilog standard commonly known as Verilog 2001 and
today, there are a dozen simulators that simulate Verilog HDL.
29
Verilog was generated as a language for the industry rather than academia. It
is very C like programming style that closely represents hardware. VHDL supports 9
values logic, where as Verilog supports 7 strengths on 3 values. Compared to VHDL,
VHDL offers more programming constructs where as Verilog is closer to hardware.
Basically, there are various reasons for converting the IOP VHDL design to
Verilog. Verilog HDL is a general purpose hardware description language that is
easy to learn and easy to use. It is similar to C programming language. Designers
with C programming experience will find it easy to learn Verilog HDL, and will be
comfortable with its syntax.
Besides, Verilog allows different levels of abstraction to be mixed in the
same models. Thus, a designer can define a hardware model in terms of switches,
gates, RTL, or behavioral code. Also, a designer needs to learn only one language
for stimulus and hierarchical design. On top of that, most popular logic synthesis
tools support Verilog HDL. This makes it the language of choice for many ASIC
companies. More important is, all fabrication vendors provide Verilog HDL libraries
for post logic synthesis simulation. Thus, designing a chip in Verilog HDL allows
the widest choice of vendors.
VHDL to Verilog HDL conversion table is appended in Appendix B.
3.3 Mentor Graphics Modelsim v6.1
Modelsim v6.1 from Mentor Graphics is being used to run the compilation
and simulation of the IOP design at RTL level. ModelSim v6.1 is the UNIX, Linux,
and Windows-based simulation environment, combining high performance with the
30
most advanced debugging capabilities in the industry. ModelSim offers flexibility
by supporting 32 and 64 bit UNIX and Linux and 32 bit Windows based platforms.
This enables transparent mixing of VHDL, Verilog, and SystemC in one design,
using a common, intuitive graphical interface for development and debug at any level,
regardless of the language.
The combination of industry-leading performance and capacity with the best
integrated debug and analysis environment make ModelSim the simulator of choice
for both ASIC and FPGA design as well as this project. The best standards and
platform support in the industry make it easy to adopt in the majority of process and
tool flows. The ModelSim is a unified kernel that provides a true, native mixed-
language environment for Verilog 95, 2001, 2005; VHDL 87, 93, 2000; System
Verilog 2005 for design; and SystemC 1666 2005.
Modelsim provides several benefits compared to other simulators such as it
provides the best mixed-language environment and performance in the industry.
Besides, the intuitive GUI makes it easy to view and access the many powerful
capabilities of ModelSim. The debug environment for Modelsim is common across
all languages and thus required no learning curve. More importantly, all ModelSim
products are 100% standards based.
Compared with Altera Quartus II, Modelsim provides much better debug
capability via the dataflow signal tracing and drivers command. It also offers the
visibility to observe the internal module signals and wires and this features is
dramatically speed up the debugging process. On top of that, Modelsim provides a
better simulation speed and test bench environments than Quartus II.
31
3.4 FPGA Synthesis
In this project, RTL synthesis is performed with the Altera FPGA device. A
field-programmable gate array (FPGA) is a programmable device that used to
implement of relatively large logic circuit. FPGAs provide logic blocks for
implementation of the required functions. The general architecture of an FPGA is
shown in Figure 3.5. It consists of 3 main blocks, logic blocks, I/O blocks for
connecting to the pins of the package and interconnection wires and switches. The
interconnection wires are organized as horizontal and vertical routing channels
between rows and column of logic blocks. The logic blocks in the FPGA are
organized in a two-dimensional array. There are wires and programmable switches
in the routing channel that allow the logic block to be interconnected in many ways.
Connections between the I/O blocks and interconnection wires are also
programmable.
Figure 3.5 General structure of an FPGA
FPGA synthesis begins with user provides a hardware description language
HDL or a schematic design such as design in VHDL or Verilog. During the
Logic Blocks
I/O Blocks
Programmable Interconnects
32
synthesis, a technology-mapped netlist is generated. The netlist can then be fitted to
the actual FPGA architecture using place-and-route, usually performed by the FPGA
company's proprietary place-and-route software such as Quartus II used in this
project. Normally the CAD tools will validate the map, place and route results via
timing analysis, simulation, and other verification methodologies. Once the design
and validation process is complete, the binary file generated by the tool and is used
to configure the FPGA device.
The number of programmable switches and wires in an FPGA device is
different from different FPGA devices. Some FPGA devices in market nowadays,
are able to implement logic circuits of more than a million equivalent gates. FPGAs
are suitable to implement the circuit of a large range of size, from a small logic
circuit of 1000 gates to a huge logic circuit of a million equivalent logic gates.
3.5 Altera Quartus II 6.1
Altera Quartus II 6.1 is used as the FPGA synthesis tool in this project to
synthesize the design in RTL level to gate level netlist and also program the netlist
into the FPGA board APEX20KE family. It provides high levels of productivity and
the fast path to design completion for high-density FPGA design. This dramatically
improves the productivity compared to traditional high-density FPGA design flows.
Figure 3.6 illustrates the Quartus II design flow and the tasks perform by the tool.
33
Figure 3.6 Quartus II design flow
The design flow starts with the design entry which in this project is the
Verilog RTL IOP design. After the design entry, the Analysis & Synthesis module
of the Compiler is used to analyze the design files and create the project database.
Analysis & Synthesis uses Quartus II Integrated Synthesis to synthesize the Verilog
Design Files (.v) and then generate an EDIF netlist file (.edf) or a Verilog Quartus
Mapping File (.vqm) that can be used with the Quartus II software.
After the synthesis, it is then followed by Place & Route using Quartus II
Fitter to places and routes the design. This process is also referred to as “fitting” in
34
the Quartus II software. Using the database that has been created by Analysis &
Synthesis, the Fitter matches the logic and timing requirements of the project with
the available resources of a FPGA device. In this project the FPGA device is
APEX20KE. It assigns each logic function to the best logic cell location for routing
and timing, and selects appropriate interconnection paths and pin assignments.
The process is then followed by timing analysis using Quartus II TimeQuest
Timing Analyzer and Classic Timing Analyzer to analyze the performance of all
logic in the design and help to guide the Fitter to meet timing requirements. The
information generated by the timing analyzers can be used to analyze, debug, and
validate the timing performance of the design.
After achieving the design timing close, when the entire design is clean from
timing violations, simulation can be performed. In this project, gate level timing
simulation is preformed using Modelsim-Altera which will be discussed below.
Once the design has been successfully compiled and validated, it can be
programmed or configured into an Altera device. In this project the device used is
EP20K200EFC484-2X. The Assembler module of the Quartus II Compiler
generates programming files that the Quartus II Programmer can use to program or
configure a device with Altera programming hardware.
3.6 Modelsim-Altera v6.1g Web Edition
After the synthesis, the Quartus II will write out the gate level netlist as well
as the timing delay SDF file. At this stage, Modelsim-Altera v6.1g is used as the
simulator to perform gate level timing simulation. The ModelSim-Altera tool
35
supports behavioral simulation and VHDL or Verilog test benches for all Altera
devices. ModelSim-Altera Edition software is available with an Altera software for
the PC, Solaris, HP-UX, or Linux platforms to support either VHDL or Verilog HDL
simulation.
Compared to Modelsim 6.1b, ModelSim-Altera Edition software has the
limitations of licensed as a single language—either VHDL or Verilog HDL and only
supports Altera gate-level libraries. The ModelSim-Altera software includes all
ModelSim v61.features including behavioral simulation, HDL test benches, and tool
command language (Tcl) scripting. However, the simulation performance of the
ModelSim-Altera software is slower than that of the ModelSim v6.1 software.
ModelSim-Altera Web Edition’s has lower as it has a line limit of 10,000 lines
compared to the unlimited number of lines allowed in the ModelSim v6.1.
Modelsim-Altera is being select as the gate level timing simulator as it
provides the capability to read in the Altera FPGA devices libraries as well as the
netlist and SDF file. Compared to Quartus II 6.1, Modelsim-Altera is able to run the
simulations at a higher speed and with better dataflow debug capability and visibility
to internal signals.
36
CHAPTER 4
I/O PROCESSOR CORE DESIGN
This chapter discusses the design of UTM I/O Processor (IOP) core. This
includes design specification and architecture of IOP, IOP Control Unit, and UART
module, communication protocol. This chapter is written by referring to the original
IOP design thesis titled “PC-Based FPGA Prototyping System – I/O Processor
Design” (2005) by Chua, Lee Ping”
4.1 IOP – An Overview
IOP was originally designed as one of the 2 main modules in the PC-based
FPGA Prototyping System. Figure 4.1 illustrates the overview of system architecture
of PC-based FPGA Prototyping System. This system consists of two main blocks:
the IOP core and the Front-End Subsystem. DUT is connected to IOP in the FPGA
board. The UART module within IOP core is used to handle serial communication.
Signal serial_in and serial_out from IOP is connected to RS232 Level
37
Figure 4.1 Overall System Architecture of PC-based Design and Test System for digital design in FPGA
Altera FPGA Prototyping
Board
User
FPGA Chip
RS232 Level
Converter
circuit
DUT
IOP Core
UART
I/O
Interface
Chip
MAX232N
PC Host
Serial Port and S
erial API
VB GUI
Processing
Module
Male DB9
Connector
PC Male DB9
Connector
Female to Female
DB9 serial cross cable
Front End System
(FES)
Class
Module
ActiveX
Controls
38
Converter Circuit which consists of a MAX232N chip. The chip functions to convert
between RS232 signal level and CMOS logic signal level. A serial cross cable is
used to connect the male Connector attached to the RS232 Level Converter Circuit
with male DB9 Connector at PC serial port. In the Front-End Subsystem (FES)
design at PC Host, a Visual Basic GUI design is produced as an interface between
user and processing module of FES. The FES functions to process input data from
user and data received at PC serial port. FES is used to send the input data from user
to the DUT via IOP and to receive the output values of DUT via IOP serial out. FES
process the serial out from IOP and displays the DUT output data values in the FES
GUI. PC Serial API is used for serial communication of PC Host. In this project, the
discussion will focus on the IOP design
4.2 IOP Architecture
IOP core consists of two main modules, the Control Unit and the Data Path
Unit as shown in Figure 4.2. The DPU Unit is further partitioned into UART module
and I/O Interface module. In the UART Module, there are a UART_Comm with
UART Transmitter and UART Receiver. While in the I/O Interface module, there
are an Interface module and a Buffers module.
State sampling for all control units in IOP core happens at negative clock
edge, while all registers and DPU in IOP sample at positive clock edge.
IOP core is designed to be connected to DUT for the DUT to receive its input
and to send its output to FES at PC Host via serial communication. outputpin,
inputpin, busy and ack is the interconnection signals between IOP and DUT.
Outputpin is an n-bit data signal bus, while inputpin is an m-bit data signal bus with
valid value for m and n is in the range of 2-64. m and n have the minimum value of 2.
39
Therefore, these signals exist with at least 2 bits. The IOP has 8 8-bit input buffers
and 8 8-bit output buffers are available in the IOP core with can store a total of 8
bytes input data and 8 bytes output data at a time. Thus, the maximum value for the
outputpin and inputpin is 64 bit.
Besides, Clock generator in the UART module is specified to generate clock
from the 25MHz internal clock of the FPGA for 9600 baud serial communication,
hence, the communication baud rate is set to 9600 baud. The UART module
contains a transmit buffer and a receive buffer, the Transaction should be controlled
so that the buffer will not overflowed or the data will not be overwritten.
busy and ack are two handshaking signals between IOP and DUT. Signal
busy is an active low signal indicating the operation status of DUT. If this signal is
de-asserted by DUT, it indicates that DUT halts its operation and IOP can read its
current output value. When IOP sees the signal busy is de-asserted, it will read the
current output value of DUT, and assert signal ack as soon as it finishes reading the
output value. ack signal inform the DUT that IOP has fetched the output values and it
is allow to continue new operation to produce new output value. Both signals ack
and busy connection are optional. If the DUT does not implement signal busy or
signal ack, both of these signals in IOP exist as non-connected internal signals.
A part from the data I/O signals and handshaking signals discussed above,
there are also other signals which include clk_25mhz, reset_b, serial_in and
serial_out. Clk_25mhz is an internal clock signal; reset_b is a reset signal to the IOP;
serial_in is the serial input signal from FES. All the data and commands from the
FES is send serially to IOP at the baud rate of 9600 via serial_in. serial_out, on the
other hand is the serial output signal from IOP to FES. IOP transmit the data from
IOP serially out at the baud rate of 9600 as well to FES through serial_out.
40
Figure 4.2 IOP Architecture
41
4.3 Message Format
The IOP needs to communicate via a communication protocol. The protocol
implemented is based on the computer serial interface software protocol with some
modifications. In this protocol, data is sent and received in a specific data packet
format, called message format as shown in Figure 4.3. Each data packet must start
with a STX control byte (02h), follows by a command byte, data bytes and end with
an ETX control byte (03h). DLE control byte (10h) is used in order to send data with
similar value as the control and command bytes by appending the DLE byte before
that particular data. The control and command (CMD) bytes are listed in Table 4.1
Figure 4.3 Data Packet Format
Table 4.1 (a) Table of Control Bytes
Control Hex format Description
STX 02h Start of text – start of transmission packet
DLE 10h Attached before a data byte with same value as
control bytes or command bytes
ETX 03h End of text - end of transmission packet
42
Table 4.1 (b) Table of Command Bytes
Command byte Hex format Description
INIT 69h Initialize – used to tell the IOP of the
total number of input and output byte
of DUT connected to it. INIT
command byte is followed by 2 data
bytes. One of it indicates number of
input byte and another indicates
number of output byte. If bit (7) of
data byte = 0, bit (6..0) is the number
of input byte. If bit (7) of data byte =
1, bit (6..0) is the number of output
byte.
TERMINATE 78h Terminate the communication. INIT
control byte will be needed to
reinitialize the communication
TRANS 74h Transmit - to indicate all data bytes
following this command byte are data
sent either from PC serial port to
DUT or from DUT to PC serial port.
POLL 72h Poll – to indicate ready for reception
of data. Used to tell the IOP to send
back DUT output data to PC Host.
IRR_POLL 70h Interrupt poll – typical function is
about the same as POLL but this
command byte is used for DUT with
busy and ack signals (DUT stops its
operation at particular point and
activates a handshaking signal, busy
to the IOP for IOP to load the DUT
output data). When IOP received
IRR_POLL command byte, it waits
until the DUT de-activates signal
busy (active low), and it will activates
signal ack to tell DUT to continue on
its operation.
43
4.4 IOP Operation
The communication must always start with STX followed by the INIT in
order to tell the IOP how many byte data the FES will transmit to the IOP in that
transmission and how many byte data the IOP will transmit back to the FES. The
transmission is then end with ETX. After this, the FES can transmit the data to IOP
by sending the TRANS message to the IOP followed by the data. The number of
byte of data the IOP received from the FES must match with the number of byte data
that being set by the INIT message. If the FES sent more data then the number of
byte that indicated to the IOP by the INIT message, the extract byte of data will be
discarded. On the other hand, the FES will send POLL or IRR_POLL to ask the IOP
to transmit the data to FES. IOP will transmit the data in the output buffers serially
out via the serial_out signal to FES. For more details IOP operation and design,
please refer to thesis titled “PC-Based FPGA Prototyping System – I/O Processor
Design” (2005) by Chua, Lee Ping”
44
CHAPTER 5
DESIGN OF IOP WITH EMBEDDED BIST CAPABILITY
This chapter is the core chapter for this thesis which discusses the embedded
BIST module design for the IOP. This chapter starts with the top level architecture
for the IOP with embedded BIST, followed by the BIST module design which
consists of control unit and data path unit and end with the IOP with BIST operation.
5.1 Top Level Architecture
Figure 5.1 shows the IOP block diagram with embedded BIST module.
Compared to the original IOP without the BIST capability, the new IOP needs to add
in another 2 inputs ports, which is B1 and B2 to control the BIST operation.
Figure 5.2 illustrated the internal block needed to construct the IOP. Please
note that not all the interconnection between the modules are connected via wire in
Figure 5.2 in order to have a clean and neat figure. For detailed connection, please
refer to the IOP module Verilog HDL code in the appendix section. Compared to the
45
IOP discussed in Chapter 4, the new IOP design maintains the 3 main blocks, which
is the IOP CU, IOP DPU and the 1 MHz clock generator block but added in two new
functional blocks. One is the BIST module while another one is the input/output
multiplexer to select the primary input and output during the different operation
mode.
Figure 5.1 IOP block diagram with embedded BIST
In general, when in BIST mode, the number of inputpin and outputpin is set
to the maximum number of 64, to ease the design implementation and have highest
fault coverage. BIST module in IOP generates random data and shifts the data
serially together with the command and control bytes to form messages packet as
serial in to the UART module. UART module translates the serial in to parallel
command and data bytes and IOP CU module decodes the command and data
received and stores them in the input buffers.
As shown in Figure 5.4, during BIST mode, the input buffers are feedback
and connect to the output buffers. After the IOP receives the random data, BIST
module sends command bytes to IOP to request IOP to transmit serially out the data
in the output buffers. Remember that the input buffers are feedback to the output
buffers in BIST mode, thus, IOP transmits the data in the output buffers serially out.
46
Figure 5.2 IOP design with embedded BIST capability architecture
47
Figure 5.3 IOP with embedded BIST Design Hierarchy
48
The serial out is the IOP output response during BIST mode and thus will be used by
the MISR in the BIST module to compact the response and produces signature. With
the received data feedback as the transmit data, IOP is able to test for both receive
and transmit transaction.
Figure 5.4 High level BIST operation flow.
BIST module needs to send the data generated by pattern generator to IOP
serially and must comply with the message packet protocol. In general, the sequence
of command and data bytes that BIST module sends to IOP during BIST mode is as
below. BIST first sends ‘STX’ to indicates start of text, followed by ‘INIT’ to
initialize IOP input and output pin numbers. After ‘INIT’ is the 2 byte of data to
indicates the number of input and output pin. In BIST mode, in order to have the
highest coverage, it is set to the maximum of 64. After this, command bytes ‘ETX’
is sends to indicate end of text.
49
STX INIT InputNum OutputNum ETX
TRANS DATA DATA DLE DATA
• • • • • DATA ETX POLL DATA
DATA .... ETX TRANS DATA
After this, IOP sends the command byte ‘TRANS’ to transmit data generated
by pattern generated to IOP. BIST sends 8 bytes of data at a time as IOP only has 8
8-bit input buffers to store the input data at a time. If the data generated is same as
the command or control byte, BIST module appends ‘DLE’ in front of the data
before transmit it. After transmit 8 bytes of data, BIST send command byte ‘POLL’
to request IOP to transmit back the data serially. Remember in BIST mode, data
stored in input buffers are feedback to output buffers and thus, IOP transmit the 8
bytes of random data it has just received. The serial out by the IOP is actually the
output by the IOP which can be used to determine if IOP is faulty or fault free. Thus,
the response of serial out from IOP must be compressed into a signature using the
MISR in BIST module. The operation continue until BIST has send all 256 random
data generated to IOP.
As discussed in Chapter 3, a BIST architecture contains input and output
multiplexer to select the primary inputs and outputs during functional mode and
selects test pattern from hardware test pattern generator from BIST module during
BIST mode. The Verilog RTL code below describes the function of the multiplexer.
To operate in BIST mode, B1 needs to set to ‘0’ while B2 needs to set to ‘1’.
When it is set, IOP will operates in BIST mode, where the hardware test pattern
generator in the BIST module generates test patterns. When in non-BIST mode or
functional model, the input and output multiplexer selects the primary inputs
outputpin, serial_in as the inputs to IOP. SerialIn is the output of the multiplexer
50
connects as the serial in signal to UART module in IOP. DUTOutput is the signal
connects to the output buffers. The serial_out when in functional mode is connect to
func_serial_out signal which is the serial out signal from UART module while
inputpin which are the output ports of the IOP connect as the input ports of DUT are
connected to funcinputpin from the input buffers.
When in BIST mode, the SerialIn which is the serial in signal for UART
module is connected to BilboSerialOut. BilboSerialOut is the serial out signal from
BIST module to act as serial in signal to UART module during BIST mode. During
BIST mode, the design makes use of inputpin which is the 64 bit output during BIST
mode to output the final signature. Thus, when in BIST mode, inputpin is connected
to BilboSignature which is the signature generated by BILBO in BIST module.
serial_out is always 1 when in BIST module. This is because, to ease the output
response analysis, the serial out from UART module is routed to BIST module MISR
for response compaction and thus the serial_out is always 1. DUTOutput, the 64 bit
signals connected to the output buffers are connected to funcinputpin to transfer the
data stores in the input buffers to outputs buffers.
As there is no changes on functionality on IOP CU, IOP DPU and clock
generator module, this chapter focuses on the BIST module, input/output multiplexer
and the IOP operation during BIST and functional mode.
case ((B1 == 1'b0) & (B2 == 1'b1)) 1'b1 : begin DUTOutput = funcinputpin; inputpin = BilboSignature; SerialIn = BilboSerialOut; serial_out = 1'b1; end 1'b0 : begin DUTOutput = outputpin; inputpin = funcinputpin; SerialIn = serial_in; serial_out = func_serial_out; end
51
5.2 BIST Module
BIST module is the module functions as the build-in-self-test which has the
capability to produce self-generated test to test out IOP. The block diagram of the
BIST module can be shown in Figure 5.5. BIST module consists of 2 main blocks
which is the BIST CU (control unit) and BIST DPU (data path unit) and 1 small
functional block. The small functional block is basically acts as a selector, which
selects the signature from the BIST DPU when signal loadSignature from BIST CU
is asserted. This is happens when the BIST operation is completed and BIST module
wants to load the final signature as output of IOP. When loadSignature is 0, the
signature will follow the present value. When loadSignature is 1, new signature
value is outputted. The signature generated by the MISR in BIST DPU module is
output to final signature. The signature is reset by reset signal and has the value 0
when reset is active.
DPU module functions to process and manipulate data during BIST operation,
while control unit module or BIST CU module is used to control the DPU. More
detailed description of each module operation is discussed in detail in section below.
Table 5.1 shows the INPUT/OUPUT and internal signals for the BIST module
together with the signal’s function description.
if (reset == 1'b1) begin FinalMISRSignature = 64'h00000000; end
else if (loadSignature == 1'b1) begin FinalMISRSignature = MISRSignature; end else
begin FinalMISRSignature = FinalMISRSignature; end
52
Figure 5.5 BIST module block diagram
53
Table 5.1 BIST module Input, Output and Internal signal
Signal Name Mode Description
B1 In BIST mode operation pin from external
B2 In BIST mode operation pin from external
CLK In Clock signal for BIST module
D[7:0] In Input pin to BILBO block during D flip-flop
mode
func_serial_out In Serial out signal from UART module. Used by
MISR for response value compression.
Reset In Reset signal from external.
SI In SI pin used during SCAN mode. Can be used to
initialize the LFSR and MFSR seed value.
trans_bit_count In Input from UART mode. Used for debug
purpose.
TransmitDone In Input from UART module to control unit to
indicate that the UART transmitter has complete
transmit serially out all the data in the output
buffers. The BIST module can now send another
8-byte data to the UART module for testing.
BilboSerialOut Out Serial out from BIST DPU to UART block. Act
as the serial in signal for UART module during
BIST mode.
FinalMISRSignature Out Final BIST signature from the DPU MISR
block. This will be the signature that being used
to compare with the golden signature to
determine if the IOP pass the BIST test.
b1 Internal Flopped signal of B1. Purpose to flop B1 is to
make b1 become synchronous with the BIST
module clock
b2 Internal Flopped signal of B2. Purpose to flop B2 is to
make b2 become synchronous with the BIST
module clock
BistDone Internal Output signal from DPU to CU module to
indicate that the LFSR has finished generate all
256 random number and BIST operation is
completed.
BufferFull Internal Output signal from DPU to CU module to
indicate that that 8 8-bit input buffer in UART
module is full and should be shift out.
Clr Internal Used to clear the register value in DPU to all 0.
CmdFound Internal Output signal from DPU to CU module to
indicate that the random number that generated
by the LFSR is same as the command.
ctrWord[8:0] Internal Control word from control unit. Each bit of the
control word is used to control different DPU
signal
Incr Internal From BIST control unit. Used to increase the
54
counter value by 1. Active high signal.
Load Internal Signal from BIST control unit. Used to load the
data into the BIST block shift register.
loadSignature Internal From CU module. Used to load the signature
from DPU MISR when BIST operation
completed
MISRSignature[63:0] Internal Signature from BILBO MISR from DPU.
NextCmd Internal Output signal from DPU to CU module to
inform control unit to generate next command.
5.2.1 BIST DPU Module
The DPU module is mainly generates random data and manipulates the data
during BIST mode. Figure 5.6 below illustrates the block diagram of the BIST DPU.
DPU module consists of several functional unit blocks which include shift register
and counter, MISR BILBO, decoder block, NextCmd Select, Command Decoder,
CMD Select Mux, LFSR BILBO and MISRDin Mux. The detailed operation of each
functional unit block is described below. Table 5.2 shows the DPU Input and Output
signal of the DPU module.
55
Figure 5.6 BIST DPU module block diagram
56
Table 5.2 BIST DPU Input, Output signal
Signal Name Mode Description
B1 In BIST mode operation pin
B2 In BIST mode operation pin
CLK In Clock signal for BIST DPU
Clr In Clear the counter and shift register value to all ‘0’.
Active high signal.
func_serial_out In Serial out signal from UART module. Used by MISR
for signature value compression.
Incr In From BIST control unit. Used to increase the counter
value by 1. Active high signal.
LFSRIn[7:0] In Input for the BILBO LFSR.
LFSRLoad In Signal from BIST control unit. Used to load the LFSR.
The output of LFSR will maintain old value if this
signal is low. Active high signal.
MuxSel[3:0] In Signal from BIST control unit. Used to select which
command is being generated and shifted out serially
from the BIST block to UART block. Refer to the
control unit RTL code for the details.
RegLoad In Signal from BIST control unit. Used to load the data
into the BIST block shift register.
Reset In Reset signal from external. Used to reset the logic.
Shift In Signal from BIST control unit. Used by BIST block
shift register to shift out the value inside the register
serially.
SI In SI pin used during SCAN mode. Mainly to initialize
the LFSR and MFSR seed value.
trans_bit_count In Input from UART mode. Used by MISRDin Mux to
assign the value to MISR input accordingly.
BilboSerialOut Out Serial out from BIST shift register to UART block.
Act as the serial in signal for UART module during
BIST mode.
BistDone Out As the input to BIST control unit to indicate that the
LFSR has finished generate all 255 random number
and BIST operation is completed.
BufferFull Out As the input to BIST control unit to indicate that that 8
8-bit input buffer in UART module is full and should
be shift out.
CmdFound Out As the input to BIST control unit to indicate that the
random number that generated by the LFSR is same as
the command.
MISRSignature Out The BIST signature generated by the MISR block.
NextCmd Out As the input to BIST control unit to inform then
control unit to generate next command.
57
5.2.1.1 Command Select Mux
Command Select Mux as shown in Figure 5.7 is a multiplexer with the
MuxSel signal from BIST control unit as the selector to select the command and
random data from LSFR pattern generator to be loaded into the shift register. The
output of the multiplexer is 10 bit signals consists of the first bit ‘0’ to indicate the
‘START’ bit and 1 byte of data or command (8 bits) and 1 bit of logic ‘1’ to indicate
the ‘STOP’ bit. The CU assigns suitable value to MuxSel to output the correct data
or command to the functional block which consists of a shift register and counter.
The outputs of the multiplexer for different MuxSel values are as shown in the Table
5.3 below.
Figure 5.7 Command Select Mux block diagram
`init`, `trans`, `poll`, `etx`, `dle`, `stx` are IOP commands that has been
discussed in Chapter 4. `InputNum` and `OutputNum` are used to tell the IOP the
numbers of input and output pin of the DUT it connects to. In BIST mode, both of
them are hardwired to the maximum value which is 64 bit to allow all the buffers
being fully tested. `LFSR` is the random data generated by the LFSR pattern
generator.
58
Table 5.3 Command Select Mux output table
MuxSel Output
0000 1111111111
0001 0, init, 1
0010 0, trans, 1
0011 0, poll, 1
0100 0, etx, 1
0101 0, InputNum, 1
0110 0, OutputNum, 1
0111 0, LFSR, 1
1000 0, dle, 1
1001 0, stx, 1
Others 1111111111
5.2.1.2 Functional Block (Shift Register and Counter)
This functional unit block shown in Figure 5.8 consists of a 10-bit shift
register and a 4-bit counter and used to process the output commands or data from
the Command Select Mux. Output from Command Select Mux is the input to the shift
register. When the RegLoad from the control unit is ‘1’, the shift register will load
the input into the register output and at the same time, the counter value is clear to
‘0’. After the input has been loaded to the output, if the shift from CU signal is
active, the most significant bit (MSB) of the register output is shifted out. The needs
of the shift register is due to IOP is serial communicator which only accepts serial
input while the command or control bytes and random data generated by the pattern
generator is in parallel byte form and thus need a shift register to shift the data and
command that has been generated.
59
The shift register is the left shifter register where when the MSB is shifted out,
the rest of the bits will move to left and logic ‘1’ is appended to the least significant
bit (LSB) in the register. Logic 1 is appended as for serial communication as IOP,
when it is idle, the serial in or serial out is always 1. The first bit that being shifted
out is always ‘0’ to indicate the ‘Start’ bit and then followed by the 1 byte (8 bits) of
random data generated by the LFSR pattern generator or the command byte. The last
bit is always logic ‘1’ to indicate the ‘STOP’ bit. However, this has been taken care
by Command Select Mux. The shift register just need to shift the DInput serially out.
Figure 5.8 Functional Block diagram
For every bit that is being shifted out, the counter is increasing by 1. The
counter is used to count the number of bit that has been shifted out. When the
counter value is 1001, means the shift register has complete shifted out all the 10 bits
data of DInput. With this, the CU is able to know that shift register has completed
shifting the register output value and thus able to continue with next commands or
data. This functional block is operating at the positive edge of CLK and reset by
reset signal. The serial out from the shift register will connect to UART module in
IOP as the serial in and thus starts the BIST operation. Recall what has been
discussed previously, the input multiplexer is used to select the serial out from the
60
shift register during BIST mode as the serial in signal to IOP instead of the primary
input signal, serial_in.
5.2.1.3 BILBO
As has been described in Chapter 2, BILBO (Built In Logic Block Observer)
combines functionality of D flip-flop, pattern generator, response compactor and
scan chain. BILBO has been used in BIST DPU module to operate for four different
modes, controlled by the input signal B1 and B2. The operation of the BILBO under
different value of B1 B2 is shown in the Table 5.4 below.
Table 5.4 BILBO block in BIST DPU operation mode
Mode (B1 B2) Operation
00 BILBO Serial SCAN Mode
01 BILBO LFSR Pattern Generator Mode
10 D- flip-flop mode
11 MISR mode
A total of 8 D-flip flop are being used to construct the Bilbo block for this
design. 2 BILBO blocks are used in the design, one is the LFSR pattern generator
while the other is acting as the MISR response compactor. The BILBO for MISR is
function by connecting the B1 of the MISR to the XOR result of B1 and B2 of in the
BIST DPU. B2 of the MISR will still connect to the signal B2 of the BIST DPU
directly. With this, when in BIST mode, B1 and B2 from the external are set to “01”,
61
thus the BILBO LFSR operates to generate random data while XOR of B1 and B2
produces 1 and thus BILBO MISR operates as response compactor.
When in LFSR mode as shown in Figure 5.10, BILBO operates as the pattern
generator to generate the random value. Since there are 8 flip-flop used to construct
the LFSR, the random data generated have 8 bit in length. In BIBLO, this happens
for every positive edge of the BILBO clock. The BIBLO clock is the inverse of DPU
CLK. This is to make the BILBO to operate at the negative edge of CLK while other
logics in the DPU operates in the positive edge of CLK. As some outputs of other
Figure 5.9 BILBO circuit
functional blocks in the DPU are used by BILBO, and at the same time the outputs of
BIBLO are used by other functional blocks in DPU before sending to CU, thus the
BIBLO is designed to operate at negative edge of CLK to avoid racing condition,
potential hold timing violations and to speed up the logic by half a clock cycle.
assign MISRb1 = B1 ^ B2; assign MISRb2 = B2;
62
Figure 5.10 BILBO LFSR block diagram
Since BILBO is constructed using 8 flip-flops, it is able to generate a total of
255 bytes random data (excluded 0). As IOP is receiving and transmitting 1 byte or
8-bits of data at a time, excluding the START and STOP bit, a 255 of data values are
able to test out all the possibility of the data values which IOP receives and transmits.
The waveform in Figure 5.11 below shows the operation of the BILBO when in
LFSR mode.
Figure 5.11 BILBO LFSR mode operation waveform
63
As discussed in Chapter 3, value of the LFSR pattern generator can never
been ‘0’, it is crucial to initiate a value for the BILBO before it is operates as the
pattern generator. In this design, there are 2 ways to do so, one is by setting the
BILBO into scan mode and initiate the seed value through the serial in signal. When
in scan mode, BILBO shift the serial in signal SI to output bit by bit as shown in
Figure 5.12.
Figure 5.12 BILBO scan mode operation waveform
Another way to initialize the BILBO register value is by setting the BILBO
in D flip flop mode. In this mode, the output of he BILBO always follow the input
D values as shown in Figure 5.13. In this project, the inputs of D-flip flop of the
BILBO LFSR block are being hardwired to 5Ah to ease the implementation and
usage. However, other value can be still be used through the scan mode initialization.
Figure 5.13 BILBO D flip-flop mode operation waveform
64
In this project, when in BIST mode, B1 B2 equal to 01, the B1 and B2 of the
MISR BILBO is set to 11 to operate as the response compactor. Block diagram of
BILBO MISR is shown as Figure 5.14. It has the same logic and implementation as
LFSR BILBO. However, the 8 input pins of the D flip flop is connect to the
func_serial_out which is the serial out signal from the IOP and thus to compact the
response from the IOP. The bit 0, 2, 4, 6 are connected directly to the serial out of
IOP during BIST mode while bit 1,3,5,7 are connect to the inverse of the serial out
signal from IOP. The reason for inverting input is to ensure that the input to MISR
aslways contains both logic 1 and 0. However, this is not requirement for MISR.
The MISRDout is the 8-bit signature that produced by MISR. This signature will
then be appended to the 64 bit MISR signature. MISRLoad is used to load the
MISRDin into the MISRDout. The operation waveform for BILBO MISR is shown in
Figure 5.15
Figure 5.14 BILBO MISR block diagram
assign MISRDin[0] = func_serial_out; assign MISRDin[1] = ~func_serial_out; assign MISRDin[2] = func_serial_out; assign MISRDin[3] = ~func_serial_out; assign MISRDin[4] = func_serial_out; assign MISRDin[5] = ~func_serial_out; assign MISRDin[6] = func_serial_out; assign MISRDin[7] = ~func_serial_out;
65
Figure 5.15 BILBO D MISR mode operation waveform
5.2.1.4 Decoder Block
The decoder block shown in Figure 5.16 consists of one multiplexer and one
8-bit counter. The counter increases by 1 for every random data generated by the
LFSR pattern generator. This is control by the incr signal from CU. CU has the
knowledge of what is happening in the DPU. The decoder block assigns the correct
value to the output signal BufferFull, BistDone, MISRLoad and MISRSignature. For
every 8-byte data generated by the LFSR, the BufferFull signal is assigned to ‘1’ as
all the 8 8-bit buffers are full, loaded by the data generated by the LFSR. Please
refer to bist_dpu module Verilog HDL source code for more details.
When the counter value is set to ‘0’, it means that the LFSR has completed
generation of all 255 random data, thus BIST operation is considered complete and
BistDone which is the signal to CU to indicate the BIST operation is completed is
assigned to ‘1’. MISRLoad is the signal used to load the BILBO MISR register.
66
MISRSignature is signature of the MISR. It is connect to the 64 bit inputpin
which is connected to the input pin of the DUT of IOP. inputpin is the output port of
IOP, BIST module making use of these pin to output the 64 bit signature generated
by the MISR inputpin during BIST operation. With this, users will able to obtain the
signature of the IOP during BIST mode and compare with the golden signature
obtained from the simulation to determine if the chip is faulty. This implementation
saves the total pin count of IOP without having to created additional output port to
output the BIST signature.
Figure 5.16 Decoder Block diagram
BILBO MISR only produce 8 bits signature while the MISRSignature is the
64 bits signals. To achieve this, the signature from the MISR is being appended to
the MISRSignature based on the count signal value. Please refer to the Verilog HDL
code in the appendix section for more details.
67
5.2.1.5 Command Decoder
As the LFSR pattern generator is used to generate random data, there is the
possibility that the data generated is having the same value as the command byte. As
stated earlier, it is required to attach `DLE` with value 10h before the data bytes with
same value as control bytes or command bytes. Thus, it is a need for the BIST
control logic to shift out the control bytes `DLE` first when the data generated by
LFSR is found to be same as the control or command bytes. This command decoder
block shown in Figure 5.17 is used to generate a signal called CmdFound to the
BIST control unit to inform then control unit that the next data is same value as the
control or command byte, thus control unit will need to attach control byte `DLE`
before shifting out the data.
case(LFSR) 8'b00000010 : //STX CmdFound = 1'b1; 8'b00010000 : //DLE CmdFound = 1'b1; 8'b00000011 : //ETX CmdFound = 1'b1; 8'b01101001 : //INIT CmdFound = 1'b1; 8'b01111000 : //TERMINATE CmdFound = 1'b1; 8'b01110100 : //TRANS CmdFound = 1'b1; 8'b01110000: //IRR_POLL CmdFound = 1'b1; 8'b01110010 : //POLL CmdFound = 1'b1; default : //Others CmdFound = 1'b0; endcase
68
Figure 5.17 Command Decoder block diagram
5.2.1.6 NextCmd Select
NextCmd Select in Figure 5.18 is a multiplexer which selects logic ‘1’ for
`NextCmd` signal when the `BitCount` value is 10. As the shift register will shift out
the data serially, at the same time the `BitCount` will increase by one for every bit of
data being shifted out. When the `BitCount` value is 10, it indicates that the shift
register has completed shifting out the data and thus requested the control unit to
load the next data to the shift register.
Figure 5.18 NextCmd Select block diagram
case(BitCount) 4'b1010 : begin
NextCmd = 1'b1; end default : begin
NextCmd = 1'b0; end
69
5.2.2 BIST CU Module
BIST CU is the control unit to control the BIST operation based on the inputs
from the DPU. Figure 5.19 below shows the block diagram for the BIST CU while
Table 5.5 is the input and output signals with its description for BIST CU. Figure
5.20 is the RTL code of the BIST CU to show the sequence of operation of the BIST.
Table 5.6 is the RTL control sequences. CU module is reset by reset signal and
operates at the positive edge of CLK
Figure 5.19 BIST CU block diagram
70
Table 5.5 Input and output signals for BIST CU module
Signal Name Mode Description
B1 In BIST mode operation pin
B2 In BIST mode operation pin
CLK In Clock signal for BIST DPU
Reset In Reset signal from external. Used to reset the logic.
BistDone In Input to BIST control unit from DPU to indicate
that the LFSR has finished generate all 255 random
number and BIST operation is completed.
BufferFull In Input to BIST control unit from DPU to indicate
that that 8 8-bit input buffer in UART module is full
and should be shift out.
CmdFound In Input to BIST control unit from DPU to indicate
that the random number that generated by the LFSR
is same as the command.
NextCmd In Input to BIST control unit from DPU to inform then
control unit to generate next command.
TransmitDone In Input from UART module to control unit to indicate
that the UART transmitter has complete transmit
serially out all the data in the output buffers. The
BIST mode can send another 8-byte data to the
UART module now.
ctrWord[8:0] In Control word from control unit. Each bit of the
control word is used to control different DPU signal
as below:
ctrWord[3:0] = MuxSel
ctrWord[4] = shift
ctrWord[5] = RegLoad
ctrWord[6] = load
ctrWord[7] = incr
ctrWord[8] = loadSignature
71
S0: (bist_mode)/shift register ← ‘STX’
(bist_mode)/ S1
(!bist_mode)/ S0;
(!bist_mode)/load LFSR
S1 (bist_mode)/shift ← 1
(bist_mode)/ S2;
(!bist_mode)/ S0;
S2: (bist_mode * NextCmd)/shift register ← ‘INIT’
(bist_mode * NextCmd)/ S3
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S2
(!bist_mode)/ S0;
S3: (bist_mode)/shift ← 1
(bist_mode)/ S4
(!bist_mode)/ S0
S4: (bist_mode * NextCmd)/ shift register ← ‘INPUTNUM’
(bist_mode * NextCmd)/ S5
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S4
(!bist_mode)/ S0
S5: (bist_mode)/shift ←1
(bist_mode)/ S6
(!bist_mode)/ S0
S6: (bist_mode * NextCmd)/ shift register ← ‘OUTPUTNUM’
(bist_mode * NextCmd)/ S7
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S6
(!bist_mode)/ S0
S7: (bist_mode)/shift ← 1
(bist_mode)/ S8
(!bist_mode)/ S0
S8: (bist_mode * NextCmd)/shift register ← ‘ETX’
(bist_mode * NextCmd)/ S9
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S8
72
S9: (bist_mode)/shift ← 1
(bist_mode)/ S10
(!bist_mode)/ S0
S10: (bist_mode * NextCmd)/shift register ← “TRANS’
(bist_mode * NextCmd)/ S11
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S9
(!bist_mode)/ S0
S11: (bist_mode)/shift ← 1
(bist_mode)/ S12
(!bist_mode)/ S0
S12: (bist_mode * NextCmd)/ counter ← +1 , load LFSR
(bist_mode * NextCmd)/ S13
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S12
(!bist_mode)/ S0
S13: : (bist_mode * CmdFound)/ shift register ← ‘DLE’
(bist_mode * CmdFound)/ S14
(bist_mode * !CmdFound)/ shift register ← ‘LFSR’
(bist_mode * CmdFound)/ S15
(!bist_mode)/ S0
S14: (bist_mode)/shift ← 1
(bist_mode)/ S16
(!bist_mode)/ S0
S15: (bist_mode)/shift ← 1
(bist_mode)/ S17
(!bist_mode)/ S0
S16: (bist_mode * NextCmd)/ shift register ← ‘LFSR’
(bist_mode * NextCmd)/ S15
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S16
(!bist_mode)/ S0
S17: (bist_mode * NextCmd * BufferFull)/ shift register ← ‘ETX’
(bist_mode * NextCmd * BufferFull)/ S18
(bist_mode * NextCmd * !BufferFull)/counter ←+1, load LFSR
(bist_mode * NextCmd * !BufferFull)/ S13
(bist_mode * !NextCmd)/ shift ←1
(bist_mode * !NextCmd)/ S17
(!bist_mode)/ S0
73
Figure 5.20 BIST CU RTL Code
S18: (bist_mode)/shift ← 1
(bist_mode)/ S19
(!bist_mode)/ S0
S19: (bist_mode *NextCmd)/ shift register ← ‘poll’
(bist_mode * NextCmd)/ S20
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S19
(!bist_mode)/ S0
S20: (bist_mode)/shift ← 1
(bist_mode)/ S21
(!bist_mode)/ S0
S21: (bist_mode * NextCmd * BistDone)/ S23
(bist_mode * NextCmd * !BiasDone)/ S22
(bist_mode * !NextCmd )/ shift ← 1
(bist_mode * !NextCmd )/ S22
(!bist_mode)/ S0
S22: (bist_mode* TransmitDone )/ S10
(bist_mode*! TransmitDone )/ S22
(!bist_mode)/ S0
S23: (bist_mode) / shift ← 1
(bist_mode*/ S24
(!bist_mode)/ 0
S24: (bist_mode* TransmitDone )/ shift ← 1, loadSignature ← 1
(bist_mode*! TransmitDone )/ S25
(bist_mode*! !TransmitDone )/ S24
(!bist_mode)/ S0
S25: (bist_mode) / shift ← 1
(bist_mode)/ S25
(!bist_mode)/ 0
74
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S0: if bist_mode, load command byte ‘STX’ into shift register and go to S1
0 0 0 1 0 1 0 0 1 S0: (bist_mode)/shift register ← ‘STX’
(bist_mode)/ S1
(!bist_mode)/ S0;
(!bist_mode)/load LFSR
0 0 1 0 0 0 0 0 0
S1: Assert shift to shift out the ‘STX’ command byte in the shift register serially.
0 0 0 0 1 0 0 0 0 S1 (bist_mode)/shift ← 1
(bist_mode)/ S2;
(!bist_mode)/ S0;
0 0 0 0 0 0 0 0 0
S2: if NextCmd is 1, load command byte ‘INIT’ into shift register and go to S3, else stay at S2 continue shifting ‘STX’ byte
0 0 0 1 0 0 0 0 1
0 0 0 0 1 0 0 0 0
S2: (bist_mode * NextCmd)/shift register ← ‘INIT’
(bist_mode * NextCmd)/ S3
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S2
(!bist_mode)/ S0;
0 0 0 0 0 0 0 0 0
75
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S3 Assert shift to shift out the ‘INIT’ command byte in the shift register serially.
0 0 0 0 1 0 0 0 0 S3: (bist_mode)/shift ← 1
(bist_mode)/ S4
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S4: if NextCmd is 1, load input pin number ‘INPUTNUM’ into shift register by and go to S5, else stay at S4 continue shifting ‘INIT’ byte
0 0 0 1 0 0 1 0 1
0 0 0 0 1 0 0 0 0
S4: (bist_mode * NextCmd)/ shift register ← ‘INPUTNUM’
(bist_mode * NextCmd)/ S5
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S4
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
S5: Assert shift to shift out the ‘INPUTNUM’ input pin number in the shift register serially.
0 0 0 0 0 1 0 0 0
S5: (bist_mode)shift ← 1
(bist_mode)/ S6
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
76
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S6: if NextCmd is 1, load output pin number ‘OUTPUTNUM’ into shift register and go to S7, else stay at S6 continue shifting ‘INPUTNUM’ byte
0 0 0 1 0 0 1 1 0
0 0 0 0 1 0 0 0 0
S6: (bist_mode * NextCmd)/ shift register ← ‘OUTPUTNUM’
(bist_mode * NextCmd)/ S7
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S6
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
S7: Assert shift to shift out the ‘OUTPUTNUM’ input pin number in the shift register serially.
0 0 0 0 1 0 0 0 0 S7 : (bist_mode)/shift ← 1
(bist_mode)/ S8
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
S8 : if NextCmd is 1, load command byte “ETX’ into shift register and go to S9, else stay at S8 continue shifting ‘OUTPUTNUM’ byte
0 0 0 1 0 0 1 0 0
S8: (bist_mode * NextCmd)/shift register ← ‘ETX’
(bist_mode * NextCmd)/ S9
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S8
0 0 0 0 1 0 0 0 0
77
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S9: Assert shift to shift out the ‘ETX’ input pin number in the shift register serially.
0 0 0 0 1 0 0 0 0 S9: (bist_mode)/shift ← 1
(bist_mode)/ S10
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S10: if NextCmd is 1, load command byte “TRANS’ into shift register and go to S11, else stay at S10 continue shifting ‘ETX’ byte
0 0 0 1 0 0 0 1 0
0 0 0 0 1 0 0 0 0
S10: (bist_mode * NextCmd)/shift register ← “TRANS’
(bist_mode * NextCmd)/ S11
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S9
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S11: Assert shift to shift out the ‘TRANS’ input pin number in the shift register serially.
0 0 0 0 1 0 0 0 0 S11: (bist_mode)/shift ← 1
(bist_mode)/ S12
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
78
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S12: : if NextCmd is 1, increase counter value and load LFSR else stay at S12 continue shifting ‘TRANS’ byte
0 1 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0
S12: (bist_mode * NextCmd)/ counter ← +1 , load LFSR
(bist_mode * NextCmd)/ S13
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S12
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S13: if CmdFound is 1, load command byte ‘DLE’ into shift register and go to S14, else load ‘LFSR’ data into shift register and go to S15
0 0 0 1 0 1 0 0 0
0 0 0 1 0 0 1 1 1
S13: : (bist_mode * CmdFound)/ shift register ← ‘DLE’
(bist_mode * CmdFound)/ S14
(bist_mode * !CmdFound)/ shift register ← ‘LFSR’
(bist_mode * CmdFound)/ S15
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S14: Assert shift to shift out the ‘DLE’ command byte in the shift register serially and go to S16
0 0 0 0 1 0 0 0 0 S14: (bist_mode)/shift ← 1
(bist_mode)/ S16
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
79
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S15: Assert shift to shift out the LFSR data in the shift register serially and go to S17
0 0 0 0 1 0 0 0 0 S15: (bist_mode)/shift ← 1
(bist_mode)/ S17
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S16: if NextCmd is 1, load random data from LFSR into shift register and go to S15, else stay at S16 continue shifting ‘DLE’ byte
0 0 0 1 0 0 1 1 1
0 0 0 0 1 0 0 0 0
S16: (bist_mode * NextCmd)/ shift register ← ‘LFSR’
(bist_mode * NextCmd)/ S15
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S16
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S17: if NextCmd and BufferFull is 1 load command byte ‘ETX’ in to shift register and go to S18, if NextCmd is 1 and BufferFull is 0, increase counter and
load LFSR and go to S13, else stay at S17 to continue to shift LFSR data.
0 0 0 1 0 0 1 0 0 S17: (bist_mode * NextCmd * BufferFull)/ shift register ← ‘ETX’
(bist_mode * NextCmd * BufferFull)/ S18
(bist_mode * NextCmd * !BufferFull)/counter ←+1, load LFSR
(bist_mode * NextCmd * !BufferFull)/ S13
0 1 1 0 0 0 0 0 0
80
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
0 0 0 0 1 0 0 0 0 S17: (bist_mode * !NextCmd)/ shift ←1
(bist_mode * !NextCmd)/ S17
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S18 Assert shift to shift out the ‘ETX’ command byte in the shift register serially and go to S19
0 0 0 0 1 0 0 0 0 S18: (bist_mode)/shift ← 1
(bist_mode)/ S19
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S19: if NextCmd is 1, load command byte ‘poll’ into shift register and go to S20, else stay at S19 continue shifting ‘ETX’ byte
0 0 0 1 0 0 0 1 1
0 0 0 0 1 0 0 0 0
S19: (bist_mode *NextCmd)/ shift register ← ‘poll’
(bist_mode * NextCmd)/ S20
(bist_mode * !NextCmd)/shift ← 1
(bist_mode * !NextCmd)/ S19
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S20 Assert shift to shift out the ‘POLL’ command byte in the shift register serially and go to S21
0 0 0 0 1 0 0 0 0 S20: (bist_mode)/shift ← 1
(bist_mode)/ S21
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
81
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S21: if NextCmd and BistDone is 1, go to S23, if NextCmd is 1 and BistDone is 0, go to S22 , else stay at S20 continue shifting ‘POLL’ byte
0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0
S21: (bist_mode * NextCmd * BistDone)/ S23
(bist_mode * NextCmd * !BiasDone)/ S22
(bist_mode * !NextCmd )/ shift ← 1
(bist_mode * !NextCmd )/ S22
(!bist_mode)/ S0 0 0 0 0 0 0 0 0 0
S22: if TransmitDone is 1, go to S10, if TransmitDone is 0, go to S22.
S22: (bist_mode* TransmitDone )/ S10
(bist_mode*! TransmitDone )/ S22
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
S23: Assert shift and go to S24.
S23: (bist_mode) / shift ← 1
(bist_mode)/ S24
(!bist_mode)/ 0
0 0 0 0 0 0 0 0 0
82
Control word (8..0)
State: RTL operations
Load
Sig
nature
incr
load
Reg
Load
shift
MuxSel[3
]
MuxSel[2
]
MuxSel[1
]
MuxSel[0
]
S24: if TransmitDone is 1, assert shift and loadSignature and go to S25. If TransmitDone is 0, go to S24
1 0 0 0 1 0 0 0 0
S24: (bist_mode* TransmitDone )/ shift ← 1, loadSignature ← 1
(bist_mode*! TransmitDone )/ S25
(bist_mode*! !TransmitDone )/ S24
(!bist_mode)/ S0
0 0 0 0 0 0 0 0 0
S25: Assert shift and stay at S25. if bist_mode = 0, go to S0
S25: (bist_mode) / shift ← 1
(bist_mode)/ S25
(!bist_mode)/ 0
0 0 0 0 1 0 0 0 0
Table 5.6 BIST CU RTL control sequences table
83
5.3 IOP BIST Operation
The IOP BIST operation can be described as in Table 5.6 and Figure 5.21.
BIST operation starts when CU detected that the B1 and B2 value are 01. When this
happens, the CU selects the command ‘stx’, activates RegLoad and moves to stage 1,
S1, else it stays at stage 0, S0. RegLoad is the signal from CU to DPU to request the
shift register in the functional block to load the data values ‘stx’ into the register. At
S1, shift signal is activated to shift the command ‘stx’ that has been loaded ino the
shift register and moves to stage 2, S2 on the next clock edge. The clock used in the
CU and DPU is same clock as the transmitter clock of the UART module to produce
the serial in of 9600 baud rate. The serial out signal, BilboSerialOut from the shift
register in the DPU is connected to the serial in of the UART module.
At S2, CU checks if the NextCmd is high. NextCmd is the signal from DPU to
indicate to CU the shift register has complete shifting all 10 bits data or command (1
START bit, 8 data or command bit, 1 STOP bit) and is ready to shift out next
command. If NextCmd is high, CU selects command ‘init’ using MuxSel to initialize
the IOP and asserts RegLoad to load the command into the shift register and move to
stage 3, S3 in next cycle. On the other hand, if the NextCmd is low, it means that the
shift register in the DPU is still shifting the command and the CU stays at S2 until
NextCmd is high.
During S3, CU drives shift to logic ‘1’ to shift out the command byte ‘init’
serially and move to stage 4, S4. At S4, the same process repeats, where the CU
checks the NextCmd to determine it is the time to proceed to next command or
continues shifting the current command. If NextCmd is high, CU selects ‘InputNum’
to be the next command or control byte to be shift out and at the same time, asserts
RegLoad to load in into the shift register. ‘InputNum’ is used to tell IOP the number
of data bits to transmit to IOP. In order to have full testability, the number is set to
the maximum of 64 bit by the CU. At stage 5, S5, the process at S4 repeats. When at
stage 6, S6, CU selects the ‘OutputNum’ as the command to shift out. As ‘InputNum’,
84
Figure 5.21 (a) IOP BIST operation flow chart 1
85
Figure 5.21 (b) IOP BIST operation flow chart 2
86
Figure 5.21 (c) IOP BIST operation flow chart 3
87
the ‘OutputNum’ which is to indicate the number data bits to be transmitted by IOP,
and is set to maximum of 64 bits. CU is now moving into stage 7, S7 to shift out the
‘OutputNum’. At stage 8, S8, CU selects command byte ‘etx’ and asserts RegLoad to
load it into the shift register. CU is then moves to stage 9, S9 to shift the output of
shift register out.
During stage 10, S10, CU selects command byte ‘trans’ to transmit the data
generated by LFSR to the IOP. RegLoad is also asserted to load ‘trans’ into the shift
register. CU moves to stage 11, S11 to shift out the command serially. When CU is
at stage 12, this is the time for the CU to start asking the DPU to shift out the random
data generated by the LFSR. At this stage, the CU asserts incr and load signal to
DPU to increase the LFSR counter. LFSR counter is increases for every data that
being loaded into the LFSR. This allows the CU and DPU to know the number of
data that have been generated by LFSR and have been shifted out. load signal, on the
other hand functions to load the input value of LFSR into the LFSR. CU is now
moving into stage 13, S13.
As discussed before, there is a possibility that random data generated by
LFSR is same with the command or control byte of the message packets. Thus, at
stage 13, S13, CU checks the LFSR data based on CmdFound signal. CmdFound is
the signal from DPU to CU to indicate that the current data generated by LFSR is
same as one of the command or control byte and thus CU will need to shift out
‘DLE’ first before shifting out the LFSR data. If CmdFound is low, CU selects data
from the LFSR as the input into the shift register using MuxSel signal. RegLoad is
also asserted by CU to load the data into the shift register. CU is then moving into
stage 15, S15 to shifting out the data serially.
If CmdFound is high, then CU can not select the LFSR data as the input to
the shift register at this moment, it needs to shift out the control byte ‘DLE’ first.
This can be done by controlling the MuxSel to select ‘DLE’ as the input to the shift
88
register and asserting RegLoad to load ‘DLE’ into the shift register. CU is now
moving into stage 14, S14. At this stage, CU asserts shift to shift out ‘DLE’ serially.
After S14, CU is moving into stage 16, S16. At this stage, CU is now allows
to shift out the LFSR data using the MuxSel signal and also RegLoad to load the data
into the shift register. After this, CU moves back to S15 and asserts shift to shift out
the data. CU is then moves to stage 17, S17. When at S17, CU checks if it has been
shifting out 8 bytes of data by checking the BufferFull signal. Remember that IOP
consists of 8 8-bits input buffers and thus it is able to receive 8 bytes of data at a time.
Due to this, BIST CU is allows to transmit a total of 8 bytes of data to IOP at a time.
BufferFull is the signal from DPU to indicate to CU that it has been shifting out 8
bytes of data into IOP and should now asking the IOP to transmit out serially the data
that it has been received.
At S17, if the BufferFull is high, then CU selects command byte ‘etx’ as the
next data to be shifted out to IOP to indicate end of text. RegLoad is also asserted to
load it into the shift register. CU is then moving into stage 18, S18. At S18, CU
asserts shift to shift out serially command byte ‘etx’. CU moves to stage 19, S19
after this. If BufferFull is low, it indicates that CU has not complete sending the 8
bytes of data and thus CU will move to S13 to continue to send out the LFSR data as
discussed above.
At S19, again the CU checks tNextCmd to determine if shift register has
complete the data or command shifting and is ready to proceed with new data. If
NextCmd is active, CU selects command byte ‘poll’ and asserts RegLoad to load it
into the shift register. ‘poll’ is the command byte to ask IOP to transmit out the data
in output buffers serially from IOP. During BIST mode, the data that has been
shifted out by BIST module and received by IOP are stored at the inputs buffers.
The data are then feedback internally to the output buffers of the IOP to be the data
to transmit serially out of IOP. To achieve this, after BIST module sends 8 bytes of
random data to IOP, BIST CU sends the command byte ‘poll’ to IOP to ask IOP to
89
transmit out the 8 bytes of data that it has been just received serially. With this
implementation, the IOP is being tested for both receiver and transmitter without
having to generate separate sets of data or testing the receiver and transmitter
separately. This is done at the S19. CU will move into stage 20, S20.
When in S20, CU selects and loads ALL1 as the data to be shift out. This is to
give a few clocks cycle time for IOP to transmit all the 8 bytes data. CU is then
moving into stage 21, S21. During S21, CU checks BistDone logic value. BistDone
is the signal from DPU to inform CU that BIST operation is completed as LFSR
generated 256 random data. If BistDone is low, CU moves to stage 22, S22 to
continue sending the data. At S22, CU checks the value of TransmitDone.
TransmitDone is the signal from UART transmitter to indicate that the transmitter
has completed the data transmission. If the TransmitDone is low, CU stays at S22
until the transmission complete or TransmitDone is high. CU is then moves to S10
and continues the same process.
Back to S21, if BistDone is high, BIST operation is almost done. LFSR has
generated all 256 random data. CU is then moves into stage 23, S23. At this stage,
CU halt for one clock cycle to ensure IOP complete the data transmitting before
checking the TransmitDone signal. CU is now moving into stage 24, S24. At S24,
CU checks the TransmitDone to know if IOP has complete transmitting the last 8
bytes of data. If TransmitDone is low, CU stays at S24, else, it selects ALL1 and
asserts RegLoad to load ALL1 into the shift register and asserts loadSignature to load
the final signature from the MISR in DPU. After this, CU moves to the last stage,
stage 25, S25, asserting shift to shift out ALL1. CU stays at S25 unless the B1 and B2
value is changing, it will back to S0. In fact, during BIST mode, B1 and B2 should
be static at 01 during for the entire BIST operation. Changes of the B1 and B2 value
at any stage of the BIST will move the CU back to S0 again and BIST operation will
need to rerun. When at S25, BIST operation is considered complete.
90
CHAPTER 6
SIMULATION, VERIFICATION AND RESULTS ANALYSIS
This chapter discusses the IOP Verilog HDL migration code efficiency gain,
design compilation, design synthesis and analysis as well as simulation and
verification of the IOP with embedded BIST capability. Simulation waveforms,
design performance analysis and simulations results are included in this chapter as
well.
6.1 Verilog HDL Code Efficiency Gain
One of the objectives of this project is migrates the IOP design from VHDL
modeling to Verilog HDL modeling. The conversion has been completed and the
result of the HDL migration is being compared as shown in Table 6.1 below
In overall, Verilog HDL gives a better code efficiency compared to VHDL as
can be shown in Table 6.1. This is being compared using the original IOP, excluding
the BIST module as BIST module is designed directly using Verilog HDL. Total of
91
lines is calculated by excluding the empty lines and comments. Original VHDL
design consists of 3035 lines of code while IOP in Verilog HDL has a total line of
1844. Thus, Verilog HDL modeling provides a 40% of code efficiency compared to
VHDL modeling in term of number of lines required to model the same design. This
is due to Verilog HDL does not required component declaration as VHDL when
performing design instantiations. Direct port mapping can be performed in Verilog
HDL.
Table 6.1 Verilog HDL migration code efficiency gain
VHDL Verilog Gain
Total line of codes* 3035 1844 39.24%
Total entities 36 32 11.11%
The author also managed to reduce the number of design entities from 36
down to 32, which is 11% gain. However, the drop of the total entities is not really
due to the Verilog modeling efficiency. When the migration to Verilog HDL design,
the author found that there are several modules with different names but is having
totally same ports and functionality, which means those entities are duplicated and
can be removed. Thus, the number of entity is reduced by 4 after the Verilog HDL
conversion.
In short, Verilog HDL gives better code efficiency in term of number of lines
as Verilog HDL provides a much easy way to perform design or component
instantiation.
92
6.2 Design Compilations, Synthesis and Timing Analysis
The design at RTL level needs to be compiled, synthesized into gate level
netlist using the Altera Quartus II 6.1 CAD tool after the design verification of the
IOP at RTL level. Quartus II reads in the RTL design and performs the full
compilation, which includes, design analysis and synthesis, fitter, assembler, classic
timing analyzer and EDA netlist writer.
6.2.1 Design Synthesis and Analysis
The first step performed by Quartus II is design synthesis and analysis.
During this process, the tool builds a single project database that integrates all the
design files, which in this project, the IOP RTL Verilog HDL files in a design entity
or project hierarchy.
During the Analysis stage of Analysis & Synthesis, it examines the logical
completeness and consistency of the project, and checks for boundary connectivity
and syntax errors. Analysis & Synthesis also synthesizes and performs technology
mapping on the logic in the design entities.
Quartus II infers flip-flops, latches, and state machines from Verilog HDL. It
creates state assignments for state machines and makes choices that will minimize
the number of resources used. In addition, it replaces operators such as + or -with
modules from the Altera library of parameterized modules functions, which are
optimized for Altera APEX20KE device.
93
During Analysis & Synthesis, several algorithms are used to minimize gate
count, remove redundant logic, and utilize the device architecture. It applies logic
synthesis techniques to help implement timing requirements for the project and
optimize the design to meet these requirements.
Table 6.2 below shows the Analysis and Synthesis - IOP Resource Usage
Summary. In brief, IOP is being synthesized using the FPFA device
EP20K200EFC484-2X from the APEX20KE family. IOP uses up a total of 979
logic elements of this device which is around 12% utilization. If it is break into the
utilization by entity, IOP module, which excluded the components instantiated,
consumes of 130 logic elements, while IOP control unit (CU), used only 46 logic
elements. IOP DPU which consist most of the data path units and register consumes
a total of 501 logic elements. The clock generator used 12 logic elements.
The new added embedded BIST block uses 290 logic elements, only around
30% of the total logic elements of the IOP. This is because in order to reduce the
hardware overhead due to the embedded BIST, the BIST module is designed with the
aim of minimize the logic gate usage with some control unit design complexity
increases and degrade on the BIST throughput and performance which is not
important when IOP in BIST mode.
IOP used a total of 136 I/O pins and have a total of 547 register. reset_b
which is the global reset signal for IOP is having the most number of fan-out, while
the average fan-out for IOP is 3.81.
94
Table 6.2 Analysis and Synthesis - IOP Resource Usage Summary
Device Family APEX20KE
Device EP20K200EFC484-2X
Total logic elements 979/8320 (12%)
Logic elements usage by entity
⇒ IOP
⇒ IOP CU
⇒ IOP DPU
⇒ BIST
⇒ Clock generator
Total
130
46
501
290
12
979
Total combinational functions
⇒ Total 4-input functions
⇒ Total 3-input functions
⇒ Total 2-input functions
⇒ Total 1-input functions
⇒ Total 0-input functions
800
462
237
52
48
1
Total register 547
Total logic cells in carry chains 50
I/O pins 136
Maximum fan-out node reset_b
Maximum fan-out 462
Total fan-out 4244
Average fan-out 3.81
6.2.2 Fitter and Assembler
After the design synthesis, Fitter places and routes IOP design using the
database that has been created by Analysis and Synthesis. Fitter matches the logic
and timing requirements of the IOP with the available resources of a device. It
assigns each logic function to the best logic cell location for routing and timing, and
selects appropriate interconnection paths and pin assignments.
95
After the place and route, Assembler module of the Quartus II Compiler
generates programming files that the Quartus II Programmer can use to program or
configure a device with Altera programming hardware. Since in this project, due to
limited resource, the IOP has no chance to program into the FPGA device, thus, no
detailed and further discussion will be made on fitter and assembler.
6.2.3 Design Timing Analysis
After the design synthesis and analysis, places and route, it is required to
perform timing analysis to check if the design is meeting the timing requirements and
if there is any setup or hold timing violations. Quartus II Classic Timing Analyzer is
used to analyze the timing performance of all logic in IOP and help to guide the
Fitter to meet timing requirements. The timing analysis result of IOP can be
summarized as in Table 6.3.
Clock setup time (tsu) is the window of time for which data that feeds a
register via its data or enable input must be present at an input pin before the clock
signal that clocks the register is asserted at the clock pin. Every register or flip flop
should have its own clock setup time. The worst case clock setup time for entire IOP
is 8.566ns.
On the other hand, clock hold time (th) is the length of time for which data
that feeds a register via its data or enable input must be retained at an input pin after
the clock signal that clocks the register is asserted at the clock pin. In IOP, the worst
case clock hold time is 7.734ns.
96
Table 6.3 Summary of IOP timing analysis
Type Actual Time
Worst case tsu (clock setup time) 8.566ns
Worst case tco (clock to output delay) 18.218ns
Worst case th (clock hold time) 7.734ns
Clock setup 21.286ns
Clock hold N/A clock skew > data delay
fmax (maximum frequency) 46.98MHz
Total number of failed path 0
Clock to output delay is the time required to obtain a valid output at an output
pin that is fed by a register after a clock signal transition on an input pin that clocks
the register. In IOP, the worst case clock to output delay is 18.218ns. In most of the
design, it is desired that the clock to output time should be as small as possible so
that the maximum clock frequency of the design can be obtained.
Clock setup is the longest data path delay plus the clock to output time, clock
setup time and clock hold time in the design. Clock setup is important to calculate
the maximum clock frequency that can be applied to the design. The clock period of
the design must be larger than the clock setup in order to meet the setup timing. For
IOP design, the clock setup is 21.286ns which means the maximum clock frequency
allowed is 46.98MHz.
From Table 6.3, IOP does not face any clock setup issue, as for all the design
paths in IOP, the clock skew delay is greater than the data delay, thus, no hold
requirement for IOP.
97
From the discussion above, since IOP can run up to the maximum frequency
of 46.98MHz, there is no timing issue and problem using clock 25MHz as the supply
clock for IOP. In fact, although the clock supplied from external to IOP is 25MHz,
IOP does not use the 25MHz clock directly but generate a new control unit clock
which runs at 1MHz and the UART transmitter clock which run at 10KHz from the
25MHz clock. This is because IOP is actually the low speed serial communication
device designed to run at 9600 baud rate. Thus, by using the same FPGA device,
IOP is actually able to be enhanced to support higher baud rate.
After the timing analysis, Quartus II performs the netlist and standard delay
format (SDF) file writing. SDF file is the file contains all the timing information of
IOP. It also contains the timing check of the design such as setup and hold timing
check. Timing simulation can be later performed by the simulator by compiling the
netlist and read in the SDF file to back-annotated the delay information.
Figure 6.1 below shows the summary of full IOP design compilation by
Quartus II.
Figure 6.1 Full compilation of the design by Quartus II
98
6.3 Design Test bench
In any RTL design, test bench plays an important role in design validation
and verification. Test bench is normally created to model or enumerate the
components around the designs. Design needs to “communicate” and handshake
with the other components around it to form complete system or platform.
As the data input to the IOP and UART is in serial format, it will be very
troublesome in verification as the input vector to the IOP is 1 bit for every clock
cycle. Thus a suitable test bench is needed for the design verification.
This section discusses on the test bench that has been designed to verify the
IOP for both functional and BIST mode. The test bench is modeled in behavioral
modeling to emulate the FES and to provide a fast and simple verification platform.
In general, the test bench consists of 3 main functional blocks as illustrated
by Figure 6.2. First block is the register file block to store all the input vectors from
the users. The register block consists of 100 10-bit registers which enable the
support of up to 100 commands and data bytes in one single simulation. By default,
these registers are all generic with values set to all ‘1’. The generic values can be
overwritten or program before or during the simulation to perform the verification.
All the commands and data that being programmed into the register file are
concatenated to form the input data bus.
Second block of the test bench is the clock generator. IOP use the clock
25MHz, but the serial in input is at the baud rate of 9600. Thus, the clock generator
in the test bench functions to generate 25MHz clock to IOP and clock at serial in
frequency from the main 25MHz frequency in order to shift the serial data into the
IOP serially, bit by bit at baud rate of 9600.
99
Figure 6.2 Design test bench
The third functional block in the test bench is the input data shifter block. The
test bench shifts the data that being programmed in the register file serially to IOP to
perform the verification. This block consists of a shift register to shift the data bus
bit by bit into the IOP. Quartus II allows users to specify the test bench file to be
used during simulation and thus the test bench will not be synthesizes into the FPGA
device. With the test bench, verification and validation of IOP functionality becomes
easier and faster.
Test bench
IOP
DUT
Register
file
Input
data
shifter
Clock
Generator
100
6.4 IOP Simulations and Discussions
This section discusses the simulation results together with the waveform
which consists of 2 sections. First, is simulation result of IOP under function mode
and second section is the simulation of IOP when in BIST mode. IOP is set to
normal functional mode by setting the signal B1 and B2 equal to “00” while in BIST
mode, B1 and B2 is set to “01”. The simulation is being performed by using the
Modelsim-Altera 6.1g as the simulator.
6.4.1 Functional Mode
Simulation 1: Data Transmission to IOP
This simulation is performed to verify the IOP is able to receive serial in data
through UART. As discussed, the message packet to the IOP should always start
with 02h (stx), followed 69h (init) and then followed the number of bits of data the
test bench wants to transmit to IOP and number of bits the data receive from IOP.
Each command and data should be 1 byte length, the command and data will then
Message and data packet 02 69 40 03 74 f8 0f
0000000101 0011010011 0010000001 0000000111 0011101001 0011110001 0000011111
a5 b7 23 11 48 10 69
0101001011 0101101111 0001000111 0000100011 0010010001 0000100001 0011010011
03
0000000111
101
appended with 1 start bit, which is always 0 and 1 stop bit which is 1, making of total
of 10 bits for each data or command type.
After the IOP is being told by test bench on the amount of bits of data that
test bench would like to transmit to IOP, test bench is then sends the command 74h
which is trans to indicate that all data bytes following this command byte are data
send to the IOP. In this case, the IOP was told to receive a total by 64 bits data (8
byte), from the waveform in Figure 6.3, the T_input_reg, which is the register to
store the number of data that going to transmit to IOP is updated to 64.
Figure 6.3 Waveform for functional mode simulation 1 result
9 bytes of data is then sent to IOP following the command byte 74h, which in
this case is F8h, 0Fh, A5h, B7h, 23h, 11h, 48h, 10h, and 69h. The command byte
102
10h is sent because the last byte of the data is 69h is same as one of the command
type value, thus 10h (dle) is attached before 69h.
From the waveform in Figure 6.3, all the command and data bytes are serially
in one bit by one bit to the IOP. IOP decodes the command and data. In this case,
the 8 bytes of data sends to IOP are store in the input pin buffers. After the last byte
of data which is 69h is stored, recv which is the signal to indicate that IOP has
completed receiving the data from test bench and is now ready to send the data out in
parallel. With this, the IOP sends and 8 bytes of data parallel out to the DUT via the
64-bits signal input_pin.
In this simulation, no data is transmit serially out from IOP, thus, the serial
out will maintain ‘1’all the time.
Simulation 2: Data Transmission from IOP
Simulation 2 is performed to verify IOP is able to transmit data out serially.
As usual, the message packet to the IOP should always starts with 02h (stx), followed
by init (69h) and follows by the number of bits of data test bench wants to receive
from IOP. In this case, the test bench wants IOP to transmit 48 bits (6 bytes) of data
from the output pins of the DUT. The DUT output pins are always in ready mode.
The output pins values are 98h, 76h, 54h, 32h, 5Ah and A5h.
Message and data packet 02 69 b0 03 72 0000000101 0011010011 0010000001 0000000111 0011101001
103
From the waveform of Figure 6.4, IOP stores the 6 bytes of data in the 6
output buffer, each buffer can store 1 byte of data. After this, the IOP transmits out
the data serially out from IOP, starts with 02h (stx) to indicate the start of the
transmission packet, follows by 6 byte of data, starts from the LSB of the data, A5h,
5Ah, 32h, 5Ah, 76h, 98h and finally the IOP appends command byte 03h at the end
of the packet transmission to indicate that IOP has completed the data transmission.
Figure 6.4 Waveform for functional mode simulation 2 result
104
Simulation 3: Back-to-back Data Transmission to IOP
Simulation 3 verifies the back-to-back data receive scenario for the IOP. The
number of data received by the IOP is set to 3 byte in each transmission. The IOP
starts to receive the data when the command byte trans (74h) is received. The test
bench is then sends the data to IOP. In the first transmission, the data sent by the test
bench is AAh, 10h, 69h, and 0Fh. From the waveform in Figure 6.5, only 3 byte of
data is saved in the input buffers, which is AAh, 69h and 0Fh. 10h is not being saved
as it is the command byte to the IOP to tell IOP that the next data byte is same as one
of the command byte, which in this simulation is 69h (init). The first transmission
end when IOP receive the command byte 03h (etx), to indicate the end of text
transmission.
The second transaction starts as soon as the IOP receives the second trans
(74h) command byte. In second transmission, 4 bytes of instead of 3 bytes is sent to
IOP to verify how the IOP handle the extra data byte. Since IOP is being told to only
receive 3 data byte in each transmission, the last data byte which is 00h will be
discarded by the IOP. Even though the 4th data byte is still being stored in the input
buffer, but only 3 bytes of data are send to the DUT input pin as shown in the
waveform in Figure 6.5. However, in real usage model, test bench should send
number of data byte according to the number of data byte that has been indicated to
IOP and should not try to send less data or more data to IOP.
Message and data packet
02 69 18 03 74 aa 10
0000000101 0011010011 0000110001 0000000111 0011101001 0101010101 0000100001
69 0f 03 74 99 3b 88
0011010011 0000011111 0000000111 0011101001 0100110011 0001110111 0100010001
00 03 0000000001 0000000111
105
Figure 6.5 Waveform for functional mode simulation 3 result
Simulation 4: Back-to-back data Transmission from IOP
Message and data packet 02 69 a0 03 72 ff ff 0000000101 0011010011 0101000001 0000000111 0011100101 1111111111 1111111111 ff ff ff ff ff ff ff 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 ff 72 1111111111 0011100101
106
Simulation 4 verifies IOP is able to transmit data back-to-back. The number
of data transmitted in one transmission is set to 32 bits (4 byte). The transmission
starts when the IOP receives first poll command byte (72h). The 4 byte data at the
output pin are transferred to the output buffer and transmitted out serially by IOP.
Please note that IOP is half duplex device, thus not transmitting and receiving at the
same time. During the time the IOP transmit the data, test bench should receiving
the data at the end and should not send any data or command byte to the IOP. This is
done by sending the 10 bytes of FFh data to the IOP to allow the IOP completed the
transmission job first.
After the transaction completed, another poll (72h) command byte is send to
IOP to ask IOP to send the new data back. Assume that when this happen, the output
pin value has been changed and updated to new value. As shown in the waveform
below, the IOP will send another 4 byte of data serially out to the test bench.
Figure 6.6 Waveform for functional mode simulation 4 result
107
Simulation 5: Data Transmission to and from IOP
Simulation 5 verifies the IOP is able to send the data to DUT and transmit
DUT data out serially from IOP. As discussed, every command and data packet
should start with 02h, 69h and then number of DUT input and output pin. In this
simulation, the number of the DUT input is set to 32 and the number of DUT output
pin is set to 20 as shown in the U_T_input_reg and U_T_output_reg register as
shown in simulation waveform Figure 6.7. After this, command 74h is sent to IOP to
ask the IOP to receive the data. A total of 5 bytes of data, which is 78h, 69h, 69h,
6Ah, 6Ah are sent to the IOP. However, as the IOP was being told by the init
command that the number bit to receive is 32 bit, which is 4 byte, thus as shown in
the waveform below, the only the first 4 byte data which is 78h, 69h, 69h, 6Ah, 6Ah
will be send to the DUT as the input vector. The last 5th byte data is discarded.
After receive the data, the IOP also told by the message packet to transmit the
data in the buffer from the output pin of the DUT back to the test bench. As shown
in the waveform below, total of 3 bytes data (24 bits) will be transmitted as the
message packet indicated the IOP that the number of DUT output is 20 bits.
Message and data packet Message Packet : 02 69 94 20 03 74 78 0000000101 0011010011 0100101001 0001000001 0000000111 0011101001 0011110001
69 69 6A 6A 03 72 0011010011 0011010011 0011010101 0011010101 0000000111 0011100101
108
Figure 6.7 Waveform for functional mode simulation 5 result
6.4.2 BIST Mode
The BIST operation for the IOP can be performed by setting the B1 and B2 to
logic 0 and logic 1 respectively. When this happens, the BILBO works as a pattern
generator. However, before this, it is required to initialize a ‘seed’ value to the LFSR.
This can be done by setting the B1 and B2 to 1 and 0 respectively as has been shown
in Figure 6.9. In this case, the LFSR was initialized with a seed value of 5Ah, which
means the first value generated by the LFSR is 5Ah. Please refer to Chapter 5 of this
thesis for the detailed description on the IOP BIST operation. 2 simulations are
performed for the IOP when in BIST mode. First is the simulation for fault free
circuit, second is the simulation for faulty circuit.
109
6.4.2.1 Fault Free Circuit Simulation
From Figure 6.9, data_fsm is the register in the IOP CU to store the
commands and data receives by the IOP for processing. BilboSerialOut is the serial
in signal to IOP from BIST module. This signal acts as the serial_in signal when in
BIST mode to send the serial in command and data to IOP for testing. LFSR is the
data that being generated by the LFSR pattern generator when in BIST mode while
count is the signal in BIST DPU to reflects the number of data byte that LFSR has
been generator when in BIST mode.
As shown in Figure 6.9, BIST module sends the serial in command and data
through BilboSerialOut to IOP. IOP receives the commands and data and store them
in data_fsm for IOP to process. From the waveform, we can observe that the first
command that IOP receives is stx, 02h, follows by init, 69h, and then follows by 2
byte of data C0h and 40h to indicates to IOP the number of DUT input and output
pins the IOP is connected to. In BIST mode, they are set to the maximum number of
pins IOP is supported, 64 for both input and output.
After this, BIST module sends the command byte etx, 03h to IOP. After this,
IOP receives the command byte trans, 74h to request IOP to receive the data. When
in BIST mode, the data are the random data generated by the LFSR. Since the LFSR
seed value is initialized to 5Ah, thus, the first data the IOP receives is 5Ah, followed
by 64h, 68h, D1h, A3h, 46h, 8Ch and 19h, a total of 8 bytes data. IOP is then
receives command byte 03h to indicate end of the text.
110
Figure 6.8 Simulation waveform 1 for fault free IOP in BIST mode
As shown in Figure 6.8, signal count increases by 1 for every data generated
by LFSR. In this simulation, so far, LFSR has generated 8 bytes of data to IOP and
thus the count value is 8. IOP receives the data and store them in the 8 8-bit input
buffers. Input_reg_out_0 .... Input_reg_out_7 are the 8 register to store the input
data received by IOP.
After all the input data received, IOP loads all the data into the inputpin
which is the DUT input port. However, when in BIST mode, the inputpin is
feedback into the outputpin which on the other hand is the DUT output pins
connected to IOP. Thus, from the waveform, we can observe that all the input data
111
that IOP received is now in the out_reg_0.. out_reg_7 which are the registers to store
the outputpin values.
After this, BIST module sends a command byte poll, 72h to request IOP to
transmit back the data that it has been received just now. Thus, as can be see in
Figure 6.8, IOP transmits serially out the data in out_reg through the func_serial_out
signal. func_serial_out signal is same as serial_out signal in functional mode but is
only being used during BIST mode. The func_serial_out is connected to the inputs
of MISR in the BIST DPU to compact the IOP response and to produce the signature.
With this, both receiver and transmitter can be tested via BIST.
When the transmission completed, BIST module will continue to send
command byte trans. 74h to IOP to ask IOP to receive another new 8 bytes of data
that generated by the LFSR After the sending the 8 bytes of data, BIST request IOP
to transmit out serially the data that has been received by sending the command byte
poll, 72h to IOP. BIST module continues transmitting new data generated by LFSR
to IOP and requested IOP to transmit back the data received.
112
Figure 6.9 Simulation waveform 2 for fault free IOP in BIST mode
As has been discussed before, when the data value is same as the command or
control byte the messages and data packet protocol require dle, 10h command byte to
be appended before the data. BIST CU supports this and as shown in Figure 6.9,
when the data value is same as the command byte, for example 78h, BIST CU
appends dle byte before it and the value of count does not increase if the data is dle.
The testing process continue until the LFSR has complete the generation of
256 data bytes, when means the LFSR repeat the data 5Ah again and count value is 0,
overflow from 255 to 0. When this happen, BIST operation is considered completed.
As shown in Figure 6.10, count value is now become 0 and the LFSR is repeating the
generation of data with value 5Ah again. BIST send the last command byte poll to
113
IOP to transmit the last 8 bytes of the data serially out. Once the transmission of the
last 8 bytes data is completed, BIST load the final MISR signature which is
16377CF22E996CF9h. The signature is the golden signature.
Figure 6.10 Simulation waveform 3 for fault free IOP in BIST mode
6.4.2.2 Faulty Circuit Simulation
The simulation for faulty circuit can be performed by injected a stuck at 1 or
stuck at 0 to any signal inside the IOP UART module, but not the BIST module.
During BIST mode, the BIST module itself must be fault free. Stuck at 1 and stuck at
0 is faulty as the logic value will always stuck at 1 or 0 all the time. In this
114
simulation, the IOP is injected with a stuck at 1 for 1 bit of the input register in IOP
DPU and BIST operation is performed as in fault free simulation.
As shown in Figure 6.11, BIST module works and sends the test pattern out
to IOP exactly the same as the fault free simulation. The BIST operation continues
until the LFSR has completed the test pattern generation and IOP output the final
signature.
Figure 6.11 Simulation waveform 1 for faulty IOP in BIST mode
From waveform in Figure 6.12, the final signature for the faulty IOP is
4DD90234E15D2B43h which is different from the fault free IOP, by comparing the
2 signature, a faulty IOP chip can thus be detected using BIST.
115
Figure 6.12 Simulation waveform 2 for faulty IOP in BIST mode
116
CHAPTER 7
CONCLUSION AND FUTURE WORK
This chapter concludes the IOP with BIST capability design and proposes the
future works.
7.1 Conclusion
As a brief conclusion, an I/O processor with built-in-self-test has been
successfully designed. The simulated waveforms and results presented in Chapter 6
have proven the reliability of the Verilog HDL implementation to model the
characteristics and the architecture of the designed IOP with embedded BIST at RTL
and gate level.
The results and simulation waveforms in Chapter 6 also shows how the BIST
can be performed to the IOP. Although there is additional 30% hardware overhead
introduced by the embedded BIST capability, this is somehow reasonable
117
considering the test performance obtained and the ability of the BIST block provide
high fault coverage.
This project has proven that implementing BIST in a design has effectively
satisfied on chip test generation and evaluation. It provides easy way to test the chip
without having to buy any costly test equipments. Although the technique was
implemented on a low-end and low speed device, its usefulness as a testing process
and prove of concept has been demonstrated. With the implementation of BIST,
expensive tester requirements and testing procedures can be minimized.
The LFSR in the BIST block can be used to replace the expensive testers to
generate pseudo random test pattern to IOP while the MISR is able to compact the
IOP output response into a manageable signature size. The design make uses of the
output pin of the IOP to shift out the signature at the end of the BIST operation
without having to create extra pins.
Although the BIST operation and control unit in this project is specific to IOP
design, however, the BIST architecture and the concept and implementation used in
this project is design independent and can be apply to any other design.
7.2 Recommendation for Future Works
Although the IOP was not original designed in this research, the
recommended future works below includes the future work of both the IOP and BIST
module.
118
1. IOP design has only 8 input and output buffers which can stored up to
8 bytes of input and output data. Future work can be done to increase
the input and output buffers size in order to store more data at a time.
2. Current IOP does not have parity bit check. Future work can be done
to add in the parity bit for error checking.
3. IOP to support different and higher baud rate. Current IOP has a
fixed baud rate of 9600 which makes IOP a very slow device. Future
work to upgrade IOP to support different (controllable) and higher
baud rate.
4. Although current BIST implementation provides high fault coverage,
however, it is still not 100%, for example, the wires from primary
inputs to the input multiplexer and the wires from circuit output to
primary outputs is not able to be tested by BIST. These wires require
another testing method such as an external ATE or JTAG Boundary
Scan hardware.
5. BIST is able to detect the faulty circuit by comparing the signature
with the golden signature. However, BIST is not able to tell which
part of the circuit is faulty. Future work can be done to enhance the
IOP with other design for testability (DFT) algorithm, such as JTAG
Boundary Scan.
6. Due to the time constraint and resource limitation, this project is not
being programmed and implemented into the actual FPGA hardware
device. Future work can be done to implement the design into the
actual FPGA device to prove that the design is working in actual
hardware implementation.
7. Power efficiency IOP. Power has become hot topic in IC design
nowadays. A good designer must design a device with low power
consumption. Although IOP is considered low end and low speed
119
device, however, several power saving methods can be implement in
the design to make IOP more power efficiency such as dynamic clock
gating which turn off the clocks when the device is in idle state or
reduces the clock frequency during idle state to reduce the switching
power of a device.
120
LIST OF REFERENCES
1. Stephen Brown, Zvonko Vranesic. Fundamentals Of Digital Logic With
VHDL Design. International Edition. Singapore. McGraw-Hill. 2000
2. Chua Lee Ping. PC-BASED FPGA PROTOTYPING SYSTEM – I/O
PROCESSOR DESIGN. Thesis Universiti Teknologi Malaysia. 2005
3. Ng Shu Jiuan. PC BASED FPGA PROTOTYPING SYSTEM – DESIGN
OF FRONT END SUBSYSTEM . Thesis Universiti Teknologi Malaysia.
2005
4. Frank Durda. “Serial and UART Tutorial”. (Internet source) 1996
5. Mohd Yamani Idna Idris, Mashkuri Yaacob, Zaidi Razak . A VHDL
IMPLEMENTATION OF UART DESIGN WITH BIST CAPABILITY
Faculty of Computer Science and Information Technology. University of
Malaya
6. Michael L. Bushnell. Essential of Electronic Testing for Digital, Memory
and Mixed-Signal VLSI Circuit . Springer
7. John D. Carpinelli. Computer Systems Organization & Architecture.
International Edition, Pearson Education.
8. John P. Hayes. Computer Architecture and Organization. International
Edition, McGraw-Hill.
121
9. IEEE Standard 1364-2001 Verilog Hardware Description Language
Reference Manual
10. Modelsim v6.1b User Reference Manual. (Internet source)
11. Introduction To Quartus II Manual. (Internet source)
122
APPENDIX A
Module Block Diagram and Verilog HDL Code
A1: Block diagram and VHDL code for Testbench module
Testbench module block diagram
Test bench
IOP
DUT
Register
file
Input
data
shifter
Clock
Generator
123
library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; use IEEE.std_logic_unsigned.all; entity Testbench is generic ( data1 : std_logic_vector(9 downto 0) := "1111111111"; data2 : std_logic_vector(9 downto 0) := "1111111111"; data3 : std_logic_vector(9 downto 0) := "1111111111"; data4 : std_logic_vector(9 downto 0) := "1111111111"; data5 : std_logic_vector(9 downto 0) := "1111111111"; data6 : std_logic_vector(9 downto 0) := "1111111111"; data7 : std_logic_vector(9 downto 0) := "1111111111"; data8 : std_logic_vector(9 downto 0) := "1111111111"; data9 : std_logic_vector(9 downto 0) := "1111111111"; data10 : std_logic_vector(9 downto 0) := "1111111111"; data11 : std_logic_vector(9 downto 0) := "1111111111"; data12 : std_logic_vector(9 downto 0) := "1111111111"; data13 : std_logic_vector(9 downto 0) := "1111111111"; data14 : std_logic_vector(9 downto 0) := "1111111111"; data15 : std_logic_vector(9 downto 0) := "1111111111"; data16 : std_logic_vector(9 downto 0) := "1111111111"; data17 : std_logic_vector(9 downto 0) := "1111111111"; data18 : std_logic_vector(9 downto 0) := "1111111111"; data19 : std_logic_vector(9 downto 0) := "1111111111"; data20 : std_logic_vector(9 downto 0) := "1111111111"; data21 : std_logic_vector(9 downto 0) := "1111111111"; data22 : std_logic_vector(9 downto 0) := "1111111111"; data23 : std_logic_vector(9 downto 0) := "1111111111"; data24 : std_logic_vector(9 downto 0) := "1111111111"; data25 : std_logic_vector(9 downto 0) := "1111111111"; data26 : std_logic_vector(9 downto 0) := "1111111111"; data27 : std_logic_vector(9 downto 0) := "1111111111"; data28 : std_logic_vector(9 downto 0) := "1111111111"; data29 : std_logic_vector(9 downto 0) := "1111111111"; data30 : std_logic_vector(9 downto 0) := "1111111111"; data31 : std_logic_vector(9 downto 0) := "1111111111"; data32 : std_logic_vector(9 downto 0) := "1111111111"; data33 : std_logic_vector(9 downto 0) := "1111111111"; data34 : std_logic_vector(9 downto 0) := "1111111111"; data35 : std_logic_vector(9 downto 0) := "1111111111"; data36 : std_logic_vector(9 downto 0) := "1111111111"; data37 : std_logic_vector(9 downto 0) := "1111111111"; data38 : std_logic_vector(9 downto 0) := "1111111111"; data39 : std_logic_vector(9 downto 0) := "1111111111"; data40 : std_logic_vector(9 downto 0) := "1111111111"; data41 : std_logic_vector(9 downto 0) := "1111111111"; data42 : std_logic_vector(9 downto 0) := "1111111111"; data43 : std_logic_vector(9 downto 0) := "1111111111"; data44 : std_logic_vector(9 downto 0) := "1111111111"; data45 : std_logic_vector(9 downto 0) := "1111111111"; data46 : std_logic_vector(9 downto 0) := "1111111111"; data47 : std_logic_vector(9 downto 0) := "1111111111"; data48 : std_logic_vector(9 downto 0) := "1111111111"; data49 : std_logic_vector(9 downto 0) := "1111111111"; data50 : std_logic_vector(9 downto 0) := "1111111111"; data51 : std_logic_vector(9 downto 0) := "1111111111"; data52 : std_logic_vector(9 downto 0) := "1111111111"; data53 : std_logic_vector(9 downto 0) := "1111111111"; data54 : std_logic_vector(9 downto 0) := "1111111111"; data55 : std_logic_vector(9 downto 0) := "1111111111"; data56 : std_logic_vector(9 downto 0) := "1111111111"; data57 : std_logic_vector(9 downto 0) := "1111111111"; data58 : std_logic_vector(9 downto 0) := "1111111111"; data59 : std_logic_vector(9 downto 0) := "1111111111"; data60 : std_logic_vector(9 downto 0) := "1111111111"; data61 : std_logic_vector(9 downto 0) := "1111111111"; data62 : std_logic_vector(9 downto 0) := "1111111111"; data63 : std_logic_vector(9 downto 0) := "1111111111"; data64 : std_logic_vector(9 downto 0) := "1111111111"; data65 : std_logic_vector(9 downto 0) := "1111111111"; data66 : std_logic_vector(9 downto 0) := "1111111111";
124
data67 : std_logic_vector(9 downto 0) := "1111111111"; data68 : std_logic_vector(9 downto 0) := "1111111111"; data69 : std_logic_vector(9 downto 0) := "1111111111"; data70 : std_logic_vector(9 downto 0) := "1111111111"; data71 : std_logic_vector(9 downto 0) := "1111111111"; data72 : std_logic_vector(9 downto 0) := "1111111111"; data73 : std_logic_vector(9 downto 0) := "1111111111"; data74 : std_logic_vector(9 downto 0) := "1111111111"; data75 : std_logic_vector(9 downto 0) := "1111111111"; data76 : std_logic_vector(9 downto 0) := "1111111111"; data77 : std_logic_vector(9 downto 0) := "1111111111"; data78 : std_logic_vector(9 downto 0) := "1111111111"; data79 : std_logic_vector(9 downto 0) := "1111111111"; data80 : std_logic_vector(9 downto 0) := "1111111111"; data81 : std_logic_vector(9 downto 0) := "1111111111"; data82 : std_logic_vector(9 downto 0) := "1111111111"; data83 : std_logic_vector(9 downto 0) := "1111111111"; data84 : std_logic_vector(9 downto 0) := "1111111111"; data85 : std_logic_vector(9 downto 0) := "1111111111"; data86 : std_logic_vector(9 downto 0) := "1111111111"; data87 : std_logic_vector(9 downto 0) := "1111111111"; data88 : std_logic_vector(9 downto 0) := "1111111111"; data89 : std_logic_vector(9 downto 0) := "1111111111"; data90 : std_logic_vector(9 downto 0) := "1111111111"; data91 : std_logic_vector(9 downto 0) := "1111111111"; data92 : std_logic_vector(9 downto 0) := "1111111111"; data93 : std_logic_vector(9 downto 0) := "1111111111"; data94 : std_logic_vector(9 downto 0) := "1111111111"; data95 : std_logic_vector(9 downto 0) := "1111111111"; data96 : std_logic_vector(9 downto 0) := "1111111111"; data97 : std_logic_vector(9 downto 0) := "1111111111"; data98 : std_logic_vector(9 downto 0) := "1111111111"; data99 : std_logic_vector(9 downto 0) := "1111111111"; data100 : std_logic_vector(9 downto 0) := "1111111111" ); end Testbench; architecture Testbench_arch of Testbench is signal data_to_IOP : std_logic_vector(999 downto 0); signal CLK, load, ack, busy, shift, serial_in : std_logic; signal clk_25Mhz_count : std_logic_vector(9 downto 0); signal count_receiver : std_logic_vector (3 downto 0); signal inputpin : std_logic_vector(63 downto 0); signal outputpin : std_logic_vector(63 downto 0); -- Test input vector. Support up to 30 byte command & data signal data_fr_user : std_logic_vector(999 downto 0); signal clk25Mhz, reset, reset_b, serial_out, input_eq, clk_receiver_int,clk_transmitter : std_logic; signal B1, B2 : std_logic; component IOP port ( -- testing_cout_out : out std_logic_vector(7 downto 0); reset_b : in STD_LOGIC; serial_out : out std_logic; serial_in : in std_logic; clk_25mhz : in STD_LOGIC; inputpin : buffer std_logic_vector(63 downto 0); outputpin : in std_logic_vector(63 downto 0); ack : out std_logic ;-- signal from to indicate continue CPU execution busy : in std_logic; -- signal to indicate transfer data B1 : in std_logic; B2 : in std_logic ); end component; begin
125
begin data_fr_user <= data1 & data2 & data3 & data4 & data5 & data6 & data7 & data8 & data9 & data10 & data11 & data12 & data13 & data14 & data15 & data16 & data17 & data18 & data19 & data20 & data21 & data22 & data23 & data24 & data25 & data26 & data27 & data28 & data29 & data30 & data31 & data32 & data33 & data34 & data35 & data36 & data37 & data38 & data39 & data40 & data41 & data42 & data43 & data44 & data45 & data46 & data47 & data48 & data49 & data50 & data51 & data52 & data53 & data54 & data55 & data56 & data57 & data58 & data59 & data60 & data61 & data62 & data63 & data64 & data65 & data66 & data67 & data68 & data69 & data70 & data71 & data72 & data73 & data74 & data75 & data76 & data77 & data78 & data79 & data80 & data81 & data82 & data83 & data84 & data85 & data86 & data87 & data88 & data89 & data90 & data91 & data92 & data93 & data94 & data95 & data96 & data97 & data98 & data99 & data100 ; reset_b <= NOT reset; -- Enter concurrent statements here -- to generate the clock of serial in receiver_clock: process (clk25Mhz) begin if clk25Mhz'event and clk25Mhz='1' then if clk_25Mhz_count < 327 then clk_25Mhz_count <= clk_25Mhz_count+1; else clk_25Mhz_count <= (others=>'0'); end if; if clk_25Mhz_count <163 then clk_receiver_int <='0'; else clk_receiver_int <='1'; end if; end if; end process; transmitter_clock: process (clk_receiver_int) begin if clk_receiver_int'event and clk_receiver_int = '1' then if count_receiver < 7 then count_receiver<= count_receiver+1; else count_receiver<=(others=>'0'); end if; if count_receiver < 3 then clk_transmitter<='0'; elsif count_receiver=7 then clk_transmitter<='0'; else clk_transmitter<='1'; end if; end if; end process; CLK<=clk_transmitter; process (reset, CLK) begin if reset='1' then data_to_IOP<=(others=>'1'); serial_in<='1'; elsif (CLK'event and CLK='1')then if load = '1' then data_to_IOP<=data_fr_user; elsif shift = '1' then serial_in<=data_to_IOP(999); for i in 1 to 999 loop data_to_IOP(i) <= data_to_IOP(i-1); end loop; data_to_IOP(0) <= '1';
126
A2: Block diagram and Verilog HDL code for IOP module
else
data_to_IOP<=data_to_IOP;
serial_in<='1';
end if;
end if; end process;
IOP_inst : IOP
port map (reset_b, serial_out, serial_in, clk25Mhz, inputpin, outputpin, ack, busy, B1, B2);
end Testbench_arch;
127
`resetall `timescale 1ns/1ns module IOP (reset_b, serial_out, serial_in, clk_25mhz, inputpin, outputpin, ack, busy, B1, B2); parameter n =64; parameter m =64; input reset_b, serial_in, clk_25mhz, busy, B1, B2; input [n-1:0] outputpin; output serial_out, ack; output [m-1:0] inputpin; reg SerialIn, serial_out; reg [m-1:0] inputpin; wire clk_1mhz_int, sig_idle, ready_in, t_out_eq, eq_zero, input_eq, reset; wire get_data, fsm_data_load, t_input_load, t_output_load, i_load, i_inc; wire i_clr,o_load,o_inc,o_clr,transfer, data_selected_load, trans, recv, load_command; wire [7:0] data_selected, data_selected_reg, command; wire [1:0] end_data; wire [8:0] data_out; wire [16:0] control; wire [m-1:0] funcinputpin, BilboSignature; reg [m-1:0] DUTOutput; parameter Bilbo_in = 8'h5A; assign reset = ~reset_b; assign load_command = control[16]; assign get_data = control[15]; assign fsm_data_load = control[14]; assign data_selected_load = control[13]; assign t_output_load = control[12]; assign t_input_load = control[11]; assign i_load = control[10]; assign i_inc = control[9]; assign i_clr = control[8]; assign recv = input_eq; assign trans = control[6]; assign o_load = control[5]; assign o_inc = control[4]; assign o_clr = control[3]; assign end_data[1] = control[2]; assign end_data[0] = control[1]; assign transfer = control[0]; always @(posedge clk_trans) begin case ((B1 == 1'b0) & (B2 == 1'b1)) 1'b1 : begin DUTOutput = funcinputpin; inputpin = BilboSignature; SerialIn = BilboSerialOut; serial_out = 1'b1; end 1'b0 : begin DUTOutput = outputpin; inputpin = funcinputpin; SerialIn = serial_in; serial_out = func_serial_out; end
128
A3: Block diagram and Verilog HDL code for clk_1mhz module
default : begin DUTOutput = outputpin; inputpin = funcinputpin; SerialIn = serial_in; serial_out = func_serial_out; end endcase end clk_1mhz U_clk_1mhz (reset_b, clk_25mhz, clk_1mhz_int); IOP_CU U_IOP_CU (reset_b, clk_1mhz_int, sig_idle, ready_in, t_out_eq, eq_zero, data_selected, data_fsm, control, command, ack, busy); IOP_DPU U_IOP_DPU (transfer, get_data, reset_b, clk_25mhz, clk_1mhz_int, func_serial_out, sig_idle, SerialIn, ready_in, i_load, i_inc, i_clr, input_eq, o_load, end_data[1:0], o_inc, o_clr, t_out_eq, eq_zero, data_selected, data_fsm, t_input_load, t_output_load, fsm_data_load, data_selected_load, trans, recv, funcinputpin, DUTOutput, load_command, command, clk_trans, trans_bit_count); bist U_BIST (clk_trans, reset, serial_in, B1, B2, Bilbo_in, BilboSerialOut, eq_zero, trans_bit_count, func_serial_out, BilboSignature); endmodule
129
A4: Block diagram and Verilog HDL code for bist module
`resetall `timescale 1ns/1ns module clk_1mhz (reset, clk_25mhz, clk_1mhz); input reset, clk_25mhz; output clk_1mhz; reg clk_1mhz; reg [4:0] clk_25mhz_count; always @(negedge reset or posedge clk_25mhz) begin if (reset == 1'b0) begin clk_25mhz_count = 5'b00000; clk_1mhz = 1'b0; end else begin if (clk_25mhz_count < 24) begin clk_25mhz_count = clk_25mhz_count + 1; end else begin clk_25mhz_count = 5'b00000; end if (clk_25mhz_count < 12) begin clk_1mhz = 1'b0; end else begin clk_1mhz = 1'b1; end end end endmodule
130
`resetall `timescale 1ns/1ns module bist (CLK, reset, SI, B1, B2, d, BilboSerialOut, TransmitDone, trans_bit_count, func_serial_out, FinalMISRSignature); input CLK, reset, SI, B1, B2, TransmitDone, func_serial_out; input [7:0] d; input [4:0] trans_bit_count; output BilboSerialOut; output [63:0] FinalMISRSignature; wire [8:0] ctrWord; wire NextCmd, BistDone; wire loadSignature = ctrWord[8]; wire load = ctrWord[6]; wire incr = ctrWord[7]; wire BufferFull, CmdFound; wire [63:0] MISRSignature; reg [63:0] FinalMISRSignature; wire clr = 1'b0; reg b1, b2; always @ (posedge CLK) begin if (load == 1'b1) begin b1 = B1; b2 = B2; end else begin b1 = b1; b2 = b2; end end always @(loadSignature or MISRSignature or reset) begin if (reset == 1'b1) begin FinalMISRSignature = 64'h00000000; end else if (loadSignature == 1'b1) begin FinalMISRSignature = MISRSignature; end else begin FinalMISRSignature = FinalMISRSignature; end end bist_cu U_Bist_CU (B1, B2, CLK, reset, NextCmd, ctrWord, TransmitDone, BufferFull, BistDone, CmdFound); bist_dpu U_Bist_DPU (CLK, reset, d, load, ctrWord[5], ctrWord[4], ctrWord[3:0], NextCmd, BilboSerialOut, BufferFull, BistDone, CmdFound, trans_bit_count, func_serial_out, b1, b2, SI, MISRSignature, incr, clr); endmodule
131
A5: Block diagram and Verilog HDL code for bis_cu module
132
`resetall `timescale 1ns/1ns module bist_cu (B1, B2, CLK, reset, NextCmd, ctrWord, TransmitDone, BufferFull, BistDone, CmdFound); input B1, B2, CLK, reset, NextCmd, TransmitDone, BufferFull, BistDone, CmdFound; output [8:0] ctrWord; wire [1:0] bist_mode = {B1, B2}; reg [4:0] ps, ns; reg [8:0] ctrWord; parameter s0 = 5'd0; parameter s1 = 5'd1; parameter s2 = 5'd2; parameter s3 = 5'd3; parameter s4 = 5'd4; parameter s5 = 5'd5; parameter s6 = 5'd6; parameter s7 = 5'd7; parameter s8 = 5'd8; parameter s9 = 5'd9; parameter s10 = 5'd10; parameter s11 = 5'd11; parameter s12 = 5'd12; parameter s13 = 5'd13; parameter s14 = 5'd14; parameter s15 = 5'd15; parameter s16 = 5'd16; parameter s17 = 5'd17; parameter s18 = 5'd18; parameter s19 = 5'd19; parameter s20 = 5'd20; parameter s21 = 5'd21; parameter s22 = 5'd22; parameter s23 = 5'd23; parameter s24 = 5'd24; parameter s25 = 5'd25; parameter s26 = 5'd26; parameter s27 = 5'd27; parameter s28 = 5'd28; parameter s29 = 5'd29; parameter s30 = 5'd30; parameter s31 = 5'd31; always @(posedge CLK or posedge reset) begin : state_memory if (reset == 1'b1) begin ps = s0; end else begin ps = ns; end end //always @(negedge CLK) always @(ps or bist_mode or BistDone or BufferFull or CmdFound or NextCmd or TransmitDone) begin ctrWord = 9'b000000000; case (ps) s0: begin if (bist_mode == 2'b01) begin ns = s1; ctrWord = 9'b000101001;
133
end else begin ns = s0; ctrWord = 9'b001000000; end end s1: begin if (bist_mode == 2'b01) begin ns = s2; ctrWord = 9'b000010000; end else begin ns = s0; end end s2: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin ctrWord = 9'b000100001; ns = s3; end else begin ctrWord = 9'b000010000; ns = s2; end end else begin ns = s0; end end s3: begin if (bist_mode == 2'b01) begin ns = s4; ctrWord = 9'b000010000; end else begin ns = s0; end end s4: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin ctrWord = 9'b000100101; ns = s5; end else begin ctrWord = 9'b000010000; ns = s4; end end else begin ns = s0; end end
134
s5: begin if (bist_mode == 2'b01) begin ns = s6; ctrWord = 9'b000001000; end else begin ns = s0; end end s6: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin ctrWord = 9'b000100110; ns = s7; end else begin ctrWord = 9'b000010000; ns = s6; end end else begin ns = s0; end end s7: begin if (bist_mode == 2'b01) begin ns = s8; ctrWord = 9'b000010000; end else begin ns = s0; end end s8: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin ctrWord = 9'b000100100; ns = s9; end else begin ctrWord = 9'b000010000; ns = s8; end end else begin ns = s0; end end s9: begin if (bist_mode == 2'b01) begin ns = s10; ctrWord = 9'b000010000; end else begin ns = s0; end end
135
s10:
begin
if (bist_mode == 2'b01)
begin
if (NextCmd == 1'b1)
begin ctrWord = 9'b000100010;
ns = s11;
end
else
begin
ctrWord = 9'b000010000;
ns = s10;
end
end
else
begin ns = s0;
end
end
s11:
begin
if (bist_mode == 2'b01) begin
ns = s12;
ctrWord = 9'b000010000;
end
else
begin
ns = s0;
end
end s12:
begin
if (bist_mode == 2'b01)
begin
if (NextCmd == 1'b1)
begin ctrWord = 9'b011000000;
ns = s13;
end
else
begin
ctrWord = 9'b000010000;
ns = s12;
end
end else
begin
ns = s0;
end
end
s13: begin
if (bist_mode == 2'b01)
begin
if (CmdFound == 1'b1)
begin
ctrWord = 9'b000101000; ns = s14;
end
else
begin
ctrWord = 9'b000100111;
ns = s15;
end
end
else
begin ns = s0;
end
end
136
s14: begin if (bist_mode == 2'b01) begin ns = s16; ctrWord = 9'b000010000; end else begin ns = s0; end end s15: begin if (bist_mode == 2'b01) begin ns = s17; ctrWord = 9'b000010000; end else begin ns = s0; end end s16: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin ns = s15; ctrWord = 9'b000100111; end else begin ctrWord = 9'b000010000; ns = s16; end end else begin ns = s0; end end s17: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin if (BufferFull == 1'b1) begin ns = s18; ctrWord = 9'b000100100; end else begin ns = s13; ctrWord = 9'b011000000; end end else begin ctrWord = 9'b000010000; ns = s17; end end else begin ns = s0; end end
137
s18: begin if (bist_mode == 2'b01) begin ns = s19; ctrWord = 9'b000010000; end else begin ns = s0; end end s19: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin ctrWord = 9'b000100011; ns = s20; end else begin ctrWord = 9'b000010000; ns = s19; end end else begin ns = s0; end end s20: begin if (bist_mode == 2'b01) begin ns = s21; ctrWord = 9'b000010000; end else begin ns = s0; end end s21: begin if (bist_mode == 2'b01) begin if (NextCmd == 1'b1) begin if (BistDone == 1'b1) begin ns = s23; end else begin ns = s22; end end else begin ctrWord = 9'b000010000; ns = s21; end end else begin ns = s0; end end
138
s22: begin if (bist_mode == 2'b01) begin if (TransmitDone == 1'b1) begin ns = s10; end else begin ns = s22; end end else begin ns = s0; end end s23: begin if (bist_mode == 2'b01) begin ctrWord = 9'b000010000; ns = s24; end else begin ns = s20; end end s24: begin if (bist_mode == 2'b01) begin if (TransmitDone == 1'b1) begin ctrWord = 9'b100010000; ns = s25; end else begin ns = s24; end end else begin ns = s0; end end s25: begin if (bist_mode == 2'b01) begin ctrWord = 9'b000010000; ns = s25; end else begin ns = s0; end end endcase end endmodule
139
A6: Block diagram and Verilog HDL code for bis_dpu module
140
`resetall `timescale 1ns/1ns module bist_dpu (CLK, reset, LFSRIn, LFSRLoad, RegLoad, shift, MuxSel, NextCmd, BilboSerialOut, BufferFull, BistDone, CmdFound, trans_bit_count, func_serial_out, b1, b2, SI, MISRSignature, incr, clr); input CLK, reset, RegLoad, shift, LFSRLoad, func_serial_out, b1, b2, SI, incr, clr; input [7:0] LFSRIn; input [3:0] MuxSel; input [4:0] trans_bit_count; output BilboSerialOut, NextCmd, BufferFull, BistDone, CmdFound; output [63:0] MISRSignature; reg BilboSerialOut, NextCmd; reg [9:0] DInput, Dout; reg [4:0] BitCount; reg BufferFull, BistDone, CmdFound; reg MISRLoad; reg [7:0] count; reg [63:0] MISRSignature; wire [7:0] MISRDin; wire LFSRCLK = ~CLK; wire [7:0] MISRDout, LFSR; wire MISRb1; wire [7:0] stx, dle, init, trans, poll, etx, InputNum, OutputNum; assign stx = 8'b00000010; assign dle = 8'b00010000; assign etx = 8'b00000011; assign init = 8'b01101001; assign trans = 8'b01110100; assign poll = 8'b01110010; assign InputNum = 8'b11000000; assign OutputNum = 8'b01000000; assign MISRb1 = b1 ^ b2; always @(LFSR) begin : DLEData case(LFSR) 8'b00000010 : begin CmdFound = 1'b1; end 8'b00010000 : begin CmdFound = 1'b1; end 8'b00000011 : begin CmdFound = 1'b1; end 8'b01101001 : begin CmdFound = 1'b1; end 8'b01111000 : begin CmdFound = 1'b1; end 8'b01110100 : begin CmdFound = 1'b1; end 8'b01110000: begin CmdFound = 1'b1; end
141
8'b01110010 : begin CmdFound = 1'b1; end default : begin CmdFound = 1'b0; end endcase end always @ (MuxSel or init or trans or poll or etx or InputNum or OutputNum or LFSR or dle or stx) begin : CmdSelect case(MuxSel) 4'b0000 : begin DInput[9:0] = {10'b1111111111}; end 4'b0001 : begin DInput[9:0] = {1'b0, init, 1'b1}; end 4'b0010 : begin DInput[9:0] = {1'b0, trans, 1'b1}; end 4'b0011 : begin DInput[9:0] = {1'b0, poll, 1'b1}; end 4'b0100 : begin DInput[9:0] = {1'b0, etx, 1'b1}; end 4'b0101 : begin DInput[9:0] = {1'b0, InputNum, 1'b1}; end 4'b0110 : begin DInput[9:0] = {1'b0, OutputNum, 1'b1}; end 4'b0111 : begin DInput[9:0] = {1'b0, LFSR, 1'b1}; end 4'b1000 : begin DInput[9:0] = {1'b0, dle, 1'b1}; End 4'b1001 : begin DInput[9:0] = {1'b0, stx, 1'b1}; end default : begin DInput[9:0] = {10'b1111111111}; end endcase end always @(posedge reset or posedge CLK) begin if (reset == 1'b1) begin Dout = 10'b1111111111; BilboSerialOut = 1'b1; BitCount = 4'b0000; end
142
else begin if (RegLoad == 1'b1) begin Dout = DInput; BitCount = 4'b0000; end else if (shift == 1'b1) begin BilboSerialOut = Dout[9]; Dout = {Dout[8:0], 1'b1}; BitCount = BitCount + 1; end else begin Dout = Dout; BilboSerialOut = 1'b1; BitCount = 4'b0000; end end end always @ (BitCount) begin : NextCmdSel case(BitCount) 4'b1010 : begin NextCmd = 1'b1; end default : begin NextCmd = 1'b0; end endcase end always @(count or MISRDout or MISRSignature) begin : Buffer case (count) 8'd0 : begin BufferFull = 1'b1; BistDone = 1'b1; MISRLoad = 1'b1; MISRSignature[7:0] = MISRDout; end 8'd8 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[15:8] = MISRDout; end 8'd16 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[23:16] = MISRDout; end 8'd24 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[31:24] = MISRDout; end 8'd32 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[39:32] = MISRDout;
143
end 8'd40 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[47:40] = MISRDout; end 8'd48 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[55:48] = MISRDout; end 8'd56 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[63:56] = MISRDout; end 8'd64 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[7:0] = MISRDout; end 8'd72 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[15:8] = MISRDout; end 8'd80 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[23:16] = MISRDout; end 8'd88 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[31:24] = MISRDout; end 8'd96 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[39:32] = MISRDout; end 8'd104 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[47:40] = MISRDout; end 8'd112 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[55:48] = MISRDout; end
144
8'd120 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[63:56] = MISRDout; end 8'd128 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[7:0] = MISRDout; end 8'd136 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[15:8] = MISRDout; end 8'd144 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[23:16] = MISRDout; end 8'd152 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[31:24] = MISRDout; end 8'd160 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[39:32] = MISRDout; end 8'd168 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[47:40] = MISRDout; end 8'd176 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[55:48] = MISRDout; end 8'd184 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[63:56] = MISRDout; end 8'd192 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[7:0] = MISRDout; end
145
8'd200 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[15:8] = MISRDout; end 8'd208 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[23:16] = MISRDout; end 8'd216 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[31:24] = MISRDout; end 8'd224 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[39:32] = MISRDout; end 8'd232 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[47:40] = MISRDout; end 8'd240 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[55:48] = MISRDout; end 8'd248 : begin BufferFull = 1'b1; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature[63:56] = MISRDout; end default : begin BufferFull = 1'b0; BistDone = 1'b0; MISRLoad = 1'b1; MISRSignature = MISRSignature; end endcase end assign MISRDin[0] = func_serial_out; assign MISRDin[1] = ~func_serial_out; assign MISRDin[2] = func_serial_out; assign MISRDin[3] = ~func_serial_out; assign MISRDin[4] = func_serial_out; assign MISRDin[5] = ~func_serial_out; assign MISRDin[6] = func_serial_out; assign MISRDin[7] = ~func_serial_out;
146
A7: Block diagram and Verilog HDL code for bilbo module
always @(posedge reset or posedge CLK) begin if (reset == 1'b1) begin count = 8'b00000000; end else begin if (incr == 1'b1) begin count = count + 1; end else if (clr == 1'b1) begin count = 8'b00000000; end else begin count = count; end end end bilbo U_LFSR (LFSRCLK, reset, SI, b1, b2, LFSRLoad, LFSRIn, LFSR); bilbo U_MISR (~CLK, reset, SI, MISRb1, b2, MISRLoad, MISRDin, MISRDout); endmodule
147
`resetall `timescale 1ns/1ns module bilbo(CLK, reset, SI, b1, b2, load, d, q); input CLK, reset, SI, b1, b2, load; input [7:0] d; output [7:0] q; wire [7:0] Din, Dout; wire notb1orb2; reg mux_out; assign notb1orb2 = ~(b1) | b2; always @ ( SI or Dout or b2) begin : mux case (b2) 1'b1 : begin mux_out = (Dout[1] ^ Dout[2] ^ Dout[3] ^ Dout[7]); end 1'b0 : begin mux_out = SI; end default : begin mux_out = SI; end endcase end assign q = Dout; assign Din[0] = ~(mux_out & notb1orb2) ^ ~(d[0] & b1); assign Din[1] = ~(Dout[0] & notb1orb2) ^ ~(d[1] & b1); assign Din[2] = ~(Dout[1] & notb1orb2) ^ ~(d[2] & b1); assign Din[3] = ~(Dout[2] & notb1orb2) ^ ~(d[3] & b1); assign Din[4] = ~(Dout[3] & notb1orb2) ^ ~(d[4] & b1); assign Din[5] = ~(Dout[4] & notb1orb2) ^ ~(d[5] & b1); assign Din[6] = ~(Dout[5] & notb1orb2) ^ ~(d[6] & b1); assign Din[7] = ~(Dout[6] & notb1orb2) ^ ~(d[7] & b1); DFFlop biblo_reg_0 (CLK, reset, load, Din[0], Dout[0]); DFFlop biblo_reg_1 (CLK, reset, load, Din[1], Dout[1]); DFFlop biblo_reg_2 (CLK, reset, load, Din[2], Dout[2]); DFFlop biblo_reg_3 (CLK, reset, load, Din[3], Dout[3]); DFFlop biblo_reg_4 (CLK, reset, load, Din[4], Dout[4]); DFFlop biblo_reg_5 (CLK, reset, load, Din[5], Dout[5]); DFFlop biblo_reg_6 (CLK, reset, load, Din[6], Dout[6]); DFFlop biblo_reg_7 (CLK, reset, load, Din[7], Dout[7]); endmodule
148
A8: Block diagram and Verilog HDL code for DFFlop module
A9: Block diagram and Verilog HDL code for IOP_CU module
`resetall `timescale 1ns/1ns module DFFlop (CLK, reset, load, d, q); input CLK, d, reset, load; output q; reg q; always @(posedge reset, posedge CLK) begin if (reset == 1'b1) begin q = 1'b0; end else begin if (load == 1'b1) begin q = d; end else begin q = q; end end end endmodule
149
`resetall `timescale 1ns/1ns module IOP_CU (reset, clk_1mhz, sig_idle, ready_in, t_out_eq, eq_zero, data_selected, data_fsm, control, command, ack, busy); input reset, clk_1mhz, sig_idle, ready_in, t_out_eq, eq_zero, busy; input [7:0] data_selected, data_fsm, command; output ack; output[16:0] control; reg[16:0] control; parameter s0 = 6'b000000; parameter s1 = 6'b000001; parameter s2 = 6'b000010; parameter s3 = 6'b000011; parameter s4 = 6'b000100; parameter s5 = 6'b000101; parameter s6 = 6'b000110; parameter s7 = 6'b000111; parameter s8 = 6'b001000; parameter s9 = 6'b001001; parameter s10 = 6'b001010; parameter s11 = 6'b001011; parameter s12 = 6'b001100; parameter s13 = 6'b001101; parameter s14 = 6'b001110; parameter s15 = 6'b001111; parameter s16 = 6'b010000; parameter s17 = 6'b010001; parameter s18 = 6'b010010; parameter s19 = 6'b010011; parameter s20 = 6'b010100; parameter s21 = 6'b010101; parameter s22 = 6'b010110; parameter s23 = 6'b010111; parameter s24 = 6'b011000; parameter s25 = 6'b011001; parameter s26 = 6'b011010; parameter s27 = 6'b011011; parameter s28 = 6'b011100; parameter s29 = 6'b011101; parameter s30 = 6'b011110; parameter s31 = 6'b011111; parameter s32 = 6'b100000; parameter s33 = 6'b100001; parameter s34 = 6'b100010; parameter s35 = 6'b100011; parameter s36 = 6'b100100; parameter s37 = 6'b100101; parameter s38 = 6'b100110; parameter s39 = 6'b100111; parameter stx = 8'b00000010; parameter dle = 8'b00010000; parameter etx = 8'b00000011; parameter init = 8'b01101001; parameter terminate = 8'b01111000; parameter trans = 8'b01110100; parameter irr_poll = 8'b01110000; parameter poll = 8'b01110010; reg [5:0] ps, ns; reg dle_true, c_true, ack; wire [7:0] data;
150
assign data[7] = data_fsm[7]; assign data[6] = data_fsm[6]; assign data[5] = data_fsm[5]; assign data[4] = data_fsm[4]; assign data[3] = data_fsm[3]; assign data[2] = data_fsm[2]; assign data[1] = data_fsm[1]; assign data[0] = data_fsm[0]; always @(data) begin if (data == dle) begin dle_true = 1'b1; end else begin dle_true = 1'b0; end end always @(data_selected) begin if ( (data_selected == dle) | (data_selected == stx) | (data_selected == etx)) begin c_true =1'b1; end else begin c_true =1'b0; end end always @(negedge reset or negedge clk_1mhz) begin : state_trans if (reset == 1'b0) begin ps = s0; end else begin ps = ns; end end always @(ps or c_true or sig_idle or ready_in or t_out_eq or eq_zero or data_selected or data or busy or dle_true) begin : comb_logic ack = 1'b0; case (ps) s0: begin if (ready_in == 1'b0) begin ns = s0; control = 17'b01000000100001000; end else begin ns = s1; control = 17'b01000000100001000; end end s1: begin control = 17'b00100000000000000; ns = s2; end
151
s2: begin control = 17'b00000000100001000; if (data == stx) begin ns = s3; end else begin ns = s0; end end s3: begin if (ready_in == 1'b0) begin ns = s3; control = 17'b01000000000000000; end else begin ns = s4; control = 17'b01000000000000000; end end s4: begin control = 17'b00100000000000000; ns = s5; end s5: begin control = 17'b00000000000000000; if (data == init) begin ns = s6; end else begin ns = s12; end end s6: begin if (ready_in == 1'b0) begin ns = s6; control = 17'b01000000000000000; end else begin ns = s7; control = 17'b01000000000000000; end end s7: begin control = 17'b00100000000000000; ns = s8; end s8: begin if (dle_true == 1'b1) begin ns = s10; control = 17'b01000000000000000; end
152
else begin if (data == etx) begin ns = s17; control = 17'b00000000000000000; end else begin ns = s9; control = 17'b00000000000000000; end end end s9: begin if (data[7] == 1'b1) begin control = 17'b00001000000000000; end else begin control = 17'b00000100000000000; end ns = s6; end s10: begin if (ready_in == 1'b0) begin ns = s10; control = 17'b01000000000000000; end else begin ns = s11; control = 17'b01000000000000000; end end s11: begin ns = s9; control = 17'b00100000000000000; end s12: begin if (ready_in == 1'b0) begin ns = s12; control = 17'b01000000000000000; end else begin ns = s13; control = 17'b01000000000000000; end end s13: begin ns = s14; control = 17'b00100000000000000; end
153
s14: begin if (dle_true == 1'b1) begin ns = s15; control = 17'b01000000000000000; end else begin if (data == etx) begin ns = s0; control = 17'b01000000000000000; end else begin ns = s12; control = 17'b01000000000000000 end end end s15: begin if (ready_in == 1'b0) begin ns = s15; control = 17'b01000000000000000; end else begin ns = s16; control = 17'b01000000000000000; end end s16: begin control = 17'b00000000000000000; ns = s12; end s17: begin if (ready_in == 1'b0) begin ns = s17; control = 17'b01000000100001000; end else begin ns = s18; control = 17'b01000000100001000; end end s18: begin ns = s19; control = 17'b00100000000000000; end
154
s19: begin if (data == stx) begin ns = s17; control = 17'b00000000000000000; end else begin case (data) init : begin ns = s6; control = 17'b10000000000000000; end terminate: begin ns = s12; control = 17'b10000000000000000; end trans: begin ns = s20; control = 17'b10000000000000000; end poll: begin ns = s29; control = 17'b10000000000000000; end irr_poll: begin if (busy == 1'b0) begin ns = s38; end else begin ns = ps; end end default: begin ns = s0; control = 17'b00000000100001000; end endcase end end s20: begin if (ready_in == 1'b0) begin ns = s20; control = 17'b01000000000000000; end else begin ns = s21; control = 17'b01000000000000000; end end s21: begin ns = s22; control = 17'b00100000000000000; end
155
s22: begin if (dle_true == 1'b1) begin ns = s24; control = 17'b00000000000000000; end else begin if (data == etx) begin ns = s34; control = 17'b00000000000000000; end else begin ns = s23; control = 17'b00000000000000000; end end end s23: begin ns = s20; control = 17'b00000011000000000; end s24: begin if (ready_in == 1'b0) begin ns = s24; control = 17'b01000000000000000; end else begin ns = s25; control = 17'b01000000000000000; end end s25: begin ns = s23; control = 17'b00100000000000000; end s26: begin if (eq_zero == 1'b1) begin ns = s36; control = 17'b00000000000100000; end else begin if (t_out_eq == 1'b1) begin ns = s33; control = 17'b00000000000100100; end else begin if (c_true == 1'b1) begin ns = s28; control = 17'b00000000000100010; end else begin ns = s30; control = 17'b00000000000110000; end end end end
156
s27: begin if (sig_idle == 1'b0) begin ns = s27; control = 17'b00000000000000101; end else begin ns = s17; control = 17'b00000000000000000; end end s28: begin if (sig_idle == 1'b0) begin ns = s28; control = 17'b00000000000000011; end else begin ns = s29; control = 17'b00000000000000000; end end s29: begin ns = s30; control = 17'b00000000000110000; end s30: begin if (sig_idle == 1'b0) begin ns = s30; control = 17'b00010000001000001; end else begin ns = s26; control = 17'b00000000000000000; end end s31: begin if (ready_in == 1'b0) begin ns = s31; control = 17'b01010000000000000; end else begin ns = s32; control = 17'b01010000000000000; end end s32: begin ns = s35; control = 17'b00000000000000000; end s33: begin ns = s27; control = 17'b00000000000100100; end
157
s34: begin ns = s17; control = 17'b00000000010000000; end s35: begin ns = s26; control = 17'b00000000001000000; ack = 1'b1; end s36: begin if (sig_idle == 1'b0) begin ns = s36; control = 17'b00010000000000001; end else begin ns = s37; control = 17'b00000000000100110; end end s37: begin ns = s30; control = 17'b00000000000110110; end s38: begin ns = s39; control = 17'b00000000000110000; end s39: begin if (sig_idle == 1'b0) begin ns = s39; control = 17'b00010000001000001; end else begin ns = s26; control = 17'b00000000000000000; ack = 1'b1; end end default: begin ns = s0; control = 17'b00000000000000000; end endcase end endmodule
158
A10: Block diagram and Verilog HDL code for IOP_DPU module
`resetall `timescale 1ns/1ns module IOP_DPU (transfer, get_data, reset_b, clk25mhz, clk_1mhz, serial_out, sig_idle, serial_in, ready_in, i_load, i_inc, i_clr, input_eq, o_load, end_data, o_inc, o_clr, t_out_eq, eq_zero, data_selected_reg, data_fsm, t_input_load, t_output_load, fsm_data_load, data_selected_load, trans, recv, inputpin, outputpin, load_command, command, clk_trans, trans_bit_count); parameter n = 64; parameter m = 64; input transfer, get_data, reset_b, clk25mhz, clk_1mhz, serial_in, i_load, i_inc, o_load, i_clr; input o_inc, o_clr, t_input_load, t_output_load, fsm_data_load, data_selected_load, trans, recv, load_command; input [1:0] end_data; input [n-1:0] outputpin; output [m-1:0] inputpin; output serial_out, ready_in, input_eq, t_out_eq, eq_zero, sig_idle, clk_trans; output [7:0] data_selected_reg, data_fsm, command; output [4:0] trans_bit_count; wire[0:7] data; wire [7:0] data_transfer; IO_Interface U_IO_Interface (reset_b, clk_1mhz, inputpin, outputpin, data, t_out_eq, eq_zero, data_selected_reg, data_transfer, i_load, i_inc, i_clr, input_eq, o_load, end_data, o_inc, o_clr,t_input_load, t_output_load, fsm_data_load, data_selected_load, trans, recv, data_fsm, load_command, command); UART U_UART (transfer, get_data, reset_b, clk_1mhz, clk25mhz,data_transfer,serial_out, sig_idle, serial_in, data, ready_in, clk_trans, trans_bit_count); endmodule
I/O
Interface UART
159
A11: Block diagram and Verilog HDL code for IO_Interface module
`resetall `timescale 1ns/1ns module IO_Interface (reset_b, clk_1mhz, inputpin, outputpin, data_input, t_out_eq, eq_zero, data_selected_reg, data_transfer, i_load, i_inc, i_clr, input_eq, o_load, end_data, o_inc, o_clr, t_input_load, t_output_load, fsm_data_load, data_selected_load, trans, recv, data_fsm, load_command, command); parameter n = 64; parameter m = 64; input reset_b, clk_1mhz, i_load, i_inc, i_clr, o_load, o_inc, o_clr, t_input_load; input t_output_load, fsm_data_load, data_selected_load, trans, recv, load_command; input [n-1:0] outputpin; input [1:0] end_data; input [0:7] data_input; output [m-1:0] inputpin; output [7:0] data_selected_reg, data_transfer, data_fsm, command; output t_out_eq, eq_zero, input_eq; wire [7:0] inreg0, inreg1, inreg2, inreg3, inreg4, inreg5, inreg6, inreg7; wire [7:0] outreg0, outreg1, outreg2, outreg3, outreg4, outreg5, outreg6, outreg7; wire [7:0] t_input, t_output, data_selected; wire [0:7] data; wire reset; assign reset= ~reset_b; Buffers U_Buffers (reset, clk_1mhz, trans, outputpin, outreg0, outreg1, outreg2, outreg3, outreg4, outreg5, outreg6, outreg7, recv, inputpin, inreg0,inreg1,inreg2,inreg3,inreg4,inreg5,inreg6,inreg7); Interface Interface (inreg0, inreg1, inreg2, inreg3, inreg4, inreg5, inreg6, inreg7, outreg0, outreg1, outreg2, outreg3, outreg4, outreg5, outreg6, outreg7, reset, clk_1mhz, data_input, t_out_eq, eq_zero, data_selected_reg, data_transfer, i_load, i_inc, i_clr, input_eq, o_load, end_data, o_inc, o_clr, t_input_load, t_output_load, fsm_data_load, data_selected_load, data_fsm, load_command, command); endmodule
Buffers Interface
160
A12: Verilog HDL code for Interface module
`resetall `timescale 1ns/1ns module Interface (inreg0, inreg1, inreg2, inreg3, inreg4, inreg5, inreg6, inreg7, outreg0, outreg1, outreg2, outreg3, outreg4, outreg5, outreg6, outreg7, reset, clk_1mhz, data_input, t_out_eq, eq_zero, data_selected_reg, data_transfer, i_load, i_inc, i_clr, input_eq, o_load, end_data, o_inc, o_clr, t_input_load, t_output_load, fsm_data_load, data_selected_load, data_fsm, load_command, command); input [0:7] data_input; input reset, clk_1mhz,i_load, i_inc, i_clr, o_load, o_inc, o_clr; input t_input_load, t_output_load, fsm_data_load, data_selected_load, load_command; input[1:0] end_data; input[7:0] outreg0, outreg1, outreg2, outreg3, outreg4, outreg5, outreg6, outreg7; output[7:0] inreg0, inreg1, inreg2, inreg3, inreg4, inreg5, inreg6, inreg7; output[7:0] data_selected_reg, data_transfer, data_fsm, command; output t_out_eq, eq_zero, input_eq; reg[7:0] data_selected_reg; wire[7:0] t_input, t_output, data_selected, data; assign data_fsm = data; always @(posedge clk_1mhz or posedge reset) begin : data_selected_proc if (reset == 1'b1) begin data_selected_reg = 8'h00; end else begin if (data_selected_load == 1'b1) begin data_selected_reg = data_selected; end else begin data_selected_reg = data_selected_reg; end end end Input_card U_Input_card (reset, clk_1mhz, i_load, i_inc, i_clr, t_input, data, input_eq, inreg0, inreg1, inreg2, inreg3, inreg4, inreg5, inreg6, inreg7); Output_card U_Output_card (reset, clk_1mhz, t_output, o_load, end_data, o_inc, o_clr, outreg0, outreg1, outreg2, outreg3, outreg4, outreg5, outreg6, outreg7, t_out_eq, eq_zero, data_selected, data_transfer); Register U_T_input_reg (clk_1mhz, reset, t_input_load, data, t_input); T_output_reg U_T_output_reg (clk_1mhz, reset, t_output_load, data, t_output); Register U_Fsm_datareg (clk_1mhz, reset, fsm_data_load, data_input, data); Register U_Command_reg (clk_1mhz, reset, load_command, data, command); endmodule
161
A13: Block diagram and Verilog HDL code for Buffers module
162
`resetall `timescale 1ns/1ns module Buffers (reset, clk_1mhz, trans, out_pin, output_0, output_1, output_2, output_3, output_4, output_5, output_6, output_7, recv, input_pin, input_0, input_1, input_2, input_3, input_4, input_5, input_6, input_7); parameter n = 64; parameter m = 64; input reset, clk_1mhz, trans, recv; input[n-1:0] out_pin; input[7:0] input_0,input_1, input_2, input_3, input_4, input_5, input_6, input_7; output[m-1:0] input_pin; output[7:0] output_0, output_1, output_2, output_3, output_4, output_5, output_6, output_7; reg[7:0] output_0, output_1, output_2, output_3, output_4, output_5, output_6, output_7; reg[m-1:0] input_int; parameter[7:0] ALL0 = {8{1'b0}}; parameter[m-1:0] ALL0_63_0 = {m-1{1'b0}}; always @(posedge reset or posedge clk_1mhz) begin if (reset == 1'b1) begin output_0 = ALL0; output_1 = ALL0; output_2 = ALL0; output_3 = ALL0; output_4 = ALL0; output_5 = ALL0; output_6 = ALL0; output_7 = ALL0; end else if (trans == 1'b1) begin output_0 = out_pin[7:0]; output_1 = out_pin[15:8]; output_2 = out_pin[23:16]; output_3 = out_pin[31:24]; output_4 = out_pin[39:32]; output_5 = out_pin[47:40]; output_6 = out_pin[55:48]; output_7 = out_pin[63:56]; end else begin output_0 = output_0; output_1 = output_1; output_2 = output_2; output_3 = output_3; output_4 = output_4; output_5 = output_5; output_6 = output_6; output_7 = output_7; end end always @(posedge reset or posedge clk_1mhz) begin if (reset == 1'b1) begin input_int = ALL0_63_0;
end
163
A14: Verilog HDL code for Register module
else begin if (recv == 1'b1) begin input_int[7:0] = input_0; input_int[15:8] = input_1; input_int[23:16] = input_2; input_int[31:24] = input_3; input_int[39:32] = input_4; input_int[47:40] = input_5; input_int[55:48] = input_6; input_int[63:56] = input_7; end else begin input_int = input_int; end end end assign input_pin = input_int; endmodule
`resetall `timescale 1ns/1ns module Register (CLK, reset, load, d, q); input CLK, reset, load; input[7:0] d; output[7:0] q; reg[7:0] q; always @(posedge reset, posedge CLK) begin if (reset == 1'b1) begin q = 8'h00; end else begin if (load == 1'b1) begin q = d; end else begin q = q; end end end endmodule
164
A15: Verilog HDL code for T_output_reg module
A16: Block diagram and Verilog HDL code for Input_card module
`resetall `timescale 1ns/1ns module T_output_reg (CLK, reset, load, d, q); input CLK, reset, load; input[7:0] d; output[7:0] q; reg[7:0] q; always @(posedge reset or posedge CLK) begin if (reset == 1'b1) begin q = 8'h00; end else begin if (load == 1'b1) begin q = {1'b0, d[6:0]}; end else begin q = q; end end end endmodule
165
`resetall `timescale 1ns/1ns module Input_card (reset, clk_1mhz, load, inc, clear, t_input, data_input, input_eq, input_reg_out_0, input_reg_out_1, input_reg_out_2, input_reg_out_3, input_reg_out_4, input_reg_out_5, input_reg_out_6, input_reg_out_7); input reset, clk_1mhz, load, inc, clear; input[7:0] t_input, data_input; output input_eq; output[7:0] input_reg_out_0, input_reg_out_1, input_reg_out_2, input_reg_out_3; output[7:0] input_reg_out_4, input_reg_out_5, input_reg_out_6, input_reg_out_7; wire[7:0] count_int_x8, count_int; reg [7:0] input_reg [0:7]; reg [7:0] count_int_dec; parameter[7:0] ALL0 = {8{1'b0}}; integer reg_num, i; assign count_int_x8 = {count_int[4:0], 3'b000}; always @ (count_int_dec or count_int) begin if (count_int == 8'b00001000) begin count_int_dec = count_int - 1; end else begin count_int_dec = count_int; end
end always @ (count_int_dec or reg_num) case (count_int_dec) 8'b00000000 : reg_num = 0; 8'b00000001 : reg_num = 1; 8'b00000010 : reg_num = 2; 8'b00000011 : reg_num = 3; 8'b00000100 : reg_num = 4; 8'b00000101 : reg_num = 5; 8'b00000110 : reg_num = 6; 8'b00000111 : reg_num = 7; default: reg_num = 1'bx; endcase always @ (posedge clk_1mhz or posedge reset) begin if (reset == 1'b1) begin for ( i = 0; i <= 7; i = i + 1) begin input_reg[i] = ALL0; end end else begin if (load == 1'b1) begin input_reg[reg_num] = data_input; end else begin input_reg[reg_num] = input_reg[reg_num]; end end end
166
A17: Verilog HDL code for Trans_comparator module
assign input_reg_out_0 =input_reg[0]; assign input_reg_out_1 =input_reg[1]; assign input_reg_out_2 =input_reg[2]; assign input_reg_out_3 =input_reg[3]; assign input_reg_out_4 =input_reg[4]; assign input_reg_out_5 =input_reg[5]; assign input_reg_out_6 =input_reg[6]; assign input_reg_out_7 =input_reg[7]; Trans_comparator U_Trans_comparator (t_input, count_int_x8, input_eq); Counter U_Trans_counter (inc, clear, count_int, reset, clk_1mhz); endmodule
`resetall `timescale 1ns/1ns module Trans_comparator (data1, data2, eq); parameter data_length =8; input [data_length-1:0] data1, data2; output eq; reg eq; always @(data1, data2) begin : Trans_Comparator if (data1 == data2) begin eq = 1'b1; end else if ((data1 < data2) && (data2 < data1 + 8)) begin eq = 1'b1; end else begin eq = 1'b0; end end //Trans_Comparator endmodule
167
A18: Verilog HDL code for Counter module
A19: Block diagram and Verilog HDL code for Output_card module
`resetall `timescale 1ns/1ns module Counter (inc, clr, count, reset, clk); parameter count_width = 8; input inc, clr, reset, clk; output[count_width-1:0] count; reg[count_width-1:0] count; always @ (posedge reset or posedge clk) begin : Counter if (reset == 1'b1) begin count = 8'b00000000; end else begin if (inc == 1'b1) begin count = count + 1; end else if (clr == 1'b1) begin count = 8'b00000000; end else begin count = count; end end end //Counter endmodule
168
`resetall
`timescale 1ns/1ns
module Output_card (reset, clk_1mhz, t_output, load, end_data, inc, clr, out_reg_0, out_reg_1, out_reg_2, out_reg_3,
out_reg_4, out_reg_5, out_reg_6, out_reg_7, t_out_eq, eq_zero, data_selected, data_transfer);
input reset, clk_1mhz, load, inc, clr;
input[1:0] end_data;
input[7:0] t_output, out_reg_0, out_reg_1, out_reg_2, out_reg_3, out_reg_4, out_reg_5, out_reg_6, out_reg_7;
output t_out_eq, eq_zero;
output[7:0] data_selected, data_transfer;
wire[7:0] count_int, count_int_x8, count_output, count_output_16, t_output_int;
wire[7:0] add_signal= 8'b00010000;
wire[3:0] mux_sel;
reg[7:0] data_transfer_int, data_transfer, data_selected_int;
assign t_output_int[7] = t_output[7]; assign t_output_int[6] = t_output[6];
assign t_output_int[5] = t_output[5];
assign t_output_int[4] = t_output[4];
assign t_output_int[3] = t_output[3];
assign t_output_int[2] = t_output[2];
assign t_output_int[1] = t_output[1]; assign t_output_int[0] = t_output[0];
assign count_output_16 = count_output + add_signal;
assign count_int_x8 = {count_int[4:0], 3'b000};
assign mux_sel = count_int[3:0];
assign count_output = t_output_int;
assign data_selected = data_selected_int;
always @ (mux_sel or out_reg_0 or out_reg_1 or out_reg_2 or out_reg_3 or out_reg_4 or out_reg_5 or out_reg_6 or
out_reg_7) case(mux_sel)
4'b0000 : data_transfer_int = 8'h02;
4'b0001 : data_transfer_int = out_reg_0;
4'b0010 : data_transfer_int = out_reg_1;
4'b0011 : data_transfer_int = out_reg_2;
4'b0100 : data_transfer_int = out_reg_3; 4'b0101 : data_transfer_int = out_reg_4;
4'b0110 : data_transfer_int = out_reg_5;
4'b0111 : data_transfer_int = out_reg_6;
4'b1000 : data_transfer_int = out_reg_7;
default : data_transfer_int = 8'h00;
endcase
always @ (end_data or data_transfer_int)
case(end_data)
2'b00 : data_selected_int = data_transfer_int; 2'b01 : data_selected_int = 8'h10;
2'b10 : data_selected_int = 8'h03;
default : data_selected_int = 8'h74;
endcase
always @ (posedge clk_1mhz, posedge reset)
begin if (reset == 1'b1)
begin
data_transfer = 8'h00;
end
else
begin if (load == 1'b1)
begin
data_transfer = data_selected_int;
end
else
begin
data_transfer = data_transfer;
end
end
end
Counter U_Poll_counter (inc, clr, count_int, reset, clk_1mhz); Poll_comparator U_Poll_Comparator (count_output_16, count_int_x8, t_out_eq,eq_zero);
endmodule
169
A20: Verilog HDL code for Poll_comparator module
`resetall `timescale 1ns/1ns module Poll_comparator (data1, data2, eq, eq_zero); parameter data_length =8; input [data_length-1:0] data1, data2; output eq, eq_zero; reg eq, eq_zero; always @(data1, data2) begin : Poll_Comparator if (data1 == data2) begin eq = 1'b1; end else if (data1 < data2) begin eq = 1'b1; end else begin eq = 1'b0; end if (data2 == 8'h00) begin eq_zero = 1'b1; end else begin eq_zero = 1'b0; end end
endmodule
170
A21: Block diagram and Verilog HDL code for UART module
A22: Block diagram and Verilog HDL code for C_transmit module
`resetall `timescale 1ns/1ns module UART (transfer, get_data, reset_b, clk_1mhz, clk25mhz, data_in, serial_out, sig_idle, serial_in, data_output, ready_in, clk_trans, trans_bit_count); input transfer, get_data, reset_b, clk_1mhz, clk25mhz, serial_in; input [7:0] data_in; output serial_out, sig_idle, ready_in, clk_trans; output [0:7] data_output; output [4:0] trans_bit_count; wire tdre_clear_int, tdre_set_int, rdrf_set_int, rdrf_clear_int; wire idle, byte_ready, t_byte, load, read_not_ready_in, error1, error2; wire [1:0] status_int; C_transmit U_C_transmit (clk_1mhz, reset_b, transfer, status_int[1], idle, byte_ready, t_byte, load, sig_idle); C_receive U_C_receive (clk_1mhz, reset_b, get_data, status_int[0], read_not_ready_in, ready_in); UART_Comm U_UART_Comm (reset_b, clk25mhz, byte_ready, t_byte, load,data_in, tdre_clear_int, tdre_set_int, serial_out, idle, serial_in, read_not_ready_in, rdrf_set_int, rdrf_clear_int, error1, error2, data_output, clk_trans, trans_bit_count); Status_reg U_Status_reg (clk_1mhz, tdre_set_int, tdre_clear_int, rdrf_set_int, rdrf_clear_int, status_int); endmodule
171
`resetall `timescale 1ns/1ns module C_transmit (clk_1mhz, reset, transfer, tdre, idle, byte_ready, t_byte, load, sig_idle); input clk_1mhz, reset, transfer, tdre, idle; output byte_ready, t_byte, load, sig_idle; parameter s0 = 3'b000; parameter s1 = 3'b001; parameter s2 = 3'b010; parameter s3 = 3'b011; parameter s4 = 3'b100; reg byte_ready, t_byte, load, sig_idle; reg [3:0] ps, ns; always @(negedge reset or negedge clk_1mhz) begin : state_transition if (reset == 1'b0) begin ps = s0; end else begin ps = ns; end end always @(ps or transfer or tdre or idle) begin : comb_logic case (ps) s0: begin byte_ready = 1'b0; t_byte = 1'b0; sig_idle = 1'b0; load = 1'b0; if (transfer == 1'b0) begin ns = s0; end else begin ns = s1; end end s1: begin load = 1'b1; byte_ready = 1'b0; t_byte = 1'b0; sig_idle = 1'b0; if (tdre == 1'b1) begin ns = s1; end else begin ns = s2; end end
172
s2: begin load = 1'b0; byte_ready = 1'b1; t_byte = 1'b0; sig_idle = 1'b0; if (tdre == 1'b0) begin ns = s2; end else begin ns = s3; end end s3: begin load = 1'b0; byte_ready = 1'b0; t_byte = 1'b1; if (idle == 1'b0) begin ns = s3; sig_idle = 1'b0; end else begin ns = s4; sig_idle = 1'b1; end end s4: begin load = 1'b0; byte_ready = 1'b0; t_byte = 1'b0; if (transfer == 1'b1) begin ns = s4; sig_idle = 1'b1; end else begin ns = s0; sig_idle = 1'b0; end end default: begin ns = s0; load = 1'b0; byte_ready = 1'b0; t_byte = 1'b0; sig_idle = 1'b0; end endcase end
endmodule
173
A23: Block diagram and Verilog HDL code for C_receive module
`resetall `timescale 1ns/1ns module C_receive (clk_1mhz, reset, get_data, rdrf, read_not_ready_in, ready_in); input clk_1mhz, reset, get_data, rdrf; output read_not_ready_in, ready_in; parameter s0 = 2'b00; parameter s1 = 2'b01; parameter s2 = 2'b10; parameter s3 = 2'b11; reg [1:0] ps, ns; reg read_not_ready_in, ready_in; always @(negedge reset or negedge clk_1mhz) begin : state_transition if (reset == 1'b0) begin ps = s0; end else begin ps = ns; end end always @(ps or get_data or rdrf) begin : comb_logic case (ps) s0: begin read_not_ready_in = 1'b1; ready_in = 1'b0; if (rdrf == 1'b1) begin ns = s0; end else if (get_data == 1'b1) begin ns = s1; end else begin ns = s0; end end
174
s1: begin read_not_ready_in = 1'b0; ready_in = 1'b0; if (rdrf == 1'b1) begin ns = s2; end else begin ns = s1; end end s2: begin if (rdrf == 1'b1) begin ns = s2; ready_in =1'b0; read_not_ready_in =1'b0; end else begin ns = s3; read_not_ready_in = 1'b0; ready_in = 1'b1; end end s3: begin if (get_data == 1'b1) begin ns = s3; read_not_ready_in = 1'b0; ready_in = 1'b1; end else begin ns = s0; read_not_ready_in = 1'b1; ready_in = 1'b0; end end default: begin ns = s0; read_not_ready_in = 1'b1; ready_in = 1'b0; end endcase end
endmodule
175
A24: Block diagram and Verilog HDL code for Status_reg module
`resetall `timescale 1ns/1ns module Status_reg (clk_1mhz, tdre_set, tdre_clr, rdrf_set, rdrf_clr, status); input clk_1mhz, tdre_set, tdre_clr, rdrf_set, rdrf_clr; output[1:0] status; reg[1:0] status; always @(posedge clk_1mhz) begin if (tdre_clr == 1'b1 && tdre_set == 1'b0) begin status[1] = 1'b0; end else if (tdre_clr == 1'b0 && tdre_set == 1'b1) begin status[1] = 1'b1; end else if (tdre_clr == 1'b0 && tdre_set == 1'b0) begin status[1] = 1'b1; end else begin status[1] = status[1]; end if (rdrf_clr == 1'b1) begin status[0] =1'b0; end else if (rdrf_clr == 1'b0 && rdrf_set == 1'b1) begin status[0] =1'b1; end else begin status[0] = status[0]; end end endmodule
176
A25: Block diagram and Verilog HDL code for UART_Comm module
`resetall `timescale 1ns/1ns module UART_Comm (reset_b, clk25mhz, byte_ready, t_byte, load_datareg, data_in, tdre_clear, tdre_set, serial_out, sig_idle, serial_in, read_not_ready_in, rdrf_set, rdrf_clear, error1, error2, data_output, clk_trans, trans_bit_count); input reset_b, clk25mhz, byte_ready, t_byte, load_datareg; input serial_in, read_not_ready_in; input [7:0] data_in; output tdre_clear, tdre_set, serial_out, sig_idle, rdrf_set, clk_trans; output rdrf_clear, error1, error2; output [0:7] data_output; output [4:0] trans_bit_count; wire reset, clk_transmitter, clk_receiver; assign clk_trans = clk_transmitter; assign reset = ~reset_b; Clock_generator U_Clock_generator (clk25mhz, clk_transmitter, clk_receiver); UART_Receiver U_UART_Receiver (clk_receiver, reset, serial_in, read_not_ready_in, rdrf_set, rdrf_clear,error1, error2, data_output); UART_Transmitter U_UART_Transmitter (clk_transmitter, reset, byte_ready, t_byte, load_datareg, data_in, tdre_clear,tdre_set,serial_out,sig_idle, trans_bit_count);
endmodule
177
A26: Verilog HDL code for Clock_generator module
`resetall `timescale 1ns/1ns module Clock_generator (clk25Mhz, clk_transmitter, clk_receiver); input clk25Mhz; output clk_transmitter, clk_receiver; reg [9:0] clk_25Mhz_count; reg [3:0] count_receiver; reg clk_transmitter; reg clk_receiver_int, clk_transmitter_int; always @ (posedge clk25Mhz) begin : receiver_clock if (clk_25Mhz_count < 327) begin clk_25Mhz_count = clk_25Mhz_count + 1; end else begin clk_25Mhz_count = 0; end if (clk_25Mhz_count < 163) begin clk_receiver_int =1'b0; end else begin clk_receiver_int =1'b1; end end always @ (negedge clk_receiver_int) begin : transmitter_clock if (count_receiver < 7) begin count_receiver = count_receiver + 1; end else begin count_receiver = 0; end if (count_receiver < 3) begin clk_transmitter = 1'b0; end else if (count_receiver == 7) begin clk_transmitter =1'b0; end else begin clk_transmitter =1'b1; end end assign clk_receiver = clk_receiver_int; endmodule
178
A27: Block diagram and Verilog HDL code for UART_Receiver module
`resetall `timescale 1ns/1ns module UART_Receiver (clk_receiver, reset, serial_in, read_not_ready_in, rdrf_set, rdrf_clear, error1, error2, data_out); input clk_receiver, reset, serial_in, read_not_ready_in; output rdrf_set, rdrf_clear, error1, error2; output[0:7] data_out; wire clr_sample, inc_sample, clr_bit, inc_bit; wire shift, load, sel_comp, sam_eq_int, bit_eq_int; Receiver_DPU U_Receiver_DPU (clr_sample, inc_sample, clr_bit, inc_bit, shift, load, sel_comp, serial_in, reset, clk_receiver, sam_eq_int, bit_eq_int, data_out); Receiver_CU U_Receiver_CU (serial_in, read_not_ready_in, sam_eq_int, bit_eq_int, reset, clk_receiver, inc_sample, clr_sample, inc_bit, clr_bit, sel_comp, shift, load, rdrf_set, rdrf_clear,error1, error2); endmodule
179
A28: Verilog HDL code for Receiver_CU module
`resetall `timescale 1ns/1ns module Receiver_CU (serial_in, read_not_ready_in, sample_eq, bit_eq, reset, CLK, inc_sample, clr_sample, inc_bit, clr_bit, sel_comp, shift, load, rdrf_set, rdrf_clear, error1, error2); input serial_in, read_not_ready_in, sample_eq, bit_eq, reset, CLK; output inc_sample, clr_sample, inc_bit, clr_bit, sel_comp, shift, load; output rdrf_set, rdrf_clear, error1, error2; reg error1, error2, rdrf_set, rdrf_clear; reg [2:0] ps,ns; reg [6:0] control; parameter s0 = 3'b000; parameter s1 = 3'b001; parameter s2 = 3'b010; parameter s3 = 3'b011; parameter s4 = 3'b100; always @(negedge CLK or posedge reset) begin : state_memory if (reset == 1'b1) begin ps = s0; end else begin ps = ns; end end always @(ps or sample_eq or bit_eq or read_not_ready_in or serial_in) begin control = 7'b0101000; error1 = 1'b0; error2 = 1'b0; rdrf_set =1'b0; rdrf_clear =1'b1;
180
case (ps) s0: begin control = 7'b0101000; error1 = 1'b0; error2 = 1'b0; rdrf_set =1'b0; rdrf_clear =1'b1; if (serial_in == 1'b0) begin ns = s1; end else begin ns = s0; end end s1: begin if (serial_in == 1'b1) begin ns = s0; control=7'b0101000; end else begin if (sample_eq == 1'b0) begin control = 7'b1001000; ns = s1; end else begin control = 7'b0101000; ns = s2; end end end s2: begin control=7'b0101000; ns = s3; end s3: begin if (sample_eq == 1'b0) begin ns = s3; control = 7'b1000100; end else begin if (bit_eq == 1'b0) begin control =7'b0110110; ns = s3; end else begin error1 =1'b0; error2 =1'b0; if (read_not_ready_in == 1'b1) begin control = 7'b0101100; error1 = 1'b1; ns = s0; end
181
else if (serial_in == 1'b0) begin control = 7'b0101100; error2 = 1'b1; ns = s0; end else begin control = 7'b0101101; ns = s4; end end end end s4: begin control = 7'b0101101; rdrf_set = 1'b1; rdrf_clear = 1'b0; ns = s0; end default: begin ns = s0; end endcase end assign inc_sample = control[6]; assign clr_sample = control[5]; assign inc_bit = control[4]; assign clr_bit = control[3]; assign sel_comp = control[2]; assign shift = control[1]; assign load = control[0]; endmodule
182
A29: Block diagram and Verilog HDL code for Receiver_DPU module
`resetall `timescale 1ns/1ns module Receiver_DPU (clr_sample, inc_sample, clr_bit, inc_bit, shift, load, sel_comp, serial_in, reset, CLK, sample_eq, bit_eq, data_out); parameter data_width = 8; input clr_sample, inc_sample, clr_bit, inc_bit, shift, load, sel_comp, serial_in, reset, CLK; output[data_width-1:0] data_out; output sample_eq, bit_eq; reg sample_eq, bit_eq; wire[7:0] data; wire [3:0] sample, dpu_bit; always @ (sample or sel_comp) begin : comp1 if (sel_comp==1'b0 && sample==3) begin sample_eq = 1'b1; end else if (sel_comp==1'b1 && sample==7) begin sample_eq = 1'b1; end else begin sample_eq = 1'b0; end end always @ (dpu_bit) begin : comp2 if (dpu_bit == 8) begin bit_eq = 1'b1; end else begin bit_eq = 1'b0; end end Register U_Rcv_datareg (CLK, reset, load, data, data_out); Rcv_shftreg U_Rcv_shftreg (serial_in, shift, reset, data, CLK); Sample_counter U_Sample_counter (inc_sample, clr_sample, sample, reset, CLK); Bit_counter1 U_Bit_counter1 (inc_bit, clr_bit, dpu_bit, reset, CLK); endmodule
183
A30: Verilog HDL code for Bit_counter1 module
A31: Verilog HDL code for Rcv_shftreg module
`resetall `timescale 1ns/1ns module Bit_counter1 (inc, clr, count, reset, CLK); parameter count_width = 4; input inc, clr, reset, CLK; output[count_width-1:0] count; reg[count_width-1:0] count; always @ (posedge reset or posedge CLK) begin : Counter if (reset == 1'b1) begin count = 2'b00; end else begin if (inc == 1'b1) begin count = count + 1; end else if (clr == 1'b1) begin count = 2'b00; end else begin count = count; end end end //Counter
endmodule
`resetall `timescale 1ns/1ns module Rcv_shftreg (d, shift, reset, q, CLK); parameter width = 8; input d, shift, reset, CLK; output[width-1:0] q; integer i; reg [width-1:0] q; always @(posedge CLK or posedge reset) begin if (reset == 1'b1) begin q = 8'h00; end else begin if (shift == 1'b1) begin for (i = 7; i >= 1; i= i - 1) begin q[i] = q[i-1]; end q[0] = d; end else begin q = q; end end end endmodule
184
A32: Verilog HDL code for Sample_counter module
A33: Block diagram and Verilog HDL code for UART_Transmitter module
`resetall `timescale 1ns/1ns module Sample_counter(inc, clr, count, reset, CLK); parameter count_width = 4; input inc, clr, reset, CLK; output [count_width-1:0] count; reg [count_width-1:0] count; always @(posedge reset or posedge CLK) begin if (reset == 1'b1) begin count = 4'b0000; end else begin if (inc == 1'b1) begin count = count + 1; end else if (clr == 1'b1) begin count = 4'b0000; end else begin count = count; end end end endmodule
185
A34: Block diagram and Verilog HDL code for Transmitter_CU module
`resetall `timescale 1ns/1ns module UART_Transmitter(clk_transmitter, reset, byte_ready, t_byte, load_datareg, data_in, tdre_clear, tdre_set, serial_out, sig_idle, trans_bit_count); input clk_transmitter, reset, byte_ready, t_byte, load_datareg; input [7:0] data_in; output tdre_clear, tdre_set, serial_out, sig_idle; output [4:0] trans_bit_count; wire load_shftreg, clear, shift; wire [4:0] bit_count; assign trans_bit_count = bit_count; Transmitter_DPU U_Transmitter_DPU (clk_transmitter, reset, data_in, load_datareg, tdre_clear,tdre_set,load_shftreg, shift, clear, bit_count, serial_out); Transmitter_CU U_Transmitter_CU (clk_transmitter, reset, byte_ready, t_byte, bit_count, load_shftreg, clear, shift,sig_idle);
endmodule
186
`resetall `timescale 1ns/1ns module Transmitter_CU (CLK, reset, byte_ready, t_byte, bit_count, load_shftreg, clear, shift, sig_idle); input CLK, reset, byte_ready, t_byte; input[4:0] bit_count; output load_shftreg, clear, shift, sig_idle; parameter idle = 2'b00; parameter waiting = 2'b01; parameter sending = 2'b10; reg sig_idle; reg [1:0] ps, ns; reg [2:0] control; always @(posedge reset or negedge CLK) begin : state_trans if (reset == 1'b1) begin ps = idle; end else begin ps = ns; end end always @(ps or byte_ready or t_byte or bit_count) begin : output_logic control = 3'b000; case (ps) idle: begin control = 3'b100; sig_idle = 1'b1; if (byte_ready == 1'b1) begin ns = waiting; control = 3'b001; end else begin ns = idle; end end waiting: begin control = 3'b001; sig_idle = 1'b0; if (t_byte == 1'b1) begin ns = sending; end else begin ns = waiting; end end sending: begin sig_idle = 1'b0; if (bit_count < 10) begin control = 3'b010; ns = sending; end
187
A35: Block diagram and Verilog HDL code for Transmitter_DPU module
else begin control = 3'b100; ns = idle; end end default: begin control = 3'b100; ns = idle; sig_idle = 0; end endcase end assign clear = control[2]; assign shift = control[1]; assign load_shftreg = control[0]; endmodule
`resetall
`timescale 1ns/1ns
module Transmitter_DPU (CLK, reset, data_in, load_datareg, tdre_clear, tdre_set, load_shftreg, shift, clear,
bit_count, serial_out);
input CLK, reset, load_datareg, load_shftreg, shift,clear;
input [7:0] data_in;
output tdre_clear, tdre_set, serial_out;
output [4:0] bit_count;
wire [7:0] datareg_signal;
Datareg U_Datareg (CLK, reset, data_in, load_datareg,tdre_clear, datareg_signal); Bit_counter U_Bit_counter (shift, clear, bit_count, reset, CLK);
Shftreg U_Shftreg (CLK, reset, load_shftreg, datareg_signal,shift,tdre_set, serial_out);
endmodule
188
A36: Verilog HDL code for Datareg module
`resetall `timescale 1ns/1ns module Datareg (CLK, reset, data_in, load_datareg, tdre_clear, data_out); input CLK, reset, load_datareg; input[7:0] data_in; output[7:0] data_out; output tdre_clear; reg[7:0] data_out; reg tdre_clear; always @(posedge reset, posedge CLK) begin if (reset == 1'b1) begin data_out = 8'h00; end else begin if (load_datareg == 1'b1) begin data_out = data_in; tdre_clear =1'b1; end else begin data_out = data_out; tdre_clear = 1'b0; end end end
endmodule
189
A37: Verilog HDL code for Bit_counter module
`resetall `timescale 1ns/1ns module Bit_counter (shift, clr, bit_count, reset, CLK); parameter count_width = 5; input shift, clr, reset, CLK; output[count_width-1:0] bit_count; reg[count_width-1:0] bit_count; always @ (posedge reset or posedge CLK) begin : Counter if (reset == 1'b1) begin bit_count = 5'b00000; end else begin if (shift == 1'b1) begin bit_count = bit_count + 1; end else if (clr == 1'b1) begin bit_count = 5'b00000; end else begin bit_count = bit_count; end end end //Counter endmodule
190
A38: Verilog HDL code for Shftreg module
A37
`resetall `timescale 1ns/1ns module Shftreg(CLK, reset, load, data_in, shift, tdre_set, serial_out); input CLK, reset, load, shift; input [7:0] data_in; output tdre_set, serial_out; reg [8:0] data_out; reg serial_out, tdre_set; always @(posedge reset or posedge CLK) begin if (reset == 1'b1) begin data_out = 9'b111111111; serial_out = 1'b1; tdre_set = 1'b1; end else begin if (load == 1'b1) begin data_out = {data_in, 1'b0}; tdre_set = 1'b1; end else if (shift == 1'b1) begin serial_out = data_out[0]; data_out = {1'b1, data_out[8:1]}; tdre_set = 1'b0; end else begin data_out = data_out; serial_out = 1'b1; tdre_set = 1'b0; end end end endmodule
191
APPENDIX B
VHDL to Verilog HDL Conversion Table
Function VHDL Constructs Verilog Constructs
Modules/Entit
ies
entity AND_GATE is port ( x :out std_ulogic; y :in std_ulogic; z :in std_ulogic ); end AND_GATE;
module AND_GATE (X, Y, Z); output X; input Y; input Z;
Internal signal/wire
signal A: STD_LOGIC;
signal B: STD_LOGIC_VECTOR
(15 downto
0);
wire A;
wire [15:0] B;
Component/ Module Instantiation
component FF port ( q: in std_ulogic; clk:in std_ulogic; reset: in std_ulogic); end component; begin ff1: FF port map(q, clk, reset);
FF ff1(q, clk, reset);
Concurrent assignment
Sig_out <= Sig_In; assign Sig_out = Sig_In;
concurrent conditional statements
Sig_A <= '1' WHEN
(((Sig_B = '1') AND
Sig_C = '0')) OR
(Sig_D ='1') OR
assign Sig_A = (((Sig_B == 1'b1) & (Sig_C == 1'b0)) | (Sig_D == 1'b1) | ((Sig_E == 1'b1) & (Sig_F == 1'b0))) ?
192
((Sig_E = '1') AND
(Sig_F = '0')))
ELSE '0';
1'b1 : 1'b0;
concatenation operator
Sig_A(7 DOWNTO 0) <= Sig_B(3 DOWNTO 0) & Sig_C(2 DOWNTO 0) & Sig_D;
Assign Sig_A[7:0] = {Sig_B[3:0],Sig_C[2:0],Sig_D};
Process/always block
gate_clk: process(CLK, cken_int) begin if (CLK = '0') then latched_cken <= cken_int; end if; end process gate_clk;
always @(CLK or cken_int) begin : gate_clk if (CLK == 1'b 0) begin latched_cken <= cken_int; end end
Logical operations
Sig_C <= Sig_A XOR Sig_B; Output <= CC AND DD;
assign Sig_C= Sig_A| sig_B; assign = CC & DD;
Case block case ps is when s1 => ...; when s2 => ...; when others => ...; end case;
case (ps) s1 : ...; s2 : ...; default: ...; endcase
Generic/Parameter
entity Bit_counter1 is generic ( count_width: integer := 2); port ( inc : in STD_LOGIC; count : buffer unsigned(count_width-1 downto 0);
module Bit_counter1 (inc, count); parameter count_width = 2; input inc; output[count_width-1:0] count; reg[count_width-1:0] count;