EMBEDDED DSP PROCESSOR DESIGN USING COWARE PROCESSOR DESIGNER AND MAGMA LAYOUT TOOL A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Bachelor of Technology In Electronics and Communication Engineering and Electronics and Instrumentation Engineering By Dodani Vicky Rameshlal and Nikhil Kumar Department of Electronics and Communication Engineering National Institute of Technology Rourkela May, 2010
63
Embed
EMBEDDED DSP PROCESSOR DESIGN USING COWARE PROCESSOR …
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
EMBEDDED DSP PROCESSOR DESIGN USING COWARE PROCESSOR DESIGNER
AND MAGMA LAYOUT TOOL
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS FOR THE DEGREE OF
Bachelor of Technology In
Electronics and Communication Engineering and
Electronics and Instrumentation Engineering
By Dodani Vicky Rameshlal
and Nikhil Kumar
Department of Electronics and Communication Engineering
National Institute of Technology Rourkela
May, 2010
i
EMBEDDED DSP PROCESSOR DESIGN USING COWARE PROCESSOR DESIGNER
AND MAGMA LAYOUT TOOL
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
Bachelor of Technology In
Electronics and Communication Engineering and
Electronics and Instrumentation Engineering
By Dodani Vicky Rameshlal
and Nikhil Kumar
Under the guidance of
Prof. K. K. Mahapatra
Department of Electronics and Communication Engineering
National Institute of Technology Rourkela
May, 2010
ii
National Institute of Technology Rourkela
CERTIFICATE This is to certify that the thesis entitled “Embedded DSP Processor Design using CoWare Processor Designer and Magma Layout Tool” submitted by Dodani Vicky Rameshlal (Roll No.-10609008) and Nikhil Kumar (Roll No.-10607026) in partial fulfillment of the requirements for the award of Bachelor of Technology Degree in Electronics and Communication Engineering and Electronics and Instrumentation Engineering respectively at National Institute of Technology, Rourkela is an authentic work carried out by them under my supervision and guidance.
To the best of my knowledge, the matter embodied in thesis has not been submitted to any other university/ Institute for the award of any degree or Diploma.
Prof. K. K. Mahapatra
Date: Department of E.C.E
National Institute of Technology
Rourkela-769008
iii
ACKNOWLEDGEMENT
This is a research project and the fact that we have been able to complete it successfully
owes a lot to a number of persons associated with us during this project.
First of all, we would like to thank Prof. K. K. Mahapatra for giving us a golden
opportunity to work on such an interesting topic and providing a thoroughly professional
and research oriented environment. He also guided us nicely throughout the project
period and helped us time to time with his vast experience and innovative ideas.
We wish to extend our sincere thanks to Prof. S. K. Patra, Head of our Department, for
approving our project work with great interest.
Also we need to thank Mr. Sudeendra Kumar K., Mr. Ayaskanta Swain, Mr. Jagannath
Mohanty and Mr. Sushant Kr. Pattanayak , whose contribution in this project has been
quite significant.
Finally, a word of thanks to all of them who have been associated with us and directly or
indirectly helped us during this project.
Dodani Vicky Rameshlal
Roll No.-10609008
Nikhil Kumar
Roll NO.-10607026
Department of Electronics and Communication Engineering
National Institute of Technology Rourkela
iv
TABLE OF CONTENTS
CERTIFICATE ii
ACKNOWLEDGEMENT iii
ABSTRACT vi
LIST OF FIGURES vii
LIST OF TABLES ix
INTRODUCTION 1
1.1. Implementation of DSP Application 2
1.1.1 Implementation on General Purpose Processor (GPP) 2
1.1.2. Implementation on General Purpose DSP Processor 2
1.1.3. Implementation on Application Specific Integrated Circuit (ASIC) 2
1.1.4. Implementation on Application Specific Instruction Set Processor (ASIP) 3
1.2. DSP Processor Architecture 3
1.3. Embedded System Overview 5
1.4. DSP in an Embedded System 5
1.5. ASIP Design Flow 6
APPLICATION DESCRIPTION LANGUAGE 8
2.1. Introduction 9
2.2. LISA Modelling Fundamentals. 9
2.3. Modelling Processor Resources. 10
2.4. Modelling Instructions. 10
2.4.1. The Instruction Behavior. 11
2.4.2. The Instruction Syntax. 11
2.4.3. The Instruction Coding. 12
PROCESSOR DESIGN PLATFORM 14
3.1. Traditional Embedded System Design Flow 15
3.2. Hardware Software Co-Design Flow 16
3.3. CoWare Design Flow 17
3.4. CoWare Processor Designer 18
3.5. The Instruction Set Designer. 19
3.6. CoWare Processor Debugger 21
v
MAGMA LAYOUT TOOL 22
4.1. Magma Design Flow 23
4.2. Introduction to Magma Blast Create Tool 23
4.3. Introduction to Magma Blast Fusion Tool 24
Virtex-II Pro FPGA 26
5.1. Introduction. 27
5.2. XUP Virtex-II Pro Development System. 28
5.3. Chipscope Pro Tools. 30
IMPLEMENTATION OF PROCESSOR 31
6.1. Implementation of General Purpose Processor 32
6.2. Architecture Profiling and Debugging 38
SIMULATION & SYNTHESIS RESULTS 41
7.1. Introduction 42
7.2. Simulation Results 43
7.3. Synthesis Results 44
LAYOUT & FPGA IMPLEMENTATION OF THE PROCESSOR 45
8.1. Layout using Magma Tool 46
8.2. Configuring the Virtex-II Pro FPGA. 47
CONCLUSION 50
References 52
vi
ABSTRACT
A Digital Signal Processing (DSP) application can be implemented in a variety of ways.
The objective of this project is to design an Embedded DSP Processor. The desired
processor is run by an instruction set. Such a processor is called an Application Specific
Instruction Set Processor (ASIP). ASIP is becoming essential to convergent System on
Chip (SoC) Design. Usually there are two approaches to design an ASIP. One of them is
at Register Transfer Level (RTL) and another is at just higher level than RTL and is
known as Electronic System Level (ESL). Application Description Languages (ADLs)
are becoming popular recently because of its quick and optimal design convergence
achievement capability during the design of ASIPs.
In this project we first concentrate on the implementation and optimization of an ASIP
using an ADL known as Language for Instruction Set Architecture (LISA) and CoWare
Processor Designer environment. We have written a LISA 2.0 description of the
processor. Given a LISA code, the CoWare Processor Designer (PD) then generates
Software Development tools like assembler, disassembler, linker and compiler. A
particular application in assembly language to find out the convolution using FIR filter is
then run on the processor. Provided that the functionality of the processor is correct,
synthesizable RTL for the processor can be generated using Coware Processor Generator.
Using the RTL generated, we implemented our processor in the following IC Design
technologies:
• Semi-Custom IC Design Technology
Here, the RTL is synthesized using Magma Blast Create Tool and the final
Layout is drawn using Magma Blast Fusion Tool
• Programmable Logic Device IC Design Technology
Here, the processor is dumped to a Field Programmable Gate Array
(FPGA). The FPGA used for this purpose is Xilinx Virtex II Pro.
vii
LIST OF FIGURES
Figure 1.1 DSP Processor Architecture 4
Figure 1.2 DSP Processor in an Embedded System 6
Figure 1.3 ASIP Design Flow 7
Figure 3.1 Traditional Embedded System Design Flow 15
Figure 3.2 Hardware Software Co-Design Flow 16
Figure 3.3 CoWare Design Flow 17
Figure 3.4 CoWare Processor Designer Main Window 18
Figure 3.5 Instruction Set Designer Window 20
Figure 3.6 Processor Debugger Window 21
Figure 4.1 Magma Design Flow 23
Figure 4.2 Magma Blast Create Flow and Commands 24
Figure 4.3 Floorplanning within Magma Blast Fusion Flow 25
Figure 5.1 FPGA Block Structure 27
Figure 5.2 XUP Virtex-II Pro Development System Board Photo 28
Figure 5.3 XUP Virtex-II Pro Development System Block Diagram 29
Figure 5.4 Chipscope Pro Tools Design Flow 30
Figure 6.1 Operation Hierarchy of Implemented GPP 37
Figure 6.2 Processor Debugger for the Implemented GPP 38
Figure 6.3 Operation Profiling for the Desired Application 39
Figure 6.4 Instruction Set Designer Window of the Designed ASIP 40
viii
Figure 7.1 The Generated HDL Code Structure 42
Figure 7.2 Simulation Results 43
Figure 8.1 Layout of the Processor 46
Figure 8.2 FPGA Design Flow 47
Figure 8.3 Chipscope Analyzer Waveform for the current Design 49
ix
LIST OF TABLES
Table 5.1 XC2VP30 Device Features 29
Table 6.1 Modes of Implementation in CoWare Design 32
Table 6.2 Instructions of implemented GPP 35
Table 7.1 Synthesis Results 44
Table 8.1 Target Device (XC2VP30) Utilization 48
CHAPTER 1 INTRODUCTION
2
1.1. Implementation of DSP Application
There are various ways of implementing a DSP application. They are:
1. 1. 1 Implementation on General Purpose Processor (GPP)
Many DSP applications, with or without real-time requirements, can be implemented on a
general-purpose processor (GPP). There are two reasons for implementing a DSP
application on a general-purpose computer:
• To quickly supply the application to the final user within the shortest possible
time.
• To use this implementation as a reference model for the design of an embedded
system.
1. 1. 2. Implementation on General Purpose DSP Processor
Many DSP applications are implemented using a general-purpose DSP (off-the shelf
processor). Here, general-purpose DSP stands for a DSP available from a semi-conductor
supplier and not targeted for a specific class of DSP applications. A general purpose DSP
has a general assembly instruction set that provides good flexibility for many
applications. However, high flexibility usually means fewer application specific features
or less acceleration of both arithmetic and control operations. Therefore, a general-
purpose DSP is not suitable for applications with very high performance requirements.
High flexibility also means that the chip area will be large. A general-purpose DSP
processor can be used for initializing a product because the system design time will be
short. When the volume has gone up, a DSP ASIP could replace the general-purpose
processor in order to reduce the component cost.
1. 1. 3. Implementation on Application Specific Integrated Circuit (ASIC)
There are two cases when an ASIC is needed for digital signal processing. The first is to
meet extreme performance requirements. In this case, a programmable device would not
be able to handle the processing load. The second case is to meet ultralow power or ultra-
3
low silicon area, when the algorithm is stable and simple. In this case, there is no
requirement on flexibility, and a programmable solution is not needed.
ASIC implementation is to map algorithms directly to an integrated circuit. Comparing a
programmable device supplying the flexibility at every clock cycle, an ASIC has very
limited flexibility. It can be configurable to some extent in order to accommodate very
similar algorithms, but typically it cannot be updated in every clock cycle.
1. 1. 4. Implementation on Application Specific Instruction Set Processor
(ASIP)
A DSP ASIP has an instruction set optimized for a single application or a class of
applications. On one hand, a DSP ASIP is a programmable machine with a certain level
of flexibility, which allows it to run different software programs. On the other hand, its
instruction set is designed based on specific application requirements making the
processor very suitable for these applications. Low power consumption, high
performance, and low cost by manufacturing in high volume can be achieved. The
specialization of an ASIP provides a tradeoff between the flexibility of a general purpose
CPU and the performance of an ASIC. The flexibility of these processors can be achieved
by many ADLs like LISA, EXPRESSION, MIMOLA etc.
An ASIP DSP has a dedicated instruction set and dedicated data types. When designing
an ASIP DSP, functions are mapped to subroutines consisting of assembly instructions.
When designing an ASIC, the algorithms are directly mapped to circuits. However, most
DSP applications are so complicated that mapping functions to circuits is becoming
increasingly difficult. On the other hand, mapping DSP functions to an instruction set is
becoming more popular because the challenge of complexity is handled in both software
and hardware, and conquered separately.
1. 2. DSP Processor Architecture
Figure 1.1 shows a simplified block diagram of DSP processor architecture:
4
Figure 1.1 DSP Processor Architecture
As shown in the figure, a DSP processor contains five key components:
• Program memory (PM): PM is used for storing programs (in binary machine
code). PM is part of the control path.
• Programmable FSM: It is a programmable finite state machine consisting of a
program counter (PC) and an instruction decoder (ID). It supplies addresses to the
program memory for fetching instructions. Meanwhile, it also performs
instruction decoding and supplies control signals to the data processing unit and
data addressing unit.
• Data memory and data memory addressing: DM stores information to be
processed. Three types of data are stored in DM: input/output data, intermediate
data in a computing buffer (a part of the data memory), and parameters or
coefficients. The data memory addressing unit is controlled by programmable
FSM and supplies addresses to data memories.
• Data processing unit (DU): The data processing unit, or datapath, performs
arithmetic and logic computing. A DU includes at least a register file (RF), a
multiplication and accumulation unit (MAC), and an arithmetic logic unit (ALU).
A data processing unit may also include some special or accelerated functions.
5
• Input/output unit (I/O): I/O serves as an interface for functional units connected
to the outside world. I/O also handles the synchronization of external signals.
Memory buses and peripherals are also included.
1.3. Embedded System Overview An embedded system is a special-purpose computer system designed to perform one or a
class of dedicated functions. In contrast, a general-purpose computer, such as a personal
computer, can do many different tasks, depending on programming. An embedded
system could be a component of a personal computer such as a keyboard controller,
mouse controller, or a wireless modem. An embedded system could also be a digital
subsystem inside a mobile phone, a digital camera, a digital TV, or in medical equipment.
Except for general computers, most microelectronic systems are embedded systems.
Within the specific application domain, the embedded system may have much higher
performance or much lower power consumption compared to a general computer system.
1.4. DSP in an Embedded System
DSP processors are essential components in many embedded systems. One or several
DSP processors consist of a DSP subsystem in an embedded system. A general embedded
system, including a DSP subsystem, is shown in Figure 1.2. Such a system is also called a
system on a chip (SoC) platform for embedded applications.
The system in Figure 1.2 can be divided into four parts:
• The first part is the microcontroller (MCU), which is the master of the chip or the
system. The MCU is responsible for handling miscellaneous tasks, except
computing for real-time algorithms.
6
Figure1.2 DSP Processor in an Embedded System
• The second part is the ASIP DSP subsystem, which is the main computing engine
of the system. All heavy computing tasks should be allocated to this subsystem.
• The third part is the memory subsystem, which supports data and program storage
for the DSP subsystem and the MCU.
• The fourth part consists of peripherals including high-speed and low-speed I/Os.
1.5 ASIP Design Flow
The first and most important step in the design of a processor is the instruction set design.
The instruction set design is a trade-off among a multitude of parameters including
performance, functional coverage, flexibility, power consumption, silicon cost, and
design time. Figure 1.3 shows the general ASIP design flow.
The ASIP design flow starts from the requirement specification and finishes after the
microarchitecture design. The design of an ASIP is based mostly on experience, and it is
essential to minimize the cost of design iteration.
7
Figure1.3 ASIP Design Flow
.
CHAPTER 2
APPLICATION DESCRIPTION LANGUAGE
9
2.1. Introduction
The architecture design languages (ADLs) nowadays are offering promising avenues for
fast design-space exploration with enough room for optimization for target-specific
architectures. The advantages offered by the ADL-based design are as follows:
• Faster design space exploration;
• Seamless integration of the components through the automatic generation of the
software tool-chain (simulator, high level language compiler, assembler etc.) as
well as the RTL description of the processor;
• The higher abstraction level which helps in doing away with the details of the
implementation and thereby, the designer can manage the increasingly complex
processor design.
These merits coupled with the narrowing application-domains for the processors,
encourage modelling of the application specific instruction-set processor (ASIP) using
the ADLs. Examples of ADLs like LISA, EXPRESSION, MIMOLA, nML. The ADL
used in this project is LISA.
2.2 LISA Modelling Fundamentals. The scope of LISA is perfectly reflected by the meaning of its acronym, that is,
"Language for Instruction-Set Architectures". LISA is suited to model any architecture
that is driven by an instruction set, in other words, any architecture whose behavior is
steered by the content of a dedicated resource, which we call the instruction resource. The
language elements are generic enough to cover any kind of target architectures like GP
processors, RISC processors, DSPs, ASIPs, special purpose co-processors, and so on.
Resources and operations are the basic objects of a LISA processor model. Resources
describe the storage elements of the processor. Operations are the basic language
elements that describe the complete transition functions of the processor, including both
the instructions and the instruction-independent functions such as the fetch mechanism.
10
2.3. Modelling Processor Resources.
Processor resources include the internal storage elements of the processor as well as
dedicated input/output pins and global variables. The internal storage elements of the
processor are represented by its registers and its internal memories.
While in cycle-accurate models there are other types of processor resources, like pipeline
registers and interconnect signals, this manual confines itself to registers, memories, and
pins.
Processor resources are declared in the resource section, which is indicated by the
keyword RESOURCE, followed by the section body limited by braces as shown below: