1 ASIC SYSTEM LAB./AJOU UNIV. VLSI ASIC SYSTEM LAB./AJOU UNIV. Contents ● Digital Signal Processing ● Basic Architectures for DSP Algorithms ● Comparison with Microprocessors ● Fixed-Point DSP Chips : DSP56100 (Motorola) ● Multimedia DSP Chips ◆ MediaProcessor ◆ TriMedia ● Trends of Future DSPs ● VLSI Architectures for Communications ◆ Fast Fourier Transform ◆ Viterbi Decoder ◆ Reed-Solomon Decoder ◆ Equalizer
37
Embed
Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
ASIC SYSTEM LAB./AJOU UNIV.
Y��Ü<PM VLSI èHñt
è ¤ýô ¨Ô¨4�ý$
ÄT )l
ASIC SYSTEM LAB./AJOU UNIV.
Contents
● Digital Signal Processing● Basic Architectures for DSP Algorithms● Comparison with Microprocessors● Fixed-Point DSP Chips : DSP56100 (Motorola)● Multimedia DSP Chips
◆ MediaProcessor◆ TriMedia
● Trends of Future DSPs● VLSI Architectures for Communications
◆ Fast Fourier Transform◆ Viterbi Decoder◆ Reed-Solomon Decoder◆ Equalizer
2
ASIC SYSTEM LAB./AJOU UNIV.
What is Digital Signal Processing?
● Analog Signal vs. Digital Signal◆ Analog Signal : Continuous Time and Continuous Amplitude◆ Discrete Time Signal : Discrete Time and Continuous Amplitude◆ Digital Signal : Discrete Time and Discrete Amplitude
● Advantages of Digital Signal Processing◆ Guaranteed Accuracy
à Specify Sampling Rate, Word Length and Algorithmà Independent on Time, Temperature, Humidity
◆ Low Sensitivity of Noise and Error Correctable◆ Digital system : Small, Cheaper, Less Power because of VLSI◆ Flexibility of System : Reprogrammable◆ Reliable & Predictable
● Disadvantages◆ Finite Sampling Rate & Word Length Problem◆ Wide Bandwidth for Data Transfer
ASIC SYSTEM LAB./AJOU UNIV.
Why Digital Signal Processor?
Low-passFilter
High-passFilter
Amplifier
ConvolverFourier
TransformAnalog
Systems
D/AConverter
DSPD/A
Converter
ManyAlgorithms
AnalogSignal
DigitalSignal
DigitalSignal
AnalogSignal
AnalogSignal
AnalogSignal
Digital Domain
Analog Domain
3
ASIC SYSTEM LAB./AJOU UNIV.
DSP Algorithms
● Convolution
y[n] =
◆ Basic Output Sequence of LTI Digital Systems
● Correlation
y[n] =
◆ Signal Matching
● Discrete Fourier Transform (DFT)
X[k] = x[n]exp(-j2πkn / N)
◆ Spectral Analysis of Signals
h[k]x[n k]K 0
−=
∞
∑
n 0
N 1
=
−
∑
x [n]x [n k]1 2 +=
−
∑n 0
N 1
ASIC SYSTEM LAB./AJOU UNIV.
DSP Algorithms (cont.)
● Z-Transform
X(z) =
◆ System and Signal Analysis
● Finite Impulse Response (FIR) Filtering
y[n] =
◆ Linear Phase and Stable Response Filtering
● Infinite Impulse Response (IIR) Filtering
y[n] =
◆ Sharper Cutoff Filtering than FIR with the Same Number of Taps
h[k]x[n - k]k 0
N 1−
∑
x[n] Z-n
z 0=
∞
∑
a x[n - k] - b y[n - k]k k
k 1
M
k 0
N
∑∑
4
ASIC SYSTEM LAB./AJOU UNIV.
Basic Architecture for DSP Algorithms
Inst.Memory
X DataMemory
Y DataMemory
AddressGeneration
Unit
Multiplier
Adder & Acc
Inst. bus
A=X*Y+A
X*Y
X Data bus
ProgramControl
Unit
X Y
Y Data bus
X Address bus
Y Address bus
ASIC SYSTEM LAB./AJOU UNIV.
Microprocessor System Block Diagram
5
ASIC SYSTEM LAB./AJOU UNIV.
CPU Block Diagram
ASIC SYSTEM LAB./AJOU UNIV.
MCU Block Diagram
6
ASIC SYSTEM LAB./AJOU UNIV.
Micro-Instruction Format
ASIC SYSTEM LAB./AJOU UNIV.
Comparisons with Microprocessors
● Harvard Architecture◆ X&Y Data Memories, Instruction Memory
● Multi-Bus Structure◆ Minimize Bottleneck Problem
● Three Separate Parallel Units◆ Data Calculation Unit◆ Program Control Unit◆ Address Generation Unit<Example> MAC x1, y1, A X:(R0)+, y1 X:(R3)+, x1
● Program Address Generation● Instruction Decoding● Hardware Do Loop Control● Interrupt Control● Components
◆ Program Counter (PC)◆ Loop Address (LA) : Where to End of Loop◆ Loop Counter (LC) : Number of Iteration◆ Status Register (SR)◆ Operating Mode Register (OMR)◆ Stack Pointer (SP)◆ System Stack : Store PC and SR for Subroutine Call and
à Poor Code Density (VLIW) vs. Good code Density (RISC-SS)
à High Power (VLIW) vs. Low Power (RISC-SS)
à Difficult (VLIW) vs. Easy (RISC-SS) to Program by Hand
◆ High Level Languages suitable for Parallel Architectures● Architecture Driven Algorithms for Multimedia Functions● Hardware / Software Co-design Approach should be used for
● Future DSP Chips should be◆ Low Price◆ Programmable◆ 20 GOPS DSP Chip in the Year 2000
24
ASIC SYSTEM LAB./AJOU UNIV.
MPU History
°¯°¯¯̄¯̄
°¯°¯¯̄
°¯°¯
°°¯̄
¯̄°°
°¯®³´øñ
Âàâçä
²± áèó
Òôïäñòâàëäñ
µ³ áèó
îôó¬îå
îñãäñ
ÑÈÑÈÒÂÒÂ
ÃÃÒÒÏÏÔëóñàòïàñâ
Áñäàê¬óçñÁñäàê¬óçñîîôôæçæç ööîôîôëëããáäáä
ÕËÈÖ®ÌÕËÈÖ®Ìôôëóèïñîâäòëóèïñîâäòòòîñîñ
°¸°¸··¯̄ °¸°¸¸̧´́
°¯Æ°¯ÆÈÈÏÏÒÒ¿±¯¯¯¿±¯¯¯ÌÈÌÈÏÏÒÒ
ASIC SYSTEM LAB./AJOU UNIV.
Programmable DSP Chips
°¯¯°¯¯
°¯°¯
ÌÏÄÌÏÄÆƬ¬±± ÃÃäâäâ
ÒÈÌÃÒÈÌꪫ« ÕÕËÈËÈÖÖ«« ÑÑÈÒÂÈÒ¬¬ÒÒÒÒ
°¸·¯°¸·¯ ±¯¯¯±¯¯¯
ÌÌÎÏÎÏÒÒ
°¯¯¯°¯¯¯
°¸·³°¸·³ °¸··°¸·· °¸¸±°¸¸± °¸¸µ°¸¸µ
ÃÒÃÒÏÏ ååîñ Õîñ Õèèãäîãäî
ÃÒÃÒÏÏ ååîñ Àôãîñ Àôãèèîî ¥¥ ÒÒïääâïääâçç
ÒÈÌÃÒÈÌÃ
ÃÒÃÒÏÏ ååîñîñ ÌÌôôëëóóèèìäãìäãèàèà
ÍÄÂ
ÓÈ
ÀÓÓ
ÂëÂëîâîâêê
25
ASIC SYSTEM LAB./AJOU UNIV.
● Fast Fourier Transform (FFT) Algorithm
◆ One of Discrete Fourier Transform (DFT)
◆ Reduce Computation
◆ FFT Method : Radix-2, Radix-4
● Example : Orthogonal Frequency Division Multiplexing (OFDM)
Fast Fourier Transform
IFF TTra nsmit
Filter
C hann el
FF T
÷§ ó¨
í§ ó ¨
é(2π∆å (ó+φ)
Ó÷
ãàóà
Ñ ÷
ãàóà
ÎÅÃÌ òøòóäì áëîâê ãèàæñàì
Se ria lto
Pa ra lle l
Pa ra lle lto
Se ria l
Se ria lto
Pa ra lle l
Pa ra lle lto
Se ria l
Reference : JCCI’98 pp 879~883
ASIC SYSTEM LAB./AJOU UNIV.
● Radix-2 Butterfly Algorithm
OUT0 = IN0 + IN1
OUT1 = (IN0 - IN1) WNk
● Radix-4 Butterfly Algorithm
OUT0 = [(IN0 + IN2) + (IN1 + IN3)]
OUT1 = [(IN0 - IN2) - j(IN1 - IN3)] WNk
OUT2 = [(IN0 + IN2) - (IN1 + IN3)] WN2k
OUT3 = [(IN0 - IN2) + j(IN1 - IN3)] WN3k
where WNk = e(-2�nk/N)
FFT Algorithm
°
ÖÍê
ÎÔÓ¯
ÎÔÓ°
Öͱê
ÎÔÓ±
ÖͲê
ÎÔÓ²
°°°
°
°
°
¬°
¡Ž
é
é
Ž
¬°
¬°°
Èͯ
ÈÍ°
Èͱ
ÈͲ
°
¬°
°
°
Èͯ
ÈÍ°Ö
Í
ê
ÎÔÓ¯
ÎÔÓ°
26
ASIC SYSTEM LAB./AJOU UNIV.
● Butterfly Architecture
● Number of butterflies(N-point) : N/2(log2N-1)● Number of complex adders : N(log2N-1)● Number of complex multipliers : N/2(log2N-1)
Radix-2 FFT Architecture
La tch GW
PR O M
La tch E
Èì §Ï¬Ð¨
Ñä §Ï¬Ð¨
S U B
AD D
M U L 1
M U L 2
AD D
Èͯ
ÈÍ°
Îôó¯
Îôó°
ASIC SYSTEM LAB./AJOU UNIV.
Radix-4 FFT Architecture
● Butterfly Architecture
● Number of butterflies(N-point) : N/2(log4N-1)● Number of complex adders : N(log4N-1)● Number of complex multipliers : 3N/4(log4N-1)
ÂÂÒÀÒÀ
ÌÌôôëëïïëèëèääññ
ÌÌôôëëïïëèëèääññ
ÂÂÒÀÒÀ
ÌÌôôëëïïëèëèääññ
ÌÌôôëëïïëèëèääññ
ÑÑääààëë ÂîäÂîäåååå
ÑÑääààëë ÂîäÂîäåååå
ÈÈìàìàæèíæèíààñøñø
ÂÂîîääåååå
ÈÈìàìàæèíæèíààñøñø
ÂÂîîääåååå
ÈÈííïïôôóó¯̄
ÂÂËÀËÀ
ÂÂËÀËÀ
ÈÈííïïôôóó°°
ÈÈííïïôôóó±±
ÈÈííïïôôóó²²
ÎÎÔÓ°ÔÓ°
ÑäàÑäàëë ÏÏààñóñó
ÈÈìàæìàæèèííààñø Ïñø Ïààñóñó
ÈÈìàæìàæèèííààñø Ïñø Ïààñóñó
ÑäàÑäàëë ÏÏààñóñó
27
ASIC SYSTEM LAB./AJOU UNIV.
Comparison between Radix-4 and Radix-2
● Algorithm comparisons
* data : Complex number
- Radix-4 reduces the number of additions and multiplicationscompared with radix-2
● Architecture comparisons- Butterfly architecture of radix-4 is more complex than that of radix-2
- However, as N increases, the gate count of radix-2 increases moresharply than that of radix-4
Íôìáäñ îå
òóàæäò
Íôìáäñ îå
Àããèóèîíò
Íôìáäñ îå
Ìôëóèïëèâàóèîíò
Ñàãè÷¬±
Ñàãè÷¬³
ëîæ±Í¬°
ëîæ³Í¬°
ͧëîæ±Í¬°¨
ͧëîæ³Í¬°¨
Í®±§ëîæ±Í¬°¨
²Í®³§ëîæ³Í¬°¨
ASIC SYSTEM LAB./AJOU UNIV.
m 1
Pu n c tu re dL o g ic
Èíïôó
Áèó
Âîãäã
Áèó
Òóñäàìò
¹ Ìîãôëî ± Àããäñ
§×ÎѨ
m 2
Gc0 = 1012 = 58Gc1 = 1112 = 78
c1
c0
1 0 1
1 1 1
MSBLSB
Convolutional Encoder for VITERBI Algorithm
● Convolutional Encoder consists of Two Components◆ Shift Register : Hold K-1 Bits (Number of Shift Register)◆ v Modulo-2 Adder : v - Bits are Output◆ Example : K = 3, r = 1/2 Convolution Encoder
◆ Trace-Back (TB)à We define the Length of TB Depthà Usually, TB Depth = K x 5 or 6à After fill TB depth, Trace Back the TB Memory and Decode the
Received Code
VITERBI Decoding
ASIC SYSTEM LAB./AJOU UNIV.
● Punctured Code : One of Modified Coding Scheme◆ Increase Code Rate◆ Decrease Coding Gain (c.f. Coding Gain is
10log(Pwithout FEC/Pwith FEC))◆ Example : r = 3/4 Punctured Convolutional Code
Ñàóä °®±
Âîíõîëôóèîíàë
Äíâîãäñ
Ñàóä ²®³
Ïôíâóôñä× Ø Ù
ç°¨ 籨 粨 糨 ç´¨ 絨×
¯§°¨ Â
¯§±¨ Â
¯§²¨ Â
¯§³¨ Â
¯§´¨ Â
¯§µ¨
°§°¨ Â
°§±¨ Â
°§²¨ Â
°§³¨ Â
°§´¨ Â
°§µ¨
Ø
Ù¯§°¨
°§°¨
¯§²¨
°§±¨
¯§³¨
°§³¨
¯§µ¨
°§´¨
Òøìáîë Ãäëäóäã §Ïôíâóôñäã¨
Ãäëäóèíæ Ìàóñè÷
Punctured Code
011
101
29
ASIC SYSTEM LAB./AJOU UNIV.
● Trellis Diagram for PMC (previous BMü )◆ Example : K = 3, r = 1/2 Convolutional Code◆ Branch Metric is Hamming Distance (Hard decision, # of different bits) or
Euclidean Distance (Soft decision, difference of decimal code) betweenReceived Code and Branch Word
Trellis Diagram
¯
¯
¹ Ïàóç Ìäóñè â
¹ Á ñàí âç Ìäóñ èâ
°°
¯¯
¯°
°¯
±
°
°
¯
¯¯
°°
±
¯
¯¯
°°
°°
¯¯¯°
¯°
°¯
°¯
±
¯
°
°
°
°
¯
±
¯¯
°°
°°
¯¯¯°
¯°
°¯
°¯
¯
±
°
°
°
°
±
¯
¯¯
°°
°°
¯¯¯°
¯°
°¯
°¯
±
¯
°
°
°
°
¯
±
¯¯
°°
°°
¯¯¯°
¯°
°¯
°¯
¯
±
°
°
°
°
±
¯
¯¯
°°
°°
¯¯¯°
¯°
°¯
°¯
±
¯
°
°
°
°
¯
±
¯¯
°°
°°
¯¯¯°
¯°
°¯
°¯
°
°
±
¯
¯
±
°
°
±
¯
²
²
¯
±
²
²
±
¯
°
°
±
²
±
±
°
±
²
²
°
±
²
²
²
°
°
²
³
³
Òóàóä
¯¯
°¯
¯°
°°
¹ Óñàâä¬ Á àâê
Âîññäâóäã
òèæíàë
Èíïôó òèæíàë
Âîãäã òèæíàë
Ñäâäèõäã
òèæíàë
°
°°
°°
¯
°¯
°¯
°
°¯
°¯
°
°¯
°¯
°
¯°
¯°
¯
°¯
°¯
¯
°°
°°
°
¯¯
¯°
° ° ¯ ° ° ° ¯ ¯
ASIC SYSTEM LAB./AJOU UNIV.
● Viterbi Decoder Architecture◆ Depunctured Logic : If Received Code is a Punctured Code◆ BMC : Hard/Soft Decision◆ ACS : After ACS, Storage PM Memory◆ TB : Trace-Back
P ath M etricM em ory
D epunc tu redLog ic
B ranchM etric
C a lcu la te
T race B ackM em ory
A ddC om pare
S e lect
Á̯¯Â
¯
°
÷
÷
Ãäâîãèíæ Áèó
Ñäâäèõäã
Âîãä
ÁÌ°¯
Á̯°
ÁÌ°°
VLSI Architectures for VITERBI Algorithm
If Hard decision, x is 1-bitIf Soft decision, x is 3-bit
If upper path is smaller, TB stores 0If lower path is smaller, TB stores 1