RICE UNIVERSITY DSP architectures for wireless communications Sridhar Rajagopal Department of Electrical and Computer Engineering Rice University, Houston TX ECE Pizza Talk March 28, 2003 This work has been supported in part by Nokia, TI, TATP and NSF
DSP architectures for wireless communications. Sridhar Rajagopal Department of Electrical and Computer Engineering Rice University, Houston TX ECE Pizza Talk March 28, 2003. This work has been supported in part by Nokia, TI, TATP and NSF. Wireless Cellular. Wireless LAN. Bluetooth/ - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
RICE UNIVERSITY
DSP architectures for wireless communications
Sridhar Rajagopal
Department of Electrical and Computer EngineeringRice University, Houston TX
ECE Pizza Talk March 28, 2003
This work has been supported in part by Nokia, TI, TATP and NSF
2RICE UNIVERSITY
Future wireless devices :
High data rate mobile devices with multimedia
Multiple antennas w/ complex algorithms, GOPs of
computation
Area-Time-Power constraints
Seamless connection across environments and standards
Use the fastest and cheapest available service
Bluetooth/Home Networks
Wireless Cellular
Wireless LAN
3RICE UNIVERSITY
Aim of the talk
Design me
4RICE UNIVERSITY
Trends
Past Current Future Year 1990’s 2002-2005 2006+
Function Voice Data Multimedia
Data rates 10’s of Kbps 100’s of Kbps (10x) 10’s of Mbps (10-100x)
Complexity KOPs MOPs (1000x) GOPs (1000x)
Power < 500 mW < 500 mW < 500mW
Antennas Single Single Multiple
Standard GSM (Europe) CDMA (Qualcomm)
TDMA (Nokia) (different devices)
GSM/TDMA/CDMA on same device
GSM/TDMA/CDMA/EDGE/ Wireless LAN/Bluetooth on same
device
FLEXIBILITY
5RICE UNIVERSITY
Change in flexibility requirements
Physical Layer
MAC Layer
Network Layer
Application LayerNo change
(already flexible)
Maximum change(needs to support multiple
environments, algorithms and standards)
6RICE UNIVERSITY
Architecture trade-offs
Past : more DSP + less ASIC, Current : less DSP + more ASIC
Reason: need less flexibility OR DSPs not powerful enough?
Can’t we build better DSPs? How much flexibility do we need?
ASICs
Intermediate
Programmable
Area-Time-PowerbenefitsFlexibility
Time-to-marketSoftware updates
7RICE UNIVERSITY
Problems with current DSPs
Current DSPsNot enough functional units (FUs) for GOPs of
computationNeed 100’s of FUsNot low power enough!!
Cannot extend to more FUsLimited Instruction Level Parallelism (ILP)Limited Subword Parallelism (such as MMX)Cannot support more registers (area,ports)Compilers: difficult to find ILP as FUs increase
Ideal C64x (w/o co-proc) needs ~200 MHz for real-time
24RICE UNIVERSITY
SWAPs : Salient features
1-2 orders of magnitude better than 1 processor DSP
Any constraint length 10 MHz at 128 Kbps
Same code for all constraint lengths no need to re-compile or load another codeas long as parallelism/cluster ratio is constant
Power savings due to dynamic cluster scaling
25RICE UNIVERSITY
Expected SWAP power consumption
64 clusters and 1 multiplier per cluster: 0.13 micron, 1.2 V Peak Active Power: ~9 mW at 1 MHz Area: ~53.7 mm2
10 MHz, 128 Kbps with reconfiguration
*Exploring the VLSI Scalability of Stream Processors, Brucek Khailany et al, Proceedings of the Ninth Symposium on High Performance Computer Architecture, February 8-12, 2003, Anaheim, California, USA, pp. 153-164
0 10 20 30 40 50 60 700102030405060708090
Active Clusters (max 64)P
ow
er (
in m
W)Viterbi Clusters used Peak Power
K = 9 64 ~90 mW
K = 7 16 ~28.57 mW
K = 5 4 ~13.8 mW
overhead 0 ~8.1 mW
26RICE UNIVERSITY
Flexibility vs. performance
Suitable for mobile devices?SWAPs: Real-time at ~10-100 mWMaybe ; but can we do better?
ASICs : Real-time at ~10-100 W
No special customization for the applicationNo application-specific unitsGeneric inter-cluster communication networkOverhead for extracting parallelism
SWAPs suitable for base-stations?Why not? – power is not a primary constraint!
27RICE UNIVERSITY
Multiuser Estimation-Detection+Decoding
Real-time target : 128 Kbps per user
1 10 10010
100
1000
10000
100000
Number of clusters
Fre
qu
en
cy
ne
ed
ed
to
att
ain
re
al-
tim
e (
in M
Hz)
FASTMEDIUMSLOW
32-user base-station
Mobile
DSP
Ideal C64x (w/o co-proc) needs ~15 GHz for real-time
28RICE UNIVERSITY
Current research
SWAPs : Completely flexible and general
How do we trade-off flexibility for better performance?
Handset SWAPs (H-SWAPs)
29RICE UNIVERSITY
H-SWAPs: Potential advantages
DSP (RE)
SWAP
ASIC/FPGA – Real-time performance
DP
Task PipeliningDedicated interconnect
DSP (RE)
H-SWAP
Partial DP + Task Pipelining
Application-specific units
ASIC/FPGA – Real-time performance
Dedicated interconnect
H-SWAPsSWAPs
Execu
tion t
ime
30RICE UNIVERSITY
Conclusions
Need flexible architectures for future wireless devicesHigher data rates, lower power, more complex algorithms
Design methodology (SWAPs, H-SWAPs, ASICs)Flexibility vs. performance trade-offsBlurs distinction between ASICs and programmable solutions
Also need parallel, low precision algorithms for efficient mapping