Tut orial Tut orial Hot Chips Hot Chips 01 01 Jan M. Rabaey Jan M. Rabaey BWRC University of California @ Berkeley http://www.eecs.berkeley.edu/~jan Silicon Architectures f or Silicon Architectures f or Wireless Systems Wireless Systems – – Part 1 Part 1
60
Embed
S ilicon Archit ect ures f or W ireless S yst ems – Part 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Tut orial Tut orial Hot ChipsHot Chips 01 01
Jan M. RabaeyJan M. Rabaey
BWRC
University of California @ Berkeley
http://www.eecs.berkeley.edu/~jan
S ilicon Archit ect ures f orS ilicon Archit ect ures f or
W ireless S yst ems W ireless S yst ems –– Part 1 Part 1
The The FibonacciFibonacci Law on Wireless Growth Law on Wireless Growth
Source: Goldman-Sachs
“The number of world-wide wireless
subscribers (in tens of millions)
grows as a Fibonacci series”
1999 2000 2001 2002 2003
46
57
67
?
From Handsets to Mobile DevicesFrom Handsets to Mobile Devices
Internet access the most important driver
(text, graphics, multimedia)
Berkeley Infopad,
One of the first wireless
internet applicance
1990-1996
A A Smorgasboard Smorgasboard of Choicesof Choices
NARROWBANDNARROWBAND WIDEBAND WIDEBANDCIRCUIT PACKETCIRCUIT PACKETVOICE DATAVOICE DATA
9.6Kbps9.6Kbps
GSM
IS136
IS95
PCS1900
DCS1800
IS54B
22ndnd generation generation
HCSD
IS95-B
GPRS
64-384Kbps64-384Kbps
IS136+
generation 2generation 2.5.5
384-2000Kbps384-2000Kbps
WCDMA
Edge
cdma2000
HDR
W-TDMA
ARIB-WCDMA
WCDMA-FDD
33rdrd generation generation
And the AlternativesAnd the Alternatives
• Metropolitan Wireless Networks
– Various proprietary solutions
• Ricochet (up to 138 Kb/sec)
• Flarion Flash-OFDM (> 384 kBits/sec)
• Wireless LANs– 802.11 (b,a); Hyperlan
– from 1 to 56 Mbit/sec
– Restricted to the 50 meter range (at present)
(Projected) Growth in 802.11 WLAN(Projected) Growth in 802.11 WLAN
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
2001 2002 2003 2004 2005
1-2Mbps
10-11Mbps
20Mbps+
54 Mbps
Total
Units, k
Source: Cahners In-Stat 2001
The New InternetThe New Internet
Clusters
Massive Cluster
Gigabit Ethernet
Pre-1990:
Client-Server Systems
The 2000s:
Extending toward the Small
Enabled by integration
and wireless connectivity
The 1990s:
Conquering the World
The Network revolution
The Post-PC Era: The Distributed Approach toThe Post-PC Era: The Distributed Approach to
Information ProcessingInformation Processing
The emergence of ad-hoc wirelessThe emergence of ad-hoc wireless
networks networks –– the wire replacement the wire replacement
• Bluetooth, HomeRF
ü up to 800 kBit/sec
• Sensor networks
ü Low data-rates
The Evolving Wireless SceneThe Evolving Wireless Scene
Range
Dat
a R
ate
1m 10m 100m 1km 10km
1Kb
10Kb
100Kb
1Mb
10Mb
100Mb
Cellular (WAN)
3G Cellular
2.5 G Cellular
802.11 (LAN)
802.1a
Bluetooth (PAN)
Sensor networks
Metropolitan
More bit/secMore bit/sec
Compelling Issues in Wireless (1)Compelling Issues in Wireless (1)
Ubiquitous services put wireless spectrum at apremium
• Effective use of aether hampered by standardizationand fragmentation
• Current spectral efficiency far below theoreticallimits
• Emerging Solutions
– Adoption of better spectrum utilization techniques (interference
cancellation, multi-path fading mitigation and exploitation)
– multi-functional, adaptive systems
• But … huge appetite for computations
Evolution of MOPS Requirements inEvolution of MOPS Requirements in
CellularCellularFunction MOPS
Digital RRC Channel 3600
Searcher 2100
RAKE 2050
Maximal Ratio Combiner 24
Channel Estimator 12
AGC, AFC 10
Deinterleaver 15
Turbo Coder 90
TOTAL 7901
Single 384 kbps UTRA W-CDMA Channel
Source: IEEE Comm Theory Workshop, May 1999
1
10
100
1000
10000
GSM HSCSD GPRS 3G
DS
P M
IPS
Comm
Application
1
10
100
1000
10000
100000
0 2 4 6 8 10 12
SNR (db)
Rela
tive C
om
ple
xit
yThe Cost of Approaching ShannonThe Cost of Approaching Shannon’’s Bounds Bound
Courtesy Engling Yeo, UCB
The Bliss and Challenge of Error CodingThe Bliss and Challenge of Error Coding
8/9 Turbo, =4, N=4k
2/3 Turbo, =4, N=64k 1,2, and 3 iterations
1/2 Turbo, =4, N=64k 1,2,
and 3 iterations
8/9 LDPC, N=4k
1,3, and 5 iterations
8/9 Conv. Code,=3, N=4k
1/2 Conv. Code,=4, N=64k
2/3 Conv. Code,=4, N=64k
8/9 Capacity Bound
2/3 Capacity Bound
1/2 Capacity Bound
1/2 LDPC, N=107, 1100 iterations
for BER of 10-5
Dealing with Non-ideal Channels (e.g., fading)Dealing with Non-ideal Channels (e.g., fading)
• Multi-antenna approach exploitsmulti-path fading by sending dataalong good channels
• Results in large theoreticalimprovements in bandwidthefficiency for fading channels
• But…computationally hungry
)(tx
Array
Processing
1st path, 1 = 1
2nd path, 2 = 0.6
SNR (dB)0 5 10 15 20 250
5
10
15
20
25
30
Capacity (
bits/s
/Hz)
(4,4) With Feedback
(4,4) No Feedback(4,1) Orthogonal Design (1,1) Baseline
The Cost of Dealing with Non-ideal ChannelsThe Cost of Dealing with Non-ideal Channels
0
1000
2000
3000
4000
5000
6000
7000
Performance
MIPS
0.8 Mb/s
0.9 b/s/Hz
1.6 Mb/s
1.8 b/s/Hz
1.9 Mb/s
2.1 b/s/Hz
3.8 Mb/s
4.2 b/s/Hz
5.6 Mb/s
6.3 b/s/Hz
Data rate per user
Spectral efficiency
* Assume 25 MHz bandwidth and 28 users
Matched
Filter
Multi-User
Detector
OFDM +
Coding
OFDM +
Multi-Antenna
OFDM + Multi-
Antenna + Coding
Source: Ning Zhang, UCB
Shannon beats Shannon beats MooreMoore’’s s lawlaw
1
10
100
1000
10000
100000
1000000
10000000
1980
1984
1988
1992
1996
2000
2004
2008
2012
2016
2020
Algorithmic Complexity(Shannon’s Law)
Processor Performance (~Moore’s Law)
Courtesy: Ravi Subramanian (Morphics)
1G
2G
3G
Single-Chip DSPs are Lagging ...Single-Chip DSPs are Lagging ...
1
10
100
1000
10000
1980 1985 1990 1995 2000
Year
Megam
acs
DSP Trend: x 1.4/year
Moore’s law: x 1.58/year
Source: TI
DSPs
While algorithms are beating While algorithms are beating MooreMoore’’s s law!law!
Digital Processor Performance Digital Processor Performance
The Law of Diminishing ReturnsThe Law of Diminishing Returns
• More transistors are being thrown atimproving general-purpose CPU and DSPperformance
• Fundamental bounds are being pushed– limits on instruction-level parallelism
– limits on memory system performance
• Returns per transistor are diminishing– new architectures realizing only 2-3 instructions/clock
– increasingly large caches to hide DRAM latency
Some observationsSome observations• Von-Neuman style instruction set architectures
were perceived when switching devices andinterconnections were extraordinarily expensive,and multiplexing-in-time provided the mosteconomical solution– Intel 4004: 2000 transistors, 1 MHz clock frequency, 1 metal
layer
• This led to the “clock-speed” affixation, which infact is only a secondary measure of performance
• Power is rapidly becoming a limiting factor– Newest processors are including thermal sensors and
automatic slow-down (throttling) using pipeline bubbles andnop’s to combat overheating and meltdown
Compelling Issues in Wireless (2)Compelling Issues in Wireless (2)
“The Last Meter Problem”Ubiquitous wireless networking requiressteep reduction in cost and energydissipation
• To be acceptable, radio cost has to be below 1$
• Frequent battery replacement on 100’s of devicesunacceptable
• Technology not likely to be of major help
Energy to Play a Major RoleEnergy to Play a Major Role
1
10
100
1000
10000
100000
1000000
10000000
1980
1984
1988
1992
1996
2000
2004
2008
2012
2016
2020
Algorithmic Complexity(Shannon’s Law)
Processor Performance (~Moore’s Law)
Battery Capacity1G
2G
3G
Courtesy: Ravi Subramanian (Morphics)
Energy Trends in DSPsEnergy Trends in DSPs
C15 @ 3v
C52 @ 3v
32010 @ 5v
C25 @5v
ATT16xx @2.7V
1v DSP
C52 @ 5v
0.5v DSP
2v DSP
C5x @ 2v
C15 @ 5v
0.001
0.01
0.1
1
10
100
1000
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
Year
mW
/MIP
S
DSP
Power
Gene's
Law
Factor 1.6 reductionper year
Source: TI
Energy Trends in Energy Trends in DSPsDSPsGene (Frantz)Gene (Frantz)’’s Laws Law
Gene’s Law
DSP Power
1,000
100
10
1
0.1
0.01
0.001
0.0001
0.00001
mW
/MIP
S
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
YearSource: Gene Frantz (TI)
Energy to Play a Major RoleEnergy to Play a Major RoleA holistic perspectiveA holistic perspective
Energy = upper bound on the amount of availablecomputation
– Total Energy of Milky Way Galaxy: 1059 J
– Minimum switching energy for digital gate(1 electron@100 mV): 1.6 10-20 J (limited by thermalnoise)
– Upper bound on number of digital operations: 6 1078
– Operations/year performed by 1 billion 100 MOPScomputers: 3 1024
– Energy consumed in 180 years assuming a doublingof computational requirements every year.
Putting energy in perspectivePutting energy in perspective
• Energy cost of digital computation– 1999 (0.25µm): 1pJ/op (custom) … 1nJ/op (µproc)
• A platform is a restriction on the space of possibleimplementation choices, providing a well-defined abstractionof the underlying technology for the application developer
• New platforms will be defined at the architecture-micro-architecture boundary
• They will be component-based, and will provide a range ofchoices from structured-custom to fully programmableimplementations
• Key to such approaches is the representation ofcommunication in the platform model
““Only the consumer gets freedom of choice;Only the consumer gets freedom of choice;
designers need freedomdesigners need freedom fromfrom choicechoice””
((OrfaliOrfali, et al, 1996, p.522), et al, 1996, p.522)
Source:R.Newton
Hardware PlatformsHardware Platforms
Hardware Platform: not only a fully specified SoC butalso a family of architectures that share somecommon feature:
A Hardware Platform is a family of architectures thatsatisfy a set of architectural constraints imposed toallow the re-use of hardware and softwarecomponents.
The stronger the constraints the more component re-use but stronger constraints imply fewerarchitectures to choose from!
Hardware Platforms Not Enough!Hardware Platforms Not Enough!
• Hardware platform has to be abstracted
• Interface to the application software is API
• Software layer performs abstraction:
– Programmable cores and memory subsystem with
RTOS
– I/O subsystem via Device Drivers
Software PlatformsSoftware Platforms
Output DevicesInput devices
Hardware Platform
I O
Hardware
Software
network
Software Platform
Application Software
Platform API
Software Platform
RT
OS
BIOS
Device Drivers
Net
wo
rk
Co
mm
un
icat
ion
The Platform TensionThe Platform Tension
Architectural Space
Application Space
The Platform ApproachThe Platform Approach
Architectural Space
Application Space
Application Instance
Platform Instance
System
Platform
Platform Design Space
Exploration
Platform
Specification
Source:
Alberto Sangiovanni-Vincentelli
TM-xxxxD$
I$
Tr iMedia CPU
DEVICE I /P BLOCK
DEVICE I /P BLOCK
DEVICE I /P BLOCK
.
.
.
DVP System Silicon
VLIW MediaProcessor :• 100 to 300+MHz• 32-bit or 64-bit
Nexper iaSystem Busses• PI bus• Memory bus• 32-128 bit
PI
BU
S
SDRAM
MMI
DV
P M
EM
OR
YB
US
DEVICE I /P BLOCK
PRxxxxD$
I$
MIPS CPU
DEVICE I /P BLOCK
.
.
.
DEVICE I /P BLOCK
PI
BU
S
Genera lPurpose RISCProcessor• 50 to 300+ MHz• 32-bit or 64-bitLibra ry ofDevice Blocks• Image
An Integrated Radio Processor (TCI)An Integrated Radio Processor (TCI)
Embedded
Processor
MemoryMemory
Sub-systemSub-system
Baseband
Processing
FixedFixed
Protocol StackProtocol Stack
Programmable
Protocol Stack
The The ““Network-on-a-ChipNetwork-on-a-Chip””
N Inputs
B Buses
M Outputs
Multi-Bus
cluster
cluster
cluster
Hierarchical MeshMesh
Module
Platform ExplorationPlatform Exploration
The Y-Chart ApproachThe Y-Chart Approach
ApplicationsApplicationsArchitecture
Instance
Mapping
Applications
Performance
Analysis
Performance
Numbers
Source: B. Kienhuis
• Technology scaling is redefining the term “complexity”
• System-on-a-Chip fosters renaissance in processorarchitecture, opening the door for new models andcombinations thereof:Platform and Communication Based Design
• SOC for wireless driven by new set of metrics: how tosimultaneously optimize flexibility, cost, energy, andperformance?
• Application-Architecture Exploration is Focal Part ofImplementation Methodology