Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection Hamid Noori † , Maziar Goudarzi ‡ , Koji Inoue ‡ , and Kazuaki Murakami ‡ Speaker: Tohru Ishihara ‡ † Institute of Systems & Information Technologies/KYUSHU, Japan ‡ Kyushu University, Japan
30
Embed
Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection Hamid Noori †, Maziar Goudarzi ‡, Koji Inoue ‡, and Kazuaki.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Improving Energy Efficiency of Configurable Caches via
†Institute of Systems & Information Technologies/KYUSHU, Japan ‡Kyushu University, Japan
2/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivation Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
3/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivation Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
4/26 ISVLSI2008@Montpellier, FranceKyushu University
Background(1/2)
Vdd:180nm = 1.66V100nm = 1.125V
70nm = 0.9 V
Temperature:Dynamic energy is
temperature independent
0
0.05
0.1
0.15
0.2
0.25
0.3
180nm 100nm 70nm
Technology
Dy
na
mic
En
erg
y (
nJ
)
32K 16K 8K 4K 2K 1K
Vdd:180nm = 1.66V
100nm = 1.125V70nm = 0.9V
Temperatue:100°C
0
50
100
150
200
250
300
180nm 100nm 70nm
Technology
Le
ak
ag
e P
ow
er
(mW
)
32K 16K 8K 4K 2K 1K
The dynamic energy per a cache access
The leakage power of a cache memory
5/26 ISVLSI2008@Montpellier, FranceKyushu University
Background(2/2)
Vdd:180nm = 1.66V
100nm = 1.125V70nm = 0.9V
Cache Size:32KB
0
20
40
60
80
100
120
140
0°C 20°C 40°C 60°C 80°C 100°C
Temperature
Le
ak
ag
e P
ow
er
for
Ca
ch
e
32
KB
(m
W)
180nm 100nm 70nm
6/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivational Example Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
7/26 ISVLSI2008@Montpellier, FranceKyushu University
Motivational Example (1/3)
Execution time is Technology &Temperature Independent
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
16000000
18000000
20000000
128K 64K 32K 16K 8K 4K 2K 1K
Instruction Cache Size - qsort
No
. of
Ex
ec
uti
on
Clo
ck
Cy
cle
s (
K)
8/26 ISVLSI2008@Montpellier, FranceKyushu University
Motivational Example (2/3)
Technology: 70nm
Vdd: 0.9V
0
500
1000
1500
2000
2500
128K 64K 32K 16K 8K 4K 2K 1K
Cache SizeS
tati
c E
ne
rgy
(m
J)
0°C
20°C
40°C
60°C
80°C
100°C
Technology:70nm
Vdd: 0.9V
Dynamic Energy isTemperature Independent
0
500
1000
1500
2000
2500
3000
3500
4000
128K 64K 32K 16K 8K 4K 2K 1K
Cache Size
Dy
na
mic
En
erg
y (
mJ
)
Total dynamic energy for executing a program
Total static energy for executing a program
9/26 ISVLSI2008@Montpellier, FranceKyushu University
Motivational Example (3/3)
Technology: 70nm
Vdd: 0.9V
0
500
1000
1500
2000
2500
3000
3500
4000
4500
128K 64K 32K 16K 8K 4K 2K 1K
Instruction Cache Size - qsort
To
tal
En
erg
y (
mJ
)
0°C 20°C 40°C
60°C 80°C 100°C
Minimum-energy cache size
10/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivation Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
11/26 ISVLSI2008@Montpellier, FranceKyushu University
Problem Definition (1/3)
Objective function: total memory energy Cache dynamic energy Cache static energy Off-chip memory access energy Energy consumption during processor stall
CPUI-$
D-$
Mainmemory
12/26 ISVLSI2008@Montpellier, FranceKyushu University
Problem Definition (2/3)energy_memory(C, Temp, Tech) =
13/26 ISVLSI2008@Montpellier, FranceKyushu University
Problem Definition (3/3)
“For a given application, processor architecture, technology, and valid configurations of the configurable cache, find a valid cache configuration that results in minimum energy consumption in a specific temperature over the entire execution of the given application.”
14/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivation Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
15/26 ISVLSI2008@Montpellier, FranceKyushu University
Architecture
TACC BCC (proposed by Zhang et al. [1])
Cache size (way shutdown) Number of ways (way concatenation) Line size
Thermal sensor Accessible port for reading the thermal sensor
[1] C. Zang, F. Vahid and W. Najjar,.“A Highly Configurable Cache Architecture for Embedded Systems,” ACM Trans. on Embedded Computing Systems, vol.4, no.2, May 2005
16/26 ISVLSI2008@Montpellier, FranceKyushu University
Reconfiguration FlowStatic and dynamicpower for differentcache configuration
and temperatures forthe target technology
Execution time, number ofhits and misses for
different cacheconfigurations obtained
through running theapplication on an ISS
Determining thelowest energy cache
configuration fordifferent targettemperatures
Fill the lookup table of theconfigurable cache withproper configuration for
each temperature
Evaluationphase
(offline)
Detect the currenttemperature
Use the lookup table andload the proper
configuration for thecurrent temperature
Execute theapplication
Reconfigurationphase (online)
17/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivation Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
18/26 ISVLSI2008@Montpellier, FranceKyushu University
Experiment Setup (1/2)
Mibench Simplescalar
Cache hit: one clock cycle Cache miss: 100 clock cycles Clock freq of the base processor: 200 MHz
CACTI 4.2 Target technology 70nm (Vdd=0.9)
BCC (16KB) 16KB (4-, 2-, 1-way) 8KB (2-, and 1-way) 4KB (1-way) The line size for each of the configurations can be 8-, 16-, or 32-
byte.
19/26 ISVLSI2008@Montpellier, FranceKyushu University
Experimental Setup (2/2) Base Configurable Cache (BCC)
It has the same architecture proposed by Zhang et al. [1] It supports a limited set of configurations It is configured for each application for corner-case (i.e.
leakage at 100°C)
Temperature-Aware Configurable Cache (TACC) TACC is configured for each execution of an application
considering the chip temperature at that time
[1] C. Zang, F. Vahid and W. Najjar,.“A Highly Configurable Cache Architecture for Embedded Systems,” ACM Trans. on Embedded Computing Systems, vol.4, no.2, May 2005
20/26 ISVLSI2008@Montpellier, FranceKyushu University
Energy & Performance Evaluation
Energy Saving =
100__
_100__
tempBCCenergy
TACCenergytempBCCenergy × 100
BCCtimeexec
TACCtimeexecBCCtimeexec
__
____ Performance Enhancement =
× 100
21/26 ISVLSI2008@Montpellier, FranceKyushu University
Data and Instruction CacheD$ qsort djpeg lame dijkstra patricia sha adpcm crc fft
24/26 ISVLSI2008@Montpellier, FranceKyushu University
Outline
Background Motivation Problem Definition Proposed Approach
Architecture Reconfiguration Flow
Experimental Results Conclusions
25/26 ISVLSI2008@Montpellier, FranceKyushu University
Conclusions
1. Importance of temperature-aware configurable cache for finer technologies. Up to 61% (17% on average) energy consumption in 70nm technology for instruction cache
2. Data cache is more easily affected by temperature than instruction cache. Using a configurable data cache, up to 77% (36% on average) energy can be saved in 70nm technology.
3. The TACC improves the performance for instruction cache up to 28% (5% on average) and for data cache, it is up to 17% (8.1% in average).
26/26 ISVLSI2008@Montpellier, FranceKyushu University