A One-Shot Configurable- Cache Tuner for Improved Energy and Performance Ann Gordon-Ross 1 , Pablo Viana 2 , Frank Vahid 1 , Walid Najjar 1 , and Edna Barros 4 1 Dept of Computer Science & Engineering - University of California, Riverside, USA 2 Campus Arapiraca – Federal University of Alagoas, Brazil 3 Centro de Informática - Federal University of Pernambuco, Brazil This work was supported by the U.S. National Science Foundation, and by the Semiconductor Research Corporation
13
Embed
A One-Shot Configurable-Cache Tuner for Improved Energy and Performance
A One-Shot Configurable-Cache Tuner for Improved Energy and Performance. Ann Gordon-Ross 1 , Pablo Viana 2 , Frank Vahid 1 , Walid Najjar 1 , and Edna Barros 4 1 Dept of Computer Science & Engineering - University of California, Riverside, USA - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A One-Shot Configurable-Cache Tuner for Improved Energy and Performance
Ann Gordon-Ross1, Pablo Viana2, Frank Vahid1, Walid Najjar1, and Edna Barros4
1Dept of Computer Science & Engineering - University of California, Riverside, USA2Campus Arapiraca – Federal University of Alagoas, Brazil
3Centro de Informática - Federal University of Pernambuco, Brazil
This work was supported by the U.S. National Science Foundation, and by the Semiconductor Research Corporation
2Ann Gordon-RossUniv of Ca, Riverside
Introduction• Memory access: 50% of embedded processor’s
system power
• Caches are power hungry
• ARM920T (Segars 01)
• M*CORE (Lee/Moyer/Arends 99)
• Thus, caches are a good candidate for optimizations
53%
Main Mem
L1 I Cache
Processor
L1 D Cache
3Ann Gordon-RossUniv of Ca, Riverside
Introduction• Different applications have vastly different cache
requirements
• Total size, line size, and associativity
• Cache parameters that don’t match an application’s behavior can waste over 60% of energy (Gordon-Ross 05)
• Cache tuning is the process of determining the appropriate cache parameters for an application
4KB 16 byte2-way
2KB 32 byte
direct-mapped8KB
64 byte4-way
4Ann Gordon-RossUniv of Ca, Riverside
Download application
Runtime Cache Tuning• Best cache configuration can be determined by
searching the design space during runtime
• Runtime cache tuning is transparent to the designer and end user, but incurs runtime overhead in terms of energy and performance
Ene
rgy
Executing in base configuration
Tunable cache
Tuning hw
TC Cache TuningTCTCTCTCTCTC TCTCTCTC
5Ann Gordon-RossUniv of Ca, Riverside
Download application
Contribution• We introduce specialized hardware for non-intrusive runtime
cache evaluation
• Temporary energy overhead and no performance overhead
• Single-pass multi-cache evaluation - SPCE
• Special hardware simultaneously evaluates all cache configurations
• Enables switching to the best configuration in one-shot
Tunable cache
SPCE
Ene
rgy
Executing in base configuration SPCE causes an increase
in energy but no performance overhead
Switch to best config in “one-shot”
SPCESPCE
TC
6Ann Gordon-RossUniv of Ca, Riverside
SPCE Key Points• Contributions compared to previous methods
• Evaluates a highly configurable cache
–Previous method offer little configurability
• Little hardware overhead
–Simple data structures
–Elementary operations
7Ann Gordon-RossUniv of Ca, Riverside
SPCE• Monitors address stream to extract cache hit