Top Banner
1 Integrated Management of Power Aware Computing & Communication Technologies Kickoff review meeting Nader Bagherzadeh, Pai H. Chou, Fadi Kurdahi University of California, Irvine, ECE Dept. DARPA Contract F33615-00-1-1719 September 27, 2000
118

Integrated Management of Power Aware Computing & Communication Technologies

Jan 04, 2016

Download

Documents

aurelia-herrera

Integrated Management of Power Aware Computing & Communication Technologies. Kickoff review meeting Nader Bagherzadeh, Pai H. Chou, Fadi Kurdahi University of California, Irvine, ECE Dept. DARPA Contract F33615-00-1-1719 September 27, 2000. Agenda. Introduction and overview - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Integrated Management of Power Aware Computing & Communication Technologies

1

Integrated Management of Power Aware Computing & Communication

Technologies

Kickoff review meeting

Nader Bagherzadeh, Pai H. Chou, Fadi KurdahiUniversity of California, Irvine, ECE Dept.

DARPA Contract F33615-00-1-1719

September 27, 2000

Page 2: Integrated Management of Power Aware Computing & Communication Technologies

2

Agenda

Introduction and overview

Management status, financial, milestones, schedule.

Technical presentation Task progress

Architecture Applications CAD

Lessons learned, challenges, issues.

Questions + action items review.

Page 3: Integrated Management of Power Aware Computing & Communication Technologies

3

Outline

Introduction Program goals Project overview

Management status Personnel and teaming plans Plans and milestones Financial information

Technical presentation Background Technical approach Status and accomplishments Current detailed schedule

Program impact and anticipated transitions

Page 4: Integrated Management of Power Aware Computing & Communication Technologies

4

Introduction

Page 5: Integrated Management of Power Aware Computing & Communication Technologies

5

Program Goals

Power-aware system-level design Enhance mission success (time, task) Rapid customization for different missions

Design tool Exploration & evaluation Optimization& specialization Technique integration

System architecture Statically configurable Dynamically adaptive Use COTS parts & protocols

Page 6: Integrated Management of Power Aware Computing & Communication Technologies

6

Technical approach

High-level specification Separate behavior from architecture Explicit constraints (timing, power) Library characterization

System synthesis tool Source-aware power usage scheduling Bus topology transformation and communication scheduling

Configurable architecture Task migration & selective shutdown Bus segmentation and voltage scaling

Domain knowledge Encompass mechanical / thermal power Aware of power supply model

Page 7: Integrated Management of Power Aware Computing & Communication Technologies

7

Quad Chart

Innovations Component-based power-aware design

Exploit off-the-shelf components & protocols Best price/performance, reliable, cheap to replace

CAD tool for global power policy optimization Optimal partitioning, scheduling, configuration Manage entire system, including mechanical & thermal

Power-aware reconfigurable architectures Reusable platform for many missions Bus segmentation, voltage / frequency scaling

Impact

Enhanced mission success More task for the same power Dramatic reduction in mission completion time

Cost saving over a variety of missions Reusable platform & design techniques Fast turnaround time by configuration, not redesign

Confidence in complex design points Provably correct functional/power constraints Retargetable optimization to eliminate overdesign Power protocol for massive scale

Behavior

Architecture

high-levelsimulation

functionalpartitioning& scheduling

compositionoperators

high-levelcomponents

behavioralsystem model

busses, protocols systemarchitecture

mapping system integration& synthesis

staticconfiguration

dynamic powermanagement

parameterizablecomponents

2Q 00

Kickoff

2Q 01 2Q 02

Static & hybrid optimizations partitioning / allocation scheduling bus segmentation voltage scaling

COTS component library

FireWire and I2C bus models

Static composition authoring

Architecture definition

High-level simulation

Benchmark Identification

Dynamic optimizations task migration processor shutdown bus segmentation frequency scaling

Parameterizable components library

Generalized bus models

Dynamic reconfiguration authoring

Architecture reconfiguration

Low-level simulation

System benchmarking

Year 1 Year 2

Page 8: Integrated Management of Power Aware Computing & Communication Technologies

8

Innovations

Component-based power-aware design Exploit off-the-shelf components & protocols COTS offer best price/performance, reliable, cheap to replace

CAD tool for global power policy optimization Optimal partitioning, scheduling, configuration Manage entire system, including mechanical & thermal

Power-aware reconfigurable architectures Reusable platform for many missions Bus segmentation, voltage / frequency scaling

Page 9: Integrated Management of Power Aware Computing & Communication Technologies

9

Impact

Enhanced mission success More task for the same power Dramatic reduction in mission completion time

Cost saving over a variety of missions Reusable platform & design techniques Fast turnaround time by configuration, not redesign

Confidence in complex design points Provably correct functional/power constraints Retargetable optimization to eliminate overdesign Power protocol for massive scale

Page 10: Integrated Management of Power Aware Computing & Communication Technologies

10

Management Status

Page 11: Integrated Management of Power Aware Computing & Communication Technologies

11

Personnel & teaming plans

UC Irvine, Co-PI's - Design tools Nader Bagherzadeh Pai Chou Fadi Kurdahi

UC Irvine, research assistants Dexin Li Jinfeng Liu Afshin Niktash

USC - Component power optimization Jean-Luc Gaudiot Seong-Won Lee

JPL - Applications and benchmarking Nazeeh Aranki Nikzad “Benny” Toomarian

Page 12: Integrated Management of Power Aware Computing & Communication Technologies

12

Previous work

Design tools System-level: the Chinook HW/SW codesign tool Architectural synthesis (w/ physical design considerations)

Components Reconfigurable computing: the MorphoSys Chip Parameterizable components: PCL Simultaneous MultiThreading

vs. Chip MultiProcessing

Architectural platform Segmented bus X-2000, Mars Pathfinder Configurable SMP

Page 13: Integrated Management of Power Aware Computing & Communication Technologies

13

Responsibilities

Bagherzadeh, Chou, Kurdahi -- co-PIs Oversee project operation Integration into curriculum and related research efforts

Li, Liu, Afshin -- RA's Development of CAD tools Modeling of demonstrator examples Authoring of component / protocol library

JPL Furnish example specifications Co-develop optimization techniques

USC Supporting link to low-level technologies

Page 14: Integrated Management of Power Aware Computing & Communication Technologies

14

External collaborations

JPL X-2000 multi-mission architecture Mars Pathfinder as baseline JPL to provide COTS testbed JPL to evaluate IMPACCT optimizations

USC Parameterizable components Low-level power estimation

Consystant Design Technologies (Seattle, WA) Framework for component-based design IMPACCT plugins to support power management

Page 15: Integrated Management of Power Aware Computing & Communication Technologies

15

Technical Background

Page 16: Integrated Management of Power Aware Computing & Communication Technologies

16

Background: MorphoSys project

Reconfigurable processor array

MIPS-like RISC processor

High-bandwidth data interface

100 MHz clock

0.35µm 4metal CMOS

Software support

Platform for dynamic power management

Advanced RISCProcessor

External Memory (e.g. SDRAM, RDRAM)

System Bus

Instr./DataCache (L1)

ReconfigurableProcessor Array

High BandwidthData Interface

MorphoSys

Page 17: Integrated Management of Power Aware Computing & Communication Technologies

17

RC Array and Context Memory

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC

RC

RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC RC

RC

RC

RC

RC

RC RC

RC RC

RC RC

RC RC

RC RC RC RC RC RC RC

RC

RC

RC

RC

RC

RC

RC

RC

1616 16 16 16 16 16 16

column block

16

16

16

16

16

16

16

16row

bl

ock

Context Memory• 2 blocks

• 8 sets in each block

• A set controls 1 row or column (SIMD)

• 16 contexts in 1 set.

• Possible to overlap ctx broadcast with ctx reloading

Page 18: Integrated Management of Power Aware Computing & Communication Technologies

18

The M1 chip layout

Page 19: Integrated Management of Power Aware Computing & Communication Technologies

19

M1 chip test fixture

Page 20: Integrated Management of Power Aware Computing & Communication Technologies

20

TR_appa = b + cp = a + 1

TR_appa = b + cp = a + 1

TinyRISCTinyRISC

RC ArrayRC Array

App. (C Code)

C++, VHDL

MorphoSys Chip

mcc

Z=RC_F(X)

W=RC_F(Y)

mLoad Context Lib.

mSchedmSchedExecutable

RC Array functions

MuLate,MorphoSim

mView

Configurationcontext

Software environment

Page 21: Integrated Management of Power Aware Computing & Communication Technologies

21

Background on USC's SMT work

High performance processors Superscalar processor (SSP) Single chip multiprocessor (CMP) Very long instruction word (VLIW) Simultaneous multithreading (SMT)

Performance and power dissipation High performance need high power consumption

Recent applications need for low power, high performance processor

Page 22: Integrated Management of Power Aware Computing & Communication Technologies

22

Microarchitectural tradeoffs

Power tradeoffs between different architectures SMT vs. SSP:

SMT has more modules than SSP SMT has better performance and consumes more power

SMT vs. CMP: SMT has better utilization They have similar performance, but SMT consumes less power

SMT vs. VLIW: SMT consume more power SMT has compatibility with conventional architecture

Design of simple SMT A simplified SMT may consume less power and still have the advantage of TLP

Analysis of architectural features Power drain of modern processor (control vs. data path)

Page 23: Integrated Management of Power Aware Computing & Communication Technologies

23

SMT design methodology

Measuring power consumption of a processor Checking transitions of signals and module operations Hardware implementation of the processor simulator

Measuring performance of modules The contribution of each module to the total performance Performance-power ratio of each module

Comparison between architectures

Design of a low power processor

Page 24: Integrated Management of Power Aware Computing & Communication Technologies

24

Measuring performance

Finding the performance per power of each module Simulate and measure the performance without a module Calculate the performance per power for each module Classify modules if more than two modules cooperate with each other

Find the solution for the low power high performance processor

Page 25: Integrated Management of Power Aware Computing & Communication Technologies

25

Background: Chinook project

Component-based HW/SW codesign framework Specification, simulation, synthesis Motivated by IP reuse, system integration

Problem: IP reuse forces modification Reason: components have hardwired coordination protocols

Approach Adaptable components Separate coordination protocols

from components

Benefits Reuse without modification Enable system-level optimizations

Page 26: Integrated Management of Power Aware Computing & Communication Technologies

26

Example protocol: Subsumption

Must handle three cases: Subsuming, yielding, idle Hardwired protocol

Generalization: Adaptable components (by mode mapping) Separate protocols & components

joystick

bumper

sonar

wheels

escape

avoid

override

s

s

sensorsactuators

decisionmodules

decisioncomposition

isy

isy

isy

isy

i

i

i

i

s i

yi

s i

yi

ys

WB TFBumperprocess

yieldingsubsumingidlesubsumptioninterface

+B

BF

subsumingidle

+W

WB

+subsuming

yieldingsubsuming

WBF

W

T2s

45d bump

release

Page 27: Integrated Management of Power Aware Computing & Communication Technologies

27

Architectural mapping

Single processor or multiple processors

Multiple mappings to an architecture

modemanager

modalprocesses

Page 28: Integrated Management of Power Aware Computing & Communication Technologies

28

Distributed mode managers

Automatically partitioned among processors Synthesized control communication Comm. tradeoffs: synchronization, replication

modemanager

modalprocesses

Page 29: Integrated Management of Power Aware Computing & Communication Technologies

29

Technical Presentation

Page 30: Integrated Management of Power Aware Computing & Communication Technologies

30

Solar Panel

UHF Antenna

Cameras/Lasers

Warm Electronics Box (WEB)

APXS

30 cm

65 cm

48 cm

“Sojourner”The Mars Pathfinder Microrover Flight Experiment

Alpha Proton X-ray Spectrometer (APXS)

Past missions – Mars Pathfinder

Page 31: Integrated Management of Power Aware Computing & Communication Technologies

31

Application requirements

System specification 6 wheel motors 4 steering motors System health check Hazard detection

Power supply Battery (non-rechargeable) Solar panel

Power consumption Digital

Computation, imaging, communication, control Mechanical

Driving, steering Thermal

Motors must be heated in low-temperature environment

Page 32: Integrated Management of Power Aware Computing & Communication Technologies

32

Energy Required

Function Time and Calculation

7.51W-hr5.63W-hr6.92W-hr1.83W-hr0.45W-hr

1.2W-hr 

5.2W-hr0.63W-hr15.0W-hr

50W-hr95W-hr

motor heating: 1 motor at a timemotor heating: 2 motors at a timedriving (extreme terrain @ -80degC)hazard detectionimaging (3 images @ 2 min/image)image compression (compress 3 images @ 6 min/image)6Mbit communication @ 50min/sol42, 10 sec health checks during dayremainder of 7 hr daytime CPU operationWEB heating (as needed)

= 7.51W x 1hr = 11.26W x 0.5hr= 13.85W x 0.5hr= 7.33W x 0.25hr= 4.5W x 0.1hr= 3.7W x 0.3hr = 6.27W x 0.8hr= 6.27W x 0.1hr= 3.7W x 4hr= 50W-hr 

System-level power budget

Page 33: Integrated Management of Power Aware Computing & Communication Technologies

33

Design issues

Timing constraints System health check 10s/10min Heating motor for 5s, 50s prior to driving Hazard detection 10s – steering 5s – driving

10s

Power management Low-power electronics cannot make

significant power saving No system-level management tool available

Conservative hand-crafted schedule Serialize all operations to avoid power surge Long execution time Solar power wasted

Page 34: Integrated Management of Power Aware Computing & Communication Technologies

34

Pancam/Mini-TES

Mini-Corer

Instrument Arm Cluster : Raman Spectrometer Alpha-Proton-X-Ray Spectrometer (APXS) Mössbauer Spectrometer Microscopic Imager

Present missions – Athena/Mars ’03 Rover configuration

Page 35: Integrated Management of Power Aware Computing & Communication Technologies

35

Athena/Mars ‘03 Rovers - power subsystem

Power utilization: 38 W = 19 W (CPU&I/O) + 9 W (accel and gyro) + 10 W (wheel

motors) for driving. 75 W = 19 W (CPU&I/O) + 55 W (transmission) for orbiter

communication 30 W = 19 W (CPU&I/O) + 10 W (transmission) for lander relay

communication 55 W = 19 W (CPU&I/O) + 33 W (peak motor) for drilling 29 W = 23 W (CPU&I/O) + 6 W (cameras) required for imaging 11 W Raman, 1.4W APXS and 2.3 W for nighttime spectrometer

operation 141Whr daily for housekeeping engineering 75Whr limit for nighttime operations

Page 36: Integrated Management of Power Aware Computing & Communication Technologies

36

Present missions – MUSES-CN Asteroid NanoRover

Completely solar powered Requiring only 1 watt, including

an RF telecommunications system for communications between the rover and a lander or small-body orbiter for relay to Earth.

Power source 500 grams of commercial, non-

rechargeable, replaceable lithium batteries, with energy density of 750 joules per gram.

Page 37: Integrated Management of Power Aware Computing & Communication Technologies

37

Power-aware designs

Subsume low power as a special case Minimize power consumption Minimal application specific knowledge, limited reconfiguration space Conservative

Make best use of available power Use MAX solar power while it's available Increase parallelism, perform more tasks, reduce mission time Both MIN and MAX power constraints

Application-specific knowledge Multiple mission requirement Adapt to run-time power supply, operating environment

Page 38: Integrated Management of Power Aware Computing & Communication Technologies

38

System-level power management

Amdahl's law -- extended to power Component-level improvements must be scaled by % contributions Synergy between inter-component interactions

Scope of system power model Digital, mechanical, thermal Battery model - control power surge Renewable source - solar panel, etc

Mission-driven tradeoffs Execution time vs. power saving Adapt to operating environment

Page 39: Integrated Management of Power Aware Computing & Communication Technologies

39

What's needed?

Reconfigurable system architecture Statically configurable for different missions Reconfiguration for dynamic power management Support state-of-the-art power management policies

System-level design tool Support design space exploration Take full advantage of COTS components Optimize mission-specific system configuration Synthesize system-level power manager Support simulation for early validation

Page 40: Integrated Management of Power Aware Computing & Communication Technologies

40

X2000 avionics system architecture

Symmetric COTS multiprocessors Low cost component with strong commercial support Widely accepted specification, design, application and testing Reduced development cost

Dual system bus architecture High speed data rate with moderate power Low speed control with low power

Industry standard bus protocols FireWire (IEEE 1394) bus I2C bus Reconfigurable bus topology

Page 41: Integrated Management of Power Aware Computing & Communication Technologies

41

PA system architecture

The NASA X2000 Avionics System

high-rateinput

(camera)

high-speed bus (e.g. IEEE 1394)

communicationmodule (CDMA)

bus powercontroller

symmetric multiprocessor modules

altimetersubnet

microcontroller-directed subnet- power regulations & control- analog telemetry sensors- safety inhibits- valve & pyro drive

reconfigurable hardware blocks

low-speed bus (e.g. I2C )

Page 42: Integrated Management of Power Aware Computing & Communication Technologies

42

Applicable power optimizations

Application level Scheduling under timing and power constraints Task partitioning, allocation, migration Algorithm selection

Architecture level Bus segmentation / clustering Communication scheduling

Component level Voltage / frequency scaling Power down

X-2000 goals Digital electronics power: 10x decrease Analog electronics power: 2x decrease Computer performance: 10 to 20x increase

both static &dynamic versions

Page 43: Integrated Management of Power Aware Computing & Communication Technologies

43

The need for a system-level CAD tool

Avoid pitfalls with manual design Overdesign (too conservative) Hardwired assumptions in implementation (hard to change/adapt) System integration (bottleneck in projects)

Scalable methodology Specification: separation of concerns

Behavior vs. architecture Policy vs. mechanism Constraint vs. implementation

Exploration Framework for technique integration Rapid feedback

Manage complexity Knowledge base for component/bus details Consistent knowledge propagation through design stages

Page 44: Integrated Management of Power Aware Computing & Communication Technologies

44

Design tool

Library Components and bus protocols Provides power estimation Defines configuration space

Authoring Behavioral description, architecture description Mapping from behavior to architecture

Synthesis Scheduling, partitioning Bus segmentation, voltage scaling Synthesis of power manager with task scheduler

Simulation High-level: explore design space Detailed-level: power/performance for a given design point

Page 45: Integrated Management of Power Aware Computing & Communication Technologies

45

Behavior

Architecture

high-levelsimulation

functionalpartitioning& scheduling

compositionoperators

high-levelcomponents

behavioralsystem model

busses, protocols systemarchitecture

mappingsystem integration

& synthesis

staticconfiguration

dynamic powermanagement

IMPAC2T overview

parameterizablecomponents

Page 46: Integrated Management of Power Aware Computing & Communication Technologies

46

Library: low-level components

Supported components COTS Parameterizable

Levels of abstraction Parameterizable Simulatable Synthesizable Reconfigurable

VHDL code

Bus width = 8 Bus width = 16

Page 47: Integrated Management of Power Aware Computing & Communication Technologies

47

Library: component definition

Component interface Physical:pin interface Functional: data and control interface Power, current, voltage

Power/mode characterization Mode governs power usage Restrictions on mode changes allowed High-level yet refined power estimation

Aggregation Smaller components combined into larger ones New external parameters, interfaces, modes

Page 48: Integrated Management of Power Aware Computing & Communication Technologies

48

Example components

Processor : PowerPC, ARM, Pentium, MIPS

Microcontroller StrongARM, Intel 8051, Motorola 68HC11, 68332

Bus controller/transceiver: FireWire controller& transceiver I2C bus controller, GPIB

Memory SRAM DRAM Flash memory

Page 49: Integrated Management of Power Aware Computing & Communication Technologies

49

Example component definition

FireWire bus transceiver: National Semi CS4103 Working voltage: 3.3 V Power modes

Full-on (400mW) PHY-on (150mW) Standby (50mW) CLK-disable (21mW) Crystal-disable (16mW)

FireWire bus controller: National Semi CS4210 Working voltage: 3.3 V Power modes

Full-on (300mW) Standby (17mW)

Aggregated bus transceiver/controller Up to ten working modes to play with Flexibility in power management

Page 50: Integrated Management of Power Aware Computing & Communication Technologies

50

Library: bus protocols

Architecture Parallelism (parallel or serial) Topology (serial, tree, ring) Service layers (physical, link, transaction, application)

Communication Data transfer mode (asynchronouus, isochronous) Data transfer speed Response mode (need acknowledgement or not) Arbitration mode

Configuration Configuration process (deterministic or randomly ) Reconfigurability (statical, hybrid, dynamical)

Power Power mode ( full-on, standby, deep-sleep, shutdown) Media (cable, wireless, backplane)

Page 51: Integrated Management of Power Aware Computing & Communication Technologies

51

Bus protocols exploration

Explore bus protocol dimensions Protocol simulation

Input: bus protocol model Ouput: sequency of events

Map events into relative power quantities Compare and tradeoff between different design points

Example: simulating FireWire bus configuration Event-driven simulator Compare two designs with different topology

Pure tree topology (acyclic) Tree topology with bus segmentation

Tree-ID process, 9 nodes Tree 37 events Segmented tree 24 events

Page 52: Integrated Management of Power Aware Computing & Communication Technologies

52

Bus optimization

Bus: a significant power consumer Up to 30% - 50% of the total system power consumption[Mehra97] Bus power consumption determined by

Capacitance (load C and bus C, proportional to bus length) Voltage (bus supply voltage and swing voltage) Bus access frequency Bus signal switching activity

Why bus power optimization? System performance requirements Power constraints Adapt to execution time variations Bus segmentation for increased bandwidth Enable other novel power management techniques

Page 53: Integrated Management of Power Aware Computing & Communication Technologies

53

Bus-level optimizations

Bus encoding [Shin98][Benini97][Nakase98] Minimize switching activity on bus Makes sense mostly for parallel bus Gray code, bus-invert code, T0 code and Beach code Bus driver design

Bus clustering (segmentation) [Mehra97][Zhang98] Optimize bus topology by grouping components Divide the global bus into multiple segments Benefits:

Reduced bus capacitance (power saving) Shorter bus latency, higher throughput, increased flexibility

Partitioning [Hauck95][Yang94][Cong93] Divide tasks among components Minimize inter-cluster traffic Clustering before partitioning

Page 54: Integrated Management of Power Aware Computing & Communication Technologies

54

FireWire (IEEE 1394)

High speed serial bus 100, 200, 400 Mbps in 1394a 800M, 1.6Gbps in 1394b

Advantages Low power Real-time bandwidth guarantee => important for media apps Isochronous and asynchronous transfer modes Hot-pluggable, self reconfiguring Supports bus segmentation

Page 55: Integrated Management of Power Aware Computing & Communication Technologies

55

LegendCAM: cameraMC: micro controller

HD: hard driveNVM: non-volatile memorySCI: scientific equipmentRF modem: radio frequency modemI2C bus omitted on this diagram

FireWire 1394 Bus

SCI SCIHD / NVM

HD / NVMCPU 1

CPU 1

RF Modem

RF Modem

CAMCAM

MC1MC1

SCI1SCI1SCI2SCI2

CPU2 (Bus controller)

CPU2 (Bus controller)

MC2MC2MC3MC3

Tasks:

• MC's are responsible for sensing, drive control, steering control• Capture picture, compress in CPU1, and send data to RF Modem

• SCI's carry out scientific experiments, sending data to CPU2 • After analysis, CPU2 stores data in HD/ NVM

X2000 architecture mapping

Map Mars Rover application onto X2000 architecture

Page 56: Integrated Management of Power Aware Computing & Communication Technologies

56

Bottlenecks in an unsegmented architecture

Contention for bus bandwidth Camera, RF, harddisk Forces serialization of communication globally

All nodes must be kept awake Prevents component shutdown Global overhead for bus reconfiguration

Long routing path Power overhead on routing controllers

Page 57: Integrated Management of Power Aware Computing & Communication Technologies

57

Segmentation example

Three bus segments

SCI2RF Modem CAM MC1

HDMC2

SCI1

CPU2/ Bus controller

CPU1/DSP

MC3

MC sensingdrive control

steering control

SCIscientific

experiment

CAM picture captureimage compression

RF transmission

Suppose bus bandwidth is 100Mbps, image size 20Mb each, 20 pictures to work on, SCI data volume 16kbps X 10 Ks X 2 (4 hrs a day)

Power numbers:CPU1: 4.0WCPU2: 240mWRF modem: 1.7 WCamera: 2.6 WSCI1: 0.8 WSCI2: 3.2 WPower number details

Page 58: Integrated Management of Power Aware Computing & Communication Technologies

58

Bus segmentation with FireWire

Blue nodes can't be disabled All nodes’ PHY layers must

remain active.

Request packets are broadcast to all nodes

Gray nodes can be safely disabled They are in different segments

from the active ones.

Request packets are broadcast to only active nodes.

segmentation

Page 59: Integrated Management of Power Aware Computing & Communication Technologies

59

SCI2RF Modem CAM MC1

HDMC2

SCI1

CPU2/ Bus controller

CPU1/DSP

MC3

Throughput improvement

100Mbps bandwidth9s transfer time

300Mbps5s transfer time

No useful traffic

Bus segmentation help improve bus bandwidth.

FireWire 1394 Bus

SCI SCIHD / NVM

HD / NVMCPU 1

CPU 1

RF Modem

RF Modem

CAMCAM

MC1MC1

SCI1SCI1SCI2SCI2

CPU2 (Bus controller)

CPU2 (Bus controller)

MC2MC2MC3MC3

Page 60: Integrated Management of Power Aware Computing & Communication Technologies

60

SCI2RF Modem CAM MC1

HDMC2

SCI1

CPU2/ Bus controller

CPU1/DSP

MC3

Bandwidth-enabled voltage scaling

Use voltage scaling and

clock scaling to decrease

component power.

Bandwidth 100Mbps

Power consumption = 12.3 W

Could be 300Mbps, keep it at 100Mbps

Power consumption after voltage scaling = 9.2 W

Page 61: Integrated Management of Power Aware Computing & Communication Technologies

61

Power/latency reduction

energy consumption = 46 JPower consumption after voltage scaling = 9.2 W

Data transfer time = 5 s

Note: bus configuration power not counted

Power consumption = 12.3 W

Data transfer time = 9 senergy consumption = 111 J

energy saving 58% Power saving 25%

Page 62: Integrated Management of Power Aware Computing & Communication Technologies

62

Segmentation-enabled shutdown

All components’ bus interfaces are active.Entire bus is hot.

Non-operating bus segments are disabled.Non-operating components are disabled.

Bus power is saved.

Drive control(10 min.)

Drive control(20 min.)

Picture capture(6 min.)

Science experiment(20 min.)

Page 63: Integrated Management of Power Aware Computing & Communication Technologies

63

Combined energy savings from static techniques

Shutting down inactive nodes:27 times of global bus configs.

Only 11 bus configurations Config energy << 165 J Transceiver energy 1962 JConfig energy + transceiver energy

< 1962 + 165 = 2127 JNot shutting down inactive nodes:Bus transceiver active all the time. Transceiver energy:

150 mW x 10 x 3360 s = 5040 JTransceiver: National Semi CS4103, PHY-active only mode.

2.4 X energy reduction!

Page 64: Integrated Management of Power Aware Computing & Communication Technologies

64

Dynamic bus reconfiguration

SCI2 RF ModemCAM MCS1

HDMCS2

SCI1

CPU2/ Bus controller

CPU1/DSP

MCS3

Solution: dynamically change bus topology

Science experimentsRadio frequency data transfer

SCI+RF (20+60 min)SCI2RF Modem CAM MC1

HDMC2

SCI1

CPU2/ Bus controller

CPU1

MC3

New task: send data from HD to RF modem!(continue from previous task )

Science experimentsRadio frequency data transfer

SCI+RF (20+60 min)

Page 65: Integrated Management of Power Aware Computing & Communication Technologies

65

Energy savings from dynamic bus reconfiguration

Local configuration: 3 Global configuration: nonere-segmentation : noneActive transceiver: 7Active bus segment: 2Energy: 12.7 x 3 x 1+ 0.15 x 7 x 4800 = 5078 J

Local configuration: none Global configuration: 1re-segmentation : 1Active transceiver: 3+2Active bus segment: 1

Power number list:Local config: 12.7WGlobal config: 23.7WActive transceiver: 150mWSegmentation: software support Bus segment: proportional to bus length

Energy: 23.7 x 1 x 1+ 0.15 x 3 x 4800 + 0.05 x 2 x 4800 = 2664 J

1.9 X energy reduction!

Page 66: Integrated Management of Power Aware Computing & Communication Technologies

66

Summary of architecture optimization

Towards loose coupling Reduced bus contention Increased parallel bandwidth Enabling voltage/frequency scaling

Application-driven clustering Communication bandwidth requirements between processes Knowledge from high-level behavioral model

Static optimization 2.4x energy reduction Bus segmentation Cluster shutdown

Dynamic reclustering 1.9x energy reduction

Page 67: Integrated Management of Power Aware Computing & Communication Technologies

67

Power management & optimization

Behavioral modeling Extract power related attributes of all objects

Architecture modeling Use low-power devices or devices that can operate on low-power mode

Partitioning Migration – merge computations on under-utilized processors on one

processor to improve utilization Segmentation – separate tightly coupled computations into clusters to

localize communication

Scheduling Arrange operation sequences on multi-processor / multiple power

consumer to meet both performance and power requirement

Page 68: Integrated Management of Power Aware Computing & Communication Technologies

68

Behavioral model

Application specific knowledge Input, output and function Dependency and precedence Control and data flow Timing and sequence

Software architecture Operating system features – real-time, centralized, distributed, and etc. Execution model – event driven, interrupt, distributed agent, client-

server, and etc. Communication model – protocol stack and specification

Power related attributes Data rate, execution time, CPU speed, memory size, communication

path, and etc.

Page 69: Integrated Management of Power Aware Computing & Communication Technologies

69

Allocation

Map behavioral objects to hardware Group related OS, communication, control and application objects into

processing nodes Extract data objects into storage nodes Allocate components/packages for each processing node Arrange data storage for data nodes and optimize storage location to

reduce communication

Map communication paths to busses Setup working mode of each component/package to fit the behavioral

requirement Extract attribute of each structure

Function – computation, control, communication CPU utilization Bus traffic Power consumption

Page 70: Integrated Management of Power Aware Computing & Communication Technologies

70

Scheduling

Mapping of tasks to time slots Computation Communication

Mapping of power usage to time slots Mechanical devices Thermal subsystems Other electronics subsystems

Constraints Real-time deadlines, periods, min/max separation Power budget, power surge (min/max) Potentially scenario-driven

Page 71: Integrated Management of Power Aware Computing & Communication Technologies

71

Scheduling techniques

Deadline based real-time scheduling on multiprocessors

Rate-monotonic scheduling – extend existing RM scheduling to multiprocessors

Timing constraint graph scheduling – multiple serializable sequences in a single heart beat

Page 72: Integrated Management of Power Aware Computing & Communication Technologies

72

Novel IMPACCT scheduler

A novel graphical tool Timing and power constraint visualization Transforms them into graph problems Give designers a vision to the power surge at run-time

Complete system-level model All power sources All power consumers

Power-aware scheduling Schedule operations based on power source output Both performance requirement and power constraint Regulate power surge Optimize for power efficiency and reduce execution time

Page 73: Integrated Management of Power Aware Computing & Communication Technologies

73

Power

TimeStarting time Ending time

Power level Energy consumption

Demo

IMPACCT scheduler

Extended Gantt-chart in real-time scheduling for single processor Event – bins

Timing – horizontal size Power – vertical size Energy – area of the bin

Power surge – compacting bins downward

Page 74: Integrated Management of Power Aware Computing & Communication Technologies

74

A

B B B B

C C C C C

D D D

Constant task A

Periodic task B

Periodic task C

Task D follows B

Power

Time

Demo

IMPACCT scheduler

Scheduling chart for multi-processor and multiple power consumers Events can overlap vertically

Multi-processor Multiple power consumer – electronics, mechanical, thermal

Power awareness – min and max power supply

Page 75: Integrated Management of Power Aware Computing & Communication Technologies

75

A

B

C

DPower

Time

B

C

Deadline of B (scheduling space) Deadline of B

Min timing constraint of D

Max timing constraint of D

Deadline of C (scheduling space)

Deadline of C

Scheduling space of D

Slide bin within timing space

Squeeze/extend bin to available time slot

C

C

Demo

IMPACCT scheduler

Timing constraints – bin packing problem to satisfy horizontal constraints Independent tasks – moving bins horizontally Dependent tasks – moving grouped bins horizontally Power/voltage/clock scaling – extending/squeezing bins

Page 76: Integrated Management of Power Aware Computing & Communication Technologies

76

A

B

C DPower

Time

B

Manual scheduling while monitoring power surge C

A

B

C

D

Power

Time

B

Attack spike

Automated global scheduling to meet min-max power

CC

Max

Min

Improve utilization

Demo

IMPACCT scheduler

Power constraints – bin packing problem to satisfy vertical constraints Automatic optimization – let the tool do everything Manual optimization – visualizing power in manual scheduling

Page 77: Integrated Management of Power Aware Computing & Communication Technologies

77

Example revisited – Mars Rover

System specification 6 wheel motors 4 steering motors System health check Hazard detection

Power supply Battery (non-rechargeable) Solar panel

Power consumption Digital

Computation, imaging, communication, control Mechanical

Driving, steering Thermal

Motors must be heated in low-temperature environment

Page 78: Integrated Management of Power Aware Computing & Communication Technologies

78

Timing constraints – Mars Rover

Operation Duration Timing constraintsHealth check 10 s Once in every 10-minute intervalHeating steering motors 5 sHeating wheel motors 5 sHazard detection 10 s Before steering

Steering 5 sBefore driving ALL four steering motors must be heated during the 50-second period prior to steering.

Driving 10 s ALL six wheel motors must be heated during the 50-second period prior to driving.

Page 79: Integrated Management of Power Aware Computing & Communication Technologies

79

Scheduling method

Constraint graph construction Nodes: operations Edges: precedence relationship between operations

Resource specification Resource: an executing unit that can perform operations independently

Six thermal resources for wheel heating Four thermal resources for steer motor heating One mechanical resource for driving One mechanical resource for steering One computation resource for control

Operations on one resource must be serialized

Scheduling Primary resource selection Schedule primary resource by applying graph algorithms Auxiliary resources and power requirement are considered as scheduling

constraints

Page 80: Integrated Management of Power Aware Computing & Communication Technologies

80

Constraint graph

System health check /

Thc

System health check /

Thc

thc -(thc + Thc)

Heat wheel 1 / Thw

Heat wheel 2 / Thw

Heat wheel 3 / Thw

Heat wheel 4 / Thw

Heat wheel 5 / Thw

Heat wheel 6 / Thw

Heat steer 2 / Ths

Heat steer 3 / Ths

Heat steer 4 / Ths

Hazard detection / Thd

Steer / Ts

Drive / Td

- thw

-ths

Heat steer 1 / Ths

Page 81: Integrated Management of Power Aware Computing & Communication Technologies

81

-ths + Ths_E

-thw + Thw_E

thc -(thc + Thc)

Resource specification

Hazard detection (C) /

Thc / Phc_CHealth check (C) /

Thc / Phc_C

Heat steer i (C) / Ths_C /

Phs_C

Heat steer i

(T) / Ths_T /

Phs_T

Heat wheel j

(C) / Thw_C

/ Phw_CHeat

wheel j (T) / Thw_T /

Phw_T

Steer (C) / Ts_C /

Ps_CSteer

(M) / Ts_M /

Ps_M

Drive (C) / Td_C /

Pd_C

Drive (M) / Td_M /

Pd_M

Health check (C) /

Thc / Phc_C

Computation

Mechanical

Thermal

Heat steer i

Heat wheel j

Health check

Health check

Steer

Drive

Hazard detection

Page 82: Integrated Management of Power Aware Computing & Communication Technologies

82

Scheduling graph

Hazard detection (C) /

Thc / Phc_C

Heat steer i (C) / Ths_E /

Phs_E

Heat steer i

(T) / Ths_T /

Phs_T

Heat wheel j

(C) / Thw_E

/ Phw_E

Heat wheel j

(T) / Thw_T /

Phw_T

Steer (C) / Ts_C /

Ps_CSteer

(M) / Ts_M /

Ps_M

Drive (C) / Td_C /

Pd_CDrive

(M) / Td_M /

Pd_M

-ths + Ths_E

-thw

Primary resource: Computation

Auxiliary resource: Mechanical

Auxiliary resource: Thermal

Health check (C) /

Thc / Phc_C

thc -(thc + Thc)

-ths

-thw + Thw_E

-Ts_C + Ts_M

Page 83: Integrated Management of Power Aware Computing & Communication Technologies

83

-40 degC -60 degC -80 degC

Solar power 14.9 12 9Battery 10 max 10 max 10 max

Heat one motor 5 5.1 6.2 7.5

Heat two motors 5 7.6 9.5 11.3

Drive 10 7.5 10.9 13.8

Steer 5 4.3 6.5 8.1

Hazard detection 10 5.1 6.7 7.3

Health check 10 4.7 5.7 6.3CPU Constant 2.5 3.5 3.7

PowerResource Duration

Example – Mars Rover

Power constraints Different solar power supply over time Different power consumption over

temperature/time

Page 84: Integrated Management of Power Aware Computing & Communication Technologies

84

JPL Solution - High Solar Power (a)

0

5

10

15

20

25

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85

Time

Heat steer

Heat wheel

Drive

Steer

Hazard detection

Health check

CPU

System heart-beat - moving two steps

(a) Begin with health check (b) no health check

Previous solution by JPL

Over-constrained, conservative Serialize every operation to

satisfy power constraint Longer execution time and

under-utilization of solar power

No scheduling tool is used – manual scheduling

Not power-aware Scheduling without considering

power sources and consumers

Page 85: Integrated Management of Power Aware Computing & Communication Technologies

85

System heart-beat - moving two steps

(a) Begin with health check (b) no health check

Solution 1: high solar power (14.9W)

Max solar power: 14.9W at noon Improved utilization of solar

power Automated scheduling – use

scheduling tools

Aggressive – do as much as possible heating motors while doing

other operations Fastest moving speed – no

waiting on heating

Page 86: Integrated Management of Power Aware Computing & Communication Technologies

86

System heart-beat - moving two steps

(a) Begin with health check (b) no health check

Solution 2: typical solar power (12W)

Moderate solar power output – 12W Improved utilization of solar

power Automated scheduling – use

scheduling tools

Moderately aggressive – avoid exceeding power limit Relaxed constraint –heating

motors while doing other operations

Faster moving speed – some waiting time on heating

Page 87: Integrated Management of Power Aware Computing & Communication Technologies

87

System heart-beat - moving two steps

(a) Begin with health check (b) no health check

Solution 3: low solar power (9W)

Minimum solar power output – 9W Restricted constraint –

serialize operations Automated scheduling – use

scheduling tools

Conservative – same as JPL solution Slow moving speed Full utilization of low solar

power

Page 88: Integrated Management of Power Aware Computing & Communication Technologies

88

Solar power output Battery energy Solar energy % of solar energy Time Moving distance14.9 0 672.5 60% 75 2 steps - 14cm12 55 817 91% 75 2 steps - 14cm9 388 675 100% 75 2 steps - 14cm

Solar power output Battery energy Solar energy % of solar energy Time Moving distance14.9 6 604 81% 50 2 steps - 14cm12 149 720 100% 60 2 steps - 14cm9 388 675 100% 75 2 steps - 14cm

Comparison

JPL's previous solution Conservative – long execution time, low solar power utilization Not power aware – same schedule for all cases Not intend to use battery energy

Our solution Adaptive – speedup when solar power supply is high Power-aware – smart scheduling on different power supply/consumption Use battery energy when necessary

Page 89: Integrated Management of Power Aware Computing & Communication Technologies

89

Travel Distance

TimeEnergy

CostTravel

DistanceTime

Energy Cost

0-610 14.9 16 610 0 24 610 72611-1220 12 16 610 440 20 610 1569.5

1221- 9 16 610 3114 4 160 786Total 48 1830 3554 48 1380 2427.5

Improvement

24.6% 31.7%

IMPACCTJPLTime frame Solar Power

Application-level evaluation

Mission description Target location – 48 (distance-) steps away from current location

Power condition 14.9W solar power for first 10 minutes, 12W for next 10 minutes, 9W

thereafter

Metrics Execution time Total energy drawn from battery

Page 90: Integrated Management of Power Aware Computing & Communication Technologies

90

Application-level evaluation

Power-awareness Execution speed scales with power

condition adaptively

Smart schedule Maximize best case Avoid worst case

Tradeoff Power vs. performance Energy renewability

Application-specific Application-level knowledge Working mode parameters of

components

Page 91: Integrated Management of Power Aware Computing & Communication Technologies

91

Program plans and milestones

Page 92: Integrated Management of Power Aware Computing & Communication Technologies

92

Development plans

Web-based CAD tool Perl/CGI scripts for configuration Java applets for interactive scheduling UI Interface with database engine

Interface with commercial CAD backend Detailed power estimation tools Functional simulation with proprietary models

Rationale No software installation needed by end user Ready to use by everyone on the Internet Open source with all publicly available development tools

Page 93: Integrated Management of Power Aware Computing & Communication Technologies

93

Status & accomplishments to date

Architecture Component Busses Software Example

Mars Pathfinder completed completed completed completedX-2000 completed completed in progress planned

Specification Low-level High-level UI Integration

Library in progress in progress in progress planned

Authoring in progress in progress in progress planned

Synthesis Static Hybrid Dynamic Integration

Partitioning in progress planned planned planned

Scheduling in progress planned planned planned

Bus segmentation in progress planned planned planned

Voltage scaling planned planned planned planned

Output Low-level High-level UI Integration

Simulation planned planned planned planned

Configuration planned planned planned planned

Page 94: Integrated Management of Power Aware Computing & Communication Technologies

94

July 200

0

Aug

200

0

Sep

t 200

0

Oct

2000

Nov

2000

Dec

2000

Jan 200

1

core toolUI

Library

Authoring

Partitioning

Scheduling

Segmentation

Volt. Scaling

Simulation

IMPACCT schedule

planned in progress

Page 95: Integrated Management of Power Aware Computing & Communication Technologies

95

Original schedule

2Q 00

Kickoff

2Q 01 2Q 02

System modeling

Coordination synthesis

Architecture definition

Static partitioning

Component partitioning

System modeling

Coordination synthesis

Architecture definition

Static partitioning

Component partitioning

Component simulator

PCL benchmarking

Synthesizable components

System benchmarking

Component simulator

PCL benchmarking

Synthesizable components

System benchmarking

Power aware design techniques

PCL definition

Simulatable components

Benchmark Identification

Power aware design techniques

PCL definition

Simulatable components

Benchmark Identification

Authoring tool v1.0

Dynamic partitioning

Simulator v1.0

Component partitioning

Authoring tool v1.0

Dynamic partitioning

Simulator v1.0

Component partitioning

network option

Page 96: Integrated Management of Power Aware Computing & Communication Technologies

96

Updated schedule

2Q 00Kickoff

2Q 01 2Q 02

Static & hybrid optimizations Partitioning / allocation Scheduling Bus segmentation Voltage scaling

Library COTS components FireWire and I2C bus models

Static composition authoring

High-level simulation

Benchmark Identification

Architecture definition

Static & hybrid optimizations Partitioning / allocation Scheduling Bus segmentation Voltage scaling

Library COTS components FireWire and I2C bus models

Static composition authoring

High-level simulation

Benchmark Identification

Architecture definition

Dynamic optimizations Task migration Processor shutdown Bus segmentation Frequency scaling

Library Parameterizable components Parameterizable bus models

Reconfiguration authoring

Architecture reconfiguration

Low-level simulation

System benchmarking

Dynamic optimizations Task migration Processor shutdown Bus segmentation Frequency scaling

Library Parameterizable components Parameterizable bus models

Reconfiguration authoring

Architecture reconfiguration

Low-level simulation

System benchmarking

optionYear 1 Year2

Page 97: Integrated Management of Power Aware Computing & Communication Technologies

97

Quarterly schedule

3Q2001

FireWire and I2C bus models

Static bus segmentation

Architecture definition

Low-level simulation

System benchmarking

Frequency scaling

High-level simulation

Hybrid partitioning / allocation

Voltage scaling

Parameterizable components

Dynamic scheduling

Parameterizable bus models

2000

4Q

1Q

2Q

3Q

4Q

2002

1Q

2Q

COTS components library

Static scheduling

Benchmark identification

Static partitioning / allocation

Hybrid scheduling

Static composition authoring

Dynamic processor shutdown

Dynamic bus segmentation

Dynamic reconfig. authoring

Hybrid bus segmentation

Architecture reconfiguration

Dynamic task migration

2001

Page 98: Integrated Management of Power Aware Computing & Communication Technologies

98

Financial information

Page 99: Integrated Management of Power Aware Computing & Communication Technologies

99

IMPACCT budget

Months 1-6 $180,000Months 7-12 $180,000Second year $400,000

IMPACCT Budget

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

months 1-6 months 7-12 second year

Funding

IMPACCT Budget

Page 100: Integrated Management of Power Aware Computing & Communication Technologies

100

Budget distribution

Budget Distribution

SalariesBenefitsEquipmentTravelOtherIndirectUSCJPL

Page 101: Integrated Management of Power Aware Computing & Communication Technologies

101

http://www.ece.uci.edu/impacct/

Page 102: Integrated Management of Power Aware Computing & Communication Technologies

102

Bibliography

[Mehra97] R. Mehra, et al. "A partitioning scheme for optimizing Interconnect power", IEEE Journal of solid-state circuits, Vol. 32, No.3, March 1997

[Shin98] Y. Shin, et al. "Reduction of bus transitions with partial bus-invert coding", Electrons Letters, vol.34, No.7, IEE 2 April 1998 p. 642-3

[Benini97 ] L. Benini et al. "Asymptotic zero-transition activity encoding for address buses in low-power microprocessor-based systems", Proceedings Great Lakes Symposium on VLSI, Los Alamitos, CA, USA: IEEE Comput. Soc. Press, 1997, p.77-82

[Nakase98] Y. Nakase et al. "Complementary half-swing bus architecture and its application for wide band SRAM macros", IEE proceedings-Circuits, Devices and Systems, vol.145, No.5 IEE, Oct 1998, p337-42

[Zhang98] Y. Zhang et al. "An alternative architecture for on-chip global interconnect: segmented bus power modeling", Thirty-Second Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 1-4 Nov. 1998.

[Kernighan70] B. Kernighan et al. “An Efficient Heuristic Procedure for Partitioning Graphs”, Bell System technical Journal Vol. 49 No.2, Feb. 1970 p291-307

[Hauck95] S. Hauck et al. “Logic Partition Orderings for Multi-FPGA Systems”, International Symposium on Field-Programmable Gate Arrays, 1995

Page 103: Integrated Management of Power Aware Computing & Communication Technologies

103

Program Goals

Evaluation, exploration power usage, performance, cost alternative configurations, algorithms

Optimization achieve most effective power usage high-level, global knowledge

Tool integration many point tools, independent techniques

Specialization configurable platform

Reuse take advantage of rich collection of COTS not to re-design from scratch

Page 104: Integrated Management of Power Aware Computing & Communication Technologies

104

Technical approach

High-level abstraction component vs. composition Separate models for architecture and behavior

Synthesis and optimization of power manager Architecture reconfiguration Scheduling for optimal power usage adaptable to different power management policies

Aggressive, domain-knowledge Encompass mechanical / thermal power Aware of power supply model

Page 105: Integrated Management of Power Aware Computing & Communication Technologies

105

System level modeling

Architectural modeling COTS components component encapsulation bus architecture system interconnect

Behavioral modeling Application specific knowledge Software architecture Mission goals High level constraints

Page 106: Integrated Management of Power Aware Computing & Communication Technologies

106

Power-aware coordination

Protocols Coordinate power usage

e.g. peak power, resource arbitration Multiple versions of given algorithm

Components Adaptable to different power management policies, not hardwired Usable in new applications even if not designed to be power aware!

Synthesis Coordination controller (“mode manager”) Optimization to minimize control dependency Optimality depends on architectural mapping

Page 107: Integrated Management of Power Aware Computing & Communication Technologies

107

Measuring power consumption (1)

Different levels of analysis by # of operations:

(+) easy to implement (-) neglect of different sizes of modules Appropriate to compare two different architectures with similar modules

# of lines of code: (+) assume the size of hardware to be implemented (-) may be too simple to estimate power consumption With the number of operations, gives a indication of the power consumption

of each module # of F/F:

(+) more accurate measure (-) should find the relationship between # of F/F and # of lines of code The number of F/F is the lowest hardware characteristics in the high level

simulator Control unit and data path have different power dissipation pattern even

with same amount of gates

Page 108: Integrated Management of Power Aware Computing & Communication Technologies

108

Measuring power consumption (2)

# of gates: (+) Makes accurate power estimation possible (-) needs Register transfer level (RTL) description and power analysis tools To get accurate hardware information, we have to implement RTL modules Input/output statistics of each module are also necessary

Page 109: Integrated Management of Power Aware Computing & Communication Technologies

109

USC's Work in Progress

Select a processor simulator

Analyze the hardware description of each module

Estimate the power consumption of each module

Find performance-power ratio

Design a minimum power processor model

Page 110: Integrated Management of Power Aware Computing & Communication Technologies

110

Program impact & transitions

Productivity Fully exploit off-the-shelf components Rapid turnaround time to architecture

Massive Scalability Protocol based power management System architecture platform

Robust methodology Unified functional/power correctness Confidence in complex design points

Page 111: Integrated Management of Power Aware Computing & Communication Technologies

111

Bus Architecture Perspectives (X)

Parallelism Parallel:

high cost, high throughput, enable design exploration Serial:

low cost, constrained throughput, simple bus interface

Locality Functional Spatial

Adaptivity Adaptive Deterministic

Page 112: Integrated Management of Power Aware Computing & Communication Technologies

112

Communication model asynchronous transfer isochronous transfer

Arbitration model Fair gap arbitration Priority arbitration

Configuration model Bus initialization Tree identification Self identification

FireWire (IEEE 1394) bus

Service model Physical layer Link layer Transaction layer

Page 113: Integrated Management of Power Aware Computing & Communication Technologies

113

Architectural Model

Component – parameterized COTS Type – processor, memory, I/O, DSP, bus, and etc. Interface – how the components can be connected to each other Modes – operation modes parameters, voltage, clock speed, bandwidth,

power consumption, and etc.

Package – a bundle of connected components that performs certain operation A set of connected components Internal/external interface – how components are connected Modes – configuration space of the collected components specified by

each component’s working mode and collective attributes, e.g., voltage, speed, power and etc.

Page 114: Integrated Management of Power Aware Computing & Communication Technologies

114

Approach: system-level modeling

High-level abstractions Employ application specific knowledge in system models Encompass multiple domains – electronics, mechanical, thermal

System modeling Behavioral modeling – software architecture, application specific

knowledge Architectural modeling – hardware platform built on top of

parameterized components Partitioning – mapping behavioral objects to architectural structures Scheduling – a valid sequence of concurrent/parallel operations on

multiple processors that satisfies real-time requirement

Page 115: Integrated Management of Power Aware Computing & Communication Technologies

115

Example – Mars Rover

System specification 6 wheel motors 4 steering motors System health check Hazard detection

Power supply Battery (non-rechargeable) Solar panel

Power consumption Digital

computation, imaging, communication, control Mechanical

driving, steering Thermal

motors must be heated in low-temperature environment

Page 116: Integrated Management of Power Aware Computing & Communication Technologies

116

Scheduling example – Mars Rover

Power constraints Solar panel: 14.9W peak power @ noon, 11W for 6hr/sol Battery: 10W max power output. 150W-hr energy storage CPU: 3.7W, constant for 4h/sol Health check: 6.3W, 10s Hazard detection: 7.3W, 10s Heating: 7.5W (1 motor) or 11.3W (2 motors), 5s Steering: 6.8W, 5s (7º/s) Driving: 12.4W, 10s (7cm)

Existing solution Serialize each operation to satisfy power constraint Conservative – longer execution time and under utilization of solar power No scheduling tool is used

Page 117: Integrated Management of Power Aware Computing & Communication Technologies

117

Scheduling techniques

Constraint logic solving Transfer all constraints into a pure mathematical form Use tools to solve the problem in mathematical domain

Example – CLPR Constraints

C1 > 3, C1 < 5, C2 > 2, C2 < 4 # two power consumers C1 + C2 < S, S > 6, S < 12 # one power source

Inputs C1 = 4.5, S = 7

Results C2 < 2.5 2 < C2

Page 118: Integrated Management of Power Aware Computing & Communication Technologies

118

Evaluation

Application level evaluation Metrics based on overall mission objectives Constraint-driven solutions

Power related scenario Various power constraint (supply/consumption) over different stages of

application Power-aware adaptive scheduling for different stages