Top Banner
INDEX Abstract Syntax Tree (AST), 172-173, 174 Access time, sandwich/spin tunneling cell architecture, 23-24 Accumulator registers, 81 ACPI (Advanced Configuration and Power Interface), 118-119, 128, 154-155, 286 Active power: see Flip-flops Activity reduction, pipeline gating as, 60-61 Adaptive bias logic, queue design, 50 Adaptive CAM array read simulation, 44-46 Adaptive filter, FORTE, 250, 252, 253 Adaptive issue queues: see Queue; Queue design Adaptive logic and processing circuit-level evaluation of power and performance overhead, 38 dynamic: see Dynamic adaptation satellite-based parallel signal processing, 251- 252 application partitioning, 251-252 architecture, 251, 252 Address bus, transition count reduction, 213, 164 Addressing energy exposed instruction sets: see Direct addressing, stores with and StrongARM SA-ll00 current consumption, 344-345 Admission control code, API, 161 Advanced Configuration and Power Interface (ACPI), 118-119, 128, 154-155, 286 AgilentlHP 0,25um CMOS, 8-13 361 Aircraft, design-time optimization for, 228-233 aircraft examples and analysis, 231-233 domain considerations, 229-230 endurance, 230 endurance as function of energy conservation, 230-231 Algorithmic transformations, 212-213 Alpha 21264, 36 alpha-queue, 144-145 Annapolis Micro Systems, WILDS TAR VHDL templates, 182, 183 Annotations to AST nodes, SUIF, 174 API: see Application programming interface, power-aware Application-level power awareness, 227-242 design-time optimization for aircraft, 228-233 aircraft examples and analysis, 231-233 domain considerations, 229-230 endurance, 230 endurance as function of energy conservation, 230-231 dynamic energy allocation for cooperating sensors, 234-241 application of theory, 236 behavior of system, 238-240 energy allocation and sensor measurements, 235 fusing sensor measurements, 234 minimizing variance through energy allocation, 235-236
16

INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

Mar 07, 2018

Download

Documents

duonghanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Abstract Syntax Tree (AST), 172-173, 174 Access time, sandwich/spin tunneling cell

architecture, 23-24 Accumulator registers, 81 ACPI (Advanced Configuration and Power

Interface), 118-119, 128, 154-155, 286 Active power: see Flip-flops Activity reduction, pipeline gating as, 60-61 Adaptive bias logic, queue design, 50 Adaptive CAM array read simulation, 44-46 Adaptive filter, FORTE, 250, 252, 253 Adaptive issue queues: see Queue; Queue design Adaptive logic and processing

circuit-level evaluation of power and performance overhead, 38

dynamic: see Dynamic adaptation satellite-based parallel signal processing, 251-

252 application partitioning, 251-252 architecture, 251, 252

Address bus, transition count reduction, 213, 164 Addressing

energy exposed instruction sets: see Direct addressing, stores with

and StrongARM SA-ll00 current consumption, 344-345

Admission control code, API, 161 Advanced Configuration and Power Interface

(ACPI), 118-119, 128, 154-155, 286 AgilentlHP 0,25um CMOS, 8-13

361

Aircraft, design-time optimization for, 228-233 aircraft examples and analysis, 231-233 domain considerations, 229-230 endurance, 230 endurance as function of energy conservation,

230-231 Algorithmic transformations, 212-213 Alpha 21264, 36 alpha-queue, 144-145 Annapolis Micro Systems, WILDS TAR VHDL

templates, 182, 183 Annotations to AST nodes, SUIF, 174 API: see Application programming interface,

power-aware Application-level power awareness, 227-242

design-time optimization for aircraft, 228-233 aircraft examples and analysis, 231-233 domain considerations, 229-230 endurance, 230 endurance as function of energy conservation,

230-231 dynamic energy allocation for cooperating

sensors, 234-241 application of theory, 236 behavior of system, 238-240 energy allocation and sensor measurements,

235 fusing sensor measurements, 234 minimizing variance through energy

allocation, 235-236

Page 2: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

362

Application-level power awareness (con!.)

dynamic energy allocation for cooperating sensors (con!.)

parameterized sensor model, 236 two-sensor problem, 237-238

related problems, 240-242 Application partitioning, satellite-based parallel

signal processing, 251-252 Application programming interface, power-aware,

153-164 features, 156-160

interface between power manager and hardware, 157-158

interface between power manager and operating system, 158-160

future directions, 163-164 implementation, 160-161 predictive power-aware scheduling with eCos,

160-161 requirements, 155-156 results, 161-163

Application Specific ICs (ASICs), 170 Architectural innovations, 212, 213 Architectural level power modeling, 317-336

cycle simulator augmentation, 322-325 omitted details, 322-324 power estimation methodology, 324-325

future directions, 334-335 implementation of cycle-accurate power

estimator, 325-334 data structure and microarchitectural block

models, 326-328 power modeling techniques, 329-334

power metrics, 319-320 power modeling techniques, 329-334

clock distribution tree, 334-335 data path components, 332-333 memory models, 330-332 random logic and interconnections, 333

previous work, 320-322 Architecture: see also Energy-exposed instruction

sets; Microarchitecture design; RISC­accumulator architecture

HDLs: see PACT HDL instruction level parallelism, 63 power reduction approaches, 60 RISC, ARM-like, 211-223; see also ARM-like

RISC architecture sandwich/tunneling memory, 23-25, 31 satellite-based parallel signal processing, 251,

252 Architecture description file, PACT HDL

architecture independence, 180

INDEX

ARM instruction set, software energy profiling, StrongARM SA-HOO, 343

ARM-like RISC architecture, 211-223 benchmarks, 218-219, 220, 222 C compiler, 213 experimental setup, 213-217, 218 future prospects, 222 methodology, 218-219 metrics, 219, 222 off-chip memory and PCB bus models, 215-

217,218 previous research, 212-213 results, 219-221, 222 scaling off-chip memory bus frequency and

voltage, 219-221, 222 VeriLog simulation environment, 213-215

ARM processor, PACT ARM, 170 ARM Project Manager (APM), 341 ARM simulator, IouleTrack, 356 Array interleaving, 207-208 Arrays, PACT HDL, 174, 177 ASIC, 182, 183; see also PACT HDL Associative CAM-tag cache, 91, 92 AST (Abstract Syntax Tree), 172-173, 174 ASIX simulations, 43-44, 51-52 Average current consumption, 341

Backend, HDL AST, 181 Back-end (low-level) compiler optimization, 193-196 Balance equation, 113 Barrier instruction, 82-83 Barriers

restart analysis, 86 system calls, 85, 86

Base cost program blocks, 341 Baseline processor, energy exposed instruction

sets, 81 Batteries, FORTE, 252-253 Benchmarks, ARM-like RISC architecture, 218-

219, 220, 222 Bias logic table, 50 Bit-width analysis and reduction, 185 Block buffering, combined optimizations, 201-202 Body effect, 20 Branch instructions, restart analysis, 86 Branch prediction

confidence estimators, 61-62 pipeline gating, 64-65

Buffers, power reduction approaches, 60 Bus encoding, 164 Bus energy, compiler optimizations, 195-196 Bus modeling

ARM-like RISC architecture, 215-217, 218

Page 3: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Bus modeling (cant.)

ARM-like RISC architecture (cant.) frequency, 219-221, 222, 221, 222 power dissipation, 217, 218, 220

microarchitecture models, 326, 327-328 Bus state controller, 342 Bus transaction cycles, 323-324, 326 Bypass latches, 81, 87-90

compiler analysis, 89 evaluation, 90 ISA enhancements, 88

Cache architectural innovations, 213 block buffering, 201 CAM-tag, 91, 92 CPU cycle numbers, worst-case, 131 energy exposed instruction set techniques, 81 HDL AST optimization, 186 loop analysis, 185 loop transformations and tiling, 196-197 power reduction approaches, 60 software energy profiling, StrongARM SA-1100,

342, 343 subbanking, 201 web server workload, 269

Cache misses scheduling slack, 69, 70 compiler optimizations, 198-200

CACTI, 321, 331 Cai-Lim simulator, 319 CAM array read simulation, adaptive, 44-46 CAMIRAM queue design, 38-41, 55-56 CAM-tag cache, 81, 91, 92 Capacitance

power reduction, 60 transistor width and, 304, 305

C/C++, 171-172; see also PACT HDL ARM-like RISC architecture, 213 energy consumption estimation, louleTrack, 356 FORTE power usage optimization, 255, 256 GNU compiler, 255, 256 tag-unchecked compiler analysis, 95

CCMOS Flip-Flop, modified, 11, 12, 13 Checkpointed state, 83 Chip memory access, ARM-like RISC

architecture, 222 Circuit blocks, SDT devices, 25 Circuit parameters

evaluation of power and performance overhead, 38

microarchitectural block models, 326 Circuit sizing, 304-306

363

Circuit state average current consumption measurement, 341 effects on energy, 193

Circular queue structure, 38 C Level Design System Compiler, 172 Clock-based power management policies, 102 Clock distribution tree, 334-335 Clock frequency

changes in, 139-140 software energy profiling, StrongARM SA-1100,

342 Clock gating

energy tradeoffs, 354-356 flip-flops, 4, 5 hardware innovations, 212 static flip-flops, 7

Clock speeds reverse levelization and, 185 scheduling slack, 67

Clustered voltage scaling, 61 CMOS circuit, 19; see also Flip-flops

energy and delay, 294, 295-296 power reduction, 60

C2MOS Flip-Flop, 7 CMOS level power optimization, 172 Co-Design Automation, Superlog, 172 Cold scheduling effects, 193 Common Gateway Interface (CGI) scripts, 356 Compaction, latch-based design with, 55-56 Compaq iPaq, API, 161-162 Compiler: see also C/C++; PACT HDL

ARM-like RISC architecture, 211-223; see also ARM-like RISC architecture

energy exposed instruction sets RISC accumulator architecture, 89 tag-unchecked loads and stores with direct

addressing, 94-95 software restart regions, 85-86

Compiler optimizations, 191-208 energy-aware low-level compilers, 193-196

instruction scheduling and energy, 193-194 register assignment and bus energy, 195-196

hardware-software interaction, 200-208 code transformations and power mode control

mechanisms, 202-206 data transformations and power mode control

mechanism effectiveness, 206-208 hardware, 200-201 optimizations for memory energy, 201-202

high-level loop optimizations, 196-200 cache miss rates versus energy consumption,

198-200 experimental evaluation, 197-198

Page 4: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

364

Compiler optimizations (cont.)

high-level loop optimizations (cont.)

types of, 196-197 Compress program, 131 Computational energy, VSLI computations, 294 Confidence estimators

pipeline gateway, 61-63 speculation control, 63-66 terminology, 60

Consolidation of web servers, 262 Contactless RF identification tags, 27 Continuous-Time Markov Decision Processes

(CTDMP), 111, 112 Control pass, SUIF to HDL translation, 176, 178-

179 Control speculation: see Microarchitecture design Conventional Flip-Flop, 7, II, 12, 13 Conversion pass, SUIF to HDL translation, 177-

178 Core power dissipation, 220 Counter

queue, 48 reorder buffer, 49

Conventional Flip-Flop, 7 CoWare N2C layer, 171 CPU cycle numbers, worst-case, 130-131

flow control modeling, 131-132 static (off-line) power management, 134

CPU frequency scaling, web server, 265 CPU speed

power management points, 134 slack computation, 137-139 worst-case execution time (WCET), 129, 130,

139-140 time overhead, 139-140

CPU time and energy model, web server, 277-280 CSIM engine, web server, 277-280 Cubic power reduction, 133 Cumulative probability tables, decision states, 114-

115 Curie temperature (CT), 25, 26 Current

sandwich/spin tunneling cell, 20, 23 software energy profiling, 343

factors affecting consumption, 341 leakage current observations, 349-351; see

also Leakage current/power instruction current, 343-345 operating point and, 345, 346 prediction, 347-348 separation of components, 353-354 variation within instruction, 344-345

Cycle-accurate power estimator, 325-334

INDEX

Cycle-accurate power estimator (cont.)

data structure and microarchitectural block models, 326-328

power modeling techniques, 329-334 Cycle partitioning, 347 Cycle simulator, 322-325

efficiency of, 318 omitted details, 322-324 power consumption, 320 power estimation methodology, 324-325

Cycle window queue, 48, 55 shutdown logic, 49-50 SimpleScalar simulation, 52-53

CynApps C/C+ + extensions, 171

DA: see Direct addressing, stores with Data bus streams, cycle simulators, 323-324 Data caches, software energy profiling,

StrongARM SA-ll00, 342 Data gating, 4-5, 6; see also Flip-flops Data layout transformations, 207 Data path components, architectural level power

modeling, 332-333 Data storage: see Sandwich/spin tunneling memory

device Data structure, architectural level power modeling,

326-328 Data transformation: see Compiler optimizations DBench tool, 286 Deadline management, 156, 159, 160-161 Dead state elimination, 186 DEC AXP-21l64, 63 Decay, slack indicator table state, 74 Decision logic, dynamic adaptation algorithms,

46-49 Decision making, power management policies, 102 Decode cycle, pipeline gating, 63-64, 65-66 Delay: see Efficiency metric Et2; Flip-flops Delay product, low-power, flip-flop, 14-15 Demand paging, restart schemes, 85-86 Density density, sandwich/spin tunneling cell, 23 Design; see also Application-level power

awareness; Microarchitecture design flip-flops, 7-8, 9, 10 power estimation methodology, 324

Design Compiler, 183 Design Manager, 182 Design Power, 183 Desktops, power performance comparisons, 119 di/dt noise, 320 Diffusion sharing, flip-flop analysis, II Digital Signal Processors (DSPs), 27, 339

Page 5: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Direct addressing, stores with, 90-97 compiler analysis, 94-95 DA register implementation, 93-94 evaluation, 95 example use, 92-93 ISA enhancements, 92

Discrete-Time Markov Decision Processes, 102, 111,119

Display brightness, web server power management, 265

Dominating nest, 204, 205, 206 Drain Induced Barrier Lowering (DmL)

coefficient, 351-352 DRAM

power mode control, 202-206 software energy profiling, StrongARM SA-ll00, 342

DSTC, 6, 11, 12, 13 Duty cycle, and software energy estimation, 340 Dynamic adaptation, 52; see also Queue; Queue

design CAMIRAM design with, 55-56 queue design

algorithms, 46-49 issue queues, 41-46 issue queue size, 37

Dynamic circuits, microarchitectural block models, 326

Dynamic frequency scaling computation, 161 Dynamic management of power consumption, 294;

see also Application-level power awareness power-aware real-time systems, 136-139

evaluation of dynamic schemes, 146-147 reclaiming scheme, 129-130, 145

power consumption calculations, 320 system model, 104-110

hard disks, 108-109 overview, 110 portable devices, 106 queue, 110 smart badge, 107 user, 105, 106 WLAN cards, 109

techniques, 110-115 policy implementation, 114-115 results, 118-120, 123

voltage and frequency scaling, 103, 128 TISMDP, 115-118, 120-122, 123 web servers, 280-284

Dynamic power dissipation, 212 Dynamic reclaiming scheme, 129-130, 145 Dynamic reasoning, 241 Dynamic speed setting: see Power-aware real-time

systems

365

Earliest Deadline First (EDF) scheduling, 129, 129, 130

Earliness slack alpha-queue, 144-145 dynamic (on-line) power management, 136-137

eCos, 157, 160-161 EDF* policy, alpha-queue, 144, 145 EEPROMS, 20, 27 Efficiency, cycle simulator, 318 Efficiency metric Et2, 293-314

comparing algorithms, 296-299 Et,297-298 Et2, 299 e,298

energy and delay and VLSI computation, 294-296

power rule for sequential composition, 311-312 e efficiency of design, 300-304

parallelism, 300-302 pipelining, 302-304

e rules for parallel and sequential compositions, 312-313

transistor sizing for optimal e, 304-311 ET" with n *- 2, 306-307 experimental evidence, 308-309 minimum energy function, 308 multi-cycle system, 309-311 optimal energy and cycle time, 307-308

8TDFF, 6, 11, 12, 13 Embedded systems: see Application programming

interface, power-aware Empty states, HDL AST optimization, 186 Energy; see also Efficiency metric Et2

optimization: see Compiler optimizations overhead, power-aware real-time systems, 140 software, 229-258; see also Software energy

profiling transistor sizing for optimal e, 307-308 and VLSI computation, 294-296

Energy-aware applications: see Application programming interface, power-aware; Power-aware real-time systems

Energy delay metric SPEC2IW, 287 Energy efficiency, web servers, 265 Energy exposed instruction sets, 79-97

baseline processor, 81 exposing bypass latches with hybrid RISC-

accumulator architecture, 87-90 compiler analysis, 89 evaluation, 90 ISA enhancements, 88

future work, 95-97 instruction chain, 96-97

Page 6: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

366

Energy exposed instruction sets (cant.)

software restart regions, 81-87 categories of machine state, 84 compiler analysis, 85-86 evaluation, 87 example use, 84-85 restart marker implementation, 83

tag-unchecked loads and stores with direct addressing, 90-97

compiler analysis, 94-95 DA register implementation, 93-94 evaluation, 95 example use, 92-93 ISA enhancements, 92

Enumerated types, SUIF, 174 Environment, web servers, 266 Esterel-C Language (ECL), 171 ET2: see Efficiency metric Et2

Event-driven power management policies, 102 Exception management

sequential instruction semantics and, 82 software restart markers and, 80, 82-83

Execution box (EBOX), 344 Execution progress information, 134 Execution time

API requirements, 156 worst-case (WCET), 129, 130

Exponential behavior, software energy profiling, 351-353

Expressions, HDL AST, 175

Factor values, queue design, 52-53, 55 False alarms, FORTE signal detection, 248 Fast-Fourier Transform

FORTE,250 software energy profiling, 349-351, 355-356

Fast On-Orbit Recording of Transient Events: see FORTE

Fetch cycle, pipeline gating, 63-64, 65-66 File systems, flash memory, 164 Filtering, FORTE: see FORTE Finance workload, webserver, 269, 270, 274-276,

279,283 Finite State Machine (FSM) HDL AST, 173, 174-

175, 181 FIR Filter, HDL PACT compiler test, 187, 188 First-order model, software energy profiling,

345 First-use table, 37, 38, 37 Fission, loop, 202-206 Flash memory, 20, 164 Flip-flops, 3-16

clock gating, 4, 5

INDEX

Flip-flops (cant.)

comparative analysis and experimental results, 8, 10-13

data gating, 4-5, 6 design, static and dynamic, 7-8, 9, 10 pipelined multiplier design 8 x 8, 13-14, 15, 16

Floating point operations, 258-259, 340 Flow control models, power-aware real-time

systems, 131-132 FORTE: see also Satellite-based parallel signal

processing filtering, 249-250

application partitioning, 252, 253 sensor resources, 240 timing, 255, 256

goal, 244-245 hardware, 247-248

Fortran, 206 Field Programmable Gate Arrays (FPGA), 171,

182; see also PACT HDL Frame-based systems, periodic task model, 133 Frequency levels, software energy profiling, 347 Frequency scaling, 103, 115-118, 120-122, 123,

128 API, 157, 158 web servers, 280-284

Frequency-voltage tradeoff, SmartBadge, 107 FSM (Finite State Machine), 173, 174-175, 181

Gate leakage, 20 Gating: see also Flip-flops

architectural innovations, 212, 213 energy tradeoffs, 354-356 flip-flops, 4-5, 6 pipeline, 60-61; see also Pipeline gating

General purpose register (GPR) RISC architecture, 81

Global clock, software energy profiling, 343 Global register allocation, relabeling after, 195-

196 Global symbol table, SUIF, 174 GNU C compiler, 255, 256 GNU cross-assembler, 218 Go program, 131 Ground sensors, sandwich/tunneling memory

applications, 26-27

Half-select currents, 21, 23 Hamming distance, 213, 327-328 Hard disks, dynamic power management, 108-109,

118-119 portable devices, 108-109 user request arrival distribution, 105

Page 7: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Hardware, API power manager, 157-158 Hardware Abstraction Layer (HAL), API, 156, 157 Hardware C, 171 Hardware Description Languages (HDL): see HDL

AST; PACT HDL Hardware-software interaction

compiler optimizations, 200-208 code transformations and power mode control

mechanisms, 202-206 data transformations and power mode control

mechanism effectiveness, 206-208 hardware, 200-201 optimizations for memory energy, 201-202

interfaces: see Energy-exposed instruction sets Harvard Array of Clustered Computers (HACC),

286 Hazard information, single cycle simulation, 328-

329 HDL AST, 174-181; see also PACT HDL

backend, 181 PACT HDL, 185-186 SUIF to HDL translation, 176-179 symbols and symbol table, 175-176 target architecture independence, 179-181

High-impedance bus stream, 327-328 High-level loop optimizations, 196-200

cache miss rates versus energy consumption, 198-200

experimental evaluation, 197-198 types of, 196-197

High-Speed D Flip-Flop, 6, 11-12, 13 Hitachi SH-4 processors, 342; see also Software

energy profiling Homer's Rule, 177, 178 HSPICE, 9, 297 httperf simulation, 270-271 HTTP requests, server workload construction,

267-270 Hybrid Latch Flip-Flop (HLFF), 6, 11, 12, 13 Hybrid RISC-accumulator architecture: see Bypass

latches Hypercycle, 131

Idle mode, flip-flop designs, 6 Indicated slack, 73-74 Inductive effects, power consumption changes and,

320 Information Resource Caching (IRCache) Project,

267 Innate scheduling slack, 69-70 In-order ready queue, 37 Instruction base current cost, 341 Instruction-level parallelism (ILP), 36, 63

Instructions energy exposed instruction sets, 96-97 queue design, 36, 37

367

software energy profiling, StrongARM SA-ll00 caches, 342 current profiles, 343-345 current variation within, 344-345

reduction in number of, 60 scheduling

compiler optimizations, 193-194 terminology, 61

sequential semantics, 82 software energy profiling, StrongARM SA-lloo,

343, 347, 348 Instruction Set Architecture (ISA)

energy-exposed, 88; see Energy-exposed instruction sets

sequential instruction semantics, 82 tag-unchecked loads and stores with direct

addressing, 92 and software energy estimation, 340

Instructions per cycle (IPC) architectural improvements, 63 issue queue adaptation, 48 queue design and, 37, 38

Instruction trace, 341 Integer queue, 36 Intel

Evaluation, API, 161-162 SpeedStep technology, 286 StrongARM processor SA-ll0, 216-217 XScale, 160

Intel PentiumPro, 63 Interarrival times, 102, 104, 105 Interevent time set, TISDMP, 112 Inter-instruction (circuit-state) effects

average current consumption measurement, 341

on energy, 193 Interleaving, array, 207-208 Internal cycles, software energy profiling, 347,

348 Interrupt services routine (lSRs), API, 158 Inverter feedback based flip-flops, 7 Ionospheric dispersed signals, 246-247; see also

Satellite-based parallel signal processing IPC: see Instructions per cycle ISA: see Instruction Set Architecture Issue control logic, queue scheduling, 37-38 Issue queue, 36; see also Queue; Queue design

comparison of designs, 55-56 dynamic adaptation algorithms, 46-49 size of, 37

Page 8: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

368

Java, data transformations, 206 JouleTrack, 356-357, 358; see also Software

energy profiling Jump instructions, restart analysis, 86 Junction resistance, sandwich/spin tunneling cell,

22

Ko's Low-Power Flip-Flop, 7

Laplace Transform, HDL PACT compiler test, 187, 188

Laptops, 285 power performance comparisons, 119 TISDMP, LAN-attached, 119-120

Latching structure: see also Bypass latches flip-flops, 6 issue queues, 38-41 SDT devices, 24

Latency, single cycle simulation information, 328-329

Latest time, scheduling slack, 69 Leakage current/power, 20

energy tradeoffs, 354-356 flip-flop analysis, 16

data gated, 5, 6 low-leakage input vector, 5 pipelined mUltiplier, 14

MOS network, 352-353 power estimation methodology, 325 separation of current components, 353-354 software energy profiling, 340, 348-351

LEDA Systems HDL PACT compiler test, 187, 188 libraries, 214, 321

Linear loop transformations, 196, 201-202 Li program, 131 Locality, loop analysis, 185 Local memory, PACT HDL, 170 Logic gate, production rules, 295 LongRun, 286 Look-up tables

analytical power models, 321 power estimation methodology, 324

Loop analysis, 185 Loop and data transformations: see Compiler

optimizations Loop distribution (fission), 202-206 Loop execution, array interleaving, 207-208 Loop fission, 202-206 Loop invariant code motion, HDL AST

optimization, 186 Loop nest, dominating, 204, 205, 206 Loop optimizations, 196-200

INDEX

Loop optimizations (cont.) cache miss rates versus energy consumption,

198-200 combined optimizations, 202-206 data transformations, 206 experimental evaluation, 197-198 types of, 196-197

Loop reordering, 185 Loop tiling, 196-197, 201-202 Loop transformations

combined optimizations, 202-206 linear, 196, 197, 201-202

Loop unrolling, 197, 201-202 Low-Power Flip-Flop, 7, 11, 12, 13; see also Flip­

flops Low Power Sleep mode with output pull Down

Flip-Flop (LPSDFF) design and operation, 7 -8, 9, 10, 11, 12, 13

low-power delay product, 14-15 pipelined multiplier design, 13-14, 15, 16

Low Power Sleep mode with output Pull-up flip­Flop (LPSPFF) design and operation, 7-8, 9, 10, 11, 12, 13

low-power delay product, 14-15 pipelined multiplier design, 13-14, 15, 16

Low-power systems, compiler optimizations: see Compiler optimizations

Machine state energy exposed instruction sets, 84 software restart marker, 80-81

Macros, C/C++, 171 Magnetic material hysteresis: see Sandwich/spin

tunneling memory device Magnetic tunnel junction MRAM cell, 20-21,

28 Magnetic Tunnel Junctions (MTls), 20, 21, 28 Magnetoresistive Random Access Memory

(MRAM), 20-21, 28 Markovian randomized stationary policies,

114 MATCH, 181 Matched filter, FORTE application partitioning,

252,253 MATCH group, 172, 174 MATLAB, 172, 174, 255 Matrix Multiplication, HDL PACT compiler test,

187, 188 Maximum likelihood fit, FORTE application

partitioning, 252, 253 Mediabench

restart analysis, 86, 87 tag-unchecked compiler analysis, 95

Page 9: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Memory access

ARM-like RISC architecture, 222 microprocessors versus cycle simulators, 323 PACT HDL architecture independence, 180 software energy profiling, 347, 348

address PACT HDL architecture independence, 180 transition count reduction, 213

architectural level power modeling, 330-332 ARM-like RISC architecture bus, 215, 217, 218;

see also ARM-like RISC architecture frequency, 219-221, 222 power dissipation, 215, 217-218, 220

array interleaving, 207-208 flash, 20, 164 PACT HDL, 170

allocation of, 177 architecture independence, 180 caching, HDL AST optimization, 186 local,170 pipelining, 180-181, 184

power mode control, 202-206 restart analysis, 85-86 sandwich/tunneling, 23-25, 31; see also

Sandwich/spin tunneling memory device Memory-like microarchitectural blocks, analytical

power models, 321 Metrics, 219, 222; see also Efficiency metric Et2

Microarchitecture design, 59-77 background and terminology, 60-61 costs and benefits of slack indication table slack

detection, 73-76 pipeline gateway, 61-66

confidence estimators, 61-63 speculation control with confidence

estimation, 63-66 slack detection, 70-72 slack indicator table, 72-76

using indicated slack, 73-74 slack scheduling, theoretical underpinnings, 68-

70 speculation control by exploitation of scheduling

slack, 66-68 Microarchitecture simulation-based results

block models, 326-328 queue design, 52-56

Microprocessor, hardware innovations, 212 Microsoft On-Now initiative, 286 Minimum energy function E(t), 304, 308 MIPS instruction set, restart regions, 83 MIPS-like instruction, SUIF, 174 MIPS Rl0000, 36

MIPS R3000, Et2 measurements, 299, 300 MIPS RISC microprocessor, 81 Miss rates, cache, 198-200 Model Sim, 182 Model Technologies synthesis flow tools, 182 Modified CCMOS Flip-Flop, 11, 12, 13 MP3 audio

dynamic voltage scaling, 120-122, 123 TISDMP, 116-117

MPEG video decoder, 120-122, 123 MRAM (Magnetoresistive Random Access

Memory), 20-21, 28

369

MTJs: see Magnetic Tunnel Junctions Multi-cycle system, transistor sizing for optimal

e,309-311 Multiple program, multiple data-stream processing,

FORTE, 252, 253 Multiple program, single data stream processing,

FORTE application partitioning, 252, 254 Multiprocessor architecture, Power Aware

(PAMA),244

Nagano Winter Olympics: see Web server National Laboratory for Applied Network

Research (NLANR), 267 Neel temperature (TN), 25, 26 Nest, dominating, 204, 205, 206 Nested loops, loop distribution/fission, 202-206 9TDFF, 6, 11, 12, 13 nMOS transistor current, 295-296 Nonstationary policies, 102 N2C layer, 171

Off-chip bus frequency, 221, 222 Off-chip bus power dissipation, 217, 218 Off-chip memory and PCB bus models, 215-217,

218 Off-chip memory bus frequency and voltage, 219-

221,222 Off-line scheduling, 128-129 Off-line (static) power management, 130, 134-136 Olympics of 1998: see Web server Olympus Synthesis System Hardware C, 171 On-line (dynamic) power management, 130, 136-

139 On-Now initiative, 286 Operating frequency, software energy profiling,

340, 347 Operating point, and program current

consumption, 345, 346 Operating systems: see Application programming

interface, power-aware

Page 10: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

370

Operating voltage and frequency, software energy estimation, 340

Operational latency, scheduling slack, 68 Operators

HDL AST, 175 production rules, 295

Optimization: see Compiler optimizations; Flip­flops; Queue; Queue design; PACT HDL

Out-of-order execution, instruction level parallelism, 63

Out-of-order processors pipeline gating, 65 -66 power consumption, 36

Output control, flip-flop design and, 7

PACT HDL, 169-189 future work, 188, 189 HDL AST, 174-181

backend, 181 SUIF to HDL translation, 176-179 symbols and symbol table, 175-176 target architecture independence, 179-181

optimization for power and performance, 183-186

HDL AST, 185-186 memory pipelining, 184 SUIF AST, 184-185

results, 186-187, 188 static size arrays, 177 SUIF,173 synthesis flow, 182-183

ASIC design flow, 183 FPGA design path, 182

PACT (Power Aware Architecture and Compliation Techniques), 170; see also PACTHDL

Parallelism FORTE application partitioning, 252, 253, 254 instruction level, speCUlation and out-of-order

execution, 63 issue queue adaptation, 48-49 SOT junctions, 23 shutdown logic, 52 signal processing: see Satellite-based parallel

signal processing e effiency of design, 300-302 transistors, leakage current, 352-353 utilization-based algorithms versus, 55-56

Parameter estimation, satellite-based parallel signal processing, 249-251

Pareto distribution, user request arrivals, 105, 110 Partitioned memory architecture, array

interleaving, 207-208

INDEX

Pass-gate, dynamic flip-flop design and operation, 7-8,9,10

PCB bus models, ARM-like RISC architecture, 215-217, 218

Peak power, power consumption calculations, 320 Performance

optimization: see PACT HDL web servers, 284-285

Periodic task model, power-aware real-time systems, 133

Perl program, 131 Personal digital assistants, 285 Personal Digital Assistants (PDAs), 285, 339 Pipeline

energy exposed instruction set techniques, 81 PACT HDL, 184 PACT HDL architecture independence, 180-181

Pipelined multiplier design 8 x 8, 13-14, 15, 16 Pipeline gating

at decode and issue stage, 65-66 goal of, 63 microarchitecture design, 61-66

confidence estimators, 61-63 specUlation control with confidence

estimation, 63-66 scheduling slack, 67-68, 69, 70 terminology, 60-61

pMOS transistor, pull-up network, 295-296 Pointers, SUIF, 174 Portable computers, 285 Portable devices, 106, 285, 339; see also Laptops Port symbol table, 180 Power, total (Ptotal), flip-flop analysis, 10 Power availability, satellite-based parallel signal

processing, 252-255 Power-aware API: see Application programming

interface, power-aware Power Aware Architecture and Compilation

Techniques (PACT), 170; see also PACT HDL

Power-aware design: see Application-level power awareness

Power Aware Multiprocessor Architecture (PAMA),244

Power-aware real-time systems, 127-149 dynamic (on-line) power management, 136-139 energy overhead, 140 evaluation of dynamic schemes, 146-147 maximizing reward while meeting time and

energy constraints, 147-148 modeling flow control, 131-132 periodic task model, 133 power consumption model, 133-134

Page 11: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Power-aware real-time systems (cant.) power management points, 134 speculative speed retention, 145-146 speed management overhead, 139-140 static (off-line) power management, 134-136 system level dynamic power management, 143-

145 task and systems models, 130-131 task level dynamic power management, 141-142,

143 time overhead, 139-140

Power Compiler, ARM-like RISC architecture, 214-215, 218

Power consumption management of: see Dynamic management of

power consumption power-aware real-time systems, 133-134 web servers, 271-277

measurements, 271-275 opportunities for power management, 275-

277 Power-delay product, 10, 11, 14-15 Power dissipation, 212

ARM-like RISC architecture, 217, 218 power estimation methodology, 325 SIA roadmap, 36 simulation of, 321, 322

Power estimation methodology, architectural level power modeling, 324-325

Power management API

and hardware, 157-158 and operating system, 158-160

architectural level power modeling, 319-320, 329-334

clock distribution tree, 334-335 data path components, 332-333 memory models, 330-332 random logic and interconnections, 333

dynamic: see Dynamic management of power consumption

optimization: see Flip-flops power-aware real-time systems: see Power­

aware real-time systems system-level DPM, 102

Power management points, 134 Power management policy, 102 Power mode control mechanisms, compiler

optimization code transformations, 202-206 data transformations, 206-208

Power PC 603 Flip-Flop, 7 Power rule for sequential composition, 311-312

PowerScope, 287 Power state machine model, 110 Power supply voltage, ARM-like RISC

architecture, 215 Power usage, satellite-based parallel signal

processing, 255-259 PPC603 Flip-Flop, 11, 12, 13

371

Prediction assessment, confidence estimators, 61-62 Predictive power-aware scheduling with eCos,

160-161 PrimePower, 321 Printed Circuit Board (PCB) bus, Verilog

simulation, 213, 214 Printed Circuit Board Power model, ARM-like

RISC architecture, 216-217 Processor

frequency, power reduction, 60 PACT HDL, 170 pipeline: see Pipeline gating power: see Power-aware real-time systems

Processor-cache interface, ISA enhancements, 92 Processor states, API, 158 Process scheduling: see Application programming

interface, power-aware Producer, dynamic adaptation algorithms, 46-47 Production rules, 295 Program counter

restart, 83 sequential instruction semantics and, 82

Programmable gate arrays, magnetic memory applications, 27

Program performance completion time, ARM-like RISC architecture,

221 reducing number of instructions, 60

Proxy servers, 267 Proxy workload, web server, 270-271, 276, 279,

280,283 Pull-down bus state, 327-328 Pull-down network, single nMOS transistor as,

295-296 Pull-up bus state, 327-328 Pulse-Triggered True Phase Flip-Flop (PTTFF), 6,

11, 12, 13 Push-Pull Flip-Flop, 7, 11, 12, 13

Quadratic power reduction, 133, 279 Quality metric, 213 Queue

dynamic power consumption management, 110 models

TISDMP,104 web server, 277-280

Page 12: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

372

Queue design, 35-57 dynamic adaptation algorithms, 46-49 dynamic adaptation in issue queues, 41-46 latch and eAMIRAM based issue queues, 38-

41 microarchitecture simulation-based results, 52-

56 shut-down logic, 49-52

Radiation, solar array degradation, 253-254 Radiofrequency signals: see FORTE Random access memory

DRAM power mode control, 202-206 software energy profiling, StrongARM

SA-llOO, 342 MRAM, 20-21, 28

Random logic and interconnections, architectural level power modeling, 333

Rate Monotonic Scheduling (RMS), 129, 161 Read access time

MRAM,21 SOT devices, 25

Read caching, bypass latches, 88 Read circuits, sandwich/spin tunneling cell, 22 Ready queue, 37 Real-time systems: see Power-aware real-time

systems Reconfiguration elements, magnetic memory

applications, 27 RedHat eCos, 160 Register assignment, compiler optimizations, 195-

196 Register files

energy-exposed processor, 87, 88, 90, 91 extensions, slack state, 7 J

Register Transfer Level (RTL) models ARM-like RISe architecture, 214 HDL codes, 170 synthesis flow, 182

REL HDL codes, macros, 171 Remote Debug Interface (RDI), 357 Remote sensing, 245-248; see also Satellite-based

parallel signal processing Remote sensing applications, 245-248

FORTE hardware, 247-248 ionospheric dispersed signals, 246-247

Reorder Buffer (ROB) queue, 48-49 parallelism-based algorithm, 52 slack state, 71, 73

Replay program, web servers, 270-271 Request interarrival times, 102, 104, 105

Resistance, tunneling, 22 Resource Allocation Strategy, HDL AST

optimization, 186 Restart

machine state categories, 83

INDEX

software: see Software restart regions, energy exposed instruction sets

Reverse levelization, 185 Reward-based model of power management, 147-

148 RF identification tags, 27 RISe architecture

ARM-like, 211-223; see also ARM-like RISe architecture

bypass latch exposure, 81 energy exposed instruction sets, 81, 87-90

compiler analysis, 89 evaluation, 90 ISA enhancements, 88

software restart marker, 80 ROB: see Reorder Buffer RTL (Register Transfer Level) model, 170, 182,

214 Run-time execution profile, 341 Runtime slack, 69

SA-1100: see Software energy profiling; StrongARM SA-lloo

Sandwich/spin tunneling memory device, 19-32 magnetic tunnel junction MRAM cell, 20-21,

28 memory circuits/architecture, 23-25, 31 potential applications, 26-27, 29 potential higher density sandwich/tunneling

memory, 25-26, 29, 32 sandwich spin tunneling cell, 21-23, 29, 30

Satellite-based parallel signal processing, 243-259 adaptive power-aware processing, 251-252

application partitioning, 251-252 architecture, 251, 252

conventional solutions to power management, 244

FORTE goal, 244-245 power availability, 252-255 power usage, 255-259 remote sensing applications, 245-248

FORTE hardware, 247-248 ionospheric dispersed signals, 246-247

signal filters for parameter estimation, 249-251 signal filtering, 249-251 trigger and digitizer output signals, 249-251

Scaling epu frequency, web server, 265

Page 13: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Scaling (cont.)

off-chip memory bus frequency and voltage, ARM-like RISC architecture, 219-221, 222

voltage: see Power-aware real-time systems Scheduling

API requirements, 154-155. 160; see also Application programming interface, power­aware

compiler optimizations, 193-194 effects on energy, 193-194 Rate Monotonic, 129 slack state, 61, 72; see also Slack static dynamic, 129 static off-line, 128-129

SDFF, 11, 12, 13 SDRAM, software energy profiling, 342 Second-order model, software energy profiling,

345-348 Segment flow graph, flow control modeling, 131-132 Selection logic, issue queue instructions, 36 Select transistor, MTJ s, 21 Semantics, instruction, 82 Semi-dynamic Flip-Flop (SDFF), 6 Semi-Markov decision processes (SDMP), Ill, 112 SenseAmp Flip-Flop, 6, 11, 12, 13 Sense amplifier, sandwich/spin tunneling cell

architectures, 24 Sense Current Driver, 24 Sensors

dynamic energy allocation, 234-241 application of theory, 236 behavior of system, 238-240 energy allocation and sensor measurements,

235 fusing sensor measurements, 234 minimizing variance through energy

allocation, 235-236 parameterized sensor model, 236 two-sensor problem, 237-238

remote: see Satellite-based parallel signal processing

time multiplexing, 240 Sequential memory access, software energy

profiling, 347, 348 Servers: see Web servers Short-circuit power dissipation, 212 Shutdown states

API requirements, 154-155 queue design, 49-52

SIA roadmap for power dissipation, 36 Signal detection, FORTE, 248 Signal filters

FORTE, 249

Signal filters (cont.) for parameter estimation, 249-251

signal filtering, 249-251

373

trigger and digitizer output signals, 249-251 timing, 255, 256

Signal processing: see Satellite-based parallel signal processing

SimplePower, register assignment effects on bus energy, 195, 196

SimpleScalar, 196, 322 SimpleScalar 3.0, 52-56, 131, 322 Simplescalar ARM, architectural innovations, 213 Simulator

ARM-like RISC architecture, 213-215 web servers, 277-280

Single cycle behavior, simulation of, 328-329 Single nMOS transistor, transistor current, 295-

296 Single SDT junction memory cell (UC), 23, 31 Slack

dynamic (on-line) power management, 136-137 microarchitecture design

detection, 70-72, 73-76 speCUlation control, 66-68 terminology, 61 theoretical underpinnings, 68-70

power management point calculation, 134 workload variation, 103

Slack indicator table (SIT), 71, 72-76 SmartBadge, 103

dynamic power consumption management, 107 dynamic voltage scaling, 122 portable devices, 106, 107 user request arrival distribution, 105

Smooth circuits, 295-296. Sobel Transform, HDL PACT compiler test, 187,

188 Software energy profiling, 229-258

energy tradeoffs, 354-357, 358 exponential behavior, explanation of, 351-353 factors affecting software energy, 340 first-order model, 345 instruction current profiles, 343-345 leakage current observations, 349-351 leakage energy measurement, 348-349 related work, 340-341 second-order model, 345-348 separation of current components, 353-354 StrongARM experimental setup, 341-342

Software restart regions, energy exposed instruction sets, 80, 81-87

categories of machine state, 84 compiler analysis, 85-86

Page 14: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

374

Software restart regions, energy exposed instruction sets (cant.)

evaluation, 87 example use, 84-85 restart marker implementation, 83

Software trigger, FORTE application partitioning, 252,253

Solar array degradation, 253-254 SPEC benchmarks, 131 SPECint95, 42, 86, 87 Speculation

microarchitecture design: see also Microarchitecture design

with confidence estimation, 61, 63-66 exploitation of scheduling slack, 66-68 terminology, 60

power-aware real-time systems, 145-146 dynamic (on-line) power management, 137-

139 speed adjustment, 130 speed reduction, 145-146 task scheduling, 129-130

SPECWebIWatt metric, 287 Speed computation, time overhead, 139-140 Speed management overhead, power-aware real-

time systems, 139-140 Speed reduction, worst-case execution time

(WCET), 129, 130 SpeedStep technology, 286 Spin tunneling, MRAM cells, 20-21; see also

Sandwich/spin tunneling memory device Squid,267 Stable state, 83 Standby/sleep mode

dynamic flip-flop design and operation, 7-8, 9, 10

flip-flop design and, 7 SDT devices, 25

Stanford University Intermediate Format: see SUIF State-action frequencies, TISDMP, 114 Statements

HDL AST, 175, 176 SUIF to HDL translation, 176

State Node representations, HDL AST, 175, 176 States, SUIF to HDL translation, 176 State transition decisions, power management

policy, 102 Static circuits, microarchitectural block models,

326 Static current, exponential behavior, 350 Static (off-line) power management, 128, 130,

134-136 Static reclaiming scheme, 129-130

Static scheduling, issue queue, 37 Stationary policies, 102

INDEX

Stochastic power management policies, 102 Storage: see Sandwich/spin tunneling memory

device Stores with direct addressing: see Direct

addressing, stores with StrongARM

floating point operations, 258-259 SmartBadge, 107

StrongARM-l, 81 StrongARM 110 Flip-Flop, 6, 11, 12, 13 StrongARM SA-ll0, 216-217 StrongARM SA-ll00, 339

clock frequency changes, 139-140 software energy profiling, 339, 341-342; see

also Software energy profiling StrongARM SA-lllO, 158, 160, 161-162 Structs, SUIF, 174 Subbanking, combined optimizations, 202 Sub-expression elimination, 185 SUIF, 173, 174

HDL AST correlations, 175 HDL translation, 176-179 PACT HDL optimizations, 183-185 tag-unchecked compiler analysis, 95

Superlog, 172 Superscalar processors

issue queue power consumption, 36 SimpleScalar simulation, 52-56

Switching ARM-like RISC architecture, 215, 217-218 sandwich/spin tunneling cell

current, 22 mechanism, 26 speed,23 time, 22

Switching energy energy tradeoffs, 354-356 operating voltage and frequency effects, 340

Switching power dynamic power dissipation, 212 power estimation methodology, 325

Symbol table HDL AST, 175-176 PACT HDL architecture independence, 180 SUIF,174

Synopsis ARM-like RISC architecture, 214-215, 218 Design Compiler, 183, 214, 321 HDL PACT compiler test, 187, 188 Power Compiler, 214-215, 218 synthesis flow, 182

Page 15: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

INDEX

Synopsis (cont.) SystemC, 171-172

Synplicity Synplify, 182 Synthesis flow, PACT HDL, 182-183; see also

PACTHDL ASIC design flow, 183 FPGA design path, 182

SystemC, 171-172 System level

dynamic power management dynamic reclaiming and aggressive

scheduling, 130 power-aware real-time systems,130-13I, 143-

145 power management point, 134 static (off-line) power management, 135

innovations, 212 power and energy estimates, ARM-like RISC

architecture, 221

Tag checks elimination of, 95 ISA enhancements, 92

Tag-unchecked loads and stores with direct addressing, energy exposed instruction sets, 90-97

compiler analysis, 94-95 DA register implementation, 93-94 evaluation, 95 example use, 92-93 ISA enhancements, 92

Target processor, and software energy estimation, 340

Task level power management API requirements, 154-156 power-aware real-time systems, 141-142, 143

dynamic reclaiming and aggressive scheduling, 130

models, 130-131 periodic tasks, 133 power management point, 133, 134 static (off-line) power management, 135

Task scheduling, 129-130, 154-155 Task termination, API requirements, 155-156 Temperature, and magnetic memory cells, 25-26 Temporary state, 83 Termination of task, API requirements, 155-156 e

comparison of algorithms, 298 effiency of design, 300-304 rules for parallel and sequential compositions,

312-313 transistor sizing for optimizing, 304-311

e (cont.) transistor sizing for optimizing (cont.)

ET" with n "* 2, 306-307 experimental evidence, 308-309 minimum energy function, 308 multi-cycle system, 309-311 optimal energy and cycle time, 307-308

Threshold levels, FORTE signal detection, 248 Threshold values, queue design, 55 Tiling, loop, 196-197,201-202 TI LowPower DFF, 11, 12, 13 Time: see Efficiency metric Et2

Time and energy model, web server, 277-280 Time-Indexed Semi-Markov Decision Process

Model (TISDMP): see also Dynamic management of power consumption

goal of optimisation, 112-114 dynamic voltage scaling and, 115-116

Time-indexed states, TISDMP, 113 Time multiplexing of sensors, 240

375

Time overhead, power-aware real-time systems, 139-140

Times between arrivals of user requests, 102 Time-specific information, PACT HDL

architecture independence, 180 Timing of FORTE signal filters, 255, 256 TISMDP: see Time-Indexed Semi-Markov

Decision Process Model TLBs, ISAs and, 80, 86 Total task utilization, 128 Transistor connectivity, flip-flop analysis, 11 Transistor count

flip-flop analysis, 10, 11 shutdown logic, 51-52

Transistor current energy and delay analysis, 295-296 leakage, 20, 352-353

Transistor size flip-flop analysis, 8-9 for optimal e, 304-311

ET" with n "* 2, 306-307 experimental evidence, 308-309 minimum energy function, 308 multi-cycle system, 309-311 optimal energy and cycle time, 307-308

Transition, production rules, 295 Transition count reduction, memory address bus,

213 Transition distribution models, TISDMP, 104 Translation Lookaside Buffer (TLB), 342 Transmeta Crusoe processor, 286 Transmeta TM5400, 139-140 Transmogrifier C compiler, 171

Page 16: INDEX [link.springer.com]978-1-4757-6… ·  · 2017-08-29INDEX ARM instruction set, software energy profiling, StrongARM SA-HOO, ... hardware-software interaction, 200-208 code

376

Triceps, 218 Trigger, FORTE application partitioning, 252, 253 Trimaran, 213, 218, 219, 221, 222 TSMC library, 214 TSPC (True Single Phase Clocking) Flip-Flop, 6,

U, 12, 13 Tunneling magnetoresistance, 20-21, 22; see also

Sandwich/spin tunneling memory device Two SDT junction memory cell architecture, 23-

25,31

Unmanned Airborne Vehicle (UAV) queue usage, 54-55

Unrolling, loop, 197,201-202 User

dynamic power consumption management, 105, 106

request interarrival times, 102, 104, 105 TISDMP,I04

Utilization-based algorithm dynamic adaptation, 47-48 queue design, 55-56 shutdown logic, 52

Variable-voltage CPUs, power consumption reduction, 133

Vdd, 294, 299 Vectorsum, HDL PACT compiler test, 187, 188 VeriLog simulation environment, 169; see also

PACTHDL ARM-like RISC architecture, 213-215, 218-219 HDL AST, 174 Superiog, 172

VHDL, 169; see also PACT HDL HDL AST, 174 RTL, 181, 183 synthesis flow, 182 WILDSTAR, 182, 183

Virtex FPGA, 187, 188 Voltage

gating, architectural innovations, 213 power reduction, 60 sandwich/spin tunneling cell, 22 and software energy estimation, 340

Voltage scaling: see also Power-aware real-time systems

algorithms for, 128-129 API, 157, 158 dynamic power consumption management, U5-

U8, 120-122, 123 terminology, 60 web servers, 280-284

Wffitqueue, 37, 38, 37 Wattch simulator, 287, 319 Web servers, 261-288

consolidation of, 261-262

INDEX

dynamic voltage and frequency scaling, 280-284

JouleTrack, 356-357, 358 methodology, 265-271

environment, 266 measurement system, 266-267 replay program, 270-271 workloads, 267-270

performance metrics, implication for, 284-285 power consumption, 271-277

measurements, 271-275 opportunities for power management, 275-

277 power management, 263-265

energy efficiency, 265 server loads, 263-265

related work, 295-287 simulator, 277-280

Website, Trimaran benchmarks, 219 WILDSTAR, FPGA design path, 182, 183 Window, issue queue, 36 Winter Olympics of 1998: see Web server Wireless local area network (WLAN)

portable devices, 106, 109 TISDMP, 119-120 user request arrival distribution, 106 dynamic power consumption management, 109

Workload, 219, 221, 222 variation in, slack times, 103 web servers, 267-270 worst case, 130

Workload completion efficiency (WCE), 219, 221, 222

Workload completion rate efficiency (WCRE), 219, 221,222

Workload completion rate (WCR), 219, 221, 222 Worst-case CPU cycle numbers, 130-131 Worst-case execution time (WCET), 103, 129, 130 Write operations, sandwich/spin tunneling cells,

21, 23, 24, 25

Xilinx Forge J HDL, 172 Xilinx Foundation Tools Design Manager, 182 Xilinx 4000 series, 171 Xilinx XCV 400, HDL PACT compiler test, 187,

188 XScale, API, 161-163

Yield problems, MRAM cells, 20