Top Banner
1 Computer Architecture Research Overvie Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.utah.edu/~rajeev
25

1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

Dec 31, 2015

Download

Documents

Shavonne Waters
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

1

Computer Architecture Research Overview

Rajeev Balasubramonian

School of Computing, University of Utahhttp://www.cs.utah.edu/~rajeev

Page 2: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

2

What is Computer Architecture?

Page 3: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

3

What is Computer Architecture?

• If the Intel Pentium4 has a faster clock speed than the IBM Power4, does it execute your programs faster?

Page 4: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

4

What is Computer Architecture?

• If the Intel Pentium4 has a faster clock speed than the IBM Power4, does it execute your programs faster?

Completing instruction

Clock tick

Case 1:

Case 2:

Time

Page 5: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

5

What is Computer Architecture?

To a large extent, computer architecture determines:

• the number of instructions used to execute a program

• the time each instruction takes to execute

• the idle cycles when no work gets done

• the number of instructions that can execute in parallel

Page 6: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

6

A Typical Microprocessor

BranchPredictor

Decode &Rename Issue Logic

ALUALU ALU ALU

L2 Cache

L1 InstrCache

L1 DataCache

RegisterFile

Page 7: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

7

Architecture Trends in the 90s

• Performance was the ultimate metric

• Transistors were a limiting factor

As on-chip transistors became available in the 90s, more functionalityand complex circuitry was added to boost performance – most of the low-hanging fruit has now been picked

Page 8: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

8

Hitting the Wall

We have now hit the following walls:

• Single core performance

• Memory

• Complexity

• Power, temperature

Page 9: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

9

Hitting the Power Wall

Power is as important a metric today as performance

From Shekhar Borkar, MICRO’99

Page 10: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

10

The Advent of Multi-Core Chips

• In the past, performance magically increased by 50% every year• In the future, this improvement will be only ~20% every year … unless … the application is multi-threaded!

Core

Cache bank

Page 11: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

11

Upcoming Architecture Challenges

• Improving single core performance

• Functionalities in multi-core chips

• Simplifying the programmer’s task

• Efficient interconnects

• Power and temperature-efficient designs

• Designs tolerant of errors

For publications, see http://www.cs.utah.edu/~rajeev/research.html

Page 12: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

12

Interconnects as a Bottleneck

• In the past, on-chip data transmission on wires cost almost nothing

• Interconnect speed and power has been improving, but not at the same rate as transistor speeds

Hence, relative to computation, communication is much more expensive

• In the near future, it will take 100 cycles to travel across the chip

• 50% of chip power can be attributed to interconnects

Page 13: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

13

Interconnects in Multi-Core Chips

A

L1

A

CPU 3

CPU 1 CPU 2

L2cache

L2control

AA

A

A

A

L2control

Page 14: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

14

Not all Wires are Created Equal

B-Wires L-Wires W-Wires PW-Wires

Relative latency 1x 0.5x 1.6x 3.2xRelative area 1x 4x 0.5x 0.5xDynamic power (W/m) 2.65 1.46 2.9 0.87Static Power (W/m) 1.02 0.57 1.16 0.31

Page 15: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

15

Data Transfers have Varying Needs

• Example of a cache coherence transaction: Read exclusive request for a shared block

Page 16: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

16

Other Interconnect Choices

• Optical interconnects: speed of light, cost in converting between optical and electrical domains

• 3D chips: reduces communication distances, low cost for vertical signal transmission, increase in power density

Page 17: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

17

3D Layouts

Cluster

(a) Arch-1 (cache-on-cluster) (b) Arch-2 (cluster on cluster) (c) Arch-3 (staggered)

Cache bank Intra-die horizontal wire Inter-die vertical wire

Die 1

Die 0

Page 18: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

18

Upcoming Architecture Challenges

• Improving single core performance

• Functionalities in multi-core chips

• Simplifying the programmer’s task

• Efficient interconnects

• Power and temperature-efficient designs

• Designs tolerant of errors

Clustered architectures: relatively low complexity scalable solution easily handles multiple threads

Page 19: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

19

Upcoming Architecture Challenges

• Improving single core performance

• Functionalities in multi-core chips

• Simplifying the programmer’s task

• Efficient interconnects

• Power and temperature-efficient designs

• Designs tolerant of errors

Heterogeneous perf/powerCores that execute the OSCores that verify results

Page 20: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

20

Upcoming Architecture Challenges

• Improving single core performance

• Functionalities in multi-core chips

• Simplifying the programmer’s task

• Efficient interconnects

• Power and temperature-efficient designs

• Designs tolerant of errors

Hardware to supporttransactional memory

Page 21: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

21

Upcoming Architecture Challenges

• Improving single core performance

• Functionalities in multi-core chips

• Simplifying the programmer’s task

• Efficient interconnects

• Power and temperature-efficient designs

• Designs tolerant of errors

Faults are caused by high energy particles that deposit enough charge to toggle bits

Variations in conditions may cause a circuit to not produce its result in time

Page 22: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

22

Research Methodologies

It’s all about the simulators!

• Simplescalar & Wattch & Hotspot: about 10,000 lines of C code that models the flow of instructions through a modern processor

• Inputs: configuration file that specifies processor parameters, benchmark program (say, gzip)

• Outputs: how long the program runs on the simulated processor (Simplescalar), how much power is consumed (Wattch), what is the peak temperature (Hotspot)

Page 23: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

23

Evaluating a New Idea

• Lots of reading (it’s better than waiting for divine inspiration)

• Identify bottlenecks, identify problems, develop an idea, repeatedly question that idea

• Understand simulator

• Engineer a solution, modify simulator code (perhaps, write fewer than 1000 lines of C code)

• Analyze data (things never work the first time), engineer/optimize/debug your solution

• Write papers

• Implement in silicon?

Page 24: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

24

To Learn More…

• CS/EE 3810: Computer Organization

• CS/EE 6810: Computer Architecture

• CS/EE 7810: Advanced Computer Architecture

• CS/EE 7820: Parallel Computer Architecture

• CS 7937 / 7940: Architecture Reading Seminar

Page 25: 1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah rajeev.

25

Title

• Bullet