Top Banner
Slide 1 Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant: Steve Chou Administrative Assistant: Susan Gere Lecture 1 - Introduction
25

Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Mar 31, 2015

Download

Documents

Alisha Lardner
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 1Michael Flynn EE382 Winter/99

EE382Processor Design

Stanford UniversityWinter Quarter 1998-1999Instructor: Michael Flynn

Teaching Assistant: Steve ChouAdministrative Assistant: Susan Gere

Lecture 1 - Introduction

Page 2: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 2Michael Flynn EE382 Winter/99

Class Objectives Learn theoretical analysis and limits

— develop intuition— project long-term trends and bound design space more

efficiently than simulation Learn models for VLSI component cost tradeoffs

— emphasis on microprocessor Learn modeling techniques for computer system

performance— emphasis on queuing

Put it all together to balance system performance and cost— Emphasis on multiprocessors, memory, and I/O— Practical examples and design targets

Page 3: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 3Michael Flynn EE382 Winter/99

Course Prerequisites Computer Architecture and Organization (EE282)

— Instruction Set Architecture— Machine Organization— Basic Pipeline Design— Cache Organization— Branch Prediction— Superscalar Execution

• In-Order• Out-of-Order

Statistics— Basic probability

• distribution functions• statistical measures

— Familiarity with stochastic processes and Markov models is helpful, but not required

Page 4: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 4Michael Flynn EE382 Winter/99

Course Information Access to the course web page is necessary

http://www-leland.stanford.edu/class/ee382/— Course info, assignments, old exams, design

tools,FAQs, ... Textbook and reference material

— Computer Architecture: Pipelined and Parallel Processor Design, Michael J. Flynn

Problem set and design problem philosophy— Learn by doing: maximize learning/effort

Exam philosophy— Extend what you have learned— Open-book, not a speed or trick contest

You are expected to give us feedback— Questions, office hours, email, surveys

Page 5: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 5Michael Flynn EE382 Winter/99

Grading Problem Sets and Design Problems 40%

— 6 problem sets, — 2 design problems

Midterm 20% Final Exam 40%

— Covers entire course— Scheduled March 15, 8:30-11:30AM

Page 6: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 6Michael Flynn EE382 Winter/99

Key Concepts of Abstraction Instruction Set Architecture (ISA)

— Functional interface for assembly-language programmer— Examples: SGI MIPS, Sun SPARC, PowerPC, HPPA, DEC

Alpha, Intel (x86), IBM System/390, IBM AS/400 Implementation (Machine Organization)

— Partitioning into units and logic design— Examples

• Intel386 CPU, Intel486 CPU, Pentium® Processor, Pentium® Pro Processor

• Alpha 21064, 21164, 21264 Realization

— Physical fabrication and assembly— Examples

• IBM 709(‘54) built with vacuum tubes and 7090(‘59) built with transistors• Pentium Processor in 0.8 m, 0.6m, 0.35 m BiCMOS/CMOS

Page 7: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 7Michael Flynn EE382 Winter/99

Instruction Set Architecture “... the attributes of a [computing] system as seen by the

programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flow and controls, the logical design, and the physical implementation.” Amdahl, Blaauw, and Brooks, 1964

Consists of:— Organization of storage— Data types— Encodings and representations (instruction formats)— Instruction (or Operation Code) Set— Modes for addressing data Items and instructions— Program visible exceptional conditions

Specifies requirements for binary compatibility across implementations

Page 8: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 8Michael Flynn EE382 Winter/99

Instruction Set Types Load/Store (L/S)

— Only load and store instructions refer to memory• no memory ALU ops

— used by several microprocessors• Power PC, HP, DEC Alpha

Register/Memory (R/M)— ALU operations can have either source or destination in

memory— Used by mainframes and most microprocessors

• IBM System/370, Intel Architecture (x86), all x86 compatables Register or Memory (R+M)

— ALU operations can have any/all operands in memory— Not used commonly now

• DEC Vax

Page 9: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 9Michael Flynn EE382 Winter/99

L/S ISA General Characteristics 32 GPR x 32b....more recently 64b instr size: 32b... more recently 64b instr types

— R1 <- R2 op R3 for ALU ops— R1 <-> MEM [RB,D] for LD/ST

Page 10: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 10Michael Flynn EE382 Winter/99

R/M ISA General Characteristics 16 GPR x 32b instr size...16b, 32b, 48b instr types

— RR R1 <- R1 op R2

— RM R1 <- R1 op MEM [RB,RX,D]— MM MEM1 [RB,RX,D] <- MEM1 [RB,RX,D] op MEM2

[RB,RX,D] used for character, decimal ops only.

Page 11: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 11Michael Flynn EE382 Winter/99

ISA Syntax Terminology OP.type destination, source1,source2

— eg ADD.F R1,R2,R3 puts result of floating pt. add in floating reg 1.

— OP without type implies integer type unless fp is clear from the context.

— destination is always first operand, so that store is ST MEM [RB,RX,D], R2

Page 12: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 12Michael Flynn EE382 Winter/99

ISA Assumptions assume all i.s. have a PSW and condition codes...CC Branch is BC.CC target, target is either R or Mem. unconditional branch is BR, even though it’s

implemented with BC other branches BCT, BAL (branch and link)

Page 13: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 13Michael Flynn EE382 Winter/99

Moore’s Law

Moore’s Law: No. Tx per chip increases 4X every 3 yearsCAGR = 60%

Source: Intel

TransistorsPer Die

Memory

Microprocessor

Pentium™Processor

80808086

80286Intel386™Processor

Intel486™Processor

20001970 1975 1980 1985 1990 1995

4004

108

107

106

105

104

103

102

101

1

16M

1K4K 16K

64K256K

1M4M

Page 14: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 14Michael Flynn EE382 Winter/99

Die Size Growth

Source: Intel

10

100

1000

1975 1980 1985 1990 1995 2000

Year

Die

Siz

e (m

m2)

64K

256K

1M

4M

16M

DRAM8086

6800080286

68020

8048668040

80386

LOGICPentium (tm)

Page 15: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 15Michael Flynn EE382 Winter/99

Finer Lithography

Source: Intel

0.01

0.1

1

10

'83 '86 '89 '92 '95 '98 '01

YEAR

Res

olut

ion

( m

)

Resolution

Overlay

CD Control

Generation1.0

0.80.5

0.350.25

Page 16: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 16Michael Flynn EE382 Winter/99

Limits on scaling As device sizes get smaller there are difficulties

maintaining the rate of down sizing of feature sizes It currently appears that around 50nm several factors

may limit scaling— hot carrier effects— time dependent dielectric breakdown— gate tunneling current— short channel effects and effect on VT

Page 17: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 17Michael Flynn EE382 Winter/99

Beyond CMOS MOSFETs If “limits” prove real; there are alternative technologies

with system’s implications— low temperature CMOS— sub threshold logic— new gate oxide materials— SOI

Page 18: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 18Michael Flynn EE382 Winter/99

Fabrication Facility Costs

1

10

100

1000

10000

1965 1970 1975 1980 1985 1990 1995 2000

Dollars in Millions

Source: VLSI Research, Inc.

Moore’s Second Law: Fab Costs Grow 40% Per Year

Page 19: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 19Michael Flynn EE382 Winter/99

Microprocessor Business Model New “generation” of silicon technology every 2.5-3 years

— 30% reduction in linear dimensions => 50% in area— 30% reduction in device delay => 50% increase in speed— Used to reduce cost and improve performance on previous

generation microprocessor— Used to enable new generation of microprocessor with

wider, more parallel, more functional machine organization— Incremental changes between generations

Business growth enables investment in new technology— Driven by performance, new applications, and “dancing

bunny people”

Page 20: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 20Michael Flynn EE382 Winter/99

Performance Growth

HP 9000/750

SUN-4/260

MIPS M2000

MIPS M/120

IBMRS6000

100

200

300

400

500

600

700

800

900

1100

DEC Alpha 5/500

DEC Alpha 21264/600

DEC Alpha 5/300

DEC Alpha 4/266

DEC AXP/500IBM POWER 100

Year

Per

for m

anc e

0

1000

1200

19971996199519941993199219911990198919881987

Workstation Performance Improving 54% per yearThat’s almost 1% per week!Figure 1.20 from P&H

Page 21: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 21Michael Flynn EE382 Winter/99

PC Shipment Growth

Performance Growth and New Applications Drive Volume

Source: Dataquest by A. Yu in IEEE Micro 12/96

Page 22: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 22Michael Flynn EE382 Winter/99

System Price/Performance

DEC VAX11/7801 MIPS1 MB$200K

$200K per MIPS

1977

IBM System 360/500.15 MIPS

64 KB$1M

$6.6M per MIPS

Dell Dimension XPS-300725 MIPS

64 MB$2412 (1/4/98)

$3.33 per MIPS

1965 1998

Photographs from Virtual Computing History Group

Page 23: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 23Michael Flynn EE382 Winter/99

Representative System

L2 Cache

Pipelines

Registers

L1Icache

L1Dcache

CPU CPU• • •

Chipset Memory

I/O Bus(es)

Page 24: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 24Michael Flynn EE382 Winter/99

Summary Current architectures exploit parallelism for performance

— Multiple pipelines and caches— Multiprocessors

Technology costs are increasing rapidly— High volume is critical to recover costs

• interface standards and evolution necessary — Product success depends on cost-effective area allocation and

partitioning Technology capacity and performance increasing rapidly

— Critical to evaluate broad space of design options at each generation• Opportunity to learn from the past and to innovate

Theoretical analysis and modeling combined with designtargets are powerful tools for developing computer systems.

This course will help prepare you to apply thosefor your future career in theory or practice.

Page 25: Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter 1998-1999 Instructor: Michael Flynn Teaching Assistant:

Slide 25Michael Flynn EE382 Winter/99

This Week Check access to the web page

— Make sure you can read and print— First problem set will be posted by Friday

Reading— Scan Chapter 1— Sections 2.1,2.2

Room Change— move to Gates B03— no festival Friday lecture