1 1801, Joseph Marie Jacquard 1801, Joseph Marie Jacquard Jacquard Loom and punch cards to Jacquard Loom and punch cards to program it. program it. (George H. Williams, photos from (George H. Williams, photos from Wikipedia) Wikipedia) Slide courtesy Anselmo Lastra Slide courtesy Anselmo Lastra
20
Embed
1 1801, Joseph Marie Jacquard Jacquard Loom and punch cards to program it. (George H. Williams, photos from Wikipedia) George H. WilliamsGeorge H. Williams.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
1801, Joseph Marie Jacquard1801, Joseph Marie Jacquard
Jacquard Loom and punch cards to Jacquard Loom and punch cards to program it.program it.
(George H. Williams, photos from Wikipedia)(George H. Williams, photos from Wikipedia)Slide courtesy Anselmo LastraSlide courtesy Anselmo Lastra
2
COMP 740 COMP 740 (formerly 206)(formerly 206)::Computer Architecture and Computer Architecture and ImplementationImplementation
Montek SinghMontek Singh
Tue, Jan 13, 2009Tue, Jan 13, 2009
Lecture 1Lecture 1
3
Computer Architecture Is …Computer Architecture Is …Term coined by Fred Brooks and colleagues at IBM:
“…the structure of a computer that a machine language programmer must understand to write a correct (timing independent) program for that machine.”
Amdahl, Blaauw, and Brooks, 1964 “Architecture of the IBM System 360”, IBM Journal of Research and Development
Do you know about System 360 family?
Term used differently by Hennessy and Patterson (our textbook)
Includes much implementation
4
OutlineOutline Course InformationCourse Information
Course Web PageCourse Web Page Linked from mine: http://www.cs.unc.edu/~montekLinked from mine: http://www.cs.unc.edu/~montek
6
Course Information (2)Course Information (2)PrerequisitesPrerequisites
Undergrad comp. org. (COMP120) and digital logicUndergrad comp. org. (COMP120) and digital logic I assume you know the following topicsI assume you know the following topics
CPU: ALU, control unit, registers, buses, memory managementCPU: ALU, control unit, registers, buses, memory managementControl Unit: register transfer language, implementation, Control Unit: register transfer language, implementation,
Representative books (available in Brauer Library)Representative books (available in Brauer Library)Baron & Higbie: Computer Architecture. Addison Wesley, 1992Baron & Higbie: Computer Architecture. Addison Wesley, 1992Kuck: The Structure of Computers and Computations (Vol. 1). Kuck: The Structure of Computers and Computations (Vol. 1).
Wiley 1978Wiley 1978Stallings: Computer Organization and Architecture: Designing Stallings: Computer Organization and Architecture: Designing
for Performance (4th edition). Prentice Hall, 1996for Performance (4th edition). Prentice Hall, 1996Patterson & Hennessy: Computer Organization and Design: The Patterson & Hennessy: Computer Organization and Design: The
Hardware/Software Interface. Morgan Kaufmann Publishers.Hardware/Software Interface. Morgan Kaufmann Publishers.
7
Course Information (3)Course Information (3)TextbookTextbook
Hennessy & Patterson: Computer Architecture: A Hennessy & Patterson: Computer Architecture: A Quantitative Approach Quantitative Approach (4(4thth edition), edition), Morgan Kaufmann Morgan Kaufmann Publishers, Sep 2006Publishers, Sep 2006 available in the university bookstore; also: amazon.com, available in the university bookstore; also: amazon.com,
bn.com…bn.com… Quite different from 3Quite different from 3rdrd ed.: more on multiprocessing ed.: more on multiprocessing
(multicore)(multicore)
8
Course Information (4)Course Information (4)Textbook (contd.)Textbook (contd.)
We will cover the following material:We will cover the following material:Fundamentals of Computer Design (Chapter 1)Fundamentals of Computer Design (Chapter 1) Instruction Set Principles and Examples (App B & J)Instruction Set Principles and Examples (App B & J)Pipelining: Basic and Intermediate Concepts (App A)Pipelining: Basic and Intermediate Concepts (App A) Instruction-Level Parallelism (Chapter 2 & 3)Instruction-Level Parallelism (Chapter 2 & 3)VLIW Architectures (App G)VLIW Architectures (App G)Vector Architectures (App F)Vector Architectures (App F)Multiprocessors (Chapter 4)Multiprocessors (Chapter 4)Memory-Hierarchy Design (App C & Chapter 5)Memory-Hierarchy Design (App C & Chapter 5)Storage Systems (Chapter 6)Storage Systems (Chapter 6)
Additional readings/papers may be handed outAdditional readings/papers may be handed out e.g., case studiese.g., case studies
9
Course Information (5)Course Information (5)GradingGrading
25-30% homework assignments (5 or 6)25-30% homework assignments (5 or 6) 20-25% midterm exam20-25% midterm exam 20-30% small project20-30% small project
no system building, no extensive programmingno system building, no extensive programming typically: performance measurement using simulators etc. typically: performance measurement using simulators etc.
30-35% final exam30-35% final exam
Assignments are due at beginning of class on Assignments are due at beginning of class on due datedue date Late assignments: penalty=10%/day or part thereofLate assignments: penalty=10%/day or part thereof
Honor Code is in effect:Honor Code is in effect: for all for all homework/exams/projectshomework/exams/projects encouraged to discuss ideas/concepts with othersencouraged to discuss ideas/concepts with others work handed in must be your ownwork handed in must be your own
10
What is in COMP 206 for me?What is in COMP 206 for me?Understand modern computer architecture so Understand modern computer architecture so
you can:you can: Write better programsWrite better programs
Understand the performance implications of algorithms, Understand the performance implications of algorithms, data structures, and programming language choicesdata structures, and programming language choices
Write better compilersWrite better compilersModern computers need better optimizing compilers and Modern computers need better optimizing compilers and
better programming languagesbetter programming languages Write better operating systemsWrite better operating systems
Need to re-evaluate the current assumptions and tradeoffsNeed to re-evaluate the current assumptions and tradeoffsExample: fully exploit multicore/manycore architecturesExample: fully exploit multicore/manycore architectures
Design better computer architecturesDesign better computer architecturesThere are still many challenges left There are still many challenges left Example: how to design efficient multicore architecturesExample: how to design efficient multicore architectures
Satisfy the Distribution RequirementSatisfy the Distribution Requirement
11
AcknowledgementsAcknowledgements Material for this class taken fromMaterial for this class taken from
My old COMP 206 course notesMy old COMP 206 course notes Prof. Anselmo Lastra’s 740 slidesProf. Anselmo Lastra’s 740 slides Prof. Sid Chatterjee’s old 206 slidesProf. Sid Chatterjee’s old 206 slides Professor David Patterson’s (Berkeley) course notesProfessor David Patterson’s (Berkeley) course notes Textbook web siteTextbook web site
Trends of this decade (early Trends of this decade (early 2000s)2000s) TechnologyTechnology
Very large dynamic RAM: 256 Mbits to 1Gb and beyondVery large dynamic RAM: 256 Mbits to 1Gb and beyond Large fast static RAM: 16 MB, 5nsLarge fast static RAM: 16 MB, 5ns
Complete systems on a chipComplete systems on a chip 100+ million transistors (approaching 1 billion)100+ million transistors (approaching 1 billion)
Trends of this decade (early Trends of this decade (early 2000s)2000s) Low PowerLow Power
50% of PCs portable now (?)50% of PCs portable now (?) Hand held communicatorsHand held communicators Performance per watt, battery lifePerformance per watt, battery life TransmetaTransmeta Asynchronous (clockless) design Asynchronous (clockless) design
Communication (I/O)Communication (I/O) Many applications I/O limited, not computationMany applications I/O limited, not computation Computation scaling, but memory, I/O bandwidth not Computation scaling, but memory, I/O bandwidth not
Diversion: Clocked Digital DesignDiversion: Clocked Digital DesignMost current digital systems are Most current digital systems are synchronous:synchronous:
Clock:Clock: a global signal that paces operation of all a global signal that paces operation of all componentscomponents
clockclock
Benefit of clocking: Benefit of clocking: enables discrete-time enables discrete-time representationrepresentation all components operate exactly once per clock all components operate exactly once per clock
ticktick component outputs need to be ready by next component outputs need to be ready by next
clock tickclock tickallows “glitchy” or incorrect outputs between clock ticksallows “glitchy” or incorrect outputs between clock ticks
16
Microelectronics TrendsMicroelectronics TrendsCurrent and Future Trends: Current and Future Trends: Significant Significant
ChallengesChallenges
Large-Scale “Systems-on-a-Chip” (SoC)Large-Scale “Systems-on-a-Chip” (SoC)100 Million ~ 1 Billion transistors/chip100 Million ~ 1 Billion transistors/chip
Very High SpeedsVery High Speedsmultiple GigaHertz clock ratesmultiple GigaHertz clock rates
Explosive Growth in Consumer ElectronicsExplosive Growth in Consumer Electronicsdemand for ever-increasing functionality …demand for ever-increasing functionality …… … with very low power consumption (limited battery life)with very low power consumption (limited battery life)
Higher Portability/Modularity/ReusabilityHigher Portability/Modularity/Reusability““plug ’n play” components, robust interfacesplug ’n play” components, robust interfaces
17
Alternative Paradigm: Asynchronous Alternative Paradigm: Asynchronous DesignDesign Digital design withDigital design with no centralized clockno centralized clock Synchronization using localSynchronization using local “handshaking”“handshaking”
Asynchronous Benefits:Asynchronous Benefits: Higher Performance: Higher Performance: not limited by slowest componentnot limited by slowest component Lower Power: Lower Power: zero clock power; inactive parts consume little powerzero clock power; inactive parts consume little power Reduced Electromagnetic Noise: Reduced Electromagnetic Noise: no clock spikes no clock spikes [e.g., Philips pagers][e.g., Philips pagers] Greater Modularity: Greater Modularity: variable-speed interfaces; reusable variable-speed interfaces; reusable
Era of the microprocessor.Increases due to transistorsand architectural improvements
19
PerformancePerformance Increase around 2002 was 7X faster than Increase around 2002 was 7X faster than
would have been due to fabrication tech (e.g. would have been due to fabrication tech (e.g. 0.13 micron) alone0.13 micron) alone
What has slowed the trend?What has slowed the trend? Note what is really being builtNote what is really being built
A commodity device!A commodity device!So cost is very importantSo cost is very important
ProblemsProblemsAmount of heat that can be removed economicallyAmount of heat that can be removed economicallyLimits to instruction level parallelismLimits to instruction level parallelismMemory latencyMemory latency
20
Moore’s LawMoore’s Law Originally: Number of transistors on a chip Originally: Number of transistors on a chip
at the lowest cost/componentat the lowest cost/component
It’s not quite clear what it really is It’s not quite clear what it really is Moore’s original paper, doubling yearlyMoore’s original paper, doubling yearly Often quoted as doubling every 18 monthsOften quoted as doubling every 18 months Sometimes as doubling every two yearsSometimes as doubling every two years