Evolution of personal computing microprocessors and socs

Post on 04-Dec-2014

198 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

demonstrates how microprocessors evolved from 4004 to i7 4th generation

Transcript

Evolution of Personal Computing

MicroprocessorsAzmath MoosaM. Tech 1st Year

13304006Credit Seminar

Instruction DecoderALU

Full Adder

A

B

Sum

Carry

Full Subtractor

A

B

D

B

Shifter and LogicAA

Shifter and logic BB

01 10 11

Out

2002Architecture by

Azmath!

sel

4:2 MUX

SUB 2, 3

The Invention

• Intel was established as a memory device manufacturer• Nippon Calculating Machine Corporation

approached Intel to design 12 custom chips for its new calculator.• Intel suggested a family of just 4 chips – 4004 was

one of them

1969

4004 to 8085

• 4 bit• 2,300 Transistors• 10um PMOS• Clocked @ 740 kHz

4004

• 8 bit• 3,500 Transistors• 500 kHz

8008 • 4,500 Transistors• 6um NMOS• 2 MHz

8080

• 3um depletion type NMOS

• 3 MHz

8085

1969 - 76

8086/80186/88

• 16 bit• Pipelined• 29000 Transistors• 3um process• 5, 8, 10 MHz• Chosen by

1978

Fetch

Execute

The PC 1981

80286 1984• 16 bit• Pipelined• 134000 Transistors• 1.5um process• Upto 16 MHz

80386

• 32 bit• 275,000 transistors • 1um process technology• 33 MHz

1985

8086801802803804xHas become the standard CPU architecture for the PC platform. All vendors must adhere to this standard to make compatible CPUs for the PC.

Instruction Set• Includes a specification of the set

of opcodes (machine language), and the native commands implemented by a particular processor.• Ex: MMX, 3DNow!, SSE,AVX,AES etc.• Either Hardwired or Microcode routines

Micro-op table

Integer ALU

FP ALU

Load/Store

instruction op1 op2

Operand Fetch

Backend

Frontend

Pipeline

Fetch

Decode

Execute

Write to Memory

• Fetches Instructions & Operands from memory• Any techniques to optimize fetching can be implemented here• Converts Instructions to internal micro-op codes• Has to process instructions in order

• Executes instructions• Parallel units that perform same operation can be present• Instructions can be processed out of order• Any techniques to optimize write back can be implemented

here

The Pentium

• Codenamed P5• Superscaler Architecture• Longer Pipeline• 3.1 Million transistors• 800nm process technology• Upto 233 MHz

• Included MMX instruction set

1993Prefetch

Decode

Decode

Execute

Execute

Writeback

Pentium Pro

• Codenamed P6• Integrated L2

Cache• Chipset +

MemoryController = Northbridge• Iface to ATA, PCI,

ISA, BIOS, SuperIO = Southbridge

1995

P6 Architecture

• 10 Stage Pipeline

• Branch Predictor, predicts branches and prefetches

Pentium II/III 1997-99• 250nm process• 7.5/9.5 Million

Transistors• AGP for faster

graphics• SSE Instruction Set• 1 Ghz

Performance Comparison

Pentium 4

• NetBurst Microarchitecture• 42 Million Transistors• 180nm process technology• SSE Instruction Set• 1.4 to 3.0 Ghz

2000

NetBurst Architecture

• 20 Stage Long Pipeline• Trace Cache• Load operands and store• Que• OoO execution• ALU clocked @ dbl• Hyper Threading

• Too long, high power dissipation

Performance Comparison

Pentium 4

Pentium 3

320 325 330 335 340 345 350 355 360 365

Content Creati on Benchmark

Core Architecture

• 65 nm process

• 291 Million Transistors

• Shorter, Efficient Pipeline

• Wide – Dynamic Execution• Superscaler• Macro-fusion

• Advanced Digital Media Boost• 128 bit ALU

• Advanced Smart Cache

• Smart Memory Access

• Execute Disable Bit

• HT disabled

2006

Performance Comparison

Tick - Tock 2007

Nehalem Architecture• 45 nm process• 700 million transistors• Shared L3 Cache• Integrated Memory

Controller

• Improved Loop Stream detector• Improved Branch Prediction• SSE4+ instruction set• Turbo boost• HT reintroduced

Performance Comparison

SandyBridge Architecture• 32nm process• 1.2 Billion transistors• Ondie - GPU• Ring style on-die interconnect• Aggressive Turbo• AVX instruction set

2010

• Improved BPU• Micro-OP cache• Wider ALU

Performance Comparison

IvyBridge

• Tick• 22nm FinFET Transistors

2012

FinFET Structure

Performance Comparison

HaswellArchitecture• 1.4 Billion transistors• AVX2 Instruction set• Improved cache bandwidth• Improved GPU & QuickSync

2013

• Improved BPU• Unified decoder queue• Wider reorder buffer• Wider EU with 2 additional

ports• FMA – Fused multiply add

Backend

Performance Comparison

1. Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 1: Basic Architecture, [online] Available: http://www.intel.com/products/processor/manuals

2. King, J. ; Quinnell, E. ; Galloway, F. ; Patton, K. ; Seidel, P. ; Dinh, J. ; Hai Bui and Bhowmik, A., "The Floating-Point Unit of the Jaguar x86 Core," in 21st IEEE Symposium on Computer Arithmetic (ARITH), 2013, pp. 7-16.

3. Ibrahim, A.H. ; Abdelhalim, M.B. ; Hussein, H. ; Fahmy, A., "Analysis of x86 instruction set usage for Windows 7 applications," in 2nd International Conference on Computer Technology and Development (ICCTD), 2010, pp. 511-516.

4. PC Architecture, Acid Reviews, [online] 2014, http://acidreviews.blogspot.in/2008/12/pc-architecture.html (Accessed: 2nd February 2014).

5. Alpert, D. and Avnon, D., "Architecture of the Pentium microprocessor," IEEE Micro, vol. 13, Issue 3, pp. 11-21, 1993.

6. Computer Processor History, Computer Hope, [online] 2014, http://www.computerhope.com/history/processor.htm (Accessed: 2nd February 2014).

7. Gartner Press Release, Gartner Analyst, [online] 2014, http://www.gartner.com/newsroom/id/2610015 (Accessed: 8th February 2014).

8. Intel Processor Number, CPU World, [online] 2014, http://www.cpu-world.com/info/Intel/processor-number.html (Accessed: 9th February 2014).

References

Thank You

top related