Microprocessor Central Processing Unit (CPU).
Dec 26, 2015
Microprocessor
Central Processing Unit (CPU).
The First Microprocessor• Intel created the first microprocessor 4004 in 1971.
• Ran at a clock speed of 108KHz
• Contained 2,300 transistors and was built on a 10-micron process.
• Then came the Z-80 by Zilog in 1976 used in TRS-80
• 6502 by MOS Technologies used in Apple-I and II.
The First PC• In 1981 IBM introduced the IBM PC, which was
based on a 4.77MHz Intel 8088 processor running the Microsoft Disk Operating System (MS-DOS) 1.0
– 30,000 transistor at 4.77MHz – 16 bit internal registers (IA-16) /8 bit external data bus– 20 bit address bus could address upto ________Bytes?
• All Processors are backward compatible, like P4, AMD Athlon, etc.
Intel Architecture - 16, 32, 64 bits
• IA-16 was introduced with the 8086 processor in 1978
• IA-32 was introduced with the 386DX processor in 1985
• IA-64 was introduced with the Itenium processor in 2001
The First Apple
• Apple-I and II was built using 6502 as it was $25 compared to $300 of 8080.
• 6502 was based on 6800 processor.
• Motorola went on to create the 68000 series, which became the basis for the Apple Macintosh line of computers.
• Today those systems use the PowerPC chip, also by Motorola and a successor to the 68000 series.
AMD Opteron™ Processor
• AMD Opteron processor is based on AMD’s eighth-generation processor core, which is planned to mark the introduction of the industry’s first 64-bit, x86 technology implementation.
• This technology is planned to preserve companies’ investments in 32-bit applications, while allowing a seamless transition to 64-bit computing as those companies require. Includes an integrated memory controller, which reduces memory bottlenecks.
AMD Athlon 64 versus AMD Opteron• For the server/workstation market, the AMD
Opteron processor will undergo more stringent validation and reliability testing. Another difference will be in the number of HyperTransport links embedded on the chip.
• The AMD Athlon 64 processor will contain one HyperTransport link offering 6.4 GB/s data transfer while the AMD Opteron processor will offer three links.
• The processors will also contain different amounts of cache.
Two main types of Processor• CISC (Complex Instruction Set Computer)
– full-featured instruction set• complicated
• RISC (Reduced Instruction Set Computer)– less complicated instruction set
• fewer and simpler instructions
Two types of Instructions•
Processor Specifications• Processors can be identified by two main
parameters: – how wide they are and
• Internal registers
• Data bus
• Address bus
• wider is better!
– how fast they are.• faster is better!
Rating CPUs
• CPU speed is measured in megahertz (MHz)
• More efficient CPUs require fewer steps to perform fundamental operations
• Word size, or internal data path size, is the largest number of bits the CPU can process in one operation– Ranges from 16 bits to 64 bits
Rating CPUs
• Data path, or external data path size, is the largest number of bits that can be transported into and out of the CPU; – ranges from 8 bits to 64 bits
• CPUs have a fixed range of addresses it can assign to physical memory; the number of addresses limits the amount of physical memory the computer can use effectively
Rating CPUs
• Most CPUs have built-in storage for recently accessed instructions and data; this is internal Level-1 or L1 Cache
• Most CPUs include a coprocessor
• There may be chips with a special function, such as the Pentium MMX designed to work faster with multimedia
• Processor Speed Ratings (MHz)– quartz crystal with voltage applied to it vibrates
(oscillates) – forms time base on which computer operates– single cycle is smallest element of time for
processor– at least one clock cycle required for every
operation, usually many clock cycles needed• memory transfers usually require 3 clock cycles for the
first transfer and 1clock cycle for the next 3 to 6 consecutive transfers
• time to execute instructions varies from 12 cycles per instruction to three or more instructions per cycle
– must consider how many instructions can be performed per clock cycle as well as speed of clock
• Processor Speed / Motherboard Speed– processors run at some multiple speed of
motherboard– set motherboard speed and multiplier either by
jumpers or CMOS setup– new motherboard chipsets capable of setting up
motherboard and CPU speed automatically
CPU Type CPU Speed
Clock Multiplier
Motherboard Speed
Pentium 60 1x 60 Pentium 150 2.5x 60 Pentium/Pro 180 3x 60 Pentium/Pro 200 3x 66 Pentium II 233 3.5x 66 Pentium II Celeron 300 4.5x 66 Pentium II Celeron 366 5.5x 66 Pentium Celeron 500 7.5x 66 Pentium II/Xeon 400 4x 100 Pentium III/Xeon 600 4.5x 133
Processor Specifications• Data Bus
– Used to receive and transmit data
• Internal Registers– Area for processor to calculate data
• Memory Address Bus– Carries addressing information concerning where data is
being sent or received from RAM
Processor Modes• Real Mode
– 16 bit instructions/use only 1m of memory
• Protected Mode– 32 bit instructions/applications run in protected memory spaces
• Virtual Real Mode– Used in Windows for backwards compatibility; allows
multitasking of DOS programs
Processor Sockets• Intel and AMD have created a set of socket and slot
designs for their processors
• ZIF socket (Zero Insertion Force)
• Ability to upgrade
Processor Features• CPU Operating Voltage• Heating and cooling problems• Intel-Compatible Processors (AMD and Cyrix• Processor Generation• SMM (System Management Mode)
– Power Mgmt. Standard used in portables
• Superscalar Execution– Execute multiple instructions simultaneously
• MMX (Multi Media eXtensions)– Improves video, imaging, and I/O processing
Processor Features• Dynamic Execution
– Anticipates a programs needs before hand
• DIB Architecture (Dual Independent Bus)– Separate busses for memory access
• Streaming SIMD Extensions– 70 new features for multimedia processing
• Data Bus– also used to rate processors
• the larger the bus, the faster a processor can send and receive data
• Internal Registers (Internal Data Bus)– dictates:
• how much information a processor can operate on at one time
• how data is moved around internally in the processor
• the type of commands that the processor can run– 386’s and above capable of running 32-bit instructions
• Address Bus– must be considered because:
• determines how much memory a processor can access
CPU Type AddressBus
Bytes MBs GBs
8088/8086 20-bit 1,048,576 1286/386SX 24-bit 16,777,216 16386DX/486/P5 Class 32-bit 4,294,967,296 4,096 4P6 Class 36-bit 68,719,476,736 65,536 64
• Internal Level-1 Cache– memory built into processor (486 and up)– runs at same speed as processor– size varies
• 8KB
• 32KB
• 64KB
• Level-2 Cache– found on motherboard in Pentiums (P5)– found in processor in P6 class
• Xeon/Celeron run at processor speed
• mainstream Pentium at half the processor speed
• SMM (Power Management)– System Management Mode– power management circuitry inside processor– processor’s power use based on its activity
level– controlled via CMOS set-up– found in 486SL and up
• Superscalar Execution– the ability to perform multiple instructions at
the same time using multiple pipelines
InstructionFetch
InstructionDecode
OperandFetch
InstructionExecution
WriteBack
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
Stage 1
Stage 2
Stage 3
Stage 4
Stage 5
11
11
1
22
22
2
33
33
3
44
44
4
55
55
5
66
66
6
77
77
7
88
8
8
8
9
9
9
9
9
1 2 3 4 5 6 7 8 9
Time
PIPELINING
• MMX Technology– multi-media extensions– later P5 processors– improves video compression/decompression,
image manipulation, encryption, I/O processing• larger L1 cache
• 57 new commands
• SSE (Streaming SIMD Extensions)– 70 new instructions for graphics and sound
processing over and above MMX
– allows for floating-point calculations using a separate unit in processor instead of sharing standard floating-point unit
– benefits when used with SSE-aware software:• higher resolution and higher quality image viewing
and manipulation
• better audio, MPEG2 video, simultaneous MPEG2 encoding and decoding
• better speech recognition
• SSE2 – Introduced with Pentium 4– 144 new instructions
• 3DNow and Enhanced 3DNow– AMD’s version of SSE– Introduced with the K6-2 processor, improved
on the Athlon processor.– Instructions written specifically for SSE will
not support 3DNow and visa versa.
• Dynamic Execution– helps processor manipulate data more
efficiently
– three techniques used:• Multiple Branch Prediction
– processor anticipates branches in instruction flow– can predict where next instructions are in memory with an
accuracy of 90% or better
• Data Flow Analysis– analyzes and schedules instructions to be executed in the
best sequence no matter how program was written
• Speculative Execution– completes instructions before they are needed
• Dual Independent Bus (DIB) Architecture– two buses inside newer processors
• L2 cache bus• processor-to-main-memory bus (system)
– can run simultaneously– L2 cache was moved into processor package
• Processor Packaging– PGA (Pin Grid Array)
• used from 286 up to Pentiums
– SPGA (Staggered Pin Grid Array)• Pentium Pro
– SEC (Single Edge Contact)• Pentium II/III
– SEP (Single Edge Processor)• same as SEC only not so fancy and L2 cache is
optional
AMD, Cyrix, NexGen, IDT, and Rise Processors
AbbreviationsFPU = Floating-Point unit (internal math coprocessor)
WT = Write-Through cache (caches reads only)
WB = Write-Back cache (caches both reads and writes)
M = Millions
Bus = Processor external bus speed (motherboard speed)
Core = Processor internal core speed (CPU speed)
MMX = Multimedia extensions, 57 additional instructions for graphics and sound processing
3DNow = MMX plus 21 additional instructions for graphics and sound processing
Enh. 3DNow = 3DNow plus 24 additional instructions for graphics and sound processing
3DNow! Pro = Enh. 3DNow plus SSE instructions for graphics and sound processing
SSE = Streaming SIMD (Single Instruction Multiple Data) Extensions; MMX plus 70 additional instructions for graphics and sound processing
SSE2 = Streaming SIMD Extensions 2; SSE plus 144 additional instructions for graphics and sound processing
Processor Speed Ratings • A computer system’s clock speed is measured as a
frequency, usually expressed as a number of cycles per second.
• A crystal oscillator controls clock speeds using a sliver of quartz sometimes contained in what looks like a small tin container.
Intel Processor and Motherboard Speeds
Intel Processor and Motherboard Speeds
AMD Processor and Motherboard Speeds
• The Athlon to North Bridge processor bus actually runs at a double (2x) transfer speed, which is twice that of actual the motherboard clock speed.
Processor Memory-Addressing Capabilities
CPU Speeds Relative to Cache, SIMM/DIMM, and Motherboard
• RDRAM technically runs at 800MHz, but the channel is only 16 bits wide, resulting in a bandwidth of 1.6GB/sec,
• which is equivalent to running 200MHz at the 64-bit width of the processor data bus.
Processor Modes All Intel 32-bit and later processors, from the 386 on up, can
run in several modes. Processor modes refer to the various operating environments and affect the instructions and capabilities of the chip.
The processor mode controls how the processor sees and manages the system memory and the tasks that use it.
Three different modes of operation possible are
• Real mode (16-bit software)
• Protected mode (32-bit software)
• Virtual real mode (16-bit programs within a 32-bit environment)
Processor Modes• Real Mode
– 16 bit instructions/use only 1M of memory
• Protected Mode– 32 bit instructions/applications run in protected
memory spaces
• Virtual Real Mode– Used in Windows for backwards compatibility;
allows multitasking of DOS programs
SMM (system management mode)Intel has created power-management circuitry.
This was introduced initially in the Intel 486SL processor, and incorporated into all Pentium and later processors.
• SMM circuitry is integrated into the physical chip but operates independently to control the processor’s power use based on its activity level.
• It enables the user to specify time intervals after which the CPU will be partially or fully powered down.
• It also supports the Suspend/Resume feature that allows for instant power on and power off, used mostly with laptop PCs.
• These settings usually are controlled via system BIOS settings.
ASUS P4T533 Motherboard
ASUS P4T533 Motherboard
ASUS P4T533 Motherboard
ASUS P4T533 Motherboard
ASUS P4T533 Motherboard
ASUS P4T533 Motherboard
ASUS P4T533 Motherboard
533-MHz or 400-MHz System BusThe Pentium 4 processor's 533-MHz system bus supports Intel's
highest performance desktop processor by delivering 4.2 GB of data-per-second into and out of the processor.
This is accomplished through a physical signaling scheme of quad pumping the data transfers over a 133-MHz clocked system bus and a buffering scheme allowing for sustained 533-MHz data transfers.
The Pentium 4 processor's 400-MHz system bus supports Intel's performance desktop processor by delivering 3.2 GB of data-per-second into and out of the processor. This is accomplished through a physical signaling scheme of quad pumping the data transfers over a 100-MHz clocked system bus and a buffering scheme allowing for sustained 400-MHz data transfers.
This compares to 1.06 GB/s delivered on the Pentium III processor's 133-MHz system bus.
Pipelining
• Fetching and execution of each instruction is split
into many stages,
• all working in parallel.
• This allows the processing of up to five instructions to be overlapped.
Pipelining
• In 8085 there was no pipelining.
• 8086 had enjoyed the first pipelining.
• In the 486 the pipeline stage is broken down even further, to 5 stages as follows:
– 1. fetch (prefetch)
– 2. decode 1 (two stage decode)
– 3. decode 2
– 4. execute
– 5. register write-back (result goes to EAX)
pipelined vs. nonpipelined execution
Pipelining
Pipelining
InstructionFetch
InstructionDecode
OperandFetch
InstructionExecution
WriteBack
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
Stage 1
Stage 2
Stage 3
Stage 4
Stage 5
11
11
1
22
22
2
33
33
3
44
44
4
55
55
5
66
66
6
77
77
7
88
8
8
8
9
9
9
9
9
1 2 3 4 5 6 7 8 9
Time
heavily pipelining
• By heavily pipelining the fetching and
execution of instructions, many 486 instructions are executed in only 1 clock cycle instead of in 3 clocks as in the 386.
Other Fifth-Generation Processors Features:
• 16KB instruction cache
• Dynamic execution
• Five-stage RISC-like pipeline
• FPU
• Pin-selectable clock multiples of 1.5x & 2x
• IDT Centaur C6 Winchip– Socket-7 compatible– Speeds of 180, 200, 225, 240 MHz– Not superscalar– Slower with multimedia applications and games– Smaller, less power consumption
Sixth-Generation (P6)Dynamic Execution
Dual Independent Bus Architecture
Better Superscalar Design
Pentium Pro:– 387 pin package
– has on-board (included with the CPU but not internal) L2 cache, either 256, 512KB or 1MB running at full core speed
Processor Memory-Addressing Capabilities
Two integer and to memory units that can execute four instructions per clock cycle
• Two floating-point multiply accumulate units with 82-bit operands
• FMAC unit is capable of executing two floating-point operations per clock
• Two additional MMX units
• Eight single-precision FP operations can be executed every cycle
• 128 integer registers, 128 floating-point registers, 8 branch registers, 64 predicate registers
• New cartridge type which includes processor and L3 cache
• Dedicated cartridge power connector
PC Hardware Links -by Chris Hare
http://users.erols.com/chare/main.htm
Pentium® 4 processors - Comparison Chart
http://www.intel.com/support/processors/pentium4/p4compare.htm