Top Banner
Multi-core, Mega-nonsense
36

Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Jan 18, 2018

Download

Documents

Gyles Malone

To whet the appetite Can multi-core save power via the freq cube law? Is ILP dead? Should sample benchmarks drive future designs? Is hardware really sequential? Should multi-core structures be simple? Does productivity demand we ignore what’s below?
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Multi-core, Mega-nonsense

Page 2: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Will multicore cure cancer?

• Given that multicore is a reality– …and we have quickly jumped from one core to 2 to 4 to 8– It is easy to let one’s imagination run wild – a million cores!

• A lot of misinformation has surfaced

• What multi-core is and what it is not

• And where we go from here

Page 3: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

To whet the appetite

• Can multi-core save power via the freq cube law?

• Is ILP dead?

• Should sample benchmarks drive future designs?

• Is hardware really sequential?

• Should multi-core structures be simple?

• Does productivity demand we ignore what’s below?

Page 4: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 5: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 6: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

How we got here (Moore’s Law)

• The first microprocessor (Intel 4004), 1971– 2300 transistors– 106 KHz

• The Pentium chip, 1992– 3.1 million transistors– 66 MHz

• Today– more than one billion transistors– Frequencies in excess of 5 GHz

• Tomorrow ?

Page 7: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

How have we used the available transistors?

Time

Num

ber o

f Tra

nsis

tors

Cache

Microprocessor

Page 8: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Intel Pentium M

Page 9: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Intel Core 2 Duo

• Penryn, 2007• 45nm, 3MB L2

Page 10: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Why Multi-core chips?

• In the beginning: a better and better uniprocessor– improving performance on the hard problems– …until it just got too hard

• Followed by: a uniprocessor with a bigger L2 cache– forsaking further improvement on the “hard” problems– poorly utilizing the chip area– and blaming the processor for not delivering performance

• Today: dual core, quad core, octo core

• Tomorrow: ???

Page 11: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Why Multi-core chips?

• It is easier than designing a much better uni-core …and cheaper!

• It was embarrassing to continue making L2 bigger

• It was the next obvious step

Page 12: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

So, What’s the Point

• Yes, Multi-core is a reality

• No, it wasn’t a technological solution to performance improvement • Ergo, we do not have to accept it as is

• i.e., we can get it right the second time, and that means:

What goes on the chipWhat are the interfaces

Page 13: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 14: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Hardware is the ultimate in parallelism!

It is NOT about cycle by cycle,It is about what goes on in EACH cycle

Page 15: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 16: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

The Asymmetric Chip Multiprocessor (ACMP)

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Largecore

ACMP Approach

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

“Niagara” Approach

Largecore

Largecore

Largecore

Largecore

“Tile-Large” Approach

Page 17: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Large core vs. Small Core

• Out-of-order• Wide fetch e.g. 4-wide• Deeper pipeline• Aggressive branch

predictor (e.g. hybrid)• Many functional units• Trace cache• Memory dependence

speculation

• In-order• Narrow Fetch e.g. 2-

wide• Shallow pipeline• Simple branch predictor

(e.g. Gshare)• Few functional units

LargeCore

SmallCore

Page 18: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

0

1

2

3

4

5

6

7

8

9

0 0.2 0.4 0.6 0.8 1

Degree of Parallelism

Spee

dup

vs. 1

Lar

ge C

ore Niagara

Tile-LargeACMP

Throughput vs. Serial Performance

Page 19: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 20: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Huh?

Page 21: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 22: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

ILP is dead

• We double the number of transistors on the chip– Pentium M: 77 Million transistors (50M for the L2 cache)– 2nd Generation: 140 Million (110M for the L2 cache)

• We see 5% improvement in IPC• Ergo: ILP is dead! • Perhaps we have blamed the wrong culprit.

• The EV4,5,6,7,8 data: from EV4 to EV8:– Performance improvement: 55X– Performance from frequency: 7X– Ergo: 55/7 > 7 -- more than half due to microarchitecture

Page 23: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.
Page 24: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Moore’s Law

• A law of physics• A law of process technology• A law of microarchitecture• A law of psychology

Page 25: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 26: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Examine what is (rather than what can be)

Should sample benchmarks drive future designs?

Another bridge over the East River?

Page 27: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 28: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 29: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

“Abstraction” is Misunderstood

• Taxi to the airport• The Scheme Chip (Deeper understanding)• Sorting (choices)• Microsoft developers (Deeper understanding)

Page 30: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 31: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Not all programmers are created equal

• Some want to just get their work done– Performance be damned– They could care less about how computers work

• Some want performance above all else– They understand how computers work– They can program at the lowest level

Ergo: At least two interfaces

Page 32: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 33: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Thinking in Parallel is Hard

• Perhaps: Thinking is Hard

• How do we get people to believe:Thinking in parallel is natural

Page 34: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Parallel Programming is Hard?

• What if we start teaching parallel thinkingin the first course to freshmen

• For example:

– Factorial– Parallel search– Streaming

Page 35: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

Mega-nonsense

• Multi-core was a solution to a performance problem• Hardware works sequentially• Make the hardware simple – thousands of cores• Do in parallel at a slower clock and save power• ILP is dead• Examine what is (rather than what can be)• Communication: off-chip hard, on-chip easy• Abstraction is a pure good• Programmers are all dumb and need to be protected• Thinking in parallel is hard

Page 36: Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.

!