Top Banner
Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007
32

Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Mar 28, 2015

Download

Documents

Cole Bentley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Why everything I learned at Leeds in 1972 is no

longer true!

Andrew Herbert

30th March 2007

Page 2: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Personal Computing at Leeds (late 1960’s)

Page 3: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Rethinking software

◦ In the 1970s, computer software was designed in large part to overcome hardware limitations

◦ In the 21st century, many of these limitations no longer apply. We have an abundance of computing resources

◦ This is changing how we think about software

Page 4: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Caution

◦ Much of what I will say relates primarily to personal computing

◦ There are many computer applications that are still held back by hardware

◦Large server systems such as Google or Hotmail

◦High Performance Computing applications for science, such as genomics, computational physics and chemistry

Page 5: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Moore’ Law (1967)

◦ Not really a “law”, but an observation, intended to hold for “...the next few years…”

◦ (Nt/A)t1 = (Nt/A)t0 * 1.58t1-t0 (t in years)

◦ Nt: number of transistors; A: area

◦ Moore’s observation has held for 35 years and has sustained the personal computer industry

◦ NB Moore’s Law is about transistor count, not speed…

Page 6: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Implications for Processors

◦ More complex designs: rich instruction sets, pipelining, out of order execution, speculative execution, caching

◦ More than one processor on a chip (homogeneous multi-processor)

◦ More than one processor on a chip, with specialized functions,◦ Graphics performance is improving much faster

than CPU performance

Page 7: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

“… we see a very significant shift in what architectures will look like in the future ...fundamentally the way we've begun to look at doing that is to move from instruction level concurrency to … multiple cores per die. But we're going to continue to go beyond there. And that just won't be in our server lines in the future; this will permeate every architecture that we build. All will have massively multicore implementations.”

Intel Developer Forum, Spring 2004Pat GelsingerChief Technology Officer, Senior Vice PresidentIntel CorporationFebruary, 19, 2004

10,000

1,000

100

10

1

‘70 ‘80 ‘90 ‘00 ‘10

Po

wer

Den

sity

(W

/cm

2)

40048008

8080

80858086

286386

486

Pentium® processors

Hot Plate

Nuclear Reactor

Rocket Nozzle

Sun’s Surface

Intel Developer Forum, Spring 2004 - Pat Gelsinger

CPU Clock Speed

DRAM Access Speed

Today's Architecture: Memory access speed not keeping up with CPU clock speeds

Modern Microprocessors - Jason Patterson

Sp

eed

(M

Hz)

(M

Hz)

10,000

1,0001,000

100100

101019901990 19921992 19941994 19961996 19981998 20002000 20022002 20042004

CPU Clock Speed

DRAM Access Speed

Today’s Architecture: Heat becoming an unmanageable problem!

Intel Cancels Top-Speed Pentium 4 ChipThu Oct 14, 6:50 PM ET Technology - Reuters By Daniel Sorid

Intel …canceled plans to introduce its highest-speed desktop computer chip, ending for now a 25-year run that has seen the speeds of Intel's microprocessors increase by more than 750 times.

Memory Wall~90 cycles of the CPU clock

to access main memory!

The Intel perspective…

Page 8: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete software idea 1:Single-threaded programs

◦ With uniprocessors, there is no compelling reason to try to use parallelism and write “concurrent” programs◦ Your program will probably run slower due

to thread creation, destruction, and switching◦ Your program is harder to debug (“Heisenbugs”)

◦ Now we don’t have a choice!◦ Attempts to tease the parallelism out of a sequential

program automatically haven’t worked out very well◦ We need better education, better languages, and better

tools, since building concurrent programs is hard◦ Data driven computation - e.g. “Map-Reduce”

Page 9: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete software idea 2:Low-level programming languages

◦ Errors in handling array and heap storage are one of the leading causes of system unreliability in C and C++

◦ Switching to languages like Java and C#, which provide automatic storage management, makes many types of errors impossible

◦ Until recently, programmers argued that these languages were too expensive in space and time◦ Today, neither is an issue

◦ High level languages also allow programs that are easier to understand (and maintain)◦ This is a high cost to the industry, since software evolves

◦ Functional programming finally comes of age?

Page 10: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Moore’s Law for Primary Memory

◦ Capacity improvement: 1,000,000 X since 1970

◦ Bandwidth improvement: 100 X

◦ Latency reduction: only 10-20 X

◦ Dealing with latency is the largest problem for a computer system designer

Page 11: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete software idea 3: Dynamic Paging / Backing Store

◦ Originally used to compensate for expensive, small main store

◦ Later extended to increase address space and to add protection◦ Today, slows systems down,

because of disk and providing enough real memory is inexpensive

◦ Yet all major operating systems use it◦ Although you can turn off paging in Windows

◦ Increasing use of modern programming languages is also reducing the need for paging

◦ But ‘virtualization’ for controlled resource sharing and resource separation is very important!

Page 12: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Software Complexity

◦ Complexity of developing, testing and supporting large scale software systems continues to escalate

Source:Capers Jones, Estimating Software Costs, pg. 140Capers Jones, Patterns of Software Systems Failure & Success

100%100%

90%90%

80%80%

70%70%

60%60%

50%50%

40%40%

30%30%

20%20%

10%10%

0%0%100100 10001000 1000010000 100000100000 1M1M 10M10M

Today’s Software Development Process (Large Projects):Only 13% of projects are on time!

Lines of Code

% o

f P

roje

cts Early

On-time

Delayed

Canceled

100%100%

90%90%

80%80%

70%70%

60%60%

50%50%

40%40%

30%30%

20%20%

10%10%

0%0%100100 10001000 1000010000 100000100000 1M1M

Today’s Software Development Process (Medium/Large Projects):Only 18% of time spent on coding, 35% debugging!

Lines of Code

Documentation

Support and Management

Coding

Finding & Removing Defects

% o

f E

ffo

rt b

y T

ask

Page 13: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete software idea 4:Verifying software quality by testing

◦ Testing is needed, but has limits:◦ Hard to find certain types of problems (e.g. concurrency)◦ Huge number of configurations is daunting

◦ Need more use of formal methods◦ Mathematical model of system: “specification”◦ Automated “verification” of software implementation◦ Used to be impractical due to state space size explosion

and CPU speeds◦ Now routinely used in hardware design, where the cost

of a bug is much larger◦ Increasingly used in Microsoft development

Page 14: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

0 2000 4000 6000 8000 10000

1952

1962

1972

1982

1992

2002

Variables Source: Sharad Malik (Princeton) – CAV invited talk

Capacity Growth in Proof Engines for Propositional Logic

Page 15: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

0 2000 4000 6000 8000 10000

1952

1962

1972

1982

1992

2002

Variables Source: Sharad Malik (Princeton) – CAV invited talk

Capacity Growth in Proof Engines for Propositional Logic

Page 16: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Storage

◦ 30GB Personal file systems

◦ TerraServer - 5TB

◦ SkyServer – 40TB

Page 17: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete software idea 5:Hierarchical file systems

◦ Originally introduced to improve access efficiency and to give users a familiar metaphor (the office file cabinet with folders)

◦ Today, it has serious problems◦ A clean install of Windows and Office has 45,000+ files◦ The structure chosen by a person today will not be

appropriate in six months

◦ Need a new way to organize things so we don’t drown in information when we have a personal terabyte

Page 18: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

New Ways of Organizing Files

◦ Full text indexing

◦ Microsoft Vista desktop search

◦ Extend to audio and handwriting

◦ Object recognition for image and video search

◦ Huge opportunities for personalization and exploiting context

◦ Timelines, work flow discovery

◦ Email and IM buddies

◦ Location awareness

◦ Network location, GPS etc

Page 19: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Example – Visual Summary

Thumbnail Viewer - Win XP Tapestry Viewer

Images in a Images in a folderfolder

Page 20: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

The Perfect Tapestry or Photo Collage System

Input Images

- Compose new image (Graphics) - Semantic inference

Background Object People

Object Recognition and Detection

Flowers Rest

Page 21: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

The MSR Cambridge Digital Tapestry[Rother, Kumar, Kolmogorov, Blake; CVPR ’05]

Objective: Choose informative, representative parts from many images and place them realistically

Objective: Remove any visual seams

Contributions:Novel problem formulation

Extension of optimisation technique Novel “membrane blending” based on [Perez, Gangnet, Blake Siggraph ‘03]

Block Tapestry

Digital Tapestry

Input Images

Page 22: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Displays

Page 23: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete idea 6:Computer monitors (screens) on desks

• What happens when display real-estate is free?

• What happens when dynamic displays are as ubiquitous as conventional signage, or whiteboards?

• What happens when displays can be as large as your wall?

• What happens when displays are as thin as your wallpaper?

• What happens if your cell phone has an A3 display, yet retains portability?

• What happens when all of this is factored in to other trends (wireless, speed, cost, …)?

Page 24: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

New models of interaction

◦ Combine new display formats with machine perception◦ Handwriting, gesture – Tablet PC◦ Speech◦ Touch◦ Scanning / Recognition◦ Physical objects

◦ The future is “interactive surfaces”

Page 25: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete idea 7: Standalone computers

◦ Electronics and photonics provide ever more ◦ bandwidth

◦ 1.36Tb/s “femto second networks” running now in the laboratory

◦ Wireless networking becoming ubiquitous◦ Increasing bandwidth, decreasing power consumption, better

spectrum optimization

◦ Everything is interconnected ◦ And therefore more complex…◦ The “computer” is no larger the system boundary◦ Tension between server versus client oriented computing models:

rich vs. thin client, services vs. products vs. free (with adverts), centralized vs. peer to peer sharing and collaboration

◦ NB Latency

Page 26: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Obsolete idea 8:Artificial Intelligence

◦ Real world is too complex to model using formal data structures

◦ Statistical models for computing with hypotheses turn out to be very robust and applicable to a wide range of problems

◦ Machine Learning◦ Train an algorithm with labelled data◦ Classify new data using trained algorithm◦ Bayesian and Markov models dominate◦ Contrast with “rule-based” AI

◦ Machine perception, but not understanding◦ Increasingly practical due to both Moore’s law and

better algorithms

Page 27: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Machine Translation

Microsoft data-driven machine translation system

◦ Hybrid system uses symbolic and statistical techniques

◦ Learns translation correspondences automatically from over a million bilingual sentence pairs

◦ Builds on same technology used in grammar checkers & other Microsoft products

Page 28: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.
Page 29: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.
Page 30: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Translating Knowledge Bases

◦ Automatically translate Microsoft on-line Knowledge Base into multiple languages◦ Previously English-only ◦ Only a small set of crucial articles (about 5% of total) had

been translated◦ Size (140,000 documents, 80 million words) made human

translation economically infeasible◦ Spanish online today, French and German coming…

  SP KB(5 mos)

US KB(5 mos)

% of customers who are satisfied with KB

79% 73%

% of customers who could solve their issues using KB

55% 57%

Page 31: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

What Might Stop Moore’s Law?

◦ Physical limits◦ “Atoms are too large, and light is too slow”◦ Today, the problem isn’t making the transistors faster, it’s

the time for signals to propagate on the wires (latency again)

◦ Power. Lots of transistors => lots of power. Cooling is hard

◦ Design complexity◦ Designing a billion-transistor chip takes a large team,

even with good design tools◦ The “junk DNA” problem

◦ Economics◦ Factories are very expensive

◦ Latency: See D. Patterson, “Latency Lags Bandwidth”, CACM, October 2004

Page 32: Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.

Conclusions

◦ Hardware has evolved rapidly

◦ We haven’t exploited it as well as we should

◦ May be some of the stuff I learned at Leeds was useful after all!