Top Banner
Scale as a Competitive Advantage David Chou [email protected] blogs.msdn.com/dachou
22

Scale as a Competitive Advantage

Nov 01, 2014

Download

Technology

David Chou

Deck presented at the 2010 SOA & Cloud Symposium
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scale as a Competitive Advantage

Scale as a Competitive AdvantageDavid [email protected]/dachou

Page 2: Scale as a Competitive Advantage

The age of “big data”

Source: Wired Magazine: Issue 16.07, 2008.06.23; illustration by Marian Bantjeshttp://www.wired.com/science/discoveries/magazine/16-07/pb_intro

2010: ~1PB / 60 minutes

(projected)2008: ~1B views / day

2009: 600K photos

served /sec

Page 3: Scale as a Competitive Advantage

“More is different”

Source: Wired Magazine: Issue 16.07, 2008.06.23http://www.wired.com/science/discoveries/magazine/16-07/pb_intro

Infinite storage. Clouds of processors. Our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn't just more. More is different.

Page 4: Scale as a Competitive Advantage

“The future belongs to the companies and people that turn data into products”

Source: “What is data science?”, An O’Reilly Radar Report, 2010.06.02, Mike Loukideshttp://radar.oreilly.com/2010/06/what-is-data-science.html

Page 5: Scale as a Competitive Advantage

Working with data at scale

Source: “Data science democratize”, 2010.07.01, Mac Slocumhttp://radar.oreilly.com/2010/07/data-science-democratized.html

#teaparty cluster#justinbieber cluster

• 45M tweets pattern visualization in minutes

…. “political world has more connective tissue than of-the-moment entertainment”

Page 6: Scale as a Competitive Advantage

Facebook (2009)

• +200B pageviews /month

• >3.9T feed actions /day

• +300M active users

• >1B chat mesgs /day

• 100M search queries /day

• >6B minutes spent /day (ranked #2 on Internet)

• +20B photos, +2B/month growth

• 600,000 photos served /sec

• 25TB log data /day processed thru Scribe

• 120M queries /sec on memcache

Big data needs big processing

Twitter (2009)

• 600 requests /sec

• avg 200-300 connections /sec; peak at 800

• MySQL handles 2,400 requests /sec

• 30+ processes for handling odd jobs

• process a request in 200 milliseconds in Rails

• average time spent in the database is 50-100 milliseconds

• +16 GB of memcached

Google (2007)

• +20 petabytes of data processed /day by +100K MapReduce jobs

• 1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks

• +200 GFS clusters, each at 1-5K nodes, handling +5 petabytes of storage

• ~40 GB /sec aggregate read/write throughput across the cluster

• +500 servers for each search query < 500ms

• >1B views / day on Youtube (2009)

Myspace (2007)

• 115B pageviews /month

• 5M concurrent users @ peak

• +3B images, mp3, videos

• +10M new images/day

• 160 Gbit/sec peak bandwidth

Flickr (2007)

• +4B queries /day

• +2B photos served

• ~35M photos in squid cache

• ~2M photos in squid’s RAM

• 38k req/sec to memcached (12M objects)

• 2 PB raw storage

• +400K photos added /daySource: multiple articles, High Scalabilityhttp://highscalability.com/

Page 7: Scale as a Competitive Advantage

• Big data collection and processing– flying planes over nearly every

inch of the United States• on road photos• 45-degree low-altitude aerial photos• high altitude plane photos• satellite photos

– 10% done (August 2010)

– previous “all USA” flight image gathering exercise took 10 years

– 5PB storage and thousands of servers in one container

Bing Maps

Source: “Map Wars (visiting Bing’s imaging center)”, 2010.08.10, Robert Scoblehttp://scobleizer.com/2010/08/10/map-wars-visiting-bings-imaging-center/

Page 8: Scale as a Competitive Advantage

• Characteristics– On-demand self-service

– Broad network access

– Resource pooling

– Rapid elasticity

– Measured service

• Service models– Software as a service

– Platform as a service

– Infrastructure as a service

• Deployment models– Private cloud

– Community cloud

– Public cloud

– Hybrid cloud

Cloud computing

Source: The NIST Definition of Cloud Computing, Version 15, 2009.10.07, Peter Mell and Tim Grance http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc

“Cloud computing is a model for

enabling convenient, on-demand

network access to a shared pool of

configurable computing resources

(e.g., networks, servers, storage,

applications, and services) that can be

rapidly provisioned and released

with minimal management effort or

service provider interaction. This cloud

model promotes availability and is

composed of five essential

characteristics, three service models,

and four deployment models.”

Page 9: Scale as a Competitive Advantage

• 2007– founded by 6 people

• 2008– $29M funding from VC

• 2009– revenue - $270M

– $180M funding from Digital Sky Technologies

• 2010– 1,200+ employees

– $300M funding from Google and Softbank

• Active unique players– 215M monthly; 10% of world internet

population (updated 2010.10); 60M daily

– 1M daily 4 days after launch; 10M after 60 days

– 3B neighborhood connections

• Cloud infrastructure– 12,000 Amazon EC2 nodes

– Adding 1,000 servers per week (updated 2010.10)

– Moving 1PB data per day (updated 2010.10)

– 3 Gigabits/sec of traffic between FarmVille and Facebook (at peak)

– caching cluster serves another 1.5 Gigabits/sec to the application

Cloud levels the playing field

Source(s): “How FarmVille Scales to Harvest 75 Million Players a Month”, HighScalability.com, 2010.02.08, Tedd Hoff“Zynga Moves 1 Petabyte Of Data Daily; Adds 1,000 Servers A Week”, TechCrunch.com, 2010.09.22, Leena Rao

Page 10: Scale as a Competitive Advantage

• Utility computing– on-demand infrastructure

– self-provisioning and servicing

– rapid elasticity

– economy of scale

– operational expenditures

• Infrastructure-as-a-Service

• Service delivery model

… but cloud computing != cloud hosting

Cloud as a platform

Page 11: Scale as a Competitive Advantage

• Native cloud applications– horizontal scaling (scale-out)

– parallelization

– shared-nothing architecture

– partitioned data (sharding)

– multi-tenancy

– failure resilient (or fail-in-place)

– service-oriented

– federated composition

• Platform-as-a-Service

• Application development model

Cloud as a platform

Page 12: Scale as a Competitive Advantage

(On-Premise)

Infrastructure

(as a Service)

Platform

(as a Service)

Storage

Servers

Networking

O/S

Middleware

Virtualization

Data

Applications

Runtime

Storage

Servers

Networking

O/S

Middleware

Virtualization

Data

Applications

Runtime

You m

anag

e

Man

ag

ed b

y v

en

dor

Man

ag

ed b

y v

en

dor

You m

anag

e

You m

anag

e

Storage

Servers

Networking

O/S

Middleware

Virtualization

Applications

Runtime

Data

Software

(as a Service)

Man

ag

ed b

y v

en

dor

Storage

Servers

Networking

O/S

Middleware

Virtualization

Applications

Runtime

Data

Service delivery models

Page 13: Scale as a Competitive Advantage

Use more pieces, not bigger pieces

LEGO 10179 Ultimate Collector's Millennium Falcon• 33 x 22 x 8.3 inches (L/W/H)• 5,195 pieces

LEGO 7778 Midi-scale Millennium Falcon• 9.3 x 6.7 x 3.2 inches (L/W/H) • 356 pieces

Page 14: Scale as a Competitive Advantage

Live Journal (from Brad Fitzpatrick, then Founder at Live Journal, 2007)

Partitioned Data

DistributedCache

Web Frontend

Distributed Storage

Apps & Services

Page 15: Scale as a Competitive Advantage

Flickr (from Cal Henderson, then Director of Engineering at Yahoo, 2007)

Partitioned Data DistributedCache

Web Frontend

Distributed Storage

Apps & Services

Page 16: Scale as a Competitive Advantage

SlideShare (from John Boutelle, CTO at Slideshare, 2008)

Partitioned Data

Distributed Cache

WebFrontend

Distributed Storage

Apps &Services

Page 17: Scale as a Competitive Advantage

Twitter (from John Adams, Ops Engineer at Twitter, 2010)

PartitionedData

DistributedCache

WebFrontend

DistributedStorage

Apps &Services

Queues

AsyncProcesses

Page 18: Scale as a Competitive Advantage

2010 stats (Source: http://www.facebook.com/press/info.php?statistics)

– People• +500M active users• 50% of active users log on in any given

day• people spend +700B minutes /month

– Activity on Facebook• +900M objects that people interact with• +30B pieces of content shared /month

– Global Reach• +70 translations available on the site• ~70% of users outside the US• +300K users helped translate the site

through the translations application

– Platform• +1M developers from +180 countries• +70% of users engage with

applications /month• +550K active applications• +1M websites have integrated with

Facebook Platform • +150M people engage with Facebook on

external websites /month

Facebook(from Jeff Rothschild, VP Technology at Facebook, 2009)

PartitionedData

DistributedCache

WebFrontend

DistributedStorage

Apps &Services

ParallelProcesses

AsyncProcesses

Page 19: Scale as a Competitive Advantage

app server

app server

• Scale-out architecture + distributed computing– small logical units of work

– loosely-coupled processes

– stateless

– event-driven design

– optimistic concurrency

– partitioned data

– redundancy fault-tolerance

– re-try-based recoverability

app server

app server

Cloud computing as a new paradigm

app serverweb data store

web

web app server

web data store

web

web

data store

data store

data store

data store

parallel tasks

async tasks

Page 20: Scale as a Competitive Advantage

cost reductionStrategic advantages of cloud computing

cost reductiontime to marketpay by useability to scale

Page 21: Scale as a Competitive Advantage

• Data–data federation–data purification–data democratization–derived intelligence

• Process–Web as a platform– federated applications–adaptive agents

What’s next?

Page 22: Scale as a Competitive Advantage

© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Thank you!

David [email protected]/dachou