Directions for Hadoop Innovation, Yahoo

Post on 01-Jul-2015

68 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

BD Hadoop SF 2013

Transcript

Directions for Hadoop Innovation

Apr 2013

Eric Bax

Hadoop in Online Advertising at Yahoo!

Response Prediction – Clicks and Conversions

Allocation and Pricing -- Guaranteed

Analytics – Marketplace Monitoring

Science – Value of Advertising

2

Marketplace Operations

3

Model ConstructionAuction

ReconciliationAnalytics and Billing

Ad Calls

Auction Log Ad Served

Clicks and Conversions

Response Frequencies

Predict Model

Ad + ResponseROI Evaluation

Online/Offline Sales $

Desiderata

4

Faster Answers

Fewer Computations per Datum

From Analytics to Active Monitoring

From Batch Cycles to Sense and Respond

Faster Turnaround

9/18/20135

Act on the 80% of Data That Arrives Quickly

Then Correct as Late-Landing Data Arrive

Pull for Initial Result; Push for Updates?

Online Updates to Models

9/18/20136

Each day produces Big Data.

Whole history: HUMONGOUS DATA.

Update models based on new data only.

And perhaps exceptions / borderline cases from history.

“Embedded” Computation

9/18/20137

Move Computation Closer to Where Data are Generated

Monitor for Anomalies Where they Occur

(Sometimes) Compress into Sketches before Transmitting Data

Hadoop as Part of Serving vs Isolated Clusters?

Propagate Data Among Logical Neighbors Quickly

Multi-Resolution Approach at Different Time Scales

Challenge: Clustering into Logical Neighborhoods to Fit Problem

Localized / Contextual Computation

9/18/20138

Search Clusters

9/18/20139

Who Clicks?

9/18/201310

Who Doesn’t?

9/18/201311

Hadoop in Five Years

9/18/201312

Will Hadoop grow by adding features / options?

Will it branch: faster, lighter, approximate, embedded versions?

Truly huge version? With approximation / sampling / multi-resolution?

The Right Fit

9/18/201313

Multi-resolution sense and respond.

Details to neighbors, sketches and aggregates globally.

Migrate processes and storage to ingest points or logical neighbors.

Tune system-wide performance through human-machine dialog .

Thank You

9/18/201314

Eric Bax

ebax@yahoo-inc.com

top related