Top Banner
A Different Look at Ad Targeting A Swarm of Ads
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dale Wong - Spark GraphX Demo

A Different Look at Ad Targeting

A Swarm of Ads

Page 2: Dale Wong - Spark GraphX Demo

Nature-Inspired Algorithm for AdTech Model Exploration

• Pages are linked by similarity,forming a network of branches

• Ads are like butterflies,drifting towards attractors

• The flight of the butterflies is a function of local attraction to pages, plus some randomness to escape local minima

• System converges to ads hovering around relevant pages

Page 3: Dale Wong - Spark GraphX Demo

Nature-Inspired Algorithm for AdTech Model Exploration

• Pages are linked by similarity,forming a network of branches

• Ads are like butterflies,drifting towards attractors

• The flight of the butterflies is a function of local attraction to pages, plus some randomness to escape local minima

• System converges to ads hovering around relevant pages

Page 4: Dale Wong - Spark GraphX Demo

Nature-Inspired Algorithm for AdTech Model Exploration

• Pages are linked by similarity,forming a network of branches

• Ads are like butterflies,drifting towards attractors

• The flight of the butterflies is a function of local attraction to pages, plus some randomness to escape local minima

• System converges to ads hovering around relevant pages

Page 5: Dale Wong - Spark GraphX Demo

Nature-Inspired Algorithm for AdTech Model Exploration

• Pages are linked by similarity,forming a network of branches

• Ads are like butterflies,drifting towards attractors

• The flight of the butterflies is a function of local attraction to pages, plus some randomness to escape local minima

• System converges to ads hovering around relevant pages

Page 6: Dale Wong - Spark GraphX Demo

Nature-Inspired Algorithm for AdTech Model Exploration

• Pages are linked by similarity,forming a network of branches

• Ads are like butterflies,drifting towards attractors

• The flight of the butterflies is a function of local attraction to pages, plus some randomness to escape local minima

• System converges to ads hovering around relevant pages

Page 7: Dale Wong - Spark GraphX Demo

Similarity Graph

vertex = pageedge = similarity

val allPairs = pages.cartesian(pages).filter{ case (a, b) => a._1 < b._1 } val similarPairs = allPairs.filter{ case (page1, page2) => page1._2.intersect(page2._2).length >= 1 }

Page 8: Dale Wong - Spark GraphX Demo

Data Set•Kaggle 2012 Challenge: Click-Thru Rate Prediction •Actual data provided by a Chinese search company

•CSV files •26M search queries

•each query has its list of words •e.g. “data scientist”

• 4M ads •each ad has its list of words •e.g. “Insight Data Engineering Program”

Page 9: Dale Wong - Spark GraphX Demo

Data Set•Kaggle 2012 Challenge: Click-Thru Rate Prediction •Actual data provided by a Chinese search company

•CSV files •26M search queries

•each query has its list of words •e.g. “data scientist”

• 4M ads •each ad has its list of words •e.g. “Insight Data Engineering Program”

Page 10: Dale Wong - Spark GraphX Demo

Data Set•Kaggle 2012 Challenge: Click-Thru Rate Prediction •Actual data provided by a Chinese search company

•CSV files •26M search queries

•each query has its list of words •e.g. “data scientist”

• 4M ads •each ad has its list of words •e.g. “Insight Data Engineering Program”

Page 11: Dale Wong - Spark GraphX Demo

Ads Converge on Similar Pages Over Time

Page 12: Dale Wong - Spark GraphX Demo

Ads Converge on Similar Pages Over Time

Page 13: Dale Wong - Spark GraphX Demo

Ads Converge on Similar Pages Over Time

Page 14: Dale Wong - Spark GraphX Demo

Butterfly Simulation is a Good Fit for Spark GraphX

• Many parallel computations of localized operations

• ad migration

• attraction propagation

• select ad for page request

Page 15: Dale Wong - Spark GraphX Demo

Spark GraphX Google Pregel API

MapReduce for each vertex:

1.Send Messages

• send msgs to neighbors

2. Merge Messages

• merge msgs to same vertex

3. Vertex Program

• process incoming msgs

Vertex Program

Send Message

MergeMessages

Page 16: Dale Wong - Spark GraphX Demo

Spark GraphX Google Pregel API

MapReduce for each vertex:

1.Send Messages

• send msgs to neighbors

2. Merge Messages

• merge msgs to same vertex

3. Vertex Program

• process incoming msgs

Vertex Program

Send Message

MergeMessages

Page 17: Dale Wong - Spark GraphX Demo

Spark GraphX Google Pregel API

MapReduce for each vertex:

1.Send Messages

• send msgs to neighbors

2. Merge Messages

• merge msgs to same vertex

3. Vertex Program

• process incoming msgs

Vertex Program

Send Message

MergeMessages

Page 18: Dale Wong - Spark GraphX Demo

Spark GraphX Google Pregel API

MapReduce for each vertex:

1.Send Messages (for each edge)

• send msgs to neighbors

2. Merge Messages

• merge msgs to same vertex

3. Vertex Program

• process incoming msgs

Vertex Program

Send Message

MergeMessages

Page 19: Dale Wong - Spark GraphX Demo

Need to Adapt Programming Model

• Adapt Vertex-centric algorithmto Edge-centric API

• Replicate each vertex’s data onto its neighbors

• Replication implemented as an initialization phase Pregel cycle

• Localizes vertex calculations

• With GraphX cluster, network bandwidth is more of a concern than storage

Page 20: Dale Wong - Spark GraphX Demo

Need to Adapt Programming Model

• Adapt Vertex-centric algorithmto Edge-centric API

• Replicate each vertex’s data onto its neighbors

• Replication implemented as an initialization phase Pregel cycle

• Localizes vertex calculations

• With GraphX cluster, network bandwidth is more of a concern than storage

Page 21: Dale Wong - Spark GraphX Demo

Need to Adapt Programming Model

• Adapt Vertex-centric algorithmto Edge-centric API

• Replicate each vertex’s data onto its neighbors

• Replication implemented as an initialization phase Pregel cycle

• Localizes vertex calculations

• With Spark GraphX cluster, network bandwidth is more of a concern than storage

Page 22: Dale Wong - Spark GraphX Demo

About Dale• Bachelors in Computer Science, UC Berkeley

• Algorithm development for semiconductor design

• Algorithm development for genomic analysis

• Co-founder of three startups

• 18 US patents granted

• I hate vacationing in nature