Data Science at Flurry

Soups Ranjan, PhDsoups@flurry.com

We all know that when we’re talking about mobile, we’re talking about apps

Source: Nielsen- State of the App Nation 2012 Report and June 2013 Cross Platform Report

Time spent on mobile devices

Flurry has the deepest insight into consumer behavior on mobile

Flurry Facebook Google Millenial Media twitter JumpTap0

400320

Source: Data gathered from public statements/filings by Companies; Facebook denotes property and Network; Google Reach denotes sites and Network

Monthly Device Reach (Millions)

Twitter

• Flurry Analytics– Track users, sessions, events and crashes

• Flurry AppCircle– Advertise with Flurry to acquire new users for your app

• Flurry AppSpot– Monetize your app traffic via ads

Flurry Product Overview

• AppCircle: Advertiser configuration to set an ad:– Ad type: CPI, CPC, CP Video – Corresponding Bid– Ad format: Banner or Interstitial– Targeting (Age, Gender, Device, Location, Persona)

• AppCircle Bidder:– Optimally Acquire Ad-Space inventory where ads can be

AppCircle – Advertise to Acquire Users

Cost Model (Bid Price Estimation)

Bid Request (user, pub, exchange)

{Eligible Ads}

History of Bid, win-price

(Ad1, Bid1, P(win)1) … (Adn, Bidn, P(win)n)

Revenue model

(Ad1, AdvBid1, P(conv)1)…(Adn, AdvBidn, P(conv)n)

History of Ad Impressions, conversions

Budget Pacing

Advertiser Goals (α,β)

Ad, AdvBid, Daily Budget,

Ad Selector (Pick ad and its bid price)

Bid ad on Exchange

{Eligible Ads}

AppCircle Bidder Strategy

Bidder Ad Selection Model - I

Ad Selection Model:Select Ad(adv,pub,exchange,user) =

argmax (Pwinα (Revenue(adv,pub,exchange,user) – β Cost(adv,pub,exchange,user)))

• Maximize margin model (α = β = 1):Select Ad(adv,pub,exchange,user) =

argmax (Pwin (Revenue(adv,pub,exchange,user) – Cost(adv,pub,exchange,user)))

– May lead to lower advertiser fill rate, as we will then only bid to show an advertiser's ad when we are guaranteed to win at price lower than advertiser's bid

Ad Rev (ecpm)

Cost P(win) Rank

Adv1 1.50 1.30 0.30 0.3 * (1.5-1.3) = 0.06

Adv2 0.60 0.50 0.70 0.7 * (0.6-0.5) = 0.07

Bidder Ad Selection Model - II

• Maximize fill rate for advertiser (α =1, β = 0): Select Ad(adv,pub,exchange,user) =

argmax (Pwin Revenue(adv,pub,exchange,user))

– We select the ad that maximizes our revenue goals– however, we only bid if the revenue > cost

Ad Rev (ecpm)

Cost P(win) Rank

Adv1 1.50 1.30 0.30 0.3 * (1.5) = 0.45

Adv2 0.60 0.50 0.70 0.7 * (0.6) = 0.42

Ad Revenue Optimization problem: – Max: P(conv) * bid– Conversion Prediction Model: Max

P(conv)

Historical Estimation:

- Past conversion rate as a predictor for future conversion rates

ML Conversion Prediction Model: – Features: Publisher, Ad, User, Time, Location

AppCircle: Ad Revenue Optimization:

User id

Conv-prob for users who sawAd1 in Pub1’s app

Avg conv-prob

Bidder Cost Model

Cost model: We don’t know about other players in the auction Best we can do is to predict based off of our wins and losses

1) If historically we win on auctions for users in Kansas City => 2) Most likely, other bidders not interested in Kansas City users => 3) Next time, we’ll lower our bid for Kansas City users =>4) If we still win those Kansas City users, continue (1-3) =>5) If not, we will revise our bid back up

Machine Learnt model gives us both: Cost and P(win)

Multi-class Classification model (Logistic Regression) to predict win-price based on ad impression

Machine Learning based Bidder Cost Model

P(win) ~ 1.0 P(win) ~ 0.0

Win-price=28cWin-price=27c

Win-price=52 c

No Win

AppCircle Conversion Rates: Local Hour of Day

Regression weights for localHourOfDay

Local Hour of Day (0-23 hours)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Conversion Probability by locaHourOfDay

local hour of day (0-23 hours)

12 noon

4 pm6 pm

Machine Learning Workflow

• How much data is enough?• Parallelize Feature Generation vs. Model Generation• Interpretable vs. Black-box models• Batch vs. Online learning• Time to Score a Model• Unbalanced Data• Over-fitting & Regularization

Recommender System

Recommender System as an Ad-ranking method

Given users and apps they have installed in the past, what other apps are they likely to install?

Given users and their app usage (time-spent), what new apps are they likely to highly engage with?

0.1hr0.3hr

2hr 0.8hr

Recommender System

• Item-Item based Collaborative Filtering:– Missing value prediction

Engagement Model – Android All

• Category of SocialApp: Social• Number of users of SocialApp: 2,227• Number of predicted users of SocialApp: 1,131

SocialApp SocialApp

Engagement Model – Android All

• Category of SocialApp: Social• Number of users of SocialApp: 2,227• Number of predicted users of SocialApp: 1,131

SocialAppSocialApp

Data Science at Flurry

bid price bid ad

bidder ad selection

advertisers ad

bidder cost model cost

ad type

margin model

based bidder cost model

win price

Technology

Flurry Analytic Backend - Processing Terabytes of Data in...

Ausmalen Flurry Bird

A Flurry of Fun & Learning - Wissahickon School District

Flurry Presents Apps by the Numbers

Flurry Analytics - Mobile Monetization - ASW Berlin

Rabies Flurry Pa Tho Genesis

Flurry insights korea report oct13

All-New Flurry Mobile Analytics Intro

Gnt Flurry

Rml flurry 2012_sp

Simon khalaf flurry state of appnation slides

Flurry Analytics - обзор возможностей

4A’s Transformation 2013 - March 12 - Flurry - Long Ellis

Flurry, Localytics, Mixpanel, deltaDNA | Company Showdown

Calculating LTV Using Flurry

Siiconl Valley Data Center Market - Vantage Data...