Top Banner
Sponsored Search Advertising form a Database Perspective George Trimponias, CSE 1
32

Sponsored Search Advertising form a Database Perspective

Feb 04, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sponsored Search Advertising form a Database Perspective

Sponsored Search Advertising form a Database Perspectiveform a Database Perspective

George Trimponias, CSE

1

Page 2: Sponsored Search Advertising form a Database Perspective

The 3 Stages of Sponsored Search

• Ad Selection: Select all candidate ads that may be relevant.– Matchtype is important in this process.

• Ad Ranking: Rank candidate ads and select top-K.• Ad Ranking: Rank candidate ads and select top-K.– Industry Ranking Score: Max Bid × Quality Score

• Ad Pricing: Determine the actual cost per click for every advertiser in the top-K list.– Most prominent pricing scheme is the Generalized

Second-Price Auction (GSP).

2

Page 3: Sponsored Search Advertising form a Database Perspective

Account Structure

• Recall that advertiser information is hierarchically structured.– Account.

– Ad Campaign.– Ad Campaign.• Related to a specific marketing goal.

• Characterized by a budget.

– Ad Group.• Contains dozens of creatives and hundreds of bid terms.

• Maximum bid for each of the bid terms.

3

Page 4: Sponsored Search Advertising form a Database Perspective

Ad Structure

• Headline.

• Lines of text.

• Display URL.

•• Destination URL (Landing Page).

4

Page 5: Sponsored Search Advertising form a Database Perspective

Database Example

5

Page 6: Sponsored Search Advertising form a Database Perspective

Properties of Ad Corpora

6

Page 7: Sponsored Search Advertising form a Database Perspective

Properties of Ad Corpora

7

Page 8: Sponsored Search Advertising form a Database Perspective

Properties of Ad Corpora

8

Page 9: Sponsored Search Advertising form a Database Perspective

Properties of Ad Corpora

• Ads can be indexed in main memory.

• Retrieving candidate ads for infrequent queries is a difficult problem.

9

Page 10: Sponsored Search Advertising form a Database Perspective

Ad Indexing

• The retrieval of <creative, term> pairs from a structured schema.– Structured retrieval problem, where the unit of

retrieval is defined hierarchically.

• Naively indexing all possible retrieval units would result in wasted storage due to the Cartesian product semantics.

• To avoid this, we can utilize hierarchical indexing schemes that reduce the amount of duplication.

10

Page 11: Sponsored Search Advertising form a Database Perspective

3 Main Approaches

• Term Coupling Index: Index units that are composed of <creative, term> pairs.

• Creative Coupling Index: The indexing unit is a single creative coupled with all the bid terms a single creative coupled with all the bid terms associated with its ad group.

• Ad Group Coupling Index: The indexing unit is the ad group itself.

• Different indexing strategies have a different impact on ad retrieval effectiveness.

11

Page 12: Sponsored Search Advertising form a Database Perspective

Indexing for Broad Matchtype

• Broad Matchtype: a user’s query contains all terms in the keyword in any order, possibly along with other terms.

• Consider the user query cheap used cars– The bid phrase used cars matches the query.– The bid phrase used cars matches the query.– The bid phrase fast cars does not match the query.

• Inverse operation from classical document retrieval.

• In practice, broad match also accounts for singular or plural, synonyms and other variations, misspellings, extensions.

12

Page 13: Sponsored Search Advertising form a Database Perspective

Traditional IR Techniques Fail

• Consider the use of inverted indexes containing ad IDs as postings.

• Using them, we obtain the union of the postings in the inverted indexes corresponding postings in the inverted indexes corresponding to keywords in the query.

• It is still necessary to filter out ads whose bid phrase contain words not in the query.– This operation is not directly supported by inverted

indexes!

13

Page 14: Sponsored Search Advertising form a Database Perspective

A Simple Framework

1. Index the set of words in each ad phrase using a hash table.

2. Process queries by retrieving the entries associated with all subsets of

words in a query.

14

Page 15: Sponsored Search Advertising form a Database Perspective

Query Processing in the Simple Framework

1. Given a query Q, generate all subsets q.

2. Visit the corresponding nodes in the hash table.

3. Return all listings associated with ads for which the bid phrase is a subset of the query Q.

Example with Q=“new fiction books”Example with Q=“new fiction books”

15

Page 16: Sponsored Search Advertising form a Database Perspective

Reducing Main Memory Latency in the Simple Framework

• Two Strategies1. Traverse fewer data nodes.

2. Perform fewer hash-lookups against the hash table.

16

Page 17: Sponsored Search Advertising form a Database Perspective

Traversing fewer data nodes

• Consider two ads A and B, for which bid_phrase(A)⊆bid_phrase(B).

• We can remove the data of ad B to the data node corresponding to ad A.corresponding to ad A.

• We call this data node remapping.

17

Page 18: Sponsored Search Advertising form a Database Perspective

Traversing fewer data nodes

• Any query accessing the superset will by default have to access the subsets as well.

• We save one random access.

• We reduce the number of hash-entries by one.

• Sequential data reads versus random accesses.

18

Page 19: Sponsored Search Advertising form a Database Perspective

Reducing the Number of Hash Lookups

• The number of subsets grows exponentially with the query length q.

• Solution: remap all long phrases to data nodes with node locators of length no more than k.

• Number of hash lookups bounded by as ∑

k q

• Number of hash lookups bounded by as opposed to 2q – 1.

∑ =

k

i i

q1

19

Page 20: Sponsored Search Advertising form a Database Perspective

Optimizing the Index Structure

• The reduction in the number of data nodes leads to fewer random accesses, but come at the cost of the bigger amount of data we have to access per data node visited.

• We need to find the optimal tradeoff between • We need to find the optimal tradeoff between these two factors.

• For this, we need the actual workload, i.e., the query history with the corresponding frequencies.

• The above problem is actually NP-complete (weighted set cover), but can be approximated within a reasonably good factor.

20

Page 21: Sponsored Search Advertising form a Database Perspective

Relevance

• A very critical factor to the search engine’s success.

• During the ad selection process, the search engine must identify low relevance ads and get engine must identify low relevance ads and get rid of them.

• Different from the CTR, which is a measure of how attractive (as opposed to relevant) an ad is.

21

Page 22: Sponsored Search Advertising form a Database Perspective

Prior Work on Relevance

1. Direct query-ad matching: ads are treated as documents and are ranked using a standard information retrieval technique.

2. Query Rewriting/Query Substitution: generate 2. Query Rewriting/Query Substitution: generate a relevant rewrite qj for a given query qi.

3. Query Recommendation/Query Clustering: consider the bipartite click graph of queries and ads with edges that correspond to click information.

22

Page 23: Sponsored Search Advertising form a Database Perspective

Direct Query/Ad Matching

• Simple text overlap features.– Important but insufficient.– Consider the ad with title “Best Jogging Shoes” and a

user searching for “running gear”.

• Historical click rates for a query-ad pair.• Historical click rates for a query-ad pair.– When there is limited click history for a specific

query-ad pair, back off to higher levels in the account hierarchy.

• Click Propensity in Query/Ad Translation.– Translation-Based Systems.

23

Page 24: Sponsored Search Advertising form a Database Perspective

Query Substitution

• Use query substitutions.• A hybrid of exact and broad match.• Has two phases: online and offline.• Offline Phase:• Offline Phase:

– Fix a large set of sufficiently frequent queries– Learn a function that substitutes input queries

• Online Phase:– Use exact match to find ads matching the

substitute query.

24

Page 25: Sponsored Search Advertising form a Database Perspective

Substitution Framework

1. For each query, obtain the top S results returned by a Web Search Engine.

2. Find the k ads most related to the input query.

3. The bid phrases of the se ads form a pool of 3. The bid phrases of the se ads form a pool of candidates.

4. The highest scoring bid phrase is selected as the query substitution.

25

Page 26: Sponsored Search Advertising form a Database Perspective

Query Recommendation

• Consider the bipartite graph of queries and ads.

• An edge exists if and only if a user who issued the

query clicked on the ad.

• The edge is also weighted with a positive weight, • The edge is also weighted with a positive weight,

which represents the strength of the association.

– For instance, position-normalized CTR, or machine-

learned estimate of the probability click P(click|q,ad).

26

Page 27: Sponsored Search Advertising form a Database Perspective

Query Recommendation through Collaborative Filtering

1. Compute the similarities between queries.

2. Compute a prediction of the response

between a query and an ad based on how

similar queries responded to the same ad.similar queries responded to the same ad.

– Reminiscent of PageRank…

27

Page 28: Sponsored Search Advertising form a Database Perspective

CTR

• CTR is the most prominent measure of ad quality employed by all large search engines.

• Crucial factor for ad ranking.• Its estimation has attracted considerable • Its estimation has attracted considerable

attention in the scientific community.• Its is usually formulated as a supervised

learning problem.– Maximum Entropy Model (EM).– Nonlinear conjugate gradient descent algorithm.

28

Page 29: Sponsored Search Advertising form a Database Perspective

Click Prediction as a Supervised Problem

• There is a set of training query-ad pairs (samples,) containing both click and non-click events.

• We want to estimate P(c|q,a).• We want to estimate P(c|q,a).

• We carefully select a proper set of features to represent the query-ad pair.– Lexical Similarity Features

– Historical Performance of Ads

29

Page 30: Sponsored Search Advertising form a Database Perspective

Personalized Click Prediction

• Estimate P(c|q,a,u).

• We need to consider additional user features.– Demographic Features (age, gender, marriage

status, interests, job status, occupation)status, interests, job status, occupation)

– User-Specific Features• Noisy

• Sparse

30

Page 31: Sponsored Search Advertising form a Database Perspective

References

• Konig, Church, Markov. A Data Structure for Sponsored Search. ICDE, 2009.

• Bendersky et al. The Anatomy of an Ad: Structured Indexing and Retrieval for Sponsored Search. WWW, 2010.

• Hillard, D., Schroedl, S., Manavoglu, E., Raghavan, H., • Hillard, D., Schroedl, S., Manavoglu, E., Raghavan, H., Leggetter, C. Improving Ad Relevance in sponsored Search. WSDM, 2010.

• Radlinski, F., Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., Riedel, L. Optimizing Relevance and Revenue in Ad Search: A Query Substitution Approach. SIGIR, 2008.

31

Page 32: Sponsored Search Advertising form a Database Perspective

References

• Anastasakos, T., Hillard, D., Kshetramade, S., Raghavan, H. A Collaborative Filtering Approach to Ad ecommendation using the Query-Ad Click Graph. CIKM, 2009.

• Cheng, H., Cantu-Paz, E. Personalized Click Prediction • Cheng, H., Cantu-Paz, E. Personalized Click Prediction in Sponsored Search. WSDM, 2010.

• Richardson, M., Dominowska, E., Ragno, R. Predicting Clicks: Estimating the Click-Through Rate for New Ads. WWW, 2007.

• Shen, S., Hu, B., Chen, W., Yang, Q. Personalized Click Model through Collaborative Filtering. WSDM, 2012.

32