AWS Architecture Case Study: Real-Time Bidding Tom Maddox – AWS Solutions Architect
AWS Architecture Case Study:
Real-Time Bidding
Tom Maddox – AWS Solutions Architect
Who am I?
• Gardener (Capacity Planning)
• Motorcyclist (Agility)
• Mobile App Writer
• Problem Solver
• Technology Geek
• Solutions Architect
• Intentional Generalist
Agenda
• What Is Real-Time Bidding (RTB)?
• Architectural Challenges
• Architecture Deep Dive– DynamoDB Streams
– Big Data Transformation and Analysis Tools
– Machine Learning
What is Real-Time Bidding?
Real-Time Bidding (RTB) is a service offered by
advertising networks to agencies. The agencies
decide on the value of advertising opportunities in
real-time and bid accordingly on behalf of their
advertising clients. Typically the window of
opportunity for bids to be calculated from provided
consumer details (e.g. cookies) and then
submitted is 100ms.
The 100ms Handshake
Real Estate
Owner
• Websites
• Mobile Apps
• Video
Streaming
Advertising
Agency
• Logged in with…
• Referred by…• Location
• Historic user
profiles
• Active
Campaigns
• Content Management
• Billing
• Campaign
Management• Keywords• Interests
Why is it interesting?
• Most agencies want to maximize their campaign audiences by responding to advertising opportunities all over the world.
• Responses based on data driven, calculated decisions are important to yield value to campaigns.
• Consistently bidding on global opportunities in under 100ms can bechallenging at any scale.
AdRollAdRoll is a global leader in retargeting with more than 10,000 active
advertisers across over 100 countries.
AdRoll store 1.5 PB of data in Amazon S3 and run just 30 core Amazon Elastic Compute Cloud (Amazon EC2) instances. Additional
instances—anywhere from 200 to 1,000 of them, including Amazon EC2 Spot Instances—are used for variable capacity.
“We need high performance, but we need more than that,” says Valentino Volonghi, CTO. “We need flexibility, and we need software that could scale across multiple data centers and machines, software we could optimize as we go. Moving our operations to the cloud was
really our only option.”
S
SocialVibe
SocialVibe has built a global business that handles peaks in its traffic using Amazon DynamoDB and multiple
Availability Zones across different Regions. Using AWS has enabled SocialVibe to experiment with new architectures to meet the demands of a diverse worldwide customer base.
“We had to order hardware in advance, we couldn’t experiment with new hardware easily,” explains Joshua
Rangsikitpho, CTO. “once we moved over to AWS all those problems went away”
On to the architecture…
Architecture Overview
Click Stream
Ingest
Real-Time
Bidding
Regional Hub
Regional Hub
Regional Hubs
Big Data Processing(Billing, Profile Tracking, Machine
Learning)
Campaign
Mgmt
Architecture Overview
Flyby Comments
• This architecture focuses on bidding logic. We’re not going to look closely at content management or serving.
• There is a split between time sensitivity. Bidding is done as fast as possible, but clickstream data can be buffered to update data for bidding decisions.
• Long range connectivity can be error prone. We’re leveraging AWS managed, resilient replication techniques wherever possible.
Deep Dive: Campaign Management
Deep Dive: Campaign Management
• Role: Management and monitoring of advertising campaigns.
• Usage: Marketing departments and advertising agencies log into a marketing portal to define campaigns and monitor dashboards.
• Top Tip: campaign managers can optimize target audiences in real-time, based on ongoing success.
• Services: Elastic Beanstalk, Elastic Load Balancers, EC2 Instances, RDS
Deep Dive: Click Stream Ingest
Real-time processing
High throughput; elastic
Easy to use
S3, Redshift, DynamoDB Integrations
Amazon
Kinesis
Data Sources
App.4
[Machine Learning]
AW
S En
dp
oin
t
App.1
[Aggregate & De-Duplicate]
Data Sources
Data Sources
Data Sources
App.2
[Metric Extraction]
S3
DynamoDB
Redshift
App.3[Sliding
Window Analysis]
Data Sources
Availability Zone
Shard 1Shard 2Shard N
Availability Zone
Availability Zone
Introducing Amazon KinesisManaged Service for Real-Time Processing of Big Data
Kinesis
Deep Dive: Click Stream Ingest
• Role: Collect and aggregate click stream data from audience interactions.
• Usage: Interactions with audiences are recorded in a DynamoDB table. Streams and Lambda batch the content into S3 objects.
• Services: Elastic Beanstalk, DynamoDB with Streams, Lambda, S3
Deep Dive: Big Data Processing
S3 Cross Region ReplicationReplication of data across AWS regions reliably
• All new uploads into source bucket will be
replicated
• Asynchronous
• Entire bucket or prefixes
• Versioning required
Deep Dive: Big Data Processing
• Role: Consolidate clickstream datafrom around the world, update userprofiles, KPI’s and client invoices.
• Usage: S3 bucket replication is used to bring regional data together. Then AWS analytics services transform the data to derive insights. Updated user profiles and bidding tactics are distributed with the DynamoDB replication client.
• Top Tip: Faster analysis can lead to better return on investment. Leveraging Spot Instances can maximize this optimization opportunity.
• Services: S3 bucket replication, Data Pipeline, EMR, Redshift, Machine Learning, Kinesis, DynamoDB Replication Client, Spot Market.
Don’t lock your big data
pipeline down.Keep it agile and experiment often!
Deep Dive: Real-Time Bidding
Optimized Connectivity
• Many advertising exchanges already use AWS.
• Enquire into low latency connectivity options
such as VPC peering.
• If not, consider Direct Connect as a means to
get best possible latency from AWS to anyone.
Deep Dive: Real-Time Bidding
• Role: Respond to advertising opportunities with bids based on
campaign and user data in under 100ms.
• Usage: An API responds to opportunities in real-time using in-memory caches and
cross-region replication.
• Services: Elastic Beanstalk, ElastiCache,
RDS and DynamoDB
What are DynamoDB Streams?A stream of updates that scales with your table
Asynchronous
Exactly once
Strictly ordered records
i=Ai=B
i=C
i=C
Cross region replication
post
Architecture Overview
Summary
• Use regional hubs to minimize latency.
• Leverage asynchronous cross-region replication to keep regional hubs in sync, in real-time.
• Use event driven workloads that remain responsive at any scale.
• Many exchanges already use AWS, so enquire about optimized connectivity.
• Let us manage moving data around, so that you can concentrate on finding new insights from your data.
Further Challenges
• How could we make sure a campaign budget is
never exceeded?
• What’s the best way to manage and serve
advert content too?
Call to Action
1. Come to us and tell us more about your AdTech
use cases and what your priorities are.
2. We a building an AdTech community for AWS
customers to get early preview of products and
collaborate with us.
3. We are looking for cool and popular AdTech
blog ideas and for customers to tell their story.
London Loft