Page 1
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Building a Data AnonymizationPlatform on AWS
Laszlo Török,Engineering Lead,Telefónica NEXT
B I G D A T A / A N A L Y T I C S / S T R E A M I N G
Özkan Can,Senior Solutions Architect,Amazon Web Services
Page 2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Cloud-native application storage
Özkan Can,Senior Solutions Architect,Amazon Web Services
S T G 5
Page 3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Agenda
1. Why cloud-native?
2. What ist cloud-native?
3. What does this mean for Storage?
Page 4
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Why cloud-native?
1. Reliability
2. Frugality
3. Security
4. Elasticity & scale
5. Performance
6. Better design
Page 6
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Page 7
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 8
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Assess and
prioritize,
app by app
Pick path to
modernization
Lift & shift:
data center → EC2
Re-platform:
VMs → containers
Refactor:
monolith →microservices
Re-invent:
host fleets → serverless
Paths to cloud-native
Page 9
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Traditional app. example, in-cloud
Amazon
Aurora MySQL
Amazon EC2
Auto Scaling
Application
Load Balancer
Page 10
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Traits of modern applications
Security in every layer.
Built to scale on demand.
Continously integrated and deployed.
Microservice-oriented and API-backed.
Leveraging purpose-built databases and cloud-native storage options.
Page 11
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 12
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Block StorageRaw StorageData organized as an array of unrelated blocksHost File System places data on diske.g.: Microsoft NTFS, Unix ZFS
File StorageUnrelated data blocks managed by a file (serving) system
Native file system places data on disk
Object StorageStores Virtual containers that encapsulate the data, data attributes, and metadata
API Access to data
Metadata Driven, Policy-based, etc
File vs Block vs Object
Page 13
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
File vs Block vs Object
File ObjectBlock
Amazon Simple Storage
Service (S3)
Amazon S3 Glacier
Amazon Elastic Block
Store (EBS)
Amazon Elastic File
System
Amazon FSx for Lustre
NEW
Amazon FSx for Windows
File Server
NEW
Page 14
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon
EFS
AWS Storage Gateway Family
Amazon
S3
NEW!Amazon
FSx for
Lustre
Amazon FSx
for Windows
File Server
NEW!
Amazon
EBS
Amazon
EC2
Storage options for cloud-native applications
Page 15
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Storage Services on AWS
Amazon Elastic Block
Store (EBS)
Amazon FSx for LustreAmazon Elastic File
System
Amazon FSx for Windows
File Server
Amazon S3 Glacier AWS Snowball EdgeAmazon Simple Storage
Service (S3)
AWS Snowball
AWS Storage GatewayAWS BackupAWS Snowmobile
NEW NEW
NEW
Page 16
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Magic Quadrant
Magic Quadrant for Public Cloud Storage Services, Worldwide – 2018
Positioned furthest for completeness of vision and highest for ability to execute in each report since inception in 2014
Magic Quadrant for Public Cloud Storage Services, July 2018 – Raj Bala, Julia Palmer
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Amazon Web Services. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
Page 17
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Enterprise applications
Benefits of Amazon S3
Website hosting
Media Master files
Big Data
File Sharing
Content Distribution
Archive
Data Analytics
Backup & Restore
Dynamic Websites
Mobile sync & backup
Disaster Recover
Re-creatable data
Page 18
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon Simple Storage Service (S3)
Designed for
99.999999999% durability
Unmatched security and
compliance capabilities
Replication options across
regions
On-demand analytics
Built-in support for SQL
expressions with S3 Select
Detailed data on usage
patterns and access
The most ways to
move data in/out
Security that
helps the CISO
Automated cost reduction
tools
Collect AnalyzeStore
Page 19
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Your choice of Amazon S3 storage classes
Access FrequencyFrequent Infrequent
• Active, frequently
accessed data
• Milliseconds access
• > 3 AZ
• From: $0.0210/GB
• Data with changing
access pattern
• Milliseconds access
• > 3 AZ
• From: $0.0210 to
$0.0125/GB
• Monitoring fee per obj.
• Min storage duration
• Infrequently accessed
data
• Milliseconds access
• > 3 AZ
• From: $0.0125/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
S3 Standard S3 Standard-IA S3 One Zone-IA S3 Glacier
• Re-creatable less
accessed data
• Milliseconds access
• 1 AZ
• From: $0.0100/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
• Archive data
• Minutes to hours
access
• > 3 AZ
• From: $0.0040/GB
• Retrieval fee per
GB
• Min storage
duration
• Min object size
S3 Intelligent-
Tiering
S3 Glacier
Deep Archive
• Archive data
• Hours access
• > 3 AZ
• From: $0.00099/GB
• Retrieval fee per GB
• Min storage
duration
• Min object size
N E W ! N E W !
Page 20
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lakes from AWS
Data Lake on AWS
Lowest cost
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data Movement
On-premisesData Movement
Page 21
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Tiered Storage to Optimize Price/PerformanceLowest Cost
• Tiered storage to optimize price/performance• S3 Standard
• S3 Standard—Infrequent Access
• S3 One Zone—Infrequent Access
• Amazon Glacier
• Migrate between tiers based on lifecycle policies
• Store data at $0.023/GB/month with S3
• Store data at $0.004/GB/month with Glacier
S3
StandardS3 Standard
Infrequent Access
S3 One Zone-IA
Glacier
Page 22
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Building a Data AnonymizationPlatform on AWS
Laszlo Török,Engineering Lead,Telefónica NEXT
B I G D A T A / A N A L Y T I C S / S T R E A M I N G
Page 23
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 24
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
We are part of the global
telco group
Telefónica Germany‘s
network has the most
customer lines
Page 25
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
An average day of the Telefónica Germany Network
45MCustomer
Lines
5B+Network
Events
Page 26
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Very valuable dataHelp predict capacities in regional transportin Berlin-Brandenburg in the ProTrainproject, supported by the German Ministryof Transport
Model city traffic for special events like football matches with partner Intraplan
Retail applications: store location planing
Page 27
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
https://next.telefonica.de/so-bewegt-sich-deutschland
Page 29
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Very sensitive data
Protected by German privacylaws (TMG, BDSG)
We are commited to protect
privacy and give users control
over their data
Page 30
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Valuable insights vs Individual privacy ?
Most value comes from aggregate analysis of groups, not individuals.
Page 31
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Telefónica‘s Data Anonymization Platform
Cellular Signal
Data produced
by regular
mobile
network use
Anonymization
and aggregation
of Data – opt-out
possible
Analysis of
anonymized
data
Solutions for
society and
economy
Page 32
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 33
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Context
Page 34
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Requirements for our Data Hub
Page 35
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 36
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Page 37
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Hub v0.9
Page 38
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Fully programmable provisioning
Page 39
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Can we do better?
Page 40
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Optimizing S3 storage costs
Page 41
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
First month of production
Top 5 items on our end of month AWS bills
1. Lambda
2. Kinesis
3. Cloudwatch (?!?)
4. S3
5. KMS (~ on par with S3)
Page 42
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Use compression – fewer Kinesis shards needed
Page 43
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
What went wrong?
Page 44
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Hub v2 – Fargate to the rescue
Page 45
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Access Management via IAM
Page 46
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Monitoring via Cloudwatch
Page 47
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page 48
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Lessons learnt
Superpower Less DIY ops
not always significant€€€ savings
Page 49
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Thank you!
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
László Török
Engineering Lead, Big Data Privacy Services
[email protected]
https://next.telefonica.de
@telefonicaNEXT
We are hiring!
linkedin.com/company/telefónicanext
Page 50
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.