ML and Data Analytics with Google Cloud Platform The power of machine learning on any data, any size
ML and Data Analytics with Google Cloud PlatformThe power of machine learning on any data, any size
Alex OsterlohSolution Engineer, Google
[email protected]@BigDataWizard
1st WaveColocation Virtualized
Data Centers
2nd Wave
An Evolving Cloud
Your kit, someone else’s building.
Yours to manage.
Standard virtual kit, for rent. Still yours to manage.
Google Cloud Platform 5
?
Colocation Virtualized Data Centers
Automated ServicesScalable Data
3rd Wave
An Evolving Cloud
Your kit, someone else’s building.
Yours to manage.
Standard virtual kit, for rent. Still yours to manage.
Google Cloud Platform 6
1st Wave 2nd Wave
Focus in insight, not infrastructure
Colocation Virtualized Data Centers
Automated ServicesScalable Data
3rd Wave
An Evolving Cloud
Your kit, someone else’s building.
Yours to manage.
Google Cloud Platform 7
1st Wave 2nd Wave
Focus in insight, not infrastructure
Standard virtual kit, for rent. Still yours to manage.
“Google is living a few years in the future and sending the rest of us messages”
Doug CuttingChief Architect Cloudera
2012 20132002 2004 2006 2008 2010
Google Research Publications referenced are available here: http://research.google.com/pubs/papers.htmlThe Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2009 http://research.google.com/pubs/pub35290.html
GFS
MapReduce
BigTable Colossus
Dremel Flume
Megastore
Spanner
Millwheel
PubSub
F1
Google Research in Data Technologies
10+ Years of Tackling Data Problems
Google Cloud Platform
Google Papers
20082002 2004 2006 2010 2012 2014 2015
GFS MapReduce
Flume Java Millwheel
OpenSource
2005
GoogleCloudProducts BigQuery Pub/Sub Dataflow Bigtable
BigTable Dremel PubSub
Apache Beam
TensorFlow
2016
ML
Google Cloud Platform Confidential & Proprietary 11
We don’t really use MapReduce anymoreUrs Hölzle
SVP TechnicalInfrastructure Google
“ ”
Confidential & ProprietaryGoogle Cloud Platform 12
Management
Mobile
Services
Compute
Big Data
Networking
Storage Developer Tools
ML
CapturePub/Sub
ProcessDataflow
StoreStorage
SQL
Datastore
BigTable
AnalyzeBigQuery
Dataflow
Cloud ML
The Big Data Lifecycle
CapturePub/Sub
ProcessDataflow
StoreStorage
SQL
Datastore
BigTable
AnalyzeBigQuery
Dataflow
Cloud ML
The Big Data Lifecycle
Learn
Dataflow
BigQuery
Fast ETLRegexJSONUDFs
Spreadsheets
BI Tools
Coworkers
Applications + Reports PubSub
Cloud Storage
Bigtable
Enterprise Big Data Architecture on Google
Hadoop on Compute Engine
GCS-Hadoop Connector
Your Data
unmanaged
Confidential & ProprietaryGoogle Cloud Platform 15
Dataflow
BigQuery
Fast ETLRegexJSONUDFs
Spreadsheets
BI Tools
Coworkers
Applications + Reports PubSub
Cloud Storage
Bigtable
Enterprise Big Data Architecture on Google
Hadoop on Compute Engine
GCS-Hadoop Connector
Your Data
Cloud Dataproc
unmanaged managed
Confidential & ProprietaryGoogle Cloud Platform 16
http://blog.shinetech.com/2015/10/14/google-cloud-dataproc-and-the-17-minute-train-challenge/
Google confidential | Do not distribute
Applications that can see, hear & understand
Google confidential | Do not distribute
Input
Output
Neural Networks
Examples of applying ML
Machine Learning Use Cases
Structured Data
Classification/ Regression● Customer Churn Analysis● Product Diagnostics● Forecasting
Recommendation● Content Personalization● Product X-Sells/Up-sells
Anomaly Detection● Fraud Detection● Asset Sensor Diagnostics● Log Metric Anomalies
Unstructured Data
Image Analytics● Identify damaged shipments● Explicit Content Classification● Identify “styles” in images
Text Analytics● Call Center log analysis● Language Identification● Topic Classification● Sentiment Analysis
The Spectrum of Machine Learning
CloudTranslate API
CloudVision API
CloudSpeech API
Use pretrained models
Or use your own data to train models
Google Cloud Platform Confidential & Proprietary 24
The Machine Learning Spectrum
TensorFlow Machine Learning APIsMachine Learning APIs
Industry / applications
Academic / research
Cloud Machine Learning
Vision API
Speech APIOSS SDK
Cloud Datalab Notebook experienceManaged Infrastructure
Translate API
Google Cloud Platform Confidential & Proprietary 25
● Detect faces, landmarks, logos, text, and more
● Perform sentiment analysis
● Straightforward REST API
● Works on a base64-encoded image
● Connects to Google Cloud Storage
● Returns label, score pair
Google Cloud Vision API
Google Cloud Platform Confidential & Proprietary 26
Google Cloud Platform Confidential & Proprietary 27
Google Cloud Platform Confidential & Proprietary 28
● Pass raw audio data and language
● Returns a transcript of the audio data
● Works across >80 languages
● Receive response in streaming or non-
streaming
Google Cloud Speech API
● Enable voice interface to devices and applications
● Transcribe audio from stored media
● Multiple language support
● Access from mobile devices
Speech API
Click for Demo
Speech API Demo
Click for Demo
“What are you sinking about ? “
Google Cloud Platform Confidential & Proprietary 31
● translate text between thousands of language pairs.
● let’s websites and programs integrate with Google Translate programmatically
Google Cloud Translate API
Google Cloud Platform Confidential & Proprietary 32
The Machine Learning Spectrum
TensorFlow Cloud Machine Learning Machine Learning APIs
Industry / applications
Academic / research
Translate API
Vision API
Speech APIOSS SDK
Cloud Datalab Notebook experienceManaged Infrastructure
Google Cloud Platform Confidential & Proprietary 33
The Machine Learning Spectrum
TensorFlow Cloud Machine Learning Machine Learning APIs
Industry / applications
Academic / research
Vision API
Speech APIOSS SDK
Cloud Datalab Notebook experienceManaged Infrastructure
Translate API
Google Cloud Platform Confidential & Proprietary 34
Largest Machine Learning repository on GitHub
Operates over tensors: n-dimensional arrays
Using a flow graph: data flow computation framework
A brief look at TensorFlow
● Train on CPUs, GPUs
● Run wherever you like (local, cloud, mobile)
Google Cloud Platform Confidential & Proprietary 35
Largest Machine Learning repository on GitHub
Operates over tensors: n-dimensional arrays
Using a flow graph: data flow computation framework
A brief look at TensorFlow
● Train on CPUs, GPUs
● Run wherever you like (local, cloud, mobile)
Google Cloud Platform Confidential & Proprietary 36
The Machine Learning Spectrum
TensorFlow Cloud Machine Learning Machine Learning APIs
Industry / applications
Academic / research
Vision API
Speech APIOSS SDK
Cloud Datalab Notebook experienceManaged Infrastructure
Translate API
Google Cloud Platform Confidential & Proprietary 37
What Cloud Machine Learning Can Do
● Fully managed service
● Train using a custom Tensor Flow
graph
● Batch and online predictions, at scale
● Integrated Datalab experience
● Regression and classification tasks
Google Cloud Platform Confidential & Proprietary 38
Want more ? → http://bit.ly/gcp16data
Thank You
Alex [email protected]