Top Banner
IoT NY - Cloud services for IoT James Chittenden Google Cloud Platform Solutions Engineer [email protected]
42

IoT NY - Google Cloud Services for IoT

Jan 15, 2017

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IoT NY - Google Cloud Services for IoT

IoT NY - Cloud services for IoTJames Chittenden Google Cloud Platform Solutions [email protected]

Page 2: IoT NY - Google Cloud Services for IoT

+James Chittenden(Big Data Cloud Engineer)

[email protected]

Page 3: IoT NY - Google Cloud Services for IoT

Google confidential │ Do not distribute

Agenda

Big Data the Cloud Way - Why would you ?

Fully Managed: NoOps Ingest, Process & Analyse

Hands On Demo: Building an Event Streaming Pipeline

1

2

3

Page 4: IoT NY - Google Cloud Services for IoT

Big Data at Googleaka. Data at Google

Page 5: IoT NY - Google Cloud Services for IoT

20-?? BILLION devices will be

connected by 2020

$4-11 TrillionEconomic Impact

54% of top performer companies will invest

more in sensors this yr

Sources: Gartner, PwC, McKinsey

Page 6: IoT NY - Google Cloud Services for IoT

20-?? BILLION devices will be

connected by 2020

$4-11 TrillionEconomic Impact

54% of top performer companies will invest

more in sensors this yr

Sources: Gartner, PwC, McKinsey

Page 7: IoT NY - Google Cloud Services for IoT

What is IoT?IoT is a period of transformation

Phone IoT Phone

Page 8: IoT NY - Google Cloud Services for IoT

Wearables

Watches

Phones

Cars

Home Appliances

Existing Business Owned Equipment

Connected

IoT is a transition to connected

Not Connected

Page 9: IoT NY - Google Cloud Services for IoT

Back in the 70s ….

Page 10: IoT NY - Google Cloud Services for IoT

The PC

Page 11: IoT NY - Google Cloud Services for IoT

The Result

Page 12: IoT NY - Google Cloud Services for IoT

A datacenter is not a collection of computers,a datacenter is a computer.

The same is happening in the Cloud today

Page 13: IoT NY - Google Cloud Services for IoT

State of the art Data Centers.

For the past 17 years, Google has been building out the world’s fastest, most powerful, highest quality cloud

infrastructure on the planet.

Page 14: IoT NY - Google Cloud Services for IoT

2002 2004 2006 2008 2010 2012

Dremel ColossusMapReduce

GFS Bigtable Spanner

2014

Dataflow

Google’s Big Data Innovations go far back Flumejava

BigQuery

Millwheel

Bigtable

Page 15: IoT NY - Google Cloud Services for IoT

Extends the Android platform to IoT devices

Page 16: IoT NY - Google Cloud Services for IoT

Weave - IoT Protocol and Schema

Page 17: IoT NY - Google Cloud Services for IoT

Google Glass at Work

Page 18: IoT NY - Google Cloud Services for IoT

Nest - solutions for the connected home

Page 19: IoT NY - Google Cloud Services for IoT
Page 20: IoT NY - Google Cloud Services for IoT

Health and Wearables

Page 21: IoT NY - Google Cloud Services for IoT

Confidential & ProprietaryGoogle Cloud Platform 21

Management

Mobile

Services

Compute

Big Data

Networking

Storage

Developer Tools

Page 22: IoT NY - Google Cloud Services for IoT

Fully Managed:NoOps Ingest, Process & Analyze

Page 23: IoT NY - Google Cloud Services for IoT

Store

Cloud Storage Cloud SQL Cloud

Datastore

Capture Analyze

BigQuery

Process

DataflowCloud Storage

DatastoreCloud SQL

Hadoop/Spark Kafka

Pub/Sub

Hadoop/Spark

Manage the Entire Lifecycle of Big Data

Page 24: IoT NY - Google Cloud Services for IoT

Dataflow

BigQuery

Fast ETLRegexJSONUDFs

Spreadsheets

BI Tools

Coworkers

Applications + Reports PubSub

Cloud Storage

BigTable

Your Data

GCS-Hadoop Connector

Hadoop on Compute Engine Cloud Dataproc

unmanaged managed

Big Data Architecture with Google managed services

Page 25: IoT NY - Google Cloud Services for IoT

Building what’s next 25

Scales automatically

No setup or administration

Stream up to 100,000 rows p/sec

Easily integrates with third-party software

Google BigQuerymakes complex data analysis simple

Page 26: IoT NY - Google Cloud Services for IoT

Question:Find root cause why ad was or was not delivered in the last 30 days.

select date, rejection_reason, count(*)from line_item_table.last30dayswhere line_item_id=56781234

1.2B Rows scanned Result in ~5 seconds!

BigQuery Use @Google: DoubleClick Support

Page 27: IoT NY - Google Cloud Services for IoT

BigQuery scales “Google scale”

Streaming ingest at peak

Largest Data Lake on BigQuery

Largest query by data size

Largest query by rows 10.5 Trillion rows

2.3 Million rows per second

38 Petabytes

2.1 Petabytes

Page 28: IoT NY - Google Cloud Services for IoT

What is BigQuery?

Externalization of Google Dremel

Convenience of SQL

Petabyte-Scale and Fast

Fully Managed, No-Ops Data Warehouse

Page 29: IoT NY - Google Cloud Services for IoT

Building what’s next 29

Merges batch and stream processing

Data processing pipelines

Monitoring interface

Significantly lower cost

Runs on Google or Cloudera Spark (Github)

Google Cloud Dataflowmakes complex data analysis simple

Page 30: IoT NY - Google Cloud Services for IoT

What is Cloud Dataflow?

Cloud Dataflow is a collection of SDKs for

building batch or streaming parallelized

data processing pipelines.

Cloud Dataflow is a fully managed service for executing optimized

parallelized data processing pipelines.

Page 31: IoT NY - Google Cloud Services for IoT

Cloud Pub/Sub

• Globally redundant• Low latency (sub sec.)• Batched read/write• Custom labels• Push & Pull• Auto expiration

Publisher A Publisher B Publisher C

Message 1

Topic A Topic B Topic C

Subscription XA Subscription XB Subscription YC

Subscription ZC

Cloud Pub/Sub

Subscriber X Subscriber Y

Message 2 Message 3

Subscriber Z

Message 1

Message 2

Message 3

Message 3

Page 32: IoT NY - Google Cloud Services for IoT

Dataflow goodies

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Pipeline p = Pipeline.create();

p.begin()

.apply(TextIO.Read.from(“gs://…”))

.apply(ParDo.of(new ExtractTags())

.apply(Count.create())

.apply(ParDo.of(new ExpandPrefixes())

.apply(Top.largestPerKey(3))

.apply(TextIO.Write.to(“gs://…”));

p.run();

Page 33: IoT NY - Google Cloud Services for IoT

Dataflow goodies

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Deploy

Schedule & Monitor

Page 34: IoT NY - Google Cloud Services for IoT

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

800 RPS 1200 RPS 5000 RPS 50 RPS

Page 35: IoT NY - Google Cloud Services for IoT

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

Page 36: IoT NY - Google Cloud Services for IoT

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

Pipeline p = Pipeline.create();

p.begin()

.apply(TextIO.Read.from(“gs://…”))

.apply(ParDo.of(new ExtractTags())

.apply(Count.create())

.apply(ParDo.of(new ExpandPrefixes())

.apply(Top.largestPerKey(3))

.apply(TextIO.Write.to(“gs://…”));

p.run();

.apply(PubsubIO.Read.from(“input_topic”))

.apply(Window.<Integer>by(FixedWindows.of(5, MINUTES))

.apply(PubsubIO.Write.to(“output_topic”));

Page 37: IoT NY - Google Cloud Services for IoT

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

Nighttime Mid-Day Nighttime

Page 39: IoT NY - Google Cloud Services for IoT

Demo Time

Pub/Sub

Ingest Process Analyse

Cloud Dataflow BigQuery

Git: https://github.com/james-google/event-streams-dataflow

Page 40: IoT NY - Google Cloud Services for IoT

Demo Time

Pub/Sub

Ingest Process Analyse

Cloud Dataflow BigQuery

Git: https://github.com/james-google/event-streams-dataflow

Page 41: IoT NY - Google Cloud Services for IoT

Questions?

Page 42: IoT NY - Google Cloud Services for IoT

Thank You

James [email protected]