Top Banner
© 2016 MapR Technologies © 2016 MapR Technologies MapR Confidential © 2016 MapR Technologies 1 Fast Cars, Big Data How Streaming Can Help Formula 1 Tugdual Grall @tgrall
54

Fast Cars, Big Data - How Streaming Can Help Formula 1

Feb 17, 2017

Download

Technology

Tugdual Grall
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR TechnologiesMapR Confidential © 2016 MapR Technologies1

Fast Cars, Big Data How Streaming Can Help Formula 1

Tugdual Grall@tgrall

Page 2: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall

{“about” : “me”}Tugdual “Tug” Grall • MapR

• Technical Evangelist • MongoDB

• Technical Evangelist • Couchbase

• Technical Evangelist • eXo

• CTO • Oracle

• Developer/Product Manager • Mainly Java/SOA

• Developer in consulting firms

• Web • @tgrall • http://tgrall.github.io • tgrall

• NantesJUG co-founder

• Pet Project : • http://www.resultri.com

[email protected][email protected]

Page 3: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 3

Agenda• What’s the point of data in motorsports? • Live demo • Architecture • What’s next?

Page 4: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 4

How data plays in F1 motorsports

Page 5: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 5

Page 6: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 6

Data in Motorsports

http

://f1

fram

ewor

k.bl

ogsp

ot.d

e/20

13/0

8/sh

ort-g

uide

-to-f1

-tele

met

ry-s

pa-c

ircui

t.htm

l

Page 7: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 7

Difference is due to later and sharper braking

Page 8: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 8

Real Analytics as Well as Visualization• Inputs

• Predictive analysis of consumables and tires • Physical models of car + driver performance

• Tire wear slows lap times, lower fuel weight speeds lap times • Competitors’ options • Weather conditions • Current GP points status

• Outputs • Tactical options, outcome distributions

Page 9: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 9

Data for Marketing as well

http

://fo

rmul

a1.fe

rrar

i.com

/en/

info

raci

ng-h

unga

rian-

gp-2

015/

Page 10: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

Some Examples?

Page 11: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car

Some Examples?

Page 12: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car• Up to 2000 channels

Some Examples?

Page 13: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car• Up to 2000 channels• Sensor data are sent to the paddock in 2ms

Some Examples?

Page 14: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car• Up to 2000 channels• Sensor data are sent to the paddock in 2ms• 1.5 billions of data points for a race

Some Examples?

Page 15: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car• Up to 2000 channels• Sensor data are sent to the paddock in 2ms• 1.5 billions of data points for a race• 5 billions for a full race weekend

Some Examples?

Page 16: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car• Up to 2000 channels• Sensor data are sent to the paddock in 2ms• 1.5 billions of data points for a race• 5 billions for a full race weekend• 5/6Gb of compressed data per car for 90mn

Some Examples?

Page 17: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 10

• Up to 300 sensors per car• Up to 2000 channels• Sensor data are sent to the paddock in 2ms• 1.5 billions of data points for a race• 5 billions for a full race weekend• 5/6Gb of compressed data per car for 90mn

US Grand Prix 2014 : 243 Tb (race teams combined)

Some Examples?

Page 18: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 11

So how does that work? Especially for real-time data?

Page 19: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 12

Production System Outline

Page 20: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 13

Simplified Demo System Outline

Archive MapR DB

Jetty /Bootstrap /

d3

Apache Drill (SQL access)

TORCS race simulator

MapR Streams

Page 21: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 14

TORCS for Cars, Physics and Drivers

TORCS is a pseudo-physics based racing simulator with full graphics output and pluggable control modules.

TORCS is commonly used for AI research, but the control model can just as well collect data

Page 22: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 15

Let’s see it work!

Page 23: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 16

IoT : Racing Cars

Producers Consumers

sensors data

Real Time

Analytics

https://github.com/mapr-demos/racing-time-series

Page 24: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 17

IoT : Racing Cars

sensors data

https://github.com/mapr-demos/racing-time-series

Kafka Producer

(Java)

Kafka Consumer +

OJAI (Java)

Kafka Consumer +

WebSocket (Java + JS)

SQL

Page 25: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 18

Big Datastore

Distributed File SystemHDFS/MapR-FS

NoSQL DatabaseHBase/MapR-DB

….

Page 26: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 19

Store data as File or Row?

HDFS / MapR-FS • Data stores as “files” • Fast with Large Scans • Slow random read/writes

NoSQL (HBase/MapR-DB) • Data stores as row/documents • Fast with random read/writes

Page 27: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 20

Kafka & MapR Streams

Page 28: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 21

What is Kafka?

• http://kafka.apache.org/ • Created at LinkedIn, open sourced in 2011 • Implemented in Scala / Java • Distributed messaging system built to scale

Page 29: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 22

Key Concepts

• Feeds of messages are organised in topics • Processes that publish messages are called producers • Processes that subscribed to topic and process messages are

consumers • A Kafka cluster is made of one or more brokers (== node)

Page 30: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 23

Topics and Partitions

• Split topics into partitions for scalability

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5

0 1 2 3 4 5 6 7

Partition 0

Partition 1

Partition 2

Writes

Page 31: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 24

Consumer Groups

• Single consumer abstraction for scalability • Max 1 consumer per partition • Any number of consumer groups

Page 32: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 25

Produce MessagesProducerRecord<String, byte[]> rec = new ProducerRecord<>( “/stream/car_1_topic“, eventName, value.toString().getBytes());

producer.send(rec, (recordMetadata, e) -> { if (e != null) { … });

producer.flush();

Page 33: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 26

Consume Messageslong pollTimeOut = 800; while(true) { ConsumerRecords<String, String> records = consumer.poll(pollTimeOut); if (!records.isEmpty()) { Iterable<ConsumerRecord<String, String>> iterable = records::iterator; StreamSupport.stream(iterable.spliterator(), false).forEach((record) -> { // work with record object

… record.value();…

}); consumer.commitAsync(); } }

Page 34: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 27

Big Picture

Producer

Producer

Producer

Consumer

Consumer

Consumer

Page 35: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 28

More real life Kafka …

Zookeeper

Broker 1

Topic A Topic B

Broker 2

Topic A Topic B

Broker 3

Topic A Topic B

Producer

Producer

Producer

Consumer

Consumer

Consumer

Page 36: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 29

• Distributed messaging system built to scale • Use Apache Kafka API 0.9.0 • No code change • Does not use the same “broker” architecture

• Log stored in MapR Storage(Scalable, Secured, Fast, Multi DC)

• No Zookeeper

Page 37: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 30

Kafka

Zookeeper

Broker 1

Topic A Topic B

Broker 2

Topic A Topic B

Broker 3

Topic A Topic B

Producer

Producer

Producer

Consumer

Consumer

Consumer

Page 38: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 31

MapR Streams

Stream

Topic A Topic B

Stream

Topic A Topic B

Stream

Topic A Topic B

Producer

Producer

Producer

Consumer

Consumer

Consumer

Page 39: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 32

What’s next?

Page 40: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 3330

Sensor Data V1• 3 main data points:

• Speed (m/s) • RPM • Distance (m)

• Buffered

{ "_id":"1.458141858E9/0.324", "car" = "car1", "timestamp":1458141858, "racetime”:0.324, "records": [ { "sensors":{ "Speed":3.588583, "Distance":2003.023071, "RPM":1896.575806 }, "racetime":0.324, "timestamp":1458141858 }, { "sensors":{ "Speed":6.755624, "Distance":2004.084717, "RPM":1673.264526 }, "racetime":0.556, "timestamp":1458141858 },

Page 41: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 3431

Sensor Data V2• 3 main data points:

• Speed (m/s) • RPM • Distance (m) • Throttle • Gear • …

• Buffered

{ "_id":"1.458141858E9/0.324", "car" = "car1", "timestamp":1458141858, "racetime”:0.324, "records": [ { "sensors":{ "Speed":3.588583, "Distance":2003.023071, "RPM":1896.575806, "gear" : 2 }, "racetime":0.324, "timestamp":1458141858 }, { "sensors":{ "Speed":6.755624, "Distance":2004.084717, “RPM":1673.264526, "gear" : 2 }, "racetime":0.556, "timestamp":1458141858 },

Page 42: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 35

• It works, is available on github, ASL 2

• Data collected is unrealistically limited, lacks – Tire pressure, temperature x 4 – Brake usage, temperature x 8 – Engine monitoring is primitive (RPMs only, no KERS) – Data rate is fixed, real data comes in at highly variable rates – Real data has variable delays due to RF dropout + buffering

• Data collected is in pure JSON – Real data is columnar compressed blobs

Page 43: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 36

Next Steps• Near Real Time Data Processing

• Aggregation • Machine Learning • Alerts

Page 44: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 37

• Cluster Computing Platform • Extends “MapReduce” with

extensions – Streaming – Interactive Analytics

• Run in Memory • http://spark.apache.org/

Page 45: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 38

• Streaming Dataflow Engine • Datastream/Dataset APIs • CEP, Graph, ML

• Run in Memory • https://flink.apache.org/

Page 46: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 39

IoT : Racing Cars V2.0

sensors data

https://github.com/mapr-demos/racing-time-series

Alerts

Page 47: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 40

Spark & Streams val topics = “/app/racing/stream:all_cars" val sparkConf = new SparkConf().setAppName(“SensorStream") val ssc = new StreamingContext(sparkConf, Seconds(2))

// Create direct kafka stream with brokers and topics val topicsSet = topics.split(",").toSet val kafkaParams = Map[String, String]( ConsumerConfig.GROUP_ID_CONFIG -> "race1", ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG -> "org.apache.kafka.common.serialization.StringDeserializer", ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG -> "org.apache.kafka.common.serialization.StringDeserializer", ConsumerConfig.AUTO_OFFSET_RESET_CONFIG -> "earliest", ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG -> "false", "spark.kafka.poll.time" -> "1000" )

val messages = KafkaUtils.createDirectStream[String, String](ssc, kafkaParams, topicsSet) val sensorDStream = messages.map(_._2).map(parseSensor)

sensorDStream.foreachRDD { rdd => // There exists at least one element in RDD if (!rdd.isEmpty) { ….. } }

Page 48: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 41

Streaming Architecture & Formula 1• Stream data in real time

• Big Data Store to deal with the scale • NoSQL Database, Distributed File System

• Decouple the source from the consumer(s) • Dashboard, Analytics, Machine Learning • Add new use case….

Page 49: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR Technologies@tgrall 42

Streaming Architecture & Formula 1• Stream data in real time

• Big Data Store to deal with the scale • NoSQL Database, Distributed File System

• Decouple the source from the consumer(s) • Dashboard, Analytics, Machine Learning • Add new use case….

This is not only about Formula 1! (Telco, Finance, Retail, Content, IT)

Page 50: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 43

MapR Converged Data Platform

Open Source Engines & Tools Commercial Engines & Applications

Utility-Grade Platform Services

Dat

aPr

oces

sing

Enterprise StorageMapR-FS MapR-DB MapR Streams

Database Event Streaming

Global Namespace High Availability Data Protection Self-healing Unified Security Real-time Multi-tenancy

Search & Others

Cloud & Managed Services

Custom Apps

Unified M

anagement and M

onitoring

Page 51: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 44

MapR Platform Services: Open API ArchitectureAssures Interoperability, Avoids Lock-in

MapR-FS Enterprise Storage

MapR-DB NoSQL Database

MapR Streams Global Event Streaming

HDFS API

POSIX NFS

SQL, Hbase

APIJSON API

Kafka API

Page 52: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 45

Page 53: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies 46

Q & A@mapr | @tgrall maprtech

[email protected]

Engage with us!

MapR

maprtech

mapr-technologies

Page 54: Fast Cars, Big Data - How Streaming Can Help Formula 1

© 2016 MapR Technologies© 2016 MapR TechnologiesMapR Confidential 47