Top Banner
IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka [email protected] kaaproject.org © 2015 CyberVision, Inc. All rights reserved.
12

Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka [email protected] © 2015 CyberVision, Inc. All …

Jun 07, 2018

Download

Documents

NguyenMinh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

IoT data ingestion in Spark Streaming using Kaa

Andrew Shvayka

[email protected]© 2015 CyberVision, Inc. All rights reserved.

Page 2: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

kaaproject.org© 2015 CyberVision, Inc. All rights reserved.

Agenda

➢ Data ingestion challenges➢ Why Kaa?➢ Why Spark?➢ Reference architecture overview➢ Hands-on

➢ Environment setup➢ Intel Edison application code walkthrough➢ Spark application code walkthrough➢ Live demo

➢ Q&A

Page 3: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

kaaproject.org© 2015 CyberVision, Inc. All rights reserved.

Data ingestion requirements/challenges

Must have:➢ Guaranteed data delivery➢ Scalability➢ Security➢ Performance➢ Low latency

Nice to have:➢ Built-in data structure validation➢ Device platform independent➢ Low footprint➢ Low bandwidth support

Page 4: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

kaaproject.org© 2015 CyberVision, Inc. All rights reserved.

Page 5: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

kaaproject.org© 2015 CyberVision, Inc. All rights reserved. kaaproject.org© 2015 CyberVision, Inc. All rights reserved.

➢ Fully-featured IoT middleware platform➢ 10 Kb RAM footprint (with C SDK)➢ Guaranteed data delivery and reliable local storage➢ Built-in transport security➢ Efficient data serialization➢ Horizontally scalable and fault tolerant➢ 100% open-source (Apache license 2.0)➢ Rapid application development using C / C++ / Java SDKs➢ Integration with popular device platforms

Why Kaa?

Page 6: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

kaaproject.org© 2015 CyberVision, Inc. All rights reserved. kaaproject.org© 2015 CyberVision, Inc. All rights reserved.

➢ Fast and performant cluster computing➢ Rapid application development➢ SQL support➢ Streaming analytics support➢ Machine learning and graph processing support➢ 100% open-source (Apache license 2.0)➢ Easy deployment

Why Spark?

Page 7: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

© 2015 CyberVision, Inc. All rights reserved.

Problem description

kaaproject.org

Zone 1 Zone 2

Zone 3 Zone 4

Zone 5 Zone 6

Page 8: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

Spark cluster/sandbox

Kaacluster/sandbox

© 2015 CyberVision, Inc. All rights reserved.

Reference architecture

kaaproject.org

Solar panels

Flume event

StructuredData

Solar panels

Raw data

Intel Edison

Kaa SDK

Client application

Intel Edison

Kaa SDK

Client application

Kaa node

Flume agent

Spark node

Page 9: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

kaaproject.org© 2015 CyberVision, Inc. All rights reserved.

Development environment setup

Sample project repository: https://github.com/kaaproject/kaa-spark-sampleApache Spark (Standalone mode): http://spark.apache.org/docs/latest/spark-standalone.htmlKaa Sandbox: http://www.kaaproject.org/download-kaaIntel Edison: https://docs.kaaproject.org/display/KAA/Intel+Edison

Page 10: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

© 2015 CyberVision, Inc. All rights reserved.

Spark processing

kaaproject.org

DStream<SparkFlumeEvent>

JavaPairDStream<ZoneId, ZoneStats>

JavaPairDStream<ZoneId, ZoneStats>

JavaPairDStream<ZoneId, ZoneStats>

JavaDStream<String>

FlatMap

ReduceByKey

Sort

Map

Page 11: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

Andrew [email protected]

kaaproject.orgcybervisiontech.com

THANK YOU FOR YOUR ATTENTIONQUESTIONS?

Page 12: Streaming using Kaa IoT data ingestion in Spark · IoT data ingestion in Spark Streaming using Kaa Andrew Shvayka ashvayka@cybervisiontech.com © 2015 CyberVision, Inc. All …

© 2015 CyberVision, Inc. All rights reserved.

Zookeeper quorum

Endpoints

Control servers

standby

Bootstrap servers

Operations servers

Fault-tolerance and horizontal scalability

kaaproject.org

active