Top Banner
MongoDB Europe 2016 Old Billingsgate, London 15 th November stributed Ledgers, Blockchain + Mongo Bryan Reinero
49

Live Demo: Introducing the Spark Connector for MongoDB

Jan 10, 2017

Download

Data & Analytics

MongoDB
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Live Demo: Introducing the Spark Connector for MongoDB

MongoDB Europe 2016Old Billingsgate, London

15th November

Distributed Ledgers, Blockchain + MongoDBBryan Reinero

Page 2: Live Demo: Introducing the Spark Connector for MongoDB

MongoDB Connector For Spark

Page 3: Live Demo: Introducing the Spark Connector for MongoDB
Page 4: Live Demo: Introducing the Spark Connector for MongoDB

HDFS

Distributed Data

Page 5: Live Demo: Introducing the Spark Connector for MongoDB

Spark Stand Alone

YARN

Mesos

HDFS

Distributed Resources

Page 6: Live Demo: Introducing the Spark Connector for MongoDB

YARN

Spark

Mesos

HDFS

Spark Stand Alone

Hadoop

Distributed Processing

Page 7: Live Demo: Introducing the Spark Connector for MongoDB

YARN

SparkMesos

Hive

Pig

HDFS

Hadoop

Spark Stand Alone

Domain Specific Languages

Page 8: Live Demo: Introducing the Spark Connector for MongoDB

YARN

SparkMesos

Hive

Pig

SparkSQL

Spark Shell

SparkStreaming

HDFS

Spark Stand Alone

Hadoop

Page 9: Live Demo: Introducing the Spark Connector for MongoDB

YARN

SparkMesos

Hive

Pig

SparkSQL

Spark Shell

SparkStreaming

Spark Stand Alone

Hadoop

Page 10: Live Demo: Introducing the Spark Connector for MongoDB

Stand AloneYARN

SparkMesos

SparkSQL

SparkShell

SparkStreaming

Page 11: Live Demo: Introducing the Spark Connector for MongoDB

Stand Alone

YARN

SparkMesos

SparkSQL

SparkShell

SparkStreaming

Page 12: Live Demo: Introducing the Spark Connector for MongoDB

executor

Worker Node

executor

Worker Node

MasterSpark Connector

Driver Application

Page 13: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

Page 14: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

TransformTransformTransformTransform

Page 15: Live Demo: Introducing the Spark Connector for MongoDB

Transformationsfilter( func )union( func )intersection( set )distinct( n )map( function )

Page 16: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

TransformTransformTransformTransform

TransformTransformTransformTransform

Page 17: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

Transform

TransformTransformTransform

Transform

Transform

Transform

Transform

Action

Action

Action

Action

Page 18: Live Demo: Introducing the Spark Connector for MongoDB

Actionscollect()count()first()take( n )reduce( function )

Page 19: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

TransformTransformTransformTransform

TransformTransformTransformTransform

Action

Action

Action

Action

Result

Result

Result

Result

Page 20: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

TransformTransformTransformTransform

TransformTransformTransformTransform

Action

Action

Action

Action

Result

Result

Result

Result

Lineage

Page 21: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

TransformTransformTransformTransform

TransformTransformTransformTransform

Action

Action

Action

Action

Page 22: Live Demo: Introducing the Spark Connector for MongoDB

ParellelizeParellelizeParellelizeParellelize

TransformTransformTransformTransform

TransformTransformTransformTransform

Action

Action

Action

Action

Page 23: Live Demo: Introducing the Spark Connector for MongoDB

Parellelize

Parellelize

Parellelize

Parellelize

Transform

Transform

Transform

Transform

Transform

Transform

Transform

Transform

Action

Action

Action

Action

Result

Result

Result

Result

Page 24: Live Demo: Introducing the Spark Connector for MongoDB

Using the Connector

Page 25: Live Demo: Introducing the Spark Connector for MongoDB

https://github.com/mongodb/mongo-spark

Page 26: Live Demo: Introducing the Spark Connector for MongoDB
Page 27: Live Demo: Introducing the Spark Connector for MongoDB

http://spark.apache.org/docs/latest/

Page 28: Live Demo: Introducing the Spark Connector for MongoDB
Page 29: Live Demo: Introducing the Spark Connector for MongoDB
Page 30: Live Demo: Introducing the Spark Connector for MongoDB

{ "_id" : ObjectId("578be1fe1fe699f2deb80807"), "user_id" : 196, "movie_id" : 242, "rating" : 3, "timestamp" : 881250949}

Page 31: Live Demo: Introducing the Spark Connector for MongoDB

./bin/spark-shell \ --conf \

"spark.mongodb.input.uri=mongodb://127.0.0.1/movies.movie_ratings" \ --conf \

"spark.mongodb.output.uri=mongodb://127.0.0.1/movies.user_recommendations" \ --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0

Page 32: Live Demo: Introducing the Spark Connector for MongoDB

./bin/spark-shell \ --conf \

"spark.mongodb.input.uri=mongodb://127.0.0.1/movies.movie_ratings" \ --conf \

"spark.mongodb.output.uri=mongodb://127.0.0.1/movies.user_recommendations" \ --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0

Page 33: Live Demo: Introducing the Spark Connector for MongoDB

./bin/spark-shell \ --conf \

"spark.mongodb.input.uri=mongodb://127.0.0.1/movies.movie_ratings" \ --conf \

"spark.mongodb.output.uri=mongodb://127.0.0.1/movies.user_recommendations" \ --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0

Page 34: Live Demo: Introducing the Spark Connector for MongoDB

./bin/spark-shell \ --conf \

"spark.mongodb.input.uri=mongodb://127.0.0.1/movies.movie_ratings" \ --conf \

"spark.mongodb.output.uri=mongodb://127.0.0.1/movies.user_recommendations" \ --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0

Page 35: Live Demo: Introducing the Spark Connector for MongoDB

import com.mongodb.spark._import com.mongodb.spark.rdd.MongoRDDimport org.bson.Document

val rdd = sc.loadFromMongoDB()for( doc <- rdd.take( 10 ) ) println( doc )

Page 36: Live Demo: Introducing the Spark Connector for MongoDB

Read Config Write Config

Page 37: Live Demo: Introducing the Spark Connector for MongoDB

Aggregation Filters$match | $project | $group

Page 38: Live Demo: Introducing the Spark Connector for MongoDB

JSONJSONJSONJSONJSONJSONJSONJSONJSONJSONJSON

Page 39: Live Demo: Introducing the Spark Connector for MongoDB

JSONJSONJSONJSONJSONJSONJSONJSONJSONJSONJSON

Page 40: Live Demo: Introducing the Spark Connector for MongoDB

val aggRdd = rdd.withPipeline( Seq( Document.parse( "{ $match: { Country: \"USA\" } }" ) ) )

Page 41: Live Demo: Introducing the Spark Connector for MongoDB

Spark SQL + Dataframes

Page 42: Live Demo: Introducing the Spark Connector for MongoDB

RDD + Schema = Dataframe

Page 43: Live Demo: Introducing the Spark Connector for MongoDB
Page 44: Live Demo: Introducing the Spark Connector for MongoDB

JSONJSONJSONJSONJSONJSONJSONJSONJSONJSONJSON

$sample

Page 45: Live Demo: Introducing the Spark Connector for MongoDB
Page 46: Live Demo: Introducing the Spark Connector for MongoDB

Data Locality

mongos

Page 47: Live Demo: Introducing the Spark Connector for MongoDB

Courses and Resources

Page 48: Live Demo: Introducing the Spark Connector for MongoDB

https://university.mongodb.com/courses/M233/about

Page 49: Live Demo: Introducing the Spark Connector for MongoDB

THANKS!@blimpyacht