Top Banner
MongoDB use cases and setup involving Elasticsearch MongoDB Meetup @hikeapp Gurgaon Bharvi Dixit @d_bharvi 13 th February 2015
19
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MongoDB meetup at Hike

MongoDB use cases and setup involving Elasticsearch

MongoDB Meetup @hikeapp GurgaonBharvi Dixit@d_bharvi

13th February 2015

Page 2: MongoDB meetup at Hike

Agenda

About Me and Orkash.Why we chose MongoDB. Our use cases and setup of MongoDB. Better Than Apple: MongoDB-Elasticsearch. Elasticsearch An Overview. The most common issues.Mongo University: Learn from the masters.

Page 3: MongoDB meetup at Hike

About Me

Software Engineer @Orkash. Organizer and Speaker @Delhi Elasticsearch Meetup. Loves Java, Data, Elasticsearch, MongoDB, Eclipse. Interested in all things scale, search, security & DevOps. Working with NoSQL databases for more than a year. Social Media and News Media Intelligence. (Complex

schemas & Query designs)

Page 4: MongoDB meetup at Hike

About Orkash

Founded in 2007 by Ashish Sonal. An R&D driven company which provides Big Data Automated Intelligence

Platform with a focus in following areas:– Counter-terrorism, Security intelligence and Risk management.– Political Consulting And Homeland Security.– Decision Support Systems.– Market/Brand intelligence.

We create the FOUR pillars of Automated intelligence:– Information Extraction and Monitoring.– Semantic and Link Analysis.– Geo-Spatial Analysis.– Data Mining & Forensics.

Page 5: MongoDB meetup at Hike

Everything starts with a problem..!!

• Data Driven Decisions• Logfiles for scaling up/down• Warehouse withdrawal triggers orders• History for fraud detection• Internet of Things and Smart Cities.

... data explosion

Page 6: MongoDB meetup at Hike

Everything starts with a problem..!!

Better decisions == more dataAnd NoSQL adds more problems

Data

Big Data

BIG DATA

Page 7: MongoDB meetup at Hike

Big Data Problem goes on..• I need BIG DATA.• I need to analyze this data.• I need to enrich this big data & make it more bigger. • I need fast searching.• I need real-time analytics.• Ohh wait.. I need relational queries on this big data to get

more insights..

Page 8: MongoDB meetup at Hike

Why we chose mongoDB

• It does the impossible. (Can incorporate any kind of data)• Document model.• Distributed computing.• Awesome sharding and replications.• Scales big (horizontally) on commodity hardware's.• Powerful Analytics with aggregation framework.• Highly Persistence and Read-Write Performance.• Awesome security features.• OS-Managed memory management.

Page 9: MongoDB meetup at Hike

Our use cases and setup of MongoDB.

• A primary data store for collecting and storing humongousamount of unstructured/semi-structured texts.

• Building GIS applications for government and security agenciesusing GEO Spatial features.

• Data analytics.

Page 10: MongoDB meetup at Hike

Our use cases and setup of MongoDB.

Our current production setup has 14 nodes:

Node Type #of nodes Hardware SpecificationsData nodes 5 (20 GB RAM with 8 core CPU each)Mongos (VM’s) 4 (4 GB RAM with 4 core CPU each)Arbiter nodes(VM’s) 2 (1 GB RAM with 1 core CPU each)Config servers(VM’s) 3 (4 GB RAM with 2 core CPU each)

Page 11: MongoDB meetup at Hike

Better Than Apple: MongoDB-Elasticsearch

• One of the greatestcombinations this era hasseen.

• Continuous improvements• Fulfills each other’s

missing features.• Both have almost similar

concepts and data types.• Both keep cloud in mind.• Driven by Open-Source

community, knowledgesharing, and Highcollaboration with users.

Page 12: MongoDB meetup at Hike

Better Than Apple: MongoDB-Elasticsearch

Sources: Twitter

Page 13: MongoDB meetup at Hike

Elasticsearch Overview

What is Elasticsearch:• “you know, for search”• Schema-free, REST & JSON Based distributed Full Text

search engine & document store.• Written in JAVA & Build on top of Lucene.• Highly reliable, scalable, fault tolerant.• Support distributed Indexing, Replication, and load

balanced querying.• Powerful Geo-Spatial Queries.• Latest Release : 1.4.2Wait..!! Schema Free?? The real gotcha.. Mongo-ES breakup

Page 14: MongoDB meetup at Hike

Elasticsearch Overview

What does it add to Lucene:• REST service: Json API’s over HTTP

• High Availability & Performance: Clustering & Replication

• A Powerful query DSL.• Interoperation with non-Java/JVM languages.• More and more Resilience.• Multitenancy• And the best one: It allows to maintain relationship

among documents.

Page 15: MongoDB meetup at Hike

The Elasticsearch Open Source Model

Page 16: MongoDB meetup at Hike

Understanding Elasticsearch Structure in respect to MongoDB

Page 17: MongoDB meetup at Hike

The most common issues..

1. Distributed computing comes with two problems:Node failures and Network BottlenecksNode failures can be handled by MongoDB very easily but

Network bottleneck/partitions won’t let you sleep at nightsbecause of Replicaset failovers and Rollbacks.

Separate networks for read and write.2. Assuring Business continuity planMongodump is not fit for the large dataset backups.3. Data Modeling4. Keeping a close eye on Connection5. Importing embedded documents in CSV

Page 18: MongoDB meetup at Hike

Mongo University: Learn from the masters..!!