Top Banner
Delhi Elasticsearch Meetup Bharvi Dixit @d_bharvi Nov 29, 2014 Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"
26

Delhi elasticsearch meetup

Jul 13, 2015

Download

Data & Analytics

Bharvi Dixit
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Delhi elasticsearch meetup

Delhi Elasticsearch Meetup

Bharvi Dixit@d_bharvi

Nov 29, 2014 Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Page 2: Delhi elasticsearch meetup

Agenda

What is a search engine? Lucene Overview and Indexing Pipeline. Data Driven Approaches & Problems. Elasticsearch Comes to Rescue. Understanding Elasticsearch Architecture. Logstash & Kibana Overview. The ELK stack together. Some tips.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 3: Delhi elasticsearch meetup

About Me

Software engineer @Orkash. Loves Java, Data, Elasticsearch, MongoDB, Eclipse. Interested in all things scale, search, security & DevOps. Creator: CIBET Pro Manager Working on Elasticsearch for more than a year. Social Media and News Media Intelligence. (Complex

schemas & Query designs)

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 4: Delhi elasticsearch meetup

What is a search engine?

• An information retrieval system designed to find informationstored in computer system.

A search engine has different modules:

• But what about the relevant or irrelevant results??

Data collected from various

sourcesData stored in indexes

Data is queried

Indexing

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 5: Delhi elasticsearch meetup

What is a search engine?

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Auto completionDid-You-Mean

Spell correctionMulti-lingual

StemmingSynonyms

HighlightingMore-Like-This

Page 6: Delhi elasticsearch meetup

Lucene Overview

Lucene:• Open source, Fast, high performance, search/IR library.• Written in Java.• Initially developed by Doug Cutting (Also author of

Hadoop)• Indexing and Searching.• Inverted Index of documents.• Provides advanced Search options like synonyms,

stopwords, similarity, proximity.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 7: Delhi elasticsearch meetup

Lucene Internals- Inverted Index

Credit: https://developer.apple.com/library/mac/documentation/userexperience/conceptual/SearchKitConcepts/searchKit_basics/searchKit_basics.html

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 8: Delhi elasticsearch meetup

Lucene Internals- Continued

• Defines documents Model

• Index contains documents.

• Each document consist of fields.

• Each Field has attributes.

– What is the data type (FieldType)

– How to handle the content (Analyzers, Filters)

– Is it a stored field (stored="true") or Index field

(indexed="true")

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 9: Delhi elasticsearch meetup

Indexing Pipeline

• Analyzer : create tokens using a Tokenizer and/or applying Filters (Token Filters)

• Each field can define an Analyzer at index time/query time or the both at same time.

Document TokenizerDocument

WriterToken Filter

Inverted Index

Analysis Phase

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 10: Delhi elasticsearch meetup

Everything starts with a problem..!!

• Data Driven Decisions• Logfiles for scaling up/down• Warehouse withdrawal triggers orders• History for fraud detection• Assembly line, throughput improvement

... data explosion

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 11: Delhi elasticsearch meetup

Everything starts with a problem..!!

Better decisions == more data?

Data

Big Data

BIG DATA

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 12: Delhi elasticsearch meetup

Big Data Problem goes on..• I need BIG DATA.• I need to analyze this data.• I need to enrich this big data & make it more bigger. • I need fast searching.• I need real-time analytics.• Ohh wait.. I need relational queries on this big data to get

more insights..• I need .. I need .. I need..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 13: Delhi elasticsearch meetup

And I guess this is why someone nailed it..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 14: Delhi elasticsearch meetup

Elasticsearch comes to rescue..

What is Elasticsearch:• “you know, for search”• Schema-free, REST & JSON Based distributed Full Text

search engine & document store.• Written in JAVA & Build on top of Lucene.• Highly reliable, scalable, fault tolerant.• Support distributed Indexing, Replication, and load

balanced querying.• Powerful Geo-Spatial Queries.• Latest Release : 1.4.1Wait..!! Schema Free?? The real gotcha..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 15: Delhi elasticsearch meetup

Elasticsearch comes to rescue..

What does it add to Lucene:• REST service: Json API’s over HTTP

• High Availability & Performance: Clustering & Replication

• A Powerful query DSL.• Interoperation with non-Java/JVM languages.• More and more Resilience.• Multitenancy• And the best one: It allows to maintain relationship

among documents.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 16: Delhi elasticsearch meetup

The Elasticsearch Open Source Model

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 17: Delhi elasticsearch meetup

The Popularity of Elasticsearch

10M downloads in 2 years and counting..

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 18: Delhi elasticsearch meetup

The Popularity of Elasticsearch

Have a look at the case studies here:http://www.elasticsearch.org/case-studies/

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 19: Delhi elasticsearch meetup

Understanding Elasticsearch Structure

A live demo is better then nothing

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 20: Delhi elasticsearch meetup

Logstash

• Tool for Receiving, processing and outputting logs.(Input======Filter======Output)

• All kinds of logs: System logs, error logs, webserver logs,application logs & just about anything you can throw at it.

• Open Source: Apache License 2.0.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 21: Delhi elasticsearch meetup

Kibana

• Execute queries on your data & visualize results.• Add/remove widgets.• Share/Save/Load dashboards.• No need to know coding.• Open Source: Apache License 2.0.

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 22: Delhi elasticsearch meetup

The ELK Stack Together

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 23: Delhi elasticsearch meetup

meetup.com RSVP stream

• All RSVPs are written out to a HTTP stream• Each line is a JSON document• Available at http://stream.meetup.com/2/rsvps

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 24: Delhi elasticsearch meetup

meetup.com RSVP stream

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 25: Delhi elasticsearch meetup

In the end..

• Look out for best practices. (Proper cluster formation, Bulk Indexing)

• Continuous monitoring: Marvel, Bigdesk, HQ• Open-JDK strictly prohibited.• Elasticsearch is the always hungry: Give me more RAM..!!• Benchmarking of data to create indexes/shards. (Once

created; can’t be broken)• And don’t forget to create mappings.• Manage your security.. But Now It’s coming soon..

Elasticsearch Shield.. “you know, for security”

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014

Page 26: Delhi elasticsearch meetup

Thank You for Listening

[email protected]://twitter.com/d_bharvislideshare.net/bharvidixit/

Delhi Elasticsearch Meetup: Talk About Most Advance Search Engine of The world "Elasticsearch"

Nov 29, 2014