Top Banner
elasticsearch basics workshop mathieu Elie at giroll mardi 17 décembre 13
47

elasticsearch basics workshop

Jan 26, 2015

Download

Technology

Mathieu Elie

Quick install of elasticsearch, put documents, request, set a mapping and prepare yourself to read the doc !
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: elasticsearch basics workshop

elasticsearch basicsworkshop

mathieu Elie at giroll

mardi 17 décembre 13

Page 2: elasticsearch basics workshop

speaker : @mathieuel

• freelance & founder @oneplaylist

• full stack skills

• see what i’ve done on http://www.mathieu-elie.net

mardi 17 décembre 13

Page 3: elasticsearch basics workshop

goal

• go from first steps

• and get over first frustation

• give the you the power needed to learn by yourself

mardi 17 décembre 13

Page 4: elasticsearch basics workshop

install

• be sure you have java runtime

• apt-get install openjdk-6-jre-headless -y

• consider oracle jvm

mardi 17 décembre 13

Page 5: elasticsearch basics workshop

unzip and run !

## Get the latest stable archivewget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.7.zip

## Extract the archiveunzip elasticsearch-0.90.7.zip

cd elasticsearch-0.90.7

## run !# This will run elasticsearch on foreground. ./bin/elasticsearch -f

mardi 17 décembre 13

Page 6: elasticsearch basics workshop

its alive ! [2013-12-13 15:45:25,187][INFO ][node ] [Bridge, George Washington] version[0.90.7], pid[37998], build[36897d0/2013-11-13T12:06:54Z][2013-12-13 15:45:25,189][INFO ][node ] [Bridge, George Washington] initializing ...[2013-12-13 15:45:25,202][INFO ][plugins ] [Bridge, George Washington] loaded [], sites [][2013-12-13 15:45:28,342][INFO ][node ] [Bridge, George Washington] initialized[2013-12-13 15:45:28,342][INFO ][node ] [Bridge, George Washington] starting ...[2013-12-13 15:45:28,491][INFO ][transport ] [Bridge, George Washington] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.12:9300]}[2013-12-13 15:45:31,545][INFO ][cluster.service ] [Bridge, George Washington] new_master [Bridge, George Washington][pKCdh1b_TP2TlurO1gm4_g][inet[/192.168.1.12:9300]], reason: zen-disco-join (elected_as_master)[2013-12-13 15:45:31,577][INFO ][discovery ] [Bridge, George Washington] elasticsearch/pKCdh1b_TP2TlurO1gm4_g[2013-12-13 15:45:31,595][INFO ][http ] [Bridge, George Washington] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.12:9200]}[2013-12-13 15:45:31,596][INFO ][node ] [Bridge, George Washington] started[2013-12-13 15:45:31,629][INFO ][gateway ] [Bridge, George Washington] recovered [0] indices into cluster_state

mardi 17 décembre 13

Page 7: elasticsearch basics workshop

ping es on port 9200

curl http://127.0.0.1:9200{ "ok" : true, "status" : 200, "name" : "Gideon, Gregory", "version" : { "number" : "0.90.6", "build_hash" : "e2a24efdde0cb7cc1b2071ffbbd1fd874a6d8d6b", "build_timestamp" : "2013-11-04T13:44:16Z", "build_snapshot" : false, "lucene_version" : "4.5.1" }, "tagline" : "You Know, for Search" }%

mardi 17 décembre 13

Page 8: elasticsearch basics workshop

Store a Document

curl -XPUT http://localhost:9200/workshop/site/1 -d '{ "url": "http://www.elasticsearch.org", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"]}'{"ok":true,"_index":"workshop","_type":"sites","_id":"1","_version":1}%

mardi 17 décembre 13

Page 9: elasticsearch basics workshop

retreive the document

curl -XGET http://localhost:9200/workshop/site/1

{"_index":"workshop","_type":"site","_id":"1","_version":2,"exists":true, "_source" :{ "url": "http://www.elasticsearch.org", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"]}}%

mardi 17 décembre 13

Page 10: elasticsearch basics workshop

add more documentscurl -XPUT http://localhost:9200/workshop/site/2 -d '{ "url": "http://www.mathieu-elie.net", "title": "Mathieu ELIE Freelance - Full Stack Data Engineer, Data Visualization", "description": "Freelance Consultant in Bordeaux, System & Software Architect. Love dataviz, redis, elasticsearch, architecture scalability recipes and playing with data.", tags: ["elasticsearch", "Data Visualization"]}'

curl -XPUT http://localhost:9200/workshop/site/3 -d '{ "url": "http://www.giroll.org", "title": "Collectif Giroll - Gironde Logiciels Libres", "description": "Giroll, collectif basÎ È Bordeaux, rÎunis autour des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 È 20h30 et organisation d''Install Party Linux tous les six", tags: ["Open Source", "Collectif"]}'

mardi 17 décembre 13

Page 11: elasticsearch basics workshop

now search !

mardi 17 décembre 13

Page 12: elasticsearch basics workshop

curl 'http://localhost:9200/workshop/_search?pretty=true'{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "workshop", "_type" : "site", "_id" : "1", "_score" : 1.0, "_source" :{ "url": "http://www.elasticsearch.org", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"]} }, { "_index" : "workshop", "_type" : "site", "_id" : "3", "_score" : 1.0, "_source" :{ "url": "http://www.giroll.org", "title": "Collectif Giroll - Gironde Logiciels Libres", "description": "Giroll, collectif basÎ È Bordeaux, rÎunis autour des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 È 20h30 et organisation dInstall Party Linux tous les six", tags: ["Open Source", "Collectif"]} }, {

mardi 17 décembre 13

Page 13: elasticsearch basics workshop

ok great, but now i want to search for

text !

mardi 17 décembre 13

Page 14: elasticsearch basics workshop

step 1 : pass query as a request body

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "query" : { "match_all" : { } }}'

mardi 17 décembre 13

Page 17: elasticsearch basics workshop

so lets use the query_string query dsl

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "query" : { "query_string" : { "query" : "elasticsearch" } }}'

mardi 17 décembre 13

Page 18: elasticsearch basics workshop

result is a a quiet verbose lets get only title and tags fields

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : {

"query_string" : { "query" : "elasticsearch" } }}'

mardi 17 décembre 13

Page 19: elasticsearch basics workshop

{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.081366636, "hits" : [ { "_index" : "workshop", "_type" : "site", "_id" : "1", "_score" : 0.081366636, "fields" : { "tags" : [ "Open Source", "elasticsearch", "Distributed" ], "title" : "Open Source Distributed Real Time Search & Analytics" } }, { "_index" : "workshop", "_type" : "site", "_id" : "2", "_score" : 0.06780553, "fields" : { "tags" : [ "elasticsearch", "Data Visualization" ], "title" : "Mathieu ELIE Freelance - Full Stack Data Engineer, Data Visualization" } } ] }}

mardi 17 décembre 13

Page 21: elasticsearch basics workshop

Facets dsl

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : {

"query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 22: elasticsearch basics workshop

ho no!!

"facets" : { "tags" : { "_type" : "terms", "missing" : 0, "total" : 7, "other" : 0, "terms" : [ { "term" : "elasticsearch", "count" : 2 }, { "term" : "visualization", "count" : 1 }, { "term" : "source", "count" : 1 }, { "term" : "open", "count" : 1 }, { "term" : "distributed", "count" : 1 }, { "term" : "data", "count" : 1 } ] } }

mardi 17 décembre 13

Page 23: elasticsearch basics workshop

• hey ! see "Open Source" ! it is lower cased and exploded in multiple tokens !

• this is done by the defautl mapping and analyzer

mardi 17 décembre 13

Page 24: elasticsearch basics workshop

curl 'http://localhost:9200/workshop/site/_mapping?pretty=true' { "site" : { "properties" : { "description" : { "type" : "string" }, "tags" : { "type" : "string" }, "title" : { "type" : "string" }, "url" : { "type" : "string" } } }}

mardi 17 décembre 13

Page 26: elasticsearch basics workshop

test the default analyzer

curl -XGET 'localhost:9200/workshop/_analyze?pretty=true' -d 'Open Source'{ "tokens" : [ { "token" : "open", "start_offset" : 0, "end_offset" : 4, "type" : "<ALPHANUM>", "position" : 1 }, { "token" : "source", "start_offset" : 5, "end_offset" : 11, "type" : "<ALPHANUM>", "position" : 2 } ]}

mardi 17 décembre 13

Page 28: elasticsearch basics workshop

curl -XGET 'localhost:9200/workshop/_analyze?analyzer=keyword&pretty=true' -d 'Open Source'{ "tokens" : [ { "token" : "Open Source", "start_offset" : 0, "end_offset" : 11, "type" : "word", "position" : 1 } ]}

got it ! now how to apply this to our tags field ?

mardi 17 décembre 13

Page 29: elasticsearch basics workshop

curl 'http://localhost:9200/workshop/site/_mapping?pretty=true' -d '{ "site" : { "properties" : { "url" : {"type" : "string"}, "title" : {"type" : "string"}, "description" : {"type" : "string"}, "tags" : {"type" : "string", "analyzer": "keyword" } } }}'{ "error" : "MergeMappingException[Merge failed with failures {[mapper [tags] has different index_analyzer]}]", "status" : 400}

oops ! we need to drop something..

mardi 17 décembre 13

Page 30: elasticsearch basics workshop

curl -XDELETE 'http://localhost:9200/workshop/'{"ok":true,"acknowledged":true}%

# index should exists if we want to put mapping..curl -XPUT 'http://localhost:9200/workshop/'{"ok":true,"acknowledged":true}%

curl 'http://localhost:9200/workshop/site/_mapping?pretty=true' -d '{ "site" : { "properties" : { "url" : {"type" : "string"}, "title" : {"type" : "string"}, "description" : {"type" : "string"}, "tags" : {"type" : "string", "analyzer": "keyword" } } }}'{"ok":true,"acknowledged":true}%

mardi 17 décembre 13

Page 31: elasticsearch basics workshop

# test on the field analysis curl -XGET 'localhost:9200/workshop/_analyze?pretty=true&field=site.tags' -d 'Open Source'{ "tokens" : [ { "token" : "Open Source", "start_offset" : 0, "end_offset" : 11, "type" : "word", "position" : 1 } ]}

# congrats !

mardi 17 décembre 13

Page 32: elasticsearch basics workshop

# lets push data againcurl -XPUT http://localhost:9200/workshop/site/1 -d '{ "url": "http://www.elasticsearch.org", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"]}'

curl -XPUT http://localhost:9200/workshop/site/2 -d '{ "url": "http://www.mathieu-elie.net", "title": "Mathieu ELIE Freelance - Full Stack Data Engineer, Data Visualization", "description": "Freelance Consultant in Bordeaux, System &amp; Software Architect. Love dataviz, redis, elasticsearch, architecture scalability recipes and playing with data.", tags: ["elasticsearch", "Data Visualization"]}'

curl -XPUT http://localhost:9200/workshop/site/3 -d '{ "url": "http://www.giroll.org", "title": "Collectif Giroll - Gironde Logiciels Libres", "description": "Giroll, collectif basÎ È Bordeaux, rÎunis autour des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 È 20h30 et organisation d''Install Party Linux tous les six", tags: ["Open Source", "Collectif"]}'

mardi 17 décembre 13

Page 33: elasticsearch basics workshop

# faceting ok ???curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : {

"query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 34: elasticsearch basics workshop

"facets" : { "tags" : { "_type" : "terms", "missing" : 0, "total" : 5, "other" : 0, "terms" : [ { "term" : "elasticsearch", "count" : 2 }, { "term" : "Open Source", "count" : 1 }, { "term" : "Distributed", "count" : 1 }, { "term" : "Data Visualization", "count" : 1 } ] } }

cool ! our facets contains whole tags ! great jobs !!

mardi 17 décembre 13

Page 36: elasticsearch basics workshop

• more efficient than full text search

• cached / indexed

• you can filter using facet items

curl -XGET 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "query" : { "match_all" : { } }, "filter" : { "term" : { "tags" : "Open Source"} }}'

mardi 17 décembre 13

Page 37: elasticsearch basics workshop

RTFM WAY

• elasticsearch doc is great

• but it is exhaustive

• so at the beguining its a bit frustrating

mardi 17 décembre 13

Page 38: elasticsearch basics workshop

Think about json hierachy

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 40: elasticsearch basics workshop

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html

your using the query dsl

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 41: elasticsearch basics workshop

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-queries.html

your using different types of queries

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 42: elasticsearch basics workshop

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

this query is a query_string typewith a query parameter set to elasticsearch

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 43: elasticsearch basics workshop

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets.html

we also use faceting

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 44: elasticsearch basics workshop

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html

we use a terms facet

curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } }}'

mardi 17 décembre 13

Page 45: elasticsearch basics workshop

RTFM WAY

• common mistake: the code example are not showing always whole query

• so you should replace the code in the doc in the whole dsl hierarchy

• think about hierarchy and everything should be more clear

mardi 17 décembre 13

Page 46: elasticsearch basics workshop

the end for me...

the begguining for you...

mardi 17 décembre 13

Page 47: elasticsearch basics workshop

questions and more

• twitter @mathieuel

• contact on my freelance website

• http://www.mathieu-elie.net

• thanks to giroll for hosting this workshop !

mardi 17 décembre 13