LAMP 人 人人人人人人人 第 12 第 第 第第第第第第第第第第第第第第第第第第 : 一 《 》 - 第第第第第第 www.LAMPER.cn QQ 人: 83304912 http://weibo.com/lampercn
Jan 26, 2015
LAMP人 主题分享交流会
第12期:《新一代互联网行为定向广告技术的挑战与优化》
- 品友互动专场
www.LAMPER.cnQQ群:83304912
http://weibo.com/lampercn
ElasticSearchA search engine “ready to fly”
Medcl/2012/2/18
Why I am here?
• 好东西需要与大家一起分享!
What’s elasticsearch
• “Distributed, (Near) Real Time, Search Engine”
• Open Source ( Apache 2.0 )• RESTful• Free Schema ( Dynamic )• MultiTenant• Scalable• High Availability• Rich Search Features• Good Expansibility• … …
first impression
Let’s start the trip
Debug Tools
Index a document
curl –XPOST http://localhost:9200/myindex/share/1-d’ { "url" : "http://www.lamper.cn/", "date" : "2012-02-18 13:00:00", "location" : "beijing, 北京 "}’
RESTfulURL 地址
索引文档内容,Json 格式
Field字段名称 字段内容
Index Response
{ "ok": true, "_index": "myindex", "_type": "share", "_id": "1", "_version": 1}
Explain the url
http://localhost:9200/myindex/share/1
服务器 IP 地址
HTTP 端口
索引名称
索引类型名称
索引文档唯一标识
Query the document
curl –XGET http://localhost:9200/myindex/share/_search?q=location:beijing
ES 服务器地址
索引名称
类型名称
搜索 RESTful 接口
指定查询条件
查询条件,字段名 : 值
Search Response{ "took": 12, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.5, "hits": [ { "_index": "myindex", "_type": "share", "_id": "1", "_score": 0.5, "_source": { "url": "http://www.lamper.cn/", "date": "2012-02-18 13:00:00", "location": "beijing, 北京 " } } ] }}
Queries
http://localhost:9200/myindex/share/_search?q=beijinghttp://localhost:9200/myindex/share,conf/_search?q=beijinghttp://localhost:9200/myindex/_search?q=beijinghttp://localhost:9200/myindex,myindex2/_search?q=beijinghttp://localhost:9200/_search?q=beijing
QueryDSL
curl -XPOST http://localhost:9200/myindex/_search –d’{ "query": { "term": { "location": "beijing" } }}’
Why QueryDSL?
Filters 、 Caching 、 Highlighting 、 Facet 、 Compl
exQuery… …
Scalability&HA
Distributed Lucene Directory
• Each index is fully sharded with a configurable number of shards.
• Each shard can have zero or more replicas.• Read / Search operations performed on either
replica shard.
Automatic shard allocation
From:http://www.slideshare.net/elasticsearch/elasticsearch-at-berlinbuzzwords-2010#
Scalability
• nodes that can hold data, and nodes that do not.
• There is no need for a load balancer in elasticsearch, each node can receive a request, and if it can’t handle it, it will automatically delegate it to the appropriate node(s).
• If you want to scale out search, you can simply have more shard replicas per shard.
Transaction log
• Indexed / deleted doc is fully persistent• No need for a Lucene IndexWriter#commit• Managed using a transaction log / WAL• Full single node durability (kill dash 9)• Utilized when doing hot relocation of shards• Periodically “flushed” (calling IW#commit)
BASE
• Each document you index is there once the index operation is done.
• No need to commit or something similar to get everything persisted.
• A shard can have 1 or more replicas for HA. • Gateway persistency is done in the
background in an async manner.
Not Mentioned Here…
• Versioning• Template• River• Percolator• PartialUpdate• Routing• Parent-Child Type• Scripting• … …
That’s Too Much,Discovery it yourself
Community&Support
• http://github.com/elasticsearch• http://groups.google.com/group/elasticsearch• Irc:#elasticsearch@freenode
• qq 群: 190605846• http://doc.elasticsearch.cn• http://s.medcl.net/
BTW
• 招人 in’– 分布式– 高性能– 海量数据处理– 个性化推荐– 搜索引擎
• 对以上任一感兴趣者:– 欢迎加入我们的团伙 !
My Company
!
Thank you!