Administering and Monitoring SolrCloud Clusters
Post on 11-May-2015
765 Views
Preview:
DESCRIPTION
Transcript
Administering and Monitoring SolrCloud
Rafał Kuć – Sematext Group, Inc. @kucrafal @sematext sematext.com
Ta me…
Sematext consultant & engineer Solr.pl co-founder Father and husband
Solr Server
SolrCloud Concepts
Solr Server
Solr Server Solr Server
Shard1 Replica
Shard2 Replica
Shard2 Shard1
Application
Local SolrCloud Cluster
java -Dbootstrap_confdir=./solr/revolution/conf -Dcollection.configName=revolution -DzkRun -DnumShards=1 -jar start.jar
Runs embedded ZooKeeper Bootstraps collection with 1 shards Starts Solr
Starting Solr Cluster
ZooKeeper ZooKeeper ZooKeeper
Solr Server Solr Server
-DzkHost=192.168.1.2:2181, 192.168.1.1:2181,192.168.1.3:2181
Solr Server Solr Server
-DzkHost=192.168.1.1:2181, 192.168.1.2:2181,192.168.1.3:2181
-DzkHost=192.168.1.3:2181, 192.168.1.1:2181,192.168.1.2:2181
-DzkHost=192.168.1.3:2181, 192.168.1.1:2181,192.168.1.2:2181
No Collection
No Collection No Collection
No Collection
Uploading Collection Configuration
./zkcli.sh -cmd upconfig -zkhost 192.168.1.1:2181 -confdir ./conf/ -confname revolution
ZooKeeper
ZooKeeper
ZooKeeper
Collection configuration Solr
Collections API Create Delete Reload Split Create Alias Delete Alias Shard Creation/Deletion
http://wiki.apache.org/solr/SolrCloud
Collection Creation
curl 'http://solrhost:8983/solr/admin/collections?action=CREATE &name=revolution&numShards=3&replicationFactor=4'
name numShards replicationFactor maxShardsPerNode createNodeSet collection.configName
Collection Split Example
$ curl
'http://solr1:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=2&replicationFactor=1'
Collection Split Example
$ curl 'http://localhost:8983/solr/admin/collections?
action=SPLITSHARD&collection=collection1&shard=shard1'
Getting Deeper – CoreAdmin API
curl 'http://solrhost:8983/solr/admin/cores?action=CREATE &name=newcore&collection=revolution&shard=shard2'
collection shard numShards collection.configName
Schema – the API
Reading (Solr 4.2) Fields Dynamic fields Types Copy fields Name (4.3) Version (4.3) Unique Key (4.3) Similarity (4.3)
Writing (Solr 4.4)
Adding new fields Adding copy fields
Reading Your Schema
curl -XGET 'http://solrhost:8983/solr/rev/schema/fields/name'
Full reference: http://wiki.apache.org/solr/SchemaRESTAPI
{
"responseHeader" : {
"status" : 0,
"QTime" : 5 },
"field" : {
"name" : "name",
"type" : "text_general",
"indexed" : true,
"stored" : true }
}
Dynamic Schema Modifications <schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>
curl -XPUT 'http://solrhost:8983/solr/rev/schema/fields/content' –d '{
"type" : "text",
"stored" : "false",
"copyFields" : ["catchAll"]
}'
curl -XPOST 'http://solrhost:8983/solr/rev/schema/copyFields' -d '[
{
"source" : "name",
"dest" : [ "text", "personal" ]
}
]'
The Right Directory
_0.fdt _0.fdx _0.fnm _0.nvd
_1.fdt _1.fdx _1.fnm _1.nvd
StandardDirectory SimpleFSDirectory NIOFSDirectory MMapDirectory NRTCachingDirectory RAMDirectory <directoryFactory name="DirectoryFactory"
class="solr.NRTCachingDirectoryFactory" />
Segment Merging
a b c d e
Level 0 Level 1
c f g
Segment Merge Under Control
Merge policy Merge scheduler Merge factor Merge policy configuration
https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig
Autocommit or Not?
<autoCommit>
<maxTime>15000</maxTime>
<maxDocs>1000</maxDocs>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>1000</maxTime>
</autoSoftCommit>
Automatic data flush (hard commit)
Automatic index view refresh
Caches
Solr Cache
Refreshed with IndexSearcher Configurable Different purposes Different implementations
Monitoring Importance
What to Pay Attention to?
Cluster State
Health Shards and replica status Shard placement Failing nodes
Indexing Related Metrics
Index throughput Document distribution I/O subsystem metrics Merging
Search - related Metrics
Count Latency Distribution among nodes Anomalies and spikes
Monitoring Memory and GC
Heap details Pool size Pool utilization Garbage collection count Garbage collection time
Monitoring OS Related Metrics
CPU details Load I/O activity Network usage
Solr Administration Panel
Solr & JMX <jmx />
java -Dcom.sun.management.jmxremote –jar start.jar
Solr & JMX
SPM Index statistics Request # and latency Caches and warmup CPU JVM Memory and OS Memory Garbage collector OS related statistics
SPM Dashboard
Other Monitoring Tools
Ganglia http://ganglia.sourceforge.net/
New Relic http://www.newrelic.com/
Opsview http://www.opsview.com
Too much is too much
Too hot
Caches
We Are Hiring !
Dig Search ? Dig Analytics ? Dig Big Data ? Dig Performance ? Dig working with and in open – source ? We’re hiring world – wide ! http://sematext.com/about/jobs.html
Rafał Kuć @kucrafal rafal.kuc@sematext.com Sematext @sematext http://sematext.com http://blog.sematext.com SPM discount code: LR2013SPM20
Thank You !
@ Sematext booth ;)
top related