Top Banner
Solr on Docker - the Good, the Bad and the Ugly Radu Gheorghe Sematext Group, Inc.
21

Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

Jan 21, 2018

Download

Technology

LucidWorks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

Solr on Docker - the Good, the Bad and the UglyRadu Gheorghe

Sematext Group, Inc.

Page 2: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

2

01Agenda

The Good (well, arguably). Why containers? Orchestration, configuration drift...

The Bad (actually, not so bad). How to do it? Hardware, heap size, shards...

The Ugly (and exciting). Why is it slow/crashing? Container limits, GC&OS settings

Page 3: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

3

01

Clients

Sematext Cloud

logs

metrics

...

Our own dockerizing (dockerization?)

Page 4: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

4

01

Because Docker is the future!

Page 5: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

5

01

*

* you’re not tied to the provider’s autoscaling

* you may get better deals with huge VMs

Orchestration

Page 6: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

6

01

github.com/sematext/lucene-revolution-samples

Demo: Kubernetes

Page 7: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

7

01

dev=test=prod; infrastructure as code. Sounds familiar? But:

○ light images

○ faster start&stop

○ hype ⇒ community

Efficiency (overhead vs isolation): (processes + VMs)/2 = containers

More on “the Good” of containerization

Page 8: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

8

01

Zookeeper on separate hosts

nodes

Avoid hotspots:

Equal nodes per host

Equal shards per node(per collection)

podAntiAffinity on k8s

Moving on to “how”

Page 9: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

9

01

Overshard*. A bit.

time

logs1 logs2logs3

*Moving shards creates load ⇒ be aware of spikes

Time series? Size-based indices

On scaling

Page 10: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

10

01

volumes/StatefulSet for persistence

local > network (esp. for full-text search)

permissions

latency (mostly to Zookeeper) AWS → enhanced networking

network storage on different interface AWS → EBS-optimized

Page 11: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

11

01

Not too small

OS caches are shared between containers⇩

>1 Solr nodes per host?

Co-locate with less IO-intensive apps?

Not too big

Host failure will be really bad

Overhead (e.g. memory allocation)

Big vs small hosts

Page 12: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

12

01

Many small Solr nodes ⇒ bigger cluster state, # of shards

Multithreaded indexing

Full text search is usually bound by IO latency

Facets are usually parallelized between shards/collections

Size usually limited by heap (can’t be too big due to GC)or by recovery time

bigger = better

Big vs small containers/nodes

Page 13: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

13

01

More data → more heap (terms, docValues, norms…)

Caches (generally, fieldValueCache is evil, use docValues)

Transient memory (serving requests)→ add 50-100% headroom

Make sure to leave enough room for OS caches

How much heap?

Page 14: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

14

01

@32GB → no more compressed object pointers

Depending on OS, >30GB → still compressed, but not 0-based → more CPU

Uncompressed pointers’ overhead varies on use-case, 5-10% is a good

Larger heaps → GC is a bigger problem

The 32GB heap problem

Page 15: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

15

01

Defaults → should be good up to 30GB

Larger heaps need tuning for latency

100GB+ per node is doable.

CMS: NewRatio, SurvivorRatio, CMSInitiatingOccupancyFraction

G1 trades heap for latency and throughput:

■ Adaptive sizing depending on MaxGCPauseMillis

■ Compacts old gen (check G1HeapRegionSize)

More useful info: https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

usually jumpto 45GB+

typical cluster killer (timeouts)

GC Settings

Page 16: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

16

01

GC-relatedyoung: ParallelGCThreadsold: ConcGCThreads + G1ConcRefinementThreads

facet.threads

merges*: maxThreadCount & maxMergeCount

* also account for IO throughput&latency

<Java 9 defaults depend on host’s #CPUs

N nodes per host ⇒ threads

Page 17: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

17

01

Memory: more than heap, but won’t include OS caches

CPU

Single NUMA node? --cpu-shares

Multiple NUMA nodes? --cpuset*

vm.zone_reclaim_mode to store caches only on local node?

* Docker isn’t NUMA aware: https://github.com/moby/moby/issues/9777But kernel automatically balances threads by default

Container limits

Page 18: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

18

01

Memory leak → OOM killer with a wide range of Java versions*

What helps:

Similar leaks (growing RSS) → NativeMemoryTracking

Don’t overbook memory + leave room for OS caches

Allocate on startup via AlwaysPreTouch

Increase vm.min_free_kbytes?

* https://bugs.openjdk.java.net/browse/JDK-8164293

JVM+Docker+Linux = love. Or not.

Page 19: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

Newer kernels and Dockers are usually better

Open files and locked memory limits

Check dmesg and kswapd* CPU usage

Dare I say it:Try smaller hosts

Try niofs? (if you trash the cache - and TLB - too much)

A bit of swap? (swappiness is configurable per container, too)

Play with mmap arenas and THP

19

01

* kernel’s (single-threaded) GC: https://linux-mm.org/PageOutKswapd

e.g. 4.4+ and 1.13+More on that love

Page 20: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

20

01

The Good:

Orchestration

Dynamic allocation of resources (works well for bigger boxes)

Might actually deliver the promise of dev=testing=prod, because

The Bad:

Pets → cattle requires good sizing, config, scaling practices

The Ugly:

Ecosystem is still young → exciting bugs

Docker is the future!

Summary

Page 21: Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Group, Inc.

Thank You! And please check out:

Solr&Kubernetes cheatsheets:sematext.com/resources/#publications

Openings:sematext.com/jobs

@sematext @radu0gheorgheOur booth :)