Top Banner
© 2019 SPLUNK INC. Fast and Scalable Knowledge Bundle Replication in Splunk Enterprise 8.0 October 23 rd 2019, The Venetian Sands Expo, Las Vegas
27

Fast and Scalable Knowledge Bundle Replication in Splunk ...

Oct 16, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Fast and Scalable Knowledge Bundle Replication in Splunk Enterprise 8.0

October 23rd 2019,The Venetian Sands Expo, Las Vegas

Page 2: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Fast and Scalable Knowledge Bundle Replication in Splunk Enterprise 8.0

Software Engineer | SplunkAditya Dhoke

Software Engineer | SplunkAnish Shrigondekar

Page 3: Fast and Scalable Knowledge Bundle Replication in Splunk ...

During the course of this presentation, we may make forward‐looking statements regarding future events or plans of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results may differ materially. The forward-looking statements made in the this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, it may not contain current or accurate information. We do not assume any obligation to update any forward‐looking statements made herein.

In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only, and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionalities described or to include any such feature or functionality in a future release.

Splunk, Splunk>, Turn Data Into Doing, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2019 Splunk Inc. All rights reserved.

Forward-LookingStatements

© 2 0 1 9 S P L U N K IN C .

Page 4: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

IntroductionWhat are knowledge bundles and why do they matter ?

Page 5: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Knowledge BundlesCreated on search-head tier

Composed of knowledge objects• lookups• datamodels• tags• alerts …

Search-head sends bundles to search peers

Search peer runs searches on behalf of search-head using relevant knowledge bundle

Splunk Distributed Search 101

Page 6: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Classic Knowledge Bundle ReplicationHow does knowledge bundle replication work today ?

Page 7: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Classic Bundle ReplicationSearch Head sends bundle to all Search Peers

Search Head

Indexer Indexer Indexer Indexer Indexer Indexer Indexer Indexer IndexerIndexer

Bundle Bundle Bundle Bundle Bundle Bundle Bundle Bundle Bundle Bundle

payload

Searchnum_threads = 4

Page 8: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Problem Statement Could we do better?

Page 9: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

What you told us…Voice of the customer

Replication could be slow in large deployments

WAN latency impacts performance

Search Head could potentially become a

bottleneck

Page 10: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Cascading Knowledge Bundle Replication Fast and scalable

Page 11: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Cascading Bundle Replication

Ultra-fast performance

Easy to configure and manage

Site and deployment aware

Page 12: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Cascading Bundle ReplicationSearch Head only sends bundle to some Search Peers

Search Head

Indexer Indexer Indexer Indexer Indexer Indexer Indexer Indexer IndexerIndexer

Bundle Bundle Bundle Bundle Bundle Bundle Bundle Bundle Bundle Bundle

payloadplan

Search

1. Send cascade plan to all peers

2. Search Head sends payload to designated receivers

3. Search Peers send payload to designated receivers

num_threads = 4

Page 13: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

TerminologyCascade Plan (Control Plane)• Topology aware execution plan generated on search-head• Sent to all peers from search-head• Stored in JSON format on all nodes

Cascade Payload (Data Plane)• Actual payload packets sent by sender to set of local

receivers based on cascade plan• Search-peer can act as sender as well as receiver

Knowledge Bundle Replication Cycle• Cycle triggered by search-head composed of target peers

and bundle• Composed of peers receiving full bundle as well as delta

bundle• One cascade plan per bundle type

Terms & Definition

REST: /services/replication/cascading/plans

Page 14: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Cascading Plan Generation

Calculate Depth

Calculate Width

Group Classification

Peer Selection

Topology Assignment

Site and Topology Aware Search Head

Indexer A

Site 1

Indexer D

Site 2

Indexer B

Site 1

Indexer C

Site 1

Indexer E

Site 2

Indexer F

Site 2

Logical overlay tree

Depth = 2Width = 2Groups Site 1 and Site 2

Page 15: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Fault Tolerance

Cascading replication policy built with high resiliency and fault tolerant design

State about all peers maintained on the Search-Head

Bundle state informed through periodic heartbeat from Search-Head to Indexers

Search-head responsible for attempting retry

Bundle replication activity blocked if active replication cycle is in progress

What if things go wrong?

Page 16: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Fault Tolerance

Topology Changes• Peer addition• Peer deletion• Peer restarted

Bundle Failures

Bundle Stuck

Duplicate/Late Delivery

What all could possibly go wrong ? Like literally…

SH

SP

SP SP

SP

SP SP

Page 17: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Performance Analysis

Average Reduction in use of WAN

bandwidth

Average Improvement in Replication Time

Average Reduction in CPU time on Search Head

Lets come to the numbers…

61% 67% 75%

Page 18: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Performance AnalysisReplication Time with Bundle Indexing

Bundle Size = 50MB• 120 indexers – 79% faster• 1000 indexers – 97% faster

Bundle Size = 1GB• 120 indexers – 36% faster• 1000 indexers – 85% faster

Page 19: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Performance AnalysisReplication Time Without Bundle Indexing

If bundle indexing (lookups) is excluded, we get even better performance with cascading bundle replication

For e.g. – with Bundle Size = 1GB• replication time with indexing in cascading mode – 37% faster• replication time without indexing in cascading mode – 70% faster

Page 20: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

System MetricsResource Usage Analysis

SH Metrics Classic Cascading Comparison

CPU 300%~500% 100% Cascading policy consumes lower cpu

Memory 300MB 300MB Similar

IO Read 0 0 Similar

IO Write 200~2000 200~2000 Similar

Network Recv 0 0 Similar

Network Sent 40MBps~50MBps 10MBps Cascading policy consumes lower network bandwidth

Search Peer Resource UsageSP Metrics Classic Cascading Comparison

CPU 100% 100%~400% Cascading policy consumes more cpu

Memory 300MB~2000MB 300MB~2000MB Similar

IO Read 0 0 Similar

IO Write 500~2500 500~3600 Similar

Network Recv 10MBps 10MBps Similar

Network Sent 0 10MBps~30MBps Cascading policy consumes higher network bandwidth

Search Head Resource Usage

Page 21: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

ConfigurationDeploying in production

distsearch.conf• Applicable on: Search Head• Requires restart: Yes• Preferred mode of deployment in Search Head

Cluster: Deployer

[replicationSettings]

replicationPolicy = cascading

server.conf• Applicable on: Search Peer• Requires restart: Yes• Preferred mode of deployment in Indexer

Cluster: Cluster Master Bundle Push

[cascading_replication]pass4SymmKey = <secret>

Page 22: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

REST & CLIFeature Visibility

Bundle Replication Configuration• REST

/services/search/distributed/bundle/replication/config

• CLI./splunk show bundle-replication-config

Bundle Replication Status• REST

/services/search/distributed/bundle/replication/cycles

• CLI./splunk show bundle-replication-status

Page 23: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

MonitoringNew Dashboards in Splunk Distributed Monitoring Console (DMC)• Configure in Distributed Mode• Click on Search->Knowledge Bundle

Replication

New Metrics in metrics.log• bundle_metadata• cycle_dispatch• peer_dispatch

Telemetry collected for better diagnosis and supportability

How do I observe the feature ?

Page 24: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

1. Improved search experience with highly performant (avg 67% faster) and fault tolerant knowledge bundle replication in cascading mode

2. Simple turn-key configuration with automatic site and deployment awareness

3. Fine grained monitoring and visibility into knowledge bundle replication than ever before

3 things you should take home with you…

Key Takeaways

Page 25: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

DEMO

Page 26: Fast and Scalable Knowledge Bundle Replication in Splunk ...

© 2 0 1 9 S P L U N K IN C .

Q&A

Page 27: Fast and Scalable Knowledge Bundle Replication in Splunk ...

RATE THIS SESSIONGo to the .conf19 mobile app to

© 2 0 1 9 S P L U N K IN C .

You!Thank