Top Banner
Advancing the Elastic Stack - It’s more than just log aggregation!
24

Analyzing Data with the ELK Stack

Jan 21, 2018

Download

Software

DataScienceMD
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Analyzing Data with the ELK Stack

Advancing the Elastic Stack -It’s more than just log aggregation!

Page 2: Analyzing Data with the ELK Stack

Introduction

Mike ClarkeDevOps Engineer/SA

Mike KeithSenior Software Engineer

Page 3: Analyzing Data with the ELK Stack

Agenda● Project/Problem Overview

○ Our environment and problem we were solving○ Initially to solve distributed log problem

● Elastic Stack Overview● Kibana and ElasticSearch Demo

Page 4: Analyzing Data with the ELK Stack

Architecture Overview● Our Environment

○ Multiple Geographical Regions/Zones○ Ingest processing application○ Webservice application

■ Our webservice application logs tell us a lot about what is going on with customers sending us information.

○ Access logs for JBOSS○ Data archive application

Page 5: Analyzing Data with the ELK Stack

JBossWebservice

JBossWebservice

JBossWebservice

JBossWebservice

JBossUI

JBossUI

JBossUI

JBossUI

Architecture Overview

RDBMS

NoSQL DB

Page 6: Analyzing Data with the ELK Stack

Project / Problem● Log aggregation is hard● No historical reference, as logs age off● Obtaining stats was painful

○ Realistically when all your service stats are in your logs what do you do?● Cluster SSH only helps so much

Page 7: Analyzing Data with the ELK Stack

Obtaining stats was painful ?!?!?!

cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc

host@me$: cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc

5234

host@me$: cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc...………

host@me$: cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc...………

host@me$: cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc

20host@me$: cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc1240

host@me$: cat log | grep "someword" | awk '{print $8}' | paste -sd+ | bc

650

Page 8: Analyzing Data with the ELK Stack

Technical Overview● For the most part restricted to FOSS products● Needed to be easily obtainable● Available options

○ GrayLog○ Grafana○ Airbrake○ Splunk○ Elastic Stack

Page 9: Analyzing Data with the ELK Stack

Elastic Stack (formerly ELK) Overview

Elasticsearch - Distributed, RESTful search and analytics engine

Logstash - Server-side data processing pipeline

Kibana - Powerful visualization UI

Beats - Single-purpose, lightweight data shippers

X-Pack - Powerful features which enhance the Elastic Stack

Page 10: Analyzing Data with the ELK Stack

Elastic Stack (formerly ELK) Overview

Page 11: Analyzing Data with the ELK Stack

Initial Solution - Log Aggregation● Single node servers● Installed Elastic Stack and began shipping all application server logs to a

centralized server.● Near Realtime● Raw log message transitioned into a fielded log message● Grok parsing (text pattern matching)● Filters etc.

Page 12: Analyzing Data with the ELK Stack

Elasticsearch

Logstash

Filebeat

Filebeat

Filebeat

Filebeat

KibanaFilebeat

Filebeat

Filebeat

Filebeat

Architecture Overview

Page 13: Analyzing Data with the ELK Stack

Filebeatfilebeat.prospectors:

- input_type: log

paths:- /data/logs/apache/*.log

fields:type: apache

fields_under_root: true

#----------------------------- Logstash output --------------------------------output.logstash:

hosts: ["localhost:5443"]bulk_max_size: 1024

Page 14: Analyzing Data with the ELK Stack

Logstash - Input & Outputinput {

beats {port => 5443ssl => truessl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"

}}

output {elasticsearch {

hosts => ["localhost:9200"]index => "%{[@metadata][beat]}-%{[@metadata][type]}-%{+YYYY.MM.dd}"document_type => "%{type}"user => "elastic"password => "*******"

}}

Page 15: Analyzing Data with the ELK Stack

Kibana - Discover

Page 16: Analyzing Data with the ELK Stack

Kibana - Discover

Page 17: Analyzing Data with the ELK Stack

Logstash - Filters

filter {grok {

match => { "message" => "%{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] \"%{WORD:method} %{DATA:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code:int} %{NUMBER:bytes:int} "}

}mutate {

add_field => { "read_timestamp" => "%{@timestamp}" }}date {

match => [ "time", "dd/MMM/YYYY:H:m:s Z" ]remove_field => "time"

}}

Page 18: Analyzing Data with the ELK Stack

Kibana

Page 19: Analyzing Data with the ELK Stack

● We change from looking at who is talking to us, to what they are talking to us about.

○ We kept adding more to our logs just so we could see it in Kibana.○ Our data was already in Avro format, which made it easy to convert to JSON ○ Then we used the JSON Codec for logstash to input directly into elasticsearch.

● Considered Accumulo○ But there was just too much we had to build to get it to a usable state.

Evolution of the solution

Page 20: Analyzing Data with the ELK Stack

Kibana Twitter Demo● Let’s take a look at some interesting things you can see in kibana● Counting very easily across different fields in your data (makes aggregating

and histograms very easy)● Data changes over time, sometimes you need to go back and update

something you already stored?○ State changes or updates of some kind to the original document.

Page 21: Analyzing Data with the ELK Stack

Twitter Data DemoBasic twitter JSON:

{ screen_name, text, retweeted_status.user.screen_name, retweeted_status.retweet_count, retweeted_status.text, ... }

{ screen_name, text, retweeted_status.user.screen_name, retweeted_status.retweet_count, retweeted_status.text, ... }

Page 22: Analyzing Data with the ELK Stack

Data Storage Elastic Stack Architecture

ElasticsearchData Node 1

Logstash Node 1

Kibana

Filebeat

Filebeat

Logstash Node 4

ElasticsearchData Node 20

... ElasticsearchClient Node

ElasticsearchMaster Node 1

ElasticsearchMaster Node 2

... ...

Page 23: Analyzing Data with the ELK Stack

Conclusion & Takeaways● Low Barrier to Entry● Quickly Search Across Data● Horizontally Scalable● Easily Visualize Data

Page 24: Analyzing Data with the ELK Stack

About Clarity Business Solutions● We are a team of Software and System Engineers● Customer focused and mission driven● For more about us, please visit: www.claritybizsol.com

● Follow us:

@claritybizsol