Top Banner
Fluentd vs. Logstash Masaki Matsushita NTT Communications
47

Fluentd vs. Logstash for OpenStack Log Management

Jan 14, 2017

Download

Software

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fluentd vs. Logstash for OpenStack Log Management

Fluentd vs. LogstashMasaki Matsushita

NTT Communications

Page 2: Fluentd vs. Logstash for OpenStack Log Management

About Me● Masaki MATSUSHITA● Software Engineer at

○ We are providing Internet access here!● Github: mmasaki Twitter: @_mmasaki● 16 Commits in Liberty

○ Trove, oslo_log, oslo_config● CRuby Commiter

○ 100+ commits for performance improvement

2

Page 3: Fluentd vs. Logstash for OpenStack Log Management

What are Log Collectors?● Provide pluggable and unified logging layer

Without Log Collectors With Log CollectorsImages from http://fluentd.org/ 3

Page 4: Fluentd vs. Logstash for OpenStack Log Management

Input, Filter and Output

4

Input Plugins

tail

syslog

Filter Plugins

grep

hostname

Output Plugins

InfluxDB

Elasticsearch

● They are implemented as plugins● Can be replaced easily

Log FIles

Components

Page 5: Fluentd vs. Logstash for OpenStack Log Management

Two Popular Log Collectors● Fluentd

○ Written in CRuby○ Used in Kubernetes○ Maintained by Treasure Data Inc.

● Logstash○ Written in JRuby○ Maintained by elastic.co

● They have similar features● Which one is better for you?

5

Page 6: Fluentd vs. Logstash for OpenStack Log Management

Agenda● Comparisons

○ Configuration○ Supported Plugins○ Performance○ Transport Protocol

● Integrate OpenStack with Fluentd/Logstash○ Considering High Availability

6

Page 7: Fluentd vs. Logstash for OpenStack Log Management

Configuration: Fluentd● Every inputs are tagged● Logs will be routed by tag

nova-api.log(tag: openstack.nova)

cinder-api.log(tag: openstack.cinder)

<match openstack.nova>

<match openstack.cinder>Filter/Route

7

Page 8: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Configuration: Input

<source>

@type tail

path /var/log/nova/nova-api.log

tag openstack.nova

</source>

Example of tailing nova-api log

● Every inputs will be tagged

8

Page 9: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Configuration: Output<match openstack.nova> # nova related logs

@type elasticsearch

host example.com

</match>

<match openstack.*> # all other OpenStack related logs

@type influxdb

# …</match>

Routed by tag(First match is priority)

Wildcards can be used9

Page 10: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Configuration: Copy<match openstack.*>

@type copy

<store>

@type influxdb

</store>

<store>

@type elasticsearch

</store>

</match>

Copy plugin enables multipleoutputs for a tag

Copied Output

tag: openstack.*

10

Page 11: Fluentd vs. Logstash for OpenStack Log Management

Logstash Configuration● No tags● All inputs will be aggregated● Logs will be scattered to outputs

nova-api.log

cinder-api.logFilter/Aggregate

aggregated logs

11

Page 12: Fluentd vs. Logstash for OpenStack Log Management

Logstash Configurationinput {

file { path => “/var/log/nova/*.log” }

file { path => “/var/log/cinder/*.log” }

}

output {

elasticsearch { hosts => [“example.com”] }

influxdb { host => “example.com”... }

}12

Page 13: Fluentd vs. Logstash for OpenStack Log Management

Case 1: Separated Streams

Input1

Input2

Input3

Output2

Output3

Output1

● Handle multiple streams separately

13

Page 14: Fluentd vs. Logstash for OpenStack Log Management

Case 1: Separated StreamsFluentd: Simple matching by tag

<match input.input1> @type output1</match>

<match input.input2> @type output2</match>

<match input.input3> @type output3</match>

Logstash: Conditional Outputs

output { if [type] == “input1” { output1 {} } else if [type] == “input2” { output2 {} } else if [type] == “input3” { output3 {} }}

Need to split aggregated logs

14

Page 15: Fluentd vs. Logstash for OpenStack Log Management

Case 2: Aggregated Streams

Input1

Input2

Input3

Output2

Output3

Output1

● Streams will be aggregated and scattered

15

Page 16: Fluentd vs. Logstash for OpenStack Log Management

Case 2: Aggregated StreamsFluentd: Copy plugins is needed

<match input.*> @type copy <store> @type output1 </store> <store> @type output2 </store> <store> @type output3 </store></match>

Logstash: Quite simple

output {

output1 {}

output2 {}

output3 {}

}

16

Page 17: Fluentd vs. Logstash for OpenStack Log Management

Configuration● Fluentd

○ Routed by simple tag matching○ Suited to handle log streams separately

● Logstash○ Logs are aggregated○ Suited to handle logs in gather-scatter style

17

Page 18: Fluentd vs. Logstash for OpenStack Log Management

Plugins

● Both provide many plugins○ Fluentd: 300+, Logstash: 200+

● Popular plugins are bundled with Logstash○ They are maintained by the Logstash project

● Fluentd contains only minimal plugins○ Most plugins are maintained by individuals

● Plugins can be installed easily by one command18

Page 19: Fluentd vs. Logstash for OpenStack Log Management

Performance● Depends on circumstances● More than enough for OpenStack logs

○ Both can handle 10000+ logs/s● Applying heavy filters is not a good idea● CRuby is slow because of GVL?

○ GVL: Global VM (Interpreter) Lock○ It’s not true for IO bound loads

19

Page 20: Fluentd vs. Logstash for OpenStack Log Management

GVL on IO bound loads● IO operation can be performed in parallel

20

Thread 1 Thread 2

Idle : User Space:Kernel Space:

Actual Read/Write

Ruby Code Execution GVL Released/ Acquired

IO operationsin parallel

Page 21: Fluentd vs. Logstash for OpenStack Log Management

Transport Protocol

● Both collectors have their own transport protocol.○ Failure Detection and Fallback

● Logstash: Lumberjack protocol○ Active-Standby only

● Fluentd: forward protocol○ Active-Active (Load Balancing), Active-Standby○ Some additional features

21

Page 22: Fluentd vs. Logstash for OpenStack Log Management

Logstash Transport: lumberjack● Active-Standby lumberjack { #config@source

hosts => [

“primary”,

“secondary”

]

port => 1234

ssl_certificate => …}

primary

secondary

source

secondary is used when primary fails

Fail

Fallback

22

Page 23: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Transport: forward● Active-Active

(Load Balancing)<match openstack.*>

type forward

<server>

host dest1

</server>

<server>

host dest2

</server>

</match>

source dest1

dest2

Equally balancedoutputs

23

Page 24: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Transport: forward● Active-Standby <match openstack.*>

type forward

<server>

host primary

</server>

<server>

host secondary

standby

</server>

</match>

primary

secondary

source

Fail

Fallback

24

Page 25: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Transport: forward● Weighted Load Balancing <match openstack.*>

type forward

<server>

host dest1

weight 60

</server>

<server>

host dest2

weight 40

</server>

</match>

source dest1

dest2

60%

40%

25

Page 26: Fluentd vs. Logstash for OpenStack Log Management

Fluentd Transport: forward● At-least-one Semantics

(may affect performance)<match openstack.*>

type forward

require_ack_response

<server>

host dest

</server>

</match>destsource

send logs

ACK

Logs are re-transmitteduntil ACK is received

26

Page 27: Fluentd vs. Logstash for OpenStack Log Management

Transport Protocol

● Both can be configured as Active-Standby mode.● Fluentd has great features:

○ Active-Active Mode (Load Balancing)○ At-least-one Semantics○ Weighted Load Balancing

27

Page 28: Fluentd vs. Logstash for OpenStack Log Management

Forwarders● Fluentd/Logstash have their own “forwarders”

○ Lightweight implementation written in Golang○ Low memory consumption○ One binary: Less dependent and easy to install

28

Node

Tail log files

Forwarder

Log AggregatorForward/LumberjackProtocol

Page 29: Fluentd vs. Logstash for OpenStack Log Management

Forwarders: Config Examplefluentd-forwarder:[fluentd-forwarder]

to = fluent://fluentd1:24224

to = fluent://fluentd2:24224

logstash-forwarder:"network": { "servers": [ "logstash1:5043", "logstash2:5043" ]}Always send logs to both servers.

Pick one active server and send logs only to it.Fallback to another server on failure.

29

Page 30: Fluentd vs. Logstash for OpenStack Log Management

Integration with OpenStack

● Tail log files by local Fluentd/Logstash○ must parse many form of log files

● Rsyslog○ installed by default in most distribution○ can receive logs in JSON format

● Direct output from oslo_log○ oslo_log: logging library used by components○ Logging without any parsing

30

Page 31: Fluentd vs. Logstash for OpenStack Log Management

Log Aggregators

OpenStack nodes

Tail Log Files

31

Tail log files

Forward Protocol dest1

dest2

Page 32: Fluentd vs. Logstash for OpenStack Log Management

Tail Log Files• Must handle many log files…

syslog

kern.log

apache2/access.log

apache2/error.log

keystone/keystone-all.log

keystone/keystone-manage.log

keystone/keystone.log

cinder/cinder-api.log

cinder/cinder-scheduler.log

neutron/neutron-server.log

neutron/neutron-server.log

nova/nova-api.log

nova/nova-conductor.log

nova/nova-consoleauth.log

nova/nova-manage.log

nova/nova-novncproxy.log

nova/nova-scheduler.log

mysql/error.log

mysql/mysql-slow.log

mysql.log

mysql.err

nova/nova-compute.log

nova/nova-manage.log...32

Page 33: Fluentd vs. Logstash for OpenStack Log Management

Tail Log Files• But you can use wildcard

Fluentd:

<source>

type tail

path /var/log/nova/*.log

tag openstack.nova

</source>

Logstash:

input {

file {

path => [“/var/log/nova/*.log”]

}

}

33

Page 34: Fluentd vs. Logstash for OpenStack Log Management

Parse Text Log

● Welcome to regular expression hell!<source>

type tail # or syslog

path /var/log/nova/nova-api.log

format /^(?<asctime>.+) (?<process>\d+) (?<loglevel>\w+) (?

<objname>\S+)( \[(-|(?<request_id>.+?) (?<user_identity>.+))\])?

((?<remote>\S*) "(?<method>\S+) (?<path>[^\"]*) \S*?" status: (?

<code>\d*) len: (?<size>\d*) time: (?<res_time>\S)|(?<message>.

*))/

</source>34

Page 35: Fluentd vs. Logstash for OpenStack Log Management

Log Aggregators

OpenStack nodes

Rsyslog

35

via /dev/log

Syslog Protocol(TCP or UDP)

rsyslog

Page 36: Fluentd vs. Logstash for OpenStack Log Management

Rsyslog: Logging.conf● Logging Configuration in detail● Handler: Syslog, Formatter: JSON

# /etc/{nova,cinder…}/logging.conf

[handler_syslog]

class = handlers.SysLogHandler

args = ('/dev/log', handlers.SysLogHandler.LOG_LOCAL1)

formatter = json

[formatter_json]

class = oslo_log.formatters.JSONFormatter36

Page 37: Fluentd vs. Logstash for OpenStack Log Management

Example Output: JSONFormatter{

"levelname": "INFO",

"funcname": "start",

"message": "Starting conductor node (version 13.0.0)",

"msg": "Starting %(topic)s node (version %(version)s)",

"asctime": "2015-09-29 18:29:57,690",

"relative_created": 2454.8499584198,

"process": 25204,

"created": 1443518997.690932,

"thread": 140119466896752,

"name": "nova.service",

"process_name": "MainProcess",

"thread_name": "GreenThread-1",

...37

Page 38: Fluentd vs. Logstash for OpenStack Log Management

Syslog Facilities

● Assignment of local0..7 Facilities for components● Logs are tagged as like “syslog.local0” in Fluentd● Example:

○ local0: Keystone○ local1: Nova○ local2: Cinder○ local3: Neutron○ local4: Glance

38

Page 39: Fluentd vs. Logstash for OpenStack Log Management

Rsyslog: Config@OpenStack nodes

● Active-Standby Configuration

# /etc/rsyslog.d/rsyslog.conf

user.* @@primary:5140

$ActionExecOnlyWhenPreviousIsSuspended on

&@@secondary:5140

39

Page 40: Fluentd vs. Logstash for OpenStack Log Management

Rsyslog: Config@Aggregator

Fluentd:<source> type syslog port 5140 protocol_type tcp format json tag syslog</source>

Logstash:input { syslog {

codec => json port => 5140 }} Listen on both TCP and UDP

Specify TCP or UDP 40

Page 41: Fluentd vs. Logstash for OpenStack Log Management

Rsyslog: Config@Aggregator

Fluentd:<source> type syslog port 5140 protocol_type tcp format json tag syslog</source>

Logstash:input { syslog {

codec => json port => 5140 }}

41

Page 42: Fluentd vs. Logstash for OpenStack Log Management

Log AggregatorsOpenStack nodes

42

via FluentHandler

Forward Protocol

Direct output from oslo_log

Local Fluentd for buffering/load balancing(Logstash also can be used)

Page 43: Fluentd vs. Logstash for OpenStack Log Management

Direct output from oslo_log

# logging.conf:

[handler_fluent]

class = fluent.handler.FluentHandler # fluent-logger

formatter = fluent

args = (’openstack.nova', 'localhost', 24224)

[formatter_fluent]

class = fluent.handler.FluentFormatter # our Blueprint

43

Format logs as Dictionary

Page 44: Fluentd vs. Logstash for OpenStack Log Management

Our BP in oslo_log: FluentFormatter{

"hostname":"allinone-vivid",

"extra":{"project":"unknown","version":"unknown"},

"process_name":"MainProcess",

"module":"wsgi",

"message":"(4132) wsgi starting up on http://0.0.0.0:8774/",

"filename":"wsgi.py",

"name":"nova.osapi_compute.wsgi.server",

"level":"INFO",

"traceback":null,

"funcname":"server",

"time":"2015-10-15 10:09:12,255"

}

Don’t need to parse!

44

Page 45: Fluentd vs. Logstash for OpenStack Log Management

Conclusion● Log Handling

○ Fluentd: Logs are distinguished by tag○ Logstash: No tags. Logs are aggregated

● Transport Protocol○ Both supports active-standby mode○ Fluentd supports some additional features

■ Client-side load balancing (Active-Active)■ At-least-one semantics■ Weighted load balancing

45

Page 46: Fluentd vs. Logstash for OpenStack Log Management

Conclusion● Integration with OpenStack

○ Tail log files: regular expression hell○ Rsyslog: No agents are needed ○ Direct output from oslo_log w/o any parsing○ Review is welcome for our Blueprint

(oslo_log: fluent-formatter)

46

Page 47: Fluentd vs. Logstash for OpenStack Log Management

Thank you!

Please visit our booth!

Robot Racing over WebRTC! →