Fluentd Overview, Now and Then Satoshi Tagomori (@tagomoris) Fluentd meetup in Matsue #fluentdmeetup
Fluentd Overview, Now and Then
Satoshi Tagomori (@tagomoris)
Fluentd meetup in Matsue #fluentdmeetup
Satoshi "Moris" Tagomori (@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
Fluentd overview
What’s Fluentd?
Simple core + Variety of plugins
Buffering, HA (failover), Secondary output, etc.
Like syslogd in streaming manner
AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL
Log collection with traditional logrotate + rsync
Log Server
Application
Server A
File FileFile
Hard to analyze!!Complex text parsers
Application
Server C
File FileFile
Application
Server B
File FileFile
High latency!!Must wait for a day
Streaming way with Fluentd
Log Server
Application
Server A
File FileFile
Application
Server C
File FileFile
Application
Server B
File FileFile
Low latency!Seconds or minutes
Easy to analyze!!Parsed and formatted
M x N problem for data integration
LOG
script to parse data
cron job forloading
filteringscript
syslogscript
Tweet-fetching
script
aggregationscript
aggregationscript
script to parse data
rsyncserver
LOG
A solution: centralized log collection service
M + N
Fluentd Architecture
Internal Architecture (simplified)
Plugin
Input Filter Buffer Output
Plugin Plugin Plugin
2012-02-04 01:33:51myapp.buylog{
“user”:”me”,“path”: “/buyItem”,“price”: 150,“referer”: “/landing”}
TimeTag
Record
Architecture: Input Plugins
HTTP+JSON (in_http)File tail (in_tail)Syslog (in_syslog)…
Receive logs
Or pull logs from data sources
In non-blocking manner
Plugin
Input
Filter
Architecture: Filter Plugins
Transform logs
Filter out unnecessary logs
Enrich logs
Plugin
Encrypt personal dataConvert IP to countriesParse User-Agent…
Buffer
Architecture: Buffer Plugins
Plugin
Improve performance
Provide reliability
Provide thread-safety
Memory (buf_memory)File (buf_file)
Buffer
Architecture: Buffer Plugins
Chunk
Plugin
Improve performance
Provide reliability
Provide thread-safety
Input
Output
Chunk
Chunk
Architecture: Output Plugins
Output
Write or send event logs
Plugin
File (out_file)Amazon S3 (out_s3)kafka (out_kafka_buffered)…
Retry
Error
Retry
Batch
Stream Error
Retry
Retry
Divide & Conquer for retry
Divide & Conquer for recoveryBuffer (on-disk or in-memory)
Error
Overloaded!!
recovery
recovery + flow control
queued chunks
Example Use Cases
Streaming from Apache/Nginx to Elasticsearch
in_tail /var/log/access.log
/var/log/fluentd/buffer
but_file
Error Handling and Recovery
in_tail /var/log/access.log
/var/log/fluentd/buffer
but_file
Buffering for any outputs Retrying automatically With exponential wait and persistence on a disk and secondary output
Tailing & parsing files
Supported built-in formats:
Read a log file Custom regexp Custom parser in Ruby
• apache • apache_error • apache2 • nginx
• json • csv • tsv • ltsv
• syslog • multiline • none
pos fileevents.log
?(your app)
Out to Multiple Locations
Routing based on tags Copy to multiple storages
bufferaccess.log
in_tail
Example configuration for real time batch combo
Data partitioning by time on HDFS / S3
access.logbuffer
Custom file formatter
Slice files based on time
2016-01-01/01/access.log.gz 2016-01-01/02/access.log.gz 2016-01-01/03/access.log.gz …
in_tail
3rd party input plugins
dstat
df AMQL
munin
jvmwatcher
SQL
3rd party output plugins
Graphite
Real World Use Cases
Microsoft
Operations Management Suite uses Fluentd: "The core of the agent uses an existing open source data aggregator called Fluentd. Fluentd has hundreds of existing plugins, which will make it really easy for you to add new data sources."
Syslog
Linux Computer
Operating SystemApache
MySQLContainers
omsconfig (DSC)PS DSC
Prov
ider
s
OMI Server(CIM Server)
omsagent
Fire
wal
l / p
roxy
OM
S Se
rvic
e
Upload Data(HTTPS)
Pullconfiguration
(HTTPS)
Atlassian
"At Atlassian, we've been impressed by Fluentd and have chosen to use it in Atlassian Cloud's logging and analytics pipeline."
Kinesis
Elasticsearchcluster
Ingestionservice
Amazon web services
The architecture of Fluentd (Sponsored by Treasure Data) is very similar to Apache Flume or Facebook’s Scribe. Fluentd is easier to install and maintain and has better documentation and support than Flume and Scribe.
Types of DataStoreCollectTransactional • Database reads & write (OLTP)• Cache
Search • Logs• Streams
File • Log files (/val/log)• Log collectors & frameworks
Stream • Log records• Sensors & IoT data
Web Apps
IoT
Appl
icat
ions
Logg
ing
Mobile AppsDatabase
Search
File Storage
Stream Storage
Container and Logging
The Container EraServer Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Managed By Ops Team DevOps Team
Local Data Persistent Ephemeral
Log Collection syslogd / rsync ?
Metrics Collection Nagios / Zabbix ?
Server Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Managed By Ops Team DevOps Team
Local Data Persistent Ephemeral
Log Collection syslogd / rsync ?
Metrics Collection Nagios / Zabbix ?
The Container Era
How should log & metrics collection be done in The Container Era?
Problems
The traditional logrotate + rsync on containers
Log Server
Application
Container A
File FileFile
Hard to analyze!!Complex text parsers
Application
Container C
File FileFile
Application
Container B
File FileFile
High latency!!Must wait for a day
Ephemeral!!Could be lost at any time
Server 1
Container AApplication
Container BApplication
Server 2
Container CApplication
Container DApplication
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Small & many containers make storages overloadedToo many connections from micro containers!
Server 1
Container AApplication
Container BApplication
Server 2
Container CApplication
Container DApplication
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
System images are immutableToo many connections from micro containers!
Embedding destination IPsin ALL Docker images makes management hard
How to collect logs from Docker containers
Text logging with --log-driver=fluentdServer
Container
App
FluentdSTDOUT / STDERR
docker run \ --log-driver=fluentd \ --log-opt \ fluentd-address=localhost:24224
{ “container_id”: “ad6d5d32576a”, “container_name”: “myapp”, “source”: stdout}
Metrics collection with fluent-loggerServer
Container
App
Fluentd
from fluent import senderfrom fluent import event
sender.setup('app.events', host='localhost')event.Event('purchase', { 'user_id': 21, 'item_id': 321, 'value': '1'})
tag = app.events.purchase{ “user_id”: 21, “item_id”: 321 “value”: 1,}fluent-logger library
Shared data volume and tailingServer
Container
App
Fluentd
<source> @type tail path /mnt/nginx/logs/access.log pos_file /var/log/fluentd/access.log.pos format nginx tag nginx.access</source>
/mnt/nginx/logs
Logging methods for each purpose• Collecting log messages
> --log-driver=fluentd
• Application metrics
> fluent-logger
• Access logs, logs from middleware
> Shared data volume
• System metrics (CPU usage, Disk capacity, etc.)
> Fluentd’s input plugins(Fluentd pulls those data periodically)
Deployment Patterns
Server 1
Container AApplication
Container BApplication
Server 2
Container CApplication
Container DApplication
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Primitive deployment…Too many connections from many containers!
Embedding destination IPsin ALL Docker images makes management hard
Server 1
Container AApplication
Container BApplication
Fluentd
Server 2
Container CApplication
Container DApplication
Fluentd Kafka
elasticsearch
HDFS
Container
Container
Container
Container
destination is always localhost from app’s point of view
Source aggregation decouples config from apps
Server 1
Container AApplication
Container BApplication
Fluentd
Server 2
Container CApplication
Container DApplication
Fluentd
active / standby /load balancing
Destination aggregation makes storages scalable for high traffic
Aggregation server(s)
Aggregation servers• Logging directly from microservices makes log
storages overloaded. > Too many RX connections > Too frequent import API calls
• Aggregation servers make the logging infrastracture more reliable and scalable. > Connection aggregation > Buffering for less frequent import API calls > Data persistency during downtime > Automatic retry at recovery from downtime
Fluentd ♡ Container• Fluentd model fits container based systems
> This is why Treasure Data joined CNCF > TD wants to improve cloud native ecosystem
• Fluentd, Prometheus, Docker and Kubernetes collabolation is good for modern systems • Easy to scale and easy to maintain • Fluentd logging driver in Docker • fluent-plugin-prometheus to send application metrics
to prometheus • EFK for log visualization in Kubernetes
Fluentd v0.14 and Later
• v0.14.0: Released at May 31, 2016
• v0.14.1: Released at Jun 30, 2016
• New Features • New Plugin APIs, Plugin Helpers & Plugin Storage • Time with Nanosecond resolution • ServerEngine based Supervisor • Windows support
v0.14
New Plugin APIs• Input/Output plugin APIs w/ well-controlled lifecycle
• stop, shutdown, close, terminate
• New Buffer API for delayed commit of chunks • parallel/async "commit" operation for chunks
• 100% Compatible w/ v0.12 plugins • compatibility layer for traditional APIs • it will be supported between v1.x versions
Router
buffer_chunk_limit
enqueue: exceed flush_intervalor buffer_chunk_limit
Key pattern:
- BufferedOutputempty string or specified key-ObjectBufferedOutput tag-TimeSlicedOutput time slice
emit emit
Buffer
Queue
buffer_queue_limit
Output
OutputInput / Filter
Tag Time
Record Chunk
Chunk
Chunk Chunk
Chunk
key:foo
key:bar
key:baz
v0.12 buffer design
v0.14 buffer design
Plugin Storage & Helpers• Plugin Storage: new plugin type for plugins
• provides key-value storage for plugins • to persistent intermediate status of plugins • built-in plugins (in plan): in-memory, local file • pluggable: 3rd party plugin to store data to Redis?
• Plugin Helpers: • collections of utility methods for plugins • making threads, sockets, network servers, ... • fully integrated with test drivers to run test codes after
setup phase of helpers (e.g., after created threads started)
v0.12 plugins
ParserInput Buffer Output FormatteFilter
“output-ish”“input-ish”
v0.14 plugins
ParserInput Buffer Output FormatteFilter
“output-ish”“input-ish”
Storag
Helper
Time with nanosecond• For sub-second systems: Elasticsearch, InfluxData and etc
• Fluent::EventTime • behaves as Integer (used as time in v0.12) • has methods to get sub-second resolution • be serialized into msgpack using Ext type
• Fluentd core can handle both of Integer and EventTime as time • compatible with older versions and software in eco-
system (e.g., fluent-logger, Docker logging driver)
ServerEngine based Supervisor
• Replacing supervisor process with ServerEngine • it has SocketManager to share listening sockets
between 2 or more worker processes
• Replacing Fluentd's processing model from fork to spawn • to support Windows environment
Windows support
• Fluentd and core plugin work on Windows • several companies have already used
v0.14.0.pre version on production • We will send a patch to popular plugins if
it doesn’t work on Windows
• Use HTTP RPC instead of signals
v0.14.x - v1• v0.14.x (some versions in 2016)
• Symmetric multi-core processing • Counter API • TLS/authentication/authorization support
(merging secure forward) • https://github.com/fluent/fluentd/issues/1000
• v1 (4Q in 2016 or 1Q in 2017) • Stable version for new APIs / features • Fully compatible with v0.12
• exclude v0 config syntax and detach_process
Symmetric multi core processing
• 2 or more workers share a configuration file • and share listening sockets via PluginHelper • under a supervisor process (ServerEngine)
• Multi core scalability for huge traffic • one input plugin for a tcp port, some filters and
one (or some) output plugin • buffer paths are managed automatically by
Fluentd core
Worker
Supervisor
Worker Worker
Worker
Supervisor
Worker Worker
Supervisor Supervisor
Using fluent-plugin-multiprocess
v0.14
Counter API
• APIs to increment/decrement values • shared by some processes • persisted on disk backed by Storage API
• Useful for collecting metrics or stats filters
TLS/Authn/Authz support for forward plugin
• secure-forward will be merged into built-in forward • TLS w/ at-least-one semantics • Simple authentication/authorization w/ non-SSL
forwarding
• Authentication and Authorization providers • Who can connect to input plugins?
What tags are permitted for clients? • New plugin types (3rd party authors can write it) • Mainly for in/out forward, but available from others
Benchmark (1 CPU usage)
100,000msgs/sec v0.14 v0.12
in_tail (none) + out_forward 70% 66%
in_forward + flowcounter_simple 11% 11%
in_forward + tdlog 43% 38%
※ Use EC2 c3.8xlarge ※ Not fully optimized yet
Treasure Agent 3.0 (td-agent 3)
• fluentd v0.14
• Ruby 2.3 and latest core components
• Environments • Add msi Windows package • Remove CentOS 5, Ubuntu 10.04 support
• Release date is not fixed…
Enjoy logging!
H.A. configuration (high availability)
Retry automatically Exponential retry wait Persistent on a disk
bufferAutomatic fail-over Load balancing
access.log
in_tail