Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann

Post on 12-Jan-2017

460 Views

Category:

Software

4 Downloads

Preview:

Click to see full reader

Transcript

Unravelling LogsMatt Jarvis - Head of Cloud Computing @ DataCentred

Traditional log file analysis ...

● Troubleshooting● Post incident forensics● Security auditing● Reporting and analysis

Nova Controller :

● nova-api.log● nova-cert.log● nova-conductor.log● nova-scheduler.log

Glance Server :

● api.log● image-cache.log● registry.log

Neutron Controller :

● openvswitch-agent.log● server.log

Network Node :

● openvswitch-agent.log● neutron-ns-metadata-proxy*.log● metadata-agent.log● dhcp-agent.log

Compute Node :

● openvswitch-agent.log● nova-compute.log

● INGEST CENTRALLY

● STRUCTURE

● INDEX

● ANALYZE

● Distributed search engine● Highly scalable● Super fast● HTTP interface

FIXME Kibana screenshot

● Collect● Parse● Transform

Log Shipping

● Lightweight log shipper● Written in GO● Minimal resource usage● SSL● Transformation capabilities

Log Courier

{ "general": { "log file": "/var/log/log-courier.log", "admin enabled": true }, "network": { "transport": "tls", "servers": [ "your.logstash.server:55516" ], "ssl certificate": "/var/lib/puppet/ssl/certs/yourcert.pem", "ssl key": "/var/lib/puppet/ssl/private_keys/yourkey.pem", "ssl ca": "/var/lib/puppet/ssl/certs/ca.pem", "timeout": 40 }, "files": [ { "paths": [ "/var/log/syslog" ], "fields": { "shipper": "log-courier", "type": "syslog" } },]

input { courier { port => 55516 ssl_verify => true ssl_verify_ca => "/var/lib/puppet/ssl/certs/ca.pem" ssl_certificate => "/var/lib/puppet/ssl/certs/yourcert.pem" ssl_key => "/var/lib/puppet/ssl/private_keys/yourkey.pem" type => "log-courier" }}

filter { if [type] == "syslog" { if [message] =~ /Registrar received .* event/ { drop {} } grok { match => [ "message", "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" ] match => [ "message", "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY} %{TIME} %{POSINT:syslog_pid} %{WORD:severity} %{GREEDYDATA:syslog_message}"] match => [ "message", "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY} %{TIME} %{POSINT:syslog_pid} %{WORD:severity} %{GREEDYDATA:syslog_message}"] add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] add_field => [ "program", "%{syslog_program}" ] add_field => [ "timestamp", "%{syslog_timestamp}" ]

} syslog_pri { } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } }}

filter { if [type] == "native_syslog" { grok { match => [ "message", "%{SYSLOGLINE}" ] add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } syslog_pri { } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } }}

filter {# Add in group tags we didn't add in forwarder due to bug# https://github.com/elasticsearch/logstash-forwarder/issues/65# By grouping the logs using tags we can then search all the related logs in kibana if [type] =~ /cinder.*/ { mutate { add_tag => [ "cinder", "oslofmt" ] } }}

output { elasticsearch { host => elasticsearch embedded => false protocol => http }}output { if [type] == "syslog" { riemann { riemann_event => { "description" => "%{syslog_message}" "service" => "%{syslog_program}" "state" => "%{syslog_severity_code}" } } }}

FILTER

aggregatealteranonymizecollatecsvcidrclonecipherchecksumdatede_dotdnsdropelasticsearchextractnumbersenvironmentelapsedfingerprintgeoipgroki18njsonjson_encodekvmutatemetricsmultilinemetaeventprunepunctrubyrangesyslog_prisleepsplitthrottletranslateuuidurldecodeuseragentxmlzeromq

INPUT

beatscouchdb_changesdrupal_dblogelasticsearchexeceventlogfilegangliagelfgeneratorgraphitegithubheartbeatherokuhttphttp_pollerircimapjdbcjmxkafkalog4jlumberjackmeetuppipepuppet_facterrelprssrackspacerabbitmqredissalesforcesnmptrapstdinsqlites3sqsstompsyslogtcptwitterunixudpvarnishlogwmiwebsocketxmppzenosszeromq

OUTPUT

boundarycirconuscsvcloudwatchdatadogdatadog_metricsemailelasticsearchelasticsearch_javaexecfilegoogle_bigquerygoogle_cloud_storagegangliagelfgraphtasticgraphitehipchathttpircinfluxdbjuggernautjirakafkalumberjacklibratologglymongodbmetriccatchernagiosnullnagios_nscaopentsdbpagerdutypiperiemannredminerackspacerabbitmqredisriaks3sqsstompstatsdsolr_httpsnssyslogstdouttcpudpwebhdfswebsocketxmppzabbixzeromq

Riemann - an event stream processor● very low latency● extensive Clojure API● API can also be extended with Java

(streams (where (and (service #"^riak") (state "critical")) (email "delacroix@vonbraun.com")))

(by [:host :service])

(by [:host :service] (changed :state (rollup 5 3600 (email "delacroix@vonbraun.com"))))

(use 'clojure.java.io)

(defn get_messages [filename] (with-open [rdr (reader filename)] (doall (line-seq rdr))))

(def messages (get_messages "/etc/riemann.conf.d/riemann.whitelist"))

(def whitelist_pattern (str "^((?!(" (clojure.string/join "|" messages) ")).)*$"))

(def email(mailer { :from "riemann@core.sal01.datacentred.co.uk" }))

(streams (by :service (where (or (state "2")(state "1")(state "0")) (where (description (re-pattern whitelist_pattern)) (rollup 3 3600 (email "sysmail@core.sal01.datacentred.co.uk" ))))))

Ignoring invalid UTF-8 byte sequences in data to be sent to PuppetDBtftp: client does not accept optionsDHCP packet received on [a-zA-Z0-9-_]+ which has no addressCan\'t create new lease file: Permission denied\[\-\] Authorization failed\. The request you have made requires authentication\. from 127\.0\.0\.1\[\-\] \[instance: [a-zA-Z0-9-]+\] Instance not resizing[,] skipping migration\.^.*dhcp-failover rejected: incoming update is less critical than outgoing update$^.*Please use the the default quota class for default quota.$^.*FAILED: Has an address record but no DHCID, not mine.$^.*Found \d+ in the database and \d+ on the hypervisor.$^.*Arguments dropped when creating context.*^.*Failed to inspect.*of instance.*domain is in state of SHUTOFF^.*Unknown base file: /var/lib/nova/instances/_base/*^.*Couldn\'t obtain IP address of instance.*\[*\] IPMI message handler: BMC returned incorrect response, expected*\[-\] While synchronizing instance power states, found \d+ instances in the database and \d+ instances on the hypervisor

(use 'clojure.java.io)

(defn get_messages [filename] (with-open [rdr (reader filename)] (doall (line-seq rdr))))

(def messages (get_messages "/etc/riemann.conf.d/riemann.blacklist"))

(def blacklist_pattern (str "^?(" (clojure.string/join "|" messages) ").*$"))

(def pd (pagerduty "pagerduty_api_key"))

(streams (by :host (where (description (re-pattern blacklist_pattern)) (with {:state "Failure" :service "Hardware"} (throttle 1 43200 #(info %) (:trigger pd))))))

EDAC MC\d+: \d+ CE error on CPU#\d+Channel#\d+_DIMM#\d+.*ata\d+.\d+: exception.*ata\d+.\d+: failed command:.*ata\d+: link is slow to respond, please be patient.*ata\d+.\d+:.*failed.*

Log files

log courier

logstash

elasticsearch

riemann

kibana

pagerduty

email

Thanks for Listening !

top related