Top Banner
Monitoring PostgreSQL using Time-Series systems like Graphite and/or Grafana with OpenCollector Jan Wieck - OpenSCG
22

Monitoring pg with_graphite_grafana

Feb 08, 2017

Download

Software

Jan Wieck
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Monitoring pg with_graphite_grafana

Monitoring PostgreSQL using Time-Seriessystems like Graphite and/or Grafana with

OpenCollectorJan Wieck - OpenSCG

Page 2: Monitoring pg with_graphite_grafana

Overview

• What is Monitoring?• Graphite & Carbon• Grafana• Why use Carbon?• Why use Graphite AND Grafana?• PostgreSQL Metric Data• OpenCollector• osinfofdw

Page 3: Monitoring pg with_graphite_grafana

What is Monitoring?

• Capture Time-Series data• Metric-Name, Value, Timestamp

• Visualize Time-Series data• Define alerts base on Time-Series data• Statistical analysis of Time-Series data

• Getting an alert when your primary DB server isdown is covered by the above!

Page 4: Monitoring pg with_graphite_grafana

Graphite & Carbon

• Carbon is a server for collecting Time-Seriesdata

• Simple line based protocol on port 2003• Python-pickle protocol on port 2004

• Graphite is a WEB based GUI on top of Carbon• Some Dashboard functionality

Page 5: Monitoring pg with_graphite_grafana

Example Graphite screen

Page 6: Monitoring pg with_graphite_grafana

Grafana

• Grafana is more Dashboard focused• Grafana can use many Time-Series data sources

• Graphite• Elasticsearch• CloudWatch• InfluxDB• OpenTSDB• KairosDB• Prometheus

Page 7: Monitoring pg with_graphite_grafana

Example Grafana dashboard

Page 8: Monitoring pg with_graphite_grafana

Why use Carbon?

Carbon provides an extremely simple protocol to sendTime-Series data

#!/bin/shCHOST=”graphite.host.name”CPORT=”2003”METRIC=”test.PI”VALUE=”3.1415”

echo ”$METRIC $VALUE ‘date +%s‘” | nc $CHOST $CPORT

Page 9: Monitoring pg with_graphite_grafana

Why use Carbon?

Page 10: Monitoring pg with_graphite_grafana

Why use Carbon?

Not a very useful metric, but consider capturing theruntime of a shell script based cron job.

Carbon also provides a Python-pickle based protocolon port 2004 that can be used to send hundreds ofmetric points condensed in one send(1).

Page 11: Monitoring pg with_graphite_grafana

Why use Graphite AND Grafana?

• Grafana is more Dashboard focused• Templating makes it easy to define oneDashboard and use it for many hosts/databases

• Getting to a Dashboard is easier• Can define Alerts• Looks cool

• Graphite is better at ad-hoc graphing• The metric tree is easier to navigate than clickingthrough Grafana’s pull down system

Page 12: Monitoring pg with_graphite_grafana

However …

This isn’t a talk advertising Graphite or Grafana.

This is a talk about capturing monitoring data fromPostgreSQL and delivering it into a Time-Series datasystem. Carbon/Graphite and Grafana are exampledestinations.

Page 13: Monitoring pg with_graphite_grafana

PostgreSQL Metric Data

PostreSQL produces quite a number of data points.• On the table level

• about 30 metric points• On the index level

• about 6 metric points• On the database level

• about 20 metric points

Page 14: Monitoring pg with_graphite_grafana

PostgreSQL Metric Data

Those per table/index numbers are not of concernwhen you look at your typical benchmark database.

But what about a database with 1,800 tables and13,000 indexes?

Now we are talking about 132,000 metric points everytime interval! Captured every minute that is 7.9M perhour, 190M per day, 17.1B per quarter. Don’t do thatwith snapshots captured inside the DB.

Page 15: Monitoring pg with_graphite_grafana

PostgreSQL Metric Data

That isn’t as exotic as it looks at first glance

PostgreSQL system views like pg_stat_all_user_tableswill report every single metric point even if a table orindex hasn’t been used for the past 12 months.

How many dead tables (schemas) does yourdatabase have?

A generic monitoring system can’t tell them apart.

Page 16: Monitoring pg with_graphite_grafana

PostgreSQL Metric Data

But that isn’t all. Many metrics are presented in whatis a continuous counter, but the useful value is actuallytheir increase per second.

Examples:• Tuples inserted, updated, deleted, fetched• Index/Sequential scans

This is the same as for OS statistics like:• Network operations• Disk operations

Page 17: Monitoring pg with_graphite_grafana

PostgreSQL Metric Data

While that is efficient inside of the PostgreSQL serverfor collecting the data, it is rather inconvenient whenbrowsing it in a system like Graphite or Grafana.

Sure, they can apply a function like “persecond()” andit is only 20 mouse clicks away …

Page 18: Monitoring pg with_graphite_grafana

OpenCollector

• OpenCollector is a PostgreSQL monitoringdaemon sponsored by OpenSCG

• It is designed to address the aforementionedproblems

• JSON configuration files define all the operation• Target Carbon server• Source Database(s)• Queries to run and what metrics they return• Sparse metric reporting

Page 19: Monitoring pg with_graphite_grafana

OpenCollectorAn example from the sample configs:”name”: ”global_stats”,”prefix”: ”database:{datname}.global_stats”,”query”: [”SELECT ”,” datname, numbackends::float8, ”,” xact_commit::float8, xact_rollback::float8, ”,” blks_read::float8, blks_hit::float8, ”,” pg_catalog.pg_database_size(datid)::float8, ”,” pg_xlog_location_diff(pg_current_xlog_insert_location(), ’0/0’) ”,”FROM pg_catalog.pg_stat_database ”,”WHERE datname = current_database() ”

],”result”: [{ ”name”: ”datname”, ”type”: ”internal” },{ ”name”: ”numbackends”, ”type”: ”value” },{ ”name”: ”xact_commit”, ”type”: ”counter” },...

]

Page 20: Monitoring pg with_graphite_grafana

OpenCollector

• Since the queries are in config files, you cancustomize them

• Additional WHERE clauses• Change from pg_stat_all_ to pg_stat_user_

• Add your own, application specific queries• OpenCollector is modular and allows to addother things

• OpenCollector is open source

Page 21: Monitoring pg with_graphite_grafana

osinfofdw

• osinfofdw is another open source projectsponsored by OpenSCG

• A MultiCorn based FDW around Python-psutil• Access OS level statistics via SELECT

• CPU usage• Memory usage• Disk IO• Network IO• Filesystem information

Page 22: Monitoring pg with_graphite_grafana

Links

Links:• https://bitbucket.org/openscg/opencollector• https://bitbucket.org/openscg/osinfofdw

Questions?