Time series data monitoring at 99acres.com

99acres.com Monitoring[graphite,collectd,statsd,seyren,logstash,ELK ….]

Current Structure ( single box setup)

Current Structure ( single box setup)

Current Structure ( single box setup) : CarbonAn event-driven, Python-based daemon that listens on a TCP port, expecting a stream of time-series data.

Time-series data in concept: some Metric:someValue:timeStamp

Carbon expects time-series data in a particular format (of two primary types - details later). Third party tools that support Graphite are used to feed properly formatted data, such as Collectd or StatsD.

Metrics can be anything from OS memory usage to event counts fired off from an application (e.g. number of times a function was called).

After Carbon receives metrics, it periodically flushes them to a storage database.

Current Structure ( single box setup) : WhisperA lightweight, flat-file database format for storing time-series data.

It does not run as a stand-alone service or bind to a port. Carbon natively supports writing to disk in "Whisper format".

Each unique metric type is stored in a fixed-size file. If we fed in the metrics memory free and memory used for both Host A and Host B, the following database files would be created: $WHISPER_DIR/carbon/whisper/HostA/memory-free.wsp

The size of database files is determined by the number of data points stored - this is configurable (details later).

Current Structure ( single box setup) : Graphite WebA Django web UI that can query Carbon daemons and read Whisper data to return complete metrics data, such as all memory used values logged for Host A over the last 6 hours.

Graphite Web can be used directly for composing basic graphs.

Graphite Web provides the REST API that can be queried by third-party tools (such as Grafana) to create complete dashboards.

The API can return either raw text data or a rendered graph (.png format).

Current Structure ( single box setup) : Diff ConfigsIf we’re using a tool like Collectd to feed metrics from theoretical Host A, it's sending in line format and should be transmitting to our Graphite box on port 2003. The pickle format is used when a Carbon-Relay daemon is load-balancing / proxying metrics data to multiple Carbon-Cache daemons in a Graphite cluster. Understanding these functional boundaries should help clarify why all these listening directives with fuzzy names exist.

When Carbon-Cache writes data to disk, it stores it in the Whisper database format. As previously mentioned, a .wsp file is created per unique metric. Each file is created at the time a given metric type is first received. Every file is a fixed-size (for performance) determined by the resolution and retention configured in Whisper's config, storage-schemas.conf.

Carbon-Cache and Whisper are on the side of writing data. Graphite Web is how data is queried. Basically, it's a Django app that can read metric data from one of three sources:

1. Directly from Whisper database files on-disk2. From Carbon-Cache daemons on their CACHE_QUERY_PORT (remember this directive in the carbon config?)3. From other instances of Graphite Web through the REST API

Current Structure ( single box setup) : Diff ConfigsOnce data is fetched by Graphite Web, it delivers it in two fashions:

1. Makes it directly accessible in our web browser by simply visiting the Graphite Web app address, and allows us to construct our own graphs or dashboards

2. As raw data or rendered png graphs emitted through a REST API

Frontend clustering ( easy alternate)

Frontend clustering ( easy alternate)

Issues : Frontend clusteringIn statsd, same thread is responsible for both buffering incoming metrics and performing aggregations on them at every flush interval. When computing aggregations, the thread stops listening for incoming metrics, which are stored in the UDP buffer. As the rate of metrics increases, the UDP buffer overflows and drops metrics. We use single-threaded, event-looping frameworks in a few places (Node.js-based daemons for a couple of things, Python-based gunicorn+gevent for several), and we have seen this type of problem before. The event loops don’t help us when we have a blocking IO operation that can bring processing to a halt. Sometimes we work around or solve such problems within the event-loop paradigm, and sometimes we take a completely different approach.

Issues : Frontend clusteringcarbon-cache.py’s (writer thread) the Whisper file would not be closed and in most cases the Python garbage collector would come along and close the out of scope file descriptor and unlock the file.

Node.js limitations with a maximum of 40,000 packets per second. Scaling to 200,000 packets per second consumed 8 cores and 8G of RAM and continued to drop packets. A lot of packets.

same thing written in Go it is benchmarked at consuming 250,000 UDP packets per second, no packet drop, using 10MiB of RAM and about half a CPU core.

Issues : Frontend clusteringThis is caused by duplicate Whisper files for the same metric that do not have identical data in them. Exactly what happens during a rebalance. It also happens with replication set higher than 1, but without an outage the Whisper DBs are identical.

https://github.com/graphite-project/graphite-web/pull/1293

https://github.com/graphite-project/whisper/pull





Final Approach

Final Approach ...

Toolshttps://github.com/jjneely/statsrelay

1. Written in Go it is benchmarked at consuming 250,000 UDP packets per second, no packet drop, using 10MiB of RAM and about half a CPU core.

https://github.com/jssjr/carbonate

1. To Migrate whisper data

https://github.com/jjneely/buckytools

1. whisper-fill.py alternate

https://github.com/jjneely/statsrelay

https://github.com/jjneely/statsrelay





Toolshttps://github.com/bitly/statsdaemon

more efficient implementation of Statsd

https://github.com/grobian/carbon-c-relay

much more efficient consistent hashing metric router

https://github.com/bitly/statsdaemon

https://github.com/bitly/statsdaemon



Final Approach ...Double Graphite boxes.

Put a third box in front of these two Graphite boxes. This will be a dedicated Graphite-Relay.

All metrics are sent to this dedicated Carbon-Relay box. It then proxies all the data to each of the Graphite machines using the Pickle protocol and according to its own configured RELAY_METHOD.

Put a fourth box behind the two Graphite boxes. This will be a dedicated Graphite Web.

This will now be the "master" Web app to use. It will be configured to query the Graphite Web instance API running local to each Graphite box.

Final Approach ...Carbon-Relay will be configured like this:

[relay]

● LINE_RECEIVER_INTERFACE = 0.0.0.0● LINE_RECEIVER_PORT = 2003● PICKLE_RECEIVER_INTERFACE = 0.0.0.0● PICKLE_RECEIVER_PORT = 2004● RELAY_METHOD = consistent-hashing ● DESTINATIONS = 10.10.17.99:2004, 10.10.17.100:2004

Final Approach ...The master Graphite Web local_settings.py will look like this:

CLUSTER_SERVERS = ["10.10.17.99:80", "10.10.17.100:80"]

DATABASES = {

'default': {

● 'NAME': '/opt/graphite/storage/graphite.db',● 'ENGINE': 'django.db.backends.sqlite3',● 'USER': '',● 'PASSWORD': '',● 'HOST': '',● 'PORT': ''

Final Approach ...Assuming we're using consistent-hashing on every relay in the cluster, every metric inbound is hashed by name and delivered to the same host and then to the same Carbon-Cache daemon and written to the same Whisper db, every time. All Carbon-Cache daemons in the cluster receive an even distribution (practically) of all the inbound metrics, in whole.

The master Graphite Web will query APIs of all the secondary Graphite Web instances. Each instance will read from the local Whisper data and Carbon-Cache instances and return the data if it has it. The master Graphite Web instance will ultimately combine and present the data through the same ole' methods as a stand-alone Graphite setup. We can compose a single graph for a single hosts made from metrics data stored on 100 Graphite boxes. Likewise, the aggregated data could be fed through the master Graphite Web REST API into a third-party dashboarding utility.

Tools...whisper-calculator.py:

https://gist.github.com/jjmaestro/5774063

It's a Python script that we can feed Whisper retention syntax (e.g. 10s:30d,10m:180d) and it will tell how many data points it translates to.

whisper-info.py: a script included with Whisper that we can point at an existing database file to get data point and sizing values



Fault toleranceCarbon-Relay supports replication of proxied metrics, meaning each received metric will be sent to two different downstream storage nodes. This would be defined at the top level relays using the REPLICATION_FACTOR directive:

[relay]

● LINE_RECEIVER_INTERFACE = 0.0.0.0● LINE_RECEIVER_PORT = 2003● PICKLE_RECEIVER_INTERFACE = 0.0.0.0● PICKLE_RECEIVER_PORT = 2004● RELAY_METHOD = consistent-hashing ● REPLICATION_FACTOR = 2● DESTINATIONS = 127.0.0.1:2014:1, 127.0.0.1:2024:2

Fault tolerance...This would send metrics to two of the DESTINATIONS hosts listed. If we had 8 total storage nodes with a replication factor of 3, three separate nodes would be chosen (also through consistent-hashing) to store the data.

In terms of reading the data, Graphite Web is actually smart enough to detect duplicate data and present it only once. If data is missing entirely, we just get a gap in our graphs.

Thanks

Time series data monitoring at 99acres.com

Technology