Top Banner
The Gnocchi Experiment playing with timeseries
22

The Gnocchi Experiment

Jan 21, 2018

Download

Technology

Gordon Chung
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Gnocchi Experiment

The Gnocchi Experiment

playing with timeseries

Page 2: The Gnocchi Experiment

History

● Ceilometer started in 2012○ Original mission: provide an infrastructure to collect any

information needed regarding OpenStack projects

● Added alarming in 2013○ Create rules and based on threshold conditions that when broken

trigger action

● Added events in 2014○ The state of an object in an OpenStack service at a point in time

● New mission○ To reliably collect data on the utilization of the physical and

virtual resources comprising deployed clouds, persist these data for

subsequent retrieval and analysis, and trigger actions when defined criteria are met.

Page 3: The Gnocchi Experiment

Ceilometer Architecture

OpenStack Services

Notification Bus

AP

I

External Systems

Notification Agents

Agent1

AgentN

Agent2

Pipeline

Polling Agents

Agent1

AgentN

Agent2

Pipeline

Databases

AlarmsEventsMeters

Alarm

Evaluator

Alarm

Notifier

Collectors

Collector1

CollectorN

Collector2

Page 4: The Gnocchi Experiment

this didn’t work.

Page 5: The Gnocchi Experiment

Growing pains

● Too large of a scope - we did everything● Too complex - must deploy everything● Too much data - all data in one place● Too few resources - handful of developers● Too generic a solution - storage designed to handle any

scenario● Good at nothing, average/bad at everything

Page 6: The Gnocchi Experiment

Ceilometer

Gnocchi

Ceilometer Architecture

Notification Bus

Aodh

OpenStack Services

Metrics A

PI

External Systems

Notification Agents

Agent1

AgentN

Agent2

Pipeline

Polling Agents

Agent1

AgentN

Agent2

PankoAlarms

Events

Metrics

Alarm

Evaluator

Collectors

Collector1

CollectorN

Collector2 Alarm

Notifier

Events A

PI

Page 7: The Gnocchi Experiment

Componentisation

● Split functionality into own projects○ Faster rate of change○ Less expertise

● Important functionality lives● Ceilometer - data gathering and transformation service● Gnocchi - time series storage service● Aodh - alarming service● Panko - event focused storage service● They all work together and separately

Page 8: The Gnocchi Experiment

Gnocchi

Page 9: The Gnocchi Experiment

Gnocchi use cases

● Storage brick for a billing system● Alarm-triggering or monitoring system● Statistical usage of data

Page 10: The Gnocchi Experiment

Ceilometer to Gnocchi

● Ceilometer legacy storage captures full-resolution data○ Each datapoint has:

Timestamp, measurement, IDs, resource metadata, metric metadata, etc…

● Gnocchi stores pre-aggregated data in a timeserie○ Each datapoint has:

Timestamp, measurement… that’s it… and then it’s compressed

○ resource metadata is an explicit subset AND not tied to measurement

○ Defined archival rules■ capture data at 1 min

granularity for 1 day AND 3 hr granularity for 1 month AND ...

Page 11: The Gnocchi Experiment

Archive Policies

5 minute granularity for a day

1 day granularity for a year

Page 12: The Gnocchi Experiment

How it all works...

Page 13: The Gnocchi Experiment

CeilometerRaw sample

{ "user_id": "0d9d089b8f8340999fbe01354ef84643", "resource_id": "a7c7cf84-5bf7-4838-a116-645ea376f4e0", "timestamp": "2016-05-11T18:23:46.166000", "meter": "disk.write.bytes", "volume": 56114794496, "source": "openstack", "recorded_at": "2016-05-11T18:23:47.177000", "project_id": "dec2b73655154e31be903fc93e575146", "type": "cumulative", "id": "7fbf56ca-17a5-11e6-a210-e8bdd1f62a56", "unit": "B", "metadata": { "instance_host": "cloud03.wz", "ephemeral_gb": "0", "flavor.vcpus": "8", "OS-EXT-AZ.availability_zone": "nova", "memory_mb": "16384", "display_name": "gord_dev", "state": "active", "flavor.id": "5", "status": "active", "ramdisk_id": "None", "flavor.name": "m1.xlarge", "disk_gb": "160", "kernel_id": "None", "image.id": "dba2c73c-3f11-45a1-998a-6a4ca2cf243e", "flavor.ram": "16384", "host": "64fe410a8b602f69fe43a180c62b02d6c00e41c03caba40a092e2fb6", "device": "['vda']", "flavor.ephemeral": "0", "image.name": "fedora-23-x86_64", }}

Page 14: The Gnocchi Experiment

Separation of value

Resource

● Id● User_id● Project_id● Start_timestamp: timestamp● End_timestamp: timestamp● Metadata: {attribute: value}● Metric: list

Measurements

● [ (timestamp, value), ... ]

Metric

● Name● archive_policy

Page 15: The Gnocchi Experiment

Gnocchi Architecture

AP

I

Resource Indexer

Metric Storage MetricD

Computation workers

data

Page 16: The Gnocchi Experiment

MetricD Aggregation

Metric Storage

MetricDComputation

workers2

raw metric dump

computed aggregates

13backlog

1. Get unprocessed datapoint2. Compute new aggregations

a. Update sum, avg, min, max, etc… values based on define policy

3. Add datapoint to backlog for next computationa. Delete datapoints not required for

future aggregationsb. By default, only keep backlog for

single period.

Page 17: The Gnocchi Experiment

Storage format

Metric Storageraw metric dump

computed aggregates

backlog

● [ (timestamp, value), (timestamp,value) ]● One object per write

● { values: { timestamp: value, timestamp:value }, block_size: max number of points, back_window: number of blocks to retain}

● Binary serialised using msgpacks● One object per metric

● { first_timestamp: first timestamp of block, aggregation_method: sum, min, max, etc…, max_size: max number of points, sampling: granularity (60s, 300s, etc…), timestamps: [ time1, time2, … ], values: [value1, value2, … ]}

● Binary serialised using msgpacks● Compressed with LZ4● Split into chunks to minimise transfer when updating large series● (potentially) multiple objects per aggregate per granularity per metric

Page 18: The Gnocchi Experiment

Query path

AP

I

Resource Indexer

Metric Storage

What’s the cpu utilisation for VM1?

resource_id

Meausures (all granularities)

metric_id

+---------------------------+-------------+----------------+| timestamp | granularity | value |+---------------------------+-------------+----------------+| 2016-04-07T00:00:00+00:00 | 86400.0 | 0.30323927544 || 2016-04-07T17:00:00+00:00 | 3600.0 | 1.2855184725 || 2016-04-07T18:00:00+00:00 | 3600.0 | 0.188613527791 || 2016-04-07T19:00:00+00:00 | 3600.0 | 0.188871232024 || 2016-04-07T20:00:00+00:00 | 3600.0 | 0.188876901916 || 2016-04-07T21:00:00+00:00 | 3600.0 | 0.189646641908 || 2016-04-07T21:10:00+00:00 | 300.0 | 0.190019839676 || 2016-04-07T21:15:00+00:00 | 300.0 | 0.186565358466 || 2016-04-07T21:20:00+00:00 | 300.0 | 0.183166934543 || 2016-04-07T21:25:00+00:00 | 300.0 | 0.179994544916 || 2016-04-07T21:30:00+00:00 | 300.0 | 0.186649908928 || 2016-04-07T21:35:00+00:00 | 300.0 | 0.193315212093 || 2016-04-07T21:40:00+00:00 | 300.0 | 0.193272093903 || 2016-04-07T21:45:00+00:00 | 300.0 | 0.196677374077 || 2016-04-07T21:50:00+00:00 | 300.0 | 0.193300189049 |+---------------------------+-------------+----------------+

metric_id

Page 19: The Gnocchi Experiment

Query pathA

PI

Resource Indexer

Metric Storage

What’s the metadata for VM1? resource_id

resource+-----------------------+----------------------------------------------------------------+

| Field | Value |

+-----------------------+----------------------------------------------------------------+

| created_by_project_id | f7481a38d7c543528d5121fab9eb2b99 |

| created_by_user_id | 9246f424dcb341478067967f495dc133 |

| display_name | test3 |

| ended_at | None |

| flavor_id | 1 |

| host | 7f218c8350a86a71dbe6d14d57e8f74fa60ac360fee825192a6cf624 |

| id | e90974a6-31bf-4e47-8824-ca074cd9b47d |

| image_ref | 671375cc-177b-497a-8551-4351af3f856d |

| metrics | cpu.delta: 20cd1d71-de2f-43d5-90a8-b23ad31a7d04 |

| | cpu_util: 22cd22e7-e48e-4f21-887a-b1c6612b4c98 |

| | disk.iops: 9611a114-d37e-42e7-9b0c-0fb5e61d96c8 |

| | disk.latency: 6205c66f-2a5d-49c8-85e6-aa7572cfb34a |

| | disk.root.size: c9f9ca31-7e54-4dd7-81ad-129d86951dbc |

| | disk.usage: 4f29ca2e-d58f-40a9-94a7-15084233c1bb |

| original_resource_id | e90974a6-31bf-4e47-8824-ca074cd9b47d |

| project_id | 71bf402adea343609f2192ce998fa38e |

| revision_end | None |

| revision_start | 2016-04-07T17:32:33.245924+00:00 |

| server_group | None |

| started_at | 2016-04-07T17:32:25.740862+00:00 |

| type | instance |

| user_id | fd3eb127863b4177bf1abb38dda1f557 |

+-----------------------+----------------------------------------------------------------+

Page 20: The Gnocchi Experiment

Zero computation at query. Only lookup.

Page 21: The Gnocchi Experiment

Results (benchmark data, Gnocchi 1.3.x)

Page 22: The Gnocchi Experiment

Ceilometer to Gnocchi

Ceilometer legacy storage

● Single datapoint averages to ~1.5KB/point (mongodb) or ~150B/point (SQL)

● For 1000 VM, capturing 10 metrics/VM, every minute:~15MB/minute, ~900MB/hour, ~21GB/day, etc…

Gnocchi

● Single datapoint AT MOST is 9B/point

● For 1000 VM, capturing 10 metrics/VM, every minute:~90KB/minute, ~5.4MB/hour, ~130MB/day, etc…