Diagnosis via monitoring & tracing Greg Ganger, Garth Gibson, Majd Sakr adapted from Raja Sambasivan 15-719: Advanced Cloud Computing Spring 2017 1 15-719/18-847b: Advanced cloud computing, CMU Revised: 04/3/2017 Problem diagnosis is difficult • For developers of clouds • For cloud users (i.e., software developers) • Must debug own applications • Must debug interactions w/cloud • E.g., is a slowdown due to other VMs or my app? 2 2
14
Embed
Diagnosis via monitoring & tracinggarth/15719/lectures/719-S17... · 2017. 4. 4. · Diagnosis via monitoring & tracing Greg Ganger, Garth Gibson, Majd Sakr adapted from Raja Sambasivan
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Diagnosis via monitoring & tracing
Greg Ganger, Garth Gibson, Majd Sakradapted from Raja Sambasivan
Ganglia federates multiple clusters together using a tree of point-to-point connec-tions. Each leaf node specifies a node in a specific cluster being federated, whilenodes higher up in the tree specify aggregation points. Since each cluster node con-tains a complete copy of its cluster’s monitoring data, each leaf node logically rep-resents a distinct cluster while each non-leaf node logically represents a set ofclusters. (We specify multiple cluster nodes for each leaf to handle failures.) Aggre-gation at each point in the tree is done by polling child nodes at periodic intervals.Monitoring data from both leaf nodes and aggregation points is then exported usingthe same mechanism, namely a TCP connection to the node being polled followed bya read of all its monitoring data.
4. Implementation
The implementation consists of two daemons, gmond and gmetad, a command-line program gmetric, and a client side library. The Ganglia monitoring daemon(gmond) provides monitoring on a single cluster by implementing the listen/announce protocol and responding to client requests by returning an XML represen-tation of its monitoring data. gmond runs on every node of a cluster. The GangliaMeta Daemon (gmetad), on the other hand, provides federation of multiple clus-ters. A tree of TCP connections between multiple gmetad daemons allows monitor-ing information for multiple clusters to be aggregated. Finally, gmetric is acommand-line program that applications can use to publish application-specificmetrics, while the client side library provides programmatic access to a subset ofGanglia’s features.
4.1. Monitoring on a single cluster
Monitoring on a single cluster is implemented by the Ganglia monitoring daemon(gmond). gmond is organized as a collection of threads, each assigned a specific task.
tem on behalf of a given initiator. For example, Fig-ure 1 shows a service with 5 servers: a front-end (A),two middle-tiers (B and C) and two backends (D and E).When a user request (the initiator in this case) arrives atthe front end, it sends two RPCs to servers B and C. Bcan respond right away, but C requires work from back-ends D and E before it can reply to A, which in turn re-sponds to the originating request. A simple yet usefuldistributed trace for this request would be a collectionof message identifiers and timestamped events for everymessage sent and received at each server.
Two classes of solutions have been proposed to ag-gregate this information so that one can associate allrecord entries with a given initiator (e.g., RequestX inFigure 1), black-box and annotation-based monitoringschemes. Black-box schemes [1, 15, 2] assume there isno additional information other than the message recorddescribed above, and use statistical regression techniquesto infer that association. Annotation-based schemes[3, 12, 9, 16] rely on applications or middleware toexplicitly tag every record with a global identifier thatlinks these message records back to the originating re-quest. While black-box schemes are more portable thanannotation-based methods, they need more data in orderto gain sufficient accuracy due to their reliance on sta-tistical inference. The key disadvantage of annotation-based methods is, obviously, the need to instrument pro-grams. In our environment, since all applications use thesame threading model, control flow and RPC system, wefound that it was possible to restrict instrumentation toa small set of common libraries, and achieve a monitor-ing system that is effectively transparent to applicationdevelopers.
We tend to think of a Dapper trace as a tree of nestedRPCs. However, our core data model is not restrictedto our particular RPC framework; we also trace activ-ities such as SMTP sessions in Gmail, HTTP requestsfrom the outside world, and outbound queries to SQLservers. Formally, we model Dapper traces using trees,spans, and annotations.
2.1 Trace trees and spans
In a Dapper trace tree, the tree nodes are basic units ofwork which we refer to as spans. The edges indicate acasual relationship between a span and its parent span.Independent of its place in a larger trace tree, though, aspan is also a simple log of timestamped records whichencode the span’s start and end time, any RPC timingdata, and zero or more application-specific annotationsas discussed in Section 2.3.
We illustrate how spans form the structure of a largertrace in Figure 2. Dapper records a human-readable spanname for each span, as well as a span id and parent id
Figure 2: The causal and temporal relationships be-tween five spans in a Dapper trace tree.
in order to reconstruct the causal relationships betweenthe individual spans in a single distributed trace. Spanscreated without a parent id are known as root spans. Allspans associated with a specific trace also share a com-mon trace id (not shown in the figure). All of these idsare probabilistically unique 64-bit integers. In a typicalDapper trace we expect to find a single span for eachRPC, and each additional tier of infrastructure adds anadditional level of depth to the trace tree.
Figure 3 provides a more detailed view of the loggedevents in a typical Dapper trace span. This particularspan describes the longer of the two “Helper.Call” RPCsin Figure 2. Span start and end times as well as any RPCtiming information are recorded by Dapper’s RPC libraryinstrumentation. If application owners choose to aug-ment the trace with their own annotations (like the “foo”annotation in the figure), these are also recorded with therest of the span data.
It is important to note that a span can contain informa-tion from multiple hosts; in fact, every RPC span con-tains annotations from both the client and server pro-cesses, making two-host spans the most common ones.Since the timestamps on client and server come from
Figure 3: A detailed view of a single span from Fig-ure 2.