Top Banner
Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel
21

Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Dec 16, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Distributed Clustering for Robust Aggregation in Large Networks

Ittay Eyal, Idit Keidar, Raphi Rom

Technion, Israel

Page 2: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Aggregation in Sensor Networks – Applications

Temperature sensors thrown in the woods

Seismic sensors

Grid computing load

Page 3: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Aggregation in Sensor Networks – Applications

• Large networks with light nodes.• Target is a function of all sensed data.

Average, min, majority, etc.

Page 4: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Tree Aggregation and Gossip

Hierarchical solution

Fast - O(height of tree)

Disadvantages: •Limited to static topology. •No failure robustness.

• D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, 2003.

• S. Nath, P. B. Gibbons, S. Seshan, and Z. R. Anderson. Synopsis diffusion for robust aggregation in sensor networks. In SenSys, 2004.

1 3 9 11

2 10

62

10

Page 5: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Tree Aggregation and Gossip

• D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, 2003.

• S. Nath, P. B. Gibbons, S. Seshan, and Z. R. Anderson. Synopsis diffusion for robust aggregation in sensor networks. In SenSys, 2004.

1

3

9

11

Gossip:

Each node maintains a synopsis.

Page 6: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Tree Aggregation and Gossip

• D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, 2003.

• S. Nath, P. B. Gibbons, S. Seshan, and Z. R. Anderson. Synopsis diffusion for robust aggregation in sensor networks. In SenSys, 2004.

1

3

9

11

Gossip:

Each node maintains a synopsis.

Occasionally, each node contacts a neighbor and they improve their synopses.

5

5

7

7

Page 7: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Tree Aggregation and Gossip

• D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, 2003.

• S. Nath, P. B. Gibbons, S. Seshan, and Z. R. Anderson. Synopsis diffusion for robust aggregation in sensor networks. In SenSys, 2004.

5

7

5

7

Gossip:

Each node maintains a synopsis.

Indifferent to topology changes. Crash robust.

Occasionally, each node contacts a neighbor and they improve their synopses.

6

6

6

6

Page 8: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Outliers - Definition

Samples deviating from the distribution of the bulk of the data.

Page 9: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Sources of Outliers

Sensor Malfunction

Short circuit in a seismic sensor

Sensing Error

An animal sitting on a temperature sensor

Interesting Info: DDoS: Irregular load on some machines in a grid

Software bugs: In grid computing, a machine reports negative CPU usage

Interesting Info: Fire outbreak: Extremely high temperature in a certain area of the woods

Interesting Info: intrusion: A truck driving by a seismic detector

Page 10: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

The Implications of Outliers

4A single erroneous sample can radically offset the data.

31 105

27o

The average (47o) doesn’t tell the whole story.

25o 26o 25o 28o 98o 120o 27o

Page 11: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Outlier Detection Challenge

A double bind:

• A centralized solution won’t cut it: • Bandwidth limitations • Power limitations• Storage limitations

No one in the system has enough information.

Regular data distribution

Regular data distribution

Outliers

Outliers

Page 12: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Aggregate Clusters

Each cluster has its own mean and mass. A bounded number (k) of clusters is maintained.

Original samples

1 3 9 11

1 a b c d

2 10

2

Aggregation Result

Aggregation of a and b

1

1 3

(No compression)

Aggregation of a, b and c

21

2 9

Herek = 2

Page 13: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

But what does the mean mean?

The variance must be taken into account

Page 14: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Gossip Aggregation of Gaussian Clusters

Each cluster is described by: • Mass• Mean • Covariance matrix

Distribution is described as k clusters.

Nodes gossip: unify synopses by merging close clusters.

a b

+ = Merge

Page 15: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

It works where it matters

Not Interesting

Easy

Page 16: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Protocol is Crash Robust

Regular, 5% Crash probability per round

Standard, No Crashes

Outlier robust, with and without crashes

Page 17: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Describe Elaborate Data

Page 18: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Describe Elaborate Data

Temperature sensors on a fence by the woods

FireNo FireFireNo FireFireNo Fire

Page 19: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Summary

Robust Aggregation requires outlier detection

27o 27o 27o 27o 27o98o 120o

We present outlier detection by Gaussian clustering:

Merge

Page 20: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Summary – Our Protocol

Outlier Detection (where it’s important)

Crash Robustness Elaborate Data

Page 21: Distributed Clustering for Robust Aggregation in Large Networks Ittay Eyal, Idit Keidar, Raphi Rom Technion, Israel.

Ittay Eyal, Idit Keidar, Raphael Rom. Distributed Clustering for Robust Aggregation in Large Networks, Technion, 2009

Thank you