Top Banner
Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 http:// www.eecs.oregonstate.edu/ ~tgd Learning and Inference in the Knowledge Plane
35

Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Thomas G. Dietterich

School of EECS

Oregon State University

Corvallis, Oregon 97331

http://www.eecs.oregonstate.edu/~tgd

Learning and Inference in the Knowledge Plane

Page 2: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Claim: KP applications will be driven by learned models

NetworkModel

Traffic

ConfigurationsPerformanceMeasures

Example: configuration/traffic model captures the tripartite relationship between network configurations, traffic properties, and network performance measures.

Configuration X + Traffic Y ) Performance level Z

Page 3: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Network Model Drives Configuration

Traffic

ConfigurationsPerformanceMeasures

ConfigurationEngine

traffic mix

performance objectives

proposed network configuration

Network Model

Page 4: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Roles for Learning

Learn network model measure…

configuration information traffic properties (protocol mix, session lengths,

error rates …)performance measures (throughput, E2E delays,

errors, application-level measures) fit model

Page 5: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Roles for Learning (2)

Improving the configuration engine Learn repair rules

Observe operator-initiated repairs Learn heuristics for rapidly finding good

solutions Cache previous good solutions

Page 6: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Models for WHY

Traffic

ConfigurationsPerformanceMeasures

Network ModelSensor

Variables Measured, Error Bounds,

Costs

Sensor Model

Page 7: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Network and Sensor Models Drive Diagnosis and Repair

Diagnosis Engine

observed anomaly or user’s complaint

Network Model

Sensor Model

Sensors

Interventionsdiagnosis and

recommended repair

User makes complaint

DE executes measurement or interventionDE receives

results

DE outputs diagnosis

DE chooses measurement or intervention

Page 8: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Example: Bayesian Network Drives Diagnosis

Page 9: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Semantics

Every node stores a conditional probability distribution:

Bat State Charge P(Power|BS,C)

Ok Ok 0.99

Ok Bad 0.10

Worn Ok 0.45

Worn Bad 0.01

Page 10: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Diagnostic Process

Interventions: Observation: Observe Radio Repair attempts: Fill gas tank Observe & Repair: Inspect fuel pump, replace if bad

Algorithm: Compute P(component is bad | evidence) Repair component that maximizes P(bad)/cost Choose observation that maximizes value of

information

Page 11: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Example

Choose SparkPlugs as next component to repair. Repair it. Update probabilities, and repeat.

Page 12: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Role for Learning

Learn sensor model Basic sensor model is manually engineered Learn error bounds and costs (e.g., time

delay, traffic impact)

Page 13: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Anomaly Detection Models

TrafficMeasure of

“Normalness”

ConfigurationsMeasure of

“Normalness”

RoutesMeasure of

“Normalness”

Monitor network for unusual traffic, configurations, and routes

Anomalies are phenomena to be understood, not alarms to be raised.

Page 14: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Role for Learning

Learn these models by observing traffic, configurations, routes

Page 15: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Model Properties

Spatially Distributed Replicated/Cached

Hierarchical Multiple levels of abstraction

Constantly Maintained

Page 16: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Available Technology:Configuration Engine

Existing formulations: constraint-satisfaction problems (CSP) with objective function

Systematic and repair-based methods Some ideas for how to incorporate

learning

Page 17: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Available Technology (2):Diagnostic Engine

Special cases where optimal diagnosis is tractable Single fault; all actions are repair attempts Single fault; all actions are pure observations

Widely-used heuristic One-step value of information (greedy approx)

Fully-general approach Partially-observable Markov Decision Process Some approximation algorithms are available

Page 18: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Available Technology (3): Anomaly Detection

Unsupervised Learning Methods Clustering Probability Density Estimation One-class Classification Formulation

Page 19: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Research Gaps (1): Spatially-distributed multi-level

models of traffic What are the right variables to model?

Packet-level statistics (RTT, throughput, jitter, …) Connection-level statistics Routing statistics Application statistics

What levels reveal anomalies? What levels can best be related to

performance goals and configuration settings?

Page 20: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Research Gaps (2): Learning models of configurations

Network components Switches, routers, firewalls, web servers, file

servers, wireless access points, …

LANs and WANs Autonomous Systems What are the right levels of abstraction? Can these things be observed?

Page 21: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Research Gaps (3): Relational learning

Network models are relational (traffic, configuration, performance) network structure is a graph routes are paths with attached properties

Relational modeling is a relatively young area of ML

Page 22: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Research Gaps (4): Distributed learning and reasoning

Distributed model construction Bottom-up summary statistics (easy) Mixed (bottom-up/top-down) information flow

(unexplored) Essential for higher-level modeling Opportunities for data sharing at lower levels

Distributed configuration Distributed diagnosis

Opportunities for inference sharing

Page 23: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Research Gap (5):Openness

Standard AI Models assume a fixed set of classes/categories/faults/components

How do we reason about the possible existence of new classes, new components, new protocols (including intrusions/worms)?

How do we evaluate such systems?

Page 24: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Application/Model Interface

Subscription? Applications subscribe to regular model

updates/summaries Applications specify the models they want

the KP to build/maintain

Query? Applications make queries to models?

Page 25: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Application/KP Interface: Two possibilities

KP provides inference services in addition to model services WHY? client sends end-user complaint to TP where

inference engine operates

Inference is performed on end-user machine? WHY? client does inference, just sends queries to

TPs Some inference about end-user’s machine needs to

happen locally. Maybe view as local TP?

Page 26: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Concluding Remarks

KP Applications will be driven by learned models traffic models sensor models models of “normalness”

Models are acquired by mix of human authoring and machine learning

Main research challenges arise from multiple levels of abstraction and world-wide distribution

Page 27: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

KP support for learning models

Example: HP Labs email loop detection system Bronstein, Das, Duro, Friedrich, Kleyner, Mueller,

Singhal, Cohen (2001) Convert mail log data into four “detectors” and

combine using a Bayesian network

Page 28: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

KP support for learning models(2)

Variables to be measured: Raw sensors: mail log (when received, from, to, size, time of

delivery attempt, status of attempt) Derived variables (10-minute windows)

IM: # incoming msgs IMR: # incoming msgs / # outgoing msgs PEAK: magnitude of peak bin in message size histogram SHARPNESS: ratio of magnitude of peak bin to average size of

four neighboring non-empty bins (excluding 2 nearest-nbr bins on either side of peak bin)

Summary statistics mean and standard deviation of the derived variables trimmed to

remove outliers beyond ±3.5 σ

Page 29: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

KP Services

Provide a language so that I can define raw features derived features summary statistics

Provide subscription service so that I can collect the summary statistics

automatic or semi-automatic support for aggregating summary statistics from multiple sources (e.g., multiple border SMTP servers)

Provide subscription service so that I can sample the derived features (to build a supervised training data set)

Page 30: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Statistical Aggregation Middleware

Routines for aggregating statistics Example: given

S1 = i x1,i and N1

S2 = j x2,j and N2, from two independent sources

I can compute S3 = S1 + S2 and N3 = N1 + N2

From these, I can compute the mean value: = S3/N3

To compute variance, I need SS1 = i x21,i

Page 31: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Model-Fitting Middleware

Given summary statistics, compute probabilities in a Bayesian network

Mail Loop

IM IMR PEAK SHARPNESS

Page 32: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Sufficient Statistics

Andrew Moore (CMU): For nearly every learning algorithm, there is a set of statistics sufficient to permit that algorithm to fit its models

There is usually a scalable way of collecting and aggregating these sufficient statistics

Page 33: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

One KP Goal

Design the services for defining sensors, defining derived variables, defining sufficient statistics, and defining aggregation methods

Scalable, secure, etc.

Page 34: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Hierarchical Modeling

Abstract (or aggregate) model could treat conclusions/assertions of other models as input variables

“One model’s inference is another model’s raw data” probably requires associated meta-data:

provenance, age of original data major issue: assessing the independence of

multiple data sources (don’t want to double-count evidence): requires knowledge of KP/network topology

Page 35: Thomas G. Dietterich School of EECS Oregon State University Corvallis, Oregon 97331 tgd Learning and Inference in the.

Fusing Multiple Subscriptions

Multiple KPs and/or multiple KPapps may register for the same sufficient statistics

Fuse their subscriptions to save computation

Keep meta data on non-independence