Top Banner
INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University
47

INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dec 31, 2015

Download

Documents

Joshua Watts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES

Monday, March 30, 2001

Ali Mili

West Virginia University

Page 2: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

ACKNOWLEDGEMENTSFunded by NSF, under ITR program, for

2000-2003.Collaboration with Dr H. Ammar (WVU),

Dr M. Shereshevsky (WVU) and Dr Lionel Briand (Carleton U, Canada).

Co-funded by NASA IV&V, Fairmont, WV, for 2000-2001 (HCS).

Page 3: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

SOFTWARE ARCHITECTURES: A KEY PARADIGM

Codifying Best Practices into recognizable abstractions.

Supporting various forms of Software Reuse (PLE, CBSE, COTS).

Architecture: Captures scope of reusable assets and inter-component protocols.

Page 4: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Quantifying Architectural Attributes Intrinsic Attributes: The architecture

as an artifact.Extrinsic Attributes: The architecture

as a blueprint.

Page 5: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

PREMISES OF THE APPROACHA Three Tier Quality Model.A Three Dimensional Hierarchy of

Metrics.A Three Step Quantification Procedure.A Three Pronged Analysis

Methodology.

Page 6: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Three Tier Quality Model

Distinguishing between what we want to measure and what we can measure.

Qualitative Attributes, arbitrarily vague, arbitrarily (non)quantifiable.

Quantitative Factors, formally defined, arbitrarily difficult to compute, apprehend QA.

Computable Factors, easily computable, related to QF.

Page 7: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Three Dimensional Hierarchy of MetricsData vs. Control. Data flow, Control

flow between and within components.Static vs. Dynamic. Communication

vocabulary vs. Communication language.

Coupling vs. Cohesion. Flow between vs. within components.

Page 8: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Three Step Quantification ProcedureArchitectures to a Canonical

Representation. Predefined architectural style in Rapide.

Canonical Representation to Random Variables. Information flow.

Random Variables to Metrics. Information Theory Functions (known properties, known interpretations).

Page 9: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Three Pronged Research MethodAnalytical Methods. Elucidating cause

effect relationships.Empirical Methods. Eliciting laws from

empirical observations.Experimental Methods. Validate

relationships or laws against experimental data.

Page 10: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

MODELING DECISIONSStandardizing mapping from coupling to

cohesion (cohesion as self coupling).Standardizing mapping from Static to

Dynamic (dynamic is language defined by static vocabulary).

Standardizing mapping from Random variable to Metric (Shannon’s entropy).

Page 11: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Modeling Decisions, IIData vs. Control: Different ranges;

possibly different correlations.Dynamic vs. Static: Static is easier to

compute, more reliable, but misses relevant aspects.

Cohesion as Self Coupling: Gives meaning to comparison (re: diagonality).

Page 12: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

A QUALITY MODEL

A Canonical Architecture, Focal Point. Predefined architectural style:

Independent Components. Predefined Notation: A subset of Rapide.

Distinction Between Qualitative Attributes: relevant to architect,

evade quantification. Quantitative Factors: easy to define,

evade derivation/ estimation.

Page 13: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Qualitative Attributes Intrinsic Attributes: Conceptual

integrity; Completeness and Correctness; Feasibility.

Extrinsic Attributes: Run-time Properties (performance, availability, security, usability, functionality); Product Properties (testability, integrability, modifiability, portability, reusability).

Page 14: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Quantitative Factors: Error PropagationDefinition:

EP(A,B) = P([B](x)[B](x’) | xx’).

Reflects the probability that an error generated by A (feeding into B) is propagated by B (vs. masked).

Relevance: fault tolerance.

Page 15: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Quantitative Factors:Change PropagationDefinition:

CP(A,B) = P([B][B’] |

[A][A’] [S]=[S’])

Probability that a change in A mandates a change in B.

Relevance: Perfective Maintenance.

Page 16: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Quantitative Factors:Requirements PropagationDefinition:

RP(A,B) = P([B][B’] |

[A][A’] [S][S’]).

Probability that a change in A due to a requirements shift yields a change in B.

Relevance: Adaptive Maintenance.

Page 17: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Quantitative Factors: Design PropagationDefinition:

DP(A,B) = P(BB’ |

AA’ [S]=[S’])

Probability that a function preserving change in B causes a change in A.

Relevance: Corrective Maintenance.

Page 18: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

ARCHITECTURAL METRICSA Hierarchy of Eight Metrics:

Data and Control Static and Dynamic Coupling and Cohesion.

Four Matrices.Validation will select; most likely

combine.

Page 19: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Rationale: Static vs. Dynamic

Static Metrics: Entropy of the vocabulary of information flow within/ between components.

Dynamic Metrics: Entropy of the language generated from that vocabulary during a typical execution.

Page 20: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Rationale: Data vs. ControlData Interchange:

carried by messages, parameters, shared data, etc.

Usually high bandwidth.

Control Interchange: carried by method calls, synchronization signals, event notifications.

Usually low bandwidth.

Page 21: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Elements of Information TheoryRandom variable X, over set E, probability

distribution P. Abbrev: P(X=e) = p(e). Shannon’s Entropy H(X) = - p(e) log(p(e)). Renyi’s Entropy

N(X) = (1/(1-)) p(e)Other interesting functions: conditional

entropies, joint entropies, etc.

Page 22: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Static Data MetricsStatic Data Coupling:

SDR(A,B): Random variable that represents the vocabulary of data transfer from A to B.

SDC(A,B): H(SDR(A,B)).

Page 23: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Static Data Metrics, IISDR(A,B): an integer over 32 bits,

uniformly used SDC(A,B)=32 bits.SDR(A,B): three independent integer

variables SDC(A,B)=96 bits.SDR(A,B): an integer representing a

Boolean (a la C) SDC(A,B)=1 bit.SDR(A,B): an array index 0..7, uniform

usage SDC(A,B) = 3 bits.

Page 24: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Static Data Metrics, IIIStatic Data Cohesion:

SDR(A): shorthand for SDR(A,A).

Implicitly, SDR(A,A): data transferred from A to A: state space of A.

Static Data Coupling, Cohesion: a Static Data NxN Matrix. N: # of components.

Page 25: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Static Control MetricsStatic Control Coupling:

SCR(A,B): Random variable that represents the vocabulary of control transfer from A to B.

SCC(A,B): H(SCR(A,B)).

Page 26: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Static Control Metrics, IISCR(A,B): A may call 8 methods of B,

with equal likelihoodSCC(A,B)=3bits.SCR(A,B): A may call 2 methods of B,

with equal likelihoodSCC(A,B)= 1bit.SCR(A,B): A may call 1 method of B

SCC(A,B) = 0 bits. Dynamic control metrics will distinguish from 0 methods.

Page 27: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Static Control Metrics, IIIStatic Control Cohesion:

SCR(A): shorthand for SCR(A,A).

Implicitly: control flow within A: evades precise generic definition.

Static Control Coupling, Cohesion: Static Control Matrix.

Page 28: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dynamic MetricsStatic Random Variable: SR.Dynamic Random Variable:

DR = plausible sequences on SR.

DDR: plausible call/control sequences.

DCR: plausible data/parameter sequences.

Page 29: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dynamic Metrics, II Normalizing Dynamic Metrics: If a sample

execution produces a call sequence of 1000 method names, is it because

traffic between A and B is intense,

or the data sample is large?

Reflect the 1st dimension, normalize the 2nd.

Page 30: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dynamic Metrics, IIINormalizing for the Size of Data: Let

Ln be the sequence generated by a datum of size n. Rather than compute H(Ln), we compute

limn (H(Ln+1)-H(Ln)).

Whether this limit exists? Investigation.

limn (1/n) H(Ln).

Page 31: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dynamic Metrics, IVMeasuring the Size of Data: A

Generic Procedure.

- Well founded ordering on data space,

- Transitive root,

- Stratify data space,

- Size of a datum: ordinal of its stratum.

Page 32: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dynamic Metrics, VMetrics Dependent on Choice of

Ordering? Condition of Convergence Weeds Out Poor Choices of Ordering.

Binary Trees: Height, vs. Number of Nodes.

With number of nodes, limits are defined. Sequence increment: traffic generated by an extra node.

Page 33: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Dynamic Metrics, VIReflecting Dynamic Behavior: If A

calls a single method in B, static control coupling is 0 bits. Dynamic control coupling is the entropy of the random variable that represents the length of the (unitary) call sequence: a meaningful non-trivial value.

Page 34: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Measures of Integrity Ideal Matrix: High diagonal values; low

values outside diagonal. Absolute Diagonality: Distance to the

subspace of diagonal matrices. Relative Diagonality: Sine of the Angle

between the matrix and the subspace of diagonal matrices.

Captures modularity of the architecture by single scalars 0 .. 1. Mixed blessing.

Page 35: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

DEPLOYMENT PLAN: Architectures to RapideUML as An Architectural

Representation: Rules for extracting architectural information.

Page 36: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Five Architectural StylesStyle: Topology + Msg Data Types + Protocols. Independent Components: event based

systems; communicating processes. Virtual Machines: Interpreter based. Example:

Rule Based Systems. DataFlow Architectures: Data triggers nodes.

Examples: Batch; Pipe and Filter. Data Centered Systems: Data Bases; Blackboard

Systems. Call/ Return Architectures: Main/ Sub; RPC; OO

Systems; Layered Systems.

Page 37: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Rapide Paradigms/ Constructs Object Oriented Executable ADL. Specifying and Prototyping Systems. Collection of Interfaces, connections between

interfaces, and formal constraints. Three types of connections: pipeline (),

agent(), and identification (to). Execution model is event-based and supports

concurrency of node executions.

Page 38: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

DEPLOYMENT PLAN:Rapide to Random Variables The most difficult/ contentious/ controversial

issues. Mapping a Rapide Architectural description

into an NxN matrix of random variables. Relies on information that is for the most part

available at the architectural level: Data/ Control Flow within and between nodes, with relevant probability distributions.

Page 39: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Eliciting Interchange InformationData Flow within nodes: State

Variables.Data Flow between nodes: Message

Passing, Parameters, Shared Data.Control Flow between nodes:

Exchange of method calls; event flow.Control Flow within nodes: Debate.

Page 40: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Eliciting Probability DistributionsSpecified Usage Probabilities. Inferred Usage Probabilities (e.g Stack).Simulated Usage Probabilities.Default Usage Probabilities (uniform

over data type, over know subrange).

Page 41: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

DEPLOYMENT PLAN: Random Variables to MetricsStraightforward: Applying the Entropy

function.Subject to Validation: Shannon vs.

Renyi. Perhaps other forms.Selection of metrics formulas dependent

on validation step. Anticipated: logical/ numeric/ probabilistic relationships between CM and QF.

Page 42: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

AUTOMATION PLAN: Rapide to Random Variables Syntax Directed Translation (Yacc-like) of

Rapide declarations into Ensemble definitions.

Bare Rapide Parser, progressively extended. Investigation: Probabilistic Annotations of

Rapide, using closed (wrt aggregate declarations) Prob. Distribution vocabulary.

Outcome: A square matrix of random variables.

Page 43: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

AUTOMATION PLAN: Random Variables to MetricsDeriving Matrix of metrics from Matrix of

random variables, using Shannon/ Renyi.

Assessing Diagonality, other properties.Assessing/ Correlating/ Providing

Bounds for Quantitative Factors.Apprehending/ Providing Ratings for

Qualitative Attributes.

Page 44: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

VALIDATION PLAN:Analytical ValidationValidating Computable Metrics with

respect to Qualitative Factors: Documented approximations.

Under Weak Hypothesis, found equality between EP(A,B) and Renyi entropy of SDR(A,B).

Page 45: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

VALIDATION PLAN:Empirical ValidationCase Study: HCS (Hub Control

Software, ISS); UML descriptions.Map to Rapide, Compute Metrics.Correlate with measurable propagation

probabilities, in light of system logs.Other examples: a Client Server, a

Pacemaker, a KWIC index.

Page 46: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

CONCLUSION AND PROSPECTSA Three-Tier Quality Model.A Three Dimensional Hierarchy of

Metrics.A Three-Step Quantification Procedure.A Three-Pronged Methodology. Preliminary Work; tentative/

speculative.Looks easier (nicer?) than it is.

Page 47: INFORMATION THEORETIC METRICS FOR SOFTWARE ARCHITECTURES Monday, March 30, 2001 Ali Mili West Virginia University.

Questions?…

Une Science a l’Age de Ses Instruments de Mesure.

Louis Pasteur. One of the fundamental

aims of Science has been and continues to be that of progressing from perceptions to measurements.

Lotfi A. Zadeh.