Security Optimization of Dynamic Networks with …people.cs.vt.edu › ... › TDSC-Attack-Graph-Probabilistic-Yao.pdflinear programming that is extensively applied and studied in

1

Security Optimization of Dynamic Networkswith Probabilistic Graph Modeling

and Linear ProgrammingHussain M.J. Almohri, Member, IEEE , Layne T. Watson Fellow, IEEE ,

Danfeng (Daphne) Yao, Member, IEEE and Xinming Ou, Member, IEEE

Abstract—Securing the networks of large organizations is technically challenging due to the complex configurations and constraints.Managing these networks requires rigorous and comprehensive analysis tools. A network administrator needs to identifyvulnerable configurations, as well as tools for hardening the networks. Such networks usually have dynamic and fluidic structures,thus one may have incomplete information about the connectivity and availability of hosts. In this paper, we address the problemof statically performing a rigorous assessment of a set of network security defense strategies with the goal of reducing theprobability of a successful large-scale attack in a dynamically changing and complex network architecture. We describe aprobabilistic graph model and algorithms for analyzing the security of complex networks with the ultimate goal of reducing theprobability of successful attacks. Our model naturally utilizes a scalable state-of-the-art optimization technique called sequentiallinear programming that is extensively applied and studied in various engineering problems. In comparison to related solutionson attack graphs, our probabilistic model provides mechanisms for expressing uncertainties in network configurations, which isnot reported elsewhere. We have performed comprehensive experimental validation with real-world network configuration dataof a sizable organization.

Index Terms—Network security, attack graph, probabilistic model, vulnerability analysis, optimization.

F

1 INTRODUCTION

Large organizations need rigorous security tools foranalyzing potential vulnerabilities in their networks.However, managing large-scale networks with com-plex configurations is technically challenging. For ex-ample, organizational networks are usually dynamicwith frequent configuration changes. These changesmay include changes in the availability and connec-tivity of hosts and other devices, and services addedto or removed from the network.

Network administrators also need to respond tonewly discovered vulnerabilities by applying patchesand modifications to the network configuration andsecurity policies, or utilizing defensive security re-sources to minimize the risk from external attacks. For

• H. M. J. Almohri is with the Department of Computer Science,Kuwait University, Kuwait. Email: [email protected].

• L. T. Watson is with the Departments of Computer Scienceand Mathematics, Virginia Tech, Blacksburg, VA 24060. Email:[email protected].

• D. Yao is with the Department of Computer Science, Virginia Tech,Blacksburg, VA 24060. Email: [email protected].

• X. Ou is with the Department of Computing and InformationSciences, Kansas State University, Manhattan, KS 66506. Email:[email protected].

• This work was supported in part by Kuwait University Research GrantNo. [ZQ02/14].

instance, to prevent a remote attack targeting a host itis useful to analyze the candidate defensive strategiesin choosing installation and runtime parameters forone or several intrusion prevention systems.

To facilitate a scalable security analysis of organi-zational networks, attack graphs (e.g., [1], [2]) wereproposed. Attack graphs show possible attack pathswith respect to a particular network setting, whichprovide the necessary elements for modeling andimproving the security of the network.

Existing work utilizes attack graphs (for example,[1], [2], [3]) for analyzing the security risks by quan-tifying attack graphs using a variety of techniquessuch as Bayesian belief propagation [4], [5], [6], [7],basic laws of probability [8], [9], and vertex rankingalgorithms [10], [11]. These models lack a system-atic and scalable computation of optimized networkconfigurations. Current attack graph quantificationmodels assume a network with known and fixedconfigurations in terms of the connectivity, availabilityand policies of the network services and componentsdisregarding the dynamic nature of modern networks.Moreover, except for a few attempts [12], [13], [14],[6], previous work has solely focused on computing anumerical representation of the risk without address-ing the more challenging problem of risk managementand reduction.

In this paper, we present a rigorous probabilisticmodel that measures the security risk as the proba-

2

bility of success in an attack. Our probabilistic modelreferred to as the success measurement model has threemain features: (i) rigorous and scalable model witha clear probabilistic semantic, (ii) computation of riskprobabilities with the goal of finding the maximumattack capabilities, and (iii) considering dynamic net-work features and the availability of mobile devicesin the network.

As an application of our success measurementmodel, we formalize the problem of utilizing networksecurity resources as an optimization problem withthe goal of computing an optimal placement of securityproducts across a network. Our new contribution isto define this optimization problem and provide anefficient algorithm based on a standard techniquecalled sequential linear programming. Our algorithmis proved to converge and it is scalable to largenetworks with thousands of components and attackpaths. Our contributions in this paper include:• A scalable probabilistic model that uses a Bernoulli

model to measure the risk in terms of the proba-bility of success to achieve an attack goal.

• An efficient security optimization model, generatedbased on a quantified attack graph, to compute anoptimal placement of security products accordingto organizational and technical constraints.

• Modeling dynamic network features for a realisticand accurate analysis of the risk associated withmodern networks.

The results of our experiments confirm three keyproperties of our model. First, the vulnerability valuescomputed from our model are accurate. Our manualinspection of the results confirm that the probabilityvalues obtained in the experiments correlate to thevulnerabilities of components in the network. Second,our security improvement method efficiently findsthe optimal placement of security products subject toconstraints. Third, we quantify the additional vulner-abilities introduced by mobile devices of a dynamicnetwork. Our results indicate that an infected mobiledevice within the trusted region creates a preferredattack direction towards the attack target, which in-creases the chance of success at the target host. Ourimplementation efficiently computes the probabilitiesthroughout large attack graphs with a quadratic exe-cution performance.

2 RELATED WORK

The literature has a significant number of attemptsto provide methods, algorithms, and tools for thevarious problems concerning graph-based analysis ofsecurity in large networks. Graph-based analysis ofnetworks was proposed in [15] where a graph ofattack stages in a network topology was introduced toanalyze specific attacks in a network. The work in [15]was followed by the method proposed in [16] thatin addition to producing attack graphs using model

checking, introduced an analysis of guarding optionsagainst the attacks.

The effort to enhance graph-based analysis andsecurity hardening has continued since [15] and [16].Unfortunately, some of the ongoing challenges facingautomated network security analysis remain unre-solved. Per our survey, the literature lacks a compre-hensive and rigorous methodology for the assessmentof a set of network security defense strategies withthe goal of reducing the success of an attack. Inthe following, we present a thorough comparison ofour work with the related research followed by asummary of the novelty of our work.

2.1 Probabilistic Analysis

Using the probability theory to compute a quantitativesecurity has been reported in [4], [5], [9]. A work byWang et al. [9] considers a probabilistic model forcomputing a security risk metric using attack graphs.

Bayesian analysis of networks using attackgraphs [6], [5], [7] differs from our successmeasurement model in that our model does notrequire the knowledge of conditional probabilities.In [5], a dynamic Bayesian network model wasproposed that is capable of incorporating temporalfactors. Bayesian threat probability based on securityand organization-specific knowledge as well asattacker profile is discussed in [4]. Xie et al. [7]introduced a Bayesian model that adds a node to theBayesian network indicating whether or not an attackhas happened. Although this extension improvesthe models in [5], it does not capture the variouspossibilities of attack paths taken by an attackerbefore reaching an intermediate attack goal, which isaddressed in our work.

The work in [17] attempts to broaden the definitionof attack graphs as well as providing algorithms forpredicting vulnerabilities in the network. This workintroduces temporal probabilistic attack graphs thatare used to update the vulnerability information intime. Our work differs with [17] in the modelingassumption. The probabilistic attack graph presentedin [17] models time intervals for each attack stepthat may or may not occur with specific probabilities.While this is a very useful approach, our work modelsa direct attack step probability based on the Bernoullimodel of successes and failures considering uncertain-ties in an attacker’s action and a notion of deviceavailability (i.e., assuming a device may not alwaysbe connected to the network). Despite that we sharethe same goal of mitigating security risks, our worktakes a different approach through the applicationof efficient mathematical programming methods tosecurity optimization problems that we point out inSection 3.

The work in [9] discusses an interpretation of themetric and a heuristic to compute the metric. In our

3

work, we provide a success measurement model thatgeneralizes the method in [9] by capturing the un-certainty in attacker’s choices (discussed as a randomselector in Section 4).

2.2 RankingIn attack graph ranking, an initial input score is usedto bootstrap a ranking algorithm that produces aquantified attack graph. A number of attack graphranking algorithms are inspired by and extensions ofPageRank [18]. PageRank is an algorithm proposedby Page et al. [18], which is used to rank importantweb pages.

AssetRank [11] was proposed to rank any depen-dency attack graph using a random walk model.AssetRank is a generalization of PageRank extendingit to handle both conjunctive and disjunctive nodes.AssetRank is supported by an underlying probabilis-tic interpretation based on a random walk. Mehta etal. propose a ranking method using state enumerationattack graphs [10]. The idea of PageRank is appliedto state enumeration attack graphs with a modifiedinterpretation of the ranking. Attack graphs basedon model checking have been proposed in [16] for-malizing an intrusion attack in a finite state model.Authors in [16] do not propose a complete attackgraph ranking method. Instead, a method to computeminimal critical attack assets based on user-specifiedmetrics has been introduced.

Other approaches to security assessments includea goal-motivated attacker model based on a Markovdecision process [19], a weakest-adversary approachto ranking attack graphs [20], a generic framework foran attack resistance metric [21], and an enterprise ITrisk metric using CVSS scores [22].

2.3 Security ImprovementQuantified attack graphs or similar formalism are par-ticularly useful when utilized as a basis for improvingthe security of a network. The authors in [6], [13] pro-posed solutions for the security hardening problemas a multiobjective optimization problem. The mainadvantage of our work compared to the use of geneticalgorithms in [13] is that we formulate the securityhardening problem as a general mathematical pro-gramming problem that is directly developed accord-ing to an attack graph. The mathematical program-ming problem presented in this paper can be extendedto consider a variety of constraints that we discuss as afuture direction. Moreover, our work differs in the re-search goal as we focus on reducing the success ratesof attackers, whereas the work in [13] is on optimizingcosts (similar to [12]) and reducing damages. Noel andJajodia presented a greedy solution for the problemof the best placement of IDS sensors in a networkusing attack graphs [14]. The solution finds a minimalnumber of sensors that can cover all critical attack

paths. Wang, Noel, and Jajodia proposed a methodfor finding the (initial) conditions that need to beremoved to improve network security [23]. Both thesesolutions aim at reorganizing the networks to improvesecurity. In comparison, our work provides networkhardening solutions beyond network reorganization.Our probabilistic model supports the computation ofoptimal network security defense strategies.

Huang et al. proposed a method for distillingthe critical attack graph surface iteratively throughminimum-cost SAT solving [24]. The presentedmethod is useful in finding the most critical attackpath, which can be considered later for hardening thesecurity of the network. Such a result can be used toguide our improvement recommendation method toconsider hosts found on a critical path.

In [8], a probabilistic metric was introduced. Thecore component of the proposed work is to simulatethe attack scenario and provide recommendation op-tions to find a better configuration of the network.Comparably, our improvement model is not limitedto making an optimal choice between available con-figuration options. Our work goes further by con-sidering additional security hardening options (suchas installing an IPS) and finding an optimal recom-mendation accordingly. Our proposed model findsan optimal recommendation based on a nonlinearprogram and is not limited to simulation results. In [8]the authors provided a method to quantify the attackgraph and simulate attackers’ choices to computean improved reconfiguration. While being a valuableapproach, the proposed method does not take intoaccount the availability of machines and uncertaintyin attackers’ decisions.

2.4 Summary of Comparisons

In summary, there are several differences that distin-guish our work from the existing research.

1. None of the previous work considers the ef-fect of device availability on open networks.Furthermore, optimized network configurationsand improvement in our work has not beenpreviously studied. Bayesian methods are pow-erful in computing unobserved facts, such aspredicting possible threats. It remains unclearhow Bayesian methods can be used to supportvariability in attacker’s decisions, device avail-ability, and the effect of mobile devices.

2. Our probability calculation scheme is generalenough to allow performing various levels ofsuccess probability analysis by introducing vari-able attack steps as part of success probabilitycomputation.

3. We complete the analysis of network securitythreats by providing a sound and computation-ally efficient security improvement recommen-dation technique that is capable of finding op-

4

timal network configurations as well as optimalplacement of security solutions in the network.

3 OVERVIEW

Motivated by the general research goal of developingoptimized network security settings, our work focuseson the problem of statically performing a rigorous assess-ment of a set of network security defense strategies with thegoal of reducing the probability of a successful large-scaleattack in a dynamically changing and complex network ar-chitecture. This problem represents a practical concernin modern organizational networks, where there is aneed for a highly reliable and mathematically soundplatform to conduct effective security hardening anal-yses.

3.1 Challenges

In addressing the aforementioned problem, our workfaces and attempts to solve several technical chal-lenges. First, to provide a reliable analysis for im-proving network security, we face the challenge ofdeveloping a rigorous model that accurately capturesthe reality. Though this is not an entirely new chal-lenge, the current state-of-the-art does not focus on therigorousness of the proposed models. As explained inSection 4, we approach this challenge by developinga model with a clear theoretical foundation that accu-rately captures a complete view of a multistage attackon a network. The main advantage of our model is inthe use of established mathematical concepts that bestfit the problem, which enables us to exploit efficientmethods to develop reliable algorithms.

Second, most of the literature in attack graph anal-ysis focuses on various techniques to transform whatwe call plain attack graphs (that is, attack graphs withno quantification metrics) to quantified attack graphsthat provide clearer insights into the seriousness ofthe attacks. Effective assessment of network securitydefense strategies remains a challenge that requiressignificant effort in terms of further enriching the at-tack graph model and transforming it into an identicaloptimization problem. We address this challenge bydeveloping theoretically sound mathematical modelsthat represent a complete view of attack graphs, andare capable of including candidate network securitydefense options. The result of our optimization isto find which network security defense strategy willyield enhanced security according to the providedinput.

For example, a system administrator managing acomplex network architecture can use our method tocompare and contrast the effectiveness of two securitydefense packages, one to install a software firewall ona number of hosts, thus downgrading computationalperformance and potentially increasing false posi-tives, and the other to move a set of services behind

a hardware-based load balancer and thus increasingcost as well as network latency.

Third, we discover and address a novel challenge insystematically modeling the uncertainties in an adver-sary’s attack steps towards a major attack goal. This is aproblem when dealing with attacks of multiple steps.For instance, an adversary may be faced with a rangeof vulnerabilities to try to exploit when executing anattack step. When analyzing the level of security in anetwork, in the lack of historical attack data, it is par-ticularly challenging to deal with such uncertaintiesat the modeling level. We use a statistical approachwhere we define a random behavior to model thevarious possibilities of attacks. Specifically, we definea special random variable Yui for each possible attackstep within an attack path. The value of Yui

corre-sponds to the probability that the adversary choosesan attack step ui. We further explain the details ofdefining and using this method in Section 4.1.

3.2 Approach

We approach the challenges mentioned in Section 3.1by defining, implementing, and experimenting with anew probabilistic quantification model that we com-bine with our novel optimization problem as de-scribed in Sections 4 and 5. Our probabilistic quantifi-cation model, referred to as success measurement model,quantifies the vulnerabilities of networked compo-nents and resources, by computing the expectedchance of successful attack (ECSA) at every attackstep, which is represented by an attack graph node.Our security improvement model uses the computedprobabilities from the success measurement model tofind optimal security defense strategies given a set ofavailable options.

As depicted in Figure 1, the computation in thesuccess measurement model requires three sets ofinputs, which are a set of attack steps, a set ofnetwork configuration and potential vulnerabilities,and a set of ground facts. The first set includes thesteps necessary to execute a targeted attack in a net-work. These steps represent intermediate attack goalssuch as compromising a machine that has an internalconnectivity with a targeted server. In addition, theattack steps also describe the various parallel choicesavailable to an attack when achieving a specific target.The second set includes the network configurationsand vulnerability data that collectively provide hostsoftware installations, inter host connectivity, runningservices and connections, and known or potential soft-ware vulnerabilities. The third set contains the groundfact values that describe the vulnerability, availability,and connectivity of various network configuration.

In our implementation, the first two sets of inputs(i.e., the attack steps and the network configurationdata) are taken from dependency attack graphs. Thesystem administrators use vulnerability assessment

5

Attack graph

Ground fact values

Network configuration

and vulnerabilitiesAttack step

rules

Vulnerability scores and estimates

Success Measurement

Model

Expected chances of successful attacks

Mobile attack rules and data

Optimal security defense strategy

Security improvement

nodes

Security Improvement

Model

Defense strategy

candidates

Fig. 1. Our models work based on three input sets fromattack graph generators as well as initial belief valuesassociated with potential vulnerabilities and networkconfiguration data.

tools (such as OVAL [25]) to explore the configura-tions and vulnerability data in their networks. Theoutput of such assessment is provided as an inputto attack graph generation tools. Attack graph gen-eration tools (such as MulVAL [26]) often includecustomized predefined attack step rules that are ap-plied to the configurations and vulnerability data of anetwork and produce a plain (that is, not quantified)attack graph. The additional step required by ourmodel is to develop a set of ground fact values(described in detail in Section 6). The values bootstrapthe computation of success probabilities throughoutan attack graph.

The output of the computation based on our suc-cess measurement model is the input to the securityoptimization model (Figure 1). Using the security im-provement model, we transform the quantified attackgraph from the success measurement model into amathematical program. The resulting mathematicalprogram includes an additional set of data that rep-resent various network security defense strategies. Inthe tool that we developed, the security administra-tors simply feed this information as logical predicatessuch as ips_installed(T, E), which describesa potential installation of an intrusion preventionsystem of type T and security effectiveness E. Theeffectiveness value E is a score estimated by thesystem administrator based on prior experiences andavailable effectiveness data.

3.3 ResultsValidating the results of theoretical modeling of net-work security under the assumption of lack of datais challenging. In this work, we only use the datafrom attack graphs to perform a manual analysis ofthe results produced from the application of our twomodels. We set up our experiments based on thenetwork configuration data, existing potential vulner-abilities, and attack graphs produced for a functioning

real world corporate network. We summarize ourexperience with implementing the models as follows:

1. All the algorithms were programmed fromscratch in Java, automating the entire processfor receiving input from attack graph generatorsuntil recommending the best security defensestrategies.

2. The implementation performance only relies onthe performance of the simplex method usedfor solving the optimization problem. Since thesimplex method is heavily and successfully usedin practice [27], our model features a high levelof computational scalability and efficiency.

In addition, we give a summary of our experiments(Section 7) next.

1. The focus of our experiments is to practicallydemonstrate the practicality, feasibility, and ac-curacy of the model.

2. Our experiments include novel features suchas analyzing networks with less studied butpotentially vulnerable devices such as mobiledevices and networked printers. To the best ofour knowledge, the experiments in the networkanalysis literature lack this level of detail.

3. Our model will give system administrators asolid analysis of the security in their networksthat will assist in actual implementation of se-curity features to downgrade the possibility ofsuccessful attack.

4 SUCCESS MEASUREMENT MODEL

In this section we present our success measurementmodel to compute the expected chance of a successfulattack on a network with respect to the attack’s ulti-mate goal. We first present the definitions of the ex-pected chance of a successful attack (ECSA) followedby the description of an efficient method to computeECSA values.

4.1 Definitions of ECSA ValuesThe key component of our success measurementmodel is the probabilistic definition of the expectedchance of a successful attack against any node in theattack graph.

We present an alternative approach to the Bayesiananalysis discussed in [6], [7]. Our success measure-ment model computes probabilities as a function ofinitial belief probabilities without the need for spec-ifying conditional probabilities required by Bayes’theorem. The set of initial belief values required byour model is small and can be obtained from stan-dard vulnerability assessment systems (discussed inSection 6).

Our model measures the success of an attackerbased on the attack dependencies determined by alogical attack graph.

6

Definition 1: A logical attack graph G = (V,E) is adigraph where V = Nf ∪ Ng ∪ Nr and Nf , Ng , Nrare disjoint sets of nodes containing fact nodes, goalnodes, and rule nodes, respectively. E is the set ofarcs, and G ∈ Ng is the attacker’s goal.

In a logical attack graph, nodes are of three typesand are defined as tuples.

Definition 2: Each attack graph node u is a tuple(du, E[Xu]) where du is the description of a networkconfiguration item (when u ∈ Nf ), an attack rule(when u ∈ Nr), or an attack goal (when u ∈ Ng),and E[Xu] ∈ [0, 1] is the corresponding ECSA (see (1)and (3)) of the node u.

A rule node in an attack graph represents a logicalconjunction of its predecessors, a goal node in anattack graph represents a logical disjunction of itspredecessors, and a fact node is a node with nopredecessor.

We define the sample space for a node and acorresponding random variable representing attackoutcomes. The outcome of an attack attempt on a nodecan either by a success or a failure. Let Ω(u) be thesample space for a node u ∈ V for an attack graph G.We define the random variable Xu for the node u as aBernoulli random variable with Xu(ω) = 1 denotingsuccess in an attack and Xu(ω) = 0 failure, where ωis an outcome.

Definition 3: For any node u ∈ V of an attack graph,the expected chance of a successful attack (ECSA) ata node u is given as E[Xu] = P (Xu = 1), that is, theprobability of success for the random variable Xu.

Let φ(u) = v | (v, u) ∈ E be the set of predecessors(dependencies) of a node u. In the following, wedefine ECSA for the derived nodes based on thecorresponding logical semantics (that is, conjunctionfor a rule node and disjunction for a goal node).

ECSA value of a rule node. Let u ∈ Nr be a rulenode and φ(u) = v1, v2, . . . , vt. The random variableXu — corresponding to the success or failure of theattacker at node u — is defined as the product of therandom variables for all predecessor nodes v ∈ φ(u),for which the expected value is

E[Xu] =∏

v∈φ(u)

E[Xv], (1)

assuming independence of the predecessor randomvariables (further discussed in Section 4.4).

ECSA value of a goal node. An attack graph hasseveral goal nodes. A goal node either depends on asingle exploitation rule (represented by a rule node)or multiple exploitation rules such as u1 in Figure 2.

A goal node with multiple rule node dependenciesis a logical disjunction. In reality, this disjunctionindicates that there are multiple attack choices for anattacker towards a specific attack goal. For instance,consider a server with a local privilege escalation vul-nerability (which is exploitable remotely in a multi-step attack) and runs a network service with multiple

Remote exploitation rule

(1)

Local exploitation rule

code execution on H

Remote exploitation rule

(2)

u1

u2

u3

E[Xu2] = 0.7

E[Xu3] = 0.66

E[Xu4] = 0.8

E[Y1] = 0.3

E[Y2] = 0.5

E[Xu1] = 0.721

...

...

...

u4

Fig. 2. A goal node for an attack on host H with threeattack choices: a local exploitation and two methods ofremote exploitation. The variables Y1 and Y2 measurethe probability of attack choices. We assume E[Y1] andE[Y2] are not available, and thus, we computationallydetermine their values based on Equation 2.

remote vulnerabilities. An attacker must exploit one(or more) of these vulnerabilities to gain privileges onthe target server. In the lack of observable evidence,one needs to compute the ECSA of a goal node witha function that correctly captures the probabilities ofsuch attack choices.

Our approach is to computationally determine at-tack choice probabilities according to various attackpatterns (Section 4.2). Per our knowledge, no previouswork has modeled these choices.

In the the attack graph of Figure 2, node u1 hasthree predecessors (rule nodes u2, u3, and u4). To com-pute E[Xu1 ], we introduce auxiliary Bernoulli randomvariables Yi (referred to as the random selectors) tocapture the random selection of an attack path.

Definition 4: A random selector Yi is a Bernoullivariable that is associated with a rule node ui. Yiacts as a weighting variable for the correspond-ing rule node variable Xui

. For any goal node v,with a set of predecessor rule nodes φ(v), we have∑ui∈φ(v)E[Yi] = 1.The values of Yi are multiplied with the computed

ECSA for the predecessor nodes to reflect the attackchoices. In Section 4.2, we show how the values of Yivariables are computed.

Let φ(u) = v1, v2, . . . , vt be the set of dependen-cies of u. Then we define the random variable Xu fora goal node u ∈ Ng for which the expected value is

E[Xu] =

t−1∑k=1

[E[Yk]E[Xvk ]

k−1∏i=1

(1− E[Yi])

]

+ E[Xvt ]

t−1∏i=1

(1− E[Yi]). (2)

Observe that the definition above selects Xu = Xvi bythe event Yi = 1, Yj = 0 for j < i < t (for example,Figure 2). Note that the Bernoulli variables Yi ingeneral depend on the node u, but this dependenceis not reflected with the notation Y

(u)i for simplicity.

7

4.2 Computing ECSA Values

From a defender’s point of view, attack choices areuncertain with various attack scenarios. Existing worksuch as [16], [9], [11], has provided ways to computea static view of the security risk corresponding tospecific attack scenarios. In this section we describethe method for computing ECSA values of an attackgraph with a goal of finding the highest possiblechance of success for an attack.Finding the most vulnerable components. The computa-tion method described in this section allows one tofind the ECSA values such that the ECSA of the attacktarget is maximized. The result of this computationis in particular important for optimal placement ofsecurity hardening products described in Section 5.1.

To find the most vulnerable components, we formu-late a maximization problem with a nonlinear objec-tive function subject to linear and nonlinear equalityconstraints. The decision variables represent the nodesof an attack graph.

Let xi = E[Xi] be a decision variable for a nodei ∈ Nr∪Ng , and x = (x1, x2, . . . , xM )T be the vector ofunknown ECSA values for all nodes. Let yi = E[Yi] bea decision variable for a random selector Yi, and y =(y1, y2, . . . , yP )T be the vector of unknown expectedvalues of the random selectors. For a rule node u ∈ Nrwith predecessors φ(u), the constraint function is

fu(x, y) =

xu − xj∏

k∈φ(u)k∈Nf

P (Xk = 1), j ∈ φ(u) ∩Ng,

xu −∏

k∈φ(u)k∈Nf

P (Xk = 1), φ(u) ∩Ng = ∅.

(3)Note that Equation (3) has two cases. The first case isfor rule nodes with one goal node as a predecessorand the second case is for rule nodes with no goalnodes as predecessors. For a goal node u ∈ Ng withpredecessors φ(u) = v1, v2, . . . , vt, the constraintfunction is

fu(x, y) = xu −t−1∑k=1

[ymu+kxvk

k−1∏i=1

(1− ymu+i)

](4)

− xvtt−1∏i=1

(1− ymu+i).

All the selector variables for all the goal nodes arenumbered consecutively, so that the yi for node u areym+1, ym+2, . . . , ym+t−1 for some m = mu dependingon u. Note that there is no variable ym+t since ym+t isdependent on ym+1, ym+2, . . . , ym+t−1; ym+t = 1 onlywhen all other selectors for u are zero.

Let f(x, y) = (f1, f2, . . . , fM )T be a vector-valuedfunction. The nonlinear program for finding the mostvulnerable components is

maximize xG (5)subject to f(x, y) = 0,0 ≤ xi ≤ 1 , i = 1, . . . ,M,

0 ≤ yi ≤ 1 , i = 1, . . . , P .

In (5), the vector-valued function f(x, y) holds allthe constraint functions (that is, (3) and (4)) for allrule and goal nodes in the attack graph. Note thatthe constraints in f(x, y) are the ECSA equations (1)and (2) set to zero.

4.3 Computational ProcedureFor a network configuration w, let Gw be the cor-responding attack graph. The complete procedure tocompute the ECSA values of nodes (Definition 2) foran attack graph (Definition 1) is given next.

To prepare the attack graph for computation, weexecute the following procedure.

Procedure 1:1. Determine the set of initial belief values B0 =E[Xuf

1], E[Xuf

2], . . . , E[Xuf

|Nf |] for each fact

node uf ∈ Nf . Let B0i denote E[Xub

i].

2. Create a set of fact nodes N ′f such that |N ′f | =|Nf |, where for each node ui ∈ Nf there isa node vi ∈ N ′f corresponding to the samenetwork item description (i.e., dui

is identical todvi ) and with E[Xvi ] = B0

i .3. Update the attack graph Gw such that the origi-

nal set of fact nodes Nf is replaced with the newquantified set of fact nodes N ′f , producing attackgraph G′w.

We input the resulting attack graph G′w to the pro-cedure below for computing the maximum possibleECSA values.

Procedure 2:1. Transform G′w into the corresponding mathemat-

ical program Pw as explained in Section 4.2.2. With Z = (x, y), choose a starting point Z0 with

each variable being a random value in the range[0, 1].

3. Replace all the nonlinear functions fi(Z) witha linear approximation fi(Z) ≈ fi(Z

0) +∇fi(Z0)(Z − Z0).

4. To prevent large changes in Z, add the constraint|Zi − Z0

i | ≤ ε, that is, each variable can changeby no more than ε.

5. Solve the resulting LP problem using an efficientLP method, such as the simplex method, pro-ducing the candidate optimal point Z∗, whichreplaces Z0.

6. Repeat step Steps 3–5 until the solution con-verges to a stationary point.

Procedure 2 computes the maximum possible ECSAvalue for node u in the attack graph. Our procedure

8

is a technique called sequential linear programming(SLP) [28]. SLP is a standard technique for solvingnonlinear optimization problems, which is found to becomputationally efficient and converges to an optimalsolution [29].

Complexity. In terms of computational efficiency, allof the steps in Procedure 1 and 2 require polynomialtime in the number of nodes. The most complex stepis the fifth step in Procedure 2. The complexity ofthe fifth step depends on the complexity of the LPalgorithm, and the simplex method is polynomial inpractice [27]. Since SLP has linear convergence, thenumber of iterations is also polynomial.

Optimizing the initial attack graph. Our attack graphshave goal nodes with no outgoing arcs (the ultimategoal) and may include several paths towards satisfy-ing a goal. It is possible to remove unnecessary pathsthat may or may not be taken by an attacker towardsthe ultimate goal. An experienced attacker may takea near minimum path towards the goal, thus, savingtime and perhaps bypassing difficult paths along theway. Finding a minimal attack graph will precede anycomputation with regard to the ECSA value.

4.4 Attack Dependencies

A major problem in probabilistic risk assessment isto accurately capture attack step dependencies andcorrelations. Attack dependencies in the form of at-tack preconditions are intrinsically captured by ourmodel. That is because we base our analysis on attackgraphs that are formed based on the dependency rela-tions among the nodes. Therefore, the probabilities ofsuccess are computed by considering the dependencyrelations determined in an attack graph.

Definition 5: An attack step represented by a goalor rule node u in an attack graph is dependent on an-other attack step v, if achieving v affects the decisionof the attacker in achieving u.

The dependency, as defined in Definition 5, occurswhen a dependent node u is a direct or indirectsuccessor of v. The only way u can be dependent onv is if v is known to have Xv = 1. Knowing Xv = 1indicates an attack has succeeded, and the attacker isnow using that knowledge to stage a second attack.In our current model, we assume independence of allattack steps since the scope of this paper is limited toanalyzing a single attack. The attack step dependen-cies could occur when multiple consequent attacks areanalyzed. To compute these dependencies, considerthe following formulation.

Let A, B, and C = AB be the random variables as-sociated with nodes in an attack graph. If we assumeA and B are dependent, then

E[C] = P [C = 1] = P [A = 1 ∧B = 1] =

P [A = 1|B = 1]P [B = 1] = E[A|B = 1]E[B].

To compute E[A|B = 1], set B = 1, which forcesall predecessors of B to also be 1, and recompute allexpected values at nodes affected by assuming B = 1.After

E[C] = E[A|B = 1]E[B]

is computed, the rest of the computation proceedswithout modification.

Given the above formulation, we conclude thatconsidering dependence in attack information in oursuccess measurement model (despite existing proba-bilistic work) is straightforward and does not requiresignificant additional computation.

5 SECURITY OPTIMIZATION

To achieve our main research goal (described in Sec-tion 3) of reducing the probability of success in anattack, and thus optimizing the overall security ofthe network, we point out the necessity to modelthis problem as an optimization problem. Further,we attempt to model an important feature that is toconsider the availability of machines in the network.In this section we describe these two contributions ofour work as summarized below.• Optimizing the security of the network. Given a set

of security hardening products (e.g., a host basedfirewall), we compute an optimal distribution ofthese resources subject to given placement con-straints. Using the rigorous probabilistic modelintroduced in Section 4.1, this is the first workin which a logical attack graph (Definition 1) istransformed into a system of linear and nonlinearequations with the global objective of reducingthe probability of success on the graph’s ultimateattack goal. This transformation is performed ef-ficiently and naturally and directly captures ourresearch goal.

• Machine availability and the effect of mobile devices.Our work is the first to show how to representand assess devices with variable availability (fre-quently joining and leaving the network), whichis one of the characteristics of mobile devices withvariable connectivity.

5.1 Optimizing the security of the networkWith limited resources for hardening an organiza-tional network, it is important to install a single ora combination of security hardening products so thatthe expected chance of a successful attack on thenetwork is minimized. To find the best placement ofa set of security products in a network, we extend theattack graph to define a security product as a specialfact node referred to as an improvement node, whichis a fact node that represents a security hardeningproduct, service, practice, or policy.

The objective of solving the problem of optimalplacement of security products is to compute the effects

9

of various placements of one or more improvement nodessubject to certain constraints and choose the placement thatminimizes the attack goal’s ECSA value.

The following describes computing the best place todeploy a single security product (that can be gener-alized to multiple security products) in the network.We formulate this optimal placement problem as aminimax problem — finding the best placement of theimprovement option that minimizes xG , where xG isthe maximum of E[XG ] with respect to Xu and Y

(u)i .

We consider a single improvement option for rulenodes given deployment constraints. We define theset of admissible rule nodes Nra ⊆ Nr as a subset ofall rule nodes. Let P (Xτ = 1) be the initial belief ofsome improvement option τ . The problem is to finda configuration that minimizes xG . That is, we aim tofind a rule node u ∈ Nra such that if τ ∈ φ(u), thevalue of xG is minimized.

Let A =∣∣Nra∣∣ and j1 < j2 < . . . < jA be the nodes

in Nra. Define 0-1 variables tji for i = 1, . . . ,A and letT = (tj1 , . . . , tjA). A single improvement correspondsto the constraint

tj1 + tj2 + · · ·+ tjA = 1,

and the generalization to multiple improvements isobvious.

We modify the definition of fu(x, y) for a rule nodegiven in Equation (3) to include the effect of theimprovement option τ . For a rule node u ∈ Nra, define

fu(T, x, y) = (6)

xu − (P (Xτ = 1))tuxj∏

k∈φ(u)k∈Nf

P (Xk = 1), j ∈ φ(u) ∩Ng,

xu − (P (Xτ = 1))tu∏

k∈φ(u)k∈Nf

P (Xk = 1), φ(u) ∩Ng = ∅.

fu is unmodified for rule nodes u ∈ Nr. Thismodified definition adds the improvement node atexactly one rule node in Nra. Note that the definitionof fu for a goal node is identical to Equation (4).The minimax problem to find the best placement ofsecurity products is

minimizeT∈0,1A

xG (7)

subject to tj1 + · · ·+ tjA = 1,

where xG is the solution to

maximizex,y

xG (8)

subject to f(T, x, y) = 0,0 ≤ xi ≤ 1, i = 1, ...,M ,0 ≤ yi ≤ 1, i = 1, ..., P .

The minimax problem (7) maximizes the ECSAvalue of the attack’s goal (E[XG ]) to find the highest

chance of success in attacking a specific networkcomponent (such as a server). The result of the innermaximization problem (8) is then used in the outerminimization problem (7) to find the best placementof the security product such that the maximized ECSAis minimized.

The inner maximization problem is solved usingSLP as before. The outer minimization problem is alimited combinatorial problem for one improvement.For multiple improvements, the outer problem canbe solved by an LP relaxation (change ti ∈ 0, 1to 0 ≤ ti ≤ 1) with branch and bound. For kimprovements, the complexity is

(Ak

).

5.2 Machine Availability and Threats from MobileDevices

To capture the increase in security threats due to theinclusion of mobile devices (such as laptops, smart-phones, and tablet computers) in the network, ourapproach is to extend an original attack graph for anetwork to include attack paths from mobile devices.Specifically, we define special rules to represent theuncertain availability of mobile devices in an attackgraph, as well as the corresponding ECSA formulationand computation. The ability to model the availabilityof machines in attack graphs is general and usefulbeyond the specific mobile devices studied.

Attack graph extension. We extend the rules of theMulVAL attack graph generator [30] to include ex-ploitation rules that capture the availability of mobiledevices. An identified mobile device may not alwaysappear in the network. Mobile devices rarely includeserver software. The majority of Internet-based mobileapplications are clients to the outside world, requiringinteraction with malicious input to execute a success-ful exploit. For instance, most of the vulnerabilitiesthat we studied for the Android platform involvedan interaction with a malicious code (i.e., a maliciouswebsite) and exploiting a local vulnerability. Accord-ingly, we define basic exploitation rules for mobiledevices in Figure 3.

execCode(H,Perm) :-compromised(H),vulExists(H,Vulid,

localExploit,privEscalation).

compromised(H) :-deviceOnline(H,Platform),vulExists(H,Vulid,remoteClient,

codeExecution),maliciousInteraction(H,_,App).

Fig. 3. The two predicates describe attack stages(i.e., remote and local exploits). The predicate de-viceOnline(H,Platform) captures the availability of thedevice H.

We capture the availability of a device with thenode deviceOnline(H,Platform). In the success

10

measurement model, these nodes are dynamic nodeswith no fixed initial belief. The availability of a devicemay be measured as the percentage of the time thatthe device is connected within the target network (e.g.,through a wireless connection) in a certain period.This data may be collected or estimated for the targetnetwork.

Note that our intuition of availability is in itsgeneral sense and is not necessarily bound to theavailability as, for example, connectivity of a machine.An availability element in our method may capture amachine’s connectivity, responsiveness of a particularvulnerable service, or the role of a firewall rule thatmight limit the availability of a service. For instance,our model captures the scenario in which a softwarefirewall, based on a specific policy, limits the numberof TCP connections opened by the Apache web server,and, for a specific period of time. When analyzing avulnerable network according to our model, a frac-tional probability value on a fact node representing aservice such as Apache will accurately imply limitedattack chance as a result of the limited availability.

ECSA for mobile devices. For mobile device factnodes, the availability of the device cannot be de-terministically specified. Thus, fact nodes similar todeviceOnline(H,Platform) cannot have a pre-computed value for all instances of ECSA computa-tion. In order to solve this issue, we define a stochasticfact node as a fact node that represents a dynamicground fact that is not associated with a fixed initialbelief. Each stochastic fact node u is represented usinga Bernoulli random variable Xu. For instance, forthe node deviceOnline(H,Platform), E[Xu] =P [Xu = 1] is the probability of the event that thedevice is online.

6 DETERMINATION OF INITIAL BELIEF

In this section we discuss and provide a concreteexample for choosing initial belief values for factnodes and improvement nodes.

6.1 DiscussionOur model relies on the availability of the initial beliefvalues that are initial estimates of vulnerabilities at asubset of nodes in an attack graph. Since this raisesa concern about the practicality of determining thesevalues, there have been a number of recent attemptsto automate this process. One notable attempt ispresented by Wang et al. [31] where a vulnerabilityassessment metric is developed that can be computedregardless of the type of the vulnerability itself. Thatis, the metric relies on the number of unknown vul-nerabilities required to compromise a network.

References [32] and [33] provide various ways tocalculate metrics for zero-day or known vulnerabili-ties. While a comprehensive measurement model forcomputing initial estimates of vulnerability impacts

may still be needed, existing methods as well asexpert knowledge suffice for our computations.

6.2 Example

Initial belief for fact nodes. An initial belief value is agiven probability of success P (Xui

= 1) at a fact nodeui ∈ Nf . Our success measurement model relies on arelatively small set of initial beliefs that provide anestimation of expected chance of success for specificattacks on network services. In an attack graph, thesenetwork service vulnerabilities are formalized as factnodes. The methods for obtaining initial belief valuesmay vary. We illustrate some specific approaches next.

For documented software vulnerabilities, the valueof standard vulnerability scores (such as CVSS) isused as an estimation of the expected chance ofsuccess in exploiting the vulnerability. The steps forassigning the initial belief values follow.

Analyzing the network configuration. A server A runsMySQL listening on port 3306, allowing remote con-nections. To protect A, iptables rules are set toallow tcp/udp connections either locally or to specificIP addresses inside a NAT subnet. These IP addressesbelong to workstations from which the database ad-ministrators and developers connect to the server A,and a web server that runs the web applications.

Analyzing attacks and vulnerabilities. An attacker canexploit a remote privilege escalation vulnerabilityfrom a workstation W1 to a developer workstationW2. Since A accepts MySQL connections from W2,the attacker uses one of multiple remote denial ofservice vulnerabilities (such as CVE-2012-3147, witha CVSS base score of 6.4/10) to launch a denial ofservice attack on the MySQL server in A.

Assigning initial belief values. With multiple docu-mented vulnerabilities with similar effects on u2, wecompute the value P (Xu2

= 1) = max(s1, s2, · · · , sK),where sj is a value in [0, 1] based on the CVSS basescore for a vulnerability j (for example, the scoredivided by 10), with K documented vulnerabilities.Alternatively take P (Xu2 = 1) = µ(s1, s2, · · · , sK),where µ is the mean of the score values.

We create another fact node as a dependency of therule node u1 (see Figure 4), denoted u3, to indicatethat incoming traffic on port 3306 is allowed from hostW2. We choose the probability value P (Xu3 = 1) =1, indicating that the connection to the port 3306 isreliable and the attacker is knowledgeable about theport 3306 when attacking a MySQL database server.Otherwise, depending on the network configurations,we can set P (Xu3

= 1) < 1, with a reasonable value.Initial belief for improvement nodes. Initial belief

values for improvement nodes correspond to the re-liability of the security solution represented by thenodes. There are several assessment factors for com-puting the initial belief values. We categorize thesefactors into two main groups: (i) effectiveness and (ii)

11

networkService(A, MySQL, 3306, W2)

vulnerability(A, remoteDoS, 6.4)

attackerIn(W2)

denialOfService

E[u3] = 1

...

E[u4] = 0.7

E[u2] = 0.64

E[u1] = 0.448

Fig. 4. u1 is a denial of service on A, u2 is a vulner-ability, u3 is a network service info, and u4 indicatesattacker reached W2 that can access A.

deployment. Effectiveness is measured by detectionaccuracy and the rate of false positive/negative deci-sions. The deployment factor includes measurementsfor memory consumption, CPU utilization, librarydependencies, maintenance, and financial cost.

To compute an estimated initial belief value fora security product, we use the mean of all the ef-fectiveness and deployment parameters. Let Z(ui)

k bea Bernoulli variable for an assessment factor k forimprovement option ui, and let L be the total numberof assessment factors. We define the expected valuefor Xui as

E[Xui] =

∑k E[Z

(ui)k ]

L. (9)

For an effectiveness factor k, the value of E[Z(ui)k ]

indicates the accuracy of improvement option ui. Fora deployment factor k, a higher value of E[Z

(ui)k ]

indicates lower deployment overhead.In the example scenario of Section 6, we create an

improvement node for additional iptables rules toimprove security. For instance, we modify the firewallrules on server A to allow connection to the databaseserver on an unusual port p other than the default3306, and also change MySQL socket configuration tolisten on port p. Then we create an improvement nodeu5 for an iptables rule dropping ICMP requestsand limiting TCP ACK packets to already establishedconnections to prevent the attacker from easily findingthe port number p through a port scanner such asnmap. We expect that the firewall rule of the nodeu5 has an average effectiveness (some attacks maybypass this rule) with virtually no deployment over-head. Thus we compute the initial belief value for u5as P (Xu5 = 1) = 0.5

(E[Z

(u5)1 ] + 0.5 ∗ E[Z

(u5)2 ]

)with

a value of E[Z(u5)1 ] ≥ 0.5 for the effectiveness factor

and E[Z(u5)2 ] = 1 for the deployment factor.

7 EXPERIMENTS

To validate our models (introduced in Sections 4and 5), we conduct four experiments on an actualcorporate network (depicted in Figure 5). Our exper-iments focus on (i) computing the ECSA values forthe network, (ii) assessing security defense strategies,(iii) adding mobile device data to the analysis, and

(iv) security improvement without installation of newdevices.

BackupServer

Public DMZ

ApplicationServer Linux 1

Application Server Win 4

Public DMZ

Cloud ServerLinux 2

Application Server Win 3

Printer1

Internet

MySQL Database Server

Private DMZ (Trusted)

Attacker

Wireless access point

Firewall

Firewall

Mail ServerWeb Server 1

Web Server 2

Smartphone Laptop

Workstation

Attacker

Public DMZ

Fig. 5. Each machine on the three public DMZ sub-networks runs at least a network service with an openport. Data servers are on a NAT subnetwork andcan only be accessed through the workstation. Theattacker either attacks remotely or uses a phone tocrack the wireless password and attack the servers.

We implemented a tool for our computational pro-cedures (Section 4.3) in Java (with approximately 3500lines of code). We use (GNU Linear ProgrammingKit) GLPK [34], a well known open source linearprogramming API for our SLP-based procedure.

Our tool parses an attack graph input file (obtainedfrom MulVAL [30]), computes the ECSA values ac-cording to various parameters, and performs securityimprovement analysis based on a set of improvementoptions and constraints.

In Figure 6, we demonstrate the performance ofour implementation. For each graph, we repeat thecorresponding experiment to measure the time tocompute the final expected chance of a successfulattack at the graph’s root vertex.

We compute ECSA values for the target graphsusing our tool. We run our tool as a single threadedprogram on a machine with a 2.4 GHz Intel Core i7processor and a 8 GB DDR3 memory. All our exper-iments converged with at most 20 iterations towardsthe solution. On average, 87.99% of the executiontime for Procedure 2 is spent on the Taylor expansionfrom which on average 78.27% of the execution timeis spent on symbolic differentiation performed usingDJep1 Java library for symbolic operations. The Taylorexpansion is parallelizable, and scales with the num-ber of vertices, hence can be done efficiently offline.

1. DJep is available on http://www.singsurf.org/djep/.

12

Fig. 6. The x-axis captures the number of vertices foreach experiment, while the y-axis shows the averagetime measured in seconds to execute one iteration forProcedure 2. On average 87.99% of time is spent oncomputing the Taylor expansion.

7.1 Experimental setup

The target network of Figure 5 is open to a largenumber of users and contains several servers andworkstations. This network has low usage restrictionsand allows untrusted mobile devices to enter thenetwork without mandatory security scanning (someof the data is sanitized while preserving the generalstructure and vulnerability information.). In this net-work, a connected user can easily obtain informationabout the network topology, and perform port scan-ning and operating system finger printing.

Generating the attack graphs. We used network scan-ning tools (such as nmap), online vulnerability repos-itories, and information provided by system admin-istrators to create a network topology (depicted inFigure 5) and the attack graphs that represent the realnetwork. We performed wireless network scanning toconfirm the connectivity of wireless devices to thenetwork. We generate two attack graphs (Table 1) withslight variations. Attack graph A (483 nodes) assumesno mobile devices in the network (i.e., availabilityof mobile devices is 0%), while attack graph B (549nodes) includes attack scenarios from untrusted mo-bile devices.

Assigning the initial beliefs. A subset of the initialbelief values (22 values for attack graphs A and B)for computing ECSA values for our example attackgraphs is given in Figure 7. All the entries in thefile correspond to known vulnerabilities for which aCommon Vulnerability Scoring System (CVSS) scoreis available. A CVSS score is a number in the range[0, 1] that represents the exploitability level of a vul-nerability. We use this number as an approximation ofthe probability of success for known vulnerabilities.

Initial belief values are required for every groundfact node. In our network, attack graph A contains 229

ground fact nodes. We use an automated techniqueto determine the initial belief values for all the nodeswithout a CVSS score. Of the 229 ground fact nodes,161 nodes describe host access control informationbetween two machines in the network. We assumethat all the actual connections are highly reliable, thussetting an initial belief value for availability of thesehosts to 0.9. Our tool automatically detects host accesscontrol nodes and sets the initial belief values forthem.

The other fact nodes are in three categories: 37nodes describe network services (such as Apache), 30describe a vulnerability (for which we use the CVSSscores, as depicted in Figure 7), and one node repre-sents the existence of an attacker. Similar to the hostaccess control nodes, we apply unified initial beliefvalues for each category. Note that these parametersmay be adjusted to test various scenarios, for example,under low probability of existence of an attacker.

Vulnerability fact node uiCVE_2006_1516 5.0CVE_2006_1518 6.5CVE_2008_1483 6.9CVE_2006_5752 4.3CVE_2011_1929 5.0CVE_2011_1968 7.1CVE_2004_0331 5.0CVE_2009_4565 7.5CVE_2005_2090 4.3CVE_2010_1899 4.3

E[Xui]

Fig. 7. First ten entries in initial belief values file(containing 22 entries) for attack graphs A and B. Weuse the Common Vulnerability Scoring System valuesas approximation of E[Xui ] for documented vulnerabil-ities.

7.2 Chances of a Successful AttackGiven complicated attack structures represented byan attack graph of the network, it is particularlyinteresting to analyze the attack to understand theweakest points of the network that enable the ultimateattack goal. In addition to computing the highestexpected chance of success given an ultimate attackgoal E[XG ], the solution to Equation (5) as describedin Section 4.2 finds the expected chance of successfulattack on intermediate attack goals that are necessaryto achieve G.

To verify the solution computed by our tool, con-sider the partial view of the attack graph A in Figure 8.We highlight two attack vectors leading to privilegeescalation on the database server, namely throughcompromising servers 3 and 4 (See Figure 5). Comput-ing the ECSA for all the nodes in the graph, the resultssuggest that both application servers 3 and 4 (denotedserver3 and server4 in Figure 9) have high ECSA

13

Attack Graph Hosts Nodes Edges Placement Options Min. Size of Initial Belief SetA: No mobile 13 483 663 206 22B: With mobile 13 549 757 235 22

TABLE 1Attack graph A is generated with no mobile devices in the network and attack graph B is generated with two

mobile devices. Placement options refers to the number of nodes that can be considered for the addition of animprovement node.

1g

2r

474r

482r

466r

3g 464f

465f

481f

475g

480f483f

472f

473f

467g

Network access through apache

vulnerability cve_2006_5752

Network access through mySQL vulnerability

cve_2006_1518

Network access through mySQL vulnerability

cve_2006_1516

Network access through ssh vulnerability cve_2008_1483

Privilege escalation on DB server (ultimate goal)

43

2

1

25f Starting point of attack

6g Privilege escalation on

server 3

96gPrivilege escalation on

server 4

Other attack vectors Apache exploits

Apache and ssh exploits

Fig. 8. A simplified partial view of attack graph Ain which nodes are numbered and labeled with nodetype. Nodes are labeled with an ID followed by thenode type (g for goal, r for rule, and f for fact). Theattack sequence starts at node 25 and proceeds with anumber of alternatives among which is compromisingeither server 3 or server 4 through nodes 6 and 96respectively. Both nodes 6 and 96 are predecessorsof the final attack goal, that is node 1, which refers toprivilege escalation on the database server.

values for their goal node, indicating high chances ofsuccessful attacks. This is because application servers3 and 4 have highly scored software vulnerabilitieswith a number of open ports that increase the attacksurface, and thus are relatively more exposed to theoutside world. The chance of successful attack on thetarget database server is the lowest, which is dueto a better network configuration to protect it. Inour target network, the probability of success (basedon our computation) for compromising the databaseserver is a function of both success probability atpreceding goals (and thus taking into account the de-pendency on previous stages of the attack) as well asthe independent probability of success at the databaseserver itself. Our computation is more rigorous thansimply regarding the probability of success at thedatabase server to be the same as the probabilityfor compromising the preceding servers in the attack

chain.The results obtained from this experiment signif-

icantly improve the manual inspection of networkvulnerabilities even with the assistance of plain attackgraphs. In addition, the ECSA values and the math-ematical programming structure of Equation (5) laythe foundation for an efficient assessment of securityimprovement options as discussed in Section 7.3.

Figure 9 also shows the results for the ECSA com-puted based on attack graph B, which are discussedin Section 7.4.

7.3 Optimal Placement of Security ProductsWe used the results from the previous section to findthe best placement of an improvement option forthe network of Figure 5. Our improvement optionis the installation of an intrusion prevention system(IPS) on a single device to minimize the risk on thetarget host (the database server). Our choice of IPS hassome deployment overhead because of memory andCPU usage. After testing its effectiveness, we believethat this IPS has a low false negative rate. UsingEquation (9), the initial belief for each improvementfact node for the IPS is E[Xτ ] = 0.3.

According to our method (described in Section 5.1),we add all the exploitation rules to the set of applica-ble placement nodes Nra (i.e., 206 nodes for attackgraph A and 235 nodes for attack graph B; notethat one can choose fewer rule nodes for solving theoptimal placement problem, depending on possibleplacement constraints.). Then we modify the originalattack graph to include improvement fact nodes aspredecessors to each u ∈ Nra.

Cursory reasoning may recommend that the targetserver (i.e., database server) itself must be where weinstall the IPS. However, this recommendation maynot be optimal. We computed the improvement forthe attack graph with no mobile devices and with themobile devices present in the network. Table 2 showsthe improvement results, for each attack graph config-uration, ordered based on the percentage decrease inE[XG ]. The third column shows the best placementof the IPS. E′[XG ] and E[XG ] denote the expectedchances of a successful attack on G (i.e., the databaseserver) in the improved attack graph and the originalattack graph, respectively.

The results in Table 2 demonstrate significant de-crease in E[XG ] when considering the improvement

14Comparison

Page 1

web1 web2

server4 mail

server2 server1

server3 printer

workstationdatabase

laptop backup

phone

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Computation of ECSA using Equation (5)

Mobile device available No mobile devices

Exploitation goal nodes

E[X

]

Fig. 9. ECSA values for attack graphs A (no mobile devices) and B (with mobile devices). In the experiment withmobile devices, the availability of a mobile device is captured with a random variable and is not assumed to befixed.

Rank Attack Graph Machine E′[XG ] E[XG ] % ↓

1A: No mobile App Server 3 0.0520 0.1739 70.09B: With mobile Database 0.0521 0.2651 70.04

2A: No mobile Database 0.0552 0.1739 79.18B: With mobile Workstation 0.0791 0.2651 70.16

TABLE 2Optimal selection for IPS installation for attack graphsA and B. The attack target is code execution on thedatabase server. The results are compared againstthe original ECSA values without the improvementoption. E′[XG ] is the ECSA value of the improved

model.

option for attack graphs A and B. Our results indicatethat installing the IPS on application server 3 has thebest effect in minimizing the ECSA of the attack’sgoal. The reason is that the target server can beattacked from a number of ports indicated by goalnodes. Based on the computed values of the randomselectors Yi, a particular port p1 receives a high chanceof being used to attack the database server.

In the results, attacking the database server from p1has a lower ECSA compared to attacking applicationserver 3. In the attack graph, attacking application 3is a predecessor of attacking the database server onport p1. Thus, the improvement option multiplied bythe ECSA of attacking application server 3 reducesthe value of E[XG ] more, and installing the IPS onapplication server 3 yields a slightly lower value ofE[XG ].

Notice that the second ranked improvement recom-mendation (obtained during the course of solving theminimization problem (7)) suggests the workstationas the best place to install the IPS. This is consistentwith the conclusions from the ECSA values since theworkstation is one of the most vulnerable devicesdetermined in the previous experiment.

7.4 Effect of Mobile DevicesThe network architecture presented in Figure 5 isalso vulnerable to threats from mobile devices. For

example, in the network of Figure 5, the systemadministrators have allowed mobile devices to jointhe wireless access point that is set up for internalpurposes in the private DMZ region. Also, the laptop(connected to the wireless access point) is directlyaccessible from the workstation and the printer. Suchconfigurations increase the attack surface. We assessedthe security of the network by computing the ECSAvalues for attack graph B that includes the attackvectors from mobile devices.

The ECSA in our experiments is computed accord-ing to the method for computing the most vulnerablecomponents (Section 4.2). Therefore, the results ofthe experiment on attack graph B (Figure 9) showlower values for exploiting the application servers,but higher values for exploiting the smartphone andthe laptop (with highly scored known software vul-nerabilities), the workstation, and the printer. This isbecause the mobile devices in the network of Figure 5have highly scored vulnerabilities that make themmore attractive to attackers.

From the results of the experiment with mobiledevices, we can conclude that the presence of highlyvulnerable mobile devices in the network increasesthe chance of a successful attack on the target ma-chine. Using attack graph B, the most vulnerablecomponents are the workstation, the printer (whichhas vulnerable server software), and the mobile de-vices (i.e., the laptop and the smartphone). In thisexperiment, the chance of success in exploiting thedatabase server is increased by 52.44%.

7.5 Improving Network ConfigurationOur optimal recommendation method is capable ofcomputing an improved network configuration withno extra security products (such as an IPS) added tothe network. In particular, we find a port p (amongstall open ports on all machines) such that if it isdisabled, the value of E[XG ] (the optimum value for(5)) is minimized. That is, for any other port p′, if p′ isdisabled in the network (for which we obtain E′[XG ]),then E′[XG ] ≥ E[XG ].

15

We used our method to examine the option onevery possible open port that appears in the at-tack graph. The results of our experiments on attackgraphs A (no mobile) and B (with mobile) are sum-marized in Table 3.

Rank Attack Graph Machine, Port E′[XG ] E[XG ] % ↓

1A: No mobile Database, 2200 0.0 0.1739 100B: With mobile Database, 2200 0.0 0.2651 100

2A: No mobile App Server 3, 22 0.0 0.1739 100B: With mobile Backup, 2200 0.12 0.2651 53.8

TABLE 3Optimal selection for closing a single port with the

best effect on the security of the network.

To verify the accuracy of our method, we consid-ered open ports on the target database server thatif disabled would eliminate the chance of attack. Al-though it is a common practice to eliminate straight-forward attacks on well known ports, some of theservers in the target network did have open ports withminimum firewall rules.

The results in Table 3 show that the best recom-mendation is to disable the port 2200 yielding azero expected chance of successful attack. The secondranked recommendations are to close ports on theapplication server 3 and the backup server. Noticethat both recommendations achieved a lower valueof E[XG ], thus improving the security of the network.

All the analyses (including vulnerability and opti-mization analyses) conducted in our experiment fin-ished within several seconds for the attack graphsused.

8 CONCLUSIONS AND FUTURE WORKIn this work we formalized, implemented, and eval-uated a new probabilistic model for measuring thesecurity threats in large enterprise networks. Thenovelty of our work is the ability to quantitativelyanalyze the chance of successful attack in the presenceof uncertainties about the configuration of a dynamicnetwork and routes of potential attacks.

For future work, we plan to utilize and extendour success measurement model and optimal securityplacement algorithm to solve more complex networksecurity optimization problems. For instance, an im-portant issue is noise elimination in the initial beliefset of values. This is an important problem that ifsolved will lead to the production of more accurateresults.

REFERENCES[1] K. Ingols, R. Lippmann, and K. Piwowarski, “Practical attack

graph generation for network defense,” in Computer SecurityApplications Conference, 2006., 12 2006, pp. 121 –130.

[2] S. Jajodia, S. Noel, and B. O’Berry, “Topological analysisof network attack vulnerability,” in Managing Cyber Threats:Issues, Approaches and Challanges, V. Kumar, J. Srivastava, andA. Lazarevic, Eds. Kluwer Academic Publisher, 2003, ch. 5.

[3] S. Jha, O. Sheyner, and J. Wing, “Two formal analyses ofattack graphs,” in Computer Security Foundations Workshop,2002. Proceedings. 15th IEEE, 2002, pp. 49–63.

[4] S. Fenz, “An ontology- and Bayesian-based approach fordetermining threat probabilities,” in Proceedings of the 6th

ACM Symposium on Information, Computer and CommunicationsSecurity, ser. ASIACCS ’11. New York, NY, USA: ACM, 2011,pp. 344–354.

[5] M. Frigault, L. Wang, A. Singhal, and S. Jajodia,“Measuring network security using dynamic Bayesiannetwork,” in Proceedings of the 4th ACM Workshopon Quality of Protection, ser. QoP ’08. New York,NY, USA: ACM, 2008, pp. 23–30. [Online]. Available:http://doi.acm.org/10.1145/1456362.1456368

[6] N. Poolsappasit, R. Dewri, and I. Ray, “Dynamic security riskmanagement using Bayesian attack graphs,” IEEE Transactionson Dependable and Secure Computing, vol. 9, no. 1, pp. 61–74,Jan 2012.

[7] P. Xie, J. H. Li, X. Ou, P. Liu, and R. Levy, “Using Bayesiannetworks for cyber security analysis,” in The 40th AnnualIEEE/IFIP International Conference on Dependable Systems andNetworks (DSN), 2010.

[8] S. Noel, S. Jajodia, L. Wang, and A. Singhal, “Measuringsecurity risk of networks using attack graphs,” InternationalJournal of Next-Generation Computing, vol. 1, no. 1, July 2010.

[9] L. Wang, T. Islam, T. Long, A. Singhal, and S. Jajodia, “An at-tack graph-based probabilistic security metric,” in Proceedingsof the 22nd annual IFIP WG 11.3 Working Conference on Dataand Applications Security. Berlin, Heidelberg: Springer-Verlag,2008, pp. 283–296.

[10] V. Mehta, C. Bartzis, H. Zhu, E. Clarke, and J. Wing, “RankingAttack Graphs,” in Recent Advances in Intrusion Detection.Springer Berlin, 2006, vol. 4219, pp. 127–144.

[11] R. E. Sawilla and X. Ou, “Identifying Critical Attack Assetsin Dependency Attack Graphs,” in Proceedings of the 13thEuropean Symposium on Research in Computer Security: ComputerSecurity, ser. ESORICS ’08. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 18–34.

[12] M. Albanese, S. Jajodia, and S. Noel, “Time-efficient and cost-effective network hardening using attack graphs,” in Depend-able Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIPInternational Conference on, june 2012, pp. 1–12.

[13] R. Dewri, N. Poolsappasit, I. Ray, and D. Whitley, “Optimalsecurity hardening using multi-objective optimization on at-tack tree models of networks,” in Proceedings of the 14th ACMconference on Computer and Communications Security, ser. CCS’07. New York, NY, USA: ACM, 2007, pp. 204–213.

[14] S. Noel and S. Jajodia, “Optimal IDS sensor placement andalert prioritization using attack graphs,” Journal of Networkand Systems Management, vol. 16, no. 3, pp. 259–275, Sep.2008. [Online]. Available: http://dx.doi.org/10.1007/s10922-008-9109-x

[15] C. Phillips and L. P. Swiler, “A Graph-based System forNetwork-vulnerability Analysis,” in Proceedings of the 1998Workshop on New Security Paradigms, ser. NSPW ’98. NewYork, NY, USA: ACM, 1998, pp. 71–79. [Online]. Available:http://doi.acm.org/10.1145/310889.310919

[16] O. Sheyner, J. Haines, S. Jha, R. Lippmann, andJ. M. Wing, “Automated Generation and Analysisof Attack Graphs,” in Proceedings of the 2002 IEEESymposium on Security and Privacy. Washington, DC,USA: IEEE Computer Society, 2002. [Online]. Available:http://portal.acm.org/citation.cfm?id=829514.830526

[17] M. Albanese, S. Jajodia, A. Pugliese, and V. Subrahmanian,“Scalable analysis of attack scenarios,” in Computer Secu-rity ESORICS 2011, ser. Lecture Notes in Computer Science,V. Atluri and C. Diaz, Eds., vol. 6879. Springer BerlinHeidelberg, 2011, pp. 416–433.

[18] L. Page, S. Brin, R. Motwani, and T. Winograd, “The Pagerankcitation ranking: Bringing order to the web,” Stanford DigitalLibrary Technologies Project, Tech. Rep., September 1999.

[19] Z. Zhang, F. Naıt-Abdesselam, X. Lin, and P.-H. Ho, “A Model-based Semi-quantitative Approach for Evaluating Security ofEnterprise Networks,” in Proceedings of the 2008 ACM Sympo-sium on Applied Computing, ser. SAC ’08. New York, NY, USA:ACM, 2008, pp. 1069–1074.

16

[20] J. Pamula, S. Jajodia, P. Ammann, and V. Swarup, “Aweakest-adversary security metric for network configurationsecurity analysis,” in Proceedings of the 2nd ACM Workshopon Quality of Protection, ser. QoP ’06. New York,NY, USA: ACM, 2006, pp. 31–38. [Online]. Available:http://doi.acm.org/10.1145/1179494.1179502

[21] L. Wang, A. Singhal, and S. Jajodia, “Measuring theoverall security of network configurations using attackgraphs,” in Proceedings of the 21st annual IFIP WG 11.3working conference on Data and applications security. Berlin,Heidelberg: Springer-Verlag, 2007, pp. 98–112. [Online].Available: http://dl.acm.org/citation.cfm?id=1770560.1770573

[22] S. Bhatt, W. Horne, and P. Rao, “On computing enterprise ITrisk metrics,” in Future Challenges in Security and Privacy forAcademia and Industry, ser. IFIP Advances in Information andCommunication Technology. Springer Boston, 2011, vol. 354,pp. 271–280.

[23] L. Wang, S. Noel, and S. Jajodia, “Minimum-cost networkhardening using attack graphs,” Computer Communications,vol. 29, no. 18, pp. 3812–3824, Nov. 2006. [Online]. Available:http://dx.doi.org/10.1016/j.comcom.2006.06.018

[24] H. Huang, S. Zhang, X. Ou, A. Prakash, and K. Sakallah,“Distilling critical attack graph surface iteratively throughminimum-cost SAT solving,” in Proceedings of the 27th

Annual Computer Security Applications Conference. New York,NY, USA: ACM, 2011, pp. 31–40. [Online]. Available:http://doi.acm.org/10.1145/2076732.2076738

[25] G. Lee, I.-s. Ko, and T.-h. Kim, “A Vulnerability AssessmentTool Based on OVAL in System Block Model,” in Proceedingsof the 2006 International Conference on Intelligent Computing -Volume Part I, ser. ICIC’06. Berlin, Heidelberg: Springer-Verlag, 2006, pp. 1115–1120.

[26] X. Ou, S. Govindavajhala, and A. W. Appel, “Mulval: A logic-based network security analyzer,” in Proceedings of the 14thConference on USENIX Security Symposium. Berkeley, CA, USA:USENIX Association, 2005, pp. 113–128. [Online]. Available:http://portal.acm.org/citation.cfm?id=1251398.1251406

[27] D. A. Spielman and S.-H. Teng, “Smoothed analysisof algorithms: Why the simplex algorithm usuallytakes polynomial time,” Journal of the ACM (JACM),vol. 51, pp. 385–463, May 2004. [Online]. Available:http://doi.acm.org/10.1145/990308.990310

[28] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, NonlinearProgramming. John Wiley and Sons, Inc., 2005.

[29] F. Palacios-Gomez, L. Lasdon, and M. Engquist, “Nonlinearoptimization by successive linear programming,” ManagementScience, vol. 28, no. 10, pp. 1106–1120, 1982. [Online].Available: http://www.jstor.org/stable/2630940

[30] X. Ou, W. F. Boyer, and M. A. McQueen, “A scalable approachto attack graph generation,” in Proceedings of the 13th ACMConference on Computer and Communications Security, ser. CCS’06. New York, NY, USA: ACM, 2006, pp. 336–345. [Online].Available: http://doi.acm.org/10.1145/1180405.1180446

[31] L. Wang, S. Jajodia, A. Singhal, P. Cheng, and S. Noel, “k-zeroday safety: A network security metric for measuring therisk of unknown vulnerabilities,” IEEE Trans. Dependable Sec.Comput., vol. 11, no. 1, pp. 30–44, 2014. [Online]. Available:http://doi.ieeecomputersociety.org/10.1109/TDSC.2013.24

[32] D. Balzarotti, M. Monga, and S. Sicari, “Assessing the riskof using vulnerable components,” in Quality of Protection, ser.Advances in Information Security, D. Gollmann, F. Massacci,and A. Yautsiukhin, Eds. Springer US, 2006, vol. 23, pp. 65–77.

[33] M. McQueen, T. McQueen, W. Boyer, and M. Chaffin, “Em-pirical estimates and observations of 0-day vulnerabilities,”in System Sciences, 2009. HICSS ’09. 42nd Hawaii InternationalConference on, Jan 2009, pp. 1–12.

[34] R. Ceron, “The GNU linear programming kit,part 1: Introduction to linear optimization,” 2006,http://www.ibm.com/developerworks/linux/library/l-glpk1/.

Hussain M. J. Almohri received his B.S.from Kuwait University and M.S. from KansasState University. In 2013 he received hisPh.D. from Virginia Tech and started as anassistant professor of computer science atKuwait University. His research focuses onmobile system security and quantitative anal-ysis of security. He has served as a reviewerfor several IEEE journals. He is a member ofACM, IEEE, and USENIX.

Layne T. Watson received the B.A. degree(magna cum laude) in psychology and math-ematics from the University of Evansville, In-diana, in 1969, and the Ph.D. degree in math-ematics from the University of Michigan, AnnArbor, in 1974. He is a professor of computerscience, mathematics, and aerospace andocean engineering at Virginia Polytechnic In-stitute and State University. His professionalservice includes stints as associate editor ofORSA Journal on Computing, SIAM Journal

on Optimization, Computational Optimization and Applications, Evo-lutionary Optimization, Engineering Computations, and the Interna-tional Journal of High Performance Computing Applications, andis senior editor of Applied Mathematics and Computation. He haspublished well over 300 refereed journal articles and 200 refereedconference papers. He is a fellow of the IEEE, the National Instituteof Aerospace, and the International Society of Intelligent BiologicalMedicine.

Danfeng (Daphne) Yao is an associate pro-fessor and L-3 Faculty Fellow in the De-partment of Computer Science at VirginiaTech, Blacksburg. She received her Com-puter Science Ph.D. degree from Brown Uni-versity in 2007. She received the NSF CA-REER Award in 2010 for her work on human-behavior driven malware detection, and mostrecently ARO Young Investigator Award forher semantic reasoning for mission-orientedsecurity work in 2014. She received the Out-

standing New Assistant Professor Award from Virginia Tech Collegeof Engineering in 2012. Dr. Yao has several Best Paper Awards(ICICS ’06, CollaborateCom ’09, and ICNP ’12). She was given theAward for Technological Innovation from Brown University in 2006.She held a U.S. patent for her anomaly detection technologies. Dr.Yao is an associate editor of IEEE Transactions on Dependable andSecure Computing (TDSC) and serves as PC members in numerouscomputer security conferences, including ACM CCS.

Xinming “Simon” Ou is an associate pro-fessor of computer science and the Peggyand Gary Edwards Chair in Engineering atKansas State University. His research inter-ests include designing better technologies tofacilitate cyberdefense. Ou received a PhD incomputer science from Princeton University.He is a 2010 National Science FoundationFaculty Early Career Development award re-cipient and three-time winner of HP Labs In-novation Research Program award. Contact

him at [email protected].

Security Optimization of Dynamic Networks with …people.cs.vt.edu › ... › TDSC-Attack-Graph-Probabilistic-Yao.pdflinear programming that is extensively applied and studied in

Documents