-
Hu et al. EURASIP Journal on Information Security 2014,
2014:17http://jis.eurasipjournals.com/content/2014/1/17
RESEARCH Open Access
MUSE: asset risk scoring in enterprise networkwith mutually
reinforced reputationpropagationXin Hu1*, Ting Wang1, Marc Ph
Stoecklin2, Douglas L Schales1, Jiyong Jang1 and Reiner Sailer1
Abstract
Cyber security attacks are becoming ever more frequent and
sophisticated. Enterprises often deploy several securityprotection
mechanisms, such as anti-virus software, intrusion
detection/prevention systems, and firewalls, to protecttheir
critical assets against emerging threats. Unfortunately, these
protection systems are typically ‘noisy’, e.g., regularlygenerating
thousands of alerts every day. Plagued by false positives and
irrelevant events, it is often neither practicalnor cost-effective
to analyze and respond to every single alert. The main challenges
faced by enterprises are to extractimportant information from the
plethora of alerts and to infer potential risks to their critical
assets. A betterunderstanding of risks will facilitate effective
resource allocation and prioritization of further investigation. In
thispaper, we present MUSE, a system that analyzes a large number
of alerts and derives risk scores by correlating diverseentities in
an enterprise network. Instead of considering a risk as an isolated
and static property pertaining only toindividual users or devices,
MUSE exploits a novel mutual reinforcement principle and models the
dynamics of riskbased on the interdependent relationship among
multiple entities. We apply MUSE on real-world network traces
andalerts from a large enterprise network consisting of more than
10,000 nodes and 100,000 edges. To scale up to suchlarge graphical
models, we formulate the algorithm using a distributed memory
abstraction model that allowsefficient in-memory parallel
computations on large clusters. We implement MUSE on Apache Spark
and demonstrateits efficacy in risk assessment and flexibility in
incorporating a wide variety of datasets.
Keywords: Risk scoring; Enterprise network; Reputation
propagation
IntroductionMitigating and defending against ever more
frequentand sophisticated cyber attacks are often top priori-ties
for enterprises. To this end, a plethora of detec-tion and
prevention solutions have been developed anddeployed, including
anti-virus software, intrusion detec-tion/prevention systems
(IDS/IPS), blacklists, firewalls,and so on. With these
state-of-the-art technologies cap-turing various types of security
threats, one would expectthat they are very effective in detecting
and prevent-ing attacks. In reality, however, the effectiveness
ofthese systems often fall short. The increasingly diver-sified
types of cyber attacks, coupled with increasing
*Correspondence: [email protected] Research Department,
IBM T.J. Watson Research Center, 1101Kitchawan Rd, Yorktown
Heights, NY, USAFull list of author information is available at the
end of the article
collection of applications, hardware configurations, andnetwork
equipments, have made the enterprise environ-ment extremely
‘noisy’. For example, IDS/IPS systemsregularly generate over 10,000
alerts every day. Majorityof them turn out to be false positives.
Even true alerts areoften triggered by low level of threats such as
brute-forcepassword guessing and SQL injection attempts.
Althoughthe suspicious nature of these events warrants the
reportsby IPS/IDS systems, their excessive amount often onlymade
the situation even more noisy.
Digging into the haystack of alerts to find clues to
actualthreats is a daunting task that is very expensive, if
notimpossible, through manual inspection. As a result,
mostenterprises practically only have resources to investigatea
very small fraction of alerts raised by IPS/IDS systems.Vast
majority of others are stored in a database merelyfor forensic
purposes and inspected only after signifi-cant incidents have been
discovered or critical assets have
© 2014 Hu et al.; licensee Springer. This is an Open Access
article distributed under the terms of the Creative
CommonsAttribution License
(http://creativecommons.org/licenses/by/4.0), which permits
unrestricted use, distribution, and reproductionin any medium,
provided the original work is properly credited.
mailto:
[email protected]://creativecommons.org/licenses/by/4.0
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 2 of 9http://jis.eurasipjournals.com/content/2014/1/17
already been severely damaged, e.g., security breachesand data
leakage. However, even those small fraction oftrue positive alerts
(e.g., device compromise, virus infec-tion on a user’s computer)
are often too voluminous tobecome security analysts’ top priority.
In addition, thesealerts are often considered as low level of
threats andpose far less risks to the enterprises comparing with
moresevere attacks, such as server compromise or sensitivedata
leakage. Evidently, a more effective solution is to bet-ter
understand and rank potential risks associated withthese alerts so
that analysts can effectively prioritize andallocate resources for
alert investigation.
In a typical enterprise environment, as shown inFigure 1, there
are different sets of entities: servers,devices, users,
credentials, and (high-value) assets (e.g.,databases, business
processes). The connections betweenthese entities represent their
intuitive relationships; forexample, a user may own multiple
devices, a devicemay connect to different types of servers and
inter-nal databases, a device may be associated with mul-tiple user
accounts with different credentials, etc. Wenote that the
reputation of these entities provide valu-able indicators into
their corresponding risks and areimportant factors to rank various
security incidentsassociated with these entities e.g., IPS/IDS
alerts andbehavior anomalies. More importantly, an entity’s
rep-utation and the risk it may produce are not restrictedto each
individual entity. In fact, multiple entities areoften tied
together in a mutually reinforcing relationshipwith their
reputation closely interdependent with eachother.
In this work, we develop MUSE (Mutually-reinforcedUnsupervised
Scoring for Enterprise risk), a risk analysisframework that
analyzes a large amount of security alerts
Figure 1 Entities in a typical enterprise network.
and computes the reputation of diverse entities based onvarious
domain knowledge and interactions among theseentities.
Specifically, MUSE models the interactions withcomposite,
multi-level bipartite graphs where each pairof entity types (e.g.,
a user and a device) constitute onebipartite graph. MUSE then
applies an iterative prop-agation algorithm on the graphic model to
exploit themutual reinforcement relationship between the
connectedentities and derive their reputation and risk score
simul-taneously. Finally, with the refined risk scores, MUSE isable
to provide useful information such as ranking of low-reputation
entities and potential risks to critical assets,allowing security
analysts to make an informed decisionas to how resources can be
prioritized for further inves-tigation. MUSE will also provide
greater visibility intothe set of alerts that are responsible for
an entity’s lowreputation, offering insights into the root cause of
cyberattacks.
The main contributions of this work include: 1) a
mutualreinforcement framework to analyze the reputation andthe risk
of diverse entities in an enterprise network, 2) ascalable
propagation algorithm to exploit the networkingstructures and
identify potential risky entities that maybe overlooked by a
discrete risk score, 3) a highly flexi-ble system that can
incorporate data sources in multipledomains, 4) implementation of
MUSE that takes advan-tage of recent advances in distributed
in-memory clustercomputing framework and scales to very large
graphicmodels, and 5) evaluations with real network traces froma
large enterprise to verify the efficiency and efficacy ofMUSE.
Risk and reputation in a multi-entity environmentIn the
following sections, we will present the design andthe architecture
of MUSE. We will first formulate theproblem of risk and reputation
assessment in enterprisenetwork and then discuss specific domain
knowledge andintuition that are crucial for solving the
problem.
Problem formulationIn a typical enterprise environment, there
are multiplesets of connected entities. Specifically, we consider
fivedistinct types of entities as depicted in Figure 1: users
, devices , credentials , high value assets , andexternal
servers . These entities are often related in apairwise
many-to-many fashion. For example, a devicecan access multiple
external servers. A user may ownseveral devices e.g., laptops and
workstations, while onedevice (e.g., server clusters) can be used
by multiple users.
We model the interconnection between entities as acomposite
bipartite graph G = (V , E), schematicallyshown in Figure 2. In G,
vertices V = {U ,D, C,A,S}represent various entities. Edges E =
{MDS ,MDU ,MDA,MUC ,MDC} of bipartite graphs represent their
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 3 of 9http://jis.eurasipjournals.com/content/2014/1/17
Figure 2 Network of interactions among multiple entities.
relationships, where MDS is the |D|-by-|S| matrixcontaining all
the pairwise edges, i.e., MDS(i, j) > 0 ifthere is an edge
between device di and external server sj.The value of MDS(i, j)
denotes the edge weight derivedfrom the characteristics of the
relationship, such as thenumber of connections, the number of bytes
transmit-ted, duration, and so on. Similarly, MDS , MDU , MDA,MUC ,
and MDC are the matrices of the pairwise edgesrepresenting the
association of their respective entities.
Next, we define the risk and the reputation of differententities
more precisely. We treat each entity as a ran-dom variable X with a
binary class label X = {xR, xNR}assigned to it. Here, xR is a risky
(or bad) label, and xNR isa non-risky (or good) label. A
probability distribution P isdefined over this binary class, where
P(xR) is the proba-bility of being risky and P(xNR) is the
probability of beingnon-risky. By definition, the sum of P(xR) and
P(xNR) is1. We use this probabilistic definition because it
encom-passes a natural mapping between P(xNR) and the
generalconcept of reputation, i.e., an entity with a high
proba-bility of being good (or non-risky) is expected to havehigh
reputation. In addition, it accepts different types ofentities to
incorporate specific domain knowledge intothe reputation
computation considering their respectivecharacteristics, e.g.,
• External server reputation ps = Ps(xNR) indicatesthe server’s
probability of being malicious andinfecting the clients connecting
to it. Notice that alow reputation ps means high probability of
beingmalicious.
• Device reputation pd = Pd(xNR) represents theprobability that
a device may have been infected orcompromised and thus under
control of adversaries.
• User reputation pu = Pu(xNR) indicates howsuspiciously a user
behaves, e.g., an unauthorizedaccess to sensitive data.
• Credential reputation pc = Pc(xNR) denotes theprobability that
a credential may have been leaked tothe adversaries and thus making
any serversassociated with the credential vulnerable.
• High-value asset reputation pa = Pa(xNR) denotesthe asset’s
probability of being risky; for instance, aconfidential database
being accessed by unauthorizedusers, exfiltration of sensitive
data, etc.
As there is a natural correlation between reputation andrisk
(e.g., less reputable entities generally pose high risks),we define
an entity’s risk as P(xR) = (1−P(xNR)) weightedby the importance of
the entity, such that a high-valueasset will experience a large
increase in its risk score evenwith a small decline in its
reputation. With these defi-nitions, the goal of MUSE is to
aggregate large amountsof security alerts, determine the reputation
of each entityby exploiting their structural relationships in the
con-nectivity graph, and finally output a ranked list of
riskyentities for further investigation. In the next section,
wewill describe the mutual reinforcement principle [1]
thatunderlies MUSE.
Mutual reinforcement principleThe key observation of MUSE is
that entities’ reputationand risk are not separated; instead, they
are closely corre-lated and interdependent. Through interacting
with eachother, an entity’s reputation can impact on the risk
associ-ated with its neighbors, and at the same time, the
entity’srisk can be influenced by the reputation of its
neighbors.For example, a device is likely to be of low reputation
1)if the server it frequently visits are suspicious or mali-cious
e.g. Botnet C&C, Phishing, or malware sites, 2) if theusers
using the device have bad reputation, and 3) if thecredentials used
to log into the device have high risks ofbeing compromised, leaked,
or even used by an unautho-rized user. Similarly, a credential’s
risk of being exposedwill increase if it has been used by a less
reputable userand/or on a device that exhibits suspicious behavior
pat-terns. Along the same line, a user will have low reputationif
she owns several low-reputation devices and credentials.Last but
not least, a high-value asset or the sensitive datastored in
internal databases are likely to be under a signif-icant risk e.g.,
data exfiltration if they have been accessedby multiple
low-reputation devices that also connect toexternal malicious
servers. We describe these mutuallydependent relationships more
formally in our multi-layermutual reinforcement framework, using
the following set
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 4 of 9http://jis.eurasipjournals.com/content/2014/1/17
of equations governing the server reputation ps,
devicereputation pd, user reputation pu, credential reputation
pc,and high-value asset reputation pa.
pd ∝ ωds∑
d∼smdsps + ωdu
∑
d∼umdupu + ωdc
∑
d∼cmdcpc
pu ∝ ωdu∑
d∼umdupd + ωuc
∑
u∼cmucpc
pc ∝ ωuc∑
u∼cmucpu + ωdc
∑
d∼cmdcpd
pa ∝ ωda∑
d∼amdapd + ωua
∑
u∼amuapu + ωca
∑
c∼amcapc,
where d ∼ s, d ∼ u, etc. represent edges connectingdevice d with
server s and user u, etc., ∝ means ‘propor-tional to’, ωij
indicates the weights associated with edgesand reputation types,
and mij is the value in the connectiv-ity matrices. Next, we
exploit this mutual reinforcementprinciple in the bipartite graph
network to simultaneouslyestimate the reputation and the risk using
the propagationalgorithm described in the next section.
Reputation propagation algorithmSpecifically, we employ the
principle of belief propaga-tion (BP) [2] on the large composite
bipartite graph Gto exploit the link structure and efficiently
compute thereputations for all entities. Belief propagation is an
iter-ative message passing algorithm on general graphs andhas been
widely used to solve many graph inference prob-lem [3], such as
social network analysis [1], fraud detection[4], and computer
vision [5].
BP is typically used for computing the marginal distri-bution
(or so-called ‘hidden’ distribution) for the nodesin the graph,
based on the prior knowledge (or ‘observed’distribution) about the
nodes and from its neighbors. Inour case, the algorithm infers the
probabilistic distribu-tion of an entity’s reputation in the graph
based on twosources of information: 1) the prior knowledge about
theentity itself and 2) information about the neighbor entitiesand
relationship between them. The inference is accom-plished by
iteratively passing messages between all pairs ofentities ni and
nj. Let mi,j denote the ‘message’ sent fromi to j. The message
represents i’s influence on j’s reputa-tion. One could view it as
if i, with a certain probability ofbeing risky, passes some ‘risk’
to j. Additionally, the priorknowledge about i (e.g., importance of
the assets and auser’s anomalous behavior) is expressed by node
potentialfunction φ(i) which plays a role in determining the
mag-nitude of the influence passed from i to j. In details,
edgeei,j is associated with message mi,j (and mj,i if the mes-sage
passing is bi-directional). The outgoing message fromi to neighbor
j is updated at each iteration based on the
incoming messages from i’s other neighbors and nodepotential
function φ(i) as follows.
mi,j(xj) ←∑
xi∈{xR,xNR}φi(xi)ψij(xi, xj)
∏
k∈N(i)\jmk,i(xi)
(1)
where N(i) is the set of i’s neighbors, and ψi,j is the
edgepotential which is a transformation function defined onthe edge
between i and j to convert a node’s incomingmessages into its
outgoing messages. Edge potential alsocontrols how much influence
can be passed to the receiv-ing nodes, depending on the properties
of the connectionsbetween i and j (e.g., the number of connections
and vol-ume of traffic). ψ(xi, xj) is typically set according to
thetransition matrix shown in Table 1, which indicates thata
low-reputation entity (e.g. a less reputable user) is morelikely to
be associated with low-reputation neighbors (e.g.compromised
devices). The algorithm runs iteratively andstops when the entire
network is converged with somethreshold T, i.e., the change of any
mi,j is smaller than T, ora maximum number of iterations are done.
Convergence isnot theoretically guaranteed for general graphs;
however,the algorithm often does converge for real-world graphsin
practice. At the end of the propagation procedure, eachentity’s
reputation (i.e. marginal probability distribution)is determined by
the converged messages mi,j and thenode potential function (i.e.
prior distribution).
p(xi) = kφi(xi)∏
j∈N(i)mj,i(xi); xi ∈ {xR, xNR} (2)
where k is the normalization constant.
Incorporating domain knowledgeOne of the major challenges in
adopting a BP algorithmis to properly determine its parameters,
particularly, thenode potential and the edge potential function. In
thissection, we briefly discuss how we leverage the availabledata
sources in a typical enterprise network and incor-porate specific
domain knowledge (unique to each entitytype) to infer the
parameters.
Characteristics of external serversWe develop an intrusion
detection system that leveragesseveral external blacklists to
inspect all the HTTP traffic.It flags different types of suspicious
web servers to whichinternal devices try to connect; this which
allows us to
Table 1 Edge potential function
ψ(xi , xj) xi = xNR xi = xRxj = xNR 0.5 + � 0.5 − �xj = xR 0.5 −
� 0.5 + �
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 5 of 9http://jis.eurasipjournals.com/content/2014/1/17
assign the node potential function according to the
mali-ciousness of the external servers. Specifically, we
classifysuspicious servers into the following five types:
• Spam websites: servers that are flagged by externalspam
blacklists like Spamhaus, SpamCop, etc.
• Malware websites: servers that host malicioussoftware
including virus, spyware, ransomware, andother unwanted programs
that may infect the clientmachine.
• Phishing websites: servers that try to purport to bepopular
sites such as bank sites, social networks,online payment, or IT
administration sites in order tolure unsuspecting users to disclose
their sensitiveinformation e.g., user names, passwords, and
creditcard details. Recently, attackers started to employmore
targeted spear phishing attacks which usespecific information about
the target to increase theprobability of success. Because of its
potential tocause severe damage, we assign a high risky value toits
node potential.
• Exploit websites: servers that host exploit toolkits,such as
Blackhole and Flashpack, which are designedto exploit
vulnerabilities of the victims’ web browsersand install malware on
victims’ machines.
• Botnet C&C servers: are connected with botprograms to
command instructions, update botprograms, or to extrude
confidential information. Ifan internal device makes an attempt to
connect toany known botnet C&C servers, the device is likely
tobe compromised. In addition to blacklists (e.g., ZeusTracker), we
also design models to detect fast fluxingand domain name generation
botnets based on theirdistinct DNS request patterns.
Using the categorization of suspicious servers, wedetermine
initial node potential values according to theseverity of their
categories. We assign (φ(xR), φ(xNR)) =(0.95, 0.05) for the
high-risk types, such as botnets andexploit servers. For the
medium-risk (Phishing and mal-ware) and low-risk (Spam) types, we
assign (0.75, 0.25)and (0.6, 0.4), respectively.
Characteristics of internal entities D,U ,C,AFor internal
entities, e.g., devices, users, credentials, andassets, rich
information can be obtained from the internalasset management
systems and IPS/IDS systems. Avail-able information include
device’s status (e.g., OS version,patch level), device behavior
anomalies (e.g., scanning),suspicious user activities (e.g.,
illegal accesses to sensi-tive data, multiple failed login
attempts), and creden-tial anomalies (e.g., unauthorized accesses).
For instance,from the IPS system deployed in our enterprise
network,we are able collect over 500 different alert types, most
of
which are various attack vectors such as SYN port scan,remote
memory corruption attempts, bruteforce logon,XSS, SQL injection,
etc. Based on these information, weadjust node potential for the
internal entities by assign-ing a severity score (1 to 3 for low,
medium, and high-riskalerts)to each type of suspicious activities
exemplifiedabove and summing up the severities of all
suspiciousactivities associated with an entity i to get total
severitySi. Since an entity i may be flagged multiple times for
thesame or different types of suspicious behaviors, to avoidbeing
overshadowed by a few outliers, we transform theaggregated severity
score using the sigmoid function
Pi = 11 + exp(−Si)The node potential for i is then calculated
as
(φ(xR), φ(xNR)) = (Pi, 1 − Pi). The key benefit of using
asigmoid function is that if no alerts have been reported foran
entity (e.g. Si = 0), its initial node potential will
auto-matically be set to (0.5, 0.5) i.e. equal probability of
beingrisky and non-risky, implying that no prior informationexist
for the particular entity.
Although the parameters in MUSE require some level ofmanual
tuning by domain experts, it is valuable to secu-rity analysts in
several aspects. First, the output of MUSEis the ranking of
high-risk entities whose absolute riskvalues are less important. As
long as the parameters areassigned based on reasonable estimation
of the severi-ties of different types of alerts, MUSE will able to
derivea ranking of entities based on their potential risks,
thusproviding useful information to help analysts prioritizetheir
investigation. Second, MUSE offers the flexibility toincorporate
diverse types of entities and thus can be eas-ily adapted in a wide
variety of other domains. Finally, itis possible to automatically
learn the appropriate parame-ter values through machine learning
techniques, providedthat proper labeled training sets are
available. We leavethis as our plan for future exploration.
Scale up propagation algorithm to big dataAnother major
challenge of applying BP algorithm in alarge enterprise network is
the scalability. Even thoughBP itself is a computationally
efficient algorithm: the run-ning time scales quadratically with
the number of edgesin the graph, for large enterprises with hundred
of thou-sands nodes and edges, the computation cost can
becomesignificant. To make the MUSE practical for
large-scalegraphic models, we observe that the main computationin
belief propagation is localized, i.e. message passing isperformed
between only a specific node and its neigh-bors. This means that
the computation can be efficientlyparallelized and distributed to a
cluster of machines.
One of the most prominent parallel programmingparadigms is
MapReduce, popularized by its open-source
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 6 of 9http://jis.eurasipjournals.com/content/2014/1/17
implementation of Apache Hadoop [6]. MapReduceframework consists
of the map and the reduce stage thatare chained together to perform
complex operations in adistributed and fault-tolerant fashion.
However, MapRe-duce is notoriously inefficient for iterative
algorithmswhere the intermediate results are reused across
multiplerounds of computations. Due to the lack of abstractionfor
leveraging distributed memory, the only way to reusedata across two
MapReduce jobs is to persist them to anexternal storage system e.g.
HDFS and load them back viaanother Map job. This incurs substantial
overheads dueto disk I/O and data synchronization which can
domi-nate the execution times. Unfortunately, the BP
algorithmunderlying MUSE is a typical example of iterative
com-putation where the same set of operations i.e. messageupdate in
Equation 1 are repeatedly applied to multipledata items. As a
result, instead of MapReduce, we leveragea new cluster computing
abstraction called Resilient Dis-tributed Datasets (RDDs) [7] that
achieves orders of mag-nitude performance improvement for iterative
algorithmsover existing parallel computing frameworksa.
RDDs are parallel data structures that are createdthrough
deterministic operations on data in stable stor-age or through
transformations from other RDDs. Typicaltransformations include
map, filter, join, reduce, etc. Themajor benefit of RDDs is that it
allows users to explicitlyspecify which intermediate results (in
the form of RDDs)they want to reuse in the future operations.
Keeping thosepersistent RDDs in memory eliminates unnecessary
andexpensive disk I/O or data replication across iterations,thus
making it ideal for iterative algorithms. To abstractBP algorithm
into RDDs, our key observation is that themessage update process
Equation 1 can be more efficientlyrepresented by RDDs on an induced
line graph from theoriginal graph which represents the adjacencies
betweenedges of original graph. Formally, given a directed graphG =
(V , E), a directed line graph or derived graph L(G) isa graph such
that:
• each vertex of L(G) represents an edge of G. We usefollowing
notations to denote vertices and edges in Gand L(G): let i, j ∈ V
denote two vertices in theoriginal graph G, we use (i, j) to
represent the edge inG and the corresponding vertex in L(G)
• two vertices of L(G) are adjacent if and only if
theircorresponding edges share a common endpoint (‘areincident’) in
G and they form a length two directedpath. In other words, for two
vertices (i, j) and (m, n)in L(G), there is an edge from (i, j) to
(m, n) if andonly if j = m in the original G.
Figure 3 shows the conversion of original graph G to itsdirected
line graph. Since an edge (i, j) ∈ E in the origi-nal graph G
corresponds to a node in L(G), the message
passing process in Equation 1 is essentially an
iterativeupdating process of a node in L(G) based on all of
thisnode’s adjacent nodes. On each iteration, each node inL(G)
sends a message (or influence) mi,j to all of its neigh-bors and at
the same time, it updates its own messagebased on the message it
received from the neighbors. Thiscan be easily described in RDDs as
follows:
Algorithm 1 Message passing algorithm using RDDs1: // Load line
graph L(G) as an RDD of (srcNode,
dstNode) pair2: links =
RDD.textFile(graphFile).map(split).persist()3:4: // load initial
node potential function in original graph
G as (node, φi) pairs5: potentials =
RDD.textFile(potentialFile).map(split)
.persist()6:7: messages = // initialize RDD of messages as
(Node,
mi,j) pairs8:9: for iteration in xrange(ITERATIONS):
10: // Build an RDD of (dstNode, msrc) pairs withmessages sent
by all node to dstNode
11: MsgContrib = links.join(messages).map(12: lambda(srcNode,
(dstNode, msrc):
(dstNode, msrc))13:14: // Multiplication of all incoming
messages by
dstNode15: AggContrib = MsgContrib.reduceByKey(labmda
(m1, m2): m1 ∗ m2)16:17: // Get New updated Messages for the
next iteration
18: messages = potentials.join(AggContrib).map(19:
labmda(dstNode, (φi, magg )):
φi ∗ ψi,j ∗ magg)20:21: //After iterations, compute final belief
according to
Eq. 222: belief = potentials.join(messages).mapValues(lambda
(φi, magg): k ∗ φi ∗ magg)23:24: // and save to external
storage25: belief.saveAsTextFile(“Beliefs.txt”)
The above algorithm leads to the RDD lineage graph inFigure 4.
On each iteration, the algorithm create a newmessages dataset based
on the aggregated contributionsAggContrib and messages from
previous iteration as wellas the static links and potential
datasets that are persisted
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 7 of 9http://jis.eurasipjournals.com/content/2014/1/17
Figure 3 Converting a directed graph G to a directed line
graph.
in memory. Keeping these static datasets and intermedi-ate
messages in memory and making them readily avail-able in the
subsequent iterations avoids unnecessary I/Ooverhead and thus
significantly speed up the computation.
EvaluationDatasetsWe evaluated MUSE with datasets collected from
mul-tiple data sources in the entire USNorth IBM network.The data
sources included DNS messages from local DNSservers, network flows
from edge routers, proxy logs, IPSalerts, and HTTP session headers
(for categorization ofwebsites). The size of the raw data per day
was about 200GB and the average data event rates are summarized
inTable 2.
Experiment resultsWe first evaluate MUSE using data collected
over 1 weekperiod of time. The resulting graph consists of
11,790nodes and 44,624 edges. The number of different entitytypesb
are shown in Table 3.
Figure 4 Linage graph for BP algorithms.
We applied MUSE to the graph, and our algorithm con-verged at
the 5th iteration. We manually inspected topranked entities (i.e.,
with higher P(xR)) in each entity typeand were able to confirm that
they were all suspicious ormalicious entities including infected
devices, suspicioususers, etc. Here, we show one example of user
reputa-tion among our findings. We selected five top risky
usersbased on the output of MUSE. Figure 5 shows their riskvalues
at each iteration. Note that all the users startedwith neutral
score (0.5, 0.5), meaning that these users hadnot been flagged by
anomalous behaviors. However, dueto their interaction with
low-reputation neighbors, theirassociated risks increased. Further
investigation showedthat user332755 owned five devices which made
56 timesof connections to spam websites, four times of connec-tions
to malware websites, and two times of connec-tions to exploit
websites during our monitoring period.user332755 inherited low
reputation from his neighborsincluding the user’s devices, causing
his risk to quickly riseto the top.
We also measured the running time of MUSE at eachiteration
against different size of the graph in terms of thenumber of edges.
Experiments are performed in a serverblade with 2.00 GHz Intel(R)
Xeon(R) CPU and 500 Gmemory (the memory usage of MUSE is less than
1 G).The experiment results are shown in Figure 6. From thefigure,
one can see that BP is efficient in handling small-to-medium-sized
graphic models. Even for 1 week worthof traffic data, MUSE is able
to finish each iteration inless than 1 min. However, we also notice
that the running
Table 2 Average traffic rate for IBM US North network
Data type Data rate
Firewall logs 950 M/day
DNS messages 1,350 M/day
Proxy logs 490 M/day
IPS/IDS events 4 M/day
Overall: 2.5 billion events/day
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 8 of 9http://jis.eurasipjournals.com/content/2014/1/17
Table 3 Number of different entities for one week data
ServerDevice User Assets
Spam Malware Phishing Exploit Botnet
5,500 124 10 26 16 3,823 2,191 100
time quadratically increases with the number of edges,which can
be a bottleneck to handle large graphs as we willdemonstrate in the
next section.
Scalability of MUSE using RDDsTo compare the scalability of
non-parallelized BP algo-rithm with that of the distributed version
using RDDsabstraction, we collected half month worth of
networktraffic to stress test MUSE. The resulting graph consists
of24,706 nodes and 123,380 edges. The number of differententities
are listed in Table 4. As a baseline benchmark, weran the
non-distributed version of MUSE against this largedataset for five
iterations on the same server blade. Theoverall experiment took
1,641 s to finish and each iterationon average required 328 s.
We implement a distributed version of MUSE onApache Spark [8]
which is the open source implemen-tation of RDDs. We deploy Spark
on our blade centerwith three blade servers. We vary the number of
CPUcores available for the Spark framework from 10 to 30 andsubmit
the same workload to it. Figure 7 illustrates thecomparison
results. From the figure, we can notice thatMUSE is able to
leverage RDDs’ in-memory cluster com-puting framework to achieve
10× to 15× speed up. Forinstance, with 30 CPU cores, MUSE is able
to completefive iterations in 104 s with each iteration requiring
lessthan 20 s. Although the algorithm does not scale linearlywith
the number of cores due to fixed I/O and com-munication overhead,
the results demonstrate that with
Figure 5 Risk scores of top five risky users at the end of
eachiteration.
Figure 6 Running time for each iteration with varying sizes
ofgraphs.
moderate hardware configuration, MUSE is scalable andpractical
for large enterprise networks.
Related workWith the cyber threats rapidly evolving towards
large-scale and multi-channel attacks, security becomes crucialfor
organizations of varying types and sizes. Many tradi-tional
intrusion detection and anomaly detection methodsare focused on a
single entity and applied rule-basedapproaches [9]. They were often
too noisy to be usefulin practice [10]. Our work is inspired by the
prominentresearch in the social network area that used a link
struc-ture to infer knowledge about the network properties.Previous
work demonstrated that social structure wasvaluable to find
authoritative nodes [11], to infer indi-vidual identities [12], to
combat web spam [13], and todetect security fraud [14]. Among
various graph min-ing algorithms, the belief propagation algorithm
[2] hasbeen successfully applied in many domains, e.g., detect-ing
fraud [4], accounting irregularities [15], and malicioussoftware
[3]. For example, NetProbe [4] applied a BP algo-rithm to the eBay
user graph to identify subgraphs offraudsters and accomplices.
ConclusionIn this paper, we proposed MUSE, a framework to
system-atically quantify and rank risks in an enterprise
network.MUSE aggregated alerts generated by traditional IPS/IDSon
multiple data sources, and leveraged the link structure
Table 4 Number of different entities for half a month data
ServerDevice User Assets
Spam Malware Phishing Exploit Botnet
8,924 171 103 33 22 10,809 4,527 116
-
Hu et al. EURASIP Journal on Information Security 2014, 2014:17
Page 9 of 9http://jis.eurasipjournals.com/content/2014/1/17
Figure 7 Comparison of running time between non-distributedMUSE
and distributed version implemented using RDDs.
among entities to infer their reputation. The key advan-tage of
MUSE was that it derived the risk of each entitynot only by
considering its own characteristics but alsoby incorporating the
influence from its neighbors. Thisallowed MUSE to pinpoint a
high-risk entity based onits interaction with low-reputation
neighbors, even if theentity itself was benign. By providing risk
rankings, MUSEhelps security analysts to make an informed decision
onallocation of resources and prioritization of further
inves-tigation to develop proper defense mechanisms at anearly
stage. We have implemented and tested MUSE onreal world traces
collected from large enterprise network,demonstrating that the
efficacy and scalability of MUSE.
Endnotesa10x-100x speedup as compared to Hadoop [8].bDue to the
privacy issues, we were not able to include
authentication logs to incorporate user credentials in
ourexperiments.
Competing interestsThe authors declare that they have no
competing interests.
AcknowledgementsThis research was sponsored by the U.S. Army
Research Laboratory and theU.K. Ministry of Defense and was
accomplished under Agreement NumberW911NF-06-3-0001. The views and
conclusions contained in this documentare those of the author(s)
and should not be interpreted as representing theofficial policies,
either expressed or implied, of the U.S. Army ResearchLaboratory,
the U.S. Government, the U.K. Ministry of Defense, or the
U.K.Government. The U.S. and U.K. Governments are authorized to
reproduce anddistribute reprints for government purposes
notwithstanding any copyrightnotation hereon.
Author details1Security Research Department, IBM T.J. Watson
Research Center, 1101Kitchawan Rd, Yorktown Heights, NY, USA. 2IBM
Zurich Research Lab,Saumerstrasse 4, 8803 Ruschlikon,
Switzerland.
Received: 11 August 2014 Accepted: 28 October 2014
References1. J Bian, Y Liu, D Zhou, E Agichtein, H Zha, in
Proceedings of WWW ‘09.
Learning to recognize reliable users and content in social media
withcoupled mutual reinforcement (ACM New York, 2009)
2. JS Yedidia, WT Freeman, Y Weiss, in Exploring Artificial
Intelligence in theNew Millennium, Volume 8. Understanding belief
propagation and itsgeneralizations (Morgan Kaufmann Publishers
Inc., San Francisco, 2003),pp. 236–239
3. DH Chau, C Nachenberg, J Wilhelm, A Wright, C Faloutsos, in
SIAMInternational Conference on Data Mining. Polonium: tera-scale
graphmining and inference for malware detection, (2011)
4. S Pandit, DH Chau, S Wang, C Faloutsos, in International
Conference onWorld Wide Web. Netprobe: a fast and scalable system
for fraud detectionin online auction networks (ACM New York,
2007)
5. PF Felzenszwalb, DP Huttenlocher, Efficient belief
propagation for earlyvision. Int. J. Comput. Vis. 70, 41–54
(2006)
6. AS Foundation, Apache Hadoop. http://hadoop.apache.org/7. M
Zaharia, M Chowdhury, T Das, A Dave, J Ma, M McCauley, MJ Franklin,
S
Shenker, I Stoica, in Proceedings of the 9th USENIX Conference
on NetworkedSystems Design and Implementation. Resilient
distributed datasets: afault-tolerant abstraction for in-memory
cluster computing (USENIXAssociation San Jose, 2012), pp. 2–2
8. AS Foundation, Apache Spark Lightning-fast cluster computing.
https://spark.apache.org/
9. P Garcia-Teodoro, J Diaz-Verdejo, G Maciá-Fernández, E
Vázquez,Anomaly-based network intrusion detection: techniques,
systems andchallenges. Comput. Secur. 28, 18–28 (2009)
10. J Viega, Myths of Security. (O’Reilly Media, Inc,
Sebastopol, 2009)11. JM Kleinberg, Authoritative sources in a
hyperlinked environment. J. ACM.
46(5), 604–632 (1999)12. S Hill, F Provost, The Myth of the
double-blind review?: Author
identification using only citations. ACM SIGKDD Explorations
Newsl. 5(2),179–184 (2003)
13. Z Gyöngyi, H Garcia-Molina, J Pedersen, in Intl Conference
on Very LargeData Bases. Combating Web Spam with Trustrank (VLDB
EndowmentToronto, 2004)
14. J Neville, O Simsek, D Jensen, J Komoroske, K Palmer, H
Goldberg, in ACMConference on Knowledge Discovery and Data Mining.
Using relationalknowledge discovery to prevent securities fraud
(ACM New York, 2005)
15. M McGlohon, S Bay, MG Anderle, DM Steier, C Faloutsos, in
ACMConference on Knowledge Discovery and Data Mining. SNARE: a link
analyticsystem for graph labeling and risk detection (ACM New York,
2009)
doi:10.1186/s13635-014-0017-1Cite this article as: Hu et al.:
MUSE: asset risk scoring in enterprise networkwith mutually
reinforced reputation propagation. EURASIP Journal onInformation
Security 2014 2014:17.
Submit your manuscript to a journal and benefi t from:
7 Convenient online submission7 Rigorous peer review7 Immediate
publication on acceptance7 Open access: articles freely available
online7 High visibility within the fi eld7 Retaining the copyright
to your article
Submit your next manuscript at 7 springeropen.com
http://hadoop.apache.org/https://spark.apache.org/https://spark.apache.org/
AbstractKeywords
IntroductionRisk and reputation in a multi-entity
environmentProblem formulation
Mutual reinforcement principleReputation propagation
algorithmIncorporating domain knowledgeCharacteristics of external
servers SCharacteristics of internal entities D, U, C, A
Scale up propagation algorithm to big
dataEvaluationDatasetsExperiment resultsScalability of MUSE using
RDDs
Related workConclusionEndnotesCompeting
interestsAcknowledgementsAuthor detailsReferences
/ColorImageDict > /JPEG2000ColorACSImageDict >
/JPEG2000ColorImageDict > /AntiAliasGrayImages false
/CropGrayImages true /GrayImageMinResolution 300
/GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true
/GrayImageDownsampleType /Bicubic /GrayImageResolution 300
/GrayImageDepth -1 /GrayImageMinDownsampleDepth 2
/GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true
/GrayImageFilter /DCTEncode /AutoFilterGrayImages true
/GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict >
/GrayImageDict > /JPEG2000GrayACSImageDict >
/JPEG2000GrayImageDict > /AntiAliasMonoImages false
/CropMonoImages true /MonoImageMinResolution 1200
/MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true
/MonoImageDownsampleType /Bicubic /MonoImageResolution 1200
/MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000
/EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode
/MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None
] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false
/PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000
0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true
/PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ]
/PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier ()
/PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped
/False
/CreateJDFFile false /Description > /Namespace [ (Adobe)
(Common) (1.0) ] /OtherNamespaces [ > /FormElements false
/GenerateStructure true /IncludeBookmarks false /IncludeHyperlinks
false /IncludeInteractive false /IncludeLayers false
/IncludeProfiles true /MultimediaHandling /UseObjectSettings
/Namespace [ (Adobe) (CreativeSuite) (2.0) ]
/PDFXOutputIntentProfileSelector /NA /PreserveEditing true
/UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling
/LeaveUntagged /UseDocumentBleed false >> ]>>
setdistillerparams> setpagedevice