Top Banner
Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at Dallas
29

Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Jan 15, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur KhanDepartment of Computer Science atThe University of Texas at Dallas

Page 2: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Insider ThreatLZW & Quantized DictionaryConcept DriftExperiments & Results

Page 3: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

An Insider is someone who

exploits, or has the intention

to exploit, his/her

legitimate access to assets

for unauthorised purposes.

For example, over time, legitimate users may enter commands that read or write private data, or install malicious software

Page 4: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Computer Crime and Security Survey 2001

$377 million financial losses due to attacks

49% reported incidents of unauthorized network access by insiders

WikiLeaks Breach Highlights Insider Security Threat--Even the toughest security systems sometimes have a soft center that can be exploited by someone who has passed rigorous screening

http://www.scientificamerican.com/article.cfm?id=wikileaks-insider-threat

Page 5: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Reduce false alarm rate without sacrificing threat detection rate

Threat detection is challenging since insiders mask and adapt their behavior to resemble legitimate system.

Page 6: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Normal users have a repetitive sequence of commands, system calls etc..

A sudden deviation from normal behavior, raises an alarm indicating an insider threat

To find an insider threatWe need to collect these repeated sequences of commands in an unsupervised fashionFirst challenge: variability in sequence length Overcome: Generating a LZW dictionary with combinations of possible potential patterns in the gathered data using Lempel- Ziv- Welch algorithm (LZW)Second Challenge: Huge size of the Dictionary Overcome: Compress the Dictionary

Page 7: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Using an ensemble of models increases the accuracy of threat anomaly detection

New data chunks create new models Problem: Ensemble holds K models and there

are K+1 Solution: Remove the least accurate model

Majority voting by all models used to determine the model that is performing the worst

Page 8: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Indexed the

system calls with Unicode

Anomaly?

jSystem call/command

System Call/Comman

d

Chunki+1

Chunki

System log

Testing on Data from

weeki+1

Online learning

Gather Data from

Chunki

Indexed the

system calls with Unicode

Unsupervised Sequence

Learning

Compressed the Dictionary

(QD)

Generate a LZW

Dictionary (D) containing all

possible patterns using

Lempel-Ziv-welch

Algorithm

Incremental based Stream Mining

Update the previous QD

Update models

Page 9: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

liftliftlifliftliftliftliftliftliftliftliftliftliftlift

lift

LZW Dictionary

Quantized Dictionary

Lossy compression

Unlabeled data stream

LZW

li lif liftIf Ift Iftlft ftl ftlitl tli tlif

Page 10: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 11: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

LZW Dictionary

OLD Quantized Dictionary (OQD)

LZWDictionary

Session 1 Session 2

Session n

LZW

LZW

LZW

New Quantized Dictionary (NQD)

compression

compression

Session 1 Session 2

Session n

LZW

LZW

LZW

Page 12: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Given data test stream S and quantized dictionary QD = {qd1, qd2, …},

An anomaly is a phrase/pattern in the stream which is more than α edit distance from all the patterns in QD

Steps in identifying non-matching phrases Compute edit distance matrix L for each

phrase in dictionary and data stream S If the edit distance is within α edit distance ,

delete the matching part from the stream S Remaining patterns in the stream S is

considered as anomaly

Page 13: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

User command patterns shift over time i.e. programmer slowly evolves into an

advanced programmer Changes in users’ habits should not be

identified as anomalies Attribute natural changes to concept

drift Concept drift can be added artificially and

anomalies are still detected

Page 14: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 15: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

drift = [.7071, 1.1180, 1.5811, 1.5811, 1.5811]

Min/Max distributions = [.42929/.57071, .08820/.31180, 0/.25811, 0/.25811, 0/.25811]

Page 16: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Modified Naïve Bayes that uses incremental approach(NB-INC)*

Unsupervised ensemble approach (USSL-GG) that incrementally tests for anomalies and best performs with an ensemble size of 3

(*) R. A. Maxion, “Masquerade detection using enriched command lines,”in Proc. IEEE International Conference on Dependable Systems &Networks (DSN), 2003, pp. 5–14.

Page 17: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 18: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 19: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 20: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 21: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 22: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 23: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 24: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 25: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.
Page 26: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Ensemble based stream mining effectively detects insider threats while coping with evolving concept drift

Our approach adopts advantages from stream mining, compression and ensembles– Compression gives unsupervised learning Stream mining offered adaptive learning Ensembles increase accuracy with concept

drift

Page 27: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Approach

Un/Supervised

Drift

Insider Threat

Sequence

Ju S N Y Y

Maxion S N Y N

Liu U N Y Y

Wang S N Y N

Szymanski

S N Y Y

Masud S Y N N

Parveen U Y Y N

USSL-GG U Y Y Y

Page 28: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.

Update existing models based on user feedback

Update and refine models on ground truth when it is available

Page 29: Pallabi Parveen, Nate McDaniel, Varun S. Hariharan, Bhavani Thuraisingham and Latifur Khan Department of Computer Science at The University of Texas at.