Simseer.com - Malware Similarity and Clustering Made Easy

Post on 18-Nov-2014

1196 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

Transcript

Simseer.comMalware Similarity and Clustering Made Easy

Silvio Cesare <silvio@ruxcon.org.au>

Introduction• Simseer.com is a set of web services to analyse

malware using program structure as a signature.. Why?

• AV String signatures not very robust.

• Can’t detect ‘approximate’ matches.

• Hard to generate signature for an entire family.

• Program structure improves signature-based methods.

Who am I?

•Ph.D. Student at Deakin University.

•Presented at Ruxcon, Black Hat, AusCERT, etc.

•Published in academia.

•Book author

•Recently relocated to Canberra.

Outline

1. Introduction

2. Simseer.com’s Malware Services

3. Supporting Infrastructure

4. Other Services

5. Conclusion

Signatures

•In my other presentations.•Signature is based on ‘set of control flow

graphs’

Signature Extraction

•Transform ‘set of control flow graphs’ into a ‘feature vector’

•Decompilation + N-Grams

L_0

L_3

L_6

L_7L_1

L_2 L_4

L_5

true

true

true

true

true

W|IEH}Rproc(){L_0: while (v1 || v2) {L_1: if (v3) {L_2: } else {L_4: }L_5: }L_7: return;}

W|IEH}R

W|IE|IEHIEH}EH}R

Simseer

•Begin start of demo...

•A revamp of my existing http://www.FooCodeChu.com service.

•Submit an archive of malware samples.

•Results▫A similarity matrix comparing samples.▫An evolutionary tree showing relationships.

Submission Page

Results

Simseer

•Demo complete...

•Use ‘distance between vectors’ to show similarity.

•Visualize using phylogenetics software.

SimseerCluster• Begin demo...

• A new service.

• Submit an archive of malware samples.

• Define the number of clusters.

• Results▫ Samples grouped into clusters.▫ Cross checking samples with AV.▫ Identification of families.

Submission Page

Results

SimseerCluster

•Demo complete...

•Use ‘similarity matrix’ and ‘cosine similarity’.

•Pass to ‘cluster analysis software’ – The Weka Machine Learning Toolkit.

•Use Hierarchical clustering.

SimseerSearch• Begin demo...

• A new service.

• Submit a malware sample.

• Specify threshold of similarity.

• Results▫ All samples in database similar to query.▫ An AV report.▫ Heuristics to detect obfuscations (packing).

Submission Page

Results

SimseerSearch

•Demo complete...

•Use ‘nearest neighbour similarity search’ based on ‘Euclidean distance’.

•Packer detection based on entropy analysis.

q

Query Malicious

Query Benign

d(p,q)

p

r

Malware

Query

Supporting Infrastructure

Other Services

•Other services on the same infrastructure▫Clonewise▫Bugwise

Clonewise – Detecting embedded libraries.

Bugwise on real Debian Linux binaries

Future Work

•Integrate Cuckoo sandbox▫Unpacking with Volatility.▫Non EXE formats (PDF, DOC, etc).▫API Call classification (non signature-

based).

Conclusion

•Free services.

•Control flow better than traditional string signatures.

•Try it!

•http://www.simseer.com

top related