Top Banner
Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen Matija Stevanovic Jens M. Pedersen Alexandre Czech
16

Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Jul 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Analysis of Malware Behavior: Type Classification using Machine

Learning Steven S. Hansen

Radu S. Pirscoveanu Thor M. T. Larsen Matija Stevanovic Jens M. Pedersen Alexandre Czech

Page 2: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Introduction

• Malware is a threat to the modern society

• Approximately 390.000 new malware emerge each day according to AV-TEST

– Many of them are variants originating from the same code

How to classify a large amount of malware using machine learning ?

Page 3: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Introduction

• Dynamic analysis

– Executing malware in a secure environment

– Collects behavioral data from the samples

• Pre-filtering application

– Filter known malware from novel malware

– Proof of concept

Page 4: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

• Scalable and distributed

Analysis Setup

Page 5: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

• Emulate Internet services using INetSim

Analysis Setup

Page 6: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

• VMs are personalized

Analysis Setup

Page 7: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

• 80.000 samples are analyzed

Analysis Setup

Page 8: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

• Each sample is analyzed for 200 sec

Analysis Setup

Page 9: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Malware samples

• Supervised machine learning

• Avast is used to extract labels

– Approximately 42.000 samples are labeled

Page 10: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Features

• Main parameter

– API calls

• Secondary parameters

– Mutexes

– Registry Keys

– Files

– DNS requests

Page 11: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Feature Representation

• Sequence

– Distinct API calls

Page 12: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Feature Representation

• Frequency Bins

– Frequency of all APIs for each bin are summed

Page 13: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Feature Representation

• Counters

– Count number of actions performed for each secondary parameter

Page 14: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Results

• Random Forests

– 160 trees

Class F-measure AUC

Trojan 0.960 0.989

PUP 0.850 0.978

Adware 0.767 0.955

Rootkit 0.862 0.970

Weighted Avg. 0.898 0.980

Page 15: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Conclusion

• Cuckoo Sandbox

• Feature representation

• Random Forests

• Pre-filtering application

Page 16: Analysis of Malware Behavior: Type Classification …...Analysis of Malware Behavior: Type Classification using Machine Learning Steven S. Hansen Radu S. Pirscoveanu Thor M. T. Larsen

Future Work

• Uniform dataset

• Ambiguous type description

• Project is continued

• Q&A

• Email: [email protected]