Top Banner
Cythereal MAGIC Documentation Release 0.4 Cythereal Mar 14, 2017
39

Cythereal MAGIC Documentation

Mar 19, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cythereal MAGIC Documentation

Cythereal MAGIC DocumentationRelease 0.4

Cythereal

Mar 14, 2017

Page 2: Cythereal MAGIC Documentation
Page 3: Cythereal MAGIC Documentation

Table of Contents

1 Getting Started 31.1 Request API Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Install MAGIC Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Configure MAGIC Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 MAGIC Client Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Capabilities 72.1 Services Provided . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Use Cases 93.1 Malware Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Malware Signature Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.3 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Pages In Progress 11

5 Overview of Operations 13

6 Archive 156.1 API Flow Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156.2 BinJuice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176.3 Call Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.4 Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196.5 Downloading Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.6 FILE IDENTIFIER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246.7 FILE FORMATS SUPPORTED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246.8 Legalese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.9 PROCESSING STEPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.10 Querying Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266.11 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266.12 Semantic Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276.13 Similar Binaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286.14 Similar Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286.15 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.16 Unpacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306.17 Uploading Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326.18 Client Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

i

Page 4: Cythereal MAGIC Documentation

7 Indices and tables 35

ii

Page 5: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

This documentation is currently a work in progress. See http://docs.cythereal.com/en/latest for current documentation.

Cythereal MAGIC extracts intelligence from malicious programs (malware) to aid in cyber defense and incidenceresponse. It combines two orthogonal areas – semantic inference and statistical inference – along with big dataanalytics to create what may be termed as “Google for malware”. Using a patent-pending technology, CytherealMAGIC can make connections between malware, peering through its protection mechanisms to discover intrinsicrelationships that provide strong evidence for classification and attribution.

Table of Contents 1

Page 6: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

2 Table of Contents

Page 7: Cythereal MAGIC Documentation

CHAPTER 1

Getting Started

Request API Key

Access to the MAGIC service requires an API key.

To request a key, send an email to [email protected] with the subject “Key Request”. Include your name andemail address you want the key to be registered under.

Please note that as part of our minimal verification of identities, we do not provide keys to email addresses pointingto open email services, such as, gmail.com, yahoo.com, aol.com, and many others. Please request a key using yourbusiness or university email address. You may also use your FaceBook email address.

Install MAGIC Client

The MAGIC Client is a command line utility named vbclient. This client provides easy access to the functionalityexposed by the MAGIC API.

This client requires python version 2.7. Older versions of python and python 3 are not supported at this time.

Access to the MAGIC service requires an API Key. If you don’t have one yet, please Request API Key.

There are three options for installing the MAGIC CLI Client:

• Stand Alone Executable

• Install Using Pip

• Build from Source

All installation methods will result in the executable vbclient being created. This executable provides the CLIinterface to MAGIC.

Once installed, the client must be configured

3

Page 8: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Stand Alone Executable

A single executable is provided that packages all necessary libraries for the MAGIC client. Simply download theappropriate version, place it in your PATH, and configure the client.

Version Download URLLinux https://bitbucket.org/cythereal/virusbattle-sdk/downloads/vbclientWindows https://bitbucket.org/cythereal/virusbattle-sdk/downloads/vbclient.exe

Install Using Pip

The cythereal-sdk package on PyPi is provided for installation via pip.:

pip install -U cythereal-sdk

Build from Source

The source for the MAGIC CLI client is released under the Apache License and can be found at https://bitbucket.org/cythereal/virusbattle-sdk.

To install the client from source:

git clone http://bitbucket.org/cythereal/virusbattle-sdk.gitpip install ./virusbattle-sdk

This will create the vbclient command available in your path.

Configure the client before continuing.

Configure MAGIC Client

Accessing the MAGIC client requires an API Key. If you do not have one, first request a key.

The vbclient program expects an environment variable named VIRUSBATTLE_KEY to contain your API key foraccessing MAGIC.

On Linux, place the following in your .bashrc:

export VIRUSBATTLE_KEY=<your API key>

On Windows you can use the System Property Editor to add a new environment variables and set the value.

MAGIC Client Examples

For experimenting with MAGIC without having to upload malware, you can use this set of example hashes:

d7cf8805293e1b53e4978244a3846e8aa5694709c3380880e0af850e41e515ec76cec5afa923f2a170d8ca750e7aba85b8d861a5726123ae57e92e0cf67f72956c5dba5efbbdc7189aaa6dad901c50625357e97f4d68f68b8c7278d4ea9191b53711fa11e9c9165377f0ad0d099bfbae00ccce1ab384c8d3

4 Chapter 1. Getting Started

Page 9: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

3a0bd01f3953a5497b912fd34d951fa01729ad6971c77195f3bee7fbc4e1f6718030573ed38dda45f871a5e2d43cc2dd498092b8ae3136713fb0fdfe0ac2eace224747de60ba5f049279a0c898ffb7f420796a052453b8351ac4b1b1a985bdbfc2e5d1dc2e0fb5e86ef3babe732c7d4fdf2a5e8a69ab25b033f8da48446ee4a5466fda03594102488c9c774224c0aacc8a565e0ab91752e564f74f561d76606b

Upload file for analysis

Archive files, e.g. zip, tar, rar, are supported.

If the archive is password protected, add the option -p <password>.

vbclient -a upload <path-to-binary-file/directory/archive>

This will create a file in the local directory named UploadedHashes.txt that lists the SHA1s of all uploaded files.Additional uploads in the same directory will append to, rather than overwrite, this list.

Query file information

vbclienty -a query <sha1>

Where <sha1> is the SHA1 of a previously uploaded file. You may compute it some third-party program, such assha1sum. It is also available in the file UploadedHashes.txt created during file upload.

vbclient -a query `cat UploadedHashes.txt`

Download analyses output

vbclient -a download <sha1>

This command will create a folder named “Results”, and put the downloaded files there. The unpacked files are notdownloaded as they are in executable format.

Download unpacked files

vbclient -a download <sha1> --enable_malware_download

This will create a folder “Results”, and put the downloaded files there. The unpacked files are distributed as passwordprotected zips. The password is “unpacked”.

Generate mappings between uploaded and download files

vbclient -a map

1.4. MAGIC Client Examples 5

Page 10: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Search for similar files

vbclient -a matches <sha1>

This will create a folder “Results”, and save the similarity results in files similarity.csv and similarity.json in csv andjson format respectively.

Search for similar procedures

vbclient -a search <sha1>/0x<rva-of-procedure>

View the underlying feature set

# For a binary.vbclient -a show <sha1-of-binary># For a procedure.vbclient -a show <sha1>/0x<rva-of-procedure>

6 Chapter 1. Getting Started

Page 11: Cythereal MAGIC Documentation

CHAPTER 2

Capabilities

The capabilities we have may be categorized in three levels based on the level of abstraction.

Highest level: * Clusters of related binaries * Abstraction of Middle Level by composing relations due to packed andunpacked binaries. * Classification of malware - ransomware, keylogger, etc. * Described in the document Charlesjust sent.

Middle Level: * Ability to upload malware, zip files, tar files. * Treats packed and unpacked binaries separately * Getmatches based on binaries, whether packed or unpacked. * Available via vbclient -a matches. * Get access to malwarebinaries (unpacked as well). Via vbclient -a download.

DeepDive Level * Get genome of individual binary and functions, via vbclient -a show * Get matching functions, viavbclient -a search

The mid and deep level are documented on the website that was sent earlier.

For VMware, our architecture should also be interesting. For high performance I expect we would to integrate moretightly. For instance we also use hypervisor introspection to monitor a process and then take its memory dump. Wecreate a PE file and use that for processing. A tighter integration would feed directly off the memory dump. May benot even a complete dump, but just of current page in memory.

As an example, just yesterday one of my user in Australia hand created a function. And used it to search for binariesthat contain similar function. He found matches in five malware binaries.

That’s a great POC. We can get pieces of code and look for them in known good/badware. And then associate someadditional info to determine whether they are indicator of malware.

Services Provided

VirusBattle provides the following automated malware analysis services.

• *Unpacking*: Unpack PE-32 files for a large variety of unpackers using VM Introspection- monitoring execu-tion below ring

• *Reverse Engineering*: Calculate abstract semantics- BinJuice of basic blocks in disassembly, generate Call-graph Graphs, APIFlow Graphs and extract strings.

7

Page 12: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

• *Semantic Matching*: Make code similarity queries at both procedure and binary levels of granularity.

8 Chapter 2. Capabilities

Page 13: Cythereal MAGIC Documentation

CHAPTER 3

Use Cases

Malware Intelligence

Cythereal MAGIC’s unique analysis method combines deep knowledge of Operating Systems Internals coupled withstate-of-the-art programming languages theory for formal program analysis. This allows it to peer through most knownobfuscations and easily analyze even the most complex malware and extract a wealth of information about the innerstructure and workings of malware. Add Data Mining to mix and you get a very powerful tool to extract Intelligencefrom large repositories of malware at a scale that was previously un-thought of.

Connections between seemingly disparate malware families:

VirusBattle can be used to find connections among malware families that were previously never even thought of.Further queries can be made to the system to find out the nature of the connection and also to show the evidence-semantically equivalent procedures that led the system to the conclude the connection.

Below image shows VirusBattle identifying a connection between Gamarue Worms and Leechole Trojans. VirusBattlefound that certain variants of the two families share the same packer. VirusBattle also successfully identified the set ofprocedures that were common to the two families and formed the unpacking stub. This is of immense help to reverseengineers wanting to unpack the malware manually for deeper analysis.

Fig. 3.1: cluster-middle-row-right-column-2.png

Below two images show two procedures found in several variants of DarkComet and Optima families. Variants ofboth families use different packers to hide these procedures from static analysis. The procedures were extracted byVirusBattle’s unpacker using VM Introspection at runtime.

Malware Signature Generation

VirusBattle can analyze large collection of labeled malware and generate semantic signatures common to the family.VirusBattle analyses are capable of locating and identifying even the smallest set of procedures common to a familyand generate obfuscation resistant, semantically meaningful signatures.

9

Page 14: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Additionally, VirusBattle can also perform probabilistic analysis to calculate a confidence value with which it assignsa new malware variant to a known family.

Below graph show number of procedures (y-axis) vs percentage of nitol binaries they are found in (x-axis) as identifiedby VirusBattle. The graph shows that VirusBattle is capable of finding the needle in haystack! It successfully generatedjuice based signatures for the set of 5 procedures that were present in more than 95% of nitol executables.

Fig. 3.2: nitol Proc Sharing.png

Reverse Engineering

VirusBattle uses VM introspection to observe malware execution at a level below ring 0. The intricate knowledgeof Windows Internals is in-built the system to monitor the malware’s interaction with the Operating System as it isexecuting. This is followed by a rigorous static analysis of the original code, as well as, that of runtime generated codeextracted during the execution.

VirusBattle’s static analysis engine performs a variety of analyses. The most important to reverse engineers being theBinJuice analysis. Juice is an abstraction over semantics that can be computed and compared in a fast and scalablefashion.

Given a binary executable, in about a minute, VirusBattle can calculate juice of all procedures in the binary and findout known procedures in the database which are semantically equivalent to procedures in the given binary. Users thenhave access to all the information and notes of malware analysts who have worked on the procedure before, leavingonly the unique, never-seen-before, procedures to be reversed. This reduces the workload by orders of magnitude.

Fig. 3.3: proc-sharing.jpeg

Propagating information from procedure to another juice equivalent procedure has interesting advantages. For in-stance, IDA more often than not, misses to identify library procedures. Reverse engineers thus often end up spendingtime reversing a library procedure which can be avoided.

Below image shows percentage of library procedures as identified by IDA followed by those identified by VirusBattleby just propagating IDA isLibrary tag information across juice-equivalent procedures.

The above is just a glimpse of what can be achieved by propagating information across equivalent procedures. Onecan also throw in labeled open source code and propagate information from them to similar equivalent procedures inmalware and use the labels to guide a reverse engineer trying to understand the malware behaviors.

To better aid in understanding new malware, VirusBattle also reports on the ControlFlow Graph of the malware.Additionally, VirusBattle also generates an APIFlow Graph. Since API calls are the most common way to interact withthe OS, they can be used to understand malware behavior. APIFlow Graph thus may be understood as an abstractionof the ControlFlow Graph where each path describes the behavior of the program as it executed that path on theControlFlow Graph.

10 Chapter 3. Use Cases

Page 15: Cythereal MAGIC Documentation

CHAPTER 4

Pages In Progress

These are pages currently in the progress of being moved to this new documentation.

• USAGE INSTRUCTIONS

– Upload Files

– Download Results

– Query Relation between Upload and Download

– Search Similar Binaries

– Search Similar Procedures

• DATA MODEL

– Objects and Relations

– Processing Steps

– File Identifier and Formats

• PERFORMANCE

– Scalability

– Usability

• COMMAND REFERENCE

– vbClient.py Command Line

• OTHER

– Legalese

– Version History

– Credits for Development

• HELP

– Troubleshooting installation issues

11

Page 16: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

12 Chapter 4. Pages In Progress

Page 17: Cythereal MAGIC Documentation

CHAPTER 5

Overview of Operations

Fig. 5.1: Data Flow Diagram.jpg

13

Page 18: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

14 Chapter 5. Overview of Operations

Page 19: Cythereal MAGIC Documentation

CHAPTER 6

Archive

Old documentation in the process of being moved to the new format.

API Flow Graph

Getting Started

Accessing VirusBattle requires downloading and setting up the VirusBattle SDK. See Installation, Setup, Registrationto setup.

APIFlowGraph with VirusBattle

VirusBattle provides several fully automated semantic reverse engineering service. The service that extracts APIFlow-Graph is called srlStatic. APIFlowGraph is a directed graph with nodes representing apicallsites and a directed edgefrom apicallsite A to B implying there exists an inter-procedural-ly valid path in the callgraph where a call from siteA follows a call from site B, with no intermediate call to any API. To extract APIFlowGraph from x86 binary, all youneed to do is upload PE-32 executable either as is and or as part of compressed archive. Wait for a few seconds, anddownload the result files.

Uploading to VirusBattle

See Uploading Files for a detailed HowTo. The easiest way to upload to VirusBattle is:

vbclient.py -a upload <path to file>

Checking Status

To find out if the uploaded file has been processed or not:

15

Page 20: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

vbclient.py -a status <sha1 of uploaded file>

You may also want to use the Query option for details

vbclient.py -a query <sha1 of uploaded file>

Downloading APIFlowGraph File

To download results of VirusBattle, use the Download action:

vbclient.py -a download <sha1 of uploaded file>

This downloads VirusBattle service result files in the ./Results folder. To avoid download of results from other services-srlJuice, etc., set the appropriate VIRUSBATTLE_SERVICE_FILTER as described below.

Generate Mapping between PE File and APIFlowGraph File

To generate service maps, use the map action:

vbclient.py -a map <sha1 of uploaded file>

This creates csv map files ./Results directory containing original_file_sha1,result_file_sha1. In case of srlStatic Ser-vice, the service produces 3 output files: Callgraphh, APIFlowGraph and Strings. This makes making sense ofthe map file tricky. The files are named as result_file_sha1.callgraph.dot, result_file_sha1.apiflowgraph.json, re-sult_file_sha1.strings.json. This is helpful, but still difficult. We recommend processing the vb-srlStatic.map fileproduced by above command in the ./Results directory where you also downloaded the srlStatic result files, as fol-lows:

VB_APIFLOWGRAPH_FILES=`ls|grep apiflowgraph|awk -F. -vORS='\\\|' '{print $1}'|head -c→˓-2`cat vb-srlStatic.map | grep $VB_APIFLOWGRAPH_FILES > vb-apiflowgraph.map

Filter Other Services

If you are only interested in results from this service, and want to filter out results from other VirusBattle services-srlUnpacker, srlJuice, srlSimService etc., you can filter them out setting the appropriate value for the VIRUSBAT-TLE_SERVICE_FILTER environment variable.

export VIRUSBATTLE_SERVICE_FILTER="srlUnpacker,srlSimService,srlJuice"

The variable accepts a case-sensitive, comma separated list of service names to filter out. You can filter out as many,or as few services as you choose.

See Also

• Quick Start Guide

• vbClient.py Command Line

• File Identifier and Formats

16 Chapter 6. Archive

Page 21: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

BinJuice

Getting Started

Accessing VirusBattle requires downloading and setting up the VirusBattle SDK. See Installation, Setup, Registrationto setup.

BinJuice with VirusBattle

VirusBattle provides several fully automated semantic reverse engineering service. The service that extracts abstractsemantics (juice) of basic blocks of x86 code is called srlJuice. To extract juice, all you need to do is upload PE-32executable either as is and or as part of compressed archive. Wait for a few seconds, and download the result files.

Uploading to VirusBattle

See Uploading Files for a detailed HowTo. The easiest way to upload to VirusBattle is:

vbclient.py -a upload <path to file>

Checking Status

To find out if the uploaded file has been processed or not:

vbclient.py -a status <sha1 of uploaded file>

You may also want to use the Query option for details

vbclient.py -a query <sha1 of uploaded file>

Downloading Juice File

To download results of VirusBattle, use the Download action:

vbclient.py -a download <sha1 of uploaded file>

This downloads VirusBattle service result files in the ./Results folder. To avoid download of results from other services-srlStatic, etc., set the appropriate VIRUSBATTLE_SERVICE_FILTER as described below.

Generate Mapping between PE File and Juice File

To generate service maps, use the map action:

vbclient.py -a map <sha1 of uploaded file>

This creates csv map files ./Results directory containing original_file_sha1,result_file_sha1

6.2. BinJuice 17

Page 22: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Filter Other Services

If you are only interested in results from this service, and want to filter out results from other VirusBattle services-srlUnpacker, srlStatic, srlSimService etc., you can filter them out setting the appropriate value for the VIRUSBAT-TLE_SERVICE_FILTER environment variable.

export VIRUSBATTLE_SERVICE_FILTER="srlUnpacker,srlSimService,srlStatic"

The variable accepts a case-sensitive, comma separated list of service names to filter out. You can filter out as many,or as few services as you choose.

See Also

• Quick Start Guide

• vbClient.py Command Line

• File Identifier and Formats

Call Graph

Getting Started

Accessing VirusBattle requires downloading and setting up the VirusBattle SDK. See Installation, Setup, Registrationto setup.

Callgraph with VirusBattle

VirusBattle provides several fully automated semantic reverse engineering service. The service that extracts callgraph,a directed graph with nodes representing procedures and a directed edge from procedure A to B implying a call fromA to B, is called srlStatic. To extract callgraph from x86 binary, all you need to do is upload PE-32 executable eitheras is and or as part of compressed archive. Wait for a few seconds, and download the result files.

Uploading to VirusBattle

See Uploading Files for a detailed HowTo. The easiest way to upload to VirusBattle is:

vbclient.py -a upload <path to file>

Checking Status

To find out if the uploaded file has been processed or not:

vbclient.py -a status <sha1 of uploaded file>

You may also want to use the Query option for details

vbclient.py -a query <sha1 of uploaded file>

18 Chapter 6. Archive

Page 23: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Downloading CallGraph File

To download results of VirusBattle, use the Download action:

vbclient.py -a download <sha1 of uploaded file>

This downloads VirusBattle service result files in the ./Results folder. To avoid download of results from other services-srlJuice, etc., set the appropriate VIRUSBATTLE_SERVICE_FILTER as described below.

Generate Mapping between PE File and CallGraph File

To generate service maps, use the map action:

vbclient.py -a map <sha1 of uploaded file>

This creates csv map files ./Results directory containing original_file_sha1,result_file_sha1. In case of srlStatic Ser-vice, the service produces 3 output files: Callgraphh, APIFlowGraph and Strings. This makes making sense ofthe map file tricky. The files are named as result_file_sha1.callgraph.dot, result_file_sha1.apiflowgraph.json, re-sult_file_sha1.strings.json. This is helpful, but still difficult. We recommend processing the vb-srlStatic.map fileproduced by above command in the ./Results directory where you also downloaded the srlStatic result files, as fol-lows:

VB_CALLGRAPH_FILES=`ls|grep callgraph|awk -F. -vORS='\\\|' '{print $1}'|head -c -2`cat vb-srlStatic.map | grep $VB_CALLGRAPH_FILES > vb-callgraph.map

Filter Other Services

If you are only interested in results from this service, and want to filter out results from other VirusBattle services-srlUnpacker, srlJuice, srlSimService etc., you can filter them out setting the appropriate value for the VIRUSBAT-TLE_SERVICE_FILTER environment variable.

export VIRUSBATTLE_SERVICE_FILTER="srlUnpacker,srlSimService,srlJuice"

The variable accepts a case-sensitive, comma separated list of service names to filter out. You can filter out as many,or as few services as you choose.

See Also

• Quick Start Guide

• vbClient.py Command Line

• File Identifier and Formats

Data Model

Object Classes and Relations

The following diagram summarizes the various classes of objects and their parent-child relations.

6.4. Data Model 19

Page 24: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

The Archive class represents the variety of compressed formats, such as, zip, tar, etc. An Archive object may containanother Archive object or an Executable. The ‘Executable’ class represents the original executable files contained inan Archive. The result of unpacking an Executable is an object of class Unpacked.

DATA MODEL DIAGRAM

![vb-datamodel-0.2.jpg](https://bitbucket.org/repo/EAaeny/images/1732021486-vb-datamodel-0.2.jpg = 100px)

object_class property

The result of a query is a collection of objects (in json format), each object belonging to one of the three classes.The class of an object is available via the object_class property. Here is a list of values for object_classproperty and their meanings:

archive.7z : 7z compressed archivearchive.tar : tar archivearchive.tgz : gzip compressed archivearchive.zip : zip compressed archive

binary.pe32 : Windows 32 PE executable

binary.unpacked.zip : Binary from unpacking a binary, in a passwordprotected zip file

We anticipate adding support for other archive formats, such as, rar, and other binaries.

Object Class: archive.*

{"object_class": "archive.zip","sha1": "defa70e7c2c209d08bdb59d1dafa322368fba8ee","unix_filetype": "Zip archive data, at least v1.0 to extract","length": 318156,"uploadDate": "2014-09-18 23:28:27.817000","children": [{

"service_data": null,"service_name": "archiveHandler","status": "success","child": "9e63fc2115a65f06b25c9541ad463a9c53dbccb1"

},{

"service_data": null,"service_name": "archiveHandler","status": "success","child": "1661c84ab02129efa9461fc84b8f2c4290df407a"

},{

"service_data": null,"service_name": "archiveHandler","status": "success","child": "3c8030b8e344c0bc9211652388621b90d8896290"

},{

20 Chapter 6. Archive

Page 25: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

"service_data": null,"service_name": "archiveHandler","status": "success","child": "61a7409277ca86576a1006a4415701b95acda6b7"

}],"md5": "522f3af3d85d2e9b5f49330a3c5b5c7c"

}

Object Class: binary.pe32

Here is an example json structure returned in response to query of information for9e63fc2115a65f06b25c9541ad463a9c53dbccb1.

{"object_class": "binary.pe32","sha1": "9e63fc2115a65f06b25c9541ad463a9c53dbccb1","unix_filetype": "PE32 executable (GUI) Intel 80386, for MS Windows","filepath": ["user202/testData-1/f53d2ffe563347776f50af7856f1f8b7","./f53d2ffe563347776f50af7856f1f8b7"

],"length": 144384,"parents": ["defa70e7c2c209d08bdb59d1dafa322368fba8ee","3c8030b8e344c0bc9211652388621b90d8896290"

],"uploadDate": "2014-09-18 23:33:34.926000","children": [{

"service_name": "srlUnpacker","status": "success","child": "5fb25ea31fdb45377ccc6f0b7542d537d73b71d2","service_data": {

"unpacker_config": {"UNPACKER_MAX_TIME": "5","UNPACKER_DLLMODE": "0","UNPACKER_TIMEOUT": "100"

},"unpacker_result": {"message": "unpacked","time": "0 sec"

}}

}],"md5": "f53d2ffe563347776f50af7856f1f8b7"

}

The object_class field tells the type of file. The above object has the class binary.pe32 implying that this isa Windows PE32 executable. More details about the format are available in the field unix_filetype, which givesthe file type identification using the file command.

Notice the values for children and parents. Both values are lists, implying that a Windows PE32 mayhave multiple parents and multiple children. A parent of a binary.pe32 file must be an archive. In

6.4. Data Model 21

Page 26: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

this case, the binary was found in two archives: defa70e7c2c209d08bdb59d1dafa322368fba8ee and3c8030b8e344c0bc9211652388621b90d8896290.

It is likely one may want the name of the file in the archive. The filepath field gives this information. In the aboveexample it states that the file was found in the respective parents at the following paths: user202/testData-1/f53d2ffe563347776f50af7856f1f8b7 and ./f53d2ffe563347776f50af7856f1f8b7. The pathsare given in the same order as the parents.

While it is obvious that an archive may have children, in our data model, a PE 32 executable can also have children. Achild of a binary.pe32 file is the program resulting from unpacking the binary executable. In the above example,the program has one child described by the record:

{"service_name": "srlUnpacker","status": "success","child": "5fb25ea31fdb45377ccc6f0b7542d537d73b71d2","service_data": {

"unpacker_config": {"UNPACKER_MAX_TIME": "5","UNPACKER_DLLMODE": "0","UNPACKER_TIMEOUT": "100"

},"unpacker_result": {"message": "unpacked","time": "0 sec"

}}

}

The service_name field states the specific service used to create the child, the status field whether the servicewas successfully completed, and the child field gives the id associated with the child. In the above example, thechild was created using the service srlUnpacker. The service was successfully completed and produced a childwith the id 5fb25ea31fdb45377ccc6f0b7542d537d73b71d2.

The above qualification of the type of service indicates that Virusbattle supports multiple services on a binary. Thatis indeed the case. Virusbattle may also perform semantic reverse engineering of the binary or find semantic matches.This service is currently experimental, and is not yet released for general use.

Object class: binary.unpacked.zip

vbSDK provides the ability to recursively query information for the children of a parent. The following record providesthe information extracted for the children stated above.

{"object_class": "binary.unpacked.zip","sha1": "5fb25ea31fdb45377ccc6f0b7542d537d73b71d2","unix_filetype": "Zip archive data, at least v2.0 to extract","length": 87607,"parents": ["9e63fc2115a65f06b25c9541ad463a9c53dbccb1"

],"uploadDate": "2014-09-18 23:33:53.217000","password": "unpacked","children": [],"md5": "cc7823ad925e210b1a7e3ac91831fe58"

}

22 Chapter 6. Archive

Page 27: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Downloading Files

ACTION: DOWNLOAD FILE

To download a file from VirusBattle, vbclient.py provides the following command:

vbclient.py -a download [--norecursive] [-o outdir] [--enable_malware_download] [--→˓downloadall] [--zipbinary] arg ...

The command may be used to download VirusBattle analyses output and/or uploaded binaries. The downloadcommand first queries VirusBattle server for file information, and downloads a file only if it is not an executable or anarchive containing executable. It saves the downloaded files in the directory outdir (which defaults to Results).A downloaded file is named using it’s file identifier (sha1) along with the file extension that may identify the type ofanalyses that created it.

By default, vbclient.py prevents download of any executable files, including unpacker output and archives thatmay contain executable files. To enable downloading of unpacker output, use --enable_malware_download toexplicitly allow unpacker output to be downloaded.

Though vbclient.py by default supports downloading only analyses results, it also has the capability to downloadall other files. The option --downloadall instructs vbclient.py to download all the files, including archivefiles and binary.pe32 files.

Note: executables and archives with executable will still be stopped unless --enable_malware_download isused.

The --norecursive option, as before, disables recursive traversal. When this is NOT specified,--downloadall and --enable_malware_download will download all of files in the entire tree, startingfrom a parent. That means, it will download the top archive, and then each file in the archive as well. If any of the fileis an archive, it will recursively download its content as well. Such recursive downloading, if done without thought,may download tons of data on your machine and exhaust your disk space. So it ought to be used with care.

If you want to download unpacked files and other executable files in compressed archive format, use –zipbinary. Thiswill tell virusbattle server to compress all binary files into respective .zip files before sending them out client side.

In general, the files created from download are given extensions, such as, .zip, .7z, .exe, etc., consistent withtheir object_class.

COMMON USAGE

1. Recursively download analyses files of an executable with hash filehash (into default Results directory).

vbclient.py -a download filehash

Unpacked files are not downloaded.

2. Recursively download analyses files of an executable with hash filehash including the unpacked files (intodefault Results directory).

vbclient.py -a download --enable_malware_download filehash

Unpacked files are also downloaded.

3. Recursively download all analyses output for all executables in an archive with hash ziphash excluding un-packed files.

6.5. Downloading Files 23

Page 28: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

vbclient.py -a download ziphash

4. Recursively download all analyses output for all executables in an archive with hash ziphash excluding un-packed files.

vbclient.py -a download --enable_malware_download ziphash

5. Download unpacked files in compressed archive format.

vbclient.py -a download --enable_malware_download --zipbinary <sha1>

6. Download analyses files in the directory myoutdir.

vbclient.py -a download --enable_malware_download --outdir=myoutdir <sha1>or

vbclient.py -a download -o myoutdir <sha1>

7. Download analyses files along with original files.

vbclient.py -a download --enable_malware_download --downloadall [--zipbinary]→˓<sha1>

Generate Mappings for downloaded files

While one can easily parse the json output of the query results to identify each files’ connection to the top leveluploaded file, an easier way is to use the map option to generate mappings.

vbclient.py -a map [-o Output-directory] arg1 arg2 ...

The arguments here are the filehashes for which you need the mappings. If no argument is provided, we take inputfrom the listfile. By default, this will create map files in the Results directory, unless you specify an alternativeoutput-path using the -o (or –outdir) option.

The mappings are csv files named by as service-name.csv where the service name is the name of the service whichproduced the output file. The csv files are in the format Input-filehash,Output-filehash.

#File Identifier and Formats

FILE IDENTIFIER

Currently VirusBattle uses SHA1 of a file as its identifier. However, the specific choice of the identifier is subject tochange. In this discussion we use the term filehash or objID interchangeably to mean the object identifier usedby VirusBattle.

FILE FORMATS SUPPORTED

VirusBattle currently analyzes Windows PE32 executables. Since most companies and people do not like to havemalware executables on their machines or network, the system allows uploading password protected zip files as well.Besides zip, the system also accepts tar, tar.gz, and 7z formats. Other archive formats can be supported onrequest to the extent their decompressors are available on Ubuntu platform.

24 Chapter 6. Archive

Page 29: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

The system also supports nested archives: exe contained in zip, contained in tar, contained in of 7z, zip, etc. It can digdeep into such archives.

VirusBattle can also decrypt password protected archives, either with the default password infected or using apassword provided by you. The only caveat is that for nested archives all password protected archives use the samepassword.

The design of the HTTP API is driven with the goal to support submission and querying of nested archives. VirusBattletreats nested archives similar to nested directories, and provides mechanism to traverse the hierarchy until you reachthe required binary and query its information. You may also directly access the necessary information using a binary’s(or for that matter any file’s) file identifier.

Legalese

PRIVACY AND ACCESS CONTROL

Though access to VirusBattle is restricted using an API Key, anyone with a valid API Key may have access to all theuploaded files, unpacked binaries, and their analyses. This includes even the files that are not uploaded by the user.

Which means any data you upload and all the analyses we perform is available to everyone who has access to Virus-Battle. Please be aware of this before uploading any data on VirusBattle.

More fine-grained access control may be introduced in future versions.

DISCLAIMERS

The service is provided as is, without any claims of fitness for use. We offer no guaranties or warranties. There isalso no assurance for any level of quality of service. The server is experimental and may be taken offline without priornotice.

IP RIGHTS

Access to use this service does not afford a user any rights on the SDK, the analyses results computed by the service,the underlying algorithms, or the implementation.

PROCESSING STEPS

The following diagram represents the flow of computation on VirusBattle server.

When a file is uploaded VirusBattle places the file in the Upload queue, if the file is not already in the database (orif a “–force” upload is requested). A Dispatch process reads the queue, determines the type of file, and pass it to theappropriate handler. An archive file is sent to the Archive Handler, a Windows PE32 executable to the Binary Handler,and other files to an Unknown File Hander (not represented in the diagram).

The Archive Handler decompresses an archive, walks through the directory to find files and internally ‘uploads’ them.This leads to recursively decompressing nested archives and it also puts all the enclosed Windows PE32 executablesin the Unpack Queue and the Juice Queue for processing.

The Unpacker module unpacks binaries in the Unpack queue, and if successful, stores the result in the Virus BattleMongo database. It also places the unpacked file generated in the Juice queue.

The Juice module uses UL Lafayette’s patent pending BinJuice technology to perform semantic reverse engineering ofbinaries, both uploaded and generated by the unpacker. Upon completion of this step, the system computes a variety

6.8. Legalese 25

Page 30: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

of hashes and generates indexes for connecting a malware binary to other binaries with semantically similar code. Thedata is stored in the Virus Battle Mongo database.

A download request is handled directly by the web server. It retrieves unpacked files from the database and returnsto the user. The query requests are handled by the Query module. It retrieves and returns the appropriate informationrequested.

Querying Information

ACTION: QUERY INFORMATION

To query information about a filehash use the following command:

vbclient.py -a query [--norecursive] arg ...

An arg in this command is a filehash. The --norecursive option asks vbclient.py to not to recursively queryinformation about all of the children. The following describes the meaning of the children of a file.

Archive Files: The children of an archive file (such as zip) are obvious. They are the set of all FILES contained in thatarchive. VirusBattle does not maintain the directory structure of the uncompressed zip file. Though it does maintainthe relative path within the archive from which a file is extracted. The relative path may be used to match a file onVirusBattle to a specific file in the archive, in addition to the sha1 filehash identifiers.

Binary files: Though a binary file is are not an archive, in VirusBattle it may also have children. The children of a binaryfile are other binaries generated from unpacking it. A binary may have multiple children resulting from unpacking itmultiple times. In most cases the children may have insignificant differences, yet yielding different hashes and henceconsidered different.

The result of the query for a hash consists of a variety of information including its list of parents and children.

COMMON USAGE

1. Query info for filehash only

vbclient.py -a query --norecursive filehash

2. Recursively query info, use filehashes stored in UploadedHashes.txt

vbclient.py -a query

3. Check if a file is already on virusbattle.

filehash=`sha1sum filename | cut -f 1 -d ' '`vbclient.py -a query --norecursive $filehash

4. Recursively query all information for a file.

vbclient.py -a query <sha1>

Scalability

In general, computational cost depends on the algorithm, the implementation, underlying hardware, and the dataset.The times provided below are ballpark times for analyzing a binary of around 650Kb (95 percentile in our collection).

26 Chapter 6. Archive

Page 31: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

These times are for an unoptimized implementation on a modest hardware: the work horse consisting of 4 cores with8 Gb memory, and separate machines for database server and web server.

• Unpacking: 15 seconds (sometimes 3 minutes)

• Semantic Reverse Engineering: 15 seconds,

• Searching: around 15sec per binary, on a database of over 30,000 samples (with a naive algorithm).

• Searching: around 10ms per procedure, on a database of over 30,000 samples.

The implementations are unoptimized in that they do not take advantage of certain inherent parallelism in the compu-tations. Thus, the above time estimates provide ample space for improvement by using better hardware, and improvedsoftware architecture.

Semantic Matching

ACTION: SEARCH FOR SIMILAR BINARIES

To search for similar binaries, use:

vbclient.py -a matches [–threshold] [–fullmatrix] [–outdir] arg1 arg2 ...

As before arg1, arg2 are sha1 hashes of files that must be either binary.pe32 or binary.unpacked. Itdoesn’t provide similarity for any archive files. The binaries included in an archive may be found using thevbclient.py -a query command.

One may consider the matches command similar to a Google query. It gives a list of all the binaries similar to thearg, upto a given similarity threshold, where threshold is a real number between 0 and 1. If threshold isnot provider, a default value is used.

By default the matches command returns only the sha1s that are higher than the arg. This default is kept to servethe use case when a user may wish to compute the similarity matrix between a large collection of binaries. In suchcase, since the similarity matrix is symmetric it is sufficient to return just the upper diagonal. This option serves thepower user who may upload a lot of files and search for similarity between all of them by searching for matches ofone sha1 at a time.

The --fullmatrix option may be used should a user desire to receive all of the matches, i.e., sha1s lower andhigher than the query sha1.

The output of the matches is saved in the files $outdir/similarity.csv and $outdir/similarity.json. CAUTION: These files are overwritten. So if you want to search for similarity between a lot of files, it issuggested to use the --lf option to give the list of sha1s to be searched.

ACTION: SEARCH FOR SIMILAR PROCEDURES

To search for similar procedures, use:

vbclient.py -a search [--noLibrary] [--limit] sha1/0xrva1 sha1/0xrva2 ...

The search command searches procedures similar to a given one. A procedures is identified as sha1/0xrva,where sha1 is the sha1 of the binary and rva is the relative virtual address of the procedures in hex format.

The –noLibrary option removes library procedures from the search.

The –limit option can be set one of two case-sensitive values- either High or Low.

6.12. Semantic Matching 27

Page 32: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

High limits the procedure search results to semantically equivalent procedures, that is procedures with same juice, or,procedures with very high similarity only.

Low limits the procedure search results to semantically similar, but not equivalent, procedures. These are usuallyprocedures that share some blocks of juice but not all.

Similar Binaries

ACTION: SEARCH FOR SIMILAR BINARIES

To search for similar binaries, use:

vbclient.py -a matches [--threshold] [--fullmatrix] [--outdir] arg1 arg2 ...

As before arg1, arg2 are sha1 hashes of files that must be either binary.pe32 or binary.unpacked. Itdoesn’t provide similarity for any archive files. The binaries included in an archive may be found using thevbclient.py -a query command.

One may consider the matches command similar to a Google query. It gives a list of all the binaries similar to thearg, upto a given similarity threshold, where threshold is a real number between 0 and 1. If threshold isnot provider, a default value is used.

By default the matches command returns only the sha1s that are higher than the arg. This default is kept to servethe use case when a user may wish to compute the similarity matrix between a large collection of binaries. In suchcase, since the similarity matrix is symmetric it is sufficient to return just the upper diagonal. This option serves thepower user who may upload a lot of files and search for similarity between all of them by searching for matches ofone sha1 at a time.

The --fullmatrix option may be used should a user desire to receive all of the matches, i.e., sha1s lower andhigher than the query sha1.

The output of the matches is saved in the files $outdir/similarity.csv and $outdir/similarity.json. CAUTION: These files are overwritten. So if you want to search for similarity between a lot of files, it issuggested to use the --lf option to give the list of sha1s to be searched.

Similar Procedures

ACTION: SEARCH FOR SIMILAR PROCEDURES

To search for similar procedures, use:

vbclient.py -a search [--noLibrary] [--limit] sha1/0xrva1 sha1/0xrva2 ...

The search command searches procedures similar to a given one. A procedures is identified as sha1/0xrva,where sha1 is the sha1 of the binary and rva is the relative virtual address of the procedures in hex format.

TODO: Describe --noLibrary and --limit option.

ACTION: SHOW LIST OF PROCEDURES IN A BINARY

Before searching for similar procedures, you need the list of procedures. The list of procedures in a binary may bequeried using the following command.

28 Chapter 6. Archive

Page 33: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

vbclient.py -a show arg

The show command takes the sha1 of the binary as arg. It produces the output in json format.

ACTION: GET CONTENT OF PROCEDURES

Once similar procedures are found, as user may wish to compare their code. You can get the code of a procedure usingthe following command.

vbclient.py -a show [--noLibrary] [--limit] sha1/0xrva1 sha1/0xrva2 ...

The show command gives quite a bit of information for each procedure. For each block of a procedure it givesits code, semantics, generalized code, and generalized semantics. In addition, it also gives the strings accessed in aprocedure and the Windows APIs referenced in the procedure.

Strings

Getting Started

Accessing VirusBattle requires downloading and setting up the VirusBattle SDK. See Installation, Setup, Registrationto setup.

Extracting Strings with VirusBattle

VirusBattle provides several fully automated semantic reverse engineering service. The service that extracts stringsfrom PE-32 binaries is called srlStatic. To extract strings from x86 binary, all you need to do is upload PE-32 exe-cutable either as is and or as part of compressed archive. Wait for a few seconds, and download the result files.

Uploading to VirusBattle

See Uploading Files for a detailed HowTo. The easiest way to upload to VirusBattle is:

vbclient.py -a upload <path to file>

Checking Status

To find out if the uploaded file has been processed or not:

vbclient.py -a status <sha1 of uploaded file>

You may also want to use the Query option for details

vbclient.py -a query <sha1 of uploaded file>

6.15. Strings 29

Page 34: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Downloading Strings File

To download results of VirusBattle, use the Download action:

vbclient.py -a download <sha1 of uploaded file>

This downloads VirusBattle service result files in the ./Results folder. To avoid download of results from other services-srlJuice, etc., set the appropriate VIRUSBATTLE_SERVICE_FILTER as described below.

Generate Mapping between PE File and Strings File

To generate service maps, use the map action:

vbclient.py -a map <sha1 of uploaded file>

This creates csv map files ./Results directory containing original_file_sha1,result_file_sha1. In case of srlStatic Ser-vice, the service produces 3 output files: Callgraphh, APIFlowGraph and Strings. This makes making sense ofthe map file tricky. The files are named as result_file_sha1.callgraph.dot, result_file_sha1.apiflowgraph.json, re-sult_file_sha1.strings.json. This is helpful, but still difficult. We recommend processing the vb-srlStatic.map fileproduced by above command in the ./Results directory where you also downloaded the srlStatic result files, as fol-lows:

VB_STRING_FILES=`ls|grep strings|awk -F. -vORS='\\\|' '{print $1}'|head -c -2`cat vb-srlStatic.map | grep $VB_STRING_FILES > vb-string.map

Filter Other Services

If you are only interested in results from this service, and want to filter out results from other VirusBattle services-srlUnpacker, srlJuice, srlSimService etc., you can filter them out setting the appropriate value for the VIRUSBAT-TLE_SERVICE_FILTER environment variable.

export VIRUSBATTLE_SERVICE_FILTER="srlUnpacker,srlSimService,srlJuice"

The variable accepts a case-sensitive, comma separated list of service names to filter out. You can filter out as many,or as few services as you choose.

See Also

• Quick Start Guide

• vbClient.py Command Line

• File Identifier and Formats

Unpacking

Getting Started

Accessing VirusBattle requires downloading and setting up the VirusBattle SDK. See Installation, Setup, Registrationto setup.

30 Chapter 6. Archive

Page 35: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

Unpacking with VirusBattle

VirusBattle provides a fully automated generic unpacking service. All you need to do is upload PE-32 executableeither as is and or as part of compressed archive. Wait for a few seconds, and download the unpacked file.

Uploading to VirusBattle

See Uploading Files for a detailed HowTo. The easiest way to upload to VirusBattle is:

vbclient.py -a upload <path to file>

Checking Status

To find out if the uploaded file has been processed or not:

vbclient.py -a status <sha1 of uploaded file>

You may also want to use the Query option for details

vbclient.py -a query <sha1 of uploaded file>

Downloading Unpacked File

To download results of VirusBattle, use the Download action:

vbclient.py -a download <sha1 of uploaded file> --enable_malware_download

This downloads VirusBattle service result files in the ./Results folder. To avoid download of results from other services-srlStatic, srlJuice etc., set the appropriate VIRUSBATTLE_SERVICE_FILTER as described below.

Generate mapping between packed and unpacked file

To generate service maps, use the map action:

vbclient.py -a map <sha1 of uploaded file>

This creates csv map files ./Results directory containing original_file_sha1,unpacked_file_sha1

Filter Other Services

If you are only interested in unpacking, and want to filter out results from other VirusBattle services- srlJuice, srlStatic,srlSimService etc., you can filter them out setting the appropriate value for the VIRUSBATTLE_SERVICE_FILTERenvironment variable.

export VIRUSBATTLE_SERVICE_FILTER="srlJuice,srlSimService,srlStatic"

The variable accepts a comma separated list of service names to filter out. You can filter out as many, or as few servicesas you choose.

6.16. Unpacking 31

Page 36: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

See Also

• Quick Start Guide

• vbClient.py Command Line

• File Identifier and Formats

Uploading Files

ACTION: UPLOAD FILES

The following summarizes the command to upload a file to VirusBattle for analysis. (The syntax assumes thatvbclient.py is on your command search PATH.)

vbclient.py -a upload [-p password] [--norecursive] [-f] [--lf listfile] arg ...

The CLI parameter -a upload selects the upload action. The -p, --norecursive, -f, and --lf argumentsare optional. Their default values may be obtained by invoking vbclient.py using -h (or --help).

For this command an arg may be a file or a directory. The program can take a list of arguments. At least one argmust be provided for the system to do anything.

The meaning (or effect) of the optional arguments are as follows.

-p password: This option is used to provide the password for unzipping .zip and .7z archives.

--norecursive: When any arg is a directory, the --norecursive option may be used to direct vbclient.py to not to traverse it recursively. In the absence of the --norecursive option, arg directory will be recursivelytraversed and all files found are uploaded.

-f: VirusBattle, by default, does not unpack a binary that has been unpacked before. The -f option is used to instructVirusBattle that the executables being submitted ought to be re-unpacked, even if they have previously been packed.This option is currently provided to compensate for potential errors that may terminate the processing prematurely.We advice that -f be used only if you have a reason to believe that the previous analysis was faulty.

--lf list-file: vbclient.py stores the file identifiers of the uploaded file in the listfile. These fileidentifiers may then be used for querying VirusBattle for its analysis. To override this capability please use --lf/dev/null. Default listfile is UploadedHashes.txt

COMMON USAGE

1. Upload a file (exe or archive, such as zip file)

vbclient.py -a upload <filename>

2. Upload a password protected zip file (with password mypassword)

vbclient.py -a upload -p mypassword <filename.zip>

3. Upload files in a directory (one level deep)

vbclient.py -a upload --norecursive <directoryname>

4. Upload all files in a directory, recursively.

32 Chapter 6. Archive

Page 37: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

vbclient.py -a upload <directoryname>

5. Upload multiple files and directories

vbclient.py -a upload <file1> <dir2> <file3> <file4>

6. Upload files, save file hashes returned in UploadedHashes.txt

vbclient.py -a upload --list-file UploadedHashes.txt <dir1> <file1> <dir2>

7. Upload and force analysis of the binary even if it was analyzed earlier

vbclient.py -a upload --force <filename>

8. Force analysis of a binary that was analyzed earlier (without uploading again)

vbclient.py -a reprocess <sha1>

Client Reference

REFERENCE: Command Line

vbSDK contains the program vbclient.py that provides access to its CLI as well as the library. When executed asa program it provides command line capability for uploading, querying, and downloading. When used as a packagethat may be used to integrate with your own internal systems to create a seamless automated malware analysis pipeline.

The following shows the help message printed by vbclient.py. The client is a swiss-army knife, providing accessto all of VirusBattle services. As a result the program has a variety of options, though not all are meaningful for allactions.

Usage: vbclient.py [options] [arg]

General Options:-h, --help show this help message and exit.-f, --force Force resubmission, if the file already exists.-p PASSWORD, --password=PASSWORD

Password for Zip and 7z encrypted archives.-a ACTION, --action=ACTION

Action to perform. One of upload, reprocess, query,download, map, status, show, search, matches,myuploads. Default is: upload

-o OUTDIR, --outdir=OUTDIRDirectory to save downloaded files. Default is:./Results

--norecursive Do not recursively visit children nodes.--test Do a test run, don't actually upload.--lf=LISTFILE, --list-file=LISTFILE

File to keep list of filehashes that are uploaded.Default is: UploadedHashes.txt

--loglevel=LOGLEVEL Select log level. One of: info, debug, warn, error.Default is: warn.

-v, --verbose Verbose output.

--enable_malware_downloadDownload Malware files. Malware download is disabled

6.18. Client Reference 33

Page 38: Cythereal MAGIC Documentation

Cythereal MAGIC Documentation, Release 0.4

by default.--downloadall Download all files in the tree. By default only analyses

→˓output files aredownloaded.

--zipbinary Download binary files as zip. Default- as .exe file.

Show or Search Action related options:--xl, --noLibrary eXclude Library functions from juice and similarity

responses--fullmatrix Get full matrix search; default upperhalf only--threshold=THRESHOLD

Threshold for similarity matching-l LIMIT, --limit=LIMIT

limit similarity search results to semanticallyequivalent (High similarity) or semantically similar

→˓procedures (Low similarity) only.

USAGE GUIDES

• Upload Files

• Download Results

• Query Relation between Upload and Download

• Perform Similarity Searches at procedure and/or binary level

34 Chapter 6. Archive

Page 39: Cythereal MAGIC Documentation

CHAPTER 7

Indices and tables

• genindex

• modindex

• search

35