CYBER FORENSICS Module 2

CYBER FORENSICS

Module 2

Introduction to Cyber forensics

Interrelation among Cybercrime, Cyber Forensics and Cyber Security - Historical background of

Cyberforensics - Cyber Forensics – Definition, Need, Objectives, Computer Forensics

Investigations, Steps in Forensic Investigation, Forensic Examination Process, Classification of

Cyber Forensics , Benefits of Cyber Forensics, Incident and Incident Handling - Computer

Security Incident Response Team

______________________________________________________________________________

Interrelation among Cybercrime, Cyber Forensics and Cyber Security

Cyber crime - any criminal offence that involves a computer/network or an ECD

Cyber forensics / Computer Foreniscs – focus on investigation of the cyber crimes

Deals with acquisition, analysis and admissibility of digital evidence from a

computer or any ECD after the occurrence of a cybercrime

Evidence gathered will be used in criminal proceedings

An attempt is made to determine what has happened to the digital media, as a result

of the incident

Cyber security – refers to technologies , processes and practices designed to protect

networks, computers, programs, and data from the attack, damage or unauthorized access

It determines vulnerabilities that exist in the network, computers programs, and

data and patches the loop holes

SO Cyber forensics is a response to the cyber crime to investigate if any adverse has

happened and to determine the source of the incident

But Cyber security prevents incidents of cybercrime with the implementation of

security measures

All these 3 are interrelated where every cybercrime drives the cyber forensics team

and the cyber security team to work in tandem , to respond to and to prevent cyber

crimes

In computing context, security can be viewed from different perspectives

Security of any organization has 4 dimensions:

IT :

Application security – applications should be developed to overcome vulnerabilities and

threats

Computing security – efficient security policy should be in force so as to avoid threats

Data security – data of the organization should be secured from unauthorized

manipulation, theft, loss and secrecy

Information security – ensures confidentiality, integrity and availability of the information

Network security – organization network should be secure enough to facilitate safe data

transfer

Physical Security:

Facilities security – all the equipments within organization must be secured from physical

damage, system crash and power failure

IT

Security

Physical

Security

Financial

Security

Legal

Security

Human security – employees within the organization must be provided with security

training be aware of the security process

Financial Security

Organization must adopt appropriate measures to be financially immune to threats from

insiders and outsiders

Legal Security

National security – It involves checking for any lapse in security from threats that arise out

of nationwide issues

Public security - security from threats associated with societal issues such as riots, strikes

or clashes

CYBER FORENSICS

Its an electronic discovery technique that is used to determine and reveal technical

criminal evidence

It involves extraction of electronic data for legal purposes

Also known as Computer forensics

Definition :

“Computer forensics is the study of evidence from attacks on computer systems in order to

learn what has occurred, how to prevent it from recurring and the extent of damage”

“Forensic computing is the process of identifying, preserving, analyzing and presenting

digital evidence in a manner that is legally acceptable ”

And etc

Need for Cyber forensics

1. Traditional approaches like finger printing, DNA extraction are insufficient to

prove an incident or they end in deadlock during an investigation

2. Due to advent of internet offences and crimes now span a diverse range from

hacking till cyber terrorism, so it is essential to curb and control

3. Cybercrime has changed the mode of operation of the crime hence investigation

must be performed by cyberforensics body rather than regular crime branch

4. Cybercrime spread across the boundaries in no time, so its necessary to address

incidents that conflict with legal provisions

5. Cyberforensics is essential to prosecute a criminal if a compromise of some sort is

observed

Objectives of Cyber forensics

1. To identify the evidence associated with a malicious activity in short span of time

2. To recover and analyze the evidence and related materials from computers and

ECDs

3. To present the collected evidence in a court of law

4. To estimate the potential impact of malicious activity

5. To assess the intention and identity of the offender

Computer/Cyber Forensics Investigation

Cyber crime investigation works in phases:

First phase : Preliminary analysis – by forensic investigator – gathers information on crime

scene

Second phase : works on forensic copy acquisition and recovery

Third phase : perform detailed analysis and prepares comprehensive report

To do this forensic investigator should have extensive knowledge on this area and highly

specialized skills The evidence has to be gathered from ECDs

Steps in Forensic Investigation

Cyber forensics should ensure integrity of the evidence while handling and analyzing so

that the evidence is admissible in court

Steps are:

1) Investigation starts when a crime is reported or complaint is recieved

2) In response to the complaint the following are made:

a. If evidence has to be gathered from a third party, a notice is served

b. If it is a criminal offence , a First Information Report (FIR) is filed

c. A search warrant (if required) is obtained from court

3) First responder / Computer Emergency Response Team(CERT) procedures are

performed

4) Evidence is seized from the crime scene, by photographing the scene and marking

the evidence

Necessary documentation is done

Witnesses present during the seizure of evidence and the suspect can be interviewed

If there are any complications in evidence collection , or the investigation officer

does not possess evidence collection expertise , a third party expertise may be called

in

Chain of custody has to be documented

5) The collected evidence is numbered and securely transported to the forensic

laboratory for analysis

6) The following are done at forensic lab:

a. Two bit stream copies of the evidence are collected. The hash values of the

original and forensic copies are verified

b. Chain of custody is maintained

c. Original evidence is stored in a secured location

d. Forensic copy is analysed for evidence

e. A forensic report is prepared stating the methods and recovery tools used,

the potential evidence and the findings

f. The report is presented to the client

7) In some cases forensic investigator may be called to testify in court as an expert

witness

The task performed by the forensic investigator is as follows:

a. Determine the extent of crime and damage caused due to it

b. Recover the data to be investigated from ECDs

c. Collect the evidence from ECDs in a forensically sound manner

d. Ensure the integrity of the evidence

e. Analyse the evidence

f. Consider all possible conclusions of investigations

g. Prepare a forensic report

h. Testify in court if required

Forensic Examination Process

Following steps are required for reconstruction of technical aspects of the data and to

analyse computer usage to prove a crime, examine residual data and to authenticate data

by technical analysis

Identification – Attempts to determine the evidence present, where is it stored and how is it

stored, the context of the evidence present, either physical in disk drive as hardware and

software components or logical as location of the evidence in the drive

Procedure used to locate the evidence should be documented

.Acquisition - it is required for the incident that has already occurred

Based on this and type of information on an ECD and its format, the tools and strategy

used for acquisition will vary

Extraction - forensic investigator will extract data from it. Volatile data is lost , so a copy

of it is made and is compared with the original one.

Preservation - integrity of original evidence has to be preserved, and is ensured by creating

its forensic copy for analysis

Evaluation – it attempts to ascertain and analyse if the evidence is relevant to the case.

Irrelevant information may be filtered out to avoid confusion

Interpretation – interpretation of what is found during analysis should be done in an easily

understandable way

Presentation - suitability of the evidence with respect to the case has be presented before

the court. Documentation has to be prepared – chain of custody and evidence analysis

Methods employed in Forensic Analysis

Data Recovery

o Recovering and analyzing deleted files that have not been

overwritten

o Carving out portions of text from the unallocated and slack space

String and keyword searching

o Attempts to identify the readable text within a binary file or

specific string within a file

o Search is made with known and unknown files as well as

unallocated and slack space

Volatile evidence analysis

o It gives details about the state of the system by looking into

connections, processes and cache tables gathered from the RAM

Timeline analysis

o Attempts to create a timeline of events and makes analysis on the

basis of modified, accessed and changed times associated with files

that are imaged

System file analysis

o Reveals any unauthorized changes that are made to system

binaries

Benefits of Cyber forensics

ECD subjected to forensic examination is protected from alteration, damage, data

corruption and viruses

Files, hidden files and password protected files are discovered and deleted data is

recovered from ECD

Contents of the hidden files and swap files used by the application programs as well

as operating system are revealed

The contents of password protected and encrypted files are recovered using tools

All possible and relevant data present in special areas of the disk are analysed

All the possible relevant files are discovered

It offers expert consultation and testimony when required

Classification of Cyber forensics

1. Disk forensics

2. Network forensics

3. Wireless forensics

4. Database forensics

5. Malware forensics

6. Mobile device forensics

7. GPS forensics

8. Email forensics

9. Memory forensics

Disk forensics

It is the process of extracting forensic information from storage media such as hard disks,

USB drive, CD, DVD, flash drive and floppy disk etc

Steps in Disk forensics are:

Identification of evidence – locates source of evidence at crime scene

Seizure and acquisition of evidence - at the crime scene hashvalue of the original

evidence in the storage after seizure is computed using a forensic tool.

Hash value is stored, evidence is packed and sealed

Acquisition is the process of taking bit-by-bit copy of the original evidence which

itself is write protected . This is done in forensic lab

Authentication and analysis of evidence – this done at forensic lab where he hash

values of both original media and forensic copy are compared to make sure that

they are the same

Preservation of evidence - after acquisition and authentication, evidence is kept in a

place that is secured from magnetic and other radiation sources

Analysis of evidence – process of collecting the evidence from storage media

Report on findings – prepare case analysis report that includes all the details:

examination, analysis , authentication. It should also include observation of

examiner

Documentation – every activity at each step is documented to make the case

admissible in court

Disk Forensics challenges:

1. Text search utility is usually used by examiners to find the keywords which would

serve as evidence

It becomes impossible to gather evidence a)if the keyword is misspelt, b)files in the

media are encrypted or c) stored as graphic

Certain graphic file will open only if it is extracted from the image file and opened

with respective software

In such case its responsibility of the examiner to look for other alternatives to gather

evidence rather than concluding that the evidence does not exist

2. Hidden files, encrypted files, files with disguised names and files whose extension

are altered and hidden areas in the storage media provide room for hiding

evidential data unless specialized tools are used to analyse them

Network forensics

Refers to capture , recording and analysis of network events so as to discover any malicious

activity, security attack, or any violation

It finds applications in cases relating to hacking, fraud, data espionage, data theft,

defamation, narcotics trafficking, credit card cloning, software piracy , sexual harassment

etc.

Tools for analysis:

1) Intrusion detection system – monitors networks and systems under it for malicious

activity or policy violations and maintains a record of the activity

Any such activity will be reported to the network administrator

2) Logging gathers and records the activity on a network with the help of IDS which

can help in tracking an offender or hacker

3) Packet capturing tools can gather and record every bit exchanged between ant two

designated hosts . Since large amount of data is generated by these tools in a short

span of time, it cannot be used to capture data for a longer time

4) NetFlow data collector gathers and records data about every network connection-

eg: source, destination, the volume of the data , since it captures only summary, this

can be used to gather data for longer periods

Network forensic challenges:

1. Large volume of data generated by network everyday , it is tedious to search for an

evidence

2. The inherent anonymity of internet protocols, with MAC address at datalink layer,

IP address at network layer and an email address at the application layer and the

possibility that all these can be spoofed poses biggest challenge in identifying the

source of the incident

3. Single purpose tools for collecting, filtering and stream reassembly from

applications, routers, firewalls are insufficient to figure out the network activity.

Raw network packets should be captured to gather highest level of traffic, this is

possible with sniffing

4. Sessioning is the act of assembling raw packets between specified points as a

complete stream which helps in gathering information about specific

communication. Protocol analysis tools can be used to produce a tree oriented view

of sessions and such visual presentation gives clear picture of what happened on the

network.

Wireless forensics

Is associated with network forensics

It involves capturing the data moving over the network and analyzing network events so as

to uncover network anomalies, discover source of security attacks and investigate breaches

on computers and wireless networks

Evidence collected can correspond to plain data, or with the broad usage of Voice over IP

technologies, especially over wireless networks and can include voice conversations

Traffic analysis in wireless networks involves following stages:

1. Data normalization and mining to search through the data

2. Traffic pattern recognition for identifying suspect patterns

3. Protocol dissection for analyzing the header fields

4. Reconstruction of application sessions for visualization

Forensic tool: A Network forensic analysis tool (NFAT) is available for network forensics

but no alternate for wireless as such

1. Graphical wireshark protocol dissector is used to inspect every field of the frame

captured

2. Ngrep to search for specific strings in contents of frame

3. Text based tcpdump or tshark sniffers to automate and script the analysis of certain

tasks eg: filtering traffic based on specific conditions

Database forensics

Malware forensics

Mobile device forensics

GPS forensics

Email forensics

Memory forensics

Chapter 5

Introduction to Cyber Forensics

© Oxford University Press 2018. All rights reserved.

Outline

• Interrelation among Cybercrime, Cyber Forensics and Cyber Security

• Cyber Forensics

• Disk Forensics

• Network Forensics

• Wireless Forensics

• Database Forensics

• Malware Forensics


Outline (Cont…)

• Mobile Forensics

• GPS Forensics

• Email Forensics

• Memory Forensics

• Building Forensic Computing Lab

• Incident and Incident Handling

• Computer Security Incident Response Team


Interrelation among Cybercrime, Cyber Forensics, and Cyber Security• Cybercrime:

• Any criminal offence.

• Involves a computer/ network.

• Computer forensics: Focuses on the investigation of cybercrimes.

• Cyber security: Prevents cybercrime with the implementation of security measures.


Cyber Forensics

• Cyber Forensics• Electronic discovery technique.

• Determine and reveal technical criminal evidence.

Definition

• Computer forensics is the study of evidence from attacks on computer systems in order to learn what has occurred, how to prevent it from recurring and the extent of damage.

- McGraw-Hill Dictionary of Scientific and Technical Terms


Cyber Forensics (Cont…)

Need

• Traditional approaches are either insufficient or endin deadlock.

• Wide range of cyber offences and crimes.

• Perpetrators of cybercrimes have changed the modusoperandi.

• Cybercrime can spread across boundaries in no time.

• Integrity and existence have to be ensured.



Objectives

• To identify the evidence.

• To recover and analyze the evidence and relatedmaterials.

• To present the evidence in a court of law.

• To estimate the impact of the malicious activity.

• To assess the intention and identity of the offender.



Computer Forensics Investigations



Steps in Forensics Investigations



Forensic Examination Process

• Identification – attempts to determinecthe vvff ukikk

• Acquisition

• Extraction

• Preservation

• Evaluation

• Interpretation

• Presentation



Methods Employed in Forensic Analysis

• Data recovery: Recovering and analyzing deleted files.

• String and keyword searching: Identifying readable textor specific string.

• Volatile evidence analysis: Details about the state of thesystem.

• Timeline analysis: Analysis on modified, accessed andchanged times.

• System file analysis: Analysis on any unauthorizedchanges.



Classification of Cyber Forensics

• Disk Forensics

• Network Forensics

• Wireless Forensics

• Database Forensics

• Malware Forensics

• Mobile device Forensics

• GPS Forensics

• Email Forensics

• Memory Forensics



Benefits of Cyber Forensics

• Protected from alteration, damage, data corruption,and viruses.

• Files, hidden files, and password-protected files arediscovered.

• Deleted data is recovered.

• Content of password-protected and encrypted filesare accessed.

• Data present in the special area of the disk areanalyzed.

• Offers expert consultation and testimony.


Disk Forensics

• Extracting information from storage media.• Hard disk.

• USB drive.

• CD.

• DVD.

• Flash drive.

• Floppy disk.


Disk Forensics (Cont…)

• Steps in disk forensics:• Identification of evidence.

• Seizure and acquisition of evidence.

• Authentication and analysis of evidence.

• Preservation of evidence.

• Analysis of evidence.

• Reports on findings.

• Documentation.


Disk Forensics (Cont…)

Challenges

• Text search utility:• Misspelt keywords.

• Encrypted files.

• Files stored as graphics.

• Difficult to gather evidence:• Hidden files, files with disguised names , and files whose

extensions are altered.

• Hidden areas in the storage media.


Network Forensics

• Capture, recording, and analysis of network events.

• Cases:• Hacking.

• Fraud.

• Data espionage.

• Data theft.

• Defamation.

• Narcotics trafficking.

• Credit card cloning.

• Software piracy.

• Sexual harassment.


Network Forensics (Cont…)

Tools for Analysis

• Intrusion Detection System (IDS): Monitors networksand systems.

• Logging: Gathers and records the network activity.

• Packet capturing tools: Gather and record every bitexchange.

• NetFlow data collector: Gathers and records dataabout every network connection.


Network Forensics (Cont…)

Challenges

• Large volume of data in the order of gigabytes.

• Spoofing of inherent anonymity of Internet protocols.

• Single-purpose tools are insufficient to figure outnetwork activity.

• Protocol analysis tools: Produce a tree-oriented viewof sessions.


Wireless Forensics

• Capturing the network data and analyzing thenetwork events.

• Goal: To collect and analyze network traffic.

• Stages of traffic analysis:• Data normalization and mining.

• Traffic pattern recognition.

• Protocol detection.

• Application session reconstruction.


Wireless Forensics (Cont…)

Forensic Tools

• Graphical Wirelesshack protocol detector: Inspectsevery field of the frame.

• ngrep (network grep): Searches for specific strings.

• Text-based tcpdump / tshark sniffers: Automate andscript the analysis of certain tasks.


Wireless Forensics (Cont…)

Challenges

• Radio frequency communication and the complexityof the medium.

• Tracking data during roaming.

• Handling processing overheads and storage.


Database Forensics

• Determine the security breach to a database.

• Sources for database breach:• Files where the metadata resides.

• Cached data (Internal structures).

• Index files (Logical structures).

Forensic Approaches

• Reactive Approach.

• Proactive Approach.


Database Forensics (Cont…)

Forensic Methodology

• Investigation Preparedness.

• Incident Verification.

• Artifact Collection.

• Artifact Analysis.


Malware Forensics

• Finding the malicious code.• Determining how it got there and changes caused.• Malware forensic process begins with the examination of

the following:• Master boot record.• Volatile data.• System files.• Hash of the files.• System programs.• Auto-start locations.• Host-based logs.• File system artifacts.• Web browsing history.• Suspected malicious files.


Malware Forensics (Cont…)

Malware Analysis

• Removal of malware.

• Scanning of machine for malware.

• Gathering of data / evidence


Mobile Forensics

• Recovery of digital evidence from mobile devices.

Stages

• Seizure.

• Preparation.• Legal authority.

• Goals of examination.

• Make, model, and identifying information of device.

• Removable and external data storage.


Mobile Forensics (Cont…)

• Acquisition.

• Evidence examination.

• Presentation and reporting.

Analysis Tools

• Manual extraction.

• Logical extraction.

• Physical extraction.

• Chip-off.

• Micro read.


GPS Forensics

• Recovery of live and deleted data from differentnavigation devices.


Email Forensics

• Email tracing and email tracking can be achieved withemail forensics.

• Tracing: Done when an email header is available.

• Tracking: Done even when no information is available.

Client and Server in Email

• Email Clients run programs such as Outlook Express,Eudora, or Pine.

• Servers run specialized software such as Windows Server2003 or Novell Netware.

• Servers run programs such as Exchange, GroupWise, orSendmail.


Email Forensics (Cont…)

Structure of Email

• Header: Email information source.• Message body: compiled by the user and is stored as binary

data.• Attachments: 80% of email data.Working of Email

• Composed using mail client (Gmail, Yahoo mail, etc.).• Client sends the message to a mail transfer agent (MTA).• MTA is a server that runs simple mail transfer protocol

(SMTP).• Header information is placed on the top.• Timestamp is added.• Recipient accesses the mail server using POP3 or IMAP.



Email Protocols

• Post office protocol (POP).

• Internet message access protocol (IMAP).

• Microsoft’s mail API (MSMAPI).

Examining Email Messages

• Accessing the victim’s computer.

• Retrieving the evidence.

• Investigation• Look for, open, and copy the evidence in the email along

with header.• Look for protected and encrypted material.



Viewing Email Headers

• Information in the email header:• Unique identifying numbers.

• IP address of the sending server.

• Time the mail was sent.

• Headers can be viewed using• GUI clients.

• Command-line clients.

• Web-based clients.



Examining Email Headers

• Return path.

• Recipient’s email address.

• Type of sending email service.

• IP address of the server from where the mail hasbeen sent.

• Name of the email server.

• Unique message number.

• Date and time at which the mail has been sent.

• Information related to attached files.



Tracing Email Messages


ARIN


Tracing Email Messages


APNIC


Email Servers and their Examination

• FINALeMAIL• Scans email database files.

• Recovers deleted files.

• FTK• Filters and finds files specific to email client and servers.



Tracking Emails

• Services• Readnotify.

• DidTheyReadIt.

• getnotify.


Memory Forensics

• Examination of volatile data in a computer’s memorydump.

RAM Artifacts

• Network connections.

• Running process.

• Usernames and passwords.

• Dynamic link libraries.

• Contents of open window.

• Open registry key of process.

• Open files for process.

• Memory resident malware.© Oxford University Press 2018. All rights reserved.

Memory Forensics (Cont…)

RAM Analysis

• Tools• Volatility: Free and open-source

• HBGary: Proprietary.

Forensic Tools

• Magnet RAM Capture.

• Belkasoft Live RAM Capturer

• MoonSols Dump.

• FTK Imager.


Building Forensic Computing Lab


Requirements:

• A log register should be maintained at the entrance of the lab as a layer of monitoring for protection.

• The lab area should be secured by cipher combination locks to ensure that the chain of custody is maintained.

• The lab should be equipped with fire safety measures.

• The work area should be equipped with necessary infrastructure such as work tables, chairs, and storage capability.

• The evidence storage area should have a strongly constructed metal shelf, be non-destructive, and fire-proof.

• A forensic toolkit should contain disassembly and removal tools, packaging and transport supplies, etc., facilitating the examiner to collect evidence from the crime scene.

Building Forensic Computing Lab (Cont…)• A forensic lab should have the following: workstations, UPS, book

racks with necessary reference materials, necessary software and tools, safe locker, LAN, and Internet connectivity.

• Necessary hardware equipment.

• Necessary software.

• Internet connectivity with sufficient bandwidth for the workstations is necessary.

• Multiple forensic tools as required.


Incident and Incident Handling

Incident

• An event or set of events that threaten the security ofcomputing systems and the network.

Incident Handling


Incident and Incident Handling (Cont…)Incident Reporting

• Report the incident to the CERT Coordination Center, a lawenforcement agency or CSIRT.

Incident Response

• Ascertain the affected resources.

• Assess the incident.

• Assign a unique identity to the event.

• IIC coordinates with the task force.

• Collect the information related to the evidence.

• Perform forensic analysis.


Computer Security Incident Response Team • Service organization.

• Receives reports, and reviews.

• Responds to computer security incidents.

• Members:• Incident investigator and coordinator (IIC).

• Incident liaison (IL).

• Senior system manager.

• Information system security officer.

Forensic Readiness

• Incident response procedures in place along withtrained personnel to handle any investigation.


Using Data Mining Techniques in

Cyber Security Solutions

Data mining is the process of identifying patterns in large datasets. Data mining techniques are heavily used in scientific research (in order to process large amounts of raw scientific data) as well as in business, mostly to gather statistics and valuable information to enhance customer relations and marketing strategies.

Data mining has also proven a useful tool in cyber security solutions for discovering vulnerabilities and gathering indicators for baselining.

The process of data mining

What is data mining? In general, it is a process that involves analyzing information, predicting future trends, and making proactive, knowledge-based decisions based on large datasets.

While the term data mining is usually treated as a synonym for Knowledge Discovery in Databases (KDD), it’s actually just one of the steps in this process. The main goal of KDD is to obtain useful and often previously unknown information from large sets of data.

The entire KDD process includes four steps:

Pre-processing – selecting, cleaning, and integrating data Transformation – transforming information and consolidating it into forms appropriate

for mining Mining – collecting, extracting, analyzing, and statistically processing data Pattern evaluation – identifying new and unusual patterns and presenting the

knowledge gained from data mining

Data mining helps you find new interesting patterns, extract hidden (yet useful and valuable) information, and identify unusual records and dependencies from large databases. To obtain valuable knowledge, data mining uses methods from statistics, machine learning, artificial intelligence (AI), and database systems.

In recent years, many IT industry giants such as Comodo, Symantec, and Microsoft have started using data mining techniques for malware detection.

Data mining methods

Many methods are used for mining big data, but the following eight are the most common:

Association rules help find possible relations between variables in databases,

discover hidden patterns, and identify variables and the frequencies of their occurrence.

https://www.apriorit.com/dev-blog/472-machine-learning-applications

Classification breaks a large dataset into predefined classes or groups.

Clustering helps identify data items that have similar characteristics and

understand similarities and differences among data. The decision tree technique creates classification and regression models in the

form of a tree structure. The neural network technique is used to model complex relationships between

inputs and outputs and to discover new patterns. Regression analysis is used for predicting the value of one item based on the

known value of other items in a dataset by building a model of the relationship between dependent and independent variables.

Statistical techniques help find patterns and build predictive models.

Visualization discovers new patterns and shows the results in a way that is

comprehensible for users.

You can apply one or several data mining methods to create an efficient model that will ensure successful detection of attacks.

Data mining for malware detection

Data mining is one of the four detection methods used today for detecting malware. The other three are scanning, activity monitoring, and integrity checking.

When building a security app, developers use data mining methods to improve the speed and quality of malware detection as well as to increase the number of detected zero-day attacks.

Malware detection strategies

There are three strategies for detecting malware:

Anomaly detection Misuse detection Hybrid detection

Anomaly detection involves modeling the normal behavior of a system or network in order

to identify deviations from normal usage patterns. Anomaly-based techniques can detect even previously unknown attacks and can be used for defining signatures for misuse detectors.

The main problem with anomaly detection is that any deviation from the norm, even if it is a legitimate behavior, will be reported as an anomaly, thus producing a high rate of false positives.

Misuse detection, also known as signature-based detection, identifies only known attacks

based on examples of their signatures. This technique has a lower rate of false positives but can’t detect zero-day attacks.

A hybrid approach combines anomaly and misuse detection techniques in order to

increase the number of detected intrusions while decreasing the number of false positives. It doesn’t build any models, but instead uses information from both harmful and clean programs to create a classifier – a set of rules or a detection model generated by the data mining algorithm. Then the anomaly detection system searches for deviations from the normal profile and the misuse detection system looks for malware signatures in the code.

Detection process

When using data mining, malware detection consists of two steps:

Extracting features Classifying/clustering

In the first step, various features such as API calls, n-grams, binary strings, and program behaviors are extracted statically and dynamically to capture the characteristics of the file samples. Feature extraction can be performed by running static or dynamic analysis (with or without actually running potentially harmful software). A hybrid approach that combines static and dynamic analysis may also be used.

During classification and clustering, file samples are classified into groups based on feature analysis. To classify samples, you can use classification or clustering techniques.

To classify file samples, you need to build a classification model (a classifier) using classification algorithms such as RIPPER, Decision Tree (DT), Artificial Neural Network (ANN), Naive Bayes (NB), or Support Vector Machines (SVM). Clustering is used for grouping malware samples that have similar characteristics.

Using machine learning techniques, each classification algorithm constructs a model that represents both benign and malicious classes. Training a classifier using such file sample collection makes it possible to detect even newly released malware.

Note that the effectiveness of data mining techniques for malware detection critically depends on the features you extract and the categorization techniques you use.

Data mining for intrusion detection

Aside from detecting malware code, data mining can be effectively used to detect intrusions and analyze audit results to detect anomalous patterns. Malicious intrusions may include intrusions into networks, databases, servers, web clients, and operating systems.

There are two types of intrusion attacks you can detect using data mining methods:

Host-based attacks, when the intruder focuses on a particular machine or a group of machines

Network-based attacks, when the intruder attacks the entire network (for instance, causing a buffer overflow

https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Classification/JRip

To detect host-based attacks, you need to analyze features extracted from programs, while to detect network-based attacks, you need to analyze network traffic. And just like with malware detection, you can look for either anomalous behavior or cases of misuse.

Data mining for fraud detection

Fraudulent activities can be detected with the help of supervised and unsupervised learning.

With supervised learning, all available records are classified as either fraudulent or non-fraudulent. This classification is then used for training a model to detect possible fraud. The main drawback of this method is its inability to detect new types of attacks. Unsupervised learning methods help identify privacy and security issues in data without using statistical analysis.

Data mining pros and cons

Using data mining in cyber security lets you

process large datasets faster; create a unique and effective model for each particular use case; apply certain data mining techniques to detect zero-day attacks.

While this list of the benefits is impressive, there are also certain drawbacks you need to know about:

Data mining is complex, resource-intensive, and expensive Building an appropriate classifier may be a challenge Potentially malicious files need to be inspected manually Classifiers need to be constantly updated to include samples of new malware There are certain data mining security issues, including the risk of unauthorized

disclosure of sensitive information

Data mining helps you quickly analyze huge datasets and automatically discover hidden patterns, which is crucial when it comes to creating an effective anti-malware solution that’s able to detect previously unknown threats. However, the final result of using data mining methods always depends on the quality of data you use.

When using data mining in cyber security, it’s crucial to use only quality data. However, preparing databases for analysis requires a lot of time, effort, and resources. You need to clear all your records of duplicate, false, and incomplete information before working with them. Lack of information or the presence of duplicate records or errors can significantly decrease the effectiveness of complex data mining techniques. Only using accurate and complete data can ensure high quality of analysis.

Conclusion

Data mining has great potential as a malware detection tool. It allows you to analyze huge sets of information and extract new knowledge from it.

The main benefit of using data mining techniques for detecting malicious software is the ability to identify both known and zero-day attacks. However, since a previously unknown but legitimate activity may also be marked as potentially fraudulent, there’s the possibility for a high rate of false positives.

CYBER FORENSICS Module 2

Documents