Top Banner
1 EIE 4114 Digital Forensics for Crime Investigation Lecturers: First half: Dr Bonnie Law Second half: Dr Wen Chen EIE4114 My part: 5 September to 15 October Room: DE 609 Tel: 2766 4746 [email protected] http://www.eie.polyu.edu.hk/~nflaw/ei e4114.html 3 Topics Covered Forensics framework Collecting, searching and sorting evidence Machine learning forensics For crime investigation, prevention and detection Authenticating and attributing evidence source for email/image, identity behind social media account Forensics and Issues of anti-forensics 4 Assessment Examination: 50% Continuous Assessment: 50% (2 parts) My part: (27.5%) Quiz: 10% 26 September, 10 October Laboratory sessions (2.5%) – 17 Sept, 15 Oct Mini-project (15%) (2 in a group) Phase 1: project proposal: due on 3 Oct 2019 Phase 2: project report and presentation: 10 mins video: both members need to explain the findings verbally (29 November
15

EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Oct 16, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

1

EIE 4114 Digital Forensics for Crime Investigation

Lecturers: First half: Dr Bonnie Law

Second half: Dr Wen Chen

EIE4114 My part: 5 September to 15 October

Room: DE 609 Tel: 2766 4746 [email protected]

http://www.eie.polyu.edu.hk/~nflaw/eie4114.html

3

Topics Covered Forensics framework Collecting, searching and sorting evidence Machine learning forensics

For crime investigation, prevention and detection

Authenticating and attributing evidence source for email/image, identity behind

social media account Forensics and Issues of anti-forensics 4

Assessment

Examination: 50% Continuous Assessment: 50% (2 parts) My part: (27.5%)

Quiz: 10% 26 September, 10 October

Laboratory sessions (2.5%) – 17 Sept, 15 Oct Mini-project (15%) (2 in a group)

Phase 1: project proposal: due on 3 Oct 2019 Phase 2: project report and presentation: 10 mins video:

both members need to explain the findings verbally (29 November

Page 2: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Mini-Project: Phase 1: proposal

Outline a forensics problem to be addressed Identify techniques/software/methods to solve the

forensics problem Phase 2:

State the forensics problem A detailed comparison of two

techniques/software/methods to solve the problem Outline the features of the new tools to be developed

/ propose new ideas that may better solve the problem

Forensic Science Application of scientific methods to establish

factual answers to legal problems “What has happened” “How did it happen” “Who has involved” “When did it occur”

Digital forensics: application of computer science and investigative procedures for a legal purpose Goal: reconstruct the incident and find supporting or

refuting evidence

7

Forensics process Defines a structured investigation of digital

evidence from any device capable of storing or processing data in a digital form Maximizing the usefulness of incident evidence

data, minimizing the cost of forensics Cost:

The estimated investigation time in NZ hacker’s case, characterized as a typical intrusion scenario, was 417 hours, resulting in investigation cost of $27,800 (one victim)

A Russian hacker’s case (automated online auctions using stolen credit card) that resulted in prosecution took 9 months of investigator’s time. A partial estimate of the cost was $100,000.

Sometimes, if cost was greater than benefit dismissal of charges

8

Forensics process Example:

Case: received an email with potentially important evidence of a crime

As email was sent over the Internet origin of the email must be considered as uncertain, as must the timestamps (may be tampered with while en route)

How may one know that it was created by the system’s owner, not by intruder? Or not by Torjanhorse/other malware)

Page 3: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

usefulness of evidence Evidentiary weight in a court of law, relevant

and sufficient for, trustworthiness of the evidence, determining root cause, linking the attacker to the incident, …

10

Definition of Digital Forensics

Digital forensics involves the analysis of digital evidence after proper

search authority Evidence integrity: Preservation of evidence in its original form (without any

intentional or unintentional changes) chain of custody, validation with mathematics, use of validated tools, repeatability, reporting

and possible expert presentation National Institute of Standards and Technology (NIST)

(www.cftt.nist.gov): create a set of criteria for evaluating forensics tools

11

Digital Forensics Covers the general practices of analyzing

all forms of digital evidence Include

Computer forensics (file system forensics) Preservation and analysis of computers

Network forensics Traffic analysis and logs from networks

Mobile device forensics Cell phones, smart phones, satellite navigation

systems (GPS) Malware forensics

Malicious code (viruses, worms, Trojan horse)

Digital Forensics Process

Evidence integrity: Preservation of evidence in its original form (without any intentional or unintentional

changes) Chain of custody

Refers to the documentation of acquisition, control, analysis and disposition of physical and electronic evidence

Shows how the evidence is acquired, managed, transferred during the investigation process, and who involve in the process, what their responsibilities are and for how much time they store the evidence and how they transfer it to some else

Page 4: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Example: Chain of Custody

Chronological documentation

Source (Digital evidence) data stored or transmitted using a computer that

support or refute a theory of how an offense occurred or that address critical elements of the offense such as intent Computer systems: laptops, desktops, servers, … Communication systems: Internet, networks,

GPS, SMS messages, email, website that was visited,

Embedded computer systems: mobile devices, smart cards, security cameras, …

Digital evidence: fragile, can be modified / edited 14

Example of Digital evidence

a scanner: is used to digitize illegal photos Evidence: has unique scanning characteristics that

links the hardware to the digitized images can seize as digital evidence

All service provides (e.g., telephone companies, ISPs, banks, credit institutions) Reveal location and time of an individual’s activities

(items purchased, car rentals, automated toll payment, mobile telephone calls, Internet access, online banking/shopping, …)

15

Example of Digital evidence

“Cost” consideration: Based on the type of incident or crime scenario,

focus on the most common places for evidence Hypothetical scenarios: considered possible

scenarios that generate potential evidence and plan to collect it in a proper manner

Questions: Where is the data? Format? How long is it stored or

retained? How much data is produced? Who is the owner? Who has

access? Is data generated during normal operations?

16

Page 5: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

17

Investigation Investigating digital devices includes:

Collecting data securely Examining suspect data to determine details such as

origin and content Presenting digital information to courts Applying laws to digital device practices

Goal: Present supporting facts and probabilities Resist the influence of others’ opinions and

avoid jumping to conclusions Evidence: authentic and has not been tampered 18

Collect digital evidence Investigative plan: identify sources of data Collect data according to the volatility of

the data (data lifetimes)

19

Collect digital evidence Recover data from

Deleted files File fragments Complete files

Deleted files still on the disk until new data is saved on the same physical location

Tools can be used to retrieve deleted files ProDiscover Basic

Example: “search” Sample data search

Identify and extract all email and deleted items

Search media for evidence of photos Configure and load sized database for data

mining Recover all deleted files for review

20

Page 6: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Examination Large volume of data File hashes can be used to identify files

Known good files: many files belong to OS, software or other applications do not contain useful evidence

Known bad hash databases: identify suspicious files like malware or images known to be associated with criminal activities

National software reference library: www.nsrl.nist.gov

Analysis Statistical methods, manual analysis, techniques

for understanding protocols and data formats, linking of multiple data objects (through the use of data mining) and timelining analysis Keyword searches: targeted analysis technique that

can be used if one knows what to look for E.g., imagine an illegal drug case where the

investigation was triggered based on a reported crime with specific info about a person and a certain drug or its code name the name of the person or the drug can be used as a keyword

Page 7: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Analysis Pattern matching: (regular expression)

social security numbers can be relevant for identity theft cases

credit card numbers/account numbers for fraud cases

File properties such as name, type, size, data of creation, and when accessed

Phone numbers Addresses (IP, email, physical home or work, along

with website URLs)

Analysis Who/What

Who or what application created, edited, modified, sent, received or caused the file to be?

Who is this item linked to? Where

Where was it found? Does it show where relevant events took place? Evidence points to a common source?

26

Analysis When

When was it created, accessed, modified, received, sent, viewed and deleted?

Time analysis? How

How did it originate on the media? How was it created, transmitted, modified and

used?

27

Analysis Reconstruction (Timeframe analysis)

Understanding the sequence of events Association (connects a person to a crime

scene)

28

Page 8: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

29

Forensic Analysis Groups such as the Scientific Working Group on

Digital Evidence set standards for recovering, preserving and examining digital evidence

Scientific evidence: evaluated using 4 criteria Whether the theory or technique can be (and has been) tested) Whether there is a high known or potential rate of error, and

the existence and maintenance of standards controlling the technique’s operation

Whether the theory/technique has been subjected to peer review and publication

Whether the theory/technique enjoys “general acceptance”within the relevant scientific community

Case Studies 5 different case studies Steps:

Identify relevant information concerning the case

Locate all files and find relevant info(how?) Associate files with …? Reconstruct the events/activities

30

Forensic Framework

31

Collection Identify and collect

digital evidence

selective acquisition?cloud storage?Generate data subset for

examination?

Examination of evidenceString search?Pattern matching?Data visualization (time-

line analysis)?Analysis

Forensic Framework

32

Data mining?cluster analysisdiscriminant analysisrule mining

Presentation

Analysisdetermine data significance and draw conclusion

Page 9: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Supplementary Info Clustering

Motivation: big volume of data (files) Manual inspection / string comparison: not

effective Desirable

Automatic system: document clustering into different groups

Objects within a cluster are more similar to each other than with other clusters Focus investigation on certain clusters 33

Supplementary Info

34

Supplementary Info File: feature vector:

term frequency: how frequent the term appears TF (t) = no of times the term t appears in a

document / total number of terms in the document

Inverse document frequency: how important the term is IDF (t) = Log (total number of documents / no

of document with term t in it)35

Supplementary Info Example:

10 million documents found 1000 of these 10 million documents contain

the term “Honda” In a document containing 100 words, the

term “Honda” appears 3 times TF (“Honda”) = 3/100 = 0.03 Idf (“Honda”) = log (10,000,000/ 1000) = 4 Feature = 0.03 x 4 = 0.12

36

Page 10: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Supplementary Info Features: a matrix

Each document has a set of TF-IDF features

similarity: between two documents: (0.12-0)^2 + (0.34-0)^2 +(0.55-0)^2 +

(0.11-0)^2 + (0.44-0)^2 37

Doc Honda Check License Phone Buy Sell …

1 0.12 0.11

2 0.34 0.55 0.44

Example 1 million document Identify six terms: “Honda”, “Check”,

“License”, “Phone”, “Buy”, “Sell” Term frequencies: 10000, 1000, 1000,

50000, 6000, 1000 Document 1: [10, 0, 0, 5, 5, 2] Document 2: [10, 0, 0, 5, 3, 5] Document 3: [3, 0,2, 1, 0, 0]

38

Example

39

Honda Check License Phone Buy Sell

idf 2 3 3 1.3 2.2 3

df 10000 1000 1000 50000 6000 1000

d1 d2 d3 tf

Honda 10 10 3Check 0 0 0

License 0 0 2

Phone 5 5 1Buy 5 3 0Sell 2 5 0

Example

40

Tf-idf d1 d2 d3

HondaCheck

License

PhoneBuySell

d1-d2:d1-d3:d2-d3:

Page 11: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Supplementary Info Clusters: use similarity

41

Inter-cluster distances are maximized

Intra-cluster distances are minimized

Example: Financial

crime:

42

Self-study: case study Censorship through Forensics:

Video analysis in post-war crisis https://www.cmu.edu/chrs/documents/Wexler-Censorship-Through-Forensics.pdf

On August 25, 2009, Channel 4 News in U.K. broadcast a video depicting men in Sri Lankan military uniforms shooting naked, bound prisoners in the head.

43

Self-study: case study Channel 4 acquired the video,

approximately one minute long, from Journalists for Democracy in Sri Lanka Condition: total anonymity of the source

Forensic analysts sought to resolve these speculations by examining the video file for traces of image manipulation. difficulties

44

Page 12: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Self-study: case study Background:

Hashing to verify copies are the same Check the consistency with the proprietary

container format Meta data:

Start / end time, duration of recording, sampling rate, make / model of the recording device, GPS location

Visual analysis: visual discontinuities Simultaneous visual and aural analysis 45

Self-study: case study Background:

Multiple compression detection Record in one format (usually compressed

format) Before editing can be done decompress After editing compress again

Source origin analysis (PRNU signal) Determine if PRNU signal is consistent through

the whole video PRNU signal will be different if video segments

from another device is inserted 46

Challenges in digital forensics Increasing number and size of

storage capacity Increasing volume of data + need to

provide fast results Availability of anti-forensics tools

Negatively affect the existence, amount and/or quality of evidence

47

Challenges in digital forensics

Disk cleaning utilities: overwrite existing data in disk

File wiping utilities Delete individual files by overwriting the clusters

occupied by files with random data, multiple times Guttmann standard: 35 times DoD standard: 7 times

Much faster than disk cleaning utilities

48

Page 13: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Challenges in digital forensics

Disk degaussing Magnetic file is applied to a digital media device The device is entirely clean of any previously

stored data Expensive approach, although effective

Trail obfuscation: Replace relevant info with false info (such as IP

address spoofing), alter metadata such as date/time stamps, log deletion/modification

49

Data Hiding Techniques Changing or manipulating a file to conceal

information Techniques:

Hiding entire partitions Changing file extensions Setting file attributes to hidden Bit-shifting Encryption Password protection

50

Data Hiding Techniques Changing file extensions: first techniques

to hide data Compare the file extension with file headers

Bit-shifting Changes data from readable code to data that

looks like binary executable code Data fabrication: e.g., modifying MAC info

(modified, accessed, created dates) or create excessive amount of data of certain type 51

NEW CHALLENGES

52

Page 14: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

US election in 2016 https://www.nytimes.com/2015/06/0

7/magazine/the-agency.html?_r=2

53

Reports http://www.theverge.com/2016/11/14/13

626694/election-2016-trending-social-media-facebook-twitter-influence

54

Reports Agency: organized disinformation

campaigns on social media using pseudonyms and virtual identities Promoting false news events influencing public opinion on politics Digital forensics?

Determine underlying identities of these agency’s employees content originating from them could be flagged and monitored (or banned?) 55

Reports American Scientist (Sept-Oct 2013,

volume 101, no 5) Without developing fundamentally new

tools and capabilities, forensics experts will face increasing difficulty and cost along with the ever-expanding data size and system complexity.

56

Page 15: EIE 4114 Digital Forensics EIE4114 for Crime Investigation

Summary Definition of digital forensics Case studies Forensics framework Challenges in Forensics

57