Report

DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIS

CHAPTER 1

INTRODUCTION

With the rapid development of computer technology and network technology, network security become

more important for the aim of protects network information from variety kind of attack. In order to enable the

network from a variety of possible abuse, the use of only a single feather firewall cannot meet the

requirements, but also needs real-time monitoring on Networks, as far as possible to attack the intrusion

before the attack happens.

Intrusion Detection System is developed and grew up against this background. As a new active security-

defensive mechanism, Intrusion Detection System can provide the host and network dynamic protection, it

can not only monitor the implementation of internal network attacks, external attacks and disoperation of the

real-time protection, but also in combination with other network security products to protect the network in

full range. The characteristics of real-time and initiative are important complement to the firewall. Today, in

the overall network security solutions, intrusion detection has become an indispensable component.

However, with the continuous expansion of network scale and the complexity of the means of attack,

Distributed Intrusion Detection System

Computer systems have been made increasingly secure over the past decades. However, new attacks and

the spread of harmful viruses have shown that better methods must be used. One approach gaining increasing

popularity in the computer community is to use Intrusion Detection Systems (IDSs). Intrusion Detection

Systems identify attacks against a system or users performing illegitimate actions. Using a common analogy,

having an Intrusion Detection System is like having a ”burglar alarm” in your house. The alarm will not

prevent the burglar from breaking into your house, but it will detect and warn you of the problem. Following

the publication of the first research in Intrusion Detection Systems, a large number of diverse applications

have been developed. One method of accomplishing this type of detection is the use of file system integrity

tools. When a system is compromised, an attacker will often alter certain key files to provide continued

access and to prevent detection. The changes could target any portion of the system software, e.g. the kernel,

libraries, log files, or other sensitive files.

DEPT. OF CSE / B.T.L.I.T 1


1.1 FILE SYSTEM INTEGRITY

File system integrity checkers detect those changes and trigger a corresponding alert. To guarantee the

integrity of the file system, two approaches can be followed.

The first approach is to create a secure database, which is usually composed of hashes. The stored hash will

be periodically checked against a newly computed hash. This method is used with tools such as Tripwire,

Aide and others

The second, more recent approach is to create digital signatures of sensitive data, such as executable files

using asymmetric cryptography, and use these signatures to check the integrity of the signed file.

Both approaches have advantages and drawbacks, but they share a common flaw: the auditing relies on the

validity of the operating system. All the previous applications have made the assumption that the OS itself is

not corrupted. Once the operating system is compromised the intruder can easily defeat integrity tools. As an

example, in the Linux operating system, redirecting system calls using kernel modules can potentially

compromise the system.

Also, since the binary of the Integrity Tool resides in the machine to be audited, the attacker may be able to

corrupt the binary or the configuration files of the tool. This work develops a novel way to overcome the

problems of traditional Integrity Tools. Our approach is to use a Distributed Intrusion Detection System

Based on Protocol Analysis, to perform the integrity detection checks.

The area of distributed computing systems provides a promising domain for applications of machine

learning methods. One of the most interesting aspects of such applications is that learning algorithms that are

embedded in a distributed computing infrastructure are themselves part of that infrastructure and must

respect its inherent local computing constraints (e.g., constraints on bandwidth, latency, reliability, etc.),

while attempting to aggregate information across the infrastructure so as to improve system performance (or,

availability) in a global sense.



Consider, for example, the problem detecting anomalies in a wide-area network. While it is straightforward

to embed learning algorithms at local nodes to attempt to detect node-level anomalies, these anomalies may

not be indicative of network-level problems. Indeed, in recent work, demonstrated a useful role for Principal

Component Analysis (PCA) to detect network anomalies. They showed that the minor components of PCA

(the subspace obtained after removing the components with largest eigen values) revealed anomalies that

were not detectable in any single node-level trace. While their work did not face the distributed data analysis

problem (it involved centralized, off-line analysis of blocks of data), it does provide clear motivation for

attempting to design a distributed PCA-based system for analyzing network anomalies in real time. The

development of such a design involves facing several challenging problems that have not been addressed in

previous work. Naive solutions that continuously push all data to a central analysis site simply cannot scale

to large networks or massive data streams. Instead, viable solutions need to process data .in-network. To

intelligently control the frequency and size of data communications.

The key underlying problem is that of developing a mathematical understanding of how to trade off

quantization arising from local bandwidth restrictions against delay of the data analysis. We also need to

understand how this trade off impacts overall detection accuracy. Finally, the implementation needs to be

simple if it is to have impact on developers.



CHAPTER 2

TRADITIONAL INTRUSION DETECTION SYSTEM

A Traditional intrusion detection system (TIDS) is a device or software application that monitors

network and/or system activities for malicious activities or policy violations and produces reports to a

Management Station. Intrusion prevention is the process of performing intrusion detection and attempting to

stop detected possible incidents. Intrusion detection and prevention systems (IDPS) are primarily focused on

identifying possible incidents, logging information about them, attempting to stop them, and reporting them

to security administrators In addition, organizations use IDPSs for other purposes, such as identifying

problems with security policies, documenting existing threats, and deterring individuals from violating

security policies. IDPSs have become a necessary addition to the security infrastructure of nearly every

organization.

IDPSs typically record information related to observed events, notify security administrators of important

observed events, and produce reports. Many IDPSs can also respond to a detected threat by attempting to

prevent it from succeeding. They use several response techniques, which involve the IDPS stopping the

attack itself, changing the security environment (e.g., reconfiguring a firewall), or changing the attack’s

content.

2.1 TERMINOLOGY

Alert/Alarm: A signal suggesting that a system has been or is being attacked.

True Positive: A legitimate attack which triggers TIDS to produce an alarm.

False Positive: An event signaling TIDS to produce an alarm when no attack has taken place.

False Negative: A failure of TIDS to detect an actual attack.

True Negative: When no attack has taken place and no alarm is raised.

Noise: Data or interference that can trigger a false positive.

Site policy: Guidelines within an organization that control the rules and configurations of TIDS.



Site policy awareness: The ability a TIDS has to dynamically change its rules and configurations in

response to changing environmental activity.

Confidence value: A value an organization places on a TIDS based on past performance and analysis

to help determine its ability to effectively identify an attack.

Alarm filtering: The process of categorizing attack alerts produced from a TIDS in order to

distinguish false positives from actual attacks.

Attacker or Intruder: An entity who tries to find a way to gain unauthorized access to information,

inflict harm or engage in other malicious activities.

Masquerader: A user who does not have the authority to a system, but tries to access the information

as an authorized user. They are generally outside users.

Misfeasor: They are commonly internal users and can be of two types:

1. An authorized user with limited permissions.

2. A user with full permissions and who misuses their powers.

Clandestine user: A user who acts as a supervisor and tries to use his privileges so as to avoid being

captured.

2.2 TYPES

For the purpose of dealing with IT, there are two main types of IDS:

2.2.1 Network intrusion detection system (NIDS)

It is an independent platform that identifies intrusions by examining network traffic and monitors multiple

hosts. Network intrusion detection systems gain access to network traffic by connecting to a network hub,

network switch configured for port mirroring, or network tap. In a NIDS, sensors are located at choke points

in the network to be monitored, often in the demilitarized zone (DMZ) or at network borders. Sensors capture

all network traffic and analyze the content of individual packets for malicious traffic. An example of a NIDS

is Snort.


http://en.wikipedia.org/wiki/Snort_(software)

http://en.wikipedia.org/wiki/Demilitarized_zone_(computing)

http://en.wikipedia.org/wiki/Network_tap

http://en.wikipedia.org/wiki/Port_mirroring

http://en.wikipedia.org/wiki/Network_switch

http://en.wikipedia.org/wiki/Network_hub

http://en.wikipedia.org/wiki/Network_intrusion_detection_system


2.2.2 Host-based intrusion detection system (HIDS)

It consists of an agent on a host that identifies intrusions by analyzing system calls, application logs, file-

system modifications (binaries, password files, capability databases, Access control lists, etc.) and other host

activities and state. In a HIDS, sensors usually consist of a software agent. Some application-based IDS are

also part of this category. An example of a HIDS is OSSEC. Intrusion detection systems can also be system-

specific using custom tools and honey pots.

2.3 PASSIVE AND/OR REACTIVE SYSTEMS

In a passive system, the intrusion detection system (IDS) sensor detects a potential security breach, logs

the information and signals an alert on the console and or owner. In a reactive system, also known as an

intrusion prevention system (IPS), the IPS auto-responds to the suspicious activity by resetting the

connection or by reprogramming the firewall to block network traffic from the suspected malicious source.

The term IDPS is commonly used where this can happen automatically or at the command of an operator;

systems that both "detect" (alert) and/or "prevent."

2.4 COMPARISON WITH FIREWALLS

Though they both relate to network security, an intrusion detection system (IDS) differs from a firewall in

that a firewall looks outwardly for intrusions in order to stop them from happening. Firewalls limit access

between networks to prevent intrusion and do not signal an attack from inside the network. An IDS evaluates

a suspected intrusion once it has taken place and signals an alarm. An IDS also watches for attacks that

originate from within a system. This is traditionally achieved by examining network communications,

identifying heuristics and patterns (often known as signatures) of common computer attacks, and taking

action to alert operators. A system that terminates connections is called an intrusion prevention system, and is

another form of an application layer firewall.


http://en.wikipedia.org/wiki/Application_layer_firewall

http://en.wikipedia.org/wiki/Intrusion_prevention_system

http://en.wikipedia.org/wiki/IDPS

http://en.wikipedia.org/wiki/Intrusion_prevention_system

http://en.wikipedia.org/wiki/Honeypot_(computing)

http://en.wikipedia.org/wiki/OSSEC

http://en.wikipedia.org/wiki/Software_agent

http://en.wikipedia.org/wiki/Access_control_list

http://en.wikipedia.org/wiki/Host-based_intrusion_detection_system


2.5 STATISTICAL ANOMALY AND SIGNATURE BASED IDSEs

All Intrusion Detection Systems use one of two detection techniques:

2.5.1 Statistical anomaly-based IDS

A statistical anomaly-based IDS determines normal network activity like what sort of bandwidth is generally

used, what protocols are used, what ports and devices generally connect to each other- and alert the

administrator or user when traffic is detected which is anomalous(not normal).

2.5.2 Signature-based IDS

Signature based IDS monitor’s packets in the Network and compares with preconfigured and

predetermined attack patterns known as signatures. The issue is that there will be lag between the new threat

discovered and Signature being applied in IDS for detecting the threat. During this lag time your IDS will be

unable to identify the threat.

2.6 TRADITIONAL IDS MODEL

Detection of known attacks

Should have the ability to determine the malicious attackers.

Real-time/rear real-time analysis

Analyze information sources gathered by the IDS sensor as soon as possible.

Minimal resource

Use the minimal resource in the systems when monitoring.

High accuracy

Make sure the detection is correct and lower the false alarms.



CHAPTER 3

THE ROLES AND RELATIONSHIPS IN TIDS

Hackers

People who attempt to gain unauthorized access to a computer system. These people are often

malicious and have many tools for breaking into a system.

System Manager (SM)

The person who takes charge to minimize the use of excess, network management, and system

maintenance costs. If a system under some attacks results IDSs alarms, they have to make

efforts to find out where the problem is.

Fig. 3.1: Relationship in TIDs



Detection System (DS)

The system that monitors the events occurring in protected hosts or networks and analyze

them for signs of intrusions.

The intrusion is a major aspect of every network and can be harmful to the entire system.

Thus we need a detection system to detect the intrusion beyond their early stages of damage to

the network.

Intrusion Detection System can provide the host and network dynamic protection

It can not only monitor the implementation of internal network attacks, external attacks and

disoperation of the real-time protection, but also in combination with other network security

products to protect the network in full range.

The characteristics of real-time and initiative are important complement to the firewall.

Today, in the overall network security solutions, intrusion detection has become an

indispensable component.

However, with the continuous expansion of network scale and the complexity of the means of

attack, Distributed Intrusion Detection System



CHAPTER 4

PROTOCOL ANALYSIS TECHNOLOGY

Intrusion Detection System early detection technology are misuse detection technology and anomaly

detection technology commonly used. Misuse detection technology is based on the known methods of

intrusion attacks to match and identify attacks. This detection technique commonly used is a simple pattern-

matching technology. It is characterized by simple, good scalability, detection efficiency, and can be

detected, but only applied to relatively simple attacks, and high false alarm rate. Although simple pattern-

matching on performance is a big problem, because system implementation, configuration, maintenance is

very convenient, it is widely used. Anomaly detection system is user's normal pre-stored patterns of behavior,

but those inconsistent with normal behavior patterns of users on the case be considered aggression.

Anomaly Detection Intrusion Detection System is the main research direction, which is characterized by

abnormal behavior of the detection system and found that unknown attack patterns. The key question of the

anomaly detection is the establishment of normal usage patterns and how to use the model to the current

system /user behavior compared with the normal in order to judge the degree of deviation from the model.

Using these two methods of IDS systems do not have the intelligence to determine the true intention of these

models but finally the results of protocol analysis and the advantages are being here.

Protocol analysis is the main technology means of new generation of IDS systems to detect attacks, which

use a high degree of regularity corresponding to the reported location of the first protocol to analyze

information only useful for detection of the intrusion detection field. Protocol decoding not only decodes on

the bottom protocol, but also on the application layer protocol decoding. Since protocol analysis technology

guide the search packet clearly part of specific rather than the entire payload reducing the search space, they

are able to improve the efficiency of intrusion detection.



4.1 ANOMALY DETECTION INTRUSION DETECTION SYSTEM

• Anomaly is a pattern in the data that does not conform to the expected behaviour

• Also referred to as outliers, exceptions, peculiarities, surprise, etc.

• Anomalies translate to significant (often critical) real life entities

– Cyber intrusions

– Credit card fraud

4.2 REAL WORLD ANOMALIES

• Credit Card Fraud

– An abnormally high purchase made on a credit card

• Cyber Intrusions

– A web server involved in ftp traffic

Fig. 4.1: Cyber intrusion


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISFig 4.2: Credit card fraud

4.3 KEY CHALLENGES

• Defining a representative normal region is challenging

• The boundary between normal and outlying behavior is often not precise

• The exact notion of an outlier is different for different application domains

• Availability of labelled data for training/validation

• Malicious adversaries

• Data might contain noise

• Normal behaviour keeps evolving

4.4 TYPE OF ANOMALY

Point Anomalies- An individual data instance is anomalous w.r.t. the data

Fig. 4.3: Point Anomalies

Contextual Anomalies- An individual data instance is anomalous within a context.Requires

a notion of context and also referred to as conditional anomalies.


X

Y

N1

N2

o1

o2

O3


Fig. 4.4: Contextual Anomalies

Collective Anomalies-A collection of related data instances is anomalous.Requires a

relationship among data instances

-Sequential Data

-Spatial Data

-Graph Data

Fig. 4.5: Collective Anomalies


Normal

Anomalous Subsequence

Anomaly


CHAPTER 5

THE FUNDAMENTAL STRUCTURE OF PROTOCOL

Protocol decoding not only decodes on the bottom protocol, but also on the application layer protocol

decoding. They are able to improve the efficiency of intrusion detection.

Fig 5.1: Protocol Structure

Ethernet MAC frame format, there are two different standards, one is DIX Ethernet V2, and the other is the

IEEE standard 802.3[4]. Ethernet V2 format is often used in current MAC frame, the upper protocol

including IP, IPX, ARP,SNMP, NetBUI, its frame format as shown in fig. 5.1

Fig. 5.2: Ethernet Frame Format



5.1 TWO IMPORTANT STAGES OF PROTOCOL ANALYSIS

5.1.1 IP datagram

In the transmission protocol, TCP, UDP, ICMP, IGMP data are based on IP data transmission format, IP

datagram is divided into IP header and IP data. IP header contains the version, header length, service type,

TL, identifier, flag, fragment offset, TTL, type, header checksum, source IP address, destination IP address.

Reference fig. 5.3

Fig. 5.3: IP Datagram Format

Protocol field accounted for 8 bit; field values indicate that the data of this protocol IP datagram carries is

which kind use of protocol, such as protocol field value of 6, indicating that part of their data using a TCP

protocol.

5.1.2 TCP datagram


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIS Transmission Control Protocol is a reliable connection oriented transmission service, which is transmitted

by segments, and a conversation must be built when exchange data. It is using the communication of bit

stream, that is, unstructured data as byte stream. Each TCP transmitted sequence number is specified, for

reliability, TCP datagram is divided into TCP header and TCP data. The header contains the source port,

destination port, serial number, and confirmation number and so on as shown in fig. 5.3

Fig. 5.4: TCP Segment Format

Where:

Source Port

The 16-bit source port number, used by the receiver to reply.

Destination Port

The 16-bit destination port number.

Sequence Number

The sequence number of the first data byte in this segment. If the SYN control bit is set, the sequence

number is the initial sequence number (n) and the first data byte is n+1.

Acknowledgment Number

If the ACK control bit is set, this field contains the value of the next sequence number that the

receiver is expecting to receive.



Data Offset

The number of 32-bit words in the TCP header. It indicates where the data begins.

Reserved

Six bits reserved for future use; must be zero.

URG

Indicates that the urgent pointer field is significant in this segment.

ACK

Indicates that the acknowledgment field is significant in this segment.

PSH

Push function.

RST

Resets the connection.

SYN

Synchronizes the sequence numbers.

FIN

No more data from sender.

Window

Used in ACK segments. It specifies the number of data bytes beginning with the one indicated in the

acknowledgment number field which the receiver (= the sender of this segment) is willing to accept.

Checksum

The 16-bit one's complement of the one's complement sum of all 16-bit words in a pseudo-header, the

TCP header and the TCP data.

The pseudo-header is the same as that used by UDP for calculating the checksum. It is a pseudo-IP-

header, only used for the checksum calculation, with the format shown in fig. 5.4


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISFig. 5.5: Pseudo-IP Header

CHAPTER 6

DISTRIBUTED INTRUSION DETECTION ON PROTOCOL ANALYSIS

6.1 DETECTOR UNIT MODEL

The most important part of the system is the design and work patterns of the Detect Module based on the

principle of the Protocol Analysis. It contains two parts: data capture module, protocol analysis module.

Data capture module- The major role of data capture module is to capture data on the

Internet, and then sent the data to the analysis part of the protocol, whose role is more simple

and easy to achieve.

Protocol analysis module- Protocol analysis is the focus of this module, it will parse captured

data its working principle is as follows: from the Ethernet frame, get the Ethernet header,

Ethernet header length is l4 byte, each of which is the 6-byte destination Ethernet address, 6-

byte source Ethernet address and the 2-byte frame type components, the frame type gives data

frame included in the protocol type, such as ARP, RARP, IP, IPX, etc. Their corresponding

number of protocol: 0806, 8035, 0800, 8l37, one of ARP/RARP are data link protocol, and IP

and IPX are network layer protocol, we have only the IP(0800) protocol for further analysis;

Where there is no select items, IP header length is 20 bytes, the main contents include the

following: source IP address, destination IP address, fragment flag and offset, and protocol

type of IP load (length of one byte), the type of protocol within the IP packet indicate the

protocol type of IP packet load, that is, TCP, UDP or ICMP, their corresponding number of

protocol:6,17 and 1; In the transport layer, where there is no select items, TCP header length is

20 bytes, the main contents include source port, destination port, flag, serial number and ACK

and so on, TCP header contains six flag: URG, SYN, ACK, FIN, RST, PSH, the six flags

reflect the status of the TCP connection, such as TCP connection is always in communication

through the exchange of SYN packets to the two sides to begin to create a new connection,



and through the adoption of FIN , RST to terminate a connection, the packet types can be got

according to the source port and destination port of TCP packet, such as TELNET port 23,

EMAIL port 25 and so on; In the application layer, contains a lot of the protocol, we only

analyze some daily applications, such as FTP, E-MAIL, TELNET, WWW and so on. After

doing this protocol analysis, protocol analysis module extracts data packets from the

application of the protocol of the protocol keyword, such as FTP at the package; you can

extract the RETR (GET operation), STOR (PUT operation) and other protocol keywords.

Comparing detector modules at the top of these keywords, we will be able to determine

whether there is network intrusion happened.

6.1.1 Ethernet Version 2

The original Ethernet Version 2 frame varies slightly from the 802.3 Ethernet frame format in that a Type

field, also referred to as Ethernet type, is used in place of the Length field (also 2 bytes), as shown on the

Ethernet V2 Frame Format Diagram.

Fig 6.1: Ethernet V2 Frame Format

6.1.2 Ethernet 802.3

The 802.3 format of an Ethernet frame is shown on the IEEE 802.3 Ethernet Frame Format Diagram. It has a

Length field instead of a Type field, and an 802.2 LLC header (not shown) in the Information field. As

mentioned in the previous lesson, the LLC header DSAP field indicates the protocol being carried and steers

the frame to the appropriate process in the Network Layer. An 802.3 Length field will always have a value of

less than 0x0600.

Fig 6.2: IEEE 802.3 Ethernet Frame Format



6.1.3. 802.1Q VLAN Frames

With the establishment of the 802.1Q VLAN standard, it is now possible to mix vendor switch equipment

and have the VLANs interoperate. That is, frames travelling from switch-to-switch between VLANs carry

VLAN membership information that all equipment meeting the standard recognize. The 802.1Q tag follows

the standard MAC header in Ethernet frames. If the frame is VLAN-tagged, the Type field contains a value

of 0x8100. The VLAN-tag format uses the next 2 bytes after the 0x8100 Type field for the VLAN tag. These

16 bits contain the 3-bit frame priority, the canonical format indicator (CFI), and the 12-bit VLAN ID.

Another way of looking at this is that Ethernet frames have either a Length or a Type field. When using

LLC, the field is Length-encoded. If not using LLC, the field is Type-encoded. Following the VLAN tag

would be the original 802.3 Length field or Version 2 Type value that the frame would have carried had it not

been tagged. That is, if this is an 802.1Q-tagged Type-encoded frame carrying IP, the 2 bytes after the VLAN

tag will be 0x0800. If the original frame was Length-encoded, the 2 bytes following the VLAN tag would be

a Length field, followed by the LLC header as the first part of the Information field.

If Length-encoded: 8100 0020 01A6--The 8100h and 0020h are the 4 additional VLAN bytes; the 01A6 is

an example of a valid 802.3 length; the LLC header would follow in the Information field. This concept is

illustrated on the 802.1Q Length-Encoded Frame Format Diagram.

Fig 6.3: Length-Encoded Frame Format

If Type-encoded: 8100 0020 0800--The 0800h is in the Type field, indicating IP is being carried. This

concept is illustrated on the 802.1Q Type-Encoded Frame Format Diagram.

Fig 6.4: Type-Encoded Frame FormatDEPT. OF CSE / B.T.L.I.T 20


6.2 DISTRIBUTED INTRUSION DETECTION SYSTEM MODEL

Although the Intrusion Detection System can identify non authorized use, abuse or computer and network

systems of misuse, as the intrusion has become more and more complex, individual intrusion detection

system has been unable to deal with complex security issues. So putting a number of intrusion detection

system agent on the network, setting up a process module to deal with the keyword data carried from

intrusion detection system agent, doing comprehensive analysis, to determine whether the attack happens,

this is the Distributed Intrusion Detection System. This system is divided into Detect Module, Process

Module and Response Module, the relationship between the various modules shown in Figure 6.5

In this model, Internet data on the Detect Module and the Process Module can arrives at a user computer

after the detection. After the network intrusion detection Process Module set the signal of the intrusion to

Response Module to alarm the user. Each detection module is a micro-data analysis system; they will get

through the analysis of data reported through the High speed link to send to process module. In the Response

Module to determine whether there is intrusion. Detect Module and Process Module make up a complete

intrusion detection system.

Computer systems have been made increasingly secure over the past decades. However, new attacks and

the spread of harmful viruses have shown that better methods must be used. One approach gaining increasing

popularity in the computer community is to use Intrusion Detection Systems (IDSs). Intrusion Detection

Systems identify attacks against a system or users performing illegitimate actions. Using a common analogy,

having an Intrusion Detection System is like having a ”burglar alarm” in your house. The alarm will not

prevent the burglar from breaking into your house, but it will detect and warn you of the problem. Following

the publication of the first research in Intrusion Detection Systems, a large number of diverse applications

have been developed. One method of accomplishing this type of detection is the use of file system integrity

tools. When a system is compromised, an attacker will often alter certain key files to provide continued

access and to prevent detection. The changes could target any portion of the system software, e.g. the kernel,

libraries, log files, or other sensitive files.



Fig. 6.5: Model of Distributed Intrusion Detection System.

Process Module contains a rule base, there is the keyword set of current often intrusion mode, and with the

emergence of a new intrusion technology and expanding the size of rule base, and the keywords can

deleted .Detect Module will send the keywords and rule base for comparison, if we find the matching of

string of arrived words with the rules of rule, then the intrusion has happened, Response Module responses

the user the intrusion, as well as to advise users the attack means and the aim of being attacked, allowing

users to take timely preventive measures to avoid losses.



CHAPTER 7

CHARACTERISTICS OF THE DIDS SYSTEM

Intrusion detection is the problem of identifying unauthorized use, misuse, and abuse of computer systems by

both system insiders and external penetrators. The proliferation of heterogeneous computer networks

provides additional implications for the intrusion detection problem. Namely, the increased connectivity of

computer systems gives greater access to outsiders, and makes it easier for intruders to avoid detection. IDS’s

are based on the belief that an intruder’s behavior will be noticeably different from that of a legitimate user.

The designing and implementing of a prototype Distributed Intrusion Detection System (DIDS) that

combines distributed monitoring and data reduction (through individual host and LAN monitors) with

centralized data analysis (through the DIDS director) to monitor a heterogeneous network of computers. This

approach is unique among current IDS’s. A main problem considered in this paper is the Network- user

Identification problem, which is concerned with tracking a user moving across the network, possibly with a

new user-id on each computer. Initial system prototypes have provided quite favorable results on this

problem and the detection of attacks on a network. This paper provides an overview of the motivation behind

DIDS, the system architecture and capabilities, and a discussion of the early prototype.

7.1 SCENARIOS

The detection of certain attacks against a networked system of computers requires information from multiple

sources. A simple example of such an attack is the so-called doorknob attack. In a doorknob attack the

intruder’s goal is to discover, and gain access to, insufficiently-protected hosts on a system. The intruder

generally tries a few common account and password combinations on each of a number of computers. These

simple attacks can be remarkably successful. As a case in point, UC Davis’ NSM recently observed an

attacker of this type gaining super-user access to an external computer which did not require a password for

the super-user account. In this case, the intruder used telnet to make the connection from a university

computer system, and then repeatedly tried to gain access to several different computers at the external site.

In cases like these, the intruder tries only a few logins on each machine (usually with different account

names), which means that an IDS on each host may not flag the attack. Even if the behavior is recognized as



an attack on the individual host, current IDS’s are generally unable to correlate reports from multiple hosts;

thus they cannot recognize the doorknob attack as such. Because DIDS aggregates and correlates data from

multiple hosts and the network, it is in a position to recognize the doorknob attack by detecting the pattern of

repeated failed logins even though there may be too few on a single host to alert that host’s monitor.

In another incident, our NSM recently observed an intruder gaining access to a computer using a guest

account which did not require a password. Once the attacker had access to the system, he exhibited behavior

which would have alerted most existing IDS’s (e.g., changing passwords and failed events). In an incident

such as this, DIDS would not only report the attack, but may also be able to identify the source of the attack.

That is, while most IDS’s would report the occurrence of an incident involving user "guest" on the target

machine, DIDS would also report that user "guest" was really, for example, user "smith" on the source

machine, assuming that the source machine was in the monitored domain. It may also be possible to go even

further back and identify all of the different user accounts in the "chain" to find the initial launching point of

the attack. Another possible scenario is what we call network browsing. This occurs when a (network) user is

looking through a number of files on several different computers within a short period of time. The browsing

activity level on any single host may not be sufficiently high enough to raise any alarm by itself. However,

the network-wide, aggregated browsing activity level may be high enough to raise suspicion on this user.

Network browsing can be detected as follows. Each host monitor will report that a particular user is browsing

on that system, even if the corresponding degree of browsing is small. The expert system can then aggregate

such information from multiple hosts to determine that all of the browsing activity corresponds to the same

network user. This scenario presents a key challenge for DIDS: the tradeoffs between sending all audit

records to the director versus missing attacks because thresholds on each host are not exceeded. In addition to

the specific scenarios outlined above, there are a number of general ways that an intruder can use the

connectivity of the network to hide his trail and to enhance his effectiveness. Some of the attack

configurations which have been hypothesized include chain and parallel attacks. DIDS combats these

inherent vulnerabilities of the network by using the very same connectivity to help track and detect the

intruder. Note that DIDS should be at least as effective as host-based IDS’s (if we implement all of their

functionality in the DIDS host monitor), and at least as effective as the stand-alone NSM.



7.2 THE NETWORK-USER IDENTIFICATION (NID)

One of the most interesting challenges for intrusion detection in a networked environment is to track users

and objects (e.g., files) as they move across the network. For example, an intruder may use several different

accounts on different machines during the course of an attack. Correlating data from several independent

sources, including the network itself, can aid in recognizing this type of behavior and tracking an intruder to

their source. In a networked environment, an intruder may often choose to employ the interconnectivity of

the computers to hide his true identity and location. It may be that a single intruder uses multiple accounts to

launch an attack, and that the behavior can be recognized as suspicious only if one knows that all of the

activity emanates from a single source. For example, it is not particularly noteworthy if a user inquires about

who is using a particular computer (e.g., using the UNIX who or finger command). However, it may be

indicative of an attack if a user inquires about who is using each of the computers on a LAN and then

subsequently logs into one of the hosts. Detecting this type of behavior requires attributing multiple sessions,

perhaps with different account names, to a single source.

Fig. 7.1: Network User Identification (NID)



This problem is unique to the network environment and has not been dealt with before in this context. The

solution to the multiple user identity problem is to create a network-user identification (NID) the first time a

user enters the monitored environment, and then to apply that NID to any further instances of the user. All

evidence about the behavior of any instance of the user is then accountable to the single NID. In particular,

we must be able to determine that "smith@host1" is the same user as "jones@host2", if in fact they are. Since

the network-user identification problem involves the collection and evaluation of data from both the host and

LAN monitors, examining it is a useful method to understand the operation of DIDS. In the following

subsections we examine each of the components of DIDS in the context of the creation and use of the NID.

7.3. THE HOST MONITOR

The host monitor is currently installed on Sun SPARC stations running SunOS 4.0.x with the Sun C2

security package. Through the C2 security package, the operating system produces audit records for virtually

every transaction on the system. These transactions include file accesses, system calls, process executions,

and logins. The contents of the Sun C2 audit record are: record type, record event, time, real user ID, audit

user ID, effective user ID, real group ID, process ID, error code, return value, and label.

The host monitor examines each audit record to determine if it should be forwarded to the expert system for

further evaluation. Certain critical audit records are always passed directly to the expert system (i.e., notable

events); others are processed locally by the host monitor (i.e., profiles and attack signatures, which are

sequences of noteworthy events which indicate the symptoms of attacks) and only summary reports are sent

to the expert system.

Thus, one of the design objectives is to push as much of the processing operations down to the low-level

monitors as possible. In order to do this, the HEG creates a more abstract object called an event. The event

includes any significant data provided by the original audit record plus two new fields: the action and the

domain. The action and domain are abstractions which are used to minimize operating system dependencies

at higher levels. Actions characterize the dynamic aspect of the audit records. Domains characterize the

objects of the audit records. In most cases, the objects are files or devices and their domain is determined by

the characteristics of the object or its location in the file system. Since processes can also be objects of an

audit record, they are also assigned to domains, in this case by their function. The actions are: session start,



session end, read (a file or device), write (a file or device), execute (a process), terminate (a process), create

(a file or (virtual) device), delete (a file or (virtual) device), move (rename a file or device), change rights,

and change_user_id. The domains are: tagged, authentication, audit, network, system, sys_info, user_info,

utility, owned, and not_owned. The domains are prioritized so that an object is assigned to the first applicable

domain. Tagged objects Are Ones which are thought a priori to be particularly interesting in terms of

detecting intrusions. Any file, device, or process can be tagged (e.g., /etc/passwd). Authentication objects are

the processes and files which are used to provide access control on the system (e.g., the password file).

Similarly, audit objects relate to the accounting and security auditing processes and files. Network objects are

the processes and files not covered in the previous domains which relate to the use of the network. System

objects are primarily those which are concerned with the execution of the operating system itself, again

exclusive of those objects already assigned to previously considered domains. Sys_info and user_info objects

provide information about the system and about the users of the system, respectively. The utility objects are

the bulk of the programs run by the users (e.g., compilers and editors). In general, the execution of an object

in the utility domain is not interesting (except when the use is excessive), but the creation or modification of

one is. Owned objects are relative to the user. Not_owned objects are, by exclusion, every object not assigned

to a previous domain. They are also relative to a user; thus, files in the owned domain relative to "smith" are

in the not_owned domain relative to "Jones".

All possible transactions fall into one of a finite number of events formed by the cross product of the

actions and the domains, and each event may also succeed or fail. Note that no distinction is made between

files, directories or devices, and that all of these are treated simply as objects. Not every action is applicable

to every object; for example, the terminate action is applicable only to processes. The choice of these

domains and actions is somewhat arbitrary in that one could easily suggest both finer and coarser grained

partitions. However, they capture most of the interesting behavior for intrusion detection and correspond

reasonably well with what other researchers in this field have found to be of interest. By mapping an infinite

number of transactions to a finite number of events, we not only remove operating system dependencies, but

also restrict the number of permutations that the expert system will have to deal with. The concept of the

domain is one of the keys to detecting abuses. Using the domain allows us to make assertions about the

nature of a user’s behavior in a straightforward and systematic way. Although we lose some details provided

by the raw audit information, that is more than made up for by the increase in portability, speed, simplicity,

and generality. An event reported by a host monitor is called a host audit record (har). The record syntax is:



har(Monitor-ID, Host-ID, Audit-UID, Real-UID, Effective-UID, Time, Domain, Action, Transaction, Object,

Parent Process, PID, Return Value, Error Code).

Of all the possible events, only a subset are forwarded to the expert system. For the creation and application

of the NID, it is the events which relate to the creation of user sessions or to a change in an account that are

important. These include all the events with session_start actions, as well as ones with an execute action

applied to the network domain. These latter events capture such transactions as executing the rlogin, telnet,

rsh, and rexec UNIX programs. The HEG consults external tables, which are built by hand, to determine

which events should be forwarded to the expert system. Because they relate to events rather than to the audit

records themselves, the tables and the modules of the HEG which use them are portable across operating

systems. The only portion of the HEG which is operating system dependent is the module which creates the

events.

Fig.7.2: DIDS target environment



Fig 7.2 shows a generalized DIDS target environment. The DIDS architecture combines distributed

monitoring and data reduction with centralized data analysis. DIDS architecture consists of DIDS director, a

single host monitor per host and a single LAN monitor for each broadcast LAN segment in the network

which is monitored. In DIDS, the host and LAN monitors report events, which possibly lead to intrusive

activity, to a centrally-located DIDS director. The director employs an expert system to detect the possible

intrusion attacks. This architecture provides accountability by trying the users with their actions. The host

and LAN monitors are responsible for the collection of evidence of suspicious activity and DIDS director is

responsible for its evaluation. Reports are sent independently and asynchronously from the host and LAN

monitors to the DIDS director through a communications architecture shown in figure 7.2. For high level

communication protocols between the components are based on Common Management Information Protocol

(CMIP) recommendations. The architecture provides a bidirectional communication between the DIDS

director and any monitor in the configuration.

7.4 THE LAN MONITOR

The LAN monitor is currently a subset of UC Davis’ Network Security Monitor. The LAN monitor builds its

own "LAN audit trail". The LAN monitor observes each and every packet on its segment of the LAN and,

from these packets, it is able to construct higher-level objects such as connections (logical circuits), and

service requests using the TCP/IP or UDP/IP protocols. In particular, it audits host-to-host connections,

services used, and volume of traffic per connection.

Similar to the host monitor, the LAN monitor uses several simple analysis techniques to identify significant

events. The events include the use of certain services (e.g., rlogin and telnet) as well as activity by certain

classes of hosts (e.g., a PC without a host monitor). The LAN monitor also uses and maintains profiles of

expected network behavior. The profiles consist of expected data paths (e.g., which systems are expected to

establish communication paths to which other systems, and by which service) and service profiles (e.g., what

a typical telnet, mail, or finger is expected to look like). The LAN monitor also uses heuristics in an attempt

to identify the likelihood that a particular connection represents intrusive behavior. These heuristics consider

the capabilities of each of the network services, the level of authentication required for each of the services,

the security level for each machine on the network, and signatures of past attacks. The abnormality of a

connection is based on the probability of that particular connection occurring and the behavior of the



connection itself. Upon request, the LAN monitor is also able to provide a more detailed examination of any

connection, including capturing every character crossing the network (i.e., a wire-tap). This capability can be

used to support a directed investigation of a particular subject or object. Like the host monitor, the LAN

monitor forwards relevant security information to the director through its LAN agent. An event reported by a

LAN monitor is called a network audit record (nar). The record syntax is: nar(Monitor-ID, Source_Host,

Dest_Host, Time, Service, Domain, Status). A large amount of low level filtering and some analysis is

performed by the host monitor to minimize the use of network bandwidth in passing evidence to the director.

The LAN monitor has several responsibilities with respect to the creation and use of the NID. The LAN

monitor is responsible for detecting any connections related to rlogin and telnet sessions. Once these

connections are detected, the LAN monitor can be used to verify the owner of a connection. The LAN

monitor can also be used to help track tagged objects moving across the network. The SSO can also ask for a

wire-tap on a certain network connection to monitor a particular user’s behavior.



CHAPTER 8

DISTRIBUTED INTRUSION DETECTION SYSTEM ARCHITECTURE

The DIDS architecture combines distributed monitoring and data reduction with centralized data analysis.

This approach is unique among current IDS’s. The components of DIDS are the DIDS director, a single host

monitor per host. And a single LAN monitor for each broadcast LAN segment in the monitored network.

DIDS can potentially handle hosts without monitors since the LAN monitor can report on the network

activities of such hosts. The host and LAN monitors are primarily responsible for the collection of evidence

of unauthorized or suspicious activity, while the DIDS director is primarily responsible for its evaluation.

Reports are sent independently and asynchronously from the host and LAN monitors to the DIDS director

through a communications infrastructure. High level communication protocols between the components are

based on the ISO Common Management Information Protocol (CMIP) recommendations, allowing for future

inclusion of CMIP management tools as they become useful. The architecture also provides for bidirectional

communication between the DIDS director and any monitor in the configuration. This communication

consists primarily of notable events and anomaly reports from the monitors. The director can also make

requests for more detailed information from the distributed monitors via a "GET" directive, and issue

commands to have the distributed monitors modify their monitoring capabilities via a "SET" directive.

Like the host monitor, the LAN monitor consists of a LAN event generator (LEG) and a LAN agent. The

LEG is currently a subset of UC Davis’ NSM. Its main responsibility is to observe all of the traffic on its

segment of the LAN to monitor host-to-host connections, services used, and volume of traffic. The LAN

monitor reports on such network activity as rlogin and telnet connections, the use of security-related services,

and changes in network traffic patterns.

The DIDS director consists of three major components that are all located on the same dedicated

workstation. Because the components are logically independent processes, they could be distributed as well.

The communications manager is responsible for the transfer of data between the director and each of the host

and the LAN monitors. It accepts the notable event records from each of the host and LAN monitors and

sends them to the expert system. On behalf of the expert system or user interface, it is also able to send the

requests to the host and LAN monitors for more information regarding a particular subject.



The expert system is responsible for evaluating and reporting on the security state of the monitored system.

It receives the reports from the host and the LAN monitors, and, based on these reports; it makes inferences

about the security of each individual host, as well as the system as a whole. The expert system is a rule-based

system with simple learning capabilities. The director’s user interface allows the System Security Officer

(SSO) interactive access to the entire system. The SSO is able to watch activities on each host, watch

network traffic (by setting "wire-taps"), and request more specific types of information from the monitors.

8.1 COMMUNICATION ARCHITECTURE

Anticipating that a growing set of tools, including incident-handling tools and network-management tools,

will be used in conjunction with the intrusion-detection functions of DIDS. This will give the SSO the ability

to actively respond to attacks against the system in real-time. Incident-handling tools may consist of possible

courses of action to take against an attacker, such as cutting off network access, a directed investigation of a

particular user, removal of system access, etc. Network-management tools that are able to perform network

mapping would also be useful.

Fig. 8.1: Communication Architecture



The architecture provides bidirectional communication between the DIDS director and any monitor in the

configuration and the communication consists of notable events and anomaly reports. The director makes

requests for more detailed information from the distributed monitors.

The host monitor consists of host event generator and host agent. The agent generator collects and analysis

audit records from the host operating system, in which, the audit records are scanned for notable events. The

notable events are sent to the director of the next analysis. The LAN monitor consists of a LAN event

generator and a LAN agent. The LAN event generator is a subset of NSM and is responsible to observe all

the traffic on its segment of the LAN, in order to monitor host-to-host connections, services used and volume

of traffic.

The DIDS director consists of three major components namely a communication manager, an expert

system and a user interface. The communication manager is used to transfer data between the director and it

accepts the notable event records from each host and LAN monitors and sends them to the expert system. It

also sends request to the host and LAN monitors for information regarding a particular user.

The expert system is responsible for evaluating and reporting the security state of the monitored system and

it receives the reports from the hosts and the LAN monitors. This makes inferences about the security of each

individual host and the expert system is having simple learning capabilities.

8.2 A STANDARD NETWORK INTRUSION DETECTION ARCHITECTURE

Fig. 8.2 shows traditional sensor-based network intrusion detection architecture. A

sensor is used to “sniff” packets off of the network where they are fed into a detection

engine which will set off an alarm if any misuse is detected. These sensors are

distributed to various mission-critical segments of the network. A central console is

used to collect the alarms from multiple sensors. However, in order to better

understand the traditional sensor-based architecture, the lifecycle of a network packet

should be examined.

The packet is read, in real time, off the network through a sensor that

president on a network segment located somewhere between the two DEPT. OF CSE / B.T.L.I.T 33

DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIScommunicating computers. The sensor is usually a stand-alone machine or

network device.

Fig. 8.2: A Standard Network Intrusion Detection Architecture

The network packet is created when one computer communicates with

another.

A sensor-resident detection engine is used to identify predefined patterns

of misuse. If a pattern is detected, an alert is generated.

The security officer is notified about the misuse. This can be done through

a variety of methods including audible, visual, pager, email, or through

any other different method.


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIS A response to the misuse is generated. The response subsystem matches

alerts to predefined responses or can take responses from the security

officer.

The alert is stored for correlation and review at a later time.

Reports are generated that summarize the alert activity.

Data forensics is used to detect long-term trends. Some systems allow

archiving of the original traffic to replay sessions.

A few years ago all commercial network intrusion detection systems used

promiscuous-mode sensors. However, these technologies were subject to packet loss

on high speed networks. A new architecture for network intrusion detection was

created that dealt with the performance problem on high speed networks by

distributing sensors to every computer on the network. In network-node intrusion

detection each sensor is concerned only with packets directed at the target in which

the sensor resides. The sensors then communicate with each other and the main

console to aggregate and correlate alarms.

However, this network-node architecture has added to the confusion over the

difference between network and host-based intrusion detection. A network sensor that

is running on a host machine does not make it a host-based sensor. Network packets

directed to a host and sniffed at a host are still considered network intrusion detection.

8.3 DISTRIBUTED HOST RESIDENT INTRUSION DETECTION

Fig. 8.3 represents the network-node intrusion detection architecture. An agent is used to read packets off

the TCP/IP stack layer where the packets have been reassembled. The packet is then fed into the detection

engine located on the target machine. Network node agents communicate with each other on the network to

correlate alarms at the console.

A network packet is created.

The packet is read in real-time off the network through a sensor resident on the destination

machine.


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIS A detection engine is used to identify pre-defined patterns of misuse. If a pattern is detected,

an alert is generated and forwarded to a central console or to other sensors in the network.

The security officer is notified.

A response is generated.

The alert is stored for later review and correlation.

Reports are generated summarizing alert activity.

Data forensics is then used to look for long-term trends.

Fig. 8.3: A Distributed Host Resident Intrusion Detection Architecture

However, the architectures require operational modes in order to operate. Operational modes describe the

manner the intrusion detection system will operate and partially describe the end goals of monitoring. There

are two primary operational modes to use network-based intrusion detection: tip-off and surveillance. The


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISsystem is used to detect misuse as it is happening. This is the traditional context for intrusion detection

systems. By observing patterns of behavior, suspicious behavior can be detected to “tip off” the officer that

misuse may be occurring. The defining characteristic for tip-off is that the system is detecting patterns that

have not been detected before. During surveillance, targets are observed more closely for patterns of misuse.

Surveillance is characterized by an increased observance of the behavior of a small set of subjects. Unlike

tip-off, surveillance takes place when misuse has already been suspected. Surveillance results from a tip-off

from either the intrusion detection system or another indicator.

In order for there to be a tip-off a data source needs to be searched for suspicious behavior. Host-based

intrusion detection systems analyze data that originates on computers, such as application and operating

system event logs and file attributes. Host data sources are numerous and varied, including operating system

event logs, such as kernel logs, and application logs such as syslog. These host event logs contain

information about file accesses and program executions associated with inside users. If protected correctly,

event logs may be entered into court to support the prosecution of computer criminals.

There are many attack scenarios that host-based intrusion detection guards against. One of these scenarios is

the abuse of privilege attack scenario. That is when a user has root, administrative or some other privilege

and uses it in an unauthorized manner. Another scenario involves contractors with elevated privileges. This

usually happens when an administrator gives a contractor elevated privileges to install an application. Most

security policies restrict nonemployees from having root or administrator privileges, however it might be

easier to elevate the user and reduce privileges later. However, the administrator might forget to remove the

privileges. A third attack scenario involves ex-employees utilizing their old accounts. Most organizations

have policies in place to delete or disable accounts when individuals leave.

However, they take time to delete or disable, leaving a window for a user to log back in. Another scenario

involves modifying web site data. There have been many cases, against government agencies in particular,

that result in uncomplimentary remarks posted on web sites. While these attacks originate from outside the

network, they are perpetrated on the machine itself through alteration of data. With a review of what attacks

host-based intrusion detection systems prevent, it’s important to examine the architecture to see how it


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISprevents those attacks. In the centralized architecture, data is forwarded to an analysis engine running

independently from the target. Fig. 8.4 represents the typical life cycle of an event record running through

this type of architecture. And Fig. 8.5 represents distributed real-time host based intrusion detection

architecture. The difference between the two is that in Fig. 8.4 the raw data is forwarded to a central location

before it is analyzed and, in Fig. 8.5, the raw data is analyzed in real time on the target first and then only

alerts are sent to the command console. There are advantages and disadvantages to each method. However,

the best systems offer both types of processing.

8.4 A CENTRALIZED HOST-BASED INTRUSION DETECTION

Fig. 8.4: A Centralized Host-Based Intrusion Detection Architecture



An even record is created. This occurs when an action happens; such as a

file is opened or a program is executed like the text editor like Microsoft

Word. The record is written into a file that is usually protected by the

operating system trusted computing base.

The target agent transmits the file to the command console. This happens

at predetermined time intervals over a secure connection.

The detection engine, configured to match patterns of misuse, processes

the file.

A log is created that becomes the data archive for all the raw data that will

be used in prosecution.

An alert is generated. When a predefined pattern is recognized, such as

access to a mission critical file, an alert is forwarded to a number of

various subsystems for notification, response, and storage.

The security officer is notified.

A response is generated. The response subsystem matches alerts to

predefined responses or can take response commands from the security

officer. Responses include reconfiguring the system, shutting down a

target, logging off a user, or disabling an account.

The alert is stored. The storage is usually in the form of a database. Some

systems store statistical data as well as alerts.

The raw data is transferred to a raw data archive. This archive is cleared

periodically to reduce the amount of disk space used.

Reports are generated. Reports can be a summary of the alert activity.

Data forensics is used to locate long-term trends and behavior is analyzed

using both the stored data in the database and the raw event log archive.

The lifecycle of an event record through a distributed real-time

architecture is similar, except that the record is discarded after the target

detection engine analyzes it. The advantage to this approach is that


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISeverything happens in real-time. The disadvantage is that the end users

suffer from system performance degradation.

Data forensics is used to search for long-term trends. However, because

there is no raw data archive and no statistical data, this capacity is limited.

Reports are generated.

8.5 A DISTRIBUTED REAL-TIME HOST-BASED INTRUSION DETECTION



Fig. 8.5: A Distributed Real-Time Host-Based Intrusion Detection Architecture

An event record is created.

The file is read in real-time and processed by a target resident detection

engine.

The security officer is notified. Some systems notify directly from the

target, while others notify from a central console.

A response is generated. The response may be generated from the target

or console.

An alert is generated then sent to a central console.

The alert is stored. Statistical behavioral data outside alert data are not

usually available in this architecture.



CHAPTER 9

THE EXPERT SYSTEM

DIDS utilizes a rule-based (or production) expert system. The expert system is currently written in Prolog,

and much of the form of the rule base comes from Prolog and the logic notation that Prolog implies. The

expert system uses rules derived from the hierarchical Intrusion Detection Model (IDM). The IDM describes

the data abstractions used in inferring an attack on a network of computers. That is, it describes the

transformation from the distributed raw audit data to high level hypotheses about intrusions and about the

overall security of the monitored environment. In abstracting and correlating data from the distributed

sources, the model builds a virtual machine which consists of all the connected hosts as well as the network

itself. This unified view of the distributed system simplifies the recognition of intrusive behavior which spans

individual hosts. The model is also applicable to the trivial network of a single computer.

The model is the basis of the rule base. It serves both as a description of the function of the rule base, and as

a touchstone for the actual development of the rules. The IDM consists of 6 layers, each layer representing

the result of a transformation performed on the data (see Table 9.1)

The objects at the first level of the model are the audit records provided by the host operating system, by

the LAN monitor, or by a third party auditing package. The objects at this level are both syntactically and

semantically dependent on the source. At this level, all of the activity on the host or LAN is represented.

At the second level, the event (which has already been discussed in the context of the host and LAN

monitor) is both syntactically and semantically independent of the source standard format for events.

The third layer of the IDM creates a subject. This introduces a single identification for a user across many

hosts on the network. It is the subject who is identified by the NID. Upper layers of the model treat the

network-user as a single entity, essentially ignoring the local identification on each host. Similarly, above this

level, the collection of hosts on the LAN is generally treated as a single distributed system with little

attention being paid to the individual hosts.



Table 9.1: Intrusion Detection Model

The fourth layer of the model introduces the event in context. There are two kinds of context: temporal and

spatial. As an example of temporal context, behavior which is unremarkable during standard working hours

may be highly suspicious during off hours. The IDM, therefore, allows for the application of information

about wall clock time to the events it is considering. Wall-clock time refers to information about the time of

day, weekdays versus weekends and holidays, as well as periods when an increase in activity is expected. In

addition to the consideration of external temporal context, the expert system uses time windows to correlate

events occurring in temporal proximity. This notion of temporal proximity implements the heuristic that a

call to the UNIX who command followed closely by a login or logout is more likely to be related to an

intrusion than either of those events occurring alone. Spatial context implies the relative importance of the

source of events. That is, events related to a particular user, or events from a particular host, may be more

likely to represent an intrusion than similar events from a different source. For instance, a user moving from

a low-security machine to a high-security machine may be of greater concern than a user moving in the

opposite direction. The model also allows for the correlation of multiple events from the same user or source.

In both of these cases, multiple events are more noteworthy when they have a common element than when

they do not.


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIS The fifth layer of the model considers the threats to the network and the hosts connected to it. Events in

context are combined to create threats. The threats are partitioned by the nature of the abuse and the nature of

the target. In other words, what is the intruder doing, and what is he doing it to? Abuses are divided into

attacks, misuses, and suspicious acts. Attacks represent abuses in which the state of the machine is changed.

That is, the file system or process state is different after the attack than it was prior to the attack. Misuses

represent out-of-policy behavior in which the state of the machine is not affected.

`

Suspicious acts are events which, while not a violation of policy, are of interest to IDS. For example,

commands which provide information about the state of the system may be suspicious. The targets of abuse

are characterized as being either system objects or user objects and as being either passive or active. User

objects are owned by non-privileged users and/or reside within a non-privileged user’s directory hierarchy.

System objects are the complement of user objects. Passive objects are files, including executable binaries,

while active objects are essentially running processes.

At the highest level, the model produces a numeric value between one and 100 which represents the overall

security state of the network. The higher the number the less secure the network. This value is a function of

all the threats for all the subjects on the system. Here again we treat the collection of hosts as a single

distributed system. Although representing the security level of the system as a single value seems to imply

some loss of information, it provides a quick reference point for the SSO. In fact, in the current

implementation, no information is lost since the expert system maintains all the evidence used in calculating

the security state in its internal database, and the SSO has access to that database.

In the context of the network-user identification problem we are concerned primarily with the lowest three

levels of the model: the audit data, the event, and the subject. The generation of the first two of these has

already been discussed; thus, the creation of the subject is the focus of the following subsection.

The expert system is responsible for applying the rules to the evidence provided by the monitors. In

general, the rules do not change during the execution of the expert system. What does change is a numerical

value associated with each rule. This Rule Value (RV) represents our confidence that the rule is useful in

detecting intrusions. These rule values are manipulated using a negative reinforcement training method

which allows the expert system to continually lower the number of false attack reports. When a potential


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISattack is reported by the expert system, the SSO determines the validity of the report and gives feedback to

the expert system. If the report was deemed faulty, then the expert system lowers the RV’s associated with

the rules that were used to draw that conclusion. In addition to this directed training, which may lower some

rule values, the system also automatically increases the RV’s of all the rules on a regular basis. This recovery

algorithm allows the system to adapt to changes in the environment as well as recover from faulty training.

Logically the rules have the form:

Antecedent => consequence

Where the antecedent is either a fact reported by one of the distributed monitors, or a consequence of some

previously satisfied rule. The antecedent may also be a conjunction of these. The overall structure of the rule

base is a tree rooted at the top. Thus, many facts at the bottom of the tree will lead to a few conclusions at the

top of the tree.

The expert system shell consists of approximately a hundred lines of Prolog source code. The shell is

responsible for reading new facts reported by the distributed monitors, attempting to apply the rules to the

facts and hypotheses in the Prolog database, reporting suspected intrusions, and maintaining the various

dynamic values associated with the rules and hypotheses. The syntax for rules is:

rule(n,r,(single,[A]),(C))).

where n is the rule number, r is the initial RV, A is the single antecedent, and C is the consequence.

Conjunctive rules have the form:

rule(n,r,(and,[A1,A2,A3]),(C))).

where A1,A2,A3 are the antecedents and C is the consequence. Disjunctive rules are not allowed; that

situation is dealt with by having multiple rules with the same consequence.

9.1 ADVANTAGES

The distributed Intrusion Detection Model based on Protocol analysis has the following advantages:

System structure is simple.

The system consists of three modules: Detect Module, Process Module and Response Module.

This makes data transmission between the modules do not need too much middle layer, enhance the


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSIStransfer rate between the modules. At this point the flow of large data networks, intrusion detection

has great advantages. When there is more data traffic on the network, undetected rate of general

intrusion detection systems will increase sharply, which give a hacker an opportunity, which can be

taken in some way to send a large number of flooded packets littering the network, at this time if

there is some delays of detected part and processed part or the matching time is too long between the

rule base processed and the data sent, then there is a large part of the data will certainly not be

detected, the hackers can mix intrusion data packets with litter data packages falling through the

openings in the packet, so as to achieve their sinister purposes. This model uses high-speed link,

which greatly improve the data transmission speed.

Detected speed is fast.

In the Detect Module part, extraction is only the important characteristics of packet into Process

Module to process. Its length is often only a small percentage of the length of all the data packets, not

only saves resources of detection part, but also in the unit time greatly improves the characteristics of

the packet transmission rate when transmitted. Because the rule base of the central part is constituted

by the characteristics of these intrusion data, but also saves resources of Process Module. And the

strings of characteristics is short, so the matching speed can be greatly enhanced, even if there is a lot

of data that need to be processed at the same time, the system can also achieve matching tasks, detect

intrusion timely, enhance the detection rate.

9.2 COMPARISON OF DIFFERENT ARCHITECTURES

Table 9.2 summarizes the advantages and disadvantages of centralized detection

architecture. There is little impact in performance on the target machine because all

the analysis happens elsewhere. Multi-host signatures are possible because the

centralized engine has access to data from all targets. Finally, the centralized raw data

can be used for prosecution provided the integrity of the data is preserved.



Table 9.2: Advantages and Disadvantages of a Centralized Detection Architecture

Table 9.3 illustrates the advantages and disadvantages of a real-time distributed

intrusion detection system. This table is a mirror image of Table 9.2 with a few minor

additions.

Table 9.3 Advantages and Disadvantages of a Distributed Real-Time Architecture

Host-based and network-based systems are both required because they provide

significantly different benefits. Detection, deterrence, response, damage assessment,

attack anticipation and prosecution support are available at different degrees from the

different technologies. Table 9.4 summarizes these differences.


DISTRIBUTED INTRUSION DETECTION SYSTEM BASED ON PROTOCOL ANALYSISTable 9.4: Comparing Network- and Host-Based Benefits

CONCLUSION

Intrusion detection technology based on protocol analysis has become one of the technologies for the

intrusion detection system of next generation. This paper presents the Distributed Intrusion Detection System

based on protocol analysis which is simple in structure, fast in detection speed, efficient in detection, safe in

resources, etc., and is an affordable intrusion detection system. However, the diversity of network intrusion

make detection system impossible, especially because the rule base can only extract the invaded, so that there

is failure to recognize the intrusion undetected, resulting some missed detection. However, the Distributed

Intrusion Detection research study is at the initial stage, with the development of technology, the system must

be able to change with the trend of network data to make adaptive changes, which make the system have the

function of self-learning and adaptive. However, this paper presents the protocol analysis have a certain

stimulating function to improve the existing distributed intrusion detection system performance, and must

have some practical significance for the future of the Distributed Intrusion Detection System.



FUTURE WORK

The Distributed Intrusion Detection System (DIDS) is being developed to address the shortcomings of

current single host IDS’s by generalizing the target environment to multiple hosts connected via a network

(LAN). Most current IDS’s do not consider the impact of the LAN structure when attempting to monitor user

behavior for attacks against the system. Intrusion detection systems designed for a network environment will

become increasingly important as the number and size of LAN’s increase. The prototype has demonstrated

the viability of our distributed architecture in solving the network-user identification problem.

The tested system on a sub-network of Sun SPARC stations and it has correctly identified network users in

a variety of scenarios. Work continues on the design, development, and refinement of rules, particularly those

which can take advantage of knowledge about particular kinds of attacks. The initial prototype expert system

has been written in Prolog, but it is currently being ported to CLIPS due to the latter’s superior performance

characteristics and easy integration with the C programming language. The designing of a signature analysis

component for the host monitors to detect events and sequences of events that are known to be indicative of

an attack, based on a specific context. In addition to the current host monitor, who is designed to detect

attacks on general purpose multi-user computers, the intension to develop monitors for application specific

hosts such as file servers and gateways. In support of the ongoing development of DIDS there is a plan to

extend the model to a hierarchical Wide Area Network environment.




Report

Documents

destination

lan event

provide continued

source ip

notable event

network security

intrusion

gain unauthorized