Design of a multimedia traﬃc classiﬁer for Snort

Design of a multimedia traffic classifier for Snort

Abstract

Contemporary Intrusion Detection Systems (IDS) have been facing

the challenge of monitoring increasingly higher network traffic which

may result in not all data being inspected for malicious activity. This

paper describes the design and operation of a Snort preprocessor that

adds specialized multimedia knowledge to the IDS packet analysis ca-

pabilities. Empowering the IDS with multimedia-specific knowledge

results in two significant gains: (1) trusted multimedia contents can

be identified and allowed to bypass the detection engine, thereby al-

lowing the IDS to focus on other traffic; (2) the IDS is able to detect

multimedia-specific exploits which would otherwise go by unnoticed.

The solution has been tested in both streaming and non-streaming sce-

narios. Test results confirm that this additional specialized knowledge

enhances the IDS capabilities and results in substantial computational

savings.

Keywords: Intrusion Detection Systems, multimedia security, net-

work security, Snort.

1

1 Introduction

Intrusion Detection Systems (IDS) have become valuable tools for ensuring

system and network security. IDS scan ongoing traffic in search of pat-

terns and signatures that might indicate malicious or unauthorized activ-

ity (Kemmerer and Vigna, 2002; Bace, 2000). One of the issues currently

facing network-based IDS is the high computational cost of doing real-time

analysis when a large amount of traffic is passing through a connection. In

such cases IDS usually have no choice but to skip packets (Bace, 2000). The

increase in multimedia traffic over communication networks, whether in the

form of downloading or streaming large audio and video files, or due to the

convergence of voice, data and video over IP, seems to compound the prob-

lem even further. We believe that the increase in multimedia traffic may

actually help alleviate the problem, if the IDS is aware of the main proto-

cols and file formats used to carry multimedia data and uses that knowledge

to focus its efforts on other types of traffic.

This paper focuses on the implementation aspects of a novel method

to improve the performance of IDS based on multimedia traffic classifica-

tion (Baillargeon, 2005), initially described in (Marques and Baillargeon,

2005). Under the proposed approach, the IDS has additional knowledge

about common multimedia file formats and related protocols, and uses this

knowledge to perform a more detailed analysis of packets carrying that type

of data. If the data complies with the standard format, the corresponding

stream is flagged as trusted and the remaining packets are ignored by the

IDS, which can now focus on other traffic, therefore resulting in computa-

2

tional savings. Otherwise, an anomaly is detected and reported for further

action by the system administrator.

The proposed method has been motivated by the following:

1. The amount and usage of multimedia contents has grown significantly

in recent years and will continue to increase as broadband Internet

access among home users and mobile computing technologies become

mainstream.

2. Multimedia traffic accounts for a significant percentage of the total

Internet traffic, both in terms of streaming and downloading (Guo,

Chen, Xiao and Zhang, 2005; van der Merwe, Sen and Kalmanek,

2002).

3. Multimedia traffic is usually perceived as benign, with relatively few

reports of multimedia-related security exploits so far (see Section 5).

As multimedia formats and algorithms become more complex, how-

ever, there is a greater chance that their implementation may leave

some security hole behind.

4. Increasing network bandwidth capacities create a greater workload

for IDS, which can become saturated and unable to perform its du-

ties (Bace, 2000).

3

2 Adding Multimedia Knowledge to Intrusion De-

tection Systems (IDS)

“Intrusion Detection Systems are software or hardware systems that auto-

mate the process of monitoring the events occurring in a computer system

or network, analyzing them for signs of security problems (Bace and Mell,

2001)”. With the increase in the amount and severity of network-based at-

tacks over the past few years, IDS have become an important and widely

used additional tool in the network security infrastructure of many orga-

nizations. Several tutorials, surveys and taxonomies for IDS have been

published recently, such as (Axelsson, 2000; Sherif and Dearmond, 2002;

McHugh, 2001).

The most important features of IDS are the ability to detect as many

intrusions as possible and to do so with a low rate of false alerts. While

IDS reduce the likelihood of an intrusion they cannot eliminate or prevent

it entirely (Sequeira, 2003). In fact, as pointed out by Kemmerer and Vi-

gna (2002), “IDS do not detect intrusions at all – they only identify evidence

of intrusions, either while they’re in progress or after the fact”.

Network traffic which contains multimedia content is not generally con-

sidered to be an issue which impacts the performance or detection rate of

IDS any differently than any other type of content. During the exhaustive

process of comparing hundreds or thousands of rules against the contents of

a packet, IDS typically do not apply any rules that represent multimedia-

specific knowledge.

Currently, IDS are capable of blocking multimedia content based on port

4

number (in the case of streaming audio/video), string matching of content

type (e.g., content: “User-Agent |3A| Quicktime”) and file extension. None

of these techniques verify the validity of the content; they simply assume that

if data appears to be (from external identifiers, such as MIME) multimedia,

then it is.

This scenario may soon change as a result of new, multimedia-specific,

exploits that have recently surfaced. When the PNG exploit was first discov-

ered, it was noted that existing IDS were not capable of detecting vulnerable

files because they did not have sufficient knowledge of the transfer protocols

or file formats which were being transmitted (CoreLabs, 2005). Exploits

have also been observed with streaming formats, where incomplete protocol

implementations can leave a system vulnerable to attack (US-CERT, 2005).

The work reported in this paper uses additional header information to

classify multimedia, examining the contents of an incoming data stream

and looking for known multimedia characteristics, such as JPEG or MPEG

markers. It leverages Snort’s architecture by encapsulating the multimedia-

specific knowledge into a preprocessor for Snort.

3 Snort

Snort (Snort, 2005) is a flexible, open-source, multi-platform intrusion detec-

tion solution. Snort was chosen as the base utility upon which a multimedia

classifier was built. Snort’s ability to easily add preprocessing functionality

and its open source platform were the primary factors behind its choice.

Snort’s “plug-in” functionality for preprocessor modules, which are writ-

5

Figure 1: Basic dataflow using Snort as an IDS.

ten in C, can add functionality past the base ruleset to include things like

anomaly detection and session reconstruction (Beale, 2004). Any preproces-

sor in Snort can be turned on or off simply by adding a line to a configuration

file for the program.

Snort works by matching traffic patterns to its rules, stored in a ruleset.

The process of intrusion detection within Snort begins with the capture

of packets from an Ethernet port with a tool such as Ethereal (Ethereal,

2005). The captured packets are sent through the packet decoder, which

determines which protocol is in use for a given packet and matches the data

against allowable behavior for patterns of that protocol. Then the packet

is processed by a series of preprocessors which receive packets in the order

which they are initialized in the Snort configuration file. After the packet

has gone through all of the preprocessors in sequence, it is given to the

detection engine for comparison against the ruleset. If any anomaly (e.g.,

malformed headers, overly long packets, unusual or incorrect TCP options)

is detected, an alert is generated (Figure 1).

6

4 The Multimedia Traffic Classifier

4.1 Design

The proposed solution is a two-class classifier, whereas an incoming or out-

going stream is classified either as multimedia or non-multimedia, in a way

that resembles the stream splitting idea proposed in (Judd and McEachen,

2003). Once a session is found to be containing multimedia traffic, the IDS

is able to determine if an authorization is appropriate. Authorization will

not necessarily be immediate, and may require multiple examinations of a

data stream or session to determine content validity. If a stream is deemed

to be unauthorized, normal IDS operation will continue and the data will

be analyzed according to the remaining rulesets. The ultimate goal of the

classification stage is to be able to make intelligent decisions based on pre-

viously encoded knowledge. By preprocessing and authorizing the data,

we can take advantage of Snort’s global do detect flag that tells Snort to

skip the detection phase (ruleset comparison) of the flagged packet. Since

multimedia files are usually very large, the additional time spent classifying

and flagging multimedia data using the first few packets would be offset by

the time savings throughout the remaining packets corresponding to that

stream, thereby achieving the intended performance improvements.

To accomplish the goals of multimedia traffic classification and multi-

media-specific exploit detection, a module was developed to perform packet

level preprocessing and analysis. This module, known as the multimedia

classification preprocessor, works in addition to, not as a replacement for,

the detection engine within the IDS. As network traffic is captured, it passes

7

through the multimedia classifier where it is compared against known stan-

dards to determine whether or not it meets the requirements to be trusted.

After packets are labeled as belonging to a trusted stream or session, they

bypass detailed analysis which is normally performed within the detection

engine, allowing it to focus on traffic which poses a potentially greater threat.

In addition to the rerouting of trusted traffic past the detection engine, the

multimedia-specific knowledge which has been added to the system enables

the detection of multimedia-specific exploits and administrator notification.

Figure 2 shows the dataflow within the multimedia classification pre-

processor. Each packet which enters the multimedia preprocessor is first

checked to see if it belongs to a stream or session which has already been

authorized. The packet then undergoes a series of byte-by-byte checks to

determine if it contains well-known multimedia identifiers. If such identifiers

are located, specialized rules can be applied to extract parameters or check

them for known vulnerabilities. If the packet passes the parameter extrac-

tion and exploit detection stages, it is then classified as part of a trusted

multimedia stream and assigned a flag which lets the detection engine know

that it should not be analyzed any further.

As seen in Figure 2, the multimedia classifier sits before the detection

engine, allowing specialized knowledge to be applied to the collected data

before it is passed to the detection engine for exhaustive analysis. By adding

a few extra steps to the preprocessor’s operation our classifier is able to

distinguish legitimate multimedia traffic from all other types of traffic. This

additional functionality impacts the time spent doing packet preprocessing

and inevitably introduces some overhead. As it will be demonstrated in

8

Figure 2: Modified dataflow as a result of using a multimedia classificationpreprocessor for Snort.

Section 6, the overhead is minimal and except for the case where there is

virtually no multimedia traffic, the addition of this new knowledge results

in significant overall savings.

Possible outcomes of the classification process – compared to a scenario

in which the proposed preprocessor is absent – are:

1. Data is recognized as part of a trusted multimedia stream and autho-

rized.

Result: Significant processor savings in successive packets belonging

to the same stream.

2. Data is recognized as a multimedia specific exploit, an alert is gener-

ated and the remainder of the stream bypasses the detection engine,

9

since we have already labeled it as having malicious contents.

Result: Significant processor savings and detection of malicious ac-

tivity that would otherwise go undetected.

3. Data is not multimedia in nature and continues normally through the

detection engine.

Result: Slight increase in processor usage resulting from overhead

generated by multimedia preprocessor.

4.2 Application Scenarios

4.2.1 Non-streaming (Downloading)

The non-streaming scenario can best be described by multimedia traffic

which has the following attributes:

1. All content must be entirely downloaded before playback can com-

mence.

2. There are no temporally-dependent sessions established for the trans-

mission of playback control messages or content.

3. Files can be partitioned into file header and compressed contents.

Figure 3 illustrates the impact of the proposed classifier on the analysis

of non-streaming traffic. Without the multimedia classifier, all multimedia

packets would pass through the detection engine, with each one undergo-

ing an exhaustive search for possible malicious contents. With the classi-

fier enabled, the stream is classified early by validating the file header and

ensuring it conforms to known standards. After the packet is marked as

10

Figure 3: Non-streaming packet routing with and without the proposedmultimedia classifier.

containing trusted multimedia contents, it is then searched for known mul-

timedia exploits. If no exploits are detected, the multimedia stream is given

authorization, which reroutes future packets belonging to the stream and

greatly reduces the work of the detection engine.

4.2.2 Streaming

Streaming multimedia contents are treated in a similar manner to non-

streaming once they are authorized, but the processes used to accomplish

this authorization are different. We use the streaming description for mul-

timedia traffic that has the following characteristics:

11

Figure 4: Streaming packet routing with and without the proposed multi-media classifier.

1. Session initiation, including handshaking mechanisms, is used to es-

tablish stream parameters such as transfer ports.

2. Content playback can begin as soon as a buffer has received an ade-

quate amount of data.

3. Control messages for manipulation of playback, such as the ability to

pause, may be seen during the course of a session.

After a streaming session is authorized, the classifier flags all future

packets belonging to the stream as authorized until the stream terminates.

Figure 4 illustrates the impact of the proposed classifier on the analysis of

streaming traffic, in terms of computational savings.

12

Marker Identifier Marker Acronym Marker Name0xD8 SOI Start of image0xE0 APP0 Application use marker

0xE1-0xEF APPn Other application segments0xDB DQT Quantization table0xC0 SOF0 Start of frame 00xC4 DHT Huffman table0xDA SOS Start of scan0xD9 EOI End of image0xFE COM Comments field

Table 1: JPEG markers.

4.3 Relevant Multimedia File Formats and Protocols

The multimedia classifier preprocessor handles non-streaming media formats

by examining the file headers for known multimedia markers. A representa-

tive subset of these techniques and the formats they examine are described

below.

4.3.1 JPEG

The JPEG header consists of markers which identify various parts of the

header. These markers can be identified two byte sequences which are pre-

fixed by a FF byte and followed by a byte which represents the function of

the marker. Like many other file formats, a JPEG consists of both required

and optional markers and fields. Table 1 lists JPEG markers of particular

interest for this work and the parameters they define or describe.

The multimedia preprocessor performs analysis of JPEG files in a multi-

step sequence. First, the start of frame marker is located and the length of

the SOF0 field is read. From this point, we move ahead this many positions

13

within the file, to where the Huffman table should be located. If these two

conditions are met, we can safely say that we are in the first packet of a

JPEG file, which contains the file header. Once the JPEG file is identified,

we can begin searching for signs of an exploit within the file.

4.3.2 PNG

The PNG format is based on the GIF file format, redesigned with network

transmission and storage in mind. PNG files can be analyzed by looking for

PNG chunks, which are used to organize the structure of the file.

There are four main PNG chunks, known as critical chunks, three of

which (IHDR, IDAT, IEND) are required in every PNG image. They are:

1. IHDR: Header chunk - Contains basic information about the image.

Must be the first chunk within the image.

2. PLTE: Palette chunk - Stores the colormap which is associated with

the image data. Only present if a palette is used.

3. IDAT: Image data chunk - Contains the image data. Multiple data

chunks may occur within a file and will be found in contiguous order.

4. IEND: Image trailer chunk - Marks the end of the image file.

Figure 5 displays a PNG image with chunks and image data identified.

The IDAT chunk is followed immediately by the compressed image data and

the height (0x0000001E) and width (0x00000019) values are visible just past

the IHDR chunk (0x49484452).

14

Figure 5: PNG file structure.

4.3.3 RTSP

RTSP (Real-Time Streaming Protocol) is a widely used streaming protocol,

which has been implemented in many server software packages, such as Win-

dows Media Services, Helix Servers (RealPlayer) and the Quicktime/Darwin

Streaming Servers. RTSP is used to negotiate client/server streaming ses-

sions and allows this to be done without any requirements on the payload.

The separation between session handling and transfer specification is bene-

ficial from an IDS standpoint because we can focus on analyzing the session

and exempting the payload without having to analyze it as well.

Handling of RTSP for intrusion detection purposes starts by extracting

15

the session data port numbers from the SETUP method during session initi-

ation. Once a RTSP SETUP packet has been identified and the port number

stored for future reference, the IDS looks for a RTSP PLAY request, which

asks the server to begin transferring the media stream. When a PLAY re-

quest is detected, the IDS flags all data entering through the pre-specified

port as exempted from further examination.

This process continues until the IDS detects a TEARDOWN request,

at which time the port is removed from the exemption list. Other RTSP

messages are effectively ignored within the preprocessor, because they do

not contain any information which would further improve the effectiveness

or accuracy of our session monitoring.

5 Multimedia-specific Exploits

Multimedia exploits are relatively rare, given the number of overall vulner-

abilities which have been released for all filetypes. However, in recent years

the number of these exploits has increased and may continue to increase as

more multimedia formats continue to be introduced and software utilities

for multimedia continues to be released (Microsoft, 2004, 2005a,b; Mozilla

Foundation, 2005).

The proposed multimedia classifier makes decisions after having exam-

ined only relevant bytes within the files’ headers. There are three main

reasons for this: (i) the computational overhead introduced by the multime-

dia preprocessor is kept to a minimum; (ii) all relevant multimedia-specific

exploits described later in this Section can be successfully detected within

16

Figure 6: Example of the effects of content modification (beyond headerbytes) to a JPEG image: (top) original; (bottom) modified.

the header bytes; and (iii) the damage that would be caused by one or more

modified bytes past the header is most often very limited and potentially

noticeable by the end user, as illustrated by Figure 6, which demonstrates

the possible effects of a JPEG file in which the contents (beyond the header

bytes) have been modified. Moreover, inspecting multimedia transfers for

malicious content manipulation is outside the scope of intrusion detection

and is best dealt with using digital watermarking and other authentication

techniques.

17

The following subsections will describe a few multimedia specific exploits

in greater detail.

5.1 JPEG

The JPEG exploit allows an attacker to gain control of an exploited system.

The JPEG specification allows the embedding of comments in the JPEG

file. The comment sections start with a hex value of 0xFFFE to signal the

start of the comment, followed by a two-byte value, which specifies the

length of the comment, plus two bytes (for the field itself). The two-byte

field theoretically allows 65,533 bytes of comment data (invisible when the

JPEG is viewed). If the comment field is empty, the length value must

contain the minimum length, or a value of 2. (2 bytes in length). However,

if a specially crafted JPEG file sets this length to a 0 or 1 (illegal values), it

causes a buffer overflow condition (PC Magazine, 2005).

The multimedia classification preprocessor can easily detect exploits like

this one by reading the comment field when a JPEG transfer is detected

and alerting an administrator or blocking the traffic. Figure 7 contains a

JPEG image with a comment embedded in the file header. In this case,

the comment field identifier (0xFFFE) is followed by a field length of 0x000F

(fifteen) which is valid. To turn this valid image into an exploited image, one

would simply change this value to 0x0000 (zero) or 0x0001 (one), followed

by malicious code.

18

Figure 7: JPEG structure with valid markers and image data highlighted.

5.2 PNG

One of the published PNG exploits uses especially crafted PNG chunks to

create a buffer overflow condition that allowed an attacker to gain control

of a target system. Similarly to the JPEG exploit, the end user would likely

not realize their system was compromised, because the image would still

display correctly even though there was a malformed section within the file

header.

This vulnerability is particularly dangerous when combined with MSN

Messenger, because the user does not have to request or accept an image

19

to have their system compromised. The attacker can use a PNG file as a

buddy icon, which is automatically transferred during a chat session, thereby

allowing them to gain control of the target system.

The MSN Messenger PNG vulnerability is detailed in the advisory posted

at CoreLabs (CoreLabs, 2005). This exploit is composed by placing specific

values in the IHDR and tRNS chunks of the image. The colors used and

palette used flags must be set in the color type field and the alpha channel

used flag must not be set. The color type field must have a value of 0x03

and the contents of the tRNS chunk must exceed 256 and reach a function

pointer address.

5.3 RTSP

The proposed multimedia classifier is capable of detecting RTSP buffer over-

flow issues such as the Darwin Streaming Server vulnerability that was

passed through the RTSP DESCRIBE method, which would allow an at-

tacker to execute malicious code (US-CERT, 2005). What follows is an

example of a Snort rule to handle the problem.

alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS 554

(msg:"WEB-MISC Real Server DESCRIBE buffer overflow attempt";

flow:to_server,established; content:"DESCRIBE"; nocase;

content:"../"; distance:1; pcre:"/^DESCRIBE\s[^\n]{300}/smi";

reference:bugtraq,8476; reference:url,

www.service.real.com/help/faq/security/rootexploit091103.html;

classtype:web-application-attack; sid:2411; rev:5;)

20

This rule works as follows:

1. Snort looks for traffic coming from any external network, on any IP

address which is inbound for an IP address defined as a HTTP server,

on port 554. Port 554 is the default port for all RTSP traffic.

2. Snort looks for the plaintext string “DESCRIBE” using the PCRE

tool (Perl Compatible Regular Expressions).

3. Snort measures the length of the contents which follow the DESCRIBE

string. If these contents are sufficiently large to cause a buffer overflow

condition, an alert is generated.

4. The alerting mechanism generates an alert to the console or inserts a

new record in a database containing a short description of the intru-

sion along with reference data so an administrator can look up more

information online.

Additional implementation details, including source code, can be found

in (Baillargeon, 2005).

6 Experiments and Results

Our approach was tested by monitoring FTP transfers of multimedia and

non-multimedia files using Snort. Experiments were performed on many

platforms to ensure interoperability and flexibility for future work. All ex-

periments had the same common goal: to measure processor usage and

evaluate the impact of adding the proposed specialized preprocessor on the

21

overall performance of the IDS. Statistics and measurements were reported

by Snort when the program was terminated.

6.1 Downloading

All Snort experimentation was done in a Windows environment, using Snort

and the WinPcap packet capture library. For measurement purposes, FTP

transfers were used to ensure that data was not being cached locally. FTP

transfers also allowed batch jobs to be scheduled, for ease of reproducing

the tests to ensure accuracy of the measurements. The classification prepro-

cessor was tested by transferring data with the preprocessor on, then again

with it turned off. Time measurements were taken in Snort, using the C++

clock() function, the unit of which is clock ticks. Division of clock ticks

by the constant CLOCKS_PER_SEC returns a value in seconds. However, in

almost all cases with the classifier on this value was less than 1 second, so

values were left in units of clock ticks.

6.1.1 Single file transfers

In the first series of experiments we looked at specific file types, one at a

time. For AVI files varying from 716 KB to 56 MB in size, a fixed number

of packets (two) per file was required to classify them as valid multimedia

data. The corresponding savings in processor usage was somewhat inversely

proportional to the file size: for small files, the processor was used only 15%

of the time it would have without our classifier, while for very large files, this

number would drop to less than 1%. Similar results were obtained for MPEG

and JPEG files between 417 KB and 67 MB in size. These experiments also

22

File Type File Size (MB) DEU OFF DEU ON MM ONAVI 257 13189 70 30AVI 10.9 640 40 10JPG 2.5 571 30 0MPG 0.4 110 20 0

Table 2: Time savings for miscellaneous downloads.

Table 2 legend:

• DEU OFF - Detection engine usage (in clock ticks) with MM classifieroff.

• DEU ON - Detection engine usage with MM classifier on.

• MM ON - MM classifier usage with MM classifier on.

confirm that the overhead introduced by adding an extra preprocessing step

is minimal – less than 1%.

Table 2 contains sample collected data, demonstrating the drastic re-

duction in overall CPU usage when the multimedia classifier was enabled.

The sum of the two last columns is (much) less than the value in the center

column for all cases.

6.1.2 Mixed traffic

The mixed traffic experiments combine multimedia and non-multimedia files

into a batch FTP job, with varying percentages of multimedia-traffic, from

0% to 100% in 20% increments. Multiple files were transferred with the mul-

timedia classification preprocessor active and then again without to evaluate

CPU utilization. Figure 8 summarizes the results, indicating that processor

usage (in clock ticks) is inversely proportional to the amount of multimedia

23

MM NONMM PMM DEU ON MM ON0 100 0 19198 2103

18.86 81.52 18.79 15279 195441.68 60.93 40.6 12230 142262.51 40.95 60.42 8227 77078.94 20.19 79.63 3966 55099.78 0 100 30 160

Table 3: Time savings for mixed traffic.

Table 3 legend:

• MM - Amount of MM traffic (MB).

• NONMM - Amount of Non-MM traffic (MB).

• PMM - Percent of multimedia traffic.



data in the batch, thereby confirming our claim of increased CPU savings as

the amount of multimedia traffic increases. Table 3 contains raw data col-

lected from mixed traffic experiments, again illustrating a drastic reduction

in overall CPU usage when multimedia traffic is present.

6.2 Streaming

Streaming experiments were conducted using RealPlayer, QuickTime and

Windows Media Player for client-side data capture. Server software utilized

includes Darwin Streaming Server, Windows Media Services and popular

web sites such as iFilm.com for Helix RealPlayer server testing. Data was

gathered in a similar manner to the nonstreaming methods, in that the clock

usage was measured over a period of time with the multimedia preprocessor

24

Figure 8: CPU savings for multimedia traffic downloading.

PL DEU OFF DEU ON MM ON1:00 4843 50 05:00 12983 240 4010:00 24418 400 40

Table 4: Time savings for multimedia streaming.

Table 4 legend:

• PL - Playback length (minutes).

• DEU OFF - Detection engine usage with MM classifier off.



off, and then on.

Table 4 contains representative raw data from one of these tests. This

data was gathered by initiating streams of identical bitrates for varying

25

Figure 9: Time savings for multimedia streaming.

amounts of time and recording the CPU usage of the detection engine and

multimedia classifier, as was done in previous tests. Figure 9 shows those re-

sults for a particular subset using QuickTime and Darwin Streaming server

for streams of varying duration (1, 5, and 10 minutes long). Similar re-

sults were obtained using other server/client combinations. It illustrates the

processor usage savings as playback time increases. Without multimedia

classification, the total amount of processor usage increases linearly with

time, but when the classifier is enabled the total usage increase is minimal.

7 Conclusions

We have described the design and operation of a Snort preprocessor that

adds specialized multimedia knowledge to the IDS packet analysis capabili-

ties. The fundamental idea behind the proposed method is that the addition

26

of multimedia-specific knowledge can greatly improve the efficiency of IDS,

both in terms of detection rate and resource utilization. Empowering the

IDS with multimedia-specific knowledge results in two significant gains: (1)

trusted multimedia contents can be identified and allowed to bypass the de-

tection engine, thereby allowing the IDS to focus on other traffic; (2) the

IDS is able to detect multimedia-specific exploits which would otherwise go

by unnoticed.

Although the addition of multimedia knowledge will likely increase the

overall effectiveness of IDS, there is a tradeoff between the extra CPU time

needed to perform additional analysis on multimedia traffic and the CPU

savings that ensue as a resulting of exempting subsequent packets belonging

to the same, trusted, multimedia stream. Fine-tuning of the system to find

the proper balance of analysis is required and has to be done empirically.

The proposed solution has been tested with several combinations of mul-

timedia and non-multimedia traffic, both downloading and streaming. Re-

sults of our experiments confirm that significant processing savings can be

obtained, given that there is minimally a small amount of multimedia traffic

on a network.

The proposed framework is modular and extensible, allowing a broad

coverage of different file formats and streaming protocols as well as an ad-

justable degree of specialized knowledge upon which the IDS decision is

made. Even though multimedia-specific attacks and exploits are still rela-

tively rare, the proposed solution allows for easy encoding of rules that help

detection and prevention against reported attacks as soon as they become

known, equipping it with a ‘future proofing’ capability.

27

Acknowledgment

This work was partially supported by a grant from the U.S. Department of

Defense (DoD).

References

Axelsson, S. 2000. Intrusion Detection Systems: A Survey and Taxonomy.

Technical Report 99-15 Department of Computer Engineering, Chalmers

University of Technology Goteborg, Sweden: .

Bace, R. 2000. An Introduction to Intrusion Detection and Assessment.

Technical report ICSA Labs.

Bace, R. and P. Mell. 2001. Intrusion Detection Systems. Technical report

National Institute of Standards and Technology.

Baillargeon, Pierre. 2005. A Method for Adding Multimedia Knowledge For

Improving Intrusion Detection Systems. Master’s thesis Florida Atlantic

University.

Beale, J. 2004. Snort 2.1 Intrusion Detection. Second ed. Syngress.

CoreLabs. 2005. “Core Security Technologies Advisory: MSN Messenger

PNG Image Parsing Vulnerability.” http://www.coresecurity.com/.

Ethereal. 2005. “Ethereal: a network protocol analyzer.” http://www.

ethereal.com/.

28

Guo, Lei, Songqing Chen, Zhen Xiao and Xiaodong Zhang. 2005. Analysis

of Multimedia Workloads with Implications for Internet Streaming. In

WWW2005. Chiba, Japan: .

Judd, J. and J. McEachen. 2003. An Architecture for Network Stream

Splitting in Support of Intrusion Detection. In ICICS-PCM. Singapore: .

Kemmerer, R. and G Vigna. 2002. “Intrusion Detection: A Brief History

and Overview.” IEEE Computer 35(4):27–30.

Marques, O. and P. Baillargeon. 2005. A Multimedia Traffic Classification

Scheme for Intrusion Detection Systems. In Proc. of the IEEE ICITA’05.

Sydney, Australia: .

McHugh, John. 2001. “Intrusion and intrusion detection.” International

Journal of Information Security 1(1):14–35.

Microsoft. 2004. “Microsoft Security Bulletin MS04-028: Buffer Overrun in

JPEG Processing (GDI+) Could Allow Code Execution (833987).” http:

//www.microsoft.com/technet/security/bulletin/MS04-028.mspx.

Microsoft. 2005a. “Microsoft Security Bulletin MS05-009: Vulnerability in

PNG Processing Could Allow Remote Code Execution (890261).” http:

//www.microsoft.com/technet/security/bulletin/MS05-009.mspx.

Microsoft. 2005b. “Microsoft Security Bulletin MS05-053 Vulnerabil-

ities in Graphics Rendering Engine Could Allow Code Execution

(896424).” http://www.microsoft.com/technet/security/bulletin/

MS05-053.mspx.

29

Mozilla Foundation. 2005. “Mozilla Foundation Security Advisory 2005-30:

GIF heap overflow parsing Netscape Extension 2.” http://www.mozilla.

org/security/announce/mfsa2005-30.html.

PC Magazine. 2005. “Security Watch Letter: Inside the JPEG Virus.”

http://www.pcmag.com/article2/0,1759,1661942,00.asp.

Sequeira, D. 2003. “Intrusion Prevention Systems: Security’s Silver Bullet?”

Business Communications Reviews pp. 36–41.

Sherif, Joseph S. and Tommy G. Dearmond. 2002. Intrusion Detection:

Systems and Models. In Proceedings of the IEEE WETICE02. Pittsburgh,

PA, USA: .

Snort. 2005. “Official Snort website.” http://www.snort.org/.

US-CERT. 2005. “US-CERT: United States Computer Emergency Readi-

ness Team Vulnerability Note VU460350: Apple Quicktime/Darwin

Streaming Server fails to properly parse DESCRIBE requests.” http:

//www.kb.cert.org/vuls/id/460350.

van der Merwe, Jacobus, Subhabrata Sen and Charles Kalmanek. 2002.

Streaming Video Traffic: Characterization and Network Impact. In Pro-

ceedings of the Seventh International Web Content Caching and Distribu-

tion Workshop. Boulder, CO: .

30

Design of a multimedia traﬃc classiﬁer for Snort

Documents