QGuard: Protecting Internet Servers from OverloadInternet servers suffer from overload because of the un-controlled inﬂux of requests from network clients. Since these requests for

QGuard: Protecting Internet Servers from Overload

Hani Jamjoom� John Reumanny

and Kang G. ShinDepartment of Electrical Engineering and Computer Science,

The University of MichiganAnn Arbor, MI 48109

fjamjoom, [email protected]

Abstract

Current operating systems are not well-equipped to han-dle sudden load surges that are commonly experiencedby Internet servers. This means that service providersand customers may not be able to count on servers beingavailable once their content becomes very popular. Re-cent Denial-of-Service attacks on major e-commerce siteshave capitalized on this weakness.

Remedies that were proposed to improve server be-havior under overload require substantial changes to theoperating system or applications, which is unacceptableto businesses that only want to use the tried and true.This paper presentsQGuard, a much less radical solutionto providing differential QoS, protection from overloadand some DoS attacks. QGuard is an adaptive mecha-nism that exploits rate controls for inbound traffic in or-der to fend off overload and provide QoS differentiationbetween competing traffic classes.

Our Linux-2.2.14-based QGuard prototype providesfreely configurable QoS differentiation (preferred cus-tomer treatment and service differentiation) and effec-tively counteracts SYN and ICMP-flood attacks. SinceQGuard is a purely network-centric mechanism, it doesnot require any changes to server applications and canbe implemented as a simple add-on module for any OS.

�Hani Jamjoom is supported by the Saudi Arabian Ministry ofHigher Education

yJohn Reumann is supported by IBM’s Research Fellowship Pro-gram

Our measurements indicate no performance degradationon lightly loaded servers and only a small reduction of ag-gregated server throughput (less than 2%) under overload.Well-behaved “preferred customers” remain virtually un-affected by server overload.

1 Introduction

Recent blackouts of major web sites, such as Yahoo, eBay,and E*Trade, demonstrated how susceptible e-business isto simple Denial-of-Service (DoS) attacks [9, 11]. Usingpublicly available software, amateur hackers can chosefrom a variety of attacks such as SYN or ping-floods tolock out paying customers. These attacks either floodthe network pipe with traffic or pound the server with re-quests, thus exhausting precious server resources. In bothattack scenarios, the server will appear dead to its paying(or otherwise important) customers.

This problem has been known since the early 1980’s[5]. Since then, various fixes have been proposed [4, 17,23]. Nevertheless, these fixes are only an insufficient an-swer to the challenges faced by service providers today.What makes things more difficult today is that serviceproviders want to differentiate between their importantand less important clients at all times, even while draw-ing fire from a DoS attack.

The recent DoS attacks are only one instance of poorlymanaged overload scenarios. A sudden load surge, too,can lead to a significant deterioration of service quality

(QoS) — sometimes coming close to the denial of ser-vice. Under such circumstances, important clients’ re-sponse time may increase drastically. More severe con-sequences may follow if the amount of work-in-progresscauses hard OS resource limits to be violated. If such fail-ures were not considered in the design of the service, theservice may crash, thus potentially leading to data loss.

These problems are particularly troubling for sites thatoffer price-based service differentiation. Depending onhow much customers pay for the service, they have dif-ferent QoS requirements. First of all, paying customerswant the system to remain available even when it is heav-ily loaded. Secondly, higher-paying customers wish to seetheir work requests take priority over lower-paying cus-tomers when resources are scarce. For example, a Website may offer its content to paying customers as well asfree-riders. A natural response to overload is not to servecontent to the free-riders. However, this behavior cannotbe configured in current server OSs.

Although pure middleware solutions for QoS differ-entiation [1, 14] exist, they fail when the overload oc-curs before incoming requests are picked up and man-aged by the middleware. Moreover, middleware solu-tions fail when applications bypass the middleware’s con-trol mechanisms, e.g., by using their own service-specificcommunication primitives or simply by binding commu-nication libraries statically. Therefore, much attentionhas been focused on providing strong performance man-agement mechanisms in the OS and network subsystem[4, 6, 7, 12, 15, 19, 20, 23, 25]. However, these solutionsintroduce more controls than necessary to manage QoSdifferentiation and defend the server from overload.

We propose a novel combination of kernel-leveland middleware overload protection mechanisms calledQGuard. QGuard learns the server’s request-handlingcapacity independently and divides this capacity amongclients and services according to administrator-specifiedrules. QGuard’s differential treatment of incoming traf-fic protects servers from overload and immunizes theserver against SYN-floods and the so-called “ping-of-death.” This allows service providers to increase their ca-pacities gradually as demand grows since their preferredcustomers’ QoS is not at risk. Consequently, there is noneed to build up excessive over-capacities in anticipation

of transient request spikes. Furthermore, studies on theload patterns observed on Internet servers show that over-capacities can hardly protect servers from overload.

This paper is organized as follows. We present our de-sign rationale in Section 2 and discuss its implementationin Section 3. Section 4 studies QGuard’s behavior in anumber of typical server overload and attack scenarios.Section 5 places QGugard in the context of related work.The paper ends with concluding remarks in Section 6.

2 What is QGuard?

Internet servers suffer from overload because of the un-controlled influx of requests from network clients. Sincethese requests for service are received over the network,controlling the rate at which network packets may enterthe server is a powerful means for server load manage-ment. QGuard exploits the power of traffic shaping to pro-vide overload protection and differential service for Inter-net servers. By monitoring server load, QGuard can adaptits traffic shaping policies without anya priori capacityanalysis or static resource reservation. This is achievedby the cooperation of the four QGuard components:traf-fic shaper, monitor, load-controller, andpolicy-manager(see Figure 1).

2.1 The Traffic Shaper

QGuard relies on shaping the incoming traffic as its onlymeans of server control. Since QGuard promises QoS dif-ferentiation, differential treatment must begin in the trafficshaper, i.e., simply controlling aggregate flow rates is notgood enough.

To provide differentiation, the QGuard traffic shaper as-sociates incoming packets with their traffic classes. Traf-fic classes may represent specific server-side applications(IP destinations or TCP and UDP target ports), client pop-ulations (i.e., a set of IP addresses with a common prefix),DiffServ bits, or a combination thereof. Traffic classesshould be defined to represent business or outsourcingneeds. For example, if one wants to control the requestrate to the HTTP service, a traffic class that aggregates

2

incoming service requests

user-spacekernel-space

statistics

loaddigestCPU

Memorysubsystem

Networksubsystem

Filesystem

load-controllermonitor

traffic shaper TCP/IP

policy-managerdaemon

service differentiationrequirements defined

by sysadmin

server apps

up-call

set of traffic policies

traffic shaping policy

Figure 1:The QGuard architecture

* -> 111.1.1.2:80 TCP, SYN

111.1.1.1:111 TCP, SYN 111.1.1.*.* -> 111.1.1.2:80 TCP, SYN

111.1.1.1:* -> 111.1.1.2:80 UDP...

Web preferred

Web standard

Some UDP serviceincoming packet header

packetmatcher

Traffic classes

111.1.1.2:80

Figure 2: Classifying incoming traffic

all TCP-SYN packets sent to port 80 on the server shouldbe introduced. This notion of traffic classes is commonlyused in policy specifications for firewalls and was pro-posed initially by Mogulet al. [18]. Figure 2 displaysa sample classification process. Once the traffic class isdefined, it may be policed.

For effective traffic management, traffic classificationand policing are combined intorules. Each rule speci-fies whether a traffic class’ packets should be accepted ordropped. Thus, it is possible to restrict certain IP domainsfrom accessing certain (or all) services on the server whilegranting access to others without affecting applicationsor the OS. As far as the client and servers OS’s areconcerned, certain packets simply get lost. Such all-or-nothing scheme are used for server security (firewalls).However, for load-control more fine-grained traffic con-trol is necessary. Instead of tuning out a traffic sourcecompletely, QGuard allows the administrator to limit itspacket rate. Thus, preferred clients can be allowed to

submit requests at a higher rate than non-preferred ones.Moreover, QGuard also associates a weight representingtraffic class priority with each rule. We refer to these pri-oritized, rate-based rules asQGuard rules. QGuard rulesaccept a specific traffic class’ packets as long as their ratedoes not exceed the maximal rate specified in the rule.Otherwise, a QGuard rule will cause the incoming pack-ets to be dropped.

QGuard rules can be combined to provide differentialQoS. For example, the maximal acceptance rate of onetraffic class can be set to twice that of another, thus deliv-ering a higher QoS to the clients belonging to the trafficclass identified by the rule with the higher acceptance rate.The combination of several QGuard rules — the buildingblock of QoS differentiation — is called aQGuard filter(henceforth filter). They may consist of an arbitrary num-ber of rules. Filters are the inbound equivalent of CBQpolices [10].

3

2.2 The Monitor

Since QGuard does not assume to know the ideal shapingrate for incoming traffic, it must monitor server load todetermine it. Online monitoring takes the place of offlinesystem capacity analysis.

The monitor is loaded as an independent kernel-moduleto sample system statistics. At this time the administratormay indicate the importance of different load-indicatorsfor the assessment of server overload. The monitoringmodule itself assesses server capacity based on its obser-vations of different load indicators. Accounting for boththe importance of all load indicators and the system ca-pacity, the monitor computes theserver load-index. Otherkernel modules may register with the monitor to receive anotification if the load-index falls into a certain range.

Since the monitor drives QGuard’s adaptation to over-load, it must be executed frequently. Only frequent ex-ecution can ensure that it will not miss any sudden loadsurges. However, it is difficult to say exactly how oftenit should sample the server’s load indicators because theserver is subject to many unforeseeable influences [13],e.g., changes in server popularity or content. Therefore,all relevant load indicators should be oversampled signif-icantly. This requires a monitor with very low runtimeoverheads. The important role of the monitor also re-quires that it must be impossible to cause the monitor tofail under overload. As a result of these stringent perfor-mance requirements we decided that the logical place forthe monitor is inside the OS.

2.3 The Load-Controller

The load-controller is an independent kernel module, forsimilar reasons as the monitor, that registers its overloadand underload handlers with the monitor when it is loadedinto the kernel. Once loaded, it specifies to the monitorwhen it wishes to receive an overload or underload no-tification in terms of the server load-index. Whenever itreceives a notification from the monitor it decides whetherit is time to react to the observed condition or whether itshould wait a little longer until it becomes clear whetherthe overload or underload condition is persistent.

The load-controller is the core component of QGuard’s

overload management. This is due to the fact that onedoes not know in advance to which incoming rate thepackets of individual traffic classes should be shaped.Since one filter is not enough to manage server overload,we introduce the concept of afilter-hierarchy(FH). A FHis a set of filters ordered by filter restrictiveness (shownin Figure 3). These filter-hierarchies can be loaded intothe load-controller on demand. Once loaded, the load-controller will use monitoring input to determine the leastrestrictive filter that avoids server overload.

The load-controller strictly enforces the filters of theFH, and any QoS differentiation that are coded into theFH in the form of relative traffic class rates will be im-plemented. This means that QoS-differentiation will bepreserved in spite of the load-controllers dynamic filterselection.

Assuming an overloaded server and properly set up FH,i.e.,

� all filters are ordered by increasing restrictiveness,� the least restrictive filter does not shape incoming

traffic at all,� and the most restrictive filter drops all incoming traf-

fic,

the load-controller will eventually begin to oscillate be-tween two adjacent filters. This is due the fact that therate limits specified in one filter are too restrictive and notrestrictive enough in the other.

Oscillations between filters are a natural consequenceof the load-controller’s design. However, switching be-tween filters causes some additional OS overhead. There-fore, it is advantageous to dampen the load-controller’soscillations as it reaches the point where the incomingtraffic rate matches the server’s request handling capac-ity. Should the load-controller begin to oscillate betweenfilters of vastly different acceptance rates, the FH is toocoarse-grained an should be refined. This is the policy-manager’s job. To allow the policy-manager to deal withthis problem, the load-controller keeps statistics about itsown behavior.

Another anomaly resulting from ineffective filter-hierarchies occurs when the load-controller repeatedlyswitches to the most restrictive filter. This means that nofilter of the FH can contain server load. This can eitherbe the result of a completely misconfigured FH or due

4

Figure 3: A sample filter-hierarchy

to an attack. Since switching to the most restrictive pol-icy results in a loss of service for all clients, this condi-tion should be reported immediately. For this reason thethe load-controller implements an up-call to the policy-manager (see Figure 1). This notification is implementedas a signal.

2.4 The Policy-Manager

The policy-manager fine-tunes filter-hierarchies based onthe effectiveness of the current FH. A FH is effectiveif the load-controller is stable, i.e., the load-controllerdoes not cause additional traffic burstiness. If the load-controller is stable, the policy-manager does not alter thecurrent FH. However, whenever the load-controller be-comes unstable, either because system load increases be-yond bounds or because the current FH is too coarse-grained, the policy-manager attempts to determine theserver’s operating point from the oscillations of the load-controller, and reconfigures the load-controller’s FH ac-cordingly.

Since the policy-manager focuses the FH with respectto the server’s operating point, it is the crucial componentto maximizing throughput during times of sustained over-load. It creates a new FH with fine-granularity aroundthe operating point, thus reducing the impact of the load-controller’s oscillations and adaptation operations.

The policy-manager creates filter-hierarchies in the fol-lowing manner. The range of all possible acceptancerates that the FH should cover — an approximate rangegiven by the system administrator — is quantized intoa fixed number of bins, each of which is representedby a filter. While the initial quantization may be toocoarse to provide accurate overload protection, the policy-

manager successively zooms into smaller quantization in-tervals around the operating point. We call the policy-manager’s estimate of the operating point thefocal point.By using non-linear quantization functions around this fo-cal point, accurate, fine-grained control becomes possi-ble. The policy-manager dynamically adjusts its estimateof the focal point as system load or request arrival rateschange.

The policy-manager creates filter-hierarchies that arefair in the sense of max-min fair-share resource alloca-tion [16]. This algorithm executes in two stages. In thefirst stage, it allocates the minimum bandwidth to eachrule. It then allocates the remaining bandwidth based ona weighted fair share algorithm. This allocation schemehas two valuable features. First, it guarantees a minimumbandwidth allocation for each traffic class (specified bythe administrator). Second, excess bandwidth is sharedamong traffic classes based on their relative importance(also specified by the administrator). Figure 3 shows anexample FH that was created in this manner. This fig-ure shows that the policy-manager makes two exceptionsfrom the max-min fair-share rule. The leftmost filter ad-mits all incoming traffic to eliminate the penalty for theuse of traffic shaping on lightly-loaded servers. Further-more, the rightmost filter drops all incoming traffic to al-low the load-controller to drain residual load if too manyrequests have already been accepted.

There are some situations that cannot be handled usingthe outlined successive FH refinement mechanism. Suchsituations often result from DoS attacks. In such cases,the policy-manager attempts to identify ill-behaved traf-fic classes in the hope that blocking them will end theoverload. To identify the ill-behaved traffic class, thepolicy-manager first denies all incoming requests and ad-mits traffic classes one-by-one on a probational basis (see

5

Figure 8) in order of their priority. All traffic classes thatdo not trigger another overload are admitted to the server.Other ill-behaved traffic classes are tuned out for a con-figurable period of time (typically a very long time).

Since the policy-manager uses floating point arithmeticand reads configurations from the user, it is implementedas a user-space daemon. This also avoids kernel-bloating.This is not a problem because the load controller alreadyensures that the system will not get locked-up. Hence, thepolicy-manager will always get a chance to run.

3 Implementation

3.1 The Traffic Shaper

Linux provides sophisticated traffic management for out-bound traffic inside its traffic shaper modules [8]. Amongother strategies, these modules implement hierarchicallink-sharing [10]. Unfortunately, there is nothing compa-rable for inbound traffic. The only mechanism offered byLinux for the management of inbound traffic isIP-Chains[21] — a firewalling module. To our advantage, the fire-walling code is quite efficient and can be modified easily.Furthermore, the concept of matching packet headers tofind an applicable rule for the handling of each incomingpacket is highly compatible with the notion of a QGuardrule. The only difference between a QGuard’s and IP-Chains’ rules is the definition of a rate for traffic shaping.Under a rate-limit a packet is considered to be admissi-ble only if the arrival rate of packets that match the sameheader pattern is lower than the maximal arrival rate.

QGuard rules are fully compatible with conventionalfirewalling policies. All firewalling policies are enforcedbefore the system checks QGuard rules. This means thatthe system with QGuard will never admit any packets thatare to be rejected for security reasons.

Our traffic shaping implementation follows the well-known token bucket[16] rate-control scheme. Each ruleis equipped with a counter (remaining tokens ),a per-second packetquota , and a timestamp torecord the last token replenishment time. Theremain-ing tokens counter will never exceedV � quota

with V representing the bucket’s volume. We modified

Figure 4: A QGuard Firewall Entry

the Linux-based IP-Chains firewalling code as follows.The matching of an incoming packet against a numberof packet header patterns for classification purposes (seeFigure 2) remains unchanged. At the same time, QGuardlooks up the traffic class’quota , timestamp , andremaining tokens and executes the token bucketalgorithm to shape incoming traffic. For instance, it ispossible to configure the rate at which incoming TCP-SYN packets from a specific client should be accepted.The following command:

qgchains -A qguard --protocol TCP --syn

--destination-port --source 10.0.0.1 -j

RATE 2

allows the host10.0.0.1 to connect to the Webserver at a rate of two requests per second. The syntax ofthis rule matches the syntax of Linux IP-Chains, which weuse for traffic classification. We chose packets as our unitof control because we are ultimately interested in control-ling the influx of requests. Usually, requests are small and,therefore, sent in a single packet. Moreover, long-livedstreams (e.g., FTP) are served well by the packet-rate ab-straction, too, because such sessions generally send pack-ets of maximal size. Hence, it is relatively simple to mapbyte-rates to packet-rates.

3.2 The Monitor

The Linux OS collects numerous statistics about the sys-tem state, some of which are good indicators of overloadconditions. We have implemented a lightweight monitor-

6

ing module that links itself into the periodic timer inter-rupt run queue and processes a subset of Linux’s statistics(Table 1). Snapshots of the system are taken at a defaultrate of 33 Hz. While taking snapshots the monitor updatesmoving averages for all monitored system variables.

When loading the monitoring module into the kernel,the superuser specifies overload and underload conditionsin terms of thresholds on the monitored variables, themoving averages, and their rate of change. Moreover,each monitored system variable,xi, may be given its ownweight, wi. The monitor uses overload and underloadthresholds in conjunction with the specified weights tocompute the amalgamatedserver load index— akin toSteere’s “progress pressure” [24]. To define the serverload index formally we introduce the overload indicatorfunction,Ii(Xi), which operates on the values of moni-tored variables and moving averagesXi:

Ii(Xi) =

8><>:1 if Xi indicates an overload condition

�1 if Xi indicates an underload condition

0 otherwise

For n monitored system variables the monitor com-

putes the server load index asnPi=1

Ii(Xi). Once this value

has been determined, the monitor checks whether thisvalue falls into a range that triggers a notification to othermodules (see Figure 5). Modules can simply register forsuch notifications by registering a notification range[a; b]and a callback function of the form

void (* callback) ( int load index )

with the monitor. In particular, the load-controller — to bedescribed in the following section — uses this monitoringfeature to receive overload and underload notifications.

Since the server’s true capacity is not known before theserver is actually deployed, it is difficult to define over-load and underload conditions in terms of thresholds onthe monitored variables. For instance, the highest possiblefile-system access rate is unknown. If the administratorpicks an arbitrary threshold, the monitor may either failto report overload or indicate a constant overload. There-fore, we implemented the system to dynamically learn themaximal and minimal possible values for the monitoredvariables, rates of change, and moving averages. Hence,thresholds are not expressed in absolute terms but in per-cent of each variable’s maximal value. Replacing absolute

-100

0

100

monitoringload index

registered

call-back for load con

troller

f

f

f low

normal

high

current load

check periodically

Figure 5: The monitor’s notification mechanism

values with percentage-based conditions improved the ro-bustness of our implementation and simplified adminis-tration significantly.

3.3 The Load-Controller

QGuard’s sensitivity to load-statistics is a crucial designparameter. If QGuard is too sensitive, it will never set-tle into a stable state. On the other hand, if QGuard istoo insensitive to server load, it will fail to protect it fromoverload. For good control of QGuard’s sensitivity weintroduce three different control parameters:

1. The minimal sojourn time,s, is the minimal time be-tween filter switches. Obviously, it limits the switch-ing frequency.

2. The length of the load observation history,h, deter-mines how many load samples are used to determinethe load average. The fraction1h is the grain of allload-measurement. For example, a history of length10 allows load measurements with 10% accuracy.

3. A moderator value,m, is used to dampen oscilla-tions when the shaped incoming packet rate matchesthe server’s capacity. To switch to a more restric-tive filter, at leastm times more overloaded than un-derloaded time intervals have to be observed. Thismeans that the system’s oscillations die down as thetarget rate is reached, assuming stable offered load.

Small values form (3–6) serve this purpose reasonablywell. Since boths andm slow down oscillations, rela-tively short histories (h 2 [5; 15]) can be used in deter-mining system load. This is due to the fact that accurateload assessment is necessary only if the server operates

7

Indicator MeaningHigh paging rate Incoming requests cause high memory consumption, thus severely limiting

system performance through paging.High disk access rate Incoming requests operate on a dataset that is too large to fit into the file cache.Little idle time Incoming requests exhaust the CPU.High outbound traffic Incoming requests demand too much outgoing bandwidth, thus leading to buffer

overflows and stalled server applications.Large inbound packet backlog Requests arrive faster than they can be handled, e.g., flood-type attacks.Rate of timeouts for TCP SYN-attack or network failure.connection requests

Table 1: Load indicators used in the Linux implementation

close to its operating point. Otherwise, overload and un-derload are obvious even when using less accurate loadmeasurements. Since the moderator stretches out the av-eraging interval as the system stabilizes, measurement ac-curacy is improved implicitly. Thus, QGuard maintainsresponsiveness to sudden load-shifts and achieves accu-rate load-control under sustained load.

For statistical purposes and to allow refinement of fil-ter hierarchies, the load-controller records how long eachfilter was applied against the incoming load. Higher-levelsoftware (Section 3.4) can query these values directly us-ing the newQUERYQGUARDsocket option. In responseto this query, the load-controller will also indicate themost recent load condition (e.g.,CPUOVERLOAD) andthe currently deployed filter (Figure 6).

The load-controller signals an emergency to the load-controller whenever it has to switch into the most restric-tive filter (drop all incoming traffic) repeatedly to avoidoverload. Uncontrollable overload can be a result of:

1. ICMP floods

2. CPU intensive workloads

3. SYN attacks

4. Congested inbound queues due to high arrival rate

5. Congested outbound queues as a result of largereplies

6. The onset of paging and swapping

7. File system request overload

To avoid signaling a false uncontrollable overload,which happens when the effects of a previous overloadare still present, the system learns the time,t, that it takes

for the system to experience its first underload after theonset of an overload. The timet indicates how much sys-tem load indicators lag behind control actions. If2t > s(sojourn time,s), then t

2is used in place of the minimal

sojourn time. Thus, in systems where the effects of con-trol actions are delayed significantly, the load-controllerwaits for a longer time before increasing the restrictive-ness of inbound filters. Without the adaptation of min-imal sojourn times, such a system would tend to over-steer, i.e., drop more incoming traffic than necessary. Thisproblem occurs whenever server applications queue uplarge amounts of work internally. Server applications thatdecouple workload processing from connection manage-ment are a good example (e.g., the Apache Web server).However, if per-request work is highly variant, QGuardfails to stabilize. In such cases, a more radical solutionlike LRP [4] becomes necessary.

3.4 The Policy-Manager

The policy-manager implements three different features.First, it performs statistical analysis to dynamically ad-just the granularity of the FH and estimates the best pointof operation. Second, it identifies and reacts to sustainedoverload situations and tunes out traffic from malicioussources. Finally, it creates a FH that conforms to the ser-vice differentiation requirements.

The policy-manager views a FH as a set ofn filtersfF0; F1; :::; Fng. As described in Section 2.1, filterFiconsists of a set of QGuard rulesfri;0; ri;1; :::; ri;mg. Forconvenience we introduce some notation to represent dif-ferent attributes of a filter.

8

maximalrestrictiveness

minimalrestrictiveness

monitor

user-levelcontrol

alarm signalunmanagableload

kernel

user-space

underload signal

overload signal

Load-Controller

onoverload

onunderload

count

count

overloador

underload?

currentfilter

Figure 6: The Load-Controller

TIME(Fi) is the amount of time for which the load-controller usedFi to contain system load. This at-tribute can be directly read from the statistics of theload-controller.

RATE(Fi) is the rate at whichFi accepts incom-ing packets. This is the sum of the rates givenfor all QGuard rules,j, that belong to the filter,RATE(Fi; j).

Since QGuard provides fair-share-style resource allo-cation, the policy-manager must create filter hierarchieswhere adjacent filtersFi andFi+1 satisfy the following:if a packet is admissible according to QGuard ruleri+1;j ,then it is also admissible according to ruleri;j . How-ever, the converse is not necessarily true. First, this im-plies that corresponding rules from different filters withina FH always specify the same traffic class. Second,RATE(Fi+1; j) < RATE(Fi; j) for all j. Furthermore,F0 always admits all andFn drops all incoming traffic.The monotonicity of the rates in a filter-hierarchy is a re-sult of our commitment to fair-share resource allocation.

The FH defined above guarantees that there is at leastone filter,Fn, that can suppress any overload. Moreover,if there is no overload, no packet will be dropped by theload-controller becauseF0 admits all packets. Dependingon the amount of work that it takes to process each re-quest and the arrival rate of requests, the load-controllerwill oscillate around some filter near the operating pointof the system, i.e., the highest incoming rate that does notgenerate an overload. Since the rate difference between

Normalized Input Rate

Qua

ntiz

atio

n In

terv

al

f(x)

r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7

q 2

Figure 7: The compressor function forq = 1=2

filters is discrete, it is unlikely that there is one particularfilter that shapes incoming traffic exactly to the optimalincoming rate. Therefore, it is necessary to refine the FH.To construct the ideal filterF � that would shape incom-ing traffic to the maximal request arrival rate of the server,the policy-manager computes thefocal point (FP)of theload-controller’s oscillations:

FP :=

nPi=1

TIME(Fi) �RATE(Fi)

NPi=1

TIME(fi)

Whether or not the policy-manager uses a finer quan-tization around the focal point depends on the load-controller’s stability (absence of oscillations coveringmany filters). To switch between different quantizationgrains, the policy-manager uses a family ofcompressorfunctions[22] that have the following form:

fq(x� FP ) =

((x� FP )q for x � FP

�(FP � x)q for x < FP

Our experimental configuration only usedfq(x) forq = f1; 1=2; 1=3g; Figure 7 showsf1=2(x). The hori-zontal lines reflects the quantization of the same functionbased on 8 quantization levels (the dashes on they-axis).The ranges for each interval, marked on thex-axis, illus-trate how their widths become smaller as they approachthe focal point. Therefore, we only need to decreaseq to

9

Figure 8: State transition diagram for the identification of mis-behaving traffic classes

achieve higher resolution around the focal point. To com-pute the range values of each quantization interval, we ap-ply the inverse function (a polynomial). This is illustratedby the shaded area in Figure 7.

Under the assumption that the future will resemble thepast, compressor functions should be picked to minimizethe filtering loss that results from the load controller’soscillations. However, this requires keeping long-termstatistics, which in turn requires a large amount of book-keeping. Instead of bookkeeping, we choose a fast heuris-tic that selects the appropriate quantization,q, based onthe load-controller’s statistics. Simply put, if the load-controller only applies a small number of filters over along time, a finer resolution used. More specifically, ifthe load-controller is observed to oscillate between twofilters, it is obvious that the filtering-grain is too coarseand a smallerq is used. We found that it is good to switchto a smallerq as soon as the load-controller is found os-cillating over a range of roughly 4 filters.

When a new FH is installed, the load-controller has noindication as to which filter it should apply against incom-ing traffic. Therefore, the policy-manager advances theload-controller to the filter in the new FH that shapes in-coming traffic to the same rate as the most recently usedfilter from the previous FH. The policy manager does notsubmit a new FH to the load-controller if the new hier-archy does not differ significantly from the old one. Achange is significant if the new FP differs more than 5%from the previous one. This reduces the overheads createdby the policy-manager, which includes context switchesand the copying of an entire FH.

The above computations lead to improved serverthroughput under controllable overload. However, if the

load-controller signals a sustained (uncontrollable) over-load, the policy-manager identifies misbehaving sourcesas follows (see also Figure 8).

Assumed Bad: Right after the policy-manager recognizesthat the load-controller is unable to contain the overload,each traffic class is labeled as potentially bad. In this statethe traffic class is temporarily blocked.

Tryout : Traffic classes are admitted one-by-one and inpriority order. A “tryout-admission” is probational andused to identify whether a given traffic class is causingthe overload.

Good: A traffic class that passed the “tryout” state with-out triggering an overload is considered to be “good.” Itis admitted unconditionally to the system. This is the nor-mal state for all well-behaved traffic classes.

Bad: A traffic class that triggered another overload whilebeing tried out is considered to be a “bad” traffic class.Bad traffic classes remain completely blocked for a con-figurable amount of time.

To avoid putting traffic classes on trial that are inac-tive, the policy-manager immediately advances such traf-fic classes from state “tryout” to “good.” All other trafficclasses must undergo the standard procedure. Unfortu-nately, it is impossible to start the procedure immediatelybecause the server may suffer from residual load as a re-sult of the attack. Therefore, the policy-manager waitsuntil the load-controller settles down and indicates thatthe overload has passed.

The problem of delayed overload effects became evi-dent in the context of SYN-flood attacks. If Linux 2.2.14is used as the server OS, SYN packets that the attackerplaces in the pending connection backlog queue of the at-tacked server take 75 s to time out. Hence, the policy-manager must wait at least 75 s after entering the recov-ery procedure for a SYN-attack. Another wait is may be-come necessary during the recovery period after one ofthe traffic classes revealed itself as the malicious sourcebecause the malicious source had a second chance to fillthe server’s pending connection backlog.

10

Figure 9: Testbed

4 Evaluation

To study QGuard’s performance under various work-loads, we implemented a load-generating server appli-cation. This server load generator can be configured togenerate different types of load depending on the UDP orTCP port on which it received the request. The server ap-plication is fully parallel and uses threads from a thread-pool to serve incoming requests. The generated load mayconsist of four configurable components: CPU activity,memory usage, file accesses, and the generation of largeresponse messages. We also implemented a client-siderequest generator which could simulate an arbitrary num-ber of parallel clients. Each client can be configured tosubmit requests at a specific rate with random (Poisson-distributed) or constant inter-arrival times to the load-generating server.

The load generating server was run on its own IntelPentium-based PC (450 MHz, 210 MB memory). Up to400 simulated clients located on two other PCs requestservice at an average rate of 1 req/s. Client and serverwere connected through FastEthernet (see Figure 9).

For each test run, we established a baseline by compar-ing the QGuard-controlled server’s performance againstthe server without QGuard. We found that QGuard fullyachieved the goals for which it was designed: differen-tial QoS and defense from overload attacks. We furtherfound that QGuard degrades maximal server throughputonly minimally — 2-3% (see Figure 10). This degrada-tion results from the limitation of the input rate and thefact that we configured QGuard to keep the server’s re-source utilization at and below 99%.

4.1 Providing Differential QoS

The main objective in QGuard’s design was graceful QoSdegradation under overload. To study QGuard’s behavior,we split our experiments into two series: one that stud-

Figure 10: Performance loss due to QGuard

ies the differential treatment of homogenous services —a typical preferred vs. standard services scenario — andof heterogenous services. In both cases, 200 clients wereconfigured to request the preferred service while numberof clients requesting the standard service increased from0 to 200. Each measurement point represents the averageresponse time or throughput over a 12 minute interval.

In the first experiment, we set the relative share of pre-ferred vs. standard service to 4:1. As Figure 11 shows,QGuard clearly distinguishes the preferred from the stan-dard service since the throughput of the preferred serviceremains stable as we increase the number of clients forstandard service. This compares favorable with the ap-proximately 40% performance drop that clients for pre-ferred service would have experienced without QGuardprotection. The results are even more dramatic in termsof clients’ response time (Figure 12).

These results remain valid if one differentiates acrossclients instead of services. The only difference in thesetup would be to configure traffic classes based on sourceaddresses rather than destination services.

In a second series of experiments with services of het-erogeneous workloads, we configured the preferred ser-vice to be CPU intensive and the standard service to bememory intensive. We set the priority of the preferredservice to 10 and that of standard service to 1. Thislarge ratio is very critical for maximizing system through-put because the maximal throughput of the CPU inten-sive service was an order of magnitude higher than that

11

Figure 11: Throughput of preferred customers as the load fromstandard clients grows

of memory-intensive service. If the weights are not cho-sen appropriately (approximately reflecting the differentmaximal throughput values), then rounding errors in fa-vor of the resource-heavy service’s shaping rate can leadto significant performance degradation for the preferredservice.

Aside from the previously mentioned limitation, Fig-ures 13 and 14 show that QGuard performs well even ifthe workload is very heterogeneous. Without QGuard, theperformance of clients requesting preferred service dropsseverely as demand for the standard service increases.With QGuard, the performance of both services matchesthe QoS differentiation requirements exactly, i.e., clientsof preferred service are served at 10� the rate of client ofstandard service.

The extreme increase in the clients’ response time(around 40 clients) is a result of Linux’s memory man-agement. The likelihood of swapping increases as morenon-preferred clients compete for service until swappingis inevitable (40 simultaneous non-preferred customers).Beyond this point there is always a number of requestsfrom non-preferred clients swapped out so that the pre-ferred customers’ requests receive a larger share of theCPU, thus improving their response time. However, re-sponse times with QGuard are still 3 times better thanwithout.

Figure 12: Response time of seen by preferred customers

Figure 13: Throughput for preferred customers when standardcustomers request memory intensive services (e.g. databasejoins)

4.2 Effective SYN-Flood Defense

One of the main motivations behind our research on in-bound traffic controls for overload defense mechanismswas the recent surge in the number of DoS attacks ex-perienced by major Internet servers. To put QGuard tothe test, we configured the server to provide service tothree different client populations: preferred customersfrom host 1, standard service customers from host 2, andbest-effort service for the rest of the world. 200 secondsinto the measurement, we launched a SYN-flood attackon the server from a spoofed (unreachable) address. Ser-vice was quickly disrupted (point [b] in Figure 15). How-ever, after a short time [c], QGuard detects the DoS attack

12

Figure 14: Response time seen by preferred customers

Figure 15: Restoring service under SYN-flood attacks

and disallows all incoming traffic until all SYN packetsthat are currently present in the server’s connection back-log time out (point [d]). Then it enables client accessesin priority order ([d] and [e]). Since neither standard norpreferred clients cause the SYN-flood, they are labeled asgoodtraffic classes. Once QGuard admits all other clients[g] — including the attacker — the service experiencesanother disruption which is detected by QGuard at point[h]. Upon detection, best-effort clients are denied accessto the server and service resumes [j] after all false SYNpackets that the attacker placed on the server during itstemporary admission time out. The graph shown in Fig-ure 15 represents a typical test run (not the best case).

As we studied the behavior of QGuard under SYN-floods, we found that it is difficult to distinguish a SYN-

flood from a surge in legitimate requests until spoofedSYN packets begin to time out. Since this timeout is verylarge in the regular Linux kernel (75 s) the recovery phasetakes quite long. Almost all of the recovery time can be at-tributed to this generous timeout. One may argue that weshould simply wipe out all SYN packets in the server’sbacklog once a SYN attack has been discovered to speedup recovery. However, this is not possible without vio-lating the TCP protocol. Such a protocol alteration couldbreak some client/server applications.

4.3 Tuning out the “Ping-of-Death”

The “ping-flood” attack exploits the fact that the process-ing of ICMP requests is kernel-based, thus generally pre-empting all other system activities. In this scenario an at-tacker either directly or through so called zombies sendsa never ending stream of ICMP ping packets to a server.Although the per-request processing overhead of ping isquite low, the large number of packets leads to a completelock-up of the server. To create a lock-up situation on theexperimental server, we flooded it on both of its incominginterfaces at the maximal packet rate of 100Mbps Ether-net.

At 100 s in Figure 16, the start of the ping-flood, theserver’s throughput plummets in response to the highworkload placed on the system due to ICMP request pro-cessing. QGuard responds to the excessive load immedi-ately and reduces the acceptance rate for ICMP packetsuntil service throughput almost reaches pre-attack levels(after 175 s). The reason why the maximal throughputis not quite reached is that QGuard still admits a small,manageable amount ofping requests. QGuard’s reac-tion is caused by three events: almost all busy cycles areexecuted on behalf of the system, a large backlog of in-coming packets, and high CPU utilization.

Since QGuard successfully defends the system fromthis kind of attack, it is quite safe to connect a QGuardprotected server directly to the Internet even through high-bandwidth links. However, QGuard can only mitigate theeffect that such a ping-flood has on the incoming link’savailable bandwidth. The sources of the attack may stillsaturate incoming bandwidth by flooding the link. How-ever, a QGuard protected system does not aggravate the

13

Figure 16: QGuard’s response to an ICMP flood

problem by sending replies over the same congested link.

5 Related Work

A number of commercial and research projects addressthe problem of server overload containment and differen-tial QoS. Ongoing research in this field can be groupedinto three major categories: adaptive middleware [2, 3,14], OS [4, 6, 12, 15, 17, 20, 23] and network-centric so-lutions [7, 19].

5.1 Middleware for QoS Differentiation

Middleware solutions coordinate graceful degradationacross multiple resource-sharing applications under over-load. Since the middleware itself has only little controlover the load of the system, they rely on monitoring feed-back from the OS and application cooperation to maketheir adaptation choices. Middleware solutions work onlyif the managed applications are cooperative (e.g., by bind-ing to special communication libraries).

IBM’s workload manager (WLM) [3] is the most com-prehensive middleware QoS management solution. WLMprovides insulation for competing applications and capac-ity management. It also provides response time manage-ment, thus allowing the administrator to simply specifytarget response times for each application. WLM willmanage resources in such a way that these target response

times are achieved. However, WLM relies heavily onstrong kernel-based resource reservation primitives, suchas I/O priorities and CPU shares to accomplish its goals.Such rich resource management support is only foundin resource rich mainframe environments. Therefore, itsdesign is not generally applicable to small or mid-sizedservers. Moreover, WLM requires server applications tobe WLM-aware. WebQoS [14] models itself after WLMbut requires fewer application changes and weaker OSsupport. Nevertheless, it depends on applications bindingto the system’s dynamic communication libraries. We-bQoS is less efficient since it manages requests at a laterprocessing stage (after they reach user-space).

5.2 Operating System Mechanisms forOverload Defense and Differential QoS

Due to the inefficiencies of user-space software and thelack of cooperation from legacy applications, various OS-based solutions for the QoS management problem havebeen suggested. OS-level QoS management solutions donot require application cooperation, and they strictly en-force the configured QoS.

The Scout OS [23] provides apathabstraction, whichallows all OS activity to be charged to the resource budgetof the application that triggered it. When network pack-ets are received, for example, they are associated witha path as soon as their path affiliation is recognized bythe OS; they are then handled using to the resources thatare available to that path. Unfortunately, to be effectiveScout’s novel path abstraction must be used directly bythe applications. Moreover, Scout and the other OS-basedQoS management solutions [4, 6, 12, 15, 20] must be con-figured in terms of raw resource reservations, i.e., theydo not manage Internet services on the more natural perrequest-level. However, these solutions provide very fine-grained resource controls but require significant changesto current OS designs.

Mogul’s and Ramakrishnan’s work [17] on the receivelivelock problem has been a great inspiration to the designof QGuard. Servers may suffer from the receive livelockproblem if their CPU and interrupt handling mechanismsare too slow to keep up with the interrupt stream causedby incoming packets. They solve the problem by making

14

the OS slow down the interrupt stream (by polling or NIC-based interrupt mitigation), thus reducing the number ofcontext switches and unnecessary work. They also showthat a monitoring-based solution that uses interrupt miti-gation only under perceived overload maximizes through-put. However, their paper only targets receive-livelockavoidance and does not consider the problem of providingQoS differentiation — an important feature for today’s In-ternet servers.

5.3 Network-Centric QoS Differentiation

Network-centric solutions for QoS differentiation is be-coming the solution of choice. This is due to the fact thatthey are even less intrusive than OS-based solutions. Theyare completely transparent to the server applications andserver OSs. This eases the integration of QoS manage-ment solutions into standing server setups. Some networkcentric-solutions are designed as their own independentnetwork devices [7], whereas others are kernel-modulesthat piggy-back to the server’s NIC driver [19].

Among the network-centric solutions is NetGuard’sGuardian [19], which is QGuard’s closest relative.Guardian, which implements the firewalling solution onthe MAC-layer, offers user-level tools that allow real-timemonitoring of incoming traffic. Guardian policies can beconfigured to completely block misbehaving sources. Un-like QGuard, Guadian’s solution is not only static but alsolacks the QoS differntiation since it only implements anall-or-none admission policy.

6 The Future of QGuard

Since the QGuard prototype still requires the addition ofkernel modules to the Internet server’s OS, some poten-tial users may shy away from deploying it. We quotedthe same issue earlier as a reason for the popularity ofnetwork-centric solution for the QoS-management prob-lem. Therefore, we believe that QGuard should fol-low the trend. It would ideally be built into a sepa-rate firewalling/QoS-management device. Such a devicewould be placed in between the commercial server and theInternet, thus protecting the server from overload. Such

a setup necessitates changes in the QGuard monitoringarchitecture. Further research is necessary to determinewhether an SNMP-based monitor can deliver sufficientlyup-to-date server performance digests so that QGuard’sload-controller can still protect the server from overloadwithout adversely affecting server performance.

Another future direction for the QGuard architecturewould be to embed it entirely on server NICs. This wouldprovide the ease of plug-and-play, avoid an additional net-work hop (required for a special QGuard frontend), andreduce the interrupt load placed on the server’s OS bydropping packets before an interrupt is triggered. Anotheradvantage of the NIC-based design over our current pro-totype is that it would be a completely OS-independentsolution.

In this paper we have proven that it is possible toachieve both protection from various forms of overloadattacks and differential QoS using a simple monitor-ing control feedback loop. Neither the core networkingcode of the OS nor applications need to be changed tobenefit from QGuard’s overload protection and differen-tial QoS. QGuard delivers surprisingly good performanceeven though it uses only inbound rate controls. QGuard’ssimple design allows decoupling QoS issues from the un-derlying communication protocols and the OS, and freesapplications from the QoS-management burden. In thelight of these great benefits, we believe that inbound traf-fic controls will gain popularity as a means of servermanagement. The next step for future firewall solutionsis to consider the results of this study and add trafficshaping policies and a simple overload control-loop sim-ilar to QGuard’s load-controller. As we have shown inthis paper, these two mechanisms may suffice for the de-sign of sophisticated QoS management solutions such asQGuard’s policy-manager.

7 Acknowledgements

We gracefully acknowledge helpful discussions with, use-ful comments from, and the support of Kang Shin, BrianNoble, and Padmanabhan Pillai.

References

[1] A BDELZAHER, T., AND BHATTI , N. Web Content Adap-tation to Improve Server Overload Behavior. InInterna-tional World Wide Web Conference(May 1999).

15

[2] A BELZAHER, T. F., AND SHIN, K. G. QoS Provisioningwith qContracts in Web and Multimedia Servers. InIEEEReal-Time Systems Symposium(Pheonix, AZ, December1999).

[3] A MAN , J., EILERT, C. K., EMMES, D., YOCOM, P.,AND

DILLENBERGER, D. Adaptive Algorithms for ManagingDistributed Data Processing Workload.IBM Systems Jour-nal 36, 2 (1997), 242–283.

[4] BANGA, G., AND DRUSCHEL, P. Lazy Receiver Process-ing LRP: A Network Subsystem Architecture for ServerSystems. InSecond Symposium on Operating Systems De-sign and Implemenation(October 1996).

[5] BELLOVIN , S. M. Security Problems in the TCP/IP Pro-tocol Suite.Computer Communication Review 19, 2 (April1989), 32–48.

[6] BRUNO, J., BRUSTOLONI, J., GABBER, E., OZDEN, B.,AND SILBERSCHATZ, A. Retrofitting Quality of Serviceinto a Time-Sharing Operating System. InUSENIX AnnualTechnical Conference(June 1999).

[7] CISCO INC. Local Director (White Paper)http://cisco.com/warp/public/cc/ci sco/mkt/scale/locald/tech/lobalwp.htm. 2000.

[8] DAWSON, T. Linux NET-3-HOWTO. 1999.

[9] ELLIOTT, J. Distributed Denial of Service Attacks and theZombie Ant Effect.IT Pro (March 2000).

[10] FLOYD, S., AND JACOBSEN, V. Link-Sharing and Re-source Management Models for Packet Networks.Trans-actions on Networking 3, 4 (1995), 365–386.

[11] GARBER, L. Denial-of-Service Attacks Rip the Internet.Computer(2000).

[12] HAND, S. M. Self-Paging in the Nemesis Operating Sys-tem. InProceedings of the Third USENIX Symposium onOperating Systems Design and Implementation(New Ore-leans, Lousiana, February 1999), USENIX, pp. 73–86.

[13] HEWLETT PACKARD CORP. Ensuring CustomerE-Loyalty: How to Capture and Retain Cus-tomers with Responsive Web Site Performance.http://www.internetsolutions.enterprise.hp.com/webqos/products/overview/e-loyaltywhite paper.pdf.

[14] HEWLETT PACKARD CORP. WebQoS Technical WhitePaper. http://www.internetsolutions.enterprise.hp.com/webqos/products/overview/wp.html, 2000.

[15] JEFFAY, K., SMITH , F., MOORTHY, A., AND ANDER-SON, J. Proportional Share Scheduling of Operating Sys-tem Services for Real-Time Applications. InProceedingsof the 19th IEEE Real-Time Systems Symposium(Madrid,December 1998).

[16] KESHAV, S. An Engineering Approach to Computer Net-working. Addison-Wesley Publishing Company, 1997.

[17] MOGUL, J. C.,AND RAMAKRISHNAN , K. K. Eliminat-ing Receive Livelock in an Interrupt-Driven Kernel.Trans-actions on Computer Systems 15, 3 (August 1997), 217–252.

[18] MOGUL, J. C., RASHID, R. F.,AND J. ACCETTA, M. ThePacket Filter: An Efficient Mechanism for User-Level Net-work Code. InProceedings of the 11th ACM Symposiumon Operating Systems Principles and Design(November1987), ACM.

[19] NETGUARD, INC. Guardian Real-time Performance Monitoring (RPM).http://www.netguard.com/supportdoc.html.

[20] REUMANN, J., MEHRA, A., SHIN, K., AND KANDLUR,D. Virtual Services: A New Abstraction for Server Con-solidation. InProcedings of the 2000 USENIX AnnualTechnical Conference(June 2000), USENIX.

[21] RUSSELL, P. IPCHAINS-HOWTO.http://www.rustcorp.com/linux/ipchains/HOWTO.html.

[22] SAYOOD, K. Introduction to Data Compression. MorganKaufmann Publishers, Inc., 1996.

[23] SPATSCHECK, O., AND PETERSON, L. L. DefendingAgainst Denial of Service Attacks in Scout. InThird Sym-posium on Operating Systems Design and Implemenation(February 1999), pp. 59–72.

[24] STEERE, D. C., GOEL, A., GRUENBERG, J., MC-NAMEE, D., PU, C., AND WALPOLE, J. A feedback-driven proportion allocator for real-rate scheduling. InThird Symposium on Operating Systems Design and Imple-mentation(New Orleans, LA, USA, Feb 1999), pp. 145–58.

[25] STOCKLE, O. Overload Protection and QoS Differen-tiation for Co-Hosted Web Sites. Master’s thesis, ETHZurich, June 1999.

16

QGuard: Protecting Internet Servers from OverloadInternet servers suffer from overload because of the un-controlled inﬂux of requests from network clients. Since these requests for

Documents