Detecting Application Denial-of-Service Attacks: A Group-Testing-Based Approach

14
1 Detecting Application Denial-of-Service Attacks: A Group Testing Based Approach Ying Xuan 1 , Incheol Shin 1 , My T. Thai 1 , Taieb Znati 2 Abstract—Application DoS attack, which aims at disrupting application service rather than depleting the network resource, has emerged as a larger threat to network services, compared to the classic DoS attack. Owing to its high similarity to legitimate traffic and much lower launching overhead than classic DDoS attack, this new assault type cannot be efficiently detected or prevented by existing detection solutions. To identify application DoS attack, we propose a novel group testing (GT) based approach deployed on back-end servers, which not only offers a theoretical method to obtain short detection delay and low false positive/negative rate, but also provides an underlying framework against general network attacks. More specifically, we first extend classic GT model with size constraints for practice purposes, then re-distribute the client service requests to multiple virtual servers embedded within each back- end server machine, according to specific testing matrices. Base on this framework, we propose a 2-mode detection mechanism using some dynamic thresholds to efficiently identify the attackers. The focus of this work lies in the detection algorithms proposed and the corresponding theoretical complexity analysis. We also provide preliminary simulation results regarding the efficiency and practicability of this new scheme. Further discussions over implementation issues and performance enhancements are also appended to show its great potentials. Index Terms—Application DoS, Group Testing, Network Security. 1 I NTRODUCTION D ENIAL-of-Service (DoS) attack, which aims to make a service unavailable to legitimate clients, has become a severe threat to the Internet security [21]. Traditional DoS at- tacks mainly abuse the network bandwidth around the Internet subsystems and degrade the quality of service by generating congestions at the network [2][21]. Consequently, several network based defense methods have tried to detect these attacks by controlling traffic volume or differentiating traffic patterns at the intermediate routers [10][13]. However, with the boost in network bandwidth and application service types recently, the target of DoS attacks have shifted from network to server resources and application procedures themselves, forming a new application DoS attack [1][2]. As stated in [2], by exploiting flaws in application design and implementation, application DoS attacks exhibit three advantages over traditional DoS attacks which help evade normal detections: malicious traffic is always indistinguishable from normal traffic, adopting automated script to avoid the need for a large amount of “zombie” machines or bandwidth to launch the attack, much harder to be traced due to multiple re-directions at proxies. According to these characteristics, the malicious traffic can be classified into legitimate-like requests of two cases: (i) at a high inter-arrival rate, (ii) consuming 1 Department of Comp. and Info. Sci. & Engineering, University of Florida, Gainesville, FL 32611. Email: {yxuan,ishin,mythai}@cise.ufl.edu 2 Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15215. Email: [email protected] A preliminary version of this paper: M. T. Thai, Y. Xuan, I. Shin, and T. Znati, “On Detection of Malicious Users Using Group Testing Techniques,” is in Proceeding of Int. Conf. on Distributed Computing Systems (ICDCS), 2008. This work of Dr. My T. Thai is supported in part by NSF grant number CNS-0847869. more service resources. We call these two cases “high-rate” and “high-workload” attacks respectively in this paper. Since these attacks usually do not cause congestion at the network level, thus bypass the network-based monitoring system [21], detection and mitigation at the end-system of the victim servers have been proposed [1][3][19]. Among them the DDoS shield [1] and CAPTCHA-based defense [3] are the representatives of the two major techniques of system-based approaches: session validation based on legitimate behavior profile and authentication using human-solvable puzzles. By enhancing the accuracy of the suspicion assignment for each client session, DDoS shield can provide efficient session schedulers for defending possible DDoS attacks. However, the overhead for per-session validation is not negligible, especially for services with dense traffic. CAPTCHA-based defenses introduce additional service delays for legitimate clients and are also restricted to human-interaction services. A kernel observation and brief summary of our method is: the identification of attackers can be much faster if we can find them out by testing the clients in group instead of one-by-one. Thus the key problem is how to group clients and assign them to different server machines in a sophisticated way, so that if any server is found under attack, we can immediately identify and filter the attackers out of its client set. Apparently, this problem resembles the group-testing (GT) theory [14] which aims to discover defective items in a large population with the minimum number of tests where each test is applied to a subset of items, called pools, instead of testing them one by one. Therefore we apply GT theory to this network security issue and propose specific algorithms and protocols to achieve high detection performance in terms of short detection latency and low false positive/negative rate. Since the detections are merely based on the status of service resources usage of the Digital Object Indentifier 10.1109/TPDS.2009.147 1045-9219/09/$26.00 © 2009 IEEE IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Transcript of Detecting Application Denial-of-Service Attacks: A Group-Testing-Based Approach

1

Detecting Application Denial-of-Service Attacks:A Group Testing Based Approach∗

Ying Xuan1, Incheol Shin1, My T. Thai1, Taieb Znati2

Abstract—Application DoS attack, which aims at disrupting application service rather than depleting the network resource, hasemerged as a larger threat to network services, compared to the classic DoS attack. Owing to its high similarity to legitimate trafficand much lower launching overhead than classic DDoS attack, this new assault type cannot be efficiently detected or prevented byexisting detection solutions. To identify application DoS attack, we propose a novel group testing (GT) based approach deployed onback-end servers, which not only offers a theoretical method to obtain short detection delay and low false positive/negative rate, butalso provides an underlying framework against general network attacks. More specifically, we first extend classic GT model with sizeconstraints for practice purposes, then re-distribute the client service requests to multiple virtual servers embedded within each back-end server machine, according to specific testing matrices. Base on this framework, we propose a 2-mode detection mechanism usingsome dynamic thresholds to efficiently identify the attackers. The focus of this work lies in the detection algorithms proposed and thecorresponding theoretical complexity analysis. We also provide preliminary simulation results regarding the efficiency and practicabilityof this new scheme. Further discussions over implementation issues and performance enhancements are also appended to show itsgreat potentials.

Index Terms—Application DoS, Group Testing, Network Security.

1 INTRODUCTION

DENIAL-of-Service (DoS) attack, which aims to make a

service unavailable to legitimate clients, has become a

severe threat to the Internet security [21]. Traditional DoS at-

tacks mainly abuse the network bandwidth around the Internet

subsystems and degrade the quality of service by generating

congestions at the network [2][21]. Consequently, several

network based defense methods have tried to detect these

attacks by controlling traffic volume or differentiating traffic

patterns at the intermediate routers [10][13]. However, with

the boost in network bandwidth and application service types

recently, the target of DoS attacks have shifted from network

to server resources and application procedures themselves,

forming a new application DoS attack [1][2].

As stated in [2], by exploiting flaws in application design

and implementation, application DoS attacks exhibit three

advantages over traditional DoS attacks which help evade

normal detections: malicious traffic is always indistinguishable

from normal traffic, adopting automated script to avoid the

need for a large amount of “zombie” machines or bandwidth

to launch the attack, much harder to be traced due to multiple

re-directions at proxies. According to these characteristics, the

malicious traffic can be classified into legitimate-like requests

of two cases: (i) at a high inter-arrival rate, (ii) consuming

1Department of Comp. and Info. Sci. & Engineering, University of Florida,Gainesville, FL 32611. Email: {yxuan,ishin,mythai}@cise.ufl.edu2Department of Computer Science, University of Pittsburgh, Pittsburgh, PA15215. Email: [email protected] preliminary version of this paper: M. T. Thai, Y. Xuan, I. Shin, and T.Znati, “On Detection of Malicious Users Using Group Testing Techniques,”is in Proceeding of Int. Conf. on Distributed Computing Systems (ICDCS),2008.∗This work of Dr. My T. Thai is supported in part by NSF grant numberCNS-0847869.

more service resources. We call these two cases “high-rate”

and “high-workload” attacks respectively in this paper.

Since these attacks usually do not cause congestion at

the network level, thus bypass the network-based monitoring

system [21], detection and mitigation at the end-system of the

victim servers have been proposed [1][3][19]. Among them

the DDoS shield [1] and CAPTCHA-based defense [3] are the

representatives of the two major techniques of system-based

approaches: session validation based on legitimate behavior

profile and authentication using human-solvable puzzles. By

enhancing the accuracy of the suspicion assignment for each

client session, DDoS shield can provide efficient session

schedulers for defending possible DDoS attacks. However, the

overhead for per-session validation is not negligible, especially

for services with dense traffic. CAPTCHA-based defenses

introduce additional service delays for legitimate clients and

are also restricted to human-interaction services.

A kernel observation and brief summary of our method is:

the identification of attackers can be much faster if we can find

them out by testing the clients in group instead of one-by-one.

Thus the key problem is how to group clients and assign them

to different server machines in a sophisticated way, so that if

any server is found under attack, we can immediately identify

and filter the attackers out of its client set. Apparently, this

problem resembles the group-testing (GT) theory [14] which

aims to discover defective items in a large population with

the minimum number of tests where each test is applied to a

subset of items, called pools, instead of testing them one by

one. Therefore we apply GT theory to this network security

issue and propose specific algorithms and protocols to achieve

high detection performance in terms of short detection latencyand low false positive/negative rate. Since the detections are

merely based on the status of service resources usage of the

Digital Object Indentifier 10.1109/TPDS.2009.147 1045-9219/09/$26.00 © 2009 IEEE

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

2

victim servers, no individually signature-based authentications

or data classifications are required, thus it may overcome the

limitations of the current solutions.GT was proposed during World War Two and has been

applied to many areas since then, such as medical testing,

computer networks, and molecular biology [6][14][16]. The

advantages of GT lie in its prompt testing efficiency and fault

tolerant decoding methods [6]. To our best knowledge, the

first attempts to apply GT to networking attack defense are

proposed in parallel by Thai et al. [5] (which is the preliminary

work of this journal) and Khattab et al. [4]. The latter proposed

a detection system based on “Reduced-Randomness Nonadap-

tive Combinatorial Group Testing” [8]. However, since this

method only counts the number of incoming requests rather

then monitoring the server status, it is restricted to defending

high-rate DoS attacks and cannot handle high-workload ones.

(More details about this work are included in Section 6).In a system viewpoint, our defense scheme is to embed

multiple virtual servers within each physical back-end server,

and map these virtual servers to the testing pools in GT, then

assign clients into these pools by distributing their service

requests to different virtual servers. By periodically monitoring

some indicators (e.g., average responding time) for resource

usage in each server, and comparing them with some dynamic

thresholds, all the virtual servers can be judged as “safe” or

“under attack”. By means of the decoding algorithm of GT, all

the attacker can be identified. Therefore, the biggest challenges

of this method are three-fold: (1) How to construct a testing

matrix to enable prompt and accurate detection. (2) How to

regulate the service requests to match the matrix, in practical

system. (3) How to establish proper thresholds for server

source usage indicator, to generate accurate test outcomes.Similar to all the earlier applications of GT, this new

application to network security requires modifications of the

classical GT model and algorithms, so as to overcome the

obstacle of applying the theoretical models to practical sce-

narios. Specifically, the classical GT theory assumes that each

pool can have as many items as needed and the number of

pools for testing is unrestricted. However, in order to provide

real application services, virtual servers cannot have infinite

quantity or capacity, i.e., constraints on these two parameters

are required to complete our testing model.Our main contributions in this paper are as follows:

• Propose a new size-constrained GT model for practical

DoS detection scenarios.

• Provide an end-to-end underlying system for GT-based

schemes, without introducing complexity at the network

core.

• Provide multiple dynamic thresholds for resource usage

indicators, which help avoid error test from legitimate

bursts and diagnose servers handling various amount of

clients.

• Present three novel detection algorithms based on the

proposed system, and show their high efficiencies in

terms of detection delay and false positive/negative rate

via theoretical analysis and simulations.

Besides application DoS attacks, our defense system is

applicable to DoS attacks on other layers, e.g. protocol-layer

M =

⎡⎢⎢⎢⎣

1 0 1 1 00 1 0 0 10 0 0 1 00 0 1 0 10 1 0 1 1

⎤⎥⎥⎥⎦

testing=⇒ V =

⎡⎢⎢⎢⎣

11001

⎤⎥⎥⎥⎦

Fig. 1. binary testing matrix M and testing outcomevector V

attack – SYN flood [13] where victim servers are exhausted by

massive half-open connections. Although these attacks occur

in different layers and of different styles, the victim ma-

chines will gradually run out of service resource and indicate

anomaly. Since our mechanism only relies on the feedback

of the victims, instead of monitoring the client behaviors or

properties, it is promising to tackle these attack types.

The paper is organized as follows. In Section 2, we briefly

introduce some preliminaries of the GT theory, as well as the

attacker model and the victim/detection model of our system.

In Section 3, we propose the detection strategy derived from

the adjusted GT model, and illustrate the detailed components

in the presented system. Description and latency analysis of

three concrete detection algorithms are included in Section 4,

while Section 5 provides the results of our simulation in terms

of the detection latency and false positive/negative rate. The

existing defense schemes and related work are discussed in

Section 6, and finally we reach our conclusion by summating

our contributions and providing further discussions over false

positive/negative rate in Section 7.

2 PRELIMINARY

2.1 Classic Group Testing ModelBasic Idea. The classic GT model consists of t pools and nitems (including at most d positive ones). As shown in Fig.

1, this model can be represented by a t × n binary matrix

M where rows represent the pools and columns represent the

items. An entry M [i, j] = 1 if and only if the ith pool contains

the jth item; otherwise, M [i, j] = 0. The t-dimensional

binary column vector V denotes the test outcomes of these

t pools, where 1−entry represents a positive outcome and 0-

entry represents a negative one. Note that a positive outcome

indicates that at least one positive item exists within this pool,

whereas negative one means that all the items in the current

pool are negative.

Classic Methods. Two traditional GT methods are adaptiveand non-adaptive [14]. Adaptive methods, a.k.a. sequential

GT, use the results of previous tests to determine the pool

for the next test, and complete the test within several rounds.

Whilst non-adaptive GT methods employ d-disjunct matrix

[14], run multiple tests in parallel, and finish the test within

only one round. We investigate both these methods and pro-

pose three algorithms accordingly.

Decoding Algorithms. For sequential GT, at the end of each

round, items in negative pools are identified as negative, while

the ones in positive pools require to be further tested. Notice

that one item is identified as positive only if it is the only item

in a positive pool. Non-adaptive GT takes d-disjunct matrices

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

3

as the testing matrix M , where no column is contained in theboolean summation of any other d columns. [14] proposed

a simple decoding algorithm for this matrix type. A sketch

of this algorithm can be shown using Fig. 1 as an example.

Outcomes V [3] and V [4] are 0, so items in pool 3 and pool

4 are negative, i.e., item 3, 4, 5 are negative. If this matrix

M is a d-disjunct matrix, items other than those appearing in

the negative pools are positive, therefore item 1, 2 are positive

ones.

Apply to Attack Detection. A detection model based on

GT can be: assume that there are t virtual servers and n clients,

among which d clients are attackers. Consider the matrix

Mt×n in Fig. 1, the clients can be mapped into the columns

and virtual servers into rows in M , where M [i, j] = 1 if and

only if the requests from client j are distributed to virtual

server i. With regard to the test outcome column V , we have

V [i] = 1 if and only if virtual server i has received malicious

requests from at least one attacker, but we cannot identify the

attackers at once unless this virtual server is handling only

one client. Otherwise, if V [i] = 0, all the clients assigned to

server i are legitimate. The d attackers can then be captured

by decoding the test outcome vector V and the matrix M .

2.2 Attacker ModelThe maximum destruction caused by the attacks includes the

depletion of the application service resource at the server side,

the unavailability of service access to legitimate user, and

possible fatal system errors which require rebooting the server

for recovery. We assume that any malicious behaviors can be

discovered by monitoring the service resource usage, based

on dynamic value thresholds over the monitored objects. Data

manipulation and system intrusion are out of this scope.

Similar to [1], we assume that application interface pre-

sented by the servers can be readily discovered and clients

communicate with the servers using HTTP/1.1 sessions on

TCP connections. We consider a case that each client provides

a non-spoofed ID (e.g. SYN-cookie [9]), which is utilized to

identify the client during our detection period. Despite that the

application DoS attack is difficult to be traced, by identifying

the IDs of attackers, the firewall can block the subsequent

malicious requests.

As mentioned in Section 1, the attackers are assumed to

launch application service requests either at high inter-arrival

rate or high workload, or even both. The term “request” refers

to either main request or embedded request for HTTP page.

Since the detection scheme proposed will be orthogonal to

the session affinity, we do not consider the repeated one-shot

attack mentioned in [1].

We further assume that the number of attackers d � nwhere n is the total client amount. This arises from the

characteristics of this attack [2]. Due to the benefits of virtual

servers we employ, this constraint can be relaxed, but we keep

it for the theoretical analysis in the current work.

2.3 Victim/Detection ModelThe victim model in our general framework consists of mul-

tiple back-end servers, which can be web/application servers,

Fig. 2. victim/detection model

database servers and distributed file systems. We do not take

classic multi-tier web servers as the model, since our detection

scheme is deployed directly on the victim tier and identifies the

attacks targeting at the same victim tier, thus multi-tier attacks

should be separated into several classes to utilize this detection

scheme. The victim model along with front-end proxies are

shown in Fig. 2. We assume that all the back-end servers

provide multiple types of application services to clients using

HTTP/1.1 protocol on TCP connections. Each back-end server

is assumed to have the same amount of resource. Moreover, the

application services to clients are provided by K virtual private

servers (K is an input parameter), which are embedded in the

physical back-end server machine and operating in parallel.

Each virtual server is assigned with equal amount of static

service resources, e.g., CPU, storage, memory and network

bandwidth. The operation of any virtual server will not affect

the other virtual servers in the same physical machine. The

reasons for utilizing virtual servers are two-fold: first, each

virtual server can reboot independently, thus is feasible for

recovery from possible fatal destruction; second, the state

transfer overhead for moving clients among different virtual

servers is much smaller than the transfer among physical server

machines.

As soon as the client requests arrive at the front-end proxy,

they will be distributed to multiple back-end servers for load-

balancing, whether session-sticked or not. Notice that our

detection scheme is behind this front-end tier, so the load-

balancing mechanism is orthogonal to our setting. On being

accepted by one physical server, one request will be simply

validated based on the list of all identified attacker IDs (black-

list). If it passes the authentication, it will be distributed to one

virtual servers within this machine by means of virtual switch.

This distribution depends on the testing matrix generated

by the detection algorithm. By periodically monitoring the

average response time to service requests and comparing it

with specific thresholds fetched from a legitimate profile, each

virtual server is associated with a “negative” or “positive”

outcome. Thereout, a decision over the identities of all clients

can be made among all physical servers, as discussed further

in the following Section 3.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

4

TABLE 1Main Notations

n number of clients

K number of server machines

d maximum number of attacks,1 ≤ d ≤ n (practically, d � n)

w maximum number of clients that can be handledby a virtual server in parallel

rleg the maximum inter-arrival rateof legitimate requests from one client

P length of time of each testing round

A the set of servers used for testing

S the set of suspect clients

I the set of servers which are under attack

L the set of clients connecting to negative servers

G the number of non-testing machines

3 STRATEGY AND DETECTION SYSTEM3.1 Size Constraint Group TestingAs mentioned in the detection model, each testing pool is

mapped to a virtual server within a back-end server machine.

Although the maximum number of virtual servers can be

extremely huge, since each virtual server requires enough

service resources to manage client requests, it is practical to

have the virtual server quantity (maximum number of servers)

and capacity (maximum number of clients that can be handled

in parallel) constrained by two input parameters K and wrespectively. Therefore, the traditional GT model is extended

with these constraints to match our system setting.

Definition 3.1: Size Constraint Group Testing (SCGT): For

any 01-matrix M , let wi =∑n

j=1 M [i, j] be the weight of

the ith pool, and t be the number of pools. The model is for

identifying d defected items in the minimum period of time,

by constructing matrix Mt×n and conducting group testing

procedures based on that, where wi ≤ w for given w and

t ≤ K for given server quantity K.

The maximum number of attackers d is assumed known

beforehand. Scenarios with non-deterministic d are out of the

scope of this paper. In fact these scenarios can be readily

handled by first testing with an estimated d, then increasing dif exactly d positive items are found.

3.2 Detection SystemThe implementation difficulties of our detection scheme are

three-fold: how to construct proper testing matrix M , how todistribute client requests based on M with low overhead and

how to generate test outcome with high accuracy. We will

address the last two in this section and leave the first one to

the next section.

Table 1 includes the notations used throughout the paper.

3.2.1 System OverviewAs mentioned in the detection model, each back-end server

works as an independent testing domain, where all virtual

servers within it serve as testing pools. In the following

sections, we only discuss the operations within one back-end

server, and it is similar in any other servers. The detection

consists of multiple testing rounds, and each round can be

sketched in four stages (Fig. 4):

First, generate and update matrix M for testing.

Fig. 3. 2-state diagram of the system

Fig. 4. One testing round in DANGER mode

Second, “assign” clients to virtual servers based on M . The

back-end server maps each client into one distinct column in

M and distributes an encrypted token queue to it. Each token

in the token queue corresponds to an 1-entry in the mapped

column. i.e., client j receives a token with destination virtual

server i iff M [i, j] = 1. Being piggybacked with one token,

each request is forwarded to a virtual server by the virtual

switch. In addition, requests are validated on arriving at the

physical servers for faked tokens or identified malice ID. This

procedure ensures that all the client requests are distributed

exactly as how the matrix M regulates, and prevents any

attackers from accessing the virtual servers other than the ones

assigned to them.

Third, all the servers are monitored for their service resource

usage periodically, specifically, the arriving request aggregate

(the total number of incoming requests) and average response

time of each virtual server are recorded and compared with

some dynamic thresholds to be shown later. All virtual servers

are associated with positive or negative outcomes accordingly.

Fourth, decode these outcomes and identify legitimate or

malicious IDs. By following the detection algorithms (pre-

sented in the next section), all the attackers can be identified

within several testing rounds.

To lower the overhead and delay introduced by the mapping

and piggybacking for each request, the system is exempted

from this procedure in normal service state. As shown in

Fig. 3, the back-end server cycles between two states, which

we refer as NORMAL mode and DANGER mode. Once the

estimated response time (ERT) of any virtual server exceeds

some profile-based threshold, the whole back-end server will

transfer to the DANGER mode and execute the detection

scheme. Whenever the average response time (ART) of each

virtual server falls below the threshold, the physical server

returns to NORMAL mode.

3.2.2 Configuration DetailsSeveral critical issues regarding the implementation are as

follows.

Session State Transfer. By deploying the detection ser-

vice on the back-end server tier, our scheme is orthogonal

with the session state transfer problem caused by the load-

balancing at the reverse proxies (front-end tier). To simplify

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

5

the discussion of implementation details, we assume that

the front-end proxies distribute client requests strictly even

to the back-end servers, i.e., without considering session-

sticked. The way of distributing token queues to be mentioned

later is tightly related to this assumption. However, even if

the proxies conduct more sophisticated forwarding, the token

queue distribution can be readily adapted by manipulating the

token piggybacking mechanism at the client side accordingly.

Since the testing procedure requires distributing intra-

session requests to different virtual servers, the overhead

for maintaining consistent session state is incurred. Our

motivation of utilizing virtual servers is to decrease such

overhead to the minimum, since multiple virtual servers can

retrieve the latest client state though the shared memory,

which resembles the principle of Network File System(NFS).

An alternative way out is to forward intra-session requests to

the same virtual server, which calls for longer testing period

for each round (to be further discussed later), but we prefer

faster detection in this paper and thus adopt the former method.

Matrix Generations. The testing matrix M , which regulates

distributing which client request to which server, poses

as the kernel part of this paper. All the three algorithms

proposed in the next section are concerning with the design

of M for the purpose of shortening the testing period

and decreasing the false positve/negative rate. Since the

detection phase usually undergoes multiple testing rounds,

M is required to be re-generated at the beginning of

each round. The time overhead for calculating this M is

quite low and will be shown via analytical proofs in Section 4.

Distributing Tokens. Two main purposes of utilizing tokens

are associating each client with a unique, non-spoofed IDand assigning them to a set of virtual servers based on the

testing matrix. On receiving the connection request from a

client, each back-end server responses with a token queue

where each token is of 4-tuple: (client ID, virtual server ID,

matrix version, encrypted key). “client ID” refers to the unique

non-spoofed ID for each client, which we assume unchanged

during our testing period (DANGER mode). “virtual server

ID” is the index of each virtual server within the back-end

server. This can be implemented as a simply index value, or

through a mapping from the IP addresses of all virtual servers.

The back-end server blocks out-of-date tokens by checking

their “matrix version” value, to avoid messing up the request

distribution with non-uniform matrices. With regard to the

“encrypted key”, it is an encrypted value generated by hashing

the former three values and a secured service key. This helps

rule out any faked tokens generated by attackers.

Assume that the load-balancing at the proxies is strictly

even for all back-end servers, the client has to agree on

piggybacking each request with a token at the head of one

token queue, and then the next request with the token at the

head of the next token queue, when receiving this application

service. Notice that there are multiple token queues released

by multiple back-end servers, it is non-trivial to implement

the correct request distribution.

Length of Testing Round P . Since we need to distribute the

requests exactly the way M regulates, it is possible that: if P is

too short, some clients may not have distributed their requests

to all their assigned servers, i.e., not all the 1-entries in Mare matching with at least one request. For an attacker who

launches a low-rate high-workload attack, its high-workload

requests may only enter a part of its assigned servers, so false

negative occurs in this case. However, if P is too long, the

detection latency will be significantly increased. In this paper,

we predefine a proper value for P , which is long enough

for each client to spread their requests to all the assigned

servers, i.e., each column needs to be mapped with at least∑ti=1 M [i, j] requests. Therefore,

P =n

maxj=1

t∑i=1

M [i, j]/rmin

where rmin denotes the minimum inter-arrival request rate,

provides a theoretical lower bound of P .

Legitimate Profile. The legitimate profile does not refer to

the session behavior profile for legitimate clients, but instead

records the distribution of the ART on a virtual server receiving

only legitimate traffic. Malicious requests are certainly to

generate destructions to the victim server machines, whose

ART will usually be much higher than that of normal cases.

Therefore, ART can work as an indicator of the application

resource usage. However, the resource usage varies for dif-

ferent time intervals (peak/non-peak time) due to the change

of client quantity, so we also investigate the ART distributions

regarding each possible number of clients, assuming that there

are at most n clients.

A sample construction of this profile is: The distributions of

ART in legitimate traffic at different time intervals for several

weeks are obtained after the system is established. Legitimate

traffic can be achieved by de-noising measures and ruling

out the influences of potential attacks [25]. The ith entry of

the profile records the distribution of ART on a virtual server

with i ∈ [1, n] clients. Specifically, we assign i clients to a

virtual server, and evenly divide the round length P into 100

sub-intervals (which we refer as Psub throughout the paper),

and monitor the server ART within each sub-interval Psub.

NORMAL mode and Transfer Threshold. To decrease the

length of detection period, where additional state maintenance

overhead and service delay cannot be avoided, the back-end

server cluster provides normal service to clients, without any

special regulations. This is referred as NORMAL mode, which

takes ERT (estimated response time) as a monitoring object.

Notice that ERT is an expected value for ART in the near

future, and can be computed via

ERT = (1 − α) · ERT + α · ART

where α is decay weight for smoothing fluctuations. Since

the inter-arrival rate and workload of client request can be

randomized distributed, it is difficult to perfectly fit the ARTdistribution using classic distribution functions. Considering

that Normal Distribution [18] can provide an approximate

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

6

fitting to ART, we adopt a simplified threshold based on

Empirical Rule: If any virtual server has an ERT > μ + 4σ(μ and σ denote the expected value and standard deviation of

the fitted ART distribution), the back-end server is probably

undergoing an attack, and thus transfers to DANGER mode

for detection. To enhance the accuracy of this threshold,

more sophisticated distributions can be employed for fitting

the samples, however, higher computation overhead will be

introduced for the threshold.

DANGER mode and Attack Threshold. In DANGER mode,

besides the ART values, the back-end server simultaneously

counts the arriving request aggregate within each Psub for

each virtual server. The motivation of this strategy arises from

the fact that, high-rate DoS attacks always saturate the server

buffer with a flood of malicious requests. By counting the

number of arriving in the buffer requests periodically, possible

high-rate attacks can be detected even before depleting the

service resources.

We pre-define R as a maximum legitimate inter-arrival rate,

and derive a upperbound of the arriving request aggregate Cj

for virtual server j within period Psub as :

Cj =∑

M [j, i] · �∑t

s=j+1 M [s, i] + R · Psub∑tk=1 M [k, i]

Once this threshold is violated in a virtual server, it is for sure

undergoing high-rate attacks or flash-crowd traffic. A positive

outcome can be generated for this testing round.

The outcome generation by monitoring the ART value for

a virtual server consists of the following detailed steps:

1) check the ART distribution profile for the values of

parameters μ and σ, as mentioned in the NORMAL

mode section;

2) among all the 100 sub-intervals Psub (each is P/100)

within the current testing period P , if no violation:

ART ≥ μ + 4σ occurs in any Psub, the virtual server

gets negative outcome for this testing round;

3) if for some sub-interval Psubs, ART ≥ μ + 4σ occurs,

this virtual server is in danger, either under attack, or

undergoing flash-crowd traffic. In this case, we wait till

the end of this round to get the distribution of ARTvalues for all Psubs for further decision;

4) if the ratio of “danger” sub-intervals Psubs with ART ≥μ + 4σ, over the total sub-interval amount (100 in

this case), exceeds some quantile regulated by the Em-pirical Rule, e.g., quantile 4% for confidence interval

[μ+2σ,∞), this virtual server will be labeled as positive;

5) for the other cases, the virtual server will have a negative

outcome.

After identifying up to d attackers, the system remains in the

DANGER mode and continues monitoring the ART for each

virtual server for one more round. If all the virtual servers

have negative outcomes, this back-end server is diagnosed as

healthy and return to to the NORMAL mode, otherwise further

detections will be executed (because of error tests).

With regard to the two objects monitored, Cj provides

different counting thresholds for different virtual servers,

while ART profile supplies dynamic thresholds for virtual

server containing different amounts of client. The combination

of these two dynamic thresholds helps decrease the testing

latency and false rate.

4 DETECTION ALGORITHMS AND LATENCYANALYSES

Based on the system framework above, we propose three

detection algorithms SDP, SDoP and PND in this section.

Note that the length of each testing round is a pre-defined

constant P , hence we analyze the algorithm complexity in

terms of the number of testing rounds for simplicity. Since

all these algorithms are deterministic and introduce no false

positive/negative rate (compared to the non-deterministic con-

struction in [4]), all the test errors result from the inaccuracy

of the dynamic threshold. We will discuss on this in Section

7.

Based on our assumption, there is a unique ID for eachclient, thus the two items “ID” and “client” are interchange-able in this section. In addition, all the following algorithmsare executed in each physical back-end server, which is anindependent testing domain as mentioned. Therefore, the term“servers” denote the K virtual servers in this section.

For simplicity, we assume that n ≡ 0 (mod K), n > w ≥d ≥ 1 and |A| ≥ d+1. Notice that the last inequality holds in

practice as the properties of application DoS attacks indicate

in Section 1.

4.1 Sequential Detection with PackingThis algorithm investigates the benefit of classic sequential

group testing, i.e., optimizing the grouping of the subsequent

tests by analyzing existing outcomes. Similar to traditional

sequential testing, each client (column) only appears in one

testing pool (server) at a time. However, to make full use of

the available K servers, we have all servers conduct test in

parallel. Details can be found in Algorithm 1.

Algorithm 1 Sequential Detection with Packing (SDP)

1: while |S| �= 0 do2: for all server i in A do3: wi ← � |S||A| � // Assign even number of distinct clients to each server.

4: end for5: G ← �n−(|S|−(|A\I|)wi)

w � // Identified legitimate IDs exempt from thefollowing tests. Their subsequent requests are handled by G non-testing servers,which are selected from the K servers and only provide normal services (notesting). We call this process as “packing” the clients into non-testing servers.

6: for all server i under attack do7: Q ← set of IDs on i8: if |Q| = 1 then9: S ← S \ Q // This ID belongs to an attacker. Add it into the black-list.

10: end if11: end for12: S ← S\L // All IDs on safe (with negative test outcome) servers are identified

as legitimate.13: A ← {server 1, . . . , server (K−G)} // Select the first G servers as non-testing

machines.14: end while

Algorithm Description. The basic idea of SDP algorithm

can be sketched as follows. Given a set S of suspect IDs,

we first randomly assign (i.e. distribute their requests) them

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

7

to the available testing servers in set A, where each server

will receive requests from approximately the same number of

clients, roughly wi = � |S||A|�. For each test round, we identify

the IDs on the negative servers as legitimate clients, and

“pack” them into a number G of non-testing machines. Since

they need no more tests, only normal services will be provided

for the following rounds. As more testing servers will speed

up the tests, given at most K server machines in total, G is

then supposed to be minimized to the least, as long as all

identified legitimate clients can be handled by the non-testing

w-capacity servers. Hence G = �n−(|S|−(|A\I|)wi)w �.

With the assumption |A| ≥ d + 1, we have at least |A| − dservers with negative outcomes in each testing round, hence

at least (|A|−d)wi legitimate IDs are identified. If any server

containing only one active ID is found under attack, the only

ID is surely an attacker. Then its ID is added into the black-list

and all its requests are dropped. Iterate the algorithm until all

IDs are identified, malicious or legitimate. Via the “packing”

strategy, legitimate clients can exempt from the influence of

potential attacks as soon as they are identified.

Performance Analysis. With regard to the computational

overhead of the matrix M in this context, it is a re-mapping

of the suspect clients to the testing servers based on simple

strategies and previous testing outcomes, thus is negligible

and up to O(1). The time cost of each testing round is at

most P , including the re-computation time of the testing

matrix.

Besides this, the overall time complexity of this algorithm

in terms of the number of testing rounds are depicted in the

following theorems.

Lemma 4.1: The number of available testing server

machines: |A| ∈ [K − �n−dw �,K].

Proof: In each round, the total number of legitimate IDs

n− (|S| − (|A \ I|)wi) is non-decreasing. Hence, the number

of servers used for serving identified legitimate IDs, G =�n−(|S|−(|A\I|)wi)

w � is non-decreasing. Therefore |A| = K−Gwill finally converge to K − �n−d

w �.

Theorem 4.1: The SDP algorithm can identify all the dattackers within at most O(logK

′d

nd ) testing rounds, where

K′= K − �n−d

w �.

Proof: In each round, at most d servers are under attack,

i.e., at mostd|S||A| IDs remain suspect. Since at the end of the

program, all legitimate IDs are identified, at most d malicious

IDs remain suspect. If they happen to be handled by d different

servers, i.e., each positive server contains exactly one attacker,

then the detection is completed. If not, one more testing round

is needed. Assume that we need at most j testing rounds in

total, then n(

d|A|

)j−1

= d where K − �n−dw � ≤ |A| ≤ K.

Therefore, j ≤ log( K′

d )

nd + 1, where K

′= K − �n−d

w �.

4.2 Sequential Detection without Packing

Algorithm 2 Sequential Detection without Packing (SDoP)

1: wi ← number of IDs on server i;2: for all server i do3: wi ← � S

A � // Evenly assign clients to servers as SDP did.4: end for5:6: while |S| �= 0 do7: Randomly reassign |S| suspect IDs to K servers, and keep legitimate IDs

unmoved.8: L ← set of IDs on safe servers9: S ← S \ (S ∩ L) // |S ∩ L| IDs are identified as legitimate.

10: for all server i under attack do11: Q ← set of IDs on i12: if |Q ∩ S| = 1 then13: S ← S \ Q // The clients in Q ∩ S are attacker and added into the

black-list.14: end if15: end for16: for all servers i do17: if wi = w // The volume approaches the capacity. then18: Reassign all n − |S| legitimate IDs to K servers, and go to 6 // Load

balancing.19: end if20: end for21: end while

Considering the potential overload problem arises from

the “packing” scheme adopted in SDP, we propose another

algorithm where legitimate clients do not shift to other

servers after they are identified. This emerges from the

observation that legitimate clients cannot effect the test

outcomes since they are negative. Algorithm 2 includes the

abstract pseudocode of this SDoP scheme. Notice that in this

algorithm for the DANGER mode, requests from one client

are still handled by one server, as SDP did.

Algorithm Description. The basic idea of the SDoP algorithm

can be sketched below. Given a suspect IDs set S with initial

size n, evenly assign them to the K server machines, similar to

SDP in the first round. For the following rounds, assign suspect

IDs to the K servers instead of |A| available ones. For the

identified legitimate IDs, never move them until their servers

are to be overloaded. In this case, reassign all legitimate IDs

over the K machines to balance the load. For server with

positive outcome, the IDs active on this server but not included

by the set of identified legitimate ones, i.e., suspect IDs, will

still be identified as suspect. However, if there is only one

suspect IDs of this kind in a positive server, this ID is certainly

an attacker.

Performance Analysis. It is trivial to see that the computation

overhead of the testing matrix M is similar to that of the SDP

algorithm, therefore can be ignored. The following theorems

exhibits the overall detection delay of the algorithm, in terms

of the number of testing rounds.

Lemma 4.2: The number of IDs wi on server idoes not exceed server capacity w till round j where

j = log Kd

nn−w(K−d) .

Proof: At most d servers get positive outcomes in each

round. If we assume round j is the last round that has no

overloaded servers, then the maximum number of IDs on one

server at round j is: wjmax =

∑ji=1 ( n

K )( dK )i−1. Since round

j+1 will have at least one overloaded server, we have wjmax =

w, then j = log Kd

nn−w(K−d) .

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

8

Lemma 4.3: All malicious IDs are identified within at

most O(logKd

nd ) testing rounds if w ≥ n−d

K−d .

Proof: In each round with no overloaded servers, at mostdK |S| legitimate IDs remain suspect. After i = log K

d

nK rounds

at most K suspect IDs are left, so we need only one more

round to finish the testing. Hence, we need i + 1 = log Kd

nd

rounds. Since it requires that no servers are overloaded within

these rounds, we have i + 1 < j = log Kd

nn−w(K−d) which

yields w ≥ n−dK−d .

Lemma 4.4: All malicious IDs are identified within at

most O(logKd

nd ) testing rounds if 1 ≤ w ≤ n−d

K−d .

Proof: If w ≤ n−dK−d , reassignments for legitimate IDs will

be needed after round j = log Kd

nn−w(k−d) . Assume that the

algorithm needs x more testing rounds in addition to i + 1 =log K

d

nd to reassign and balance legitimate IDs on all servers,

since the position changes of these legitimate IDs will neither

influence test outcomes nor delay the identifications of suspect

set S (they take place simultaneously), we hence have

x ≤ i + 1 − j

≤ log Kd

n

d− log K

d

n

n − w(k − d)

≤ log Kd

n − w(K − d)d

≤ log Kd

n

d

Therefore, the total number of testing rounds is at most

O(log Kd

nd ).

Theorem 4.2: The SDoP algorithm can identify all

attackers within at most O(logKd

nd ) testing rounds.

Proof: Directly from Lemma 4.3 and 4.4.

4.3 Partial Non-adaptive DetectionConsidering the fact that in the two sequential algorithms

mentioned, we cannot identify any attackers until we isolate

each of them to a virtual server with negative outcome, which

may bring up the detection latency. Therefore, we propose a

hybrid of sequential and non-adaptive method in this section.

In this scenario, the requests from the same client will be

received and responded by different servers in a round-robin

manner. Different from SDP and SDoP, a d-disjunct matrix is

used as the testing matrix in this scheme and attackers can be

identified without the need of isolating them into servers.

As mentioned in Section 2, matrix M is called d-disjunct if

no single column is contained in the boolean sum of any other

d columns, and all positive items can be identified within one

round. Numerous construction methods for d-disjunct matrix

have been proposed in [14][27] where Du’s method [14] has

a better performance in terms of the size complexity (number

of rows). Therefore we adopt this method as a subroutine to

generate M for each testing round. For the following analysis,

we refer to this algorithm as dDisjunct(n, d), which takes

n, d as input, and outputs a d-disjunct matrix with n columns.

Lemma 4.5: Let T (n) be the smallest number of rows in

the obtained d-disjunct matrix, then

T (n) = min{(2 + o(1))d2 log22 n/log2

2(d log2 n), n}

Next we introduce our PND detection method using

d-disjunct matrix, for now we assume that the row weight ofthe constructed matrix does not exceed the server capacityw. Algorithms with respect to the weight-constraint will be

investigated in the future. Considering different ranges of

input K, the algorithm differs in two cases.

Case 1: K ≥ T (n). All n suspect IDs can be assigned to

T (n) servers according to a MT (n)×n d-disjunct matrix, and

thus identified with one testing round.

Case 2: K < T (n). With inadequate servers to identify all nclients in parallel, we can iteratively test a part of them, say

n′ < n, where T (n′) ≤ |A|. The detailed scheme is depicted

as follows and the pseudocode is shown in Algorithm 3.

Algorithm Description. First evenly assign n suspect IDs

to all K servers, as the other algorithms did. After the

first round, we can pack all identified legitimate IDs into

G = �n−|S|w � servers. Therefore, |A| = K − G servers are

available for testing afterwards. If |A| ≥ T (|S|), we can

construct a d-disjunct matrix M|A|×|S| by calling function

dDisjunct(|S|, d) and identify all IDs in one more testing

rounds. Nevertheless, if |A| < T (|S|), we can firstly find

the maximum n′ (1 ≤ n′ ≤ |S|) such that T (n′) ≤ |A|;then secondly partition set S into two disjoint sets S1 and

S2 with |S1| = n′; and thirdly pack all S2 to G non-testing

server machines and call subroutine dDisjunct(|S1|, d) with

updated d to identify all IDs in S1; then drop the attackers

and pack all legitimate IDs into G machines and continue to

test S = S2. Iterate this process until all IDs are identified.

Algorithm 3 Partial Non-adaptive Detection with K < T (n)

1: Evenly assign n IDs to K servers with no two servers having the same client IDs.Pack all legitimate IDs on into G servers.

2: |A| ← K − G // Update the number of testing servers.3: if |A| ≥ T (|S|) then4: Run dDisjunct(|S|, d), decode from the obtained d-disjunct matrix and

identify all IDs; S ← 0; // Testing finishes.5: else6: while |S| �= 0 do7: Partition S into S1, S2 where |S1| = n′ and n′ = max n′′ ∈ [1, |S|]

satisfying T (n′′)‖A|.8: Run dDisjunct(|S1|, d), decode from the obtained d-disjunct matrix and

identify all IDs in S1, update d.9: Pack legitimate IDs in S1 to G machines, update G and A; S ← S2. //

Iteration with suspect ID set S2.10: end while11: end if

Performance Analysis. The time complexity of the function

dDisjunct(n, d) is O(sqn), i.e. O(Kn) with sq ≤ K. As

mentioned in Section 3, this can be decreased to O(1) time

cost in real-time. Since it is too complicated to use the

upperbound (2 + o(1))d2 log22 n/log2

2(d log2 n)) for T (n) in

analysis, we then use T (n) = n to get a rough estimate on

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

9

the algorithm complexity in terms of the number of rounds.

Moreover, we further investigate its performance based on an

optimal d-disjunct, so as to demonstrate the potential of this

scheme.

Lemma 4.6: The PND algorithm identifies all the attackers

within at most O(dw2/(Kw − w − n)) testing rounds.

Proof: Let us consider the worst case. In each round,

assume no attackers are identified and filtered out previously,

we have T (n′) + �n−n′w � = K. Since T (n′) ≤ n′ then

n′ ≥ Kw−w−nw−1 . Therefore the maximum number of testing

rounds needed is: dnK /Kw−w−n

w−1 ≤ dw2/(Kw − w − n) =O(dw2/(Kw − w − n)).

This complexity for PND is not as good as that of theprevious two algorithms. However we are still interested infinding out how it works if we have the d-disjunct matrixconstruction method improved. Since some tight lower boundsfor T (n) have been proposed, we then use one of them [16] tocompute T (n) and investigate the corresponding performanceof PND next.

Lemma 4.7: (D’yachkov-Rykov lower bound) [16] Let n >w ≥ d ≥ 2 and t > 1 be integers. For any superimposed

(d−1, n, w)-code ((d, n, w)-design) X of length t (X is called

a (d − 1)-disjunct matrix with t rows and n columns), the

following inequality holds:

t ≥ �dn

w�

The following are the related performance analysis based

on this lower bound of T (n).

Corollary 4.1: In the PND algorithm, given w ≥ d ≥ 1,

we have: n′ = min{ |A|wd+1 , n − Kw + w + |A|w}

Proof: According to Lemma 4.7, with number of columns

n′′ ∈ [1, |S|], we have

|A| ≥ � (d + 1)n′′

w� ≥ (d + 1)n′′

w⇒ n′ = max n′′ ≤ |A|w

d + 1Meanwhile in the PND algorithm, for each round i, we have:

|Ai| ≥ K − �n − n′

w� ⇒ n′ ≤ n − Kw + w + |A|w

Lemma 4.8: In any testing round i:

1) n′ = |Ai|wd+1 when K ∈ (d, dw+n+w

w ];2) n′ = n − Kw + w + |Ai|w when K ∈ [k1,

dn+n+ww )

with

k1 =dw + w + n +

√(dw + w + n)2 + 4wnd2

2w

Proof: According to Corollary 4.1,

1) we have

K ≤ dw + n + w

w⇔ wK ≤ dw + n + w

|A| ≥ d + 1 ⇔ |A|wd

d + 1≥ wd

hence

n + w − Kw + |A|w ≥ |A|wd + 1

2) In order to get

|A|wd + 1

≥ n − Kw + w + |A|w

we need

|A| ≤ ((K − 1)m − n)(d + 1)wd

which requires

K − (K − d)nKw

≤ ((K − 1)m − n)(d + 1)wd

hence we need

wK2 − (dw + w + n)K − nd2 ≥ 0

Solving this inequality, we have K ∈ [k1, +∞). Note

if K ≥ �dn+nw �, we need not do partitions in PND

algorithm and since

k1 ≤ dn + n + w

w

we have

K ∈ [k1,dn + n + w

w)

Moreover

k1 >dw + w + n

w≥ d2 + w + n

w

there are thus no overlaps between these two intervals.

Therefore, we split K ∈ (d, +∞) into four disjoint intervals

and study which interval of value K yields O(1) testing rounds

in worst case besides the interval K ∈ [dn+n+ww ,+∞) shown

above, as well as complexity for other intervals.

• I: K ∈ (d, dw+n+ww ] yields n′ = |A|w

d+1 ;

• II: K ∈ (dw+n+ww , k1) yields n′ = min{ |A|w

d+1 , n−Kw+w + |A|w};

• III: K ∈ [k1,dn+n+w

w ) yields n′ = n−Kw +w + |A|w;

• IV: K ∈ [dn+n+ww , +∞) yields ONE testing round in

total.

Lemma 4.9: The PND algorithm needs at most O(1) testing

rounds with K ∈ [k2,dw+n+w

w ], where

d ≤ w +√

w2 − 4n2(n − w)2(n − w)

and

k2 =n + w +

√n2 + w2 + 2nw − 4n2w + 4d2wn + 4dwn

2w

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

10

Proof: Since at least one server gets positive outcome at

the first testing round, we have

|A0| ≥ K − � (K − 1)nKw

With simple algebraic computations, we can reach the interval

[k2,dw+n+w

w ] on the condition that

d ≤ w +√

w2 − 4n2(n − w)2(n − w)

within interval I; however, for interval II and III, no such

detailed subintervals of K yielding O(1) testing rounds can

be obtained.

Lemma 4.10: Within interval I, PND algorithm can

identify all IDs with O(d + K√n) testing rounds.

Proof: We derive the time complexity from the following

recurrence:

Starting round 0: |S0| ≤ dnK , and K−� (K−1)n

Kw � ≤ |A0| ≤K − � (K−d)n

Kw �Ending round T : 0 < |ST | ≤ |AT |w

d+1

Iteration: For ∀i ∈ [0, T − 1] we have

|Si+1| = |Si| − |Ai|wd + 1

and

K − �n − |Si+1|w

� ≤ |Ai+1| ≤ K − �n − |Si+1| − d

w�

hence

⎧⎪⎪⎨⎪⎪⎩

|Ai+1| ≥ K − n

w+

|Si|w

− |Ai|d + 1

− 1

S0 ≤∑T

i=0 |Ai|wd + 1

In order to estimate the maximum time cost, use |S0| = dnK

to initiate the worst starting case. Solving this recurrence, we

get the following inequality:Kw

2(d+1)T2− ( Kw

2(d+1) + dnK −K + n

w +w)T − (K− (K−1)nw −

dn(d+2)K + w − 1) ≤ 0,

therefore

T ≤ d + 1Kw

(α +

√β

d + 1+ α2)

where

α =Kw

2(d + 1)+

dn

K+

n

w− K + w

and

β = 2K2w − 2K2n + 2Kn − 2Kw + 2Kw2 − 2d(d + 2)nw

Since nw ≤ K, wK ≤ dw+n+w and n > w, we have α > 0

and β > 0. So with trivial computation we can get

T ≤ 2α(d + 1)Kw

+

√β

d + 1

< 1 +2(d + 1)2

K+

√(4K − 2)(d + 1)

n+

2(d + 1)d + 1

< 1 + 2(d + 1) +

√4K(d + 1)

n+ 2

< 3 +√

2 + 2d +2K√

n

Therefore, PND will complete the identification within at most

O(d + K√n) testing rounds.

Note that since K is always much smaller than n, thecomplexity will approach O(d) in fact.

Lemma 4.11: Within interval III, PND can identify all IDs

with at most O(d) testing rounds.

Proof:Similarly we can get T ≤ 2d + 1 + 2

√1+d4 by solving the

following recurrence

Starting round 0: |S0| = dnK and K − � (K−1)n

Kw � ≤ |A0| ≤K − � (K−d)n

Kw �Ending round T : 0 < |ST | ≤ n − Kw + w + |AT |wIteration: ∀i ∈ [0, T−1], |Si+1| = |Si|−(n−Kw+w+|Ai|w)and K − �n−|Si+1|

w � ≤ |Ai+1| ≤ K − �n−|Si+1|−dw �.

Hence, PND will complete the identification within at most

O(d) testing rounds.

Corollary 4.2: Within interval II, PND algorithm can

identify all IDs with O(d + K√n) testing rounds.

Proof: According to Lemma 4.10 and 4.11, the time

complexity of PND algorithm depends on the value of n′ at

each round, and since n′ within interval II oscillates between|A|wd+1 and n−Kw +w + |A|w, the time complexity is at most

O(d + K√n).

Theorem 4.3: Given 1 ≤ d ≤ w, the PND algorithm can

identify all IDs within

1) at most O(d + K√n) testing rounds when

K ∈ (d, k1), whilst at most O(1) testing rounds

when K ∈ [k2,dw+n+w

w ] on condition that

d ≤ w+√

w2−4n2(n−w)

2(n−w) ;

2) at most O(d) testing rounds when K ∈ [k1,dn+n+w

w );

3) at most O(1) testing rounds when K ∈ [dn+n+ww , +∞);

where k1 = dw+w+n+√

(dw+w+n)2+4wnd2

2w and

k2 = n+w+√

n2+w2+2nw−4n2w+4d2wn+4dwn2w .

Despite the number of needed testing rounds differs for

these three algorithms above, the time complexity of calculat-

ing each testing round for these algorithm are approximate in

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

11

(a) Service Resource Usage Ratio (b) Average Response Time

Fig. 5. Status of the back-end server

practice. It is trivial to see that this costs for SDP and SDoP are

negligible, but not for PND algorithm which involves polyno-

mial computation on Galois Field. However, considering that

the upperbound of both the number of clients n and attackers

d are estimated, the detection system can pre-compute the

d-disjunct matrices for all possible (n, d) pairs offline, and

fetch the results in real-time. Therefore, the overhead can be

decreased to O(1) and the client requests can be smoothly

distributed at the turn of testing rounds without suffering from

long delays of matrix update.

5 SIMULATION CONFIGURATIONS AND RE-SULTS

To demonstrate the theoretical complexity results shown in

the previous section, we conduct a simulation study on the

proposed system, in terms of four metrics: average testinglatency T which refers to the length of the time interval

from attackers starting sending requests till all of them are

identified; average false positive rate fp and false negativerate fn; as well as the average number of testing roundsRtest which stands for the number of testing rounds needed

for identifying all the clients by each algorithm.

5.1 ConfigurationsThe purpose of this simulation is not to fully implement

a detection system for use, but instead to validate all the

theoretical results we derived. Although we make several

assumptions below to simplify the implementation of the

simulation environments, as we will show later, these issues

are orthogonal to the performance of our scheme. Therefore,

the simulation results could provide a reliable overview of the

applicability and practical performance of our framework for

general network types.

To this end, we implement a simulator in Java by modeling

both the n clients and K virtual servers as independent threads.

Consider that all back-end servers are independent testing

domains, we only simulate one back-end server as we did for

the algorithms, and thus does not test the flash crowd scenario

which was mentioned and settled in Section 3 using multiple

back-end servers.

In order to mimic the real client (including attacker) be-

havior and dynamic network environment, we implement the

client/server system as follows:

• each legitimate client joins in and leaves the system at

random times which are uniformly distributed, while the

attacker threads arrive at time t = 30s and keep live until

being filtered out (More complicated client behaviors

might be more favored, but since our scheme contains

a learning phase over various client behavior and adjusts

the threshold accordingly, the performance of this system

will not significantly decay by applying it to a different

network type, i.e., the network type is orthogonal with

the scheme.)

• both legitimate and malicious clients send requests which

are with a random inter-arrival rate and CPU processing

time (workload) to the virtual servers, however, legitimate

one have a much smaller random range than that of the

attackers.

• each virtual server is equipped with an infinite request

buffer and all the client requests arrive at the buffers

with 0 transmission and distribution delays, as well as

1 ms access time for retrieving states from the shared

memory; each server handles the incoming requests in its

own buffer in FCFS manner and responds to the client

on completing the corresponding request; the average re-

sponse time and incoming request aggregate are recorded

periodically to generate the test outcomes by comparing

them to the dynamic thresholds fetched from established

legitimate profiles.

The purpose of assuming both transmission and distribution

delays to be 0 is to simply quantify the length of the whole

detection phase (testing latency). With regard to the transmis-

sion delay, it can be large due to the geographical distances in

large-scale distributed system and possibly can bring up the

testing latency if the client sends request in a stop-and-wait

manner (it does not send a request until the previous requests

are all responded, therefore the request rate is quite low and

the length of each testing round is required to be longer), yet

since the detection will be completed just in several rounds,

such increases in the detection length is not significant. The

assumption of 0 distribution delay is also practical, since the

computational overheads for the testing matrix and dynamic

thresholds is negligible by pre-computing and fetching the

results from the profiles. For the 1ms state maintenance time,

since all the clients are residing in one physical servers, and all

the virtual servers can quickly retrieve the client states from

the shared memory, this is also a practical assumption.

With regard to the details of client behavior, the legitimate

request inter-arrival rate is randomized from 1 to 3 request per

ms, and the legitimate workload is randomized from 1 to 3 ms

CPU processing time. On the contrary, the malicious request

inter-arrival rates range from 5 to 20 per ms, and malicious

workload range from 5 to 20 ms CPU time. Although requests

with arbitrarily large rate or workload are favored by attackers,

they are in fact easier to be discovered, so we consider

malicious requests with small margin from legitimate ones.

5.2 Results

By setting K = 50, n = 1000, w = 100, d = 40, P = 1second, we first show the efficiency of our detection system

using PND algorithm, in terms of reducing the service resource

usage ratio (SRUR) and the average response time (ART), as

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

12

(a) false negative rate (b) false positive rate (c) detection latency (d) # testing rounds

Fig. 6. Robustness by different numbers of back-end server machines k

(a) false negative rate (b) false positive rate (c) detection latency (d) # testing rounds

Fig. 7. Robustness by different numbers of attackers d

(a) false negative rate (b) false positive rate (c) detection latency (d) # testing rounds

Fig. 8. Robustness by different total numbers of clients n

shown in Fig. 5(a) and 5(b). The values of SRUR and ARTclimb up sharply at t = 30s when the attack starts, and then

gradually falls to normal before t = 100s. Therefore, it takes

only 70s for the system to filter out attackers and recover to

normal status. Notice that actual detection latency should be

shorter than this, because the threshold of ART for the system

to convert from DANGER mode back to NORMAL mode is

slightly higher than normal ART. Therefore, the system SRURand ART will recover to normal shortly after the detection

period ends.

In the following we show the robustness of the performance

toward different environment settings: an increasing number of

1) virtual servers K; 2) malicious clients d; 3) all clients n.

In Fig. 6, we simulate identifying d = 10 attackers out of

n = 1000 clients with the number of virtual servers ranging

in [25, 200]. It can be seen, one one hand, that all the false

negative (Fig. 6(a)) and positive (Fig. 6(b)) rates are upper-

bounded by 5% and decreasing as K goes up for all the three

algorithms. This makes sense since the most possible case for

an attacker to succeed hiding itself is that it is accompanied by

many clients with low request rate and workloads in a testing

pool. In this case, it is quite possible that the aggregate inter-

arrival rates and workloads in this server are still less than

that of a server with the same number of legitimate clients.

Therefore, the less clients serviced by each server, the less

possibly this false negative case happens. Notice that even

if there is only one attacker but no legitimate client in a

server whose incoming traffic is not quite dense, the server

can still be tested as positive since all the tests are conducted

based on some dynamic threshold (use the threshold for server

with one client in this case), which changes as the number of

active clients varies in the current virtual server. On the other

hand, the testing latencies and number of testing rounds keep

declining from less than 11s and 4 rounds respectively, which

is because the identification will be speeded up with more

available virtual servers. With respect to the three different

algorithms, they obtained approximate performances except

that SDoP has slightly higher false negative rate than the

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

13

other two. This is because of those legitimate clients who

are identified at earlier stages and staying in the servers

till the end of the detection. Since their request rates and

workloads are likely to be smaller than normal (that is why

they are identified earlier), they may camouflage attackers in

the following rounds.

Due to the space limitation, we briefly summarize the results

for the following two experiments. In Fig. 7, we simulate the

identifications for n = 1000,K = 50, w = 100, P = 1s

with [1, 15] attackers, and find that all the values of the four

measures slowly increase as d goes up. Overall, the false neg-

ative/positive rates are limited to less than 5% and the testing

latencies are smaller than 6 seconds with 5 testing rounds.

Similar as the previous experiments, PND and SDP exhibits

better performances than SDoP in terms of false negative rate,

but for the other measures, they are approximately the same.

Fig. 8 shows its robustness for the cases d = 10,K =50, w = 100, P = 1s with [1000, 2000] clients. Apparently,

both the false rates and number of testing rounds keep stably

below 5% and 4 rounds respectively, toward increasing client

amount. The testing latencies grow up from 10 seconds to

20 seconds for all three algorithms, due to the increasing

time costs for state maintenance toward a double number of

clients (from 1000 to 2000). However, this small latency is still

tolerable in real-time applications and can be further reduced

by decreasing the state maintenance time cost within the same

physical machine.

Overall, the simulation results can be concluded as follows:

• in general, the system can efficiently detect the attacks,

filter out the malicious clients, and recover to NORMAL

mode within a short period of time, in real-time network

scenarios;

• all the three detection algorithms can complete the de-

tection with short latency (less than 30s) and low false

negative/positive rate (both less than 5%) for up to 2000

clients. Thus they are applicable to large scale time/error-

sensitive services;

• the PND and SDP algorithms achieve slightly better

performance than the SDoP algorithm. Furthermore, the

efficiency of the PND algorithm can be further enhanced

by optimizing the d-disjunct matrix employed and state

maintenance efficiency.

6 RELATED WORK IN DOS DETECTION

Numerous defense schemes against DoS have been proposed

and developed [7], which can be categorized into network-

based mechanisms and system-based ones.

Existing network-based mechanisms aim to identify the

malicious packets at the intermediate routers or hosts [10][12],

by either checking the traffic volumes or the traffic distribu-

tions. However, the application DoS attacks have no necessary

deviations in terms of these metrics from the legitimate

traffic statistics, therefore, network-based mechanisms cannot

efficiently handle these attack types.

On the other hand, plenty of proposed system-based mech-

anisms tried to capture attackers at the end server via authen-

tications [3][24] or classifications [1][26]. Honeypots [24] are

used to trap attackers who attempt to evade authentications,

and can efficiently mitigate flood attacks. However, this mech-

anism kind relies on the accuracy of authentication. Once the

attacker passes the authentication, all the productive servers

are exposed and targeted. Classification-based methods study

on the traffic behavior model, for example, the inter-arrival

time distribution [1] and the bandwidth consumption pattern

[26]. With a rigorous classification criteria, abnormal packets

can be filtered out, but this requires individually checking each

session statistically and can significantly increase the overhead

of the system. Besides these two defense types, Kandula

et al.[3] augments the protected service with CAPTCHA

puzzles which are solvable by human clients but zombie

machines. This technique can eliminate the service requests

from bonnets, but it also brings in additional operations at

the client ends, which delays the consumer from receiving

services.

In parallel with our previous work [5], Khattab et al. [4]

proposed another system-based “live-baiting” defense scheme

by applying group testing theory to application DoS detection.

Based on a “high probability d-disjunct” [8] matrix, their

system avoids large modifications to the network topology

by spanning the incoming requests to multiple virtual serves

for testing, using encrypted token, which will also benefit

our detection system. However, the live-baiting mechanism

needs to be improved in several facets: Firstly, the number of

states (pools for testing) is not as small as O(d), but instead

approaching O(n). Secondly, the static threshold High-Water-

Mark (HWM), which is the arriving request aggregate at each

server periodically, is not accurate for the possible asymmetric

request distribution on the assigned servers to one client

in round-robin. Moreover, this threshold cannot be used to

servers with dynamic client set. Thirdly, the virtual servers are

merely request counters, and cannot tackle the high-workload

requests. In our system, virtual servers with limited service

resources are mapped to testing pools and tested with dynamic

thresholds, thus can overcome these limitations.

7 CONCLUSIONS AND DISCUSSIONS

We proposed a novel technique for detecting application DoS

attack by means of a new constraint-based group testing

model. Motivated by classic GT methods, three detection

algorithms were proposed and a system based on these algo-

rithms was introduced. Theoretical analysis and preliminary

simulation results demonstrated the outstanding performance

of this system in terms of low detection latency and false

positive/negative rate.

Our focus of this paper is to apply group testing principles to

application DoS attacks, and provide an underlying framework

for the detection against a general case of network assaults,

where malicious requests are in-distinguishable from normal

ones. For the future work, we will continue investigate the

potentials of this scheme and improve this proposed system

to enhance the detection efficiency. Some possible directions

for this can be: (1) the sequential algorithm can be adjusted to

avoid the requirement of isolating attackers; (2) more efficient

d-disjunct matrix could dramatically decrease the detection

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

14

latency, as we showed in the theoretical analysis. A new

construction method for this is to be proposed and can be a

major theoretical work for another paper; (3) the overhead of

of maintaining the state transfer among virtual servers can be

further decreased by more sophisticated techniques; (4) even

that we already have quite low false positive/negative rate from

the algorithms, we can still improve it via false-tolerant group

testing methods, as discussed next.

Besides attempting to alleviate the errors resulting from

the proactive learning on legitimate traffic, we will develop

our algorithms using the error-tolerant d-disjunct matrix [23],

which can still provide correct identification results even in

the presence of some error tests. Specifically, a testing matrix

M is (d; z)-disjunct if any single column of it has at least zelements that are not in the union of any d other columns. By

applying this matrix M to our PND algorithm, up to d positive

items can be correctly identified, if the number of error tests

is not more than z − 1. With efficient decoding algorithms

proposed by D. Du [14], this error-tolerant matrix has great

potentials to improve the performance of the PND algorithm

and handle application DoS attacks more efficiently.

REFERENCES

[1] S. Ranjan, R. Swaminathan, M. Uysal, and E, Knightly. “DDos-resilient schedulingto counter application layer attacks under imperfect detection,” In Proceedings ofthe IEEE Infocom, Barcelona, Spain, April 2006.

[2] S. Vries. “A Corsaire White Paper: Application Denial of Service(DoS) Attacks”,http://research.corsaire.com/whitepapers/040405-application-level-dos-attacks.pdf.

[3] S. Kandula, D. Katabi, M. Jacob, A. W. Berger “Botz-4-Sale: Surviving OrganizedDDoS Attacks That Mimic Flash Crowds.” 2nd NSDI, Boston, MA, May 2005.

[4] S. Khattab, S. Gobriel, R. Melhem, and D. Mosse, “Live Baiting for Service-levelDoS Attackers,” INFOCOM 2008.

[5] M. T. Thai, Y. Xuan, I. Shin, and T. Znati, “On Detection of Malicious Users UsingGroup Testing Techniques,” in Proceedings of ICDCS, 2008.

[6] M. T. Thai, P. Deng, W. Wu, and T. Znati, “Approximation Algorithms of Non-unique Probes Selection for Biological Target Identification,” In Proceedings ofConference on Data Mining, Systems Analysis and Optimization in Biomedicine,2007.

[7] J. Mirkovic, J. Martin and P. Reiher, “A Taxonomy of DDoS Attacks and DDoSDefense Mechanisms,” Tech. Rep. 020018, Computer Science Department, UCLA,2002

[8] M. J. Atallah, M.T. Goodrich, and R. Tamassia, Indexing Information for DataForensics, ACNS, Lecture Notes in Computer Science 3531, Springer, 206-221,2005.

[9] J. Lemon, “Resisting SYN flood DoS attacks with a SYN cache,” in Proceedingsof the BSD Conference 2002 on BSD Conference

[10] Service provider infrastructure security: “detecting, tracing, and mitigating network-wide anomalies”. http://www.arbornetworks.com, 2005.

[11] Y. Kim, W. C. Lau, M. C. Chuah, and H. J. Chao, “Packetscore: Statisticsbasedoverload control against distributed denial-of-service attacks”. In Proceedings ofInfocom 2004, HongKong, 2004.

[12] F. Kargl, J. Maier, and M. Weber, Protecting web servers from distributed denialof service attacks, in WWW ’01: Proceedings of the 10th international conferenceon World Wide Web, New York, NY, USA, 2001, pp. 514C524, ACM Press.

[13] L. Ricciulli, P. Lincoln, and P. Kakkar. “TCP SYN flooding defense”. In Proceed-ings of CNDS, 1999.

[14] D. Z. Du and F. K. Hwang, Pooling Designs: Group Testing in Molecular Biology,World Scientific, Singapore, 2006.

[15] M. T. Thai, D. MacCallum, P. Deng, and W. Wu, “Decoding Algorithms in PoolingDesigns with Inhibitors and Fault Tolerance,” IJBRA, vol. 3, no. 2, pp. 145–152,2007.

[16] A.G. Dyachkov, A.J. Macula, D. C. Torney, and P. A. Vilenkin, “Two models ofnonadaptive group testing for designing screening experiments,” In Proceeding 6thInt. Workshop on Model-Oriented Designs and Analysis, pp. 635, 2001.

[17] J. Kurose and K. Ross. “Computer Networking: A Top Down Approach 4thedition.” Addison-Wesley, July 2007.

[18] Y. Chu and J. Ke. “Mean response time for a G/G/1 queueing system: Simulatedcomputation.” Applied Mathematics and Computation. Volume 186, Issue 1, March2007.

[19] G. Mori and J. Malik. “Recognizing objects in adversarial clutter: Breaking a visualcaptcha”. IEEE Computer Vision and Pattern Recognition, 2003.

[20] Sharma P., Shah P. and Bhattacharya S. “Mirror Hopping Approach for SelectiveDenial of Service Prevention” in WORDS’03. 2003

[21] V. D. Gligor. “Guaranteeing Access in Spiteof Distributed Service-Flooding At-tacks.” in Proceedings of the Security Protocols Workshop, 2003

[22] T. V. Zandt, “How to fit a response time distribution,” in Psychonomic Bulletin &Review, 2000, 7(3)

[23] A. G. Dyachkov, V. V. Rykov, A. M. Rashad, “Superimposed distance codes,”Problems of Control and Information Theory, 1989

[24] S. M. Khattab, C. Sangpachatanaruk, D. Mosse, R. Melhem and T. Znati. “Honey-pots for Mitigating Service-Level Denial-of-Service Attacks.” in ICDCS’04, 2004.

[25] LADS: Vyas Sekar, Nick Duffield, Kobus van der Merwe, Oliver Spatscheck,Hui Zhang. “Large-scale Automated DDoS Detection System”, in USENIX AnnualTechnical Conference 2006.

[26] F. Kargl, J. Maier, and M.Weber. “Protecting web servers from distributed denialof service attacks”. In World Wide Web, pages 514.524, 2001.

[27] D. Eppstein, M.T. Goodrich, and D. Hirschberg, “Improved Combinatorial GroupTesting Algorithms for Real-World Problem Sizes”, WADS, LNCS 3608, Springer,2005, 86-98.

Ying Xuan received the BE degree in ComputerScience and Engineering from the Universityof Science and Technology of China, Anhui,China in 2006, and is currently working towardthe PhD degree in Department of Computer &Information Science & Engineering, University ofFlorida under the supervision of Dr. My T. Thai.His research topics include group testing theory,network security and network reliability.

Incheol Shin received the BE degree in Com-puter Science and Engineering from the Han-sung University, Seoul, Korea in 2002. He is nowa Ph.D. student in department of Computer &Information Science & Engineering, University ofFlorida under the supervision of Dr. My T. Thai.His research interests include network securityand group testing.

Dr. My T. Thai received a PhD degree in Com-puter Science from the University of Minnesota,Twin Cities, in 2006. Dr. Thai is an Assistant Pro-fessor in the Department of Computer and In-formation Sciences and Engineering at the Uni-versity of Florida. Her current research interestsinclude algorithms and optimization on networkscience and engineering. She also serves as anassociate editor for Journal of Combinatorial Op-timization (JOCO) and Optimization Letters. Dr.Thai is a recipient of Young Investigator Awards

from the DoD.

Dr. Taieb Znati received a Ph.D. degree in Com-puter Science from Michigan State Universityin 1988, and a M.S. degree in Computer Sci-ence from Purdue University, in 1984. Dr. Znatiis a Professor in the Department of ComputerScience, with a joint appointment in Telecom-munications in the Department of InformationScience, University of Pittsburgh. He currentlyserves as the Director of the Computer andNetwork Systems (CNS) Division at the NationalScience Foundation (NSF). From 2000 to 2004,

he served as a Senior Program Director for networking research atNSF. He also served as the Committee Chairperson of the InformationTechnology Research (ITR) Program, an NSF-wide research initiative ininformation technology.

Dr. Znati’s current research interests are on network science andengineering, with the focus on the design of scalable, robust andreliable network architectures and protocols for wired and wirelesscommunication networks. He is a recipient of several research grantsfrom government agencies and from industry. He is frequently invited topresent keynotes in networking and distributed conferences both in theUnited States and abroad.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSThis article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.