PREFETCHing to optimize DNSSEC deployment over large Resolving Platforms

PREFETCHing to ease DNSSEC deployment overlarge Resolving Platforms

Daniel Migault∗, Stephane Senecal∗, Stanislas Francfort∗, Emmanuel Herbert† and Maryline Laurent†∗ Francetelecom, [email protected]

† Institut Mines-TELECOM, UMR CNRS 5157 SAMOVAR, [email protected]

Abstract—Todays’ DNS resolving platforms have been designedfor DNS with fast and light resolutions over the Internet. Withsignature checks and larger payload, experimental measure-ments [10] showed that DNSSEC resolutions require up to 4times more CPU.While DNSSEC requires more CPU than DNS, this paperproposes PREFETCHX , an architecture that optimizes theuse of CPU and thus avoids that resolving platforms increasetheir size for DNSSEC migration. Current resolving platformsIPXOR split the traffic between the nodes according to theIP addresses. Alternatively, PREFETCHX takes advantage ofthe FQDN’s popularity distribution (Zipf), a layered cache andcache sharing mechanisms between the nodes and requires atleast 4 times less nodes. Furthermore PREFETCHX does notimpact the network infrastructure which eases its deployment.Then, defining X the number of prefetched FQDNs makesPREFETCHX highly scalable and flexible to different typeof traffic.

Index Terms—DNS, DNSSEC, DHT, Pastry, Cache sharing

I. INTRODUCTION

Internet Service Providers (ISPs) have always hosted Do-main Name System (DNS) resolution platform for its endusers and Machine-to-Machine communications. However, lit-tle attention was paid to optimize DNS resolving platform.Therefore, most of them are currently composed of a loadbalancer —like [12]—that splits the traffic between the plat-form’s nodes according to the IP addresses of the incomingqueries. This architecture is named IPXOR and is used tocompare the proposed alternative architectures.In fact, DNSSEC resolution involves larger payload than DNSand signature checks that require more CPU. Although thisCPU overhead depends on the type of traffic, the server’simplementation, the resolution policy or the hardware, Migaultand al. Experimentally measured in [10] this overhead to beup to 425%. This has been confirmed in real deployment,mainly because DNSSEC generates much more exchanges. Inaddition, the demand for DNS resolutions keeps on increasingand ISPs can observe roughly a monthly 8% increase rate.With Contend Delivery Network and short Time to Live (TTL)DNS responses, we expect the DNS traffic to keep or increaseits growth rate.As ISPs can hardly afford increasing their resolution platformby 5 and as DNS traffic is still expected to increase, itis now crucial to optimize resolving platform to face thefuture demands on DNS and DNS(SEC) evolutions. This paperproposes the PREFETCHX architecture, designed to:

1) Reduce the number of nodes (i.e. CPU or resources).2) Provide management facilities for Operation, Admin-

istration & Management (OAM) such as overcomingnodes fail-over or addition of a node.

3) Deliver limited impact on the core network with astraight forward design to ease its deployment.

PREFETCHX takes advantage of the Zipf distribution ofthe Fully Qualified Domain Names (FQDN). This distributionmeans that a large part of the traffic is constituted by asmall amount of FQDNs. In this paper we call HEADX ,the X most popular FQDNs and TAILX the remainingFQDNs. PREFETCHX takes also advantage of a layeredcache, by prefetching the HEADX FQDNs in a dedicatedcache and pre-processing each DNS query before they aresent to the server’s software. More specifically, for eachincoming DNS query, we check whether the queried FQDNis in HEADX . Only if the query is not in HEADX , itis forwarded to the traditional DNSSEC server’s software— [1], [18]. We typically expect the dedicated cache andits lookup process to be hosted in a Network HardwareAcceleration Cards (NHAC). The length of the cache X is atrade-off between fast lookup, a large cache hit rate (CHR),and a uniform distribution of the necessary CPU to deal withTAILX among the node of the platform. X is the variablefor adjustment to find this trade-off with multiple type oftraffic, making PREFETCHX highly flexible.

PREFETCHX takes advantage of cache sharingmechanisms. This means that a given FQDN should beresolved by a single or a small set of nodes, mainly toavoid duplicated resolutions. In this paper, we consideredfor TAILX different cache sharing mechanisms on top ofPastry [15]. Not only Pastry avoids duplicated resolutions,but also provides OAM facilities. Note that the advantagesprovided by Pastry mainly apply to DNSSEC and notDNS. In fact, with Pastry based architectures, an intra-nodecommunication occurs instead of a resolution. BecauseDNSSEC resolution requires signature checks, an intra-nodecommunication costs much less than a resolution. This is notso obvious with DNS.

The remaining of this paper is organized as follows.Section II positions the paper towards existing work.Section III details motivations for PREFETCHX and its

2

design. As mentioned above, X is defined by the trafficin order to find a trade-off between cache efficiency and auniform CPU distribution between the nodes. In this paperwe considered 10 minute DNS live captures from a largeISP, with more than 35 millions DNS queries. In section IVwe show that setting X = 2000 FQDNs results in a uniformCPU distribution among the nodes of the platform for theremaining TAILX FQDNs. Section V, VI and VII arededicated to the cache sharing architectures that deals withTAILX . Section V, defines analytic models of various cachesharing architectures. Section VI computes these modelswith a live traffic capture, and compares their efficiencies.At last, section VI compares efficiency of the Pastry modelwith those provided by an experimental platform based onFreePastry [5] to validate our analytic models and the thusour considerations for real deployment.

II. RELATED WORK

Many works focus on Web caching architectures. Wangin [19] describes different caching architectures, providinginputs on where and how placing the cache devices accordingto the Web requirements. Tewari and al. introduce a costcomparison to different distributed caching methods in [17].Wolman and al. investigate the benefits of cooperative cachingin [21], and demonstrates that performance are reached onlywith a limited number of nodes. The DHT web cachingmethods are investigated by Iyer and al. with Squirrel [6],which is based on Pastry [15]. This paper compares twoarchitectures, one using a home node dedicated to a bench ofweb pages and one with a home node that can also delegatethe resolution to other nodes during a flash crowd. It happensthat the less flexible but simpler architecture have betterperformance.Massey and al. in [9] and Risson and al. in [14] analyze howDHT could enhance the robustness of the Naming System.The robustness of both Chord and DNS considers Datafailure rate, Path failure rate and Path length. The DNSefficiency was proved to be linked to the popularity of itszone and the number of labels of the domain name, whereasthe DHT efficiency is related to the popularity of its RRsets.In fact, DHT main drawback is its heavy routing algorithms.DHT is also more robust to orchestrated attacks and couldachieve the same availability of the current DNS with addedmechanisms like proactive caching —Beehive [13].Our paper differs from the papers cited above since theDHT ring is used for hosting authoritative data, whereasour architecture uses the DHT Pastry as a way to define thenode responsible for performing the resolutions. The waythe data is stored into the DHT also differs from DHash andwe use PAST. In DHash, blocks of the files are spread overthe DHT nodes, whereas in PAST the whole file is hostedon the node. Our architecture is also expected to considerat maximum around one or two hundred nodes, whereas theDHT described by Cox and al. [3] is a 1000 node experiment

which is mentioned as being a restricted number.

III. PREFETCH ARCHITECTURE: GOALS AND DESIGN

As mentioned in section I, this paper introduces the designof an architecture that makes DNSSEC deployment feasiblefor a resolving platform. Current resolving platforms IPXORare based on a load balancers that split the DNS queriesby XORing their IP addresses —like with [11]. Our targetarchitecture must (1) reduce the involved resources, (2)provide OAM and (3) enjoys a straight forward design.

A. Global resource reduction of IPXOR with FQDNSHA1

The resource we care in this paper is the CPU which isheavily used for DNS resolution with signature checks andcache lookup over large caches. Thus, our goals are to reduce(a) the number of resolutions, (b) the size of the cache and (c)the number of cache lookups. Then, to prove the architectureis efficient, we must check the CPU is uniformly distributedamong the nodes of the platform.We first consider FQDNSHA1, an architecture that assignseach incoming FQDN to a specific node according to thehash of its FQDN. As a result, a given FQDN is resolved andstored by a single node which avoids simultaneous resolutionsand partly reduces the cache size. More specifically, the cachesize is reduced for FQDNs that are queried more than once.The problem with caches in IPXOR is that they are verylarge and filled with infrequently queried FQDNs. Thisincreases the necessary CPU for cache lookup without reallyproviding an advantage of the caching mechanism. FinallyFQDNSHA1 addresses goals a), b) and c).

B. Caching to make FQDNSHA1 CPU distribution uniform

The problem faced by FQDNSHA1, is that resources arenon-uniformly distributed among the nodes of the platform.With a 10 minute DNS capture of more than 35 millionqueries, figure 1 depicts how the CPU cycles, queries andresolutions are distributed among the nodes. Our live trafficis captured on a 18 node platform, so we replay it for a 18node platform, and plot how many nodes are associated toa given value of CPU cycles, query and resolution numbers.The CPU cycles associated to resolutions are provided byexperimental measurements in [10]. Figures 1a and 1bshow that FQDNSHA1 globally reduces the CPU by 30%,compared to IPXOR. On the other hand CPU are unfairlydistributed among the nodes. Similarly, figures 1c and 1ddepict the relative dispersion of the number of queries(δQ), respectively resolution (δR) as defined in equation (2)with Qi the number of queries on node i and Q the meanquery number over the platform. From figures 1c and 1d,δFQDNQ ≈ 5δXORQ and δFQDNR ≈ 0.026δXORR . Finally,SHA1 does not provide any benefit over XOR in termof load balancing, which confirms the use of XOR forload balancing the DNS traffic. FQDNSHA1 is adapted for

3

balancing resolution whereas IPXOR is more adapted forquery load balancing.

δQ = |MAX(Qi)−MIN(Qi)|, i ∈ [1..n] (1)

Qi =| Qi − Q |

Q(2)

0

2

4

6

8

10

0.8 0.9

1 1.1 1.2

1.3 1.4

1.5

Num

ber

of

Node

s

CPU (x 104 %CPU)

FQDNSHA1IPXOR

(a) CPU DNS

0

2

4

6

8

10

2.5 3 3.5

4 4.5 5 5.5

Num

ber

of

Node

s

CPU (x 104 %CPU)

FQDNSHA1IPXOR

(b) CPU Load DNSSEC

0

2

4

6

8

10

12

-0.4-0.3

-0.2-0.1

0 0.1 0.2

0.3

Num

be

r o

f N

od

es

Delta(Query)/mean(Query)

FQDNSHA1IPXOR

(c) Query

0

1

2

3

4

5

-0.04

-0.02

0 0.02

0.04

0.06

Num

be

r o

f N

ode

s

Delta(Resolution) / mean(Resolution)

FQDNSHA1IPXOR

(d) Resolution

Fig. 1. Resource Distributions on a 18 node platform

The non-uniform distribution of the resources results froma large inequality between the FQDN’s popularity as depictedin figure 2a. Accordingly to the Zipf distribution of FQDN’spopularity, to counterbalance the query number of FQDN withpopularity i, a large number (K) of FQDNs with popularityi + k, k ∈ K must be involved. This is especially true forFQDNs in HEADX , so, to prevent these FQDN to unbalancethe resources distribution over the platform, we consider a pre-process that handles FQDNs belonging to HEADX . Thus,only the remaining FQDN of TAILX will be handled by theDNS(SEC) server, and the cache sharing architecture must bedesigned for TAILX .As represented in figure 2b, for each incoming DNS query, thequery is pre-processed. If the queried FQDN is in HEADX ,the response is directly sent back, otherwise, the query isforwarded to the node to be resolved.The key advantage provided by this pre-processing step is thata large part of the traffic is querying FQDNs in HEADX ,and X is a relatively small number. This makes cache lookupmuch faster than when handled by the DNS(SEC) serverwith a large red-black binary tree cache of regular DNSserver implementations —BIND [1], UNBOUND [18]. Morespecifically, we expect the pre-processing part to be computedin Network Hardware Acceleration Card (NHAC) [2], [4]. Thekey challenge of the PREFETCHX is to derive X from

the DNS traffic so that the resources are uniformly distributedamong the nodes of the platform. Section IV proved that thatX as little as X = 2000 provides a uniform distribution ofqueries, resolutions, and so CPU.

HEADX TAILX

X=2000

(a) X , HEADX , TAILX

(HEADX)AccelerationCard

DNS Cache

Query

(TAILX)

in HEADX ?Response

YESNO

Responsible Node

Response

LOAD BALANCER

(b) PREFETCH Node Architecture

Fig. 2. PREFETCH Node Parameters

C. Providing OAM via DHT

Because PREFETCHX prefetches HEADX , the DNSserver software only deals with FQDNs of TAILX . Eachof these FQDN is being assigned to a specific node aswith FQDNSHA1. Thus, it becomes convenient to placethe DNS(SEC) resolving servers on the top of a DistributedHash Table (DHT) infrastructure. In this paper, we considerPastry [15] that provides auto-configuration when a node isadded or removed from the platform. Note that we only usea subset of the Pastry functionalities. More specifically, wedo not use the Pastry Dynamic Node Discovery that defineswhich Pastry node hosts a given content. This mechanism isquite complex because DHT protocols have been designed forbillions of dynamic nodes, making impossible for each nodeto have a global knowledge of the platform. On the other hand,our platform is not expected to have more than a (few) hundrednodes, all administrated by the same ISP. This makes possiblefor each node to consider the other nodes as its Neighbors, andthus does not require Dynamic Node Discovery. On the otherhand, we take advantage of the content distribution amongthe nodes of the platform and the auto-configuration Pastrymechanisms. Figure 2b shows that when the queried FQDN isnot in HEADX , the query is forwarded to the DNS(SEC)server on top of Pastry. The Pastry protocol forwards thequery to the responsible node for that FQDN. The Pastryinfrastructure for TAILX is detailed in section V.

IV. DERIVING X , HEADX , TAILX FROM LIVE CAPTURE

Section III defined PREFETCHX , and showed thatcaching X most popular FQDN in a NHAC results in anuniform distribution of the number of queries, resolutionsand so the CPU. This section derives X , —and so HEADX

and TAILX —from a live DNS traffic capture. We defineX so that the necessary resources to handle TAILX areuniformly distributed among the nodes. The distribution mustremain uniform when different hash functions are used inthe DHT, and at any time of the day. Finally, we show that

4

PREFETCH2000 only needs half of the resources requiredby IPXOR.

A. X for δCPUX hash function stability

Equation 2 in section III-B defined the relative disper-sions δQX and δRX and showed that resources for TAILXare uniformly distributed when these values are small.PREFETCHXo is more efficient than IPXOR when Xo

makes δQXo ≤ δQXORXand δRXo ≤ δRXORX

—QXORX (resp.RXORX ) are the number of queries (resp. responses) withIPXOR.Figure 3 sums up the hash function stability results of acomputation over a 10 minute DNS capture. Figures 3a and 3bdepict δQ and δR for different hash functions (SHA1, MD5,CRC32) and table 3c compares them with IPXOR. It clearlyshows that with X = 2000, PREFETCH2000 overcomesboth IPXOR queries and resolution dispersion. On the otherhand, the comparison of CPU shows that with X = 500PREFETCHX has a better CPU distribution than IPXOR.Considering queries and resolutions has more constrains thanconsidering CPU, and we will consider X = 2000 in theremaining of the paper.

3

5

10

20

30

70

10010 3

10 410 5

Dis

pers

ion (

%)

Number of prefetched FQDNs

PREFETCHSHA1PREFETCHMD5

PREFETCHCRC32FQDNIPXOR

(a) δQX

0.81

2

3

5

11.2

10010 3

10 410 5

Dis

pers

ion (

%)

Number of prefetched FQDNS

PREFETCHSHA1PREFETCHMD5

PREFETCHCRC32FQDNIPXOR

(b) δRX

XδQX

δQXOR

δRXδRXOR

XδCPUX

δCPUXOR

min max min max DNS DNSSEC1000 1 1 0.19 0.28 100 2.04 0.42000 0.63 0.78 0.32 0.47 250 1 0.24000 0.31 0.625 0.13 0.15 500 0.75 0.4

2000 0.25 0.2

(c) PREFETCHXIPXOR

, for Q, R and CPU

Fig. 3. PREFETCHX δQX , δRX measurements

B. X for δCPUX time stability

Section IV-A derived X = 2000 from hash functionstability. In this section, we check if this value is stable overat least a day. Figure 4 plots δQX , δRX for various X valuesduring the whole day and depicts in figure 4c for differentvalue of X the maximum of δQX (respectively δRX ) observedin that day.From figures 4a and 4b PREFETCHX provides, overtime, a much more uniform distribution of the resources thanIPXOR. X = 2000 reduces variations which do not exceed

a 10% variation over time, and increasing X to X = 10000does not significantly improve PREFETCHX stability. Thisconfirms our choice of X = 2000.

0.1

0.2

0.3

0.5

0.81

0h 3h 6h 9h 12h15h

18h21h

24h

de

lta

= (

Ma

x -

Min

) /

Me

an

Time of the Day

IPXORFQDN

PF(2000, SHA1)PF(2000, MD5)

PF(2000, CRC32)PF(10

4, SHA1)

PF(104, MD5)

PF(104, CRC32)

(a) δQX

0.01

0.02

0.05

0.080.1

0.2

0.3

0h 3h 6h 9h 12h15h

18h21h

24h

de

lta

= (

Ma

x -

Min

) /

Me

an

Time of the Day

IPXORFQDN

PF(2000, SHA1)PF(2000, MD5)

PF(2000, CRC32)PF(10

4, SHA1)

PF(104, MD5)

PF(104, CRC32)

(b) δRX

4

6

8

10

12

14

16

18

20

50010 3

2.10 3

5.10 3

10 4

MA

X(D

elta)

(Q, R

)

Number of Prefetched FQDNs

MAX(DeltaQ)MAX(DeltaR)

(c) MAX(δQX ),MAX(δRX )

Fig. 4. Time Stability

In section III, we exposed our motivations for a newarchitecture for a DNSSEC resolving platform and designedPREFETCHX . This architecture requires to set X ,the number of most popular FQDNs that are cached andpre-processed by NHACs. X must be set according to theDNS traffic and section IV details how to derive X andshows that on our live DNS capture PREFETCH2000

provides a uniform distribution of the resources —queries,resolutions and so CPU —. These distributions remainstable to different hash functions and over time. In term ofefficiency, PREFETCH2000 roughly requires half of thenodes required by IPXOR.The second part of the paper focuses on the DHTarchitecture for TAILX . Section V models various DHTconfigurations. Theseanalytic models lead to simulation withreal traffic in section VI to define which architecture bestfits PREFETCHX goals. At last, we validate these modelsin section VII with a real implementation based on FreePastry.

V. DHT THEORETICAL ANALYTIC MODELS FOR TAILX

In DNS over Pastry [15], as represented in figure 5a, queriesare identified according to the requested FQDN. Each nodeis assigned a set of FQDNs and resolves queries it is aResponsible Node (RN) for. Other queries are forwarded totheir associated RN.Pastry is widely known by the community, but other DHTprotocols may be used - like chord or Tapestry. The way we

5

use Pastry differs from what it was originally designed for.First, the Pastry nodes constitute the platform and belong to thesame administrator, they are located in the same data center,on the same LAN and every node knows all the other nodes -the platform is not expected to be larger than a few hundredsnodes. We do not assert that the node ID is derived from thedata by a simple hash function (SHA1), but we may applya specific mapping known by all nodes. Finally, we do notconsider the Pastry routing discovery algorithm. On the otherhand, we take advantage of Pastry’s auto-configuration mecha-nisms, robustness to DoS attacks [3], [9], [14] - as such it maybalance the sensitivity to DoS attacks introduced by DNSSECwith heavier resolutions. We also take advantage of the cachesharing mechanisms for enhancing Pastry based platforms.Note that how FQDNs are associated to their ResponsibleNode (RN) may require the auto-configuration mechanismto be reconsidered. Furthermore, robustness to DoS requiresarchitectures that cache somehow the responses. This sectionprovides a model description for Pastry based architectures,with different mechanisms that either reduce routing traffic, orenhance the cache of the platform. Section VI computes themodels and compares the different pastry based architectureswith FQDN and IPSHA1.From figure 5b, the traffic has the following features:

IP LoadBalancer

[1] DNSQuery

[6] DNS Response

[2] DHTQuery

[5] DHT Response

@

[3] DNSQuery

[4] DNSResponse

(a) Pastry (b) Node J

Fig. 5. DHT Traffic and Node Modelization

- Q: DNS queries sent by all end users with query rate q.- Qj : DNS queries sent to nj with query rate qj .- M

′

j (resp. Mj): incoming (resp outgoing) exchangesbetween nj and the other n − 1 nodes of the n nodeplatform.

The different entities we consider are:- p(IPs, fqdn) : a DNS query with IPs = (IPsource,IPdestination), and fqdn the queried FQDN as definedbelow.

- fqdn(name, ttl, rank) : a FQDN where name desig-nates the FQDN, ttl its TTL and rank its rank whichreflects its popularity. Note that rank is based on thepopularity of the FQDN, i.e. the query rate value associ-ated to that FQDN. Each FQDN has a distinct rank, thatis, two FQDN with the same request rate, are orderedalphabetically.

The following architecture components are : a Load Balancerthat splits the traffic among the different nodes nj of the

platform. We note :- F = {the ranked fqdnr, r ∈ R}. The list of different

FQDN with their associated rank r.- OT defines the Occupancy Time (OT) that depends on

the action performed (cache lookup OTH , cache insertionOTR, query forwarding OTFWD...) and the consideredprotocol (DNS, DNSSEC).

The different probabilities considered for each node are :- Lj the probability nj receives a packet p of Q: qj = Lj .q.- Hr,j the probability nj is RN for fqdnr.- Φ(r) the probability fqdnr of rank r is queried.- CM (resp. CH) the platform probability for a Cache

Miss (resp. Cache Hit) , i.e. the response is not stored inany cache node Cj of the platform.

- CMj (resp. CHj) the probability a packet addressed tonj triggers a Cache Miss (resp. a Cache Hit).

- Rj the probability p in Qj triggers a resolution over theInternet on nj .

- Mj the probability p in Qj is forwarded by nj .Note that we have CM + CH = 1, CMj + CHj = 1 andCMj = Mj +Rj , so an architecture is fully characterized byCM , Rj and Mj .

A. Single Node Model: τrLet us consider the case where a DNS traffic Q is addressed

to a single node. In our model, we consider as in [8] a constantTTL for the FQDNs and a constant query rate q measuredin queries.s−1. Let CMo be the experimental Cache Missmeasured on our live capture. We define τr the time (s) so thatmeasured CMo =

∑r∈R(1−Φ(r))q.τr .Φ(r). For each fqdn

with associated rank r, a Cache Miss (CM) occurs if the fqdnis queried at t and has not been queried during the last TTLseconds. Φ(r) represents the probability the FQDN is queried,and (1−Φ(r)) the probability it is not being requested. If qτrrepresents the number of queries received during fqdn.ttl,then (1−Φ(r))q.τr . represents the probability the FQDN hasnot been queried during fqdn.ttl. As such, τr is the mean timea FQDN is cached in the live capture. The empirical definitionof τr can also be expressed in a theoretical way as a functionof T the duration of the capture, and the various fqdn.ttls ofthe FQDNs. We clearly have Ro = CMo and Mo = 0.

B. IPSHA1 Architecture

If we assume that fqdn and IP addresses are indepen-dent, then CM =

∑n−1i=0 Li.CMi. Because nodes are equal,

CMj = CM , and CMj =∑r∈RΦ(r).(1 − Φ(r))Lj .qτr ,

Rj = CMj , Mj = 0 as node do not forward incoming traffic.Finally:

OT (qj) = qj .OTH + qj .CMj .OT

R (3)

C. FQDN Architecture

In FQDN we assume that load balancers redirect eachfqdnr to its RN . We also considered that our FQDN mappingresults in a uniform distribution of queries, resolutions and

6

homed FQDNs. Then CM = CMo, CMj = CMo, Rj =CMj and Mj = 0. Finally:

OT (qj) = qj .OTH + qj .CM.OTR (4)

D. Pastry Architecture (no cache, no replication)

Pastry architectures are of interest since Pastry [15] is quiterobust to DoS attacks [3], [9], [14], nodes are self organized,and can be deployed as a flat architecture without modifyingthe core network infrastructure. In Pastry, nodes route queriesto their RN node and resolves the queries they are RNfor. The challenge is to find out how managing the trafficbalances the number of avoided DNS(SEC) resolutions. Thereare various ways to manage the incoming traffic. In this sectionwe consider the following mechanisms :

- No cache no Replication (Pastry): When a node receivesa query it is not RN for, a query response exchange isperformed with the RN node.

- Stateless Forwarding (Pastry-SF): It works like Pastryexcept that the RN node sends directly the response tothe end user, rather than to the Pastry node.

- Passive Caching (Pastry-PC): It works like Pastry exceptthat nodes cache the responses.

- Replication (Pastry-R): It works like Pastry, but whenRN node performs a resolution, it provides a copy ofthe response to k neighbours that cache them.

- Active Caching (Pastry-AC): It works like Pastry-SF buttakes advantage of the FQDN’s Zipf distribution. RNprovides the response for their γ most popular FQDN toall other nodes of the platform.

Pastry, Pastry-SF, Pastry-PC, Pastry-R and Pastry-AC areillustrated in figures 6.

IP Load Balancer

[1] DNS Query

[5] DNS Response

[2] DHT Query

@

[3] DNS Query

[4] DNS Response

(a) Pastry-SF

IP LoadBalancer

[1] DNS Query

[5] DNS Response

[2] DHTQuery

@

[3] DNS Query

[4] DNS Response

Active Caching

[6]Replication

DNS Resolutionfor

ActiveC

aching

(b) Pastry-AC

IP LoadBalancer

[1] DNSQuery

[6] DNS Response

[2] DHT Query

[5] DHT Response

@

[3] DNSQuery

[4] DNSResponse

[7] Cache Insertion

(c) Pastry-PC

IP LoadBalancer

[1] DNSQuery

[5] DNS Response

[2] DHT Query

@

[3] DNSQuery

[4] DNSResponse

[6]Replication

[6]Replication

[6]Replication

(d) Pastry-R

Fig. 6. DHT Architecture Principles

Thus for Pastry, CM = CMo, CMj =∑r∈R Φ(r)(1 −

Hr,j)+Hr,j .Φ(r).(1−Φ(r))qτr , Rj = Hr,j

∑r∈R Φ(r).(1−

Φ(r))qτr , Mj = (1 − Hr,j)Φ(r). With OTFWD the OTrequired to forward a packet we have :

OT (qj) = qj .(CHj +Mj .CH).OTH +

qj .Mj .OTFWD +

qj .(Rj +Mj .CM).OTR (5)

OTFWD ≈ OTH as we consider a routing table lookupon a small table, forwarding the query to the RN node andthen forwarding the response to the end user. OTH considersreading the DNS query, but this might not be necessary, andredirection may be based on reading and hashing a fixednumber of bits at a defined position, as routers do with IPaddresses.

E. Pastry − Stateless Forwarding (Pastry − SF): (nocache, no replication)

SF-Pastry nj sends the response directly to the end userrather than to the pastry node that has forwarded the query.Similarly to Pastry in section V-D:


qj .Mj .OTFWD′ +

qj .(Rj +Mj .CM).OTR (6)

OTFWD′ ≈ OTH

2 , since to the difference with OTFWD thereis no response - which is much larger than query.

F. Pastry −Active Caching (Pastry −AC): (no repli-cation)

Active Caching [13] takes advantage of the Zipf distributionof the FQDNs. Each node informs all other nodes of the γ mostpopular RNd FQDNs. In our model, nodes responsible forthe γ FQDNs are proactive, which means that no cache missoccurs for those FQDNs. If we consider that all FQDN havethe same TTL (as in [8]), during TTL, nj sends γ responsesto all n − 1 nodes and receives the γ most popular FQDNsfrom all n − 1 nodes. As a consequence, a DNS query withrank greater than n.γ follows the Stateless Forwarding Pastryresolution procedure. As such, with CM =

∑r>n.γr∈R Φ(r)(1−

Φ(r))q.τr ,Rj =

∑r>n.γ Hr,jΦ(r)(1−Φ(r))q.τr and Mj =

∑r>n.γ(1−

Hr,j)Φ(r). Finally:

OT (qj) = qj .(CMj +Mj .CH)OTH +

qj .(Rj +Mj .CM).OTR +

qj .Mj .OTFWD′ +

γ

τr.OTRAC +

OTFWDAC (2.γ.(n − 1)

τr) (7)

OTFWD′

j ≈ OTH

2 as in equation 6. For resolution of ActiveCached FQDNs, OTRACj ≈ OTH . OTFWDAC

j ≈ n.OTRDNSsince cache updates are sent in one block, and the cache tobe updated is quite small. Optimization may also considerdifferent levels of caches, so to improve cache lookup andinteractions between Cj and the Active Cache.

7

G. Pastry −PassiveCaching(Pastry −PC): (cache,no replication)

With Passive Caching node proceeds as in the regularPastry, but keeps the response in its cache. Similarly to PastryCM = CMo, Rj =

∑r∈R Φ(r).Hr,j .(1 − Φ(r))q.τr and

Mj =∑r∈R(1−Hr,j).Φ(r).(1− Φ(r))q.Lj .τr . Finally:

OT (q) = qj .(Rj +Mj .CM).OTR +

qj .(CHj +Mj .CH).OTH +

qj .Mj .OTFWD

′′

(8)

OTFWD′′

≈ OTRDNS as insertion is performed on a largecache.

H. Pastry −Replication(Pastry −R): (no cache, repli-cation)

Replication with no cache works like the regular Pastryarchitecture with no cache except that when a resolution isperformed the response is replicated on k neighbors. Thismechanism is close to the active caching mechanism to theextent that all FQDNs will be replicated. On the contrary,in active caching one only cares about the most popularFQDNs, then replication occurs only to k nodes, whereasActive Caching replicates the responses on all nodes. Whennode nj does not have the response in its cache, it proceeds asin Pastry (cf. section V-D). Pastry-R provides CM = CMo,Rj =

∑r∈R Φ(r).Hr,j .(1− Φ(r))q.τr , and

Mj =∑r∈R Φ(r).k.Hr,j .(1 − Φ(r))q.τr + Φ(r).(1 − (k +

1).Hr,j). Then, we estimate the probability of replicating orreceiving a replication from a peer PCFWD = 2kRj . Finally:


qj .(Rj +Mj .CM).OTR +

qj .Mj .OTFWD′ + qj .2.k.Rj .OT

FWDPC(9)

OTFWD′

j ≈ OTH

2 (cf. equation 6). OTFWDPCj ≈ 1

2OTRDNS

since cache update is performed, but no response is sent norcache lookup performed.

VI. EXECUTING DHT MODELS WITH TAILX

This section computes the distribution TAIL2000 overvarious DHT architectures modeled in section V. The effi-ciency of a DHT architecture depends on the relative cost ofcaching versus requesting another node. Caching may resultin large expensive caches whereas requesting may result inheavy network operations. Because cache length’s impact onperformance is hard to estimate, in our evaluation we assumethat all cache operations have the same cost over the variousDHT architectures. This is true as long as the cache remainssmall or of the same size.Figure 7 exhibits how traffic variations impacts the CPUratio of the DHT architectures to IPXOR. Architectures basedon routing provide better performances, and are more stableto TTL or query rate variations —with constant numberof FQDNs. Thus Pastry-SF is recommended. Pastry-PC and

Pastry-R, especially for DNSSEC, take advantage of largeTTL values and large query Rate because the number ofFQDNs remains constant in our computations. Hence, increas-ing TTL or query rates results in increasing the CHR.Figure 7c shows that RCPU = CPUResolution

CPUCache≥ 20, DHT pro-

vides a clear advantage over IPXOR, which remains stable forgreater values. [10] measured on UNBOUND RDNSCPU = 3.74and RDNSSECCPU = 38.69, which shows Pastry-SF only requires55% of IPXOR resources for DNS and 19.2% for DNSSEC.Figures 7a shows that with TAIL2000, TTL above 100 havesmall impact on the platform. Furthermore, routing packet islighter than caching operations, which makes Pastry-SF andPastry the recommended architectures for TAILX . Since oursimulations run with a constant number of FQDNs, it reducesthe Cache Hit Rate (CHR), and increasing the query rates infigure 7b increases the Cache Hit Rate. This makes Pastry-PCand Pastry-R take much more advantage to other architecture,especially with DNSSEC. Architectures provide a balancebetween caching mechanisms and routing and the efficiencyof an architecture really depends on the traffic characteristics.Figures 7d confirms that DHT architectures reduce by up to60% the CPU consumption over IPXOR. Then, Pastry-AC isclearly the most scalable architecture, and is able to lowerthe load by expanding the number of prefetched FQDNs. Thisis confirmed by figure 7f. However, Pastry-AC is redundantwith PREFETCHX which is expected to take advantage ofNHAC, and Pastry-SF is the most appropriated architecture interm of scalability.Figure 7d shows that response replication proves to be benefi-cial with DNSSEC due to the relative costs of caching towardforwarding. However replication on multiple nodes increasesthe size of the cache which is not taken into account in ourmodels. Pastry-R is useful in case of failover and future workshould measure how starting a node with empty cache mayimpact the platform. If it is of importance, then the marginalcost provided by figure 7d may require to activate replication.

In section IV, we concluded that prefetching X = 2000results in dividing the number of nodes of PREFETCH by2 over IPXOR. In this section we show that, for the remainingTAILX traffic, which represents around 32% of the traffic,DHT can reasonably decrease the necessary resources by 55%and 80%. This means that overall DHT reduces the necessaryresources by roughly between 20% and 35% over IPXOR.This confirms the 30% resource reduction provided by theFQDNSHA1. As a result, PREFETCHX requires at least4 times fewer nodes than IPXOR. The purpose of section VIIis to validate our models with experimental measurements.

VII. FREE PASTRY EXPERIMENTATION

This section validates our models by implementing andtesting a DNS platform based on FreePastry [5], results arecompared to those provided by the theoretical model ofPastry which models Pastry without any cache and IPXOR.In this section, a uniform FQDN popularity distribution is

8

0.2

0.4

0.6

0.81.0

0 100 200

300 400

500 600

CP

U(D

HT

) / C

PU

(IP

s)

TTL

IPPastry

Pastry-SFPastry-ACPastry-PC

Pastry-R

(a) TTL (DNSSEC)

0.4

0.6

0.8

1.0

0 500 1000

1500

2000

2500

3000

CP

U(D

HT

) / C

PU

(IP

s)

Query Rate

IPPastry


Pastry-R

(b) Query Rate (DNSSEC)

0.2

0.4

0.6

0.8

1.0

0 10 20

30 40

50 60

70 80

CP

U(D

HT

) / C

PU

(IP

s)

CPU(Resolution)/CPU(Cache Hit)

IPPastry


Pastry-R

(c) CPUResolutionCPUCache

0.2

0.3

0.5

0.81.0

0 10 20

30 40

50 60

70 80

90 100

CP

U(D

HT

) / C

PU

(IP

s)

Node

IPPastry


Pastry-R

(d) Node (DNSSEC)

0.3

0.4

0.6

0.8

1.0

0 2 4 6 8 10 12

14 16

18

CP

U(D

HT

) / C

PU

(IP

s)

Neigh

IPPastry


Pastry-R

(e) Neighbors (DNSSEC)

0.1

0.2

0.3

0.4

0.6

0.81.0

10010 3

2.10 3

5.10 3

10 4

CP

U(D

HT

) / C

PU

(IP

s)

Prefetched FQDNs

IPPastry


Pastry-R

(f) Active Caching (DNSSEC)

Fig. 7. TAIL2000 Models evaluation with live DNS capture

used for the experimentation and the simulation.

First, we measure the performance on a single node, andthen extend the platform to 10 nodes. Our 10 node Pastryexperimental platform is based on the Java FreePastry library(version 2.1alpha3) [5]. FreePastry implements Pastry [15]as the routing protocol between DHT nodes, PAST [16]implements the DHT, and GCPAST on the top of PASTimplements a garbage collector to remove expired contents.DNS responses are stored in the DHT and indexed by the hashof the queried FQDN and type, the assigned GCPAST TTL isthe one associated to the DNS response. DNS resolutions andinteraction with GCPAST are performed with DNSJava [20].The format of the DHT contents must be chosen carefully,and we choose to store a complete answer involving multipleRRsets, rather than each RRsets individually. The hash key isbuilt using the name, class and type of the query field. Themain drawback is that it generates a high redundancy betweenthe RRsets contained in different DHT contents and the sizeof the DHT caches on the nodes. For the TTL, we assign toGCPAST the TTL provided for the ANSWER field in the DNSresponse.

Our experimental platform is composed of 10 Pentium IIIand Pentium II servers with Debian Lenny. Although differentconfigurations were used, the CPU frequency varies from500MHz to 1GHz, and RAM varies from 128M bytes to384M bytes. The 10 node platform is loaded with traffic witha T = 3 minute TTL. We fix CHRo = 0.7, so caches are pre-populated with FQDNs from a list, and 7 out of 10 queries arefrom this list. Tests last Ttest = 1 minute so with a maximumof Q = 1100 q.s−1 with Pastry.Figure 8a shows for different numbers of nodes n the per-centage of answered queries. We estimated the MaximumLoad (ML) reached when there are more than 2% errors.Figure 8b plots experimental measured ML, and plots theStand Alone value (for n = 1, ML(1) = 200) as wellas the linear approximation for the experimental value withn ≥ 2: MLPastry(n) = 114.7n − 96.9. In a Stand Alonemode, the whole FreePastry network is supported by the samehardware, and no routing operations are required. Althoughin our theoretical model, we neglect here the routing tablelookup, because we have reliable nodes. Our routing discoveryprotocol is much lighter than the one originally designedin Pastry, and all CPU estimations have been modeled bymeasuring a C implementation of a performance driven server:UNBOUND. In FreePastry and the DNS module we addresults from academic development of a Java software, androuting table is definitely not negligible. As such, the StandAlone mode does not provide a good estimation for theexperimental IPXOR measurement. In order to provide a ex-perimental measurement for IPXOR, we need to estimate thecosts of routing table on a FreePastry node. ML(n) estimatesthat routing table lookup cost to ≈ 96.9queries.s−1, whichleads to the experimental IPXOR experimental measurementMLIP (n) = 200n− 96.9 queries.s−1.Figure 8c compares our theoretical and experimental resultsfor MLPastry

MLIP(n). Theoretical and experimental measurements

confirm that the ratio between Pastry and IPXOR remainsstable and independent of n. However, theoretical model ex-pects Pastry to be equivalent as IPXOR, whereas Experimentalmeasurement shows that Pastry costs around twice as muchas IPXOR. The difference between the two results can beexplained by the FreePastry implementation and the testingconditions: 1) FreePastry is not optimized for performance.Profiling the code with JRat [7] revealed that 64% of the timewould be spent on insertion and DHT look up, if we wereusing FreePastry in synchronous mode. Using asynchronousmode leverages this bottleneck, but does not mean the codeis now optimal. In addition, 2) Routing operations are muchheavier than those we require in Pastry. When we comparethe time necessary to perform a resolution in a Stand Aloneconfiguration or in a FreePastry configuration we measure thatwhen the DNS response is in the cache, it takes 5ms to 65ms(resp. 4ms to 300ms) in a Stand Alone mode (resp. in aDHT mode). When the responses require a resolution it takes8ms to 175ms (resp. 59ms to 342ms) in a Stand Alone(resp. DHT) mode. In both cases time variation reveals that

9

optimization may be improved in event thread synchronizationof the FreePastry implementation. On the other hand, it alsoshows that the routing function is clearly not negligible in thisimplementation. At last, the lake of FreePastry and hardwareperformance only makes experimental measurements underquite small load which makes overhead more visible. In ourcase, routing overhead consists in a bit less that half of theperformance.Another way to measure the dependence between the experi-mental and theoretical values is to derive the sample correla-tion coefficient as an estimator of the Pearson correlation. Forboth IPXOR and Pastry, we consider the set of experimentalmeasurements for Maximum Load: MLexp, and the set ofcomputed values for Maximum Load: MLmodel. The variousvalues are those measured and computed with various valuesof n the number of nodes. We derive rIPMLexpMLmodel

=

0.9991 and rPastryMLexpMLmodel= 0.9827. Correlation coefficients

are very close to 1 which shows a perfect positive linearrelationship between the measured values and those computedfrom our models.As a result, experimental measurements shows our theoreticalmodel does not present major bias, and confirms the stableratio of MLPastry

MLIP(n) in the testing conditions. On the other

hand it also shows that a FreePastry implementation maynot fill the requirements of our models, and that furtherinvestigation would need specific developments, especially tosimplify the routing code and algorithms.

(a) Experimental Error

0

200

400

600

800

1000

1200

1 2 3 4 5 6 7 8 9 10

Maxim

um

Load (

Q/s

)

Number of Nodes

Max. LoadLinear Approx.

Stand Alone

(b) Experimental Maximum Load

0.4 0.6 0.8

1 1.2 1.4 1.6 1.8

2

1 2 3 4 5 6 7 8 9 10

ML

(Pa

str

y)

/ M

L(I

Ps)

Number of Nodes

ExperimentalTheoretical DNS

Theoretical DNSSEC

(c) Experimental vs Theoretical

Fig. 8. Maximum Load

VIII. CONCLUSION

This paper proposes the PREFETCHX architecturewhich prefetches X FQDNs (HEADX ) and handles theremaining FQDNs (TAILX ) with a Distributed Hash Table(DHT) structure. Traffic analysis sets X = 2000 to uniformlydistribute the resource among the nodes. Prefetching HEADX

reduces the number of nodes by 2, and DHT reduces nodesfor TAILX between 55% and 80%, making PREFETCHX

4 times more efficient than the current IPXOR architecture.PREFETCHX ’s efficiency can be enhanced by increasingX , providing an adapted light Pastry-like implementation. Inthis paper, we keep X small to limit the exchanges between thenodes to fill their cache. Large values for X require protocolsand architectures optimized for cache updates. Similarly, alight Pastry implementation is also expected to enhance thearchitecture.All these aspects are let for future work.

REFERENCES

[1] BIND. http://www.isc.org/bind10.[2] cavium. http://www.cavium.com/table.html.[3] R. Cox, A. Muthitacharoen, and R. Morris. Serving dns using a peer-

to-peer lookup service. In Revised Papers from the First InternationalWorkshop on Peer-to-Peer Systems, IPTPS ’01, pages 155–165, London,UK, 2002. Springer-Verlag.

[4] endace. http://www.endace.com.[5] FreePastry. http://www.freepastry.org/.[6] S. Iyer, A. Rowstron, and P. Druschel. Squirrel: A decentralized peer-to-

peer web cache. In 12th ACM Symposium on Principles of DistributedComputing (PODC 2002), July 2002.

[7] JRat. The java runtime analysis toolkit http://jrat.sourceforge.net/.[8] J. Jung, E. Sit, H. Balakrishnan, and R. Morris. Dns performance and

the effectiveness of caching. In Proceedings of the 1st ACM SIGCOMMWorkshop on Internet Measurement, IMW ’01, pages 153–167, NewYork, NY, USA, 2001. ACM.

[9] D. Massey. A comparative study of the dns design with dht-basedalternatives. In In the Proceedings of IEEE INFOCOM’06, 2006.

[10] D. Migault, C. Girard, and M. Laurent. A performance view on dnssecmigration. In CNSM 2010, oct 2010.

[11] N. Nortel. Alteon OS 21.0, Alteon Application Switch, Sept. 2003.[12] Radaware. Alteon Application Switch 5412 Case study, 2010.[13] V. Ramasubramanian and E. G. Sirer. Beehive: O(1)lookup performance

for power-law query distributions in peer-to-peer overlays. In Proceed-ings of the 1st conference on Symposium on Networked Systems Designand Implementation - Volume 1, pages 8–8, Berkeley, CA, USA, 2004.USENIX Association.

[14] J. Risson, A. Harwood, and T. Moors. Topology dissemination forreliable one-hop distributed hash tables. IEEE Transactions on Paralleland Distributed Systems, 20:680–694, 2009.

[15] A. Rowstron and P. Druschel. Pastry: Scalable, decentralized objectlocation, and routing for large-scale peer-to-peer systems. Lecture Notesin Computer Science, 2218:329, 2001.

[16] A. Rowstron and P. Druschel. Storage management and caching inPAST, a large-scale, persistent peer-to-peer storage utility. In 18th ACMSymposium on Operating Systems Principles (SOSP’01), pages 188–201,Oct. 2001.

[17] R. Tewari, M. Dahlin, H. Vin, and J. Kay. Beyond hierarchies: Designconsiderations for distributed caching on the Internet. Technical ReportTR98-0, 1998.

[18] UNBOUND. http://unbound.net/.[19] J. Wang. A survey of Web caching schemes for the Internet. ACM

Computer Communication Review, 25(9):36–46, 1999.[20] B. Wellington. dnsjava http://www.dnsjava.org/.[21] A. Wolman, G. M. Voelker, N. Sharma, N. Cardwell, A. R. Karlin, and

H. M. Levy. On the scale and performance of cooperative web proxycaching. In Symposium on Operating Systems Principles, pages 16–31,1999.

PREFETCHing to optimize DNSSEC deployment over large Resolving Platforms

Documents

Transcript of PREFETCHing to optimize DNSSEC deployment over large Resolving Platforms