Using Latency-Recency Profiles for Data Delivery on the Web

Using Latency-Recency Pro�les for Data Delivery on the

Web

Laura Bright Louiqa Raschid

University of MarylandCollege Park MD 20742

[email protected], [email protected]

Abstract

An important challenge to web technologiessuch as proxy caching, web portals, and ap-plication servers is keeping cached data up-to-date. Clients may have di�erent prefer-ences for the latency and recency of their data.Some prefer the most recent data, others willaccept stale cached data that can be deliveredquickly. Existing approaches to maintainingcache consistency do not consider this diver-sity and may increase the latency of requests,consume excessive bandwidth, or both. Fur-ther, this overhead may be unnecessary incases where clients will tolerate stale data thatcan be delivered quickly. This paper intro-duces latency-recency pro�les, a set of pa-rameters that allow clients to express prefer-ences for their di�erent applications. A cacheor portal uses pro�les to determine whetherto deliver a cached object to the client or todownload a fresh object from a remote server.We present an architecture for pro�les thatis both scalable and straightforward to imple-ment at a cache. Experimental results usingboth synthetic and trace data show that pro-�les can reduce latency and bandwidth con-sumption compared to existing approaches,while still delivering fresh data in many cases.When there is insuÆcient bandwidth to an-swer all requests at once, pro�les signi�cantlyreduce latencies for all clients.

Permission to copy without fee all or part of this material isgranted provided that the copies are not made or distributed fordirect commercial advantage, the VLDB copyright notice andthe title of the publication and its date appear, and notice isgiven that copying is by permission of the Very Large Data BaseEndowment. To copy otherwise, or to republish, requires a feeand/or special permission from the Endowment.

Proceedings of the 28th VLDB Conference,

Hong Kong, China, 2002

1 Introduction

Recent technological advances and the rapid growthof the Internet have increased the number of clientswho access objects from remote servers on the Web.Remote data access is often characterized by high la-tency due to network traÆc and remote server work-loads. Data at remote servers is often characterized byfrequent updates. For example, stock quotes may beupdated as frequently as once per minute[12].

The increased popularity of the Internet has re-sulted in a corresponding increase in the diversity ofboth the clients and the types of data and services thatthey request. As a consequence of this increased di-versity, clients may have varying preferences about therecency and latency of their requests. For example, astock trader who requires accurate stock informationmay be willing to wait longer for the most recent data.On the other hand, if there is heavy congestion on theInternet or at a remote server, a client reading thenews may accept slightly stale data if it can be deliv-ered quickly.

To date, there have been many caching solutionsdeveloped to improve access latencies and data avail-ability on the Web. However, the e�ectiveness of allthese approaches is limited by cached data that be-comes stale as updates are made at remote servers.In addition to the more traditional technologies suchas proxy caching, newer technologies such as web por-tals and application servers also utilize caching. Atpresent, techniques used to keep cached objects up-to-date cannot handle diverse client preferences and mayperform poorly with respect to either latency, recency,or bandwidth consumption. To handle the popularityand diversity of the web and to scale to the large num-ber of clients and data sources, it is crucial to developweb caching technologies that can accomodate diverseclient preferences with respect to recency and latency.

This paper introduces latency-recency pro�les to ad-dress this limitation of current web technologies. We�rst describe some existing caching technologies toimprove latency and availability. We then present

consistency mechanisms that are currently used bythese technologies, and discuss their shortcomings. Fi-nally, we introduce latency-recency pro�les, our ex-ible, scalable solution that allows clients to expresstheir preferences with respect to latency and recencyof data.

1.1 Web Caching Technologies

clients

serversInternetcache

Figure 1: Proxy Cache Architecture

Proxy Caches

Client-side proxy caching is a widely used technique toreduce access latencies on the Web [7, 10, 20]. In thisarchitecture, a cache resides between a group of clients,e.g., a company or university campus, and the Inter-net. This architecture is shown in Figure 1. A proxycache stores objects previously requested by clients,and these cached objects may be used to serve subse-quent requests. Since multiple clients access objectsthrough the proxy, a proxy cache can leverage com-monalities in client requests and reduce access laten-cies.

clients

databaseservers

serverapplication

web

Internet

Figure 2: Application Server Architecture

Application Server Caches

Application servers are an emerging technology toimprove the performance of data intensive web sitesby o�oading some functionality from web servers.Many commercial products are available, e.g., Ora-cle9iAS Web Cache[4], IBM WebSphere[29], and BEAWebLogic[28]. Application servers are well-suited forlarge scale data handling, and can perform caching tofurther improve performance. For example, an appli-cation server cache can reside between database serverand the Internet, and cache components of dynami-cally generated web pages. The application server canthen automatically generate pages without contactingthe database server. The Oracle Application Server

Web Cache [4] is an example of a product with thisfunctionality. We note that there is typically somecooperation between the database server and the ap-plication server, and there may be a high-bandwidthlink connecting them. The architecture is shown inFigure 2.

server

portal

server

Internet

clients

Figure 3: Portal Architecture

Web Portals

Portals make it easier for clients to locate and accessrelevant information. Earlier portals served primarilyas \hubs" to direct users to relevant web sites. Today,however, many portals cache data gathered from othersites, so clients can easily access all relevant informa-tion from a single site. According to a recent study byBooz-Allen & Hamilton[26], an increasing number ofportals serve as destinations themselves, rather thangateways to other sites, and 60% of Web sessions in-clude at least one access to a portal site. The web por-tal architecture is shown in Figure 3. The challenge ofproviding up-to-date data is critical to the success of aportal, in particular, since clients will only visit portalsthat meet their needs. Any portal that fails to do sorisks losing clients to competitor sites. Therefore, it iscrucial for portals to keep cached data up-to-date whilemeeting client preferences with respect to latency andrecency.

A challenge for all these technologies is that cacheddata becomes stale as updates are made at remoteservers. To date, several techniques have been pro-posed to keep cached data consistent with data at re-mote servers. Most implementations of proxy caches,portals, and application server caches use some com-bination of these. However, none of these techniquesconsider diverse client preferences. We describe theseexisting consistency approaches and discuss their limi-tations. In the following discussion, we use the genericterm \cache" to refer to any of the above technologies.

1.2 Existing Consistency Approaches

Time-to-Live (TTL)

An accepted strategy for maintaining the consistencyof cached objects is to assign each object a time-to-live (TTL) [8, 11, 17], i.e., the estimated length oftime the object will remain fresh. If the TTL of arequested object has expired, the cache must validate

the objects (check for updates) at the remote server.Note that this approach requires no cooperation fromremote servers. While TTL delivers more recent datato clients, validation adds overhead to client requestsand reduces the bene�ts of caching. Further, it is dif-�cult to accurately estimate an object's TTL. An es-timate that is too conservative will improve freshnessbut result in many unnecessary validations at remoteservers, while an estimate that is too optimistic re-duces contact with remote servers but may result inmany clients receiving stale data. TTL is the mostcommonly used mechanism in proxy caches, and mayalso be used by web portals.

Always-Use-Cache (AUC)

Another strategy is to serve all requests from a cache,and perform prefetching in the background to keepcached objects up to date. We refer to this approachas Always Use Cache (AUC). Prefetching strategiesto maximize the overall recency of a cache are de-scribed in [9]. This approach has several limitations.First, in the general case where there is no cooperationfrom remote servers, the cache has no knowledge ofwhen updates occur. Therefore, AUC typically mustpoll remote servers to keep cached data up to date.This may consume large amounts of bandwidth anddoes not scale well to large numbers of objects. Also,there may be some delay between when the update oc-curs and when the cache checks for updates. Duringthis time, the cache will return stale data. Therefore,while AUC minimizes the latency of client requests,it may perform poorly with respect to recency andconsume large amounts of bandwidth, since it doesnot scale well to large caches or frequently updatedobjects. AUC is commonly used by web portals andother technologies that maintain copies of objects frommany web sites.

Server-Side Invalidation (SSI)

A third approach to maintaining cache consistency isto have servers maintain information about objectsstored in client caches, and send invalidation messagesto caches when an object is updated. Alternatively,a server may push the updated object to caches. Werefer to this approach as SSI. This approach has beenstudied in the context of proxy caching in [23], andrelated techniques have been studied by the databasecommunity in [2, 19, 25]. SSI requires cooperationfrom remote servers. When a server sends only inval-idation messages to the cache, SSI has performancecomparable to TTL assuming TTL estimates are ac-curate. This approach was shown to be feasible interms of bandwidth and server load in [23]. However,if servers instead send the updated objects after eachupdate, as is the case with many application servercaches[4], the overhead of SSI may be considerablygreater.

SSI is often used in application server caches or webportals. It may be feasible in these environments be-cause the number of servers and caches is typically�xed, which facilitates cooperation and reduces scala-bility concerns. It is unlikely to be a viable alternativein proxy caches because many web servers are eitherunable or unwilling to implement it.

To summarize, existing approaches to maintaincache consistency may increase the latency of requests,consume excessive bandwidth, or do both. Further,this overhead may be unnecessary in cases whereclients will tolerate stale data that can be deliveredquickly. A scalable solution that can handle clientsand applications with diverse latency and recency re-quirements is needed.

1.3 Our Solution: Latency-Recency Pro�les

In this paper, we present latency-recency pro�les toovercome the limitations of existing approaches. Pro-�les comprise a set of parameters that allow clientsto express their preferences with respect to latencyand recency for their di�erent applications. A clientmay choose a single pro�le for all applications. Al-ternatively, a client can specify a set of applicationspeci�c pro�les. A pro�le-based downloading strategy(labelled Pro�le) uses latency-recency pro�les to de-termine whether to download a requested object or touse a cached copy.

Pro�le is a generalization of TTL and AUC. TTLaims to deliver the most recent data, while AUC aimsto minimize latency. The parameters of Pro�le can betuned to provide performance anywhere between thesetwo extremes. Pro�le can also provide an upper boundwith respect to latency or recency. Further, unlike SSI,it requires no cooperation from remote servers, and canpotentially be used for any data source on the web.

Pro�le is further distinguished by its scalable archi-tecture. A client's browser can communicate pro�leparameters to a cache by appending parameters to anHTTP request. Pro�le incorporates these parametersinto the downloading decision. The advantage of thisarchitecture is that it is straightforward to implement.It also eliminates the need for caches to maintain pro-�le information and allows a cache to scale to a largenumber of clients. Further, clients can easily changetheir pro�les without having to inform the cache.

There are incentives for both clients and caches tosupport pro�les. From the client's perspective, theycan reduce access latencies compared to the moretraditional TTL when appropriate, while providinggreater recency than AUC. From the cache's perspec-tive, pro�les can reduce bandwidth consumption com-pared to TTL, AUC, and SSI, which can reduce thecache's operating costs and can improve access laten-cies for clients who access data from remote servers.

Another key bene�t to using pro�les is that theyare well-suited to handling brief periods of overload

(\surges") which often occur on the web. Surges canbe caused by congestion on the link connecting a proxycache to the internet, or by a large number of requestsat a popular server. When there is insuÆcient band-width or server capacity to process all requests duringsuch surges, using pro�les can reduce the latency ofall requests by downloading only the ones that requirethe most recent data.

The rest of this paper is organized as follows: Sec-tion 2 surveys related work, Section 3 describes theframework for latency-recency pro�les. We present ex-perimental results using both synthetic and trace datain Section 4, and conclude in Section 5.

2 Related Work

There has been much research in the web caching com-munity [7, 10, 11, 17, 27] on proxy caching to im-prove performance on the web. Cache consistencytechniques are presented in [8, 11, 17, 23]. [8, 11, 17]present the TTL approach described in the previoussection. [23] studies techniques for strong cache consis-tency, i.e. guaranteeing fresh data. The authors showthat server-side invalidation is e�ective for maintain-ing strong cache-consistency, however this techniquemust be implemented by remote servers. [9] consid-ers prefetching locally cached objects to improve theoverall recency of a cache. This work assumes thatall requests are served from the cache, and does notconsider client preferences for recency or latency.

Recent work in the database community considerscaching dynamically generated web content [6, 24].Work in [24] caches components of dynamically gener-ated web pages to exploit overlap in queries. However,this work does not consider updates to the underly-ing databases, and requires cooperation from remoteservers. [6] presents techniques for database-backedweb sites to invalidate cached dynamic content. Thesetechniques apply only to servers and are not appro-priate for other caching technologies, e.g., client-sideproxy caching or portals.

Work in materialized views, e.g.,[16, 18, 30] alsoconsiders the problem of keeping data up to date.These works typically assume full knowledge of up-dates to the underlying database. Work in [14] allowsstale data to be incorporated into materialized viewsby adding an obsolescence cost, and shares our goalof allowing clients to accept stale data in exchange forlower latencies. WebView materialization [21, 22] ad-dresses the problem of computing materialized viewsfor web-accessible databases. EÆcient strategies forthe database to propagate updates to the WebVieware presented in [22]. Our problem di�ers from [21, 22]since we address updating cached objects at the clientside, when we do not know exactly when objects areupdated at remote servers.

There is also relevant work in the database com-munity [2, 19, 25] that relaxes the requirement that

cached copies be consistent with objects that resideon remote servers. These works allow cached copiesof objects to deviate from the data at the server ina controlled way. This requires servers to propagateupdates to the client-side cache when a cached valueno longer has an acceptable degree of precision; whichplaces a burden on servers and does not scale well toa large number of clients.

3 Latency-Recency Pro�les

Our latency-recency pro�les allow clients to expresstheir preferences for their applications using a few pa-rameters. Pro�les are set individually by each client,and a single client can specify either a single pro�leor di�erent pro�les for di�erent applications. We �rstdiscuss several key issues that are crucial to success-fully implementing and using pro�les. We then de-scribe how clients can choose target latency and re-cency values, and present a parameterized decisionfunction that can capture the latency-recency tradeo�for a particular client or application. This is the basisof our approach labelled Pro�le. Finally, we brie y dis-cuss upper bounds provided by our function, and de-scribe how the parameters can be tuned to meet clientrequirements with minimal overhead for the clients.

3.1 Issues

There are several issues that are important to suc-cessfully implementing and using client pro�les at acache. The �rst issue is scalability. An implementa-tion of pro�les that requires a cache to store detailedinformation about each client would add considerableoverhead because clients would need to register pro-�les with the cache, and the cache would need to keepthe information up to date. This does not scale wellto large numbers of clients. While our solution Pro�leexploits knowledge of pro�les, it does not require thecache to store any pro�le information. Browsers ap-pend the pro�le parameters to client HTTP requests,and Pro�le uses these parameters in the decision func-tion. Thus, Pro�le can easily scale to a large numberof clients, with no additional communication overheadbetween the client and the cache. This scalability is akey bene�t to using a parameterized function.

A second issue is exibility. Clients should be ableto specify pro�les that are appropriate for each of theirapplications, and they should be able to easily adjusttheir pro�les as needed. To allow clients to use di�er-ent pro�les for di�erent applications, clients can choosea default pro�le which they can override for speci�cdomain names or URLs. For example, a client requir-ing the most recent stock quotes may specify that allrequests to the domain finance.yahoo.com [12] re-quire the most recent data, but that all other requestscan tolerate up to 1 update. Clients can easily changetheir pro�les using their browser, without communi-cating with the cache.

The third issue is ease of implementation. It isstraightforward to modify a cache to implement Pro-�le. Pro�le allows clients with diverse pro�les to sharea cache without adding any overhead to each other'srequests. For each individual request, the cache willuse Pro�le to choose how to serve the request basedon that client's pro�le. If clients with di�erent pro-�les request the same object simultaneously, the cachecould serve one client's request from the cache whiledownloading a fresh copy for the other client.

The �nal issue relates to guarantees. Pro�le is ageneralization of TTL, which guarantees fresh data(assuming TTL estimates are accurate) and AUC,which guarantees low latency. In addition, Pro�le cansupport upper bounds on either latency or recency,which other approaches do not support.

3.2 Parameters

Pro�les include the following parameters:Target Latency: The �rst parameter is a target la-tency (TL), which is the desired end-to-end latency todownload an object. We note that the cache can esti-mate the latency of downloading an object using tech-niques described in [1, 15], which have been shown tobe reasonably accurate in practice.Target Recency (Age): Clients specify a target re-cency TR. There are many possible recency metricsthat could be chosen. In this paper our recency metricis the number of times the object has been updated atthe remote server since it was cached. We refer to thismetric as age.

We brie y discuss our choice of recency metric.There have been many di�erent metrics described andused in the literature, e.g., [2, 9, 14, 19]. One met-ric is the amount of time elapsed since the cachedobject became stale[9]. Obsolescence measures agein terms of the number of insertions, deletions, andmodi�cations[14]. Work in [19] considers age, the num-ber of times an object has been updated at the remoteserver. The choice of recency metric depends on thesemantics of the application and the types of updatesthat occur, so each of the above metrics is useful in dif-ferent circumstances. In this paper we selected age asthe recency metric[19] because we believe this metricis useful for a variety of applications. In the remain-der of this paper, we use the terms recency and age

interchangeably.

3.3 Pro�le: Parameterized Decision Functionand Pro�le-Based Downloading

We now present the details of Pro�le. It uses a param-eterized function that incorporates client pro�les intothe decision of whether to download a requested objector to use a cached copy. First, we describe the deci-sion function. We note that there are many di�erentfunctions that could be used. We chose this particu-lar function because it has several desirable properties.

First, it can be tuned to provide an upper bound withrespect to latency or recency. Second, when it is im-possible to meet both targets, two parameters can beset to re ect a tradeo�, i.e., the relative importance ofmeeting each of the targets.

Our function �rst calculates a score for both recencyand latency as follows:

Score(T; x; K) =

�1 if x � TK=(x� T +K) otherwise

T is the target value of recency or latency, x is theactual value, and K is a constant �0 that is used totune the rate at which the score decreases. Let KL bethe K value used to control the latency score, and letKR be the K value used to control the recency score.Note that the K values are set automatically by thebrowser based on client preferences, using a graphicalinterface (described in Section 3.4).

Combined Weighted Score

The decision function is a separable function that com-bines the scores for recency and latency. It can alsobe tuned to capture the latency-recency tradeo� for aclient or application. This is done by assigning (rela-tive) weights to the importance of latency and recency.The sum of the weights must equal 1. For some appli-cations it may be more important to meet the recencytarget; for others it may be more important to meetthe latency target. Let w be the weight assigned tomeeting the latency target, and let (1 - w) be theweight assigned to meeting the recency target. Wecompute the combined score of an object as follows:

CombinedScore = (1 - w)*Score(TR, Age, KR) +w*Score(TL,Latency, KL)

Pro�le-Based Downloading

Our algorithm Pro�le uses the combined scoringfunction to make the decision of whether or notto download an object. When an object is re-quested, we compute the score of either download-ing the object (DownloadScore) or using the cachedcopy (CacheScore). The Pro�le policy is as follows:When an object is requested, if DownloadScore >CacheScore, the object is downloaded from the remote

server. Otherwise the cached copy is used.We compute DownloadScore for an object as fol-

lows: Recall that when an object is downloaded, itsAge is 0 because the remote server always provides themost recent data. Therefore, Score(TR, Age, KR)is always 1.0. Latency is the estimated latency ofdownloading the object from a remote server. Thus,DownloadScore, the combined score of downloadingan object, is

DownloadScore = (1-w)*1.0 + w*Score(TL,Latency, KL)

We now consider CacheScore. Recall that when anobject is read from the cache, its Latency is 0. There-fore, Score(TL,Latency, KL) is always 1.0. Ageis the estimated age of the cached object. Thus,CacheScore, the combined value of using a cachedcopy of an object, is

CacheScore = (1-w)*Score(TR, Age, KR) +w*1.0

3.4 Choosing a Pro�le

The success of latency-recency pro�les depends on theease of creating a pro�le. If setting the parameters iscomplicated and time consuming, clients will be lessinclined to use pro�les. We describe an interface thatallows clients to express the most appropriate pro�lesfor their applications.

(a)

0 1 2 3 4 50

1

2

3

4

5

TTL (w=0, TR=0)

download

use

cach

e

Age

Late

ncy

(sec

)

(b)

0 1 2 3 4 5

0

1

2

3

4

5AUC (w=1, T

L=0)

use cache

Age

Late

ncy

(sec

)

(c)

0 1 2 3 4 50

1

2

3

4

5T

R=0, T

L=0, K

L=1, K

R=1, w=0.5

use cache

download

Age

Late

ncy

(sec

)

Figure 4: Behavior of (a) TTL (b) AUC (c) Pro�lewith TR=TL=0, w=0.5, and KR=KL=1

Default Pro�les

The default pro�le has its targets set to provide iden-tical performance to TTL. This corresponds to set-tings of w=0 and TR=0. Note that with these settings,the TL value is irrelevant. This TTL setting is whatmany caches currently provide, e.g., proxy caches [5].For those clients who wish to explicitly trade recencyfor improved latency, the browser will present a smallnumber of parameter settings to the client, and let theclient choose the settings that best suit their needs foreach application. We describe how this choice can bemade using the graphical interfaces of Figure 4 andFigure 5.

Figure 4 illustrates the latency-recency tradeo�s ofthree possible parameter settings. In these graphs, weplot the recency (age) of a cached object as x num-ber of updates on the x axis and the latency of down-loading the object as y seconds on the y axis. If apoint (x,y) lies in the shaded area, then the objectis downloaded. If (x,y) lies in the white area, thenthe object is read from the cache. Figure 4(a) displaysthe behavior of the default TTL pro�le (w=0, TR=0)to the user. Any object with 0 updates is served fromthe cache, while any object with 1 or more updates isdownloaded. Figure 4(b) displays the behavior of AUC(w=1, TL=0), where the client will tolerate any amountof staleness to minimize access latency. AUC alwaysuses the cached object (no shaded area), regardless ofthe number of updates.

Tuning Pro�les

For clients who desire performance between the twoextremes of TTL and AUC, a pro�le with parameters(w = 0.5, KR and KL=1, and TR and TL=0) can bechosen. Figure 4(c) displays the behavior of this pro-�le. We see that the decision function captures thelatency-recency tradeo�. When objects have higheraccess latencies, users may tolerate older cached ob-jects (white area). Conversely, as the cached objectbecomes more stale, users are willing to wait longer todownload a fresh object (gray area).

The pro�les illustrated in Figure 4 can be tailoredfurther. This is straightforward to do in our frame-work. For example, consider a client who wishes toreceive data with recency of no more than 1 update.Such a client could choose the default TTL as in Fig-ure 4(a), but change the TR value from 0 to 1. Thiswould result in any object with 2 or more updatesbeing downloaded, rather than 1 or more updates asshown in Figure 4(a).

Upper Bounds

The pro�les of Figure 4 do not provide any upperbounds on latency or recency. For clients who desireeven greater control over the settings of their pro�les,

the values for w and KL and KR can be chosen to pro-vide upper bounds. Clients do not need to manuallychoose w and K values. Instead, clients are aided bya graphical interface (similar to Figure 5) that illus-trates the tradeo� for settings of w and K values, andallows them to make the appropriate choice.

(a)

0 10 20 30 40 500

1

2

3

4

5

6

Late

ncy

(sec

)

Age

kL = 2, k

R = 2, w = .6

use cache

download

MaxLatency

(b)

0 20 40 60 80 1000

1

2

3

4

5

6

Late

ncy

(sec

)

Age

kL = 2, k

R = 10, w = .6

use cache

download

MaxLatency

Figure 5: Upper Bounds on the Latency-RecencyTradeo�

An upper bound for either latency or recency canbe chosen. In particular, assigning a higher weight tolatency (w > 0.5) places an upper bound on the latencyof a downloaded request, and assigning a higher weightto age (w < 0.5) places an upper bound on the age ofan object delivered to the client from the cache.

In Figure 5, w = 0.6 and TL, TR=0. The upperbound on latency is 4 seconds. The choice of KL andKR in Figure 5 (a) and 5 (b) illustrate the latency-recency tradeo� that the clients can select that con-trols how the latency asymptotically approaches theupper bound of 4 seconds. The choice of values makesPro�le more aggressive to download data as re ectedby the larger shaded area.

4 Experiments

We use both trace data and synthetic data to comparePro�le against three algorithms, TTL, AUC, and SSI.Our simulation models the proxy cache architectureof Figure 1.1 We �rst describe the details of thesealgorithms. We then describe the details of both the

1These results also apply to portals, if we do not consider theadditional time to send data from a portal to a client.

trace and synthetic datasets. Finally, we present ourresults. Our key results are as follows:

� Pro�le signi�cantly reduces bandwidth consump-tion compared to all approaches for both traceand synthetic data. Compared to TTL, Pro�le re-duces bandwidth consumption with only a slightincrease in the amount of stale data delivered toclients (trace data). Pro�le also provides betterrecency than AUC (trace and synthetic data).

� Pro�le can bene�t from an increased cache sizemore than either TTL or AUC (trace data). AUCcannot deliver recent data when the cache size islarge, while TTL cannot utilize a larger cache sizeto reduce latency. Pro�le can exploit increasingcache size to reduce both age and latency.

� In the presence of surges, Pro�le improves laten-cies for all clients, even for clients who require themost recent data.

4.1 Algorithms

We consider the following algorithms:

� TTL: This is the cache consistency mechanismcurrently used in most proxy caches [5, 8, 11, 17].Cached objects are assigned a TTL value whichis an estimate of how long they will be freshin the cache. The TTL approach guaranteesthat all cached objects are up-to-date if the TTLestimate is accurate. It uses two parameters,UpdateThreshold and DefaultMax; these are ex-plained in Section 4.2.

� AUC: We implemented a modi�ed version of theprefetching strategy presented in [9]. We relaxtheir assumption that all objects must be in thecache. Instead, we refresh objects that are cur-rently in the cache in a round robin manner. Ona cache miss, objects are downloaded from a re-mote server. This strategy has the advantage ofbeing straightforward to implement at the cache,and was shown to be near optimal in [9]. Ob-jects are validated in the background at a spec-i�ed PrefetchRate, and only validated objectsthat have been updated at the remote server aredownloaded.

� Pro�le: This is implemented as was described inSection 3. The decision function uses the esti-mated latency of downloading objects, and the es-timated age of cached objects. We describe howto compute these for the trace data in Section4.2. The settings of the pro�le parameters aredescribed with the results in Section 4.4.

� SSI-Msg: We consider two variations of SSI. In the�rst, SSI-Msg the server sends invalidation mes-

sages to a cache whenever an object is updated,

but does not send the actual object to the cache.If the cached object is subsequently requested, theupdated object is downloaded from the server.Note that this approach is comparable to TTLwith accurate expiration times. This approachwas shown to consume a comparable amount ofbandwidth to TTL in [23].

� SSI-Obj: In the second variation, SSI-Obj, theserver sends all updated objects to the cache.This consumes more bandwidth than SSI-Msg butguarantees that all cached objects will be up-to-date, which reduces the latency of requests.

4.2 Data

We now describe the trace and synthetic data used inour experiments.

4.2.1 Trace Data

We used trace data from NLANR[13]. This data wasgathered from a proxy cache in the United States inJanuary 2002. We considered approximately 3.7 mil-lion requests made over a period of 5 days. We per-formed preprocessing on the NLANR trace data toprepare it for the experiments. Speci�cally, the tracedata did not report on object modi�cation or expira-tion times, which we need to make downloading de-cisions and to determine the recency of cached ob-jects. Our solution to this problem was to create an\augmented" trace using the workload from the orig-inal NLANR trace data. Over a period of 5 days, wereplicated the trace workload by sending requests tothe servers in the traces at (approximately) the sametime of day as in the original workload. The requestswere made from the domain umiacs.umd.eduwhich isconnected to its ISP via a high speed DS3 line witha maximum bandwidth of 27 Mbps. When each re-quested object arrived, we logged the latency of the re-quest and the time the object was last modi�ed (whenavailable). We used the logging mechanism providedby the Squid cache[5], but did not cache any objects.This augmented trace data provided the informationwe needed for this study.

In our trace-based experiments, we cached only ob-jects that had last modi�ed information available andwere not labelled uncacheable. For the TTL algorithm,to estimate the TTL of an object, we use the policy im-plemented in Squid[5]. When an object's last-modi�edtimestamp is available, Squid estimates the lifetime ofan object using the adaptive TTL technique[8, 17].In adaptive TTL, an object's TTL is estimated to beproportional to the age of the object at the time itwas cached. The exact value depends on a parame-ter UpdateThreshold. We used an UpdateThresholdof 0.05, which is representative of values used inpractice [5]. We calculate TTL = (CurrentTime -LastModifiedTime) * UpdateThreshold. An object's

ExpirationTime = CurrentTime + TTL. If this esti-mate exceeds a default maximum value DefaultMax,then an object's TTL is estimated as DefaultMax. As inthe Squid cache implementation we use a DefaultMaxof 3 days.

For the Pro�le algorithm, we need estimates of thelatency and recency of objects to make a download-ing decision. We estimated the latency of an object asthe average latency over all previous requests, whichwas shown to perform well in [1]. To estimate theage of cached objects, we de�ned UpdateIntervalas ExpirationTime - LastModifiedTime, and de-�ned the age of a cached object as (CurrentTime -LastModifiedTime)/UpdateInterval.

For AUC, all cache hits were served directly fromthe cache, and we validated objects in the backgroundat a speci�ed PrefetchRate. We considered AUCwith two di�erent prefetch rates, 60 objects per minute(AUC-60) and 300 objects per minute (AUC-300).Note that for TTL and Pro�le we did not performany prefetching in this study.

On a cache hit, we need to determine if an object isfresh or stale. For all schemes, an object was fresh ifthe object's last-modified time was unchanged sincethe previous request. For TTL and Pro�le, an objectwas stale if its last-modified time had changed. ForAUC, we also need to consider the e�ects of prefetch-ing. If the object's last-modified time had changedand was more recent than the time the object waslast prefetched, the object was stale. Otherwise it wasfresh.

4.2.2 Synthetic Data

To complement our trace results and study the perfor-mance of pro�les, we also performed simulation studiesusing synthetic data, where we control updates at re-mote servers, and use more accurate age information.We used the following parameters to generate the syn-thetic data:

� Update Interval is the average length of time be-tween consecutive updates. In our simulation thisvalue ranged from once every 10 minutes to onceevery 2 hours.

� Estimated Latency is the expected end-to-end la-tency of downloading the object from the remoteserver. We modeled the latencies of objects usinglatency distributions from NLANR traces[13]. Toreduce the e�ects of network and server errors inthis data we considered only requests with laten-cies of less than 5000 msec. The distribution ofthese values was highly skewed, with a median ofapproximately 200 msec and a mean of approxi-mately 500 msec. 90% of the requests had laten-cies less than 1400 msec.

� Workload is the average number of requests perminute. We report on a workload of 8 re-quests/sec (480 requests/minute), which is rep-resentative of many cache workloads[13]. We ransimulations for 6 hours of simulation time for atotal of �172800 requests.

� World Size: We considered a world of 100,000 ob-jects with a popularity following a Zipf-like dis-tribution. The ith most popular object had pop-ularity proportional to 1/i�, where � is a valuebetween 0 and 1.0. We generated a distributionwith �= 0.7, which was typical of traces analyzedin [3].

We note that for TTL and Pro�le, for the syntheticdata we assumed that the cache had accurate expira-tion times (TTL estimates) for all objects. We use thetrace data to compare TTL, AUC, and Pro�le in thereal world case where estimates are often inaccurate.

4.3 Setup and Metrics

We implemented our simulation environment in C++.We ran simulations and experiments with trace dataon a Sparc 20 workstation running Solaris 2.6. Weassumed the cache was initially empty.

For the synthetic data, we ran simulations for 2hours of simulation time to warm up the cache, thenran them for an additional 6 hours. For the trace data,we used the �rst 12 hours of the trace to warm upthe cache, then collected data on the remainder of thetrace. We repeated each simulation 10 times to verifythe accuracy of our results, and validated that our re-sults satis�ed the 95% con�dence intervals. For bothtrace and synthetic data, we consider cache sizes rang-ing from 1% of the world size to an in�nite cache. We�rst report on results for an in�nite cache. We thenconsider the e�ects of varying cache size on the perfor-mance of all approaches. We used the Least RecentlyUsed (LRU) policy to replace objects when the cachewas full; this is commonly used in practice [5].

We report on the following metrics:

� Validation messages (vals): This is the numberof messages that were sent between cache and re-mote servers. For TTL, AUC, and Pro�le, a vali-dation message is sent from the cache to a serverto check for updates. The requested object wasonly downloaded if it had actually been updated.For SSI-Msg, a validation message is sent froma server to a cache to invalidate cached objects.Messages are typically much smaller than the ac-tual objects. We note that for SSI-Obj, the serversends the actual objects to the cache, so no mes-sages are sent.

� Downloads (Useful Validations): This is the num-ber of requested objects that were validated and

SSI-Obj SSI-Msg TTL

Val. Msgs 0 161768 67170Downloads 161768 67170 67170

AvgAge 0 0 0StaleHits 0 0 0

AUC-60 AUC-300 Pro�le

Val. Msgs 21600 100797 15932Downloads 16158 75833 15932

AvgAge 2.09 0.61 1.38StaleHits 112492 74548 96281

Table 1: Results for Experiments with Synthetic Data

subsequently downloaded because they were stalein the cache. For SSI-Obj, this includes all ob-jects that were updated at remote servers and sentto the cache. For SSI-Msg,TTL and Pro�le, thisincludes requested cached objects that were notsuÆciently recent in the cache. For AUC, thisincludes objects that were prefetched (validated)in the background and were downloaded becausethey were stale.

� Useless Validations: For the trace data, we alsoreport on useless validations. These are objectsthat were in the cache and were validated at theremote server, but had not actually been modi�edsince they were cached. Since useless validationsadd unnecessary latency to requests, it is impor-tant to minimize this number.

� Stale Hits: For the trace data, this is the numberof objects served from the cache (without valida-tion), but that had actually been updated at theremote server.

� Age: This is the average age of objects deliveredto clients, i.e., the number of times they were up-dated at the remote server. Objects that weredownloaded from a server always had an age of 0.

� Latency: This is the average latency of the re-quests in msec.

4.4 Comparison of Pro�le to Existing Tech-nologies

Our �rst set of results shows the bene�ts of using Pro-�le for an in�nite cache. We �rst show simulation re-sults using synthetic data. We then use the trace datato compare Pro�le to TTL and AUC. The trace datare ects the situation when TTL estimates are inaccu-rate, which is often the case in practice. We do notstudy SSI on the traces since it requires server coop-eration.

In these experiments, all clients used a single pro�lewith TR =1 update and TL=1 second. The othersettings were w = 0.5 and KL, KR = 1. Recall thatwith w=0.5, neither latency nor recency is favored inthe tradeo�.

TTL AUC-60 AUC-300 Pro�le

Val. msgs 151367 378312 1891560 92943Useful vals 24898 933 2810 22896Useless vals 122074 279349 327776 67601

Avg Est.Age 0 18.4 11.1 0.87Stale Hits 4282 31285 22897 7704

Table 2: Results for Experiments with Trace Data

The number of validations and downloads for thesimulation study with synthetic data is shown in Ta-ble 1. The �rst observation is that SSI-Obj consumesthe most bandwidth because it sends a large number ofobjects to the cache to keep the cache up to date. Thisis shown in the Downloads row. SSI-Msg and AUC-300also consume signi�cant amounts of bandwidth com-pared to Pro�le. While they download fewer objectsthan SSI-Obj, they still send many validation mes-sages. In contrast, Pro�le performs fewer validationsand fewer downloads than all other approaches. Wewill use the trace data to further quantify the band-width savings of Pro�le relative to TTL and AUC.

The average ages of objects and number of stale hitsare also shown in Table 1. These results show thatwhile AUC-60 and Pro�le have a comparable numberof downloads, AUC-60 does so at the cost of deliveringsigni�cantly less recent data. AUC-60 delivers objectswith an average age of 2.09 updates compared to 1.38updates for Pro�le. AUC-60 also provides nearly 20%more stale hits than Pro�le. AUC-300 provides bet-ter recency (0.61) than Pro�le. However, it does soat the cost of validating 600% more objects than Pro-�le (100797 vs. 15932) and downloading nearly 500%more objects(75833 vs. 15932).

Our trace results further compare TTL, AUC, andPro�le in the real-world case where TTL estimates areoften inaccurate. Table 2 shows the number of val-idations for TTL, AUC, and Pro�le, for an in�nitecache. The �rst observation is that both variants ofAUC validate signi�cantly more objects than eitherTTL or Pro�le. Recall that AUC validates objects atthe speci�ed PrefetchRate. TTL also validates manymore objects than Pro�le.

The number of useful and useless validations areshown in the second and third lines of Table 22. Wenote that for AUC, for a fair comparison we measureduseful and useless validations only for prefetched ob-jects that were subsequently requested.

A key observation is that TTL performs nearlytwice as many useless validations as Pro�le (122074 vs.67601). In these cases, TTL adds latency to requestswithout improving the recency. In contrast, Pro�le cansigni�cantly reduce the number of useless validations(by � 60,000) with only a small increase in the numberof stale hits (� 4000 more than TTL). Another key ob-

2In some cases the trace did not contain a last modi�ed dateto determine if a validation was useful or useless, therefore thesum of these values is less than the total validations.

servation is that both variants of AUC perform manymore useless validations than either TTL or Pro�le.Further, AUC performs very few useful validations forobjects that are subsequently requested (less than 3000for AUC-3000 vs. 24898 for Pro�le). Thus, AUC canconsume large amounts of bandwidth to keep the cacherefreshed while doing little to improve the recency ofdata delivered to clients.

Since the trace data does indicate how manytimes servers actually modi�ed objects, we mustestimate the age, i.e., number of times a staleobject was updated at a server since it wascached. We compute age= (CurrentTime -LastModifiedTime)/UpdateInterval,where UpdateInterval is estimated as de�ned in Sec-tion 4.2. While this is an estimate, it gives an idea ofhow out of date the stale objects were.

Both variants of AUC prefetch a large number ofobjects, while still delivering many stale objects toclients. The average estimated age of the stale hitsis shown in the last line of Table 2, and show thatAUC can deliver very out of date objects. This isbecause the prefetching strategy for AUC prefetchesall objects with equal frequency, which may cause fre-quently updated objects to become very out of date.While this prefetching strategy is near optimal for min-imizing the number of stale hits[9], our results clearlyshow that AUC may nevertheless result in very staledata. Thus, prefetching may not be appropriate forapplications that cannot tolerate stale data, especiallywhen the data is updated frequently.

4.5 E�ect of Cache Size

We now use the trace data to measure the performanceof TTL, AUC, and Pro�le for varying cache sizes. Wevaried our relative cache size from 1% of the worldsize to 100% of the world size (i.e., an in�nite cache).We show that Pro�le can better utilize cache size toreduce latency (compared to TTL) and to reduce age(compared to AUC).

0 20 40 60 80 100

150

200

250

300

Relative Cache Size (%)

Ave

rage

Lat

ency

(m

sec)

ProfileAUCTTL

Figure 6: E�ect of Cache Size on Average LatencyThe average latency for Pro�le and the baseline al-

gorithms are plotted in Figure 6. The �rst observa-tion is that both Pro�le and AUC better utilize in-

creased cache size to reduce latency. While increasingthe cache size increases the number of objects thatcan be cached, objects that expire in the cache mustalways be validated for TTL. Increasing the cache sizedoes not decrease the number of stale objects in thecache, so TTL does not bene�t signi�cantly from alarger cache. In contrast, Pro�le and AUC can bene-�t more from an increased cache size. While objectsin the cache may be stale, they may still be useful tosome clients.

0 20 40 60 80 1000

0.5

1

1.5

2

2.5x 10

4

Relative Cache Size (%)

Num

ber

of S

tale

Hits Profile

AUC−60AUC−300TTL

Figure 7: E�ect of Cache Size on Number of Stale Hits

Figure 7 shows the number of stale hits. As thecache size increases, the stale hits for both AUC-60and AUC-300 increase dramatically. This is becauseprefetching for AUC does not scale well and a greaternumber of client requests are being serviced by (pos-sibly stale) cached objects, so the number of stale hitsincreases. This shows that the reduced latency of AUCcomes at the high cost of delivering very stale data. Incontrast, the number of stale hits for Pro�le increasesby a much smaller amount. In summary, AUC cannotutilize a large cache size to reduce age and delivers verystale data. Similarly TTL cannot utilize a larger cachesize to reduce latency. In contrast, Pro�le is exibleand can exploit increasing cache size to reduce bothage and latency.

4.6 E�ect of Surges

Under normal workloads, there is typically suÆcientbandwidth and server capacity to handle all requests.However, from time to time networks or servers mayexperience \surges", i.e., a period of time during whichthe available resource capacity exceeds the demand.During surges, many request will be backlogged andtheir processing may be delayed signi�cantly. As anexample, we consider the case where there is insuÆ-cient bandwidth between a cache and the servers. Inthis case, the servers will attempt to deliver many ob-jects simultaneously, which will cause delays deliveringthe objects to the cache. This could occur in a proxycache if a surge in remote requests saturates the band-width between the Internet and the cache. It couldalso occur in an application server cache if many clientsmake requests to the server simultaneously.

Our �nal experiment is a simulation that comparesPro�le to TTL in the presence of surges. A surge isrepresented by a capacity ratio. The capacity ratiois the ratio of available resources per second to theresources required per second. For example, during asurge period, if a server can handle 10 requests persecond and requests arrive at the rate of 20 requestsper second, then the capacity ratio during this periodis 1/2. A capacity ratio of 1 means there are suÆcientresources to handle all requests, and requests will incurno extra delay as a result of the surge. However, if thisratio is less than 1, performance can severely degrade.

We consider two groups of clients. The �rst group,MostRecent, has TR = 0 updates and TL = 1 sec. Thesecond, LowLatency, has TR = 1 update and TL = 0sec. For both groups, the other settings were w = 0.5and KL, KR = 1. In our simulation, we considered asurge with duration 30 seconds. The request rate is 100requests per second. We vary the available capacityfrom 20 to 100 objects per second, i.e., the capacityratio varies from 0.2 to 1.0. For simplicity, we assumeno requested objects are evicted from the cache duringthe surge period. We warmed up the cache for 10000requests at a non-surge workload of 8 requests/sec,then began the surge period and gathered data.

1 0.8 0.6 0.4 0.20

5

10

15

20

25

30

35

40

Avg

Lat

ency

(se

c)

Capacity Ratio

TTLProfile−MostRecentProfile−LowLatency

Figure 8: Avg. Latency during a 30-sec. surge period

Figure 8 plots the average latencies of all requests.For TTL, all requests are treated equally, and all ob-jects that have expired in the cache are downloaded.As expected, the latency is very high, especially whenthe capacity ratio is below 0.5. However, Pro�le candistinguish between the two groups of clients and bet-ter serve their requests. As expected, the latency forsome LowLatency clients is signi�cantly lower than forTTL. This is because, when a requested object is in thecache, stale data can be delivered to the LowLatencyclients. Consequently, there is more available band-width to serve the other clients. Thus, the latencyfor the MostRecent clients also decreases compared toTTL. Thus, using Pro�le during a surge can signi�-cantly improve access latencies for all clients, not justthose that can tolerate stale data.

5 Conclusions and Future Work

The growing diversity of clients and services on theWeb requires data delivery techniques that can ac-comodate diverse client needs with minimal overheadto clients, caches, and web servers. This paper intro-duced latency-recency pro�les, a scalable, tunable ap-proach to delivering web data that is straightforwardto implement for both clients and caches, and requiresno cooperation from remote web servers. Our resultsshow that pro�les can signi�cantly reduce bandwidthconsumption compared to existing technologies, whilealso providing greater exibility than TTL or AUC.

In future work, we plan to explore prefetchingstrategies that exploit pro�les. For example, objectsfor which clients require the most recent data should bekept more up-to-date. We will also consider the e�ectsof wide variations in pro�les on performance. Finally,we plan an implementation of pro�les in Squid[5], andplan to use it to further study the e�ectiveness of pro-�les.

6 Acknowledgements

This research is supported by National Science Foun-dation grant DMI9908137 and a National PhysicalScience Consortium fellowship. Bobby Bhattachar-jee, Selcuk Candan, Mike Franklin, Avigdor Gal, andSamir Khuller provided useful feedback. We thank Du-ane Wessels and the National Laboratory for AppliedNetwork Research for the trace data used in this study.NLANR is supported by National Science Foundationgrants NCR-9616602 and NCR-9521745.

References

[1] S. Adali, K.S. Candan, Y. Papakonstantinou, and V.S.Subrahmanian. Query caching and optimization indistributed mediator systems. Proc. SIGMOD, 1996.

[2] R. Alonso, D. Barbara, and H. Garcia-Molina. Datacaching issues in an information retrieval system.ACM. TODS Vol. 15, no. 3, 1990.

[3] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker.Web caching and zipf-like distributions: Evidence andimplications. Proc. IEEE INFOCOM, 1999.

[4] Oracle9iAS Web Cache.http://otn.oracle.com/products/ias/web cache/content.html.

[5] Squid Proxy Cache. http://www.squid-cache.org.

[6] K.S. Candan, W.-S. Li, Q. Luo, W.-P. Hsiung, andD. Agrawal. Enabling dynamic content caching fordatabase-driven web sites. Proc. SIGMOD, 2001.

[7] P. Cao and S. Irani. Cost-aware www proxy cachingalgorithms. Proc. USENIX Symposium on InternetTechnologies and Systems, 1997.

[8] V. Cate. Alex - a global �lesystem. Proc. USENIXFile System Workshop, 1992.

[9] J. Cho and H. Garcia-Molina. Synchronizing adatabase to improve freshness. Proc. SIGMOD, 2000.

[10] E. Cohen, B. Krishnamurthy, and J. Rexford. Improv-ing end-to-end performance of the web using servervolumes and proxy �lters. Proc. SIGCOMM, 1998.

[11] A. Dingle and T. Partl. Web cache coherence. Proc.5th WWW Conf., 1996.

[12] Yahoo! Finance. http://�nance.yahoo.com.

[13] National Laboratory for Applied Network Research.ftp://ircache.nlanr.net/Traces.

[14] A. Gal. Obsolescent materialized views in queryprocessing of enterprise information systems. Proc.CIKM, 1999.

[15] J.R. Gruser, L. Raschid, V. Zadorozhny, and T. Zhan.Learning response time for web sources using queryfeedback and application in query optimization.VLDB Journal 9(1):18-37, 2000.

[16] H. Gupta. Selection of views to materialize in a datawarehouse. Proc. ICDT, 1997.

[17] J. Gwertzman and M. Seltzer. World wide web cacheconsistency. Proc. USENIX Technical Conf., 1996.

[18] V. Harinarayan, A. Rajaraman, and J. Ullman. Imple-menting data cubes eÆciently. Proc. SIGMOD, 1996.

[19] Y. Huang, R. Sloan, and O. Wolfson. Divergencecaching in client-server architectures. Proc. PDIS,1994.

[20] T. Kroeger, D. Long, and J. Mogul. Exploring thebounds of web latency reduction from caching andprefetching. Proc. USENIX Symposium on InternetTechnologies and Systems, 1997.

[21] A. Labrinidis and N. Roussopoulos. Webview materi-alization. Proc. SIGMOD, 2000.

[22] A. Labrinidis and N. Roussopoulos. Update propa-gation strategies for improving the quality of data onthe web. Proc. VLDB, 2001.

[23] C. Liu and P. Cao. Maintaining strong cache consis-tency on the world wide web. Proc. ICDCS, 1997.

[24] Q. Luo and J. Naughton. Form-based proxy cachingfor database-backed web sites. Proc. VLDB, 2001.

[25] C. Olston, B.T. Loo, and J. Widom. Adaptive pre-cision setting for cached approximate values. Proc.SIGMOD, 2001.

[26] H. Rozanski and G. Bollman. The great portal payo�.Booz-Allen & Hamilton, http://www.bah.com.

[27] P. Scheuermann, J. Shim, and R. Vingralek. A uni�edalgorithm for cache replacement and consistency inweb proxy servers. Proc. WebDB, 1998.

[28] BEA WebLogic. http://www.bea.com.

[29] IBM WebSphere.http://www-3.ibm.com/software/webservers/.

[30] Y. Zhuge, H. Garcia-Molina, J. Hammer, andJ. Widom. View maintenance in a warehousing en-vironment. Proc. SIGMOD, 1995.

Using Latency-Recency Profiles for Data Delivery on the Web

Documents

Transcript of Using Latency-Recency Profiles for Data Delivery on the Web