Cache architecture for on-demand streaming on the Web

35
P1: OJL ACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49 13 Cache Architecture for On-Demand Streaming on the Web RAJ SHARMAN, SHIVA SHANKAR RAMANNA, and RAM RAMESH State University at New York, Buffalo and RAM GOPAL University of Connecticut On-demand streaming from a remote server through best-effort Internet poses several challenges because of network losses and variable delays. The primary technique used to improve the qual- ity of distributed content service is replication. In the context of the Internet, Web caching is the traditional mechanism that is used. In this article we develop a new staged delivery model for a distributed architecture in which video is streamed from remote servers to edge caches where the video is buffered and then streamed to the client through a last-mile connection. The model uses a novel revolving indexed cache buffer management mechanism at the edge cache and employs selec- tive retransmissions of lost packets between the remote and edge cache for a best-effort recovery of the losses. The new Web cache buffer management scheme includes a dynamic adjustment of cache buffer parameters based on network conditions. In addition, performance of buffer management and retransmission policies at the edge cache is modeled and assessed using a probabilistic analy- sis of the streaming process as well as system simulations. The influence of different endogenous control parameters on the quality of stream received by the client is studied. Calibration curves on the QoS metrics for different network conditions have been obtained using simulations. Edge cache management can be done using these calibration curves. ISPs can make use of calibration curves to set the values of the endogenous control parameters for specific QoS in real-time streaming op- erations based on network conditions. A methodology to benchmark transmission characteristics using real-time traffic data is developed to enable effective decision making on edge cache buffer allocation and management strategies. Categories and Subject Descriptors: H.3.5 [Information Storage and Retrieval]: Online Infor- mation Services—Web-based services; H.4.3 [Information Systems Applications]: Communica- tion Applications—Videotex General Terms: Design Additional Key Words and Phrases: Web caching, on-demand streaming, quality of service, edge cache, buffering, selective retransmissions Authors’ address: R. Sharman, 325 Jacobs, State University of New York, Buffalo, NY 14260; email: [email protected]. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. C 2007 ACM 1559-1131/2007/09-ART13 $5.00 DOI 10.1145/1281480.1281483 http://doi.acm.org/ 10.1145/1281480.1281483 ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

Transcript of Cache architecture for on-demand streaming on the Web

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13

Cache Architecture for On-DemandStreaming on the Web

RAJ SHARMAN, SHIVA SHANKAR RAMANNA, and RAM RAMESH

State University at New York, Buffalo

and

RAM GOPAL

University of Connecticut

On-demand streaming from a remote server through best-effort Internet poses several challengesbecause of network losses and variable delays. The primary technique used to improve the qual-ity of distributed content service is replication. In the context of the Internet, Web caching is thetraditional mechanism that is used. In this article we develop a new staged delivery model for adistributed architecture in which video is streamed from remote servers to edge caches where thevideo is buffered and then streamed to the client through a last-mile connection. The model uses anovel revolving indexed cache buffer management mechanism at the edge cache and employs selec-tive retransmissions of lost packets between the remote and edge cache for a best-effort recovery ofthe losses. The new Web cache buffer management scheme includes a dynamic adjustment of cachebuffer parameters based on network conditions. In addition, performance of buffer managementand retransmission policies at the edge cache is modeled and assessed using a probabilistic analy-sis of the streaming process as well as system simulations. The influence of different endogenouscontrol parameters on the quality of stream received by the client is studied. Calibration curves onthe QoS metrics for different network conditions have been obtained using simulations. Edge cachemanagement can be done using these calibration curves. ISPs can make use of calibration curvesto set the values of the endogenous control parameters for specific QoS in real-time streaming op-erations based on network conditions. A methodology to benchmark transmission characteristicsusing real-time traffic data is developed to enable effective decision making on edge cache bufferallocation and management strategies.

Categories and Subject Descriptors: H.3.5 [Information Storage and Retrieval]: Online Infor-mation Services—Web-based services; H.4.3 [Information Systems Applications]: Communica-tion Applications—Videotex

General Terms: Design

Additional Key Words and Phrases: Web caching, on-demand streaming, quality of service, edgecache, buffering, selective retransmissions

Authors’ address: R. Sharman, 325 Jacobs, State University of New York, Buffalo, NY 14260; email:[email protected] to make digital or hard copies of part or all of this work for personal or classroom use isgranted without fee provided that copies are not made or distributed for profit or direct commercialadvantage and that copies show this notice on the first page or initial screen of a display alongwith the full citation. Copyrights for components of this work owned by others than ACM must behonored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,to redistribute to lists, or to use any component of this work in other works requires prior specificpermission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 PennPlaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]© 2007 ACM 1559-1131/2007/09-ART13 $5.00 DOI 10.1145/1281480.1281483 http://doi.acm.org/10.1145/1281480.1281483

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:2 • R. Sharman et al.

ACM Reference Format:Sharman, R., Ramanna, S. S., Ramesh, R., and Gopal, R. 2007. Cache architecture for on-demandstreaming on the Web. ACM Trans. Web 1, 3, Article 13 (September 2007), 35 pages. DOI = 10.1145/1281480.1281483 http://doi.acm.org/10.1145/1281480.1281483

1. INTRODUCTION

When public Internet is used for transmission of IP-TV and streaming video,congestion and network latency can occur at any time, significantly impactingthe quality of service [Sen et al. 1999]. Further, access to the content tradi-tionally via Web comes now from a wide range of heterogeneous devices thatare connected to the Internet through different wireless technologies such asGPRS, UMTS, Bluetooth, and so on [Grieco et al. 2005]. The primary techniqueused to improve the quality of distributed content service is replication. Repli-cation leads to improved system availability through better fault tolerance andbetter system scalability through distributed provisioning of content and ser-vices [Rabinovich and Aggarwal 1999]. By positioning content replicas opti-mally over the edge of the Internet, content delivery that might otherwise re-quire traversal across the core of the Internet to the users could be simplifiedand expedited, and many of the latencies avoided [Rabinovich and Spatscheck2002]. Hence in the context of traditional Web content such replication is donethough the Web caching. There is substantial literature in this area dealingwith replacement policies, architecture, content propagation, and so on. How-ever, the traditional Web caching mechanism of storing the entire content is tooexpensive in case of streaming media. (Rabinovich and Spatscheck [2002] pro-vide extensive insightful details on Web Caching). As broadband connectionsbecome commonplace, the demands on high-quality multimedia streamingapplications over the Internet have increased enormously. Increasingly, multi-media applications are also becoming available over wireless networks throughdigital multimedia broadcasting (DMB) mobile phone services [Shim and K.Ahn Forthcoming; Shim et al. 2006; Sinha and Papadopoulos 2004]. Such ap-plications include on-demand movies, live programs, and videoconferencing, toname a few [Berra et al. 1993]. Streaming such content online with low la-tency levels and jitter-free presentations is a major challenge for any serviceprovider, as it requires enormous server, bandwidth and storage capacities. In-ternet quality of service (QoS) has always been a major concern to both providersand consumers of service [Dai et al. 2003; Gupta et al. 1997]. The stochasticnature of the latency is especially important for streaming media. Wu et al.[2000] have proposed a window-based method that achieves stable through-put at a target level by utilizing a variation of the classical Robbins-Monrostochastic approximation algorithm. But their algorithm is efficient when thebandwidth requirements are modest, often requiring only a small fraction of theavailable bandwidth. High variability in customer demand and the returns oninvestments render these massive investments risky for any service provider.Therefore, topographical leveraging through a staged streaming architecturein which video is streamed from a remote server to an edge cache where the

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:3

video is buffered and then streamed to the client through a last-mile connectionis a viable option for many service providers.

The architecture proposed in this article differs from other architectures ofplayers such as PPLive, SopCast, and Coolstreaming in that they were essen-tially built for peer-to-peer networks where the stream is obtained from multipleservers. So the client essentially is swarmed from multiple sources. The band-width utilization in these players is suboptimal and becomes critical especiallywhen the access of content is from wireless devices and also from networkswhere the last mile does not have sufficient capacity. The edge architectureis an intermediary device that is intended to cache content closer to the enduser.

The focus of this article is to develop and analyze a staged streaming strategyinvolving two servers to ensure the required quality of service levels at thecustomer end. The staged streaming strategy entails streaming and bufferingfrom the remote server to the edge cache in the first stage, followed by a secondstage where the content is streamed from the edge cache to the client. Denotingthe two servers as the remote host and the edge cache servers, we develop astaged streaming model using a selective retransmission strategy from thehost server and revolving cache buffer architecture at the edge server. Thearchitecture proves a new cache buffer management scheme and is sensitive tonetwork conditions. The proposed architecture and the staging strategy can beextended to other network configurations of streaming as well.

On-demand streaming applications can be broadly classified into two types:near and pure on-demand streaming. Near on-demand applications use tech-niques such as batching where the user has limited choices in making a requestand controlling the stream. Pure on-demand streaming applications, though,have higher bandwidth requirements, allowing clients not only to request thevideo to be streamed but also to have complete control over the stream. In thisresearch we develop a distributed architecture for on-demand pure streamingfor the following scenario. We consider a content provider hosting a set of videosat its host server. The clients are distributed geographically, and each client isserved by a local/regional ISP. Besides providing last-mile access to the clients,each ISP hosts a set of servers on the edge. The edge caches often include afew seconds of prefix cache for video clips. When a client requests a video fromthe content provider, the video is streamed from its host server to an edgecache of the connecting ISP through best-effort Internet. The edge server pro-vides the last-mile delivery to the client, besides buffering the incoming stream.The architecture presented in this paper considers both situations, one wherethe bandwidth on the client side of the caching proxy is abundant and is lessprone to variability in delay and as well as when there is congestion in the lastmile and bandwidth is a problem.

Two popular techniques used to achieve such on-demand streaming are:(a) Replicate all videos on edge cache servers and stream directly to the clientsvia last-mile connections, thereby avoiding Internet traffic; (b) Peer streamingservers on the edge with assured bandwidth among them so that videos notavailable on a given streaming server can be obtained from another server inreal time. Since the clients could be geographically distributed and the typical

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:4 • R. Sharman et al.

sizes of videos are huge, a full replication of all videos in each edge server maynot be cost-effective or even feasible for many service providers. Furthermore,an ISP may not have adequate peering bandwidth resources to cover the needsof the entire distribution of its clients. Hence, both solutions (a) and (b) maynot be feasible in many cases. Consequently, the challenge lies in deliveringpure on-demand streaming through best-effort Internet without having theseassumptions.

We propose a staged streaming model to address significant lacunae in thecurrent state-of-the-art methods. The central ideas of the proposed model aretwofold: edge caching and selective retransmissions. While selective retransmis-sions in UDP-based streaming has been extensively considered in the litera-ture, the central contributions of this research are on managing an edge cacheusing a network-aware revolving buffer architecture. The revolving buffer isimplemented using a circular linked list. However, the management of thebuffer characteristics can be done in real time based on network conditionsand the calibration methodology proposed in this paper. The buffering archi-tecture is used to estimate the value of a retransmission in terms of its abil-ity to deliver lost or delayed packets at the client station on time. Consequently,the buffering architecture is used to trigger and manage such potential re-transmissions for maximal QoS at the client. Using the edge cache server asan intermediate buffer that also determines the flow of retransmitted pack-ets from the host would yield better control over the streaming process. Thisapproach would lead to enhanced levels of best-effort packet delivery at theclient.

The proposed architecture consists of two broad components: a remote serverapplication and an edge cache server application to handle pure on-demand uni-cast streaming of pre-encoded video through best-effort Internet. Videos thatare not locally cached are streamed from a remote server to the edge cacheserver, which eventually serves the client over the last mile. In the event thatedge server includes a prefix cache, when a client request is received this pre-fix cache is played out to the client, while the edge server makes a requestfor the remainder content. A number of techniques are available to smooththe prefix cache or trailer to the streamed content [Sen et al. 1999; Frossardet al. 2002; Shen et al. 2004; Wang et al. 2002; Chen et al. 2004]. Before play-out the edge cache parameters are set based on using the calibration curves(explained in Section 6). In this context, retransmission-based schemes aregenerally considered inappropriate for multimedia applications because of thelatency involved. However, if timely retransmissions can be performed with asignificant probability of success, such an approach to error control is attractive.This is because of the little overhead involved on network resources and theenhanced QoS that may result from using it in conjunction with preventive er-ror control schemes [Dempsey et al. 1996; Zimmermann et al. 2003; Zink et al.2000]. Consequently, our proposed strategy is to employ an edge cache buffer-ing mechanism to overcome the variability in packet transmission delays overthe Internet, and at the same time, use the buffering mechanism to trigger po-tential retransmissions that could be selectively used to enhance the final QoS.Jin et al. [2003] propose network-aware algorithms that make use of a partial

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:5

edge caching to reduce start-up latency. The contribution of that work also in-cludes optimal solution for populating caches with prior knowledge of requestarrival rates. The model proposed in this paper has the objective of providinga comprehensive cache buffer management strategy for ISPs when retrans-mission of lost packets are considered. Further the architecture proposed inthis article utilizes RTP, RSCP, and RTCP. However, the model proposed by Jinet al. [2003] can be used in conjunction with the model proposed in this arti-cle. Issues of incoming stream management, edge cache buffer management,coordination between the original and retransmission streams, and the out-put streaming strategy all together define the edge cache server applicationarchitecture. Similarly, the issues of outgoing streams (original and retrans-mitted) management, remote buffer management, and retransmission requesthandling together define the remote server application architecture. In anotherwork Jin et al. [2002] propose a edge cache mechanism that is not only networkaware but also stream aware. In this they take into account the popularityof the streaming media objects, their bit rate requirements, and the availablebandwidth between clients and servers. They focus on determining the cachecontent with a view to minimize average service delay using a fractional Knap-sack formulation. Our architecture differs from this architecture because thefocus of architecture proposed in this paper is to optimize edge cache parametersbased on network conditions such as congestion, packet loss, delays, encodingrate, and so on using a simulation optimization methodology. In our paper weprovide a framework and architecture to do this. The architecture proposedby Jin [2002] could be integrated with our work and this may possibly be animprovement over Jin [2002] and our model to improve QoS metrics. How-ever, this is an area for potential future study and is beyond the scope of thisarticle.

We have modeled and assessed the performance of the proposed architectureusing a probabilistic analysis as well as system simulations. These analyseshave yielded a set of calibration curves on the client-level QoS metrics in termsof the endogenous control parameters of the architecture and the exogenous net-work parameters. The methodology to benchmark transmission characteristicsusing real-time traffic data to enable effective decision making on edge cacheserver buffer allocation and management strategies significantly enhances thepractical appeal of the proposed model.

The remainder of this article is structured as follows. Section 2 presentsa review of the closely related QoS approaches to multimedia streaming andpositions our work in this body of literature. Section 3 develops the proposedremote streaming architecture. The central ideas behind the proposed archi-tecture along with the remote server and edge cache server application designare discussed in this section. Section 4 develops the various system parametersand performance metrics that are used in the proposed model. A probabilisticperformance model and a simulation model of the proposed architecture aredeveloped in Section 5. The computational results with both these models arediscussed along with their implications on application design and real-timestreaming control in Section 6. Section 7 concludes the paper and provides fu-ture research directions.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:6 • R. Sharman et al.

Table I. A Taxonomy of Multimedia QoS Approaches

1. Video Compression Schemesa. Nonscalable Encoding Schemesb. Scalable Encoding Schemes

2. Continuous Media Distribution Schemesa. Network Filtering Methodsb. Application-Level Multicast Methodsc. Content Replication Methods

I. Mirroring StrategiesII. Caching Strategies

3. Application Layer QoS Schemesa. Congestion Control Methodsb. Error Control Methods

I. Error Resilient Encoding and Concealment SchemesII. Forward Error Correction Schemes (FEC) and Retransmission Schemes

2. RELATED WORK

Ensuring appropriate levels of QoS in multimedia streaming is a central themeof a vast body of literature in this area. Past efforts in this direction can be clas-sified along several dimensions. In order to provide a more coherent view ofthe body of literature we review works on current approaches to enhance deliv-ery quality on the best-effort Internet through a taxonomy [Wu et al. 2001] ofmultimedia QoS approaches. The schemes addressed include: video compres-sion techniques, continuous media distribution services, and Application layerQOS techniques. Using these categories as the base dimensions, a taxonomy ofapproaches used in the literature is presented in Table I. We briefly synthesizethese developments in streaming architectures and strategies as follows. Whilethe research on each of these categories is vast, we address some of the recentworks in the following discussion.

2.1 Video Compression Schemes

Video compression schemes can be further categorized into nonscalable encod-ing and scalable encoding techniques. Nonscalable techniques (also known as“base only”) encode videos into a single compressed bitstream, whereas a scal-able video encoder compresses a video stream into multiple substreams. One ofthe compressed substreams is the base substream, which can be independentlydecoded. Other compressed substreams are enhancements, which can only bedecoded together with the base substream and can provide better quality. Thecomplete bitstream (i.e., combination of all the substreams) provides the high-est quality. The scalabilities of quality, image sizes, or frame rates are calledSNR, spatial, or temporal scalability, respectively. These three scalabilities arebasic scalable mechanisms. There can be combinations of these basic mecha-nisms. Compression schemes have been discussed in Girod et al. [1995] and Liet al. [1999]. Rejaie et al. [2000] present a compression mechanism for layeringvideo in the context of congestion control. Conklin et al. [2001] address variousproblems associated with video compression techniques in streaming over theInternet. Hsiao et al. [2001] show how TCP can be modified to accommodate

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:7

hierarchically layered video streams, while Kangasharju et al. [2002] developmechanisms for layered encoding using caches.

2.2 Continuous Media Distribution Schemes

Continuous media distribution schemes can be further classified into networkfiltering, application-level multicast, and content replication methods. Theseschemes have been developed to achieve significant QoS as well as improvedstreaming efficiency over best-effort Internet. Network filtering aims to maxi-mize video quality during network congestion. Such filters can adapt the rateof video streams according to the network congestion status. Network filtersare placed suitably by service providers to achieve best results. Network filter-ing, as has been discussed in Hemy et al. [1999] and Karrer and Gross [2001],addresses several factors that affect the quality of multimedia streams duringhandoffs among the various nodes in a transmission path. Application-levelmulticast is aimed at building a multicast service on top of the Internet and tosupport large scale content delivery. Application-level multicast addresses theissues of scalability, network management, deployment, and support for higherlayer functionality (e.g., error, flow, and congestion control), which cannot beachieved by IP-level multicast. In this context, McCanne et al. [1996] presentlayered multicast protocols, and Rizzo [2000] develops a scalable and stableTCP-based multicast congestion control mechanisms. Content replication isused to improve scalability of the media delivery system. Content replicationprimarily reduces the load on the streaming server, besides also decreasing thelatency for clients by increasing the availability of media content. Content repli-cation techniques include two broad categories: mirroring and caching. Mirror-ing is used to place copies of the original multimedia files on other machinesscattered around the Internet. Cache sharing and cache hierarchies allow eachcache to access files stored at other caches so that the load on the origin servercan be reduced and network bottlenecks can be alleviated. Major works in thisfield include Bouazizi et al. [2003], Fu and Vahdat [2002], Miao et al. [1999],Mourad [1996], Sen et al. [1999], and Rejaie et al. [2000].

2.3 Application Layer QoS Control Schemes

Application-layer QoS control schemes have been proposed to cope with varyingnetwork conditions and different presentation qualities requested by the users.Buffer management and admission control at application-layer level are alsoimportant considerations in multimedia streaming presentations and severalapproaches have been proposed in this regard (Li et al. [1999]; Balkir andOzsoyoglu [1998]). The application-layer techniques can be further categorizedinto congestion control schemes and error control schemes. Congestion controlschemes are employed to prevent packet loss and reduce delay. For streamingvideo, congestion control usually takes the form of rate control. Rate controlminimizes the possibility of network congestion by matching the rate of thevideo stream to the available network bandwidth. Congestion control schemeshave been discussed in Eleftheriadis and Anastassiou [1995]. Error controlschemes on the other hand are used to improve video presentation quality in

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:8 • R. Sharman et al.

the presence of packet loss. Error control mechanisms can be further categorizedinto error-resilient encoding, error concealment, forward error correction (FEC),and retransmission schemes.

2.3.1 Encoding, Concealment, and Correction Schemes. The objective oferror-resilient encoding is to enhance the robustness of compressed video topacket loss. However, most of these schemes, such as resynchronization mark-ing, data partitioning, and data recovery, are targeted at error-prone environ-ments such as wireless channels and may not be applicable to the Internetenvironment [Tan and Zakhor 2001]. However, some research exists in thearea of network-aware Internet video encoding [Briceno et al. 1999; Jin et al.2002; Jin et al. 2003], and research in this area is ongoing. On a comparativenote, research on error concealment strategies is more extensive. Wada [1989]discusses a few error concealment schemes. However, these schemes are notstrictly error control strategies because they do not actually recover lost data,but rather create an approximation based on information before and/or afterthe loss. High-quality concealment algorithms might be expensive for high-bandwidth applications and may necessitate the use of specialized hardwareas suggested in Papdopoulos et al. [1996]. Lu and Christensen [1999] propose aselective buffering strategy using error concealment to improve the quality ofMPEG videos. Varadarajan et al. [2002] introduce the notion of error spread-ing in continuous media streaming and develop a packet permuting algorithmto effectively conceal errors. Cuetos and Ross [2003] present a framework forscheduling decisions at the sending server that specifically accounts for errorconcealment in video reproduction at a client in lossy networks. The ForwardError Correction (FEC) approach is closely related to error concealment; inFEC, redundant information is added to a packet so that the original packetcould be recovered in the presence of packet loss. FEC techniques have beendiscussed in Wu et al. [2004], Bolot and Turletti [1996], Albanese et al. [1996],Nonnenmacher et al. [1998], Puri et al. [2000], and Tan and Zakhor [2001].

2.3.2 Retransmission Schemes. Retransmission is usually not considereda viable option for multimedia streaming to recover lost packets since a re-transmitted packet may not reach the client within its playout time. How-ever selective retransmission, also known as delay-constrained retransmission,which suppresses requests that will not arrive in time, is considered a very goodapproach. Selective retransmission is often compared to FEC as an alternatemechanism for enhancing QoS. In this context, although FEC reduces trans-mission latency, its disadvantages include a significant increase in transmissionrate and hence greater bandwidth requirements. Also unlike FEC, which addsredundancy regardless of correct receipt or loss, a retransmission-based schemeresends only the packets that are lost. Thus, a retransmission-based schemeis adaptive to varying loss characteristics, resulting in efficient use of networkresources.

In this research, we develop a novel scheme for streaming control that em-ploys selective retransmissions. The selective retransmission works in tandemwith the buffer management scheme proposed in this paper that allows ISPs to

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:9

set the endogenous parameters based on network conditions using the calibra-tion curves. Before we discuss other related work on retransmissions, we shouldmention that some of the techniques mentioned above, such as FEC, scalableencoding, or various application layer techniques such as multicasting, can beused in conjunction with the proposed revolving indexed buffer architecture.These are important and effective extensions to the work presented in thispaper.

Piecuch et al. [2000] propose a selective retransmission protocol (SRP) tobalance the loss found in UDP and latency in TCP. SRP uses an application-specific decision algorithm to determine whether a retransmission request is tobe sent or not by adjusting the loss and latency to the optimum levels for theapplication. Mulabegovic et al. [2002] propose a new streaming protocol calledLightweight Streaming Protocol (LSP). LSP uses an estimated round-trip time(ERTT) to selectively request for retransmissions. It also addresses networkcongestion by selectively dropping frames, thereby reducing the transmissionrate. However, both these protocols use a client-server model and do not provideservices such as synchronization and multiplexing as in RTP/RTCP.

Papdopoulos et al. [1996] present an error control scheme for continuousmedia applications based on retransmissions. The client-side buffer is a sim-ple FIFO queue, and the determination of which packets to be requested forretransmission is made by maintaining a smoothed RTD (round trip delay)parameter. Hasegawa et al. [1996] develop a video retrieval protocol that incor-porates both pre-fetch and selective retransmissions such that the lost or de-layed packets are obtained before the play-out time. Wang and Bhargava [1997]employ selective retransmissions in a method for transmitting large multime-dia objects over the Internet. Their approach combines retransmissions withpacket size reduction and multipass transmission. Li et al. [1999] propose theLVMD (Layered Video Multicast with Retransmissions) algorithm. This algo-rithm addresses the network congestion problem using layered video codingtechniques by allowing each receiver to subscribe to a subset of the video lay-ers according to its processing power and network bandwidth availability. Themodel uses retransmissions in such a layered environment. Lee and Lee [1998]develop a retransmission scheme for mission-critical MPEG streams by em-ploying virtual buffers that keep track of error propagation ranges and packetcharacteristics such as size, type, etc. Yamaguchi et al. [2000] propose a trans-port layer protocol termed R3TP. This protocol provides high-quality real-timedata transfer by adopting ATM Block Transfer (ABT) and retransmission-basederror control. The model, however, is realized only on IP over ATM networks.Anjum and Jain [2000] study the option of using link layer retransmissions toimprove TCP performance over lossy networks. Nithish et al. [2002] proposea “RTP” based retransmission scheme using network processor based routers(NPRs). Such intermediate routers have the capability of buffering and re-transmitting packets upon request. This work addresses the important issue oflatency involved with an end-to-end method for determining the necessity forretransmission. However, having such routers will require a significant changein the existing network infrastructure. The edge server model that we haveproposed here is a better option without any such changes required. Since

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:10 • R. Sharman et al.

intermediate buffering is not considered in most of these approaches, the ef-ficacy of retransmissions is questionable. This is because the client playingstations have to cope with the stringent time constraints in continuous mediatraffic and usually have very little time to recover lost packets while maintain-ing continuous play or will have to compensate with a large client-side buffer,which would mean a substantial increase in start up latency. Hence using anefficient intermediate buffering technique is crucial to taking advantage of theretransmission strategy. Furthermore, instead of proposing a new streamingprotocol to address these problems, we develop a streaming model that useswidely used streaming standards RTP, RTCP, and RTSP, as suggested by a fewrecent Internet drafts on using RTCP-based feedback for retransmissions. Thiscompliance with well-known standards renders the proposed application layermethodology easily implementable, transportable, and scalable. We have usedRTCP-based feedback for the determination of potentially useful retransmis-sion requests and the management and control of the retransmission processin our model. However, we do not specify the details of the packet formats to ac-commodate retransmissions, as this is beyond the scope of this work. Interestedreaders could refer to the latest internet drafts on RTP/RTCP for the proposedpacket formats.

3. REMOTE STREAMING ARCHITECTURE

In order to simplify the presentation, we consider the following streaming sce-nario. The streaming process entails a remote server, a local edge cache, and aclient. The remote server streams to the edge cache over the Internet; the edgecache server buffers the stream and then transmits it out to the client over alast-mile connection. Videos are assumed to be pre-encoded at different encod-ing rates and maintained at the remote server. Encoding rate is a measure ofvideo quality; higher rates imply greater video quality. Accordingly, an appro-priate encoded rate for a video would be chosen based on the available last-milebandwidth and the client requirements. In a given session, let ER (packets/sec)denote the encoding rate chosen. Let SR denote the rate at which the remoteserver streams the data to the edge cache. Due to network effects (congestionand packet losses), the edge cache server would receive the incoming stream ata rate less than or equal to SR. Assuming that the edge cache server streamsout at the same rate at which it receives, the client also would receive the dataat a rate less than or equal to SR. We denote this rate as the client realizedrate CR. The client would then play back at rate ER using a local buffer tocompensate for the mismatch between CR and ER using media players such asreal video and windows media player.

3.1 Streaming Architecture: An Integrated View

An integrated view of the proposed streaming architecture is presented inFigure 1. The streaming architecture consists of two central application compo-nents, one at the remote server and the other at the edge cache server. The userinteracts with the edge cache server application from the client player throughan RTSP channel. RTSP is over TCP (which is not a persistent connection)

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:11

Fig. 1. Communication channels for the proposed streaming architecture.

whose default port number on the server side is 554. The actual transmissionof video data is over UDP channels using RTP/RTCP protocols that come inpairs. First, a separate UDP channel pair supporting RTP/RTCP is establishedfor the transmission from the edge cache server to the client. This is denotedas Channel 1. Next, two UDP channel pairs of communication are establishedbetween the remote and edge cache servers. We call them Channels 2 and 3.Channel 2 is used for the original stream and Channel 3 for the retransmissionstream.

Each of these channels is established through an initial RTSP handshake andterminated by the same RTSP route. Note that unlike RTSP channels, theseare persistent connections and will remain throughout the entire streamingsession. In each channel, RTP uses an even-numbered UDP port (2n) and thecorresponding RTCP uses the immediate odd-numbered UDP port (2n + 1).Furthermore, RTCP2 and RTCP3 are not used for any specific purpose in theproposed architecture; therefore they can be set to UDP port number 0.

Figure 2 shows the proposed RTSP streaming control mechanism. Note thatfor every RTSP message sent to the edge cache server from the client, the edgecache server acts as an application proxy and sends it to the remote server.Further, the RTSP session between the remote and edge cache servers embedthe two streams—original and retransmission, between its setup and teardown.A high-level algorithmic architecture1 of the integrated set of components isdescribed in Section 5.

In the following discussion, we develop three further specific views of thearchitecture: a remote server view, an edge cache server view, and edge cacheserver buffer design and management view.

1A detailed C-style pseudocode of the entire application can be obtained by requesting the leadauthor.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:12 • R. Sharman et al.

Fig. 2. Proposed RTSP streaming control.

Fig. 3. Remote server view of the architecture.

3.2 Streaming Architecture: Remote Server View

The streaming component architecture at the remote server is shown inFigure 3. This architecture consists of four separate threads that are bound todifferent transport layer sockets. These threads are denoted as: stream controlremote interface (rs control interface), transmitter (rs app out), request recep-tor (rs retran in), and retransmitter (rs retran out). We describe the internalsand functionalities of these threads below.

Stream control remote interface. The stream control remote interface re-ceives and interprets RTSP requests from the edge cache server. Upon receiv-ing a setup request, it performs a check whether the client can be streamedusing a client admission control algorithm. If the client can be streamed, then

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:13

Fig. 4. Edge cache server view of the architecture.

it allocates the necessary resources for the session. It also maintains the stateof the stream session at the remote server.

Transmitter. The transmitter reads the pre-encoded video file. Chunks ofvideo data retrieved from the pre-encoded file are encapsulated in RTP packets.Each RTP packet in turn is encapsulated in a UDP segment. The transmittersends the RTP packets as a series of datagrams at a constant rate. As thepackets are sent out on RTP channel2, a copy of the packets is stored in a FIFO(First In First Out) buffer.

The packets are stored in the buffer until they are streamed from the edgecache server so that if there is a retransmission request, they can be streamedagain from the buffer without having to read from the disk.

Request receptor. The request receptor waits for any retransmission requestsfrom the edge cache server and passes the requests to retransmitter.

Retransmitter. The retransmitter receives requests from request receptorand retransmits the packets requested by retrieving them from the FIFObuffer.

3.3 Streaming Architecture: Edge Cache Server View

The streaming component architecture at the edge cache server is shown inFigure 4. The architecture consists of six separate threads, besides a revolvingindexed buffer, a pooled buffer and an index table. The threads are denoted as:Stream control edge interface (es control interface), receiver (es app in), trans-mitter (es app out), receptor (es retran in), requester (es retran out) and policymanager. The first five are independent threads bound to different transportlayer sockets. Their internals and functionalities are briefly discussed below.The policy manager is discussed in the next section after describing the bufferdesign. The index table is maintained to reflect the current set of packet

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:14 • R. Sharman et al.

sequence numbers in the indexed buffer and the packets that have been al-ready requested for retransmission.

Stream control edge interface. The stream control edge interface is used tointerpret RTSP messages. When a setup request is received, it checks to seeif the client can be admitted using a client admission control algorithm. If thevideo is locally available, it sends a reply back to the client, and upon a playrequest, it would start streaming by starting the transmitter thread. In case ofa request for a video from a remote server, a setup request is sent to the remoteserver; upon positive response from the remote server, the interface starts allthe remaining four threads. It allocates resources (bandwidth, port numbers,buffer space, etc.) before streaming begins. It interprets all RTSP requests andpasses on the requests to the remote server.

Receiver. The receiver receives the packets from the remote server on RTPchannel2. When a packet is received, it checks the sequence number and insertsthe packet into an appropriate bucket in the indexed buffer (the buffer designis described in the next section). If the sequence number of an arriving packetis outside the range maintained by the current set of indexed buckets, then thepacket is inserted into the pooled buffer area.

Transmitter. The transmitter is responsible for streaming the RTP packetsfrom the prefix cache buffer followed by packets in the indexed buffer at aconstant rate from the edge cache server to the client player. The packets arestreamed bucket by bucket, in sequence.

Requester: The requester makes requests for retransmissions of missingpackets in the indexed buffer as determined by the policy manager. RTCP re-quests for these packets are constructed and sent out to the remote serverthrough RTCP channel 3.

Receptor. The receptor listens on its UDP socket for any retransmitted pack-ets. When it receives a packet, it inserts in the appropriate bucket of the indexedbuffer. If it cannot find the bucket outside the active zone of the indexed bufferit drops the packet.

3.4 Edge Cache Server Buffer Design and Management

The objectives in designing a buffer system at the edge cache server are: (a) theavailable buffer space should be utilized effectively; (b) the buffer space shouldbe organized into buckets such that arriving packets can be inserted into theirappropriate buckets efficiently; (c) when packets arrive out of the range of thebuckets, they should be inserted into some auxiliary buffers without droppingthem; (d) the buffer architecture should yield an efficient determination of whatpackets should be requested for retransmission and when to request them; andfinally, (e) the need for sequencing on the client player side when packets aresent from the edge cache server should be minimized. The proposed organiza-tion of the buffer space into buckets yields a partial ordering of the packets inthe memory. This ordering scheme enables efficient insertion of packets intobuckets without sorting or searching. Furthermore, this scheme also enablesefficient buffer management and control of the request-response process forretransmissions and leads to buffer configurations that could maximize the

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:15

effectiveness of using retransmissions in enhancing the QoS at the client. Theproposed buffering and streaming strategy is as follows. The buffering systemconsists of three components: a prefix cache buffer, a revolving indexed buffer,and a pooled buffer. The prefix cache buffer contains the prefix cache for a fewseconds. The indexed buffer has a fixed number of buckets. Each bucket is offixed size and is assigned a range of sequence numbers of packets that it couldhold at any point of time. Let N denote the number of buckets. The buckets areindexed sequentially as i =1,. . . , N . Let [(Li(t), Ui(t)] denote the range of packetsequence numbers assigned to bucket iat any given time t. At time t = 0, theprefix cache is streamed to the client and at the same time a request is sent to theremote server for the remainder of the video that is not in the prefix cache. Sincepackets are numbered sequentially, we have Li+1(t) = Ui(t) + 1, i = 1, ..., N − 1and Ui(t)− Li(t) = S∀i, where S is the constant bucket size in terms of numberpackets. When a packet arrives at the edge cache server, its sequence numberis tested with the bucket ranges to determine the bucket to which it belongs. Ifa bucket is thus determined, then the packet is inserted into it. If the sequencenumber of the arriving packet is less thanL1(t), then the packet has arrived toolate; hence it is discarded. If the sequence number is greater than UN (t), thenit has arrived too soon; in this case, the packet is saved in the pooled buffer thatis not indexed.

The buckets are streamed out of the edge cache server to the client in se-quence. Accordingly, bucket 1 is streamed first. After its completion, the bucketsare re-indexed as follows: current buckets 2,. . . , N are indexed as 1,. . . , N − 1,and current bucket 1 is indexed as bucket N . As a result, its packet sequencenumber range is changed as follows: LN (t) = UN−1(t)+1 and UN (t) = LN (t)+S.When a bucket is streamed out, all packets currently available in the pooledbuffer that belong to the newly designated bucket N are moved into the bucket.In this strategy, streaming is always carried out from the bucket designated as 1in each streaming step at constant intervals of time. This whole process followsa revolving scheme with a bucket streamed out from one end and a new bucketadded at the other end within the same buffer space. The revolving indexedbuffer can be implemented as a circular array of dynamically created linkedlists, where each bucket is a linked list. An index table is used to access andmanage the linked lists. This is illustrated in Figure 5. The pooled buffer can beimplemented as an ordered linked list. Note that the actual number of packetsin a bucket at any given time is a random variable.

Let bi(t) ≤ S denote the actual number of packets stored in bucket iattime t.If the total buffer capacity at the edge cache server is M packets, thenthe number of packets stored in the indexed buffer at any point of time tis I (t) = ∑N

i=1 bi(t), and consequently, the maximum capacity available inthe pooled buffer at time t is P (t) = M – I (t). The implementation of theindexed buffer and the pooled buffer as linked lists thus enables the maxi-mum utilization of the available total buffer space. We develop the final threadin the edge cache server architecture, the policy manager, in the followingdiscussion.

Policy manager. The policy manager implements a retransmission-requestpolicy and coordinates with other threads in securing missing packets in time

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:16 • R. Sharman et al.

Fig. 5. Implementation of the revolving indexed buffer.

Fig. 6. Zone structure of the indexed buffer.

from the remote server. The retransmission-request policy entails two decisions:what missing packets are to be requested and when. The policy manager looksup the index table at regular intervals of time, makes these policy decisionsand communicates them to the requester thread that carries out the request.Each request may comprise packets missing in several buckets. The strategyof this policy is to make requests that will be useful in securing truly missingpackets. The objectives of a policy are threefold: (a) the policy should not createfalse alarms by requesting packets that may be still on their way, (b) the policyshould not miss retransmission opportunities that can be truly useful, and(c) the policy should avoid requests that may involve genuinely missing packetsbut could not be obtained in time for the next stage transmission to the client.In order to implement these criteria in a policy, the indexed buffer is classifiedinto five zones as shown in Figure 6.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:17

First, the set of N buckets is classified into three basic zones: active,retransmission, and passive. Choosing a parameter A, let the sequence ofbuckets {1, . . . , A}denote the active zone. Similarly, choosing a parameter P,letthe sequence of buckets {P, . . . , N }denote the passive zone. The sequence{A + 1, . . . , P − 1}represents the retransmission zone. Whenever a retransmis-sion request is made, only the missing packets in the retransmission zone willbe requested. This is because, requesting for packets in the passive zone couldcreate false alarms and requesting from the active zone may result in retrans-mitted packets arriving too late. When packets arrive, we employ an orthogonalidentification of the same buffer space: the no-insertion and insertion zones asshown in Figure 6. If a packet belonging to a bucket in the no-insertion zone,then it is discarded; otherwise it is inserted into the appropriate bucket. Thereason for this is that the packet insertion thread should not be in contentionfor the buffer resources with the transmitter. Otherwise, the expected steadythroughput from the edge cache server to the client may be affected. Typically,the no-insertion zone is small, and would entail one or two buckets at the headof the indexed buffer. Using these parametric settings, the policy manager ini-tiates retransmission requests at constant interval of time, denoted as T. Adetailed performance analysis of these parametric settings is reported in thenext sections.

4. SYSTEM PARAMETERS AND PERFORMANCE METRICS

We have carried out detailed analytical investigations and simulation studieson the proposed architecture. In this section, we summarize the system param-eters and performance metrics employed in these studies. The different systemparameters that influence the QoS performance could be broadly classified intotwo categories: exogenous and endogenous.

Exogenous Parameters. These parameters are not under the control of eitherthe application designer or runtime administrator. They pertain to the Internetenvironment in which remote streaming is carried out. In our models, we denotethese parameters as follows: (a) Average network packet loss L, expressed as apercentage of the total number of packets transmitted from the remote server;(b) End-to-end delay for any packet between the remote and edge cache serversexpressed using a delay distribution D(μ, σ 2).

Endogenous Parameters. These are control variables that can be varied bythe application designer within the constraints of the system resources to op-timize streaming performance. Furthermore, a flexible implementation of thearchitecture would enable a run-time administrator to set the values of theseparameters after assessing the exogenous parameters in any given streamingsession. The endogenous variables are: remote server streaming rate (SRRS),edge server memory size (M), bucket size (S), edge server streaming rate (SRES),time interval between retransmission requests (T), and retransmission range(RR). The parameter RR is number of buckets in the retransmission zone forwhich retransmission requests are actually sent for missing packets. Clearly,R R = P − A.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:18 • R. Sharman et al.

Performance Metrics. The system performance is measured at two levels:client and edge server levels. The client level performance is indicative of theQoS received by the client. At the edge server level, we measure performance interms of the usefulness of the retransmissions in improving the client-level QoS.Two metrics are used to assess client-level performance as follows: (a) QoS1—the percentage of packets transmitted in time from edge server to client, and(b) QoS2—the percentage of the Encoding Rate accomplished in the packetarrival rate realized at the client. It must be noted that both QoS1 and QoS2realized at the client are expected to be less than at the edge server becauseof the congestion in the last mile. Best quality in transmission is achievedwhen both QoS1 and QoS2 are close to 100%. At the edge server level, thepercentage of packets that were genuinely lost during the original transmissionbut were recovered in time through the retransmission scheme so that theycan be delivered to the client before playout is a measure of the usefulness ofretransmissions. This metric is indicated as γ . Best transmission efficiency isachieved when γ is close to 100%.

5. PERFORMANCE MODELING

In this section, we first develop a probabilistic analysis of the performanceof the proposed streaming architecture. Next, we outline the simulation modeldeveloped to evaluate performance empirically. The computational results fromboth the models are mutually consistent and are discussed in Section 7.

5.1 Probabilistic Performance Model

The probabilistic performance model provides an analytical basis to eval-uate the performance impacts of retransmissions on the overall quality ofservice. We begin the discussion by first considering the case withoutretransmissions.

5.1.1 Without Retransmissions. Without loss of generality consider abucket with S data packets numbered packet 1, packet 2. . . packet S. At timet = 0, the remote server begins the process of transmitting the packets in thebucket to the edge server. The streaming rate SRRS determines the elapsed timebetween the transmissions of consecutive packets. Let tRS (where tRS = 1/SRRS)denote the time between successive transmissions. Therefore, packet 1 is sentat time 0, packet 2 at time tRS, packet 3 at time 2tRS, and packet S at time(S − 1)tRS. Let T denote the time at which packets in the bucket are assembledby the edge server and sent to the client. Some of the packets transmitted bythe remote server may not be delivered to the client because of either packetlosses or time delays in the transmission. Let p denote the probability that thepacket is not lost in transmission. It follows that p = 1 − L/

100. The packetdelays are characterized by the delay distribution D with mean μ and varianceσ 2. Let F (t) denote the cumulative density function, that is, the probability thata packet will arrive at the edge cache by time t. The probability that ith packetwill arrive by time T is given by F (T − (i − 1)tRS). Therefore, the probabilitythat packet i will arrive at the edge cache by time T and is not lost is given by

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:19

pF (T − (i − 1)tRS). If pLMiledenotes the percentage of packets lost in the lastmile then the number of packets from the bucket that are delivered to the clientis given by the expression:

(1 − pLMile)pS∑

i=1

F (T − (i − 1)tRS). (1)

The quality of service metric QoS1 that denotes % of packets received by theclient:

QoS1 = (1 − pLMile)p∑S

i=1 F (T − (i − 1)tRS)S

∗ 100. (2)

5.1.2 With Retransmissions . We now consider the impact of requestingretransmissions. Let the retransmission time be R where R ≤ T . At timet = R, all packets from the bucket that have not yet arrived at the edge serverwill be requested for retransmission. The request is sent back to the remoteserver, which will respond by resending the packets.

For analytical convenience we will assume that the remote server will re-transmit all the requested packets at the same time. Since the packets havealready been compiled and assembled the first time there were sent, the re-mote server can seamlessly respond to retransmission requests. For expositorypurposes, we present three cases for the retransmission analysis.

Case 1. No packet delays in the backbone, but there are packet losses. Con-sider any packet i. Probability that the packet does not arrive in the first trans-mission is (1 − p). Similarly, the probability that the packet does not arrive inthe retransmission is (1 − p). Therefore the probability that the packet i willnot be delivered to the client is (1 − p)2. It follows that

QoS1 = (1 − pLMile)(1 − (1 − p2)) ∗ 100. (3)

The number of packets retransmitted will depend on R. When R ≥ (μ+ (S −1)tRS), the expected number of packets retransmitted is (1 − p) S. When R < μ

the entire bucket S needs to retransmitted as no packet would have arrived atthe retransmission time. Assuming R ≥ (μ + (S − 1)tRS), the number of wastedpackets i.e. the number of packets sent but not utilized is given by (S+ Numberof retransmission requests—Number of packets delivered to client). This sim-plifies to Sp(1 – p). Note that this is inversely related to the performance metricγ , which represents the percentage of retransmitted packets that are useful.

Case 2. Packet delays, but there are no packet losses. Consider packet 1.At the retransmission time R, the probability that the packet will not havearrived is 1 − F(R). This represents the probability that the packet will berequested for retransmission. If the packet is requested for retransmission thenthe probability that the re-transmitted packet will not arrive by time T is givenby 1 − F(T − R −μ). Similarly, the probability that the original packet will notarrive by time T is given by 1 − F(T ). Thus the probability that the packet 1will not be delivered to the client is (1 − F(T ))(1 − F(T − R − μ)). Generalizing,we obtain the probability that the packet i will be in the bucket sent to edge

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:20 • R. Sharman et al.

cache as 1 − (1 − F (T − (i − 1)tRS)) (1 − F (T − R − μ)). Therefore, the numberof packets delivered to the edge cache is given by

S∑

i=1

(1 − (1 − F (T − (i − 1)tRS))((1 − F (T − R − μ))).

This simplifies to

SF(T − R − μ) + (1 − F (T − R − μ))S∑

i=1

F (T − (i − 1)tRS). (4)

Therefore

QoS1 = ((SF(T − R − μ) + (1 − F (T − R − μ))S∑

i=1

F (T − (i − 1)tRS)) ∗ 100)/S.

(5)Number of retransmission requests is

S∑

i=1

(1 − F (R − (i − 1)tRS)). (6)

Number of packets that were streamed but not utilized is given by (S + Numberof retransmission requests—Number of packets delivered to client) and cancomputed from the above expressions.

Case 3. Packet delays and losses. This represents the general case and canbe compiled from the previous analysis. The number of packets delivered to theclient can be computed as

SpF(T − R − μ) + (1 − pF (T − R − μ))S∑

i=1

pF (T − (i − 1)tRS). (7)

From the above we obtain

QoS1 = ((SF(T −R−μ) + (1−F (T −R−μ))∑S

i=1 F (T − (i − 1)tRS)) ∗ 100)S

∗(1 − pLMile). (8)

Number of retransmission requests is

S∑

i=1

(1 − pF (R − (i − 1)tRS)). (9)

This probabilistic model provides an analysis of retransmissions and its impacton quality of service, number of retransmissions requested, and number ofwasted packets.

5.2 System Simulation Model

A simulation model of the streaming scenario with the proposed architecturehas been developed using the popular discrete event simulation tool ARENAalong with a code in C language for processing. The simulation model replicatesthe staged streaming model we propose. The topology of the simulation model

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:21

consists of an remote server, an edge server, and a client. Using extensive sim-ulations of the scenario, the performance metrics of the proposed architecturehave been assessed under various parametric settings. A completely random-ized block design yielded 1296 independent simulations in total. Each simula-tion was run for 10 replications each. Each run had a different seed of randomnumber. Sufficient warm-up period was given for each simulation run to help usremove any initial bias in the simulation results. Statistics were collected afterthe warm-up period. The performance metrics are assessed in each simulationrun.

The simulations are realistic as they rely on actual field experiments con-ducted on real Internet paths among the following locations: Austin, Buffalo,Connecticut, and Seattle. Dynamics of traffic patterns, service overloads, andpacket routing contribute to random variations in network delay. We measuredvariability in network delays using volatility measure. The results shown belowindicate that there are significant variations in the transfer times across all thesites used in the study. Since the congestion and packet loss in the last-milesmay be significant depending on the last mile conditions, we performed simu-lations assuming a lossless last mile as well as when the last mile is congestedand there are both delays and packet loss. Our simulations used several distri-butions and several different conditions in terms of delay and packet loss. InFigure 7, we have presented simulation runs for twelve different combinationsof network losses and delay distributions, varying the endogenous parametersamong various levels in each combination under the assumption that the lastmile has sufficient bandwidth and no packet loss. This is typical of many uni-versity settings in the US and in many large companies. In Figure 8 we presentresults of simulations under varying conditions of the last mile using a varietyof distributions. The endogenous parameters have been varied as follows: threelevels of edge server memory size, three levels of number of buckets, three lev-els of retransmission intervals, two levels of interbucket delay and two levels ofretransmission ranges. The performance metrics are assessed in each simula-tion run. The simulation study has been conducted for the following purposes:(a) comparison of QoS with/without retransmissions, (b) studying its the effectsof endogenous control parameters on QoS with/without retransmissions undervarious exogenous conditions, (c) studying the effects of control parameters onthe percentage of useful retransmissions realized at the edge server, and (d)obtaining the calibration curves (QoS1 vs. QoS2) for different exogenous net-work conditions that can be used by a runtime administrator to set appropriatevalues for the control parameters in a given streaming session.

6. PERFORMANCE RESULTS

We discuss the results of the probabilistic performance model in Section 6.1and the results of the system simulation model in Section 6.2.

6.1 Probabilistic Performance Model Results

A detailed analysis of the probabilistic performance model has been conductedto provide analytical insights on the impact of retransmission choices on the

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:22 • R. Sharman et al.

Fig. 7. (Continued).

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:23

Fig. 7. (Continued). Calibration curves for twelve combinations of network losses and delay asum-ing no packet loss and no delay in last mile.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:24 • R. Sharman et al.

Fig. 8. Calibration curves for six combinations with network losses and delay in the last mile.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:25

Fig. 9. a and b. Probabilistic performance model curves.

quality of service and the overall usage of the network resources. Figure 9(a)depicts the impact of retransmission time on the performance metric QoS1. Asexpected, early and frequent retransmission requests result in substantial im-provements in quality of service. An increase in the retransmission request timelowers the quality of service as increasing number of retransmitted packets ar-rive too late to be deliverable to the client. The ability of the retransmitted pack-ets to enhance QoS is further elaborated in Figure 9(b). Initially, as the retrans-mission request time increases the percentage of useful packets also increase.This follows from the fact that too early a request can raise false alarms, andthat packets that would arrive shortly are unnecessarily requested. Such falsealarms can be avoided by increasing the retransmission request time. On theother hand, retransmission requests that are too delayed can result in a sharpdrop in the percentage of useful packets delivered as they may arrive too late tobe of use and waste critical bandwidth resources. These results clearly point tothe importance of a careful management of retransmission policies to enhancethe overall utilization of the resources. In the following discussion we presentresults from a detailed simulation study that would enable an administratorto fine-tune critical control parameters to achieve optimal overall performance,and to systematically leverage the resources to enhance the client’s QoS.

6.2 System Simulation Results

The system simulations results are organized into three categories. The firstone compares QoS with/without retransmissions and also the effects of varyingdifferent control parameters on QoS. The second category presents the effects ofvarying different parameters on the usefulness on retransmissions at the edgeserver and the third category synthesizes the results into a set of section is thecalibration curves that have been obtained for different network conditions.Typical representative sets of results in each category are presented with adiscussion and analysis on their behaviors below. Similar behaviors have beenobserved in all the simulation experiments.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:26 • R. Sharman et al.

Fig. 10. a, b. Graph showing effect of varying edge server memory on QoS.

Fig. 11. a, b. Graphs showing effect of varying number of buckets on QoS.

6.2.1 Effects of Control Parameters on QoS. The effects of varying edgeserver memory on QoS are shown in Figure 10. As expected, both QoS1 andQoS2 increase as the edge server memory is increased. Furthermore, usingretransmissions improves both QoS1 and QoS2 by nearly 10%. The effects ofvarying number of buckets on QoS are shown in Figure 11. Both QoS1 andQoS2 at the edge decrease with increase in number of buckets. This is because,lesser the number of buckets, more time is allowed for a retransmitted packet to

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:27

Table II. Effect of Varying Interbucket Delay on QoS (Expo (1.5) Distribution, 20% Loss,8 Buckets, Bucket Size = 50, 3 seconds Retran Interval)

Inter-Bucket QoS1 QoS1 QoS2 QoS2Delay (Without (With (Without (With(seconds) Retransmissions) Retransmissions) Retransmissions) Retransmissions)50 79.23 88.60 77.00 88.3555 79.57 95.30 75.63 93.20

Table III. Effect of Varying Retransmission Intervalon QoS (Expo (1.5) Distribution, 20% Losses, Bucket

Size = 40, 5 Buckets)

RetransmissionInterval (seconds) QoS1 (%) QoS2 (%)3 87.91 85.575 85.94 83.608 83.84 81.51

reach back to the edge server before the bucket is streamed out. Consequently,the utility of retransmissions improves, which impacts the QoS metrics at theclient.

The effects of the interbucket delay parameter on QoS are summarized inTable II. Increasing the interbucket delay decreases the streaming rate at theedge server. This is because reducing the streaming rate implies allowing moretime for a retransmitted packet to reach the edge server. Hence the percentageof packets that eventually reach the client in time (i.e., QoS1) also increases.Although QoS2 may be expected to decrease when the interbucket delay isincreased, it shows a rise because of the increase in the number of packetsbeing streamed. As a result, the effective streaming rate at the edge serverincreases. This argument would also be strengthened by a consideration of theQoS metrics without retransmission. Though QoS1 increases with an increasein interbucket delay, in this case QoS2 decreases as the increase in QoS1 cannotcompensate the effects of reduced streaming rate. Finally, the effects of theretransmission request interval on QoS are summarized in Table III. As can beintuitively observed, the higher the frequency of retransmissions, the better isthe QoS as more packets get requested and consequently, more packets can bedelivered to the client in time.

6.2.2 Effects of Control Parameters on the Usefulness of Retransmissions.The usefulness of retransmissions is measured by the percentage of retrans-mitted packets that were not late and hence, were delivered to the client intime. These results are shown in Figure 12. Increasing the edge server memoryincreases the percentage of useful retransmissions, which is an intuitive result.

Increasing the number of buckets decreases the effectiveness of retransmis-sions, as the buckets would be streamed out at a faster rate and there wouldbe less time for a retransmitted packet to reach before its stream-out time.Similarly increasing the inter bucket delay allows more time for retransmittedpackets to reach the edge server, and hence, the effectiveness of retransmis-sions increases. However, increasing the frequency of retransmissions although

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:28 • R. Sharman et al.

Fig. 12. a–d: Graphs showing effect of varying different parameters on usefulness of retransmis-sions.

increases the number of packets that are requested for retransmission, it alsoreduces the percentage of useful retransmissions.

6.2.3 Calibration Curves. The QoS calibration curves for twelve sets ofexogenous parametric combinations (network losses and delay distributions)are presented in Figure 7 and Figure 8.

6.2.3.1 Over-Provisioned Last Mile—no Congestion and no Loss. The cali-bration curves are obtained as follows. Note that QoS1 and QoS2 are two objec-tives that have to be simultaneously optimized. Further, for each combination ofthe exogenous parameters, a set of simulations where each simulation pertainsto a set of endogenous variable settings has been carried out. By plotting thevalues of QoS1 and QoS2 from these simulations, the set of dominated endoge-nous control configurations are eliminated. The illustrations show the ParetoFrontier. To illustrate, a configuration gis said to dominate another configura-tion if both the following conditions hold: QoS1(�) ≥ QoS1(�) and QoS2(�) ≥QoS2(�). Figure 7 presents the nondominated frontier of endogenous control

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:29

configurations in the bi-criteria QoS1-QoS2 space obtained from this studywhen the last mile is overprovisioned. For the sake of brevity, the exact detailsof the control parameter configurations for the points on the nondominatedfrontiers are not provided in this figure. The nondominated frontier presentsthe tradeoffs between the two objectives for a runtime administrator. An ad-ministrator could use the calibration curves in a given streaming situationas follows. First, the existing network traffic conditions in terms of expectedlosses and delay distribution can be assessed using network sniffers. Second,an appropriate calibration curve that best describes the network conditions ischosen. The curve thus chosen would provide a set of nondominated control pa-rameter configurations for the prevalent network conditions. Next, based on thetradeoffs among these options, the administrator could choose an appropriateconfiguration for streaming. Modeling and analysis of such decision problemsis considered in the multicriteria optimization literature and their applicationto the streaming parameter selection problem is suggested for future research.

The major results of the system simulation studies can be summarized asfollows: (a) using selective retransmissions choosing various endogenous pa-rameters appropriately would yield definite improvements in the QoS of thestream received at the client for general network conditions, (b) higher mem-ory at the edge server yields better results, (c) lower number of buckets yieldshigher QoS at the client, although the packets sent from the edge server areless sorted, (d) lower edge server streaming rates imply higher QoS, albeit thereis a threshold limit for this after which the quality starts dropping, (e) higherfrequency of retransmission requests implies higher quality but there would belot more wasted retransmissions, due to false alarms and potential late arrivalsat the edge server, and finally, (f) the calibration curves can be effectively usedby a streaming administrator to set appropriate values for the control parame-ters in a streaming session in order to optimize the QoS criteria subject to theprevalent network conditions.

6.2.3.2 Underprovisioned Last Mile. Loss of IP packets may occur for mul-tiple reasons—bandwidth limitations, network congestion, failed links, andtransmission errors. Packet loss usually presents a bursty behavior, commonlyrelated to periods of network congestion. In Figure 8, we present the resultswhen the last mile is congested and there are delays as well as packet losses.Such losses are typical when streaming media is accessed over wireless devicesas is the case in DMB and IPTV. The total amount of video-stream data thatcan be sent is limited ultimately by the customer’s actual ADSL/ADSL2+ rate.Core IP infrastructure is usually based on optical networks with a low level ofcongestion; therefore, bandwidth limitations are commonly located only withinthe access network or the customer’s home network. When traffic levels hit themaximum bandwidth available, packets are discarded, leading to video qualitydegradation. ADSL2+ rates may be temporarily affected by external factors,which in turn can generate pixelization of the image. Simulation results inFigure 8 demonstrate that the last-mile conditions severely impact both QoS1and QoS2. In fact as the conditions in the last mile deteriorate the both QoS1and QoS 2 levels are lower (we show the results for 1% packet loss and 5%

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:30 • R. Sharman et al.

packet loss in the last mile). Consequently the Pareto Frontier of nondomi-nated solutions also moves down to lower values. The graphs show that therange of acceptable solutions is at lower regardless of the distribution used.

6.2.3.3 Use of Calibration Curves. The article provides several calibrationcurves simulating a variety of delay and packet loss conditions. The calibra-tion curves presented in the article are only representative of the conditionsan ISP can use the model in. A more extensive set of calibration curves shouldbe developed to detail the effect of a substantially large set of possible networkconditions. This extended set that now includes parameter settings for a varietyof network conditions is stored as a hash map and can then be used as a refer-ence table. Each point on a calibration curve corresponds to a specific qualityof service and is associated with a corresponding set of values for edge servercache parameters. Clients are provided with different qualities of the stream.When the client requests a stream of specific quality the stream control edgeinterface (see Figure 4) does a lookup on the available remote servers which hasthis stream. It pings the remote server with a stream of sample packets, and theround trip time taken by the packets to return back to the edge server from theremote server is used to determine the average network loss and delay incurred.Depending on the loss, the delay, and the quality of the stream requested,the corresponding calibration curve is picked by the stream control edge inter-face to fix the parameters of edge server and allocate resources accordingly. Ifthe interface determines that the available resources are not sufficient toachieve the requested quality under existing network conditions, the edge cacheserver automatically downgrades the quality of the stream to the next availablelevel. Thus the calibration curves can be used in real time. The parameters areset prior to the start of the streaming process.

7. CONCLUSION

On-demand streaming from a remote location through an edge cache server inbest-effort Internet continues to be a challenging problem. In this paper we pro-vide a comprehensive review and develop a taxonomy of current methods to en-hance the delivery quality of multimedia streaming. We extend the current stateof the art by developing a staged streaming architecture for pure on-demandunicast streaming of pre-encoded video from a remote location using partialcaching on edge cache servers. We have designed component architectures ofthe remote streaming process. The architecture is presented using three views:an integrated view, a remote server view and an edge cache view. The architec-ture employs the existing standard streaming protocols: RTP/RTCP and RTSP.The proposed architecture employs a novel cache buffering mechanism at theedge cache server termed revolving indexed buffering and uses selective retrans-missions of lost packets to optimize the eventual quality of service at the client.The revolving buffering architecture and selective retransmission strategiesare the central concepts of this work. The main contribution includes the edgecache management scheme, which is network-aware. Extensive simulationsto evaluate the proposed streaming, retransmission, and buffer managementpolicies have been carried out. We also provide a probabilistic model of the

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:31

proposed streaming process that supports the simulation results. The resultsshow that significant improvements in QoS can be achieved using the proposededge cache buffering and retransmission concepts and they lead to viable androbust mechanisms for runtime optimization of system resources to maximizeQoS in any remote streaming context. These mechanisms are presented interms of a set of calibration curves on the QoS measures for a runtime admin-istrator. They serve as valuable decision aids for edge server buffer allocationand management and thus have significant practical appeal. The calibrationcurves show that regardless of the delay distribution used to model last mileconditions the quality of service parameters (QoS1 and QoS2) realized at theclient degrade as the network conditions degrade. The buffering architectureproposed in this paper is configured for use at an edge cache server. The con-cept of network-aware buffer parameter setting can be adopted in client playerssuch as Windows Media Player or Real Media, but the architecture would haveto be modified for such implementations. Since the buffering in our proposedarchitecture is for an intermediary edge server, a direct comparison with clientplayers has not been possible.

In this article we proposed a framework and a buffer mechanism that issuited for streaming content such as video on demand movies, delayed feed,etc [Marioni 2007]. However, recent trends show that video content (exampleESPN Motion [Ozer, 200]) is also being delivered to consumers using what iscommonly termed “HTTP Streaming” or “progressive downloading.” Progres-sive downloading takes a video file and downloads it using HTTP and startsplaying the video when sufficient content has been downloaded. This is the sim-plest and cheapest way to stream video from a website. Small to medium-sizedWeb sites are more likely to use this method [Bouthillier 2003]. Video quality ishigher with progressive downloading as videos with higher encoding rate can bedownloaded. But there are some distinct problems with progressive download-ing stemming from the nature of the content and the issue of network conditions.Progressive downloading allows for the entire content to be easily stored on theuser’s disk. For reasons relating to protecting intellectual property only someof the video can be streamed to the edge server or to the client. Such contentcannot be delivered using progressive Downloading or HTTP streaming. Videostreaming does not allow storing the entire video and therefore provides bettersafe guards as compared to HTTP. Streaming delivery of video data has dis-tinct advantages as compared to progressive streaming because (a) the systemallows one to monitor exactly what people are watching and for how long theyare watching it and (b) it makes more efficient use of bandwidth since only thepart of the file that’s watched gets transferred [Bouthillier 2003]. For some ofthe streaming applications where a delayed feed is provided to noncommercialchannels (for example, Cricket Test match telecast on free governmental chan-nels), HTTP streaming is not an option. Even though progressive downloadstarts displaying the file during the download process, rather than waiting forthe download to complete, quality is limited by how long and how many timesviewers have to wait for the buffer to fill up. Video that had started playingstops playing until all missing packets and delayed packets are in place in thebuffer. Such intermissions during a longer video clip lower the quality of the

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:32 • R. Sharman et al.

video considerably. In contrast the buffer mechanism proposed in this paperattempts to acquire missing packets through retransmission, as is the casewith TCP. However, unlike progressive streaming, if the missing packets donot arrive on time, the video is still played with the missing packets avoidingintermissions. The net effect results in some jitter, but there is no intermissionevery so often. Video quality is higher with progressive downloading as videoswith higher encoding rate can be downloaded. But this is true for short moviesand clips where intellectual property is not an issue. Getting quality video onthe Web is all about trade-offs, and progressive downloading is increasinglybecoming a low cost alternative to streaming for some of the video content thatis distributed.

Several avenues of future research arise from this work. We have consid-ered unicast streaming in this research. However, this poses a considerableload on available bandwidth, especially when it is limited. Therefore, employ-ing application-level multicasting could be a viable and efficient solution thatcould be used under bandwidth-restricted streaming scenarios. Extensions tothe proposed architecture under bandwidth constraints themselves are majoravenues for future research. Modeling the proposed framework to provide bet-ter quality of service for Internet Protocol Television (IPTV) is another area forfuture research. Creating implementations is a useful exercise that adds valuein terms of creating a product based on the proposed mechanism. This is anotherarea for further development as an application. A limitation of the architectureis that though the buffer parameters are set at the outset when the stream-ing begins based on network conditions, the parameters are not adaptivelyreconfigured based on changing network conditions. This poses an interestingextension to the work and is an area for future research. Another interestingextension to this work would be QoS based SLA (service level agreement) be-tween a content provider (CP) and a service provider (SP) and also betweenservice providers. The SLA would specify: (a) resource allocation to a customerand (b) expected QoS. In real time, based on demand and network conditions,a SP could follow suitable strategies in accommodating requests, downgradingQoS, paying penalties where required in order to maximize its expected profit.Architectural issues, economics of the SLAs and runtime stream control opti-mization under these conditions are important areas for further study. Anotherinteresting extension would be to model and develop architectures for collabo-rative resource-sharing arrangements among different service providers wherean SP could leverage certain resource availabilities and strategic positioning onthe Internet to enable cost-effective QoS based streaming delivery. Such appli-cations are quite promising in grid computing environments. We are currentlypursuing some of these issues.

REFERENCES

ALBANESE, A., BLOMER, J., EDMONDS, J., LUBY, M., AND SUDAN, M. 1996. Priority encoding transmis-sion. IEEE Trans. Inform. Theory, 42 (Nov), 1737–1744.

ANJUM, F. AND JAIN, R. 2000. Performance of TCP over lossy upstream and downstream links withlink level retransmissions. In Proceedings of the 8th IEEE International Conference on Networks(ICON’00).

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:33

BALKIR, N. AND OZSOYOGLU, G. 1998. Delivering presentations from multimedia servers. In Pro-ceedings of the IEEE International Workshop on Multimedia DBMS.

BERRA, B., GOLSHANI, F., MEHETRO, R., AND SHENG, O. 1993. Multimedia information systems. IEEETrans. Knowl. Data Engin. 5, 4 (Aug), 545–550.

BOLOT J. AND TURLETTI, T. 1996. Adaptive error control for packet video in the Internet. In Pro-ceedings of the IEEE International Conference on Image Processing (ICIP’96), 25–28.

BOUAZIZI, I. AND GUNES, M. 2003. Selective proxy caching for robust video transmission over lossynetworks. In IEEE ITRE, Special Session for Robust Video Transmission.

BOUTHILLIER, L. 2003. Streaming vs. downloading video: Understanding the differences. Stream-ing Media.Com, http://www.streamingmedia.com/article.asp?id=8456&page=2&c=11 AccessedApril 2007.

BRICENO, H., GORTLER, S., AND MCMILLAN, L. 1999. NAIVE – Network aware internet video encod-ing. In Proceedings of 7th ACM Multimedia Conference. 251–260.

CHEN, S., SHEN, B., WEE, S., AND ZHANG, X. 2004. Designs of high quality streaming proxy systems.In Proceedings of Twenty-Third Annual Joint Conference of the IEEE Computer and Communi-cations Societies.

CONKLIN, G., GREENBAUM, G., LILLEVOLD, K., LIPPMAN, A., AND REZNIK, Y. 2001. Video coding forstreaming media delivery on the Internet. IEEE Trans. Circuits Syst. Video Techn. 11, 3 (Mar.),269–281.

CUETOS P. AND ROSS, K. 2003. Optimal streaming of layered video: joint scheduling and errorconcealment. In Proceedings of the 3rd ACM Multimedia Conference 55–64.

DAI, R., STAHL, D., AND WHINSTON, A. 2003. The economics of smart routing and quality of service.Netwo. Group Comm. 318–331.

DEMPSEY, B., LIEBEHERR, J., AND WEAVER, A. 1996. On retransmission based error control for con-tinuous media traffic in packet-switching networks. Comp. Netw. ISDN Syst. 28, 5 (Mar.), 719–736.

ELEFTHERIADIS, A. AND ANASTASSIOU, D. 1995. Meeting arbitrary QoS constraints using dy-namic rate shaping of coded digital video. In Proceedings of 5th International WorkshopNetwork and Operating System Support for Digital Audio and Video (NOSSDAV’95), 95–106.

FROSSARD, P. AND VERSCHEURE, O. 2002. Batched patch caching for streaming media. Comm. Let.6, 4, 159–161

FU, Y. AND VAHDAT, A. 2002. Service level agreement based distributed resource allocation forstreaming hosting systems. In Proceedings of 7th International Workshop on Web Caching andContent Distribution (WCW).

GIROD, B., HORN, U., AND BELZER, B. 1995. Scalable video coding with multiscale motion compensa-tion and unequal error protection. In Proceedings of Symposium on Multimedia Communicationsand Video Coding.

GRIECO, R., MALANDINO, D., SCARANO, V., VARRIALE, F., AND MAZZONI, F. An Intermediary softwareinfrastructure for edge services. Proceedings of the IEEE International Conference on DistributedComputing Systems Workshops, 2005.

GUPTA, A., JUKIC, B., PARAMESWARAN, M., STAHL, D., AND WHINSTON, A. 1997. Streamlining the digitaleconomy: how to avert a tragedy of the commons. IEEE Internet Comput. 1, 6, 38–46.

HEMY, M., HENGARTNER, U., STEENKISTE, P., AND GROSS, T. 1999. MPEG system streams in best-effortnetworks. In Proceedings of IEEE Packet Video.

HILLESTAND, O. I., LIBAK, B., AND PERKIS, A. 2005. Performance evaluation of multimedia servicesover IP networks. In Proceedings of ICME.

HSIAO, P., KUNG, H., AND TAN, K. 2001. Video over TCP with receiver-based delay control. InProceedings of ACM NOSSDAV. 199–208.

JIN, S., BESTAVROS, A., AND IYENGAR, A. 2002. Accelerating Internet streaming media deliveryusing network-aware partial caching. In Proceedings of International Conference of DistrubutedComputer Systems.

JIN, S., BESTAVROS, A., AND IYENGAR, A. 2003. Network-aware partial caching for Internet stream-ing media. Multime. Syst. Springer-Verlag.

KANGASHARJU, J., HARTANTO, F., REISSLEIN, M., AND ROSS, K. 2002. Distributed Layered EncodedVideo through Caches. IEEE Trans. Comput. 51, 6, (June), 622–636.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

13:34 • R. Sharman et al.

KARRER R. AND GROSS, T. 2001. Dynamic Handoff of Multimedia Streams. In Proceedings of ACMNOSSDAV. Port Jefferson, NY. 125–133.

LEE S. AND LEE, S. 1998. Retransmission scheme for MPEG streams in mission critical multime-dia applications. In Proceedings of 24th EUROMICRO Conference.

LI, S., WU, F., AND ZHANG, Y. 1999. Study of a new approach to improve FGS video coding efficiency.ISO/IEC JTC1/SC29/WG11, MPEG99/M5583.

LI, X., PAUL, S., AND AMMAR, M. 1998. Layered video multicast with retransmissions (LVMR):Evaluation of hierarchical rate control. In Proceedings of IEEE Infocom.

LITTLE, T. 1993. A framework for synchronous delivery of time-dependentmultimedia data. Multime. Syst. 1, 2, 87–94.

LU Y. AND CHRISTENSEN, K. 1999. Using selective discard to improve real-time video quality on anethernet local area network. Inter. J. Network Manage. 9, 106–117.

MARIONI, R., Streaming video and the media. Rich Web. http://richweb.net/Streaming Video Articles.htm. Accessed April 2007.

MCCANNE, S., JACOBSON, V., AND VETTERLI, M. 1996. Receiver-driven layered multicast. In Proceed-ings of ACM SIGCOMM. 117–130.

MIAO Z. AND ORTEGA, A. 1999. Proxy caching for efficient video services over the Internet. InProceedings of Packet Video.

MOURAD, A. 1996. Doubly striped disk mirroring: Reliable storage for video servers. Multimed.Tools Appl. 2, 253–272.

MULABEGOVIC, E., SCHONFELD, D., AND ANSARI, R. 2002. Lightweight streaming protocol (LSP). InProceedings of the 10th ACM International Conference on Multimedia. Juan-les-Pins, France.

NITHISH, M., RAMAKRISHNA, C., RAMKUMAR, J., AND LAKSHMI, P. 2002. Design and Evaluation ofIntermediate retransmission and packet loss detection schemes for MPEG4 transmission. InProceedings of the International Conference Information Technology: Coding and Computing(ITCC’04).

NONNENMACHER, J., BIERSACK, E., AND TOWSLEY, D. 1998. Parity-Based Loss recovery for ReliableMulticast transmission. IEEE/ACM Trans. Netwo. 6, 4 (Aug.), 349–361.

OZER, J. 2003. The moving picture: the problem with streaming. e-Media Live.http://www.emedialive.com/Articles/ReadArticle.aspx?ArticleID=8071. Accessed April 2007.

PAPADOPOULOS, C. AND PARULKAR, G. 1996. Retransmission-based error control for continuous me-dia applications. In Proceedings of the International Workshop on Network and Operating SystemsSupport for Digital Audio and Video (NOSSDAV), 5–12.

PIECUCH, M., FRENCH, K., OPRICA, G., AND CLAYPOOL, M. 2000. A selective retransmission protocolfor multimedia on the internet. In Proceedings of the SPIE International Symposium on Multi-media Systems and Applications.

PURI, R., LEE, K., RAMCHANDRAN, K., AND BHARGHAVAN, V. 2000. Application of FEC-based multipledescription coding to Internet video streaming and multicast,” In Proceedings of the Packet VideoWorkshop. Cagliari, Sardinia, Italy.

RABINOVICH, M. AND AGGARWAL, A. 1999. Radar: A scalable architecture for a global web hostingservice, Comput. Netw. 31, 11–16, 1645–1661.

RABINOVICH, M. AND SPATSCHECK, O. 2002. Web Caching and Replication, Addison-Wesley, Boston,MA.

REJAIE, R., HANDLEY, M., AND ESTRIN, D. 1999. Quality adaptation for congestion controlled videoplayback over the Internet. In Proceedings of the ACM SIGCOMM.

REJAIE, R., YU, H., HANDLEY, M., AND ESTRIN, D. 2000. Multimedia proxy caching for quality adap-tive streaming applications in the Internet. In Proceedings of the 19th Annual Joint Conferenceof the IEEE Computer and Communications Societies (INFOCOM’00). 980–989.

RIZZO, L. 2000. pgmcc: a TCP-friendly single-rate multicast congestion control scheme. InProceedings of ACM SIGCOMM’00. 17–28.

SEN, S., REXFORD, J., AND TOWSLEY, D. 1999. Proxy prefix caching for multimedia streams. InProceedings of IEEE INFOCOM.

SHEN, B., LEE, S., AND BASU, S. 2004. Caching strategies in transcoding-enabled proxy systemsfor streaming media distribution networks. Multimedia. 6, 2, 375–386

SHIM J. AND K. AHN, K. (Forthcoming). Empirical findings on the perceived use of digital multimediabroadcasting mobile phone services. Indus. Manag. Data Syst. To appear.

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.

P1: OJLACMJ321-03 ACM-TRANSACTION September 19, 2007 23:49

Cache Architecture for On-Demand Streaming on the Web • 13:35

SHIM, J., VARSHNEY, U., DEKLEVA, S., AND KNOERZER, G. 2006. Mobile and wireless networks: Ser-vices, evolution & issues. Inter. J. Mobile Comm. 4, 4, 405–417.

SINHA R. AND PAPADOPOULOS, C. 2004. An adaptive multiple retransmission technique for contin-uous media streams. In Proceedings of 14th International workshop on Network and OperatingSystems Support for Digital Audio and Video (NOSSDAV’ 04).

TAN, W. AND ZAKHOR, A. 2001. Video multicast using layered FEC and scalable compression. IEEETrans. Circuits Syst. Video Techn. 11, 3, 373–387.

VARADARAJAN, S., NGO, H., AND SRIVASTAVA, J. 2002. Error spreading: A perception-driven approachto handling error in continuous media streaming. IEEE/ACM Trans. Netw. 10, 1, 139–152.

WADA, M. 1989. Selective recovery of video packet loss using error concealment. IEEE J. Select.Areas Commu. 7, 807–814.

WANG, B., SEN, S., ADLER, M., AND TOWSLEY, D. 2002. Optimal proxy cache allocation for efficientstreaming media distribution. In Proceedings of 21st Annual Joint Conference of the IEEE Com-puter and Communications Societies.

WANG, S. AND BHARGAVA, B. 1997. Multi-pass transmission policy: An effective method of trans-mitting large multimedia objects in the wide-area network. In Proceedings of 21st InternationalComputer Software and Applications Conference (COMPSAC’97).

WU, D., HOU, Y., ZHU, W., ZHANG, Y., AND PEHA, J. 2001. Streaming video over the Internet: Ap-proaches and directions. IEEE Trans. Circuits Syst. Video Techn. 11, 3 (Mar.).

WU, D., HOU, Y., ZHU, W., LEE, H., CHIANG, T., ZHANG, Y., AND CHAO, H. 2000. On end-to-end ar-chitecture for transporting MPEG-4 video over the Internet. IEEE Trans. Circuits Syst. VideoTechn.

WU, Q., RAO, N. S. V. N., AND IYENGAR, S. S. 2004. On measurement-based transport method formessage delay minimization over wide-area networks. In Proceedings of International Conferenceon Computer Communications and Networks (IC3N’04).

YAMAGUCHI, M., ITO, K., AND TAKASAKI, Y. 2000. Packet loss detection scheme for retransmission-based real-time data transfer. In Proceedings of IEEE 7th International Conference on Paralleland Distributed Systems: Workshops (ICPADS’00 Workshops).

ZIMMERMANN, R., FU, K., NAHATA, N., AND SHAHABI, C. 2003. Retransmission-based error controlin a many-to-many client-server environment. In Proceedings of ACM Multimedia ComputingNetworking Conference. Santa Clara, CA.

ZINK, M., GRIWODZ, C., JONAS, A., AND STEINMETZ, R. 2000. LC-RTP (Loss Collection RTP): Relia-bility for Video Caching in the Internet. In Proceedings of IEEE 7th International Conference onParallel and Distributed Systems Workshops (ICPADS’00 Workshops).

ZINK, M., SCHMITT, J., AND STEINMETZ, R. 2002. Retransmission scheduling in layered Video Caches.In Proceedings of IEEE International Conference on Communications 2002 (ICC’2002), 2474–2478.

Received February 2006; revised April 2007; accepted April 2007

ACM Transactions on the Web, Vol. 1, No. 3, Article 13, Publication date: September 2007.