Post on 22-Feb-2023
Design of A Transcoding Proxy
Server for Mobile Web Browsing
Fan Peng Kong
-4 t hesis submit tetl in conforniity wit h the requirements for the degree of SIaster of Applied Science
Graduate Department of Electrical and Cornputer Engineering University of Toronto
@ Copyright by Fan Peng Kong. 2000
National Library I*I of Canada Bibliothèque nationale du Canada
Acquisitions and Acquisitions et Bibliographic Services services bibliographiques
395 Wellington Street 395. rue Wellington Ottawa ON K1A ON4 Ottawa ON K I A ON4 Canada Canada
The author has granted a non- exclusive licence dowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fonne de microfiche/film, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
Design of Transcoding Proxy
Server for Mobile Web Browsing
Fan Peng Kong
S lu te r of Applied Science. 2000
Ckicluate 1)epartnierit of Elect rical and Corriputer Engineering
University of Toronto
Abstract
-4 trariscoding prosy server is a rriiclcllemare interface bettveeii the server and the mo-
bile client: it clecreases the size of the web content by transcoding. thus rediicing the
clownloitding t inie ciramat ically.
Previoos transcoding pros? systerns can cope only with single content items. e.g..
transcoding the images in the neb page one by one rather than processing the entire page
aithin one connection. This research introduces a methocl of web page transcoding. which
transcodes the whole web page as one object a t the prosy server. The method makes
possible many netv processing techiiicpes that enable a mide range of client requirements
regarcling the processing of a web page. for esample. web page re-arrangement. searching
and filtering, and dynamic browsing. An adaptive transcoding policy is designed for the
web page transcoding. ancl overall optirnization is achieved for the first time.
This research des ips a new transcoding proxy server system and implernents it in
Java. The results show that the netv software proxy system can decrease the downloading
time by 25 times and still make the transcoded web page recognizable.
Acknowledgement s
I wo~ilcl like to express my niost sincere gratitude to my thesis supervisor Professor
Lén~rsanopoulos for his guiclance. atlcice. icleas. ancl encouragement diiring the course of
my 1IASc degree.
1 woiilcl also like to thank Dr. Adriana Diimitras for her nurncroris comments and
carchil reacling of the nianilscript.
Finally. i t hank rny faniily ancl friencls for their love and support.
iii
Contents
Abstract
Acknowledgement s
List of Tables
List of Figures
vii
1 Introduction 1
1.1 Motivation of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Previoiis work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - a
1.2. Research IVork . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 3
1 - 2 2 Commercial Irnplementations . . . . . . . . . . . . . . . . . . . . 8
1.3 Research Objective and Contributions . . . . . . . . . . . . . . . . . . . 9
1.3.1 h LIethod of Web Page Transcoding . . . . . . . . . . . . . . . . 11
1.3.2 An l d a p t i w Transcoding Policy . . . . . . . . . . . . . . . . . . 12
1.3.3 -1 System Design and .Java implementation . . . . . . . . . . . . . 13
1.3.4 - i n Esperimental Evaluation . . . . . . . . . . . . . . . . . . . . . 13
1.4 Thesiss t ructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . . 14
2 Background
2 1 Web Prosy Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 The HTTP Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 HTTP Pros! Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 TCP sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Glossary --
2.6 Sunimary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
System Design of Proposed Transcoding Proxy Server 24
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 .1 Design Fundamentals 24
3 Systern Design for Proposecl Pros? Systeni . . . . . . . . . . . . . . . . . 26
3 . 3 Main Flow Chart of Pmposecl Prosy Software Systeni . . . . . . . . . . . 30
3.4 Single File Process Ttircacl . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4.1 Reqiiest Processing . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.2 Yirtiiai Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . :33
3 - 4 3 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 \\ébPageProcessTlireacl . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5.1 Reqiiest Processirig . . . . . . . . . . . . . . . . . . . . . . . . . . 36
i3.5.2 Virtual Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5.3 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.6 Siimrnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -40
4 Web Page Tkanscoding 41
4.1 Data Processing Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . 41
4 Image Content .\nalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Adaptive Transcoding Policy . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 Prediction of the Transcoding Time Delay . . . . . . . . . . . . . 49
4.3.2 Size Reduction Decisions . . . . . . . . . . . . . . . . . . . . . . . 53
4.3.3 Transcochg Pararnet ers Decisions . . . . . . . . . . . . . . . . . . -33
4.4 Tr;inscoding and Rnrons t riict ion . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Siimniary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5 Implementat ion and Experimental Results 68
.5 . 1 .Java Iniplementat ion of Proposecl Transcodinrr Svstem . . . . . . . . . . 68
5.1.1 RTTI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5 . 2 Nested hlultiple Tliread Striicture . . . . . . . . . . . . . . . . . . 69
1 . 3 TCP Sockets: Sen-erSocket . Socket class . . . . . . . . . . . . . . 69
. . . . . . . . . . 1 . 4 Co rincction Cont rol and Resoiirce Slnnagenieiit 70
5 . 1 . H T U L Passer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.6 Protoc01 FIandler . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
- 5 . 1.1 Cser Preference Definition . . . . . . . . . . . . . . . . . . . . . . l 2
-- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . 2 Esperinient al Results l a
5.2. L Transcoding of single images . . . . . . . . . . . . . . . . . . . . . 76
5 - 2 2 Transcoding of rveh page nith ETP . . . . . . . . . . . . . . . . . 81
.i .2.3 Transcocling of web page with ESRR . . . . . . . . . . . . . . . . 86
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Siininiary 59
6 Conclusions and Future Work 90
A Test Images & Web page 93
B Transcoded web page 107
Bibliography 116
List of Tables
1.1 hplenientatiori features of esist ing susterris . . . . . . . . . . . . . . . .
. . . . . . . . . 4.1 Coniparison of iniage properties for clifferent iniage types
. . . . . . . . . . . 4.2 Transcocling time del- for images mith different size
4.3 Triiriscoding tinie cielay for differerit trariscoding paranieters . . . . . . .
4.4 Linear tinie clelay estimation results for cliffererit transcoding parameters
-1.5 Sizes of transcodccl iiiiages for clifferent transcoding paranieters (bytes) .
4.6 Linenr size retliiction estimation for cliffererit images . . . . . . . . . . . .
Definition of preference[O] . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . Definition of preference[l] S- [2]
Definition of preference[3] . . . . . . . . . . . . . . . . . . . . . . . . . .
Definition of preference(41 Si [SI. The notations CO'J. L-IXF: INF, MAP.
RCL. BUL. DEC. and ADV have been esplainecl in Chapter 4 . . . . . .
Definition of preference(i1 . . . . . . . . . . . . . . . . . . . . . . . . . .
Definition of pref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Keb page t ranscoding and t ranscocling parameters evaIuated by ESRR .
List of Figures
1 . L An Internet connertion via prosy [6] . . . . . . . . . . . . . . . . . . . . 3
. . . . . . . . . . . . . . . . . 2.1 Layer correspontlence betwcen OS1 and IP Il
3.1 Prosy semer susteni design . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 '\lain flowchart of the software prosy server systcwi . . . . . . . . . . . . 31
.3.3 Single file process t hreacl . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Keb page process service t hreacl . . . . . . . . . . . . . . . . . . . . . . . 37
3 3 \Y& page processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Data Processing module for web page transcociing . . . . . . . . . . . . . 43
. . . . . . . . . . . . . . . . . . . . . . . . . 4 . 2 Image content decision tree 46
4.3 Transcoding time clelay Vs . ' io . of pisels (R=0.37.5. C=-Lbit) . . . . . . . 5-1
4.4 Transcoding tinie cielay C's . 3'0 . of pixels (R=0.5. C=-Lbit) . . . . . . . . 3.1
- - 4 Transcoding time del- C's . No . of pisels (R=0.635. C=-lbit) . . . . . . . XI
4.6 Transcoding time d e l - Ys . ' io . of pixels (R=O.75. C=4bit) . . . . . . . 55
4.7 Transcoding time d e l - Vs . No . of pixels (al1 samples) . . . . . . . . . . 56
4.8 SIean transcoding time delay Vs . No . of pixels . . . . . . . . . . . . . . . 56
4.9 Mapping of relative importance to sîze reduction ratio . . . . . . . . . . . 58
4.10 Size reduction C's . Equivalent pkel Xo . for announcer-gif . . . . . . . . . 61
... Vlll
4.1 I Size recliict ion Ys . Eqiiivalent pisel No . for 11aboon.gif . . . . . . . . . . 6 1
4.12 Size recluction 1's . Equimlent pisel No . for coniniheacl2.gif . . . . . . . . 61
4.13 Size recluciion 1% . Ecluiwlent pisel ?;O . for iioft-gif . . . . . . . . . . . . 62
4.14 Size retluction VS . Eqiiivalent pisel No . for al1 images . . . . . . . . . . . 63
4.15 -\chprive transcocling policy . . . . . . . . . . . . . . . . . . . . . . . . . 66
5 . 1 Irnage downloacling time a t 28.8 kbps(wit h transcocling) . . . . . . . . . . 78
5.2 Image c!onnloading time comparison at 28.8 kbps . . . . . . . . . . . . . 78
j . 3 Irriage downloading tirne at 14.4 kbps(with transcocling) . . . . . . . . . . 79
. . . . . . . . . . . . . 5.4 [mage tlowtiloadirig time cornparison at 14.4 kbps 79
- - 3.3 Image tlownloücling time at 9.6 kbps(nit h transcoding) . . . . . . . . . . 80
5.6 [mage clonnloacling t ime cornparison at 9.6 kbps . . . . . . . . . . . . . . 80
- -. . . . . . . . a . r Keb page clonnloacling tirne at 28.8 kbps(ETP transcocling) 53
5.8 Keb page clownloading time coniparison at 28-23 kbps . . . . . . . . . . . 53
. . . . . . . 5.9 Web page downloading time at 14.4 kbps(ETP transcoding) 84
5.10 Web page clolvnloading time comparison at 1-4.4 kbps . . . . . . . . . . . 84
.5 .lL Web page tlotvnloachg time at 9.6 kbps(ETP transcoding) . . . . . . . . 85
5.12 Web page domnloading time comparison at 9.6 kbps . . . . . . . . . . . . 85
5.13 Web page clotvnloncling time at 28.8 kbps (ESRR transcocling) . . . . . . 87
3.14 Web page downloading time cornparison at 2S.Q kbps . . . . . . . . . . . 83
X.1 adv-l-1Zyahoo.gif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.2 advL-li.5;vahoo.gif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
1 . 3 advmiadora-yahoo-gif . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.4 advmonster-gif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1
-43 adv-ntap-yahhoo.gif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.6 announcer 128.gif 94
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.7 con1nheatl2.gif 94
. . . . . . . . . . . . . . . . . . . . . . A.8 anemorle.gif (scaled by 0.5 x 0.5) 95
. . . . . . . . . . . . . . . . . . . . . . . . . h.9 golc1.gif (scaled by 0.5 x 0.5) 95
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .-\.I0 announcergif 96
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .-\ . 11 baboon-gif 97
. . . . . . . . . . . . . . . . . . . . . . . . .hi . 12 cnheel.gif (scaled by 0.5 x 0.5) 98
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .-\.13guanyir~gi f . 99
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.14 inf-bag-yahoo.gif 100
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .I . 1.5 inf-billpay-yahoo-gif) 100
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.16 inf_mailsahuo.gif 100
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.17 inf-pts-yahoogif 100
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .-\ . IS tioft . gif 100
. . . . . . . . . . . . . . . . . . . .I . 19 jorclan-poster2.giI (scalecl by 0.8 x 0.8) 101
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.20 jorclan-postrr3.giF 101
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A21 kids-gif 107
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.22 map@ 103
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.23 spiash.gif 104
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.24 stock-gif 105
. . . . . . . . . . . . . . . . . . . . . . . . . A.25 tree-gif (scalecl by 0.7 x 0.7) 105
. . . . . . . . . . . . . . . . . . . A 2 6 Original web page (scaled by 0.4 x 0.4) 106
B.1 Transcoded web page (ETP) (scaled by 0.8 x O.S)( for TOX":R=0.373
. . . . . . . . . . . . . . . . . . . C=4: lor other content: R=0.625 C=4) 108
B.2 Transcodecl web page (ETP) (scaled by 0.8 x 0.8) (colored-for "COXL'.':R=0.3i3
. . . . . . . . . . . . . . . . . . . C=4: for other content: R=0.625 C=4) 109
. . . . . . . . B.3 Transcoded web page (ESRR with lOs)(scaled by 0.8 x 0.8) 110
. . . . . . . . B.1 Transcoded web page (ESRR nith 15s) (scaled by 0.8 x 0.8) 111
. . . . . . . . B.3 Transcoded web page (ESRR ni th '2Os)(scaled by 0.8 x 0.8) 112
. . . . . . . . B.6 Transcoclec~ web page (ESRR n i th '25s)(scaled by 0.3 x 0.8) 113
. . . . . . . . . B 7 Transcoded web page (ESRR with 30s)(scalecI by 0.3 x 0.8) I l 4
. . . . . . . . B.8 Transcotlecl meb page (ESRR with 35s)(scalecl by 0.8 x 0.8) I l 5
Chapter 1
Introduction
1.1 Motivation of Research
Nonadap. a rapiclly growing number of users are connected to the Internet for vari-
ous piirposes. al1 of rvhich are rclated to tlie eschange of rniiltiniedia content of r e b
dociinients.
The bantlnicltti of clifferent access to the Iriternet varies greatly [l]. Professional
Internet users iise high bit-rate links such as prima- rate ISDX ('2 SIbps) or .-\Tl[ (UP
to 2 Gbps). Average consilmers accesç the Internet throiigh low bit-rate connections.
such as 13-1 modems (up to 33.6 kbps) or newly released Y.90 modems (up to .56 kbps
downstream. up to 31.2kbps upstream). Some ilsers use cable niodems (arouncl 1.5 Mbps
downstreani ancl 300 kbps upstrearn) or ADSL modems (aroiind 3 &[bps downstream and
1 l I bps iipstream). For mobile users iising cellular phone links. hand-held cornputers
(HHC) and persona1 digital assistants (PD.-\) [-Io this communication rate is eveo lower.
usually only 14.4 kbps or less. Furt hermore. the actual transmission rate depends on the
current user load and the infrastructure of the part of the Internet in use. From a client's
point of view. it may V a r y unpredictably between zero and the maximum bit rate of its
network niorlem.
From the point of ciew of the content provicler ( the web server). by clehiilt. it does
not consicler which kirid of the network connection is between itself ancl the client. It
would jiist return a response to a recpest frotn the client. mith the wveh page that niight
be full of images. soiirici files. applets. .lalascript. and tests.
Thiis the problcnis for niobile neb browsing are:
0 Limitecl bandwidth: The ver- low baiiclwidth plus the high disconnection rate aiid
tiigh error rate of niobile link can easily -'kill" the process of niobile web browsing
becüiise of the long (lowiloading tinie.
a High cost: The cost of niobile coririections is very high. For esaniple. the connec-
tioii cost for Palm \'II is orle dollar per 5 ur 6 kilobytes of clownloaded data [3].
Thcrefore. a niobile user rvill pay -1 dollars to viea a '20 KB .JPEG [-LI iniage.
Liniitecl compiitatiori ability: Certain types of content cannot be hancilecl by the
mobile iisers. This is due to the th& limitecl conipiitabilitp and meniory. For
esample. the active content like .lava applets and the aiidio content.
0 Lirnitetl display: One of the practical problems for mobile web browsing is that
cli fferent nio bile clevices rnay have di fferent and limi ted dis play requirenient slien
displaying wwb cioctiment. For esarnple. a PD;\ has a screen of only 320 x 200 pixels
ni th 8 gray levels. Thus a colorful meb page suitable to be displayed on a PC will
not be displq-ecl properly on a mobile derice.
One solution of the problems mentioned above is to put .*transcoding prosy server"
as an interface between the Web server and the client. as indicated in Figure (1.1) [6]:
A transcoding prosy semer is a piece of middleware that acts as an agent between
the server and the mobile client. Wlen the data strearn passes through. the proxy senrer
1.1- Motivation O/ Research
Wireless LAN
I I Wireless Modem
Transcoding Proxy
p ISDN
INTERNET 4
Multimedia PC
Figure 1.1: An Internet connection vin pros- [6]
tries to cut its arnoiint to such a level that the mobile client can browse the Internet with
a relatively convenient specd. The method is by agressive conipression. rnainly of the
images inside a web page. Thus transcocling prox- sener can achieve:
Shortening the downloading tirne
Reducing the cost
Filtering unwanted content
Adjusting images for different display requirements to be directly viewable
Transcoding is effective in that images lorm up t.o 70% of the Internet data traffic [SI.
It becomes even more significant when the network congestion and data error rates are
considerecl. Packets get ciroppetl when tlic network is congested. Sonie experinients have
shown that loss rates in 5% range are quite cornmon and in norst case. the network c m
drop iip to 25% of al1 packets [Tl . When liigh error rates occurs. the data has to be sent
repeatedly for some time until the correct data is iiccepted hy the encl user. Therefore a
niuch srrialler size web document will be niucti more desirable for clowriloading because it
further recliices the amourit of packets* tlropping and error data a t the samc congestion
rate aricl error rate. ancl t hus makes the entire ciownloading furt her shortcr. Tliis is also
triie when the problern of frequent discorinection is taken into consicleration for mobile
meb browsing.
Transcotling at a proxy server is niore favorable t han at the content provider's server
or the client's sitle. Comparing with transcocling at cveb servers. transcocling at prosy
semers lias t lie Folloiving atlvantages:
Flesit~ility: If we CIO transcoding at the serwr's sicle. every web server in the In-
ternet hiw to be niotlified to have the ability to du transcoding. Instead if me do
transcoding via a pro- server. none of the web sewers needs an- moclifications.
At times when ftinctions neecl to be addecl for transcocling. there will be no pain of
rnoclif!.ing every aeb semer in the Internet. The only thing to rnodify is the pro?
server.
0 Enhancecl security: From a security perspective. it is more beneficial to separate
the origin servers and proxy servers. For esample. when separatecl. origin web
sen-ers clo not make connections to the internal network. mhich makes it possible
for firewall pro- semers to be set up to block any connections initiated by the
web server. This protects the internal network if the web server machine becomes
cornpromised. so even if an intruder gains access to the web semer host' he is still
unable to connect to the hosts inside the firewall (231.
0 Eiise of atlniinistration: Separating the origin web serwr and prosy server func-
t ionality niakes the atlministrat ion easier. as origin web serrer and pros? server
featiires are rlearly separatetl into differeiit aclrninistration interfaces. This rediices
the risk of niisconfiguration. For esarnplc. access control might be incorrectly set
up so tliat is applies to the origiri server and not the prosy server. or vice versa
[23] .
0 SLocliilarization of tleveloprnent: Frorn a software developer's point of viem. sepa-
rat ing t hese tno fiirict ionality makes drvelopment easier. Web serwrs and proxy
serwrs. who incleeci share some furictionality. are cpite clifferent from each other.
and fairly coniplicatetl on their own. Separat ion niakes developnient. stabilization
ancl testirig ei&r as t h e size of ttie software becornes smaller [23!.
As For transcotiing on the client's side. it is ïietinitely riot n goocl clioic~. The reason is
that this transcoding hecomes iiseless because it won't Save an? tirne: ttie big images are
already cio~vnloatlecl. Fiirtherrnore. with the client's limited computability and merno.
it will cause more clelay.
1.2 Previous work
Previoiis transcocling systenis can be classificd into two big categories: research work
category and commercial implemeiitation category.
1.2.1 Research Work
Theoretical Analysis
Theoretical research on transcoding proxy has been carried out by IBbI [2. 20]&[6]. The
research ivork focuses on analyzing only the transcoding part of the prosy system and
the proct'ising of one iniage per operation. The work in [ 2 . 201 is on the analysis of
content-basecl transcoding. wtiere the content of the images in the Keb clociirnent is
cletectecl and the transcotling policy is based on the image content. This means as long
as t m ~ images belong to the same content group. they will be transcoded by the same
transcotling paramet ers. Also t hose transcoding parameters miist be pre-set by the proxy
server or by the client.
The work in [6] consists of sonie theoretical analysis on the adaptation policy of image
t ransçocling. rvhere algorit hnis are analyzed to decide whet lier the t ranscocling should be
perforniccl accortling to the conncction banclwidth and the user's display reqiiirmient.
The important thing is that the algorittims are baseci on fised transcoding paranieters.
Thereforcl. no adaptation of transcocling parameters can be obtainecl throiigh theni. That
is the reason why in the real implementation shown in [û]. the transcoding policy uses
none of their prediction algorithnis. iristead it iiscs simple "if. then" sivitches to set the
transcocling parameters (for esarriple. if t lie oser is muPalm". t hen color dep t h reduced to
--bit gray ancl the scaling ratio is fiseci. wtiich miist be toltl hy the client in advance).
IBl I also presents a content adaptation syiteni in [Hl that adapts miiltirneclia neb
content to optirnally match the resoiirces and capabilities of cliwrse client devices. This
transcotling system is an estension to a web server and the content adaptation is per-
forrned by selecting from a ntimber ofdifferent possible transcoded versions of the content.
The selec: t ion is based on so called -*subjective memure of fideli tu" . whose computational
mechanisrn still cannot be determined and value must be pre-set to each content.
Research wit h implement at ion
Some research work with irnplementations on transcoding proxy systems has been carried
out by different universities. University of California a t Berkeles University of Maryland,
and Cniversity of Helsinki, resulting in GloMop: Mowser, and 3Iowgli systems respec-
In the Glohlop rnoclel describeci in [LOI. the pros? perfornis "ciistillation" of the image
received frorri the serwr before seiidiiig it to the client. Distillation is definecl as highly
lossy. rd-t inie. datatype-specific compression that preserves niost of the semantic con-
t e ~ ~ of :in in1i.g~. The t:iiascoc!i~:g pr!icy is Sasec! î n prec!icti-n d' the transcoding tirne.
iising fiscd parameters. which is sirriilar to [61.
The Nowser rnoclel [SI designs the HTTP prosy server to riin on a Mobile Support
Station (IISS) iintl conimunicirte witli the Mobile Host (UH). it Iiiu a preference lookup
table which uses IP address as incies. Ttie approach of transcoding image files is to fincl
out the s ix of the image and if it is higger than that can be Iianclled by Mobile Iiost,
ttie image is scalecl clown. recliicecl in ci~lor. or both. without losing the semantiçs. The
pros- dors transcoding not only For the response strenm froni the server. but also the
reqtiest to the server. It also perforrns so-called HTSIL transcoding by HT'rIL parsing
and actiw content filtering. HTIIL parsing is based on the fact that different content in
a web page can be identifiecl by its tag [30. 311.
The Mowgli mode1 [I l . 121 consists of two mediators. the hlowgli agent and the
hfowgli Prosy . locatad on Mobile Host ancl mobile-connection host respectively. The
.\[owgli prosy performs GIF [9] to .JPEG conversion. and large embedded images are not
transferred at a11 to the mobile nocle. The Slowgli agent and Slowgli Proxy use a special
protocol. called llowgli HTTP protocol. to cornmunicate mith each other. They reduce
the data transfer over the wireless link in three ways: data compression. caching and
filtering.
h surnrnary of the previous implementations of the pro- systerns is illustrated in
Table (1.1).
lmplernentation Features Mowser Mowgli GfoMop --
Communication protocol HlTP Mowgli HTTP HlTP
Simple threshold Simple tfireshold Statistical mode1 of si, and mlor of sire and color for single image
Transcoding objective Single content Single content Single content
Reûuction in size Reduction in size Fixed Image transcoding method and and cotor
HTML parçing
Active content filtering
Video Stream transcoding
Transcoding in both directions
Coding tanguage
Adaptive policy to evaluate transcoding parameters
Image content analysis
Proxy systern design
Yes
Yes
Yes
Yes
Perl
No
No
No
Yes
No
No
No
Nat Available
Yes
No
Yes
No
Perl
No
No
No
Table L. 1: Implernentation features of esisting susterris
1.2.2 Commercial Implementations
Intel's Quick Web technology (131 resides on Internet Service Provicler (ISP) servers. It
compresses images by selectively dropping bits or pisels out of an image and caches data
to overcome the problem of bandwidth bulge. The technology is only used n-hen the
access to the Internet is through ISPs. Similarly. Spyglass' Pnsm works on ISPs to c a r y
out the transcoding functions. [14]
IB 31's Web Espress [la] consists of t a o cornponents: ARTour ( Advanced Radio Corn-
munication on Tour) Gateiv- and ARToiir Client. The ga tewq provicies secure. corn-
pressed data across the selected network wit h aut hentication. It can automatically re-
trieve Web requests in the background mhile mobile users are perforrning other tasks.
IB hl also developed "Web [nt ermediaries" (WBI) (161, entities that can be posit ioned
1.9. Research Objectiue and Contributions 9
snywh~re along the HTTP stream and are progranimed to tailor. ciistomize. personalize.
or otherwise enhance data as the flow d o n g the streani. WB1 of *WcbSphere Tsanscoding
Piiblisher" [l;] is a serwr-sidc software that adapts. reforniats. and filters the esisting
conterit on the scrver. to rnake tlie data optirnally formatteci for destination environment.
This nieans first it. is a server-side transcocling schenie: second there is no need to down-
load nny content ancl the only concern for transcoding policy becomes optimal formiitting.
K B I of *-&b Browser Intelligence" [1S] resides on t lie client 's sicle and modifies content
by ohser~ing user's past activity. WB1 of %dDesk Internet Safe" [19] also resides on
the client's side. together with l~rowser. keeps kick safe from inappropriate content in
the Intrrnet. Those trvo t ranscotling schemes can be viewed as client-side transcoding.
basically not aiming mobile iisers.
Sinre commercial iniplenienta~ions tiicle the cletails of the technologies t hey use. we
don't rnake much comment ton-arc1 theni.
1.3 Research Objective and Contributions
Previous research work of transcoding pros? systenis focuses on the transcocling of indi-
vicltial content inside a web page. This means that the web page is not regardecl as one
integrnted document but rather as a series of separated data items siich as texts. image
files. sound files and active content. The problem behind this is that the transcoding
pro? cannot control the cioanloading process of the entire web page. For exomple. if
the client ~ a n t s to download the web page within 10 seconds. It would be impossible for
previoiis pros- systems to find an ansuver. because they don't see the web page and thus
cannot find a transcoding policy to do it.
?;O previous transcoding proxy systems consider the fact that mobile users may browse
the Internet for very specific purposes. Since the cost is high. people tend to know what
1.3. Research Objective und Coritri6utiort.s 10
they [varit before going to the Internet. For example. sonie ni- only want to check
e-riiails. soriio r i i q focus their interest iri stock market mith sonie simple figures shoiving
the treritl of a certain stock. and some rnay warit their browsirig completel- without the
clistractiori froni aclvertisements. Thus al1 ot her iriformation. l e t taking time to tlownload.
becorries useless from tlie iiser's point of view. Thos a dynarnic browsing. by which tlie
iisers cari tlefine their browsing reqiiirerrients with every connection. is higlily desired.
A11 previoiis triuiscocling prosy systeriis lise fisecl transcocling paranieters to transcode.
?;O atlaptat ion to iiser's banclaidt h or the iiser's reqiiirerrient is obtained. The existing
t ranscocling policies are basically "if t hen" swi tclies for fisecl riiimber of situations.
Thus in tliis thesis. the goal is to:
Propose a niethod to transcode the mtire web page witliin each connectioii. ahich
npriniizm tlie entiril wrh pngr tloivnl«;irling ;iccorciirig t o user's requirernents. in-
cliidirig filtering. transcoding. and neb page re-arranging.
a Design an aclapt ive t ranscocling policy t tiat can evaluat e t ranscotling parsmeters
atlaptively and dynamically to eacli conriection.
Irnplement the proposed trariscoding prosy server in .Java. which is conipletely
portable to a11 niajor operating systenis.
Compare the performance and speed of the proposed transcoding system vs. down-
loacling ivit tiout transcoding
Bu achieving t hese goals. the main contributions of the thesis are:
1. A method of U'eb Page Transcoding at a prosy semer
2. An adaptive transcoding policy for n-eb page transcoding t hat can evduate transcod-
ing parameters adaptively for al1 the images in a web page
1.9. Research Objeettue and Contributions 11
3 . A system design and Java implernentation of the proposecl transcoding system
4. .A experiniental evaluation of the perforniance ancl efficiency of proposecl transcod-
irig systern
1.3.1 A Method of Web Page Transcoding
In this thesis. a methocl of' ~ e b page trariscocling" is introducecl to process the entire
web page as one object a t the proxy server within one connection. This methocl gives
the pros- serwr the p o w r to clecicle how to proccss the entire n e b page according to
the user's rcqtiirenient. Thiis it has a total control of the downlonciing of the web page.
a complete knomlt~clge of the web page and conseqiiently is capable of fincling a solution
for any requirenients made by the client regartling the aeb page.
\\éb page transcotling niakes possible al1 the new techniques of transcoding like:
0 I t cnn optiniize the downloading time according to a transcoding policy applied to
the whole neb page.
It can reconstruct a new web page according to the user3 preference. unnanted
or unimportant images or other content can be fiiterecl or cornpressed at a much
higher ratio.
O The user does not have to send al1 those requests whose response will be filtered
by the pros!-. and the overall information eschange is increased.
It can provide the new web page the correct information of the images so that
browsers like Yetscape browser can start displaying the correct Iayout as soon as
the new HTML web page is received.
1.3. Research Objective und Contributions 12
0 It supports clynamic web browing requiretl by the mobile user. for example. the
user can set the paranieters for bronsing ancl change it dynarniçally with every
corinection. For one corinection. the user ni- mant test only web page. If sonie-
thing insicle ttiis web page seerns interesting. the user may click it. creating a new
connection. with new parameters that will bring a more viewable version of this
content.
a .\[ore efficient caching caii be espectecl for the pro- susterri. processing the web
page as a ahole will niake the systeni easilp manageable and fast accessible. This
is because the systeni orily neecls to recognize the web page link. al1 the links inside
the web page are üutoniatically associatecl with it. Ttiiis the systeni knows right
away wliere to firict a bunch of iniages witli only one link (the link of the web page)
as irictes.
1.3.2 An Adapt ive Sranscoding Policy
The essential nieaning of the iidaptiw transcoding policy for web page transcoding is to
decide shether to transcodc ancl how to transcode. The tlieoretical analysis of adaptive
transcocling policy in [61 is based on transcoding of single image îile and it uses fixed
transcoding pararneters to clecide wliether to transcocle. For research and implementation
of [IO]. this is the same situation. But in fact. how to transcode (changing transcoding
pararneters) ni11 affect the decision of whether to transcode. And in terms of web page
transcoding. the nenT transcoding policy also oeeds to consider the different importance
of different content. for example. different iniages.
This research introduces a new adaptive transcoding policy that decides the transcod-
ing parameters adaptively for a11 content inside a web page mith different importance and
decide nhether to transcode with varying transcoding pararneters sirnultaneously
1.3. Research Objectiue and Coirtributzons
This research also improves the t.ay of image content analvsis bp tlesigning a new
image content tlecision tree. The proposetl t ranscoding policy t reats images clifferent ly
accorcling to t heir content.
1.3.3 A System Design and Java implementation
-4 systeni design is given for the proposed transcoding pros- server. From the concept
of t r b page t ranscotling. t lie systeni design reqiiires the design of adapt ive transcoding
policy. filtering. and HTSIL parsing ancl reconstriicting. From the point view of software
prosp systern. it flirtlier reqiiires the monitoring and control for niiiltiple downloading and
processing of content tvi t hin one web page simiil t aneoiisly. Furt hermore. some important
systeni design features regarcling the software server are corisiderecl. for esample. the
system design for self-eutensi ble niul t iple services. miilt iple iisers conriect ion. and miilt iple
protocol sripporting. The prosy systetn is designecl to ilse TCP sockets. a i t h SIulti-threatl
st riict ure.
The prosy systeni is iniplernented in .Java. whicli is cornpietely portable to most
operat ing systems. The implenientat ion ~ises object-oriented programming technique
that mnkes it very easy to modi- and add function classes.
1.3.4 An Experirnental Evaluation
An esperimental evaluation of the performance and efficiency of proposed transcoding
system is given. Esperiments include: Tests of image content analysis are run to evaluate
the new image content decision tree. Tests of proposecl transcoding policy are run to
evaluate t ranscoding parameters for different images in a iveb page adapt ive l - Tests
of transcoding pro- server are run to compare the downloading tirne with transcoding
w. without transcoding. Tests of dowdoading time components are run for different
1.4. Theszs Structure 14
t ranscoding scenarios. Finally. t ranscodecl web pages are evaluated for visiial effec t for
di Reren t t ranscoding scenarios.
1.4 Thesis Structure
Tliere are 6 Chapters in t h tiiesis:
Chapter 1 is %trotliiction" to the research. [t gives a description of the research
problern. the literature survey. the objective and the contributions of this research.
Chapter 2 gkes a backgro~iritl of the proxy server design. It discusses the fiinction-
ality of proxy servcrs and describes HTTP protocol and the definition of HTTP
pro- semer. It also gives a brief description of TCP sockets.
Chapter 3 gives the system design of the proposecl transcoding proxy sytern with
the design of web page transcoding. Fiinctions of each moclule are clescribed in
clet ail. 1 t also int rocluces t h e services t hat t kie proposed t ranscoding proxy server
can provitle.
Chapter -! designs an aclaptive transcoding policy for web page transcoding. The
proposed transcocling policy can evaluate transcoding parameters for each image
differently and aciaptively in the sense of optimal downloading for the entire web
page.
Chapter .5 gives the description on how the proposed transcoding pro. server sys-
tem is impleniented. Different esperiments are tested to compare the performance
and efficiency of the transcoding system vs. downloading without transcoding.
C h a ~ t e r 6 oives the conclusion of the research and s u ~ ~ e s t s some future work.
Chapter 2
Background
I n this Chapter. ive prescrit the backgroiinci knowledge for our researcti. The content
inc1iic;les:
a Uéb Prosy Servers
The HTTP Protocol
HTTP Pros' Servers
TCP Sockets
2.1 Web Proxy Servers
In the beginning of neb history in 1990. prosy servers were originally referred as gateways.
The first such generic WLVW gateway n-as written by the C V W W team at CERN (a
European high-energu particle physics research ceriter in Stritzerland); headed by the
inventor of the World Wide Web. Tim Berners-Lee [23].
The term "gatewaf h a traditionally been used to refer to devices that fornard
packets between networks, sometimes converting between protocols. In 1993. the term
web prosy swwr \vas chosen as a prrferred terrri for these web gateways. to make a better
distinction betwt.cn Internet/fire~vall gateways ( --prosies" ) which allorv web-related traffic
to enter scciircd intranets. and information gateways (gatewq-s) tliat interface thircl-
part? inforrriatiori systeiris to the web. The Interriet prosy server was given the nanie
proxy server to k t ter reflect the fact that they act on behalf of tlie client. Information
gatewqs. on the other hancl. act on behalf of tlie server. Therefore sornetirnes they are
referreii as reverse prosies.
There iire several types of prosy servers. shown as below.
0 Generic firewall prosies: Generic proxy server is the most cornmon type of pros'
serwrs. whirh can handle trnffic with clifferent protocols. incliiclirig HTTP. FTP.
and Gopher protocols. etc. Generic prosy servers are able to proïitle access control.
filter loggirig. and caching featiires.
Departniental prosies: Department prosy servers are generic firemdl prosies. escept
t hat t heir base is narrotver: a single depart ment of a large corporation or institution.
Departniental prosy servers are claisu-chained to firewall prosy sen-ers. probably
wit h cliffererit restrictive access control. to const riict two 1-rs of prosies.
Personal prosies: Persona1 prosy severs are trirnrned-down prosy servers intended
for indivicliial iisers only. They typically run on the same host as the client program.
Feat ures providecl by personal prosies include local caching. active cache updates.
polling for changes. and notification about t hem. persona1 hot list management.
and local searches.
0 Specialized proxies: Specialized prosy seniers are a diverse group. which performs
specialized actions appropriate for the target environment. For esample. a pro-xy
semer serving client software running on a palrntop device. The prosy ma? reduce
2.2- The HTTP Protocol 17
image qiialit- and the number of colors iisecl arid convert the image to a format
iinclerstoocl b - the palrntop compiiter.
The gerieral propcrties of pros- serrers are:
Traiispareiicy to client: aside from an? filtering perfornied on proxies. they do not
affect t lie end result. Csers rvill get the same rcsponse. whether the connection was
direct. or tliroiigli a prosy server.
Flesibility: Client deterniines whetlier to use a prosy or not
Transparency to semer: T lie destination serrer is unaffected by ariy i ritermediate
pro- servers ancl. often. completely iinaware of thetn.
2.2 The HTTP Protocol
HTTP is a short forni of HuperTest Transfer Protocol. While the \\orld Kide Web
consists. ancl is biiilt on top of. a nuriiber of different protocols. HTTP is the prima-
protocol usetl for transferring web ciocunierits. HTTP is a request/response protocol.
The client sencls a request to the semer. and the server sends back a responst.. There
are no multiple-step handshakes in the beginning as with sorne other protocols. such as
FTP.
In presence of pro- servers. the client may send a request to the pros? server. and
the pros?* server mil1 forward the request to the server. or another prosy. Thus a request
c h a h is formed and the response will follom the same chain but by a reverse order.
An HTTP request consists of a method. a target CRL. protocol version identifier. and
a set of headers. The method specifies the type of operation, among which GET is the
most common one. Headers contain additional information to the requests. An HTTP
2 . 2 T ~ P HTTP Protocol
response cmsists of a protocol version identifier. statiis code. human-readable response
st atiis lirie. response heaclers. and the reques tetl rcsoiirce content.
The first version of HTTP. referrecl to as HTTP/O.9. supporteti only the %ET"
met hoc1 wlien requesting. The first version ivas siifficient for ret rieving documents. but
it providecl iio aiit henticat ion or access control features ot her t han those baseti on the IP
address and the DNS host ancl domairi nanies of the requesting client. The HTTP/O.9
responsr contained only the requesteci documents. with no aclditional iriforniation.
The HTTP/ 1.0 protocol is dociiniented in the Informational RFC 1945 [Z4],which
introcluct.d an estended format for reqiiests ancl responses allowing tiiore data to be
passetl in hoth directions. After the actual request. a set of header fielcls follow to
proride aclclitional information. such as aiithentication. credentials. to be passecl to the
server. Ttw MTTP header section is siniilar to IIiiltipurpose Iriteriiet .\lail Extensions
( s rn rE ) .
Corresponclingly. the response also includes a statiis line and its own heacler section
in adclition to the document. The heacler section in response ma. contain information
such as the type of the document and its length.
Some major improvenient thnt the HTTP/I. 1 protocol rnakes are [XI:
a Persistent connection: HTTP/ 1.1 allolvs connections to remain open over several
reqiiests. for the semer implicitly knows its hostnanie. port nimber. and the pro-
toc01 it uses.
0 Request pipelining: Csed in conjunct ion wit h persistent connections. request pipelin-
ing reduces latency between requests and responses and delivers better perceived
performance.
0 Cache control: One of the biggest missing features in HTTP/1.0. HTTPf1.1 in-
troduces a variety of directives that can be used to control caching on proses and
2.3. HTTP Prory Servers
clients.
a Formalizecl d i d a t i o n rnoclel (conditional requests): HTTPf1.0 only siipported con-
dit ional G ET features to perforni upto-date checks. HTTP/ 1. I formalizes the
HTTP ialidatiori niodel and provitles validators. instead of just the 1 s t niodifica-
tinii date iincl tirne i i s d hy HTTP!1.0.
Content variarits: HTTPf1. L provides the basic utilities for associating multiple
representations of a resource iinder a single URL. wliicli is very usefiil when pro-
viding one resoiirce in multiple langtiages or different formats.
Protocol tracing: HTTPl1.1 specifies a new request methocl. TRACE. which is
iiseful in clebiigging pros? chains (niore than one prosy chaineci togethcr between
client ancl origin semer).
2.3 HTTP Proxy Servers
The proxy server ciefined in HTTPJ1.0 is an intermedia- program thnt ncts both as a
server and a client for the purpose of making requests on behalf of other clients. Requests
are servicecl internally or bu passing them. with possible translation. to other servers. -1
prosy niust interpret and. if necessary. rewrite a request message behre foraarding it.
When HTTP/ 1.0 is used to communicate between client and pro- servers. as well
as between prosies. it is very different from that used between the client ancl the origin
server. For a request made through a p r o . the requested URL is used in full form.
including the protocol prefis. hostnanie. and the optional port nuniber. while the- are
omitted at the time when the request is sent directly to the origin semer from the client.
For example. a request for the LTRL http://nnw.comm.iitoronto.ca/facult~ml from
a client to a proxy would be like:
2.4. TCP sockets
GET lit tp://~v~~~~~.cornni.~itororito.ca/faciilty.htnil HTTP/ 1.0
user-agent : tesla
Accept: test/htnil. imagelgif. iniage/jpeg
But when forwarcled to tlie origin server by tlie proxy. the request is rewritten to
i d ide u i i i ~ tlir L'RL p i ~ t li pitr t.
GET /filculty.htmI HTTP/1.0
Cser-agent : tesla
Accept: test/htnil. iniage/gif. iniage/jpeg
Forwardeci: by http://mypro~.corrim.11t~ronto.com:50S0
The ..Fortvarclecl:" Iieatler is addecl t O inclicat e t hat the reqiies t received is passed by
a pros! server. We note t hat in HTTP/ 1.1 the iieadcr Via: is usecl for the same purpose.
Ué also r i o t ~ t h it is actiially not desirable for the recluest to be nritten in a short
forrn. the reason is due to the qiiick developrnent of web technolog. For example. many
companies ancl indivicluals rnight share t heir ive b server addresses wi t li ot hers sirice t hey
cnnnot afforcl. or tlon't want. to spencl money on a tledicated web serwr harclware. Two
solutions can cope wit h t his problem. the Host : heacler (to distingiiish arnong different
web site) and the full URL in reqiiest.
2.4 TCP sockets
The innovation of sockets alloms the programmer to treat a network connection as another
Stream t hat bytes can be writ ten onto or read from. Socket is the propmming interface
between the upper three lqers (the "application") and the transport layer.
In the Internet protocol suite. the network laver protocol is the 1Pv-L or IPv6 protocols.
The transport layer protocols that can be chosen are TCP and UDP? corresponding to
2.4. TCP sockets 21
TCP sockets ancl CDP sockets. Sote that therc is a gap between CDP and TCP. tiiis
gap inclicates tliat it is possible for an application to bppass the transport layer and use
IPv4 or IPv6 dircctly. And this direct iiccess corresponds to a raw socket [26].
Application
Presentation
Session
Application A Applizâiion
Details Transport UDP TCP
Network IPv4, IPv6 Communicatio
Datalin k
Physical
OS1 rnodol
Device, driver and Hardware
lnternet protocot suite
n Details
Figure 2.1: L a y corresponderice between OS1 ancl IP
The tipper three 1-rs of the OS1 rnodel are merged into a single 1-er callecl a p
plication in the Internet protocol suite. This can be the web client (browser). Telnet
client. the web server. the FTP servcr. or any other applications. e g . in this thesis.
the transcocling prosy server. K i t h the Internet protocols there is rarelp any distinction
betmeen upper three layrs of the OS1 model.
Data is transmitted across the Internet in packets of finite size callecl datagrams.
Earh datagram contains a header and a payloacl. The header contains the address and
port the packet is going to. the acldress and port the packet cornes from. and various
other information used to ensure reliable transmission. The payloacl contains the data
itself. Packet,s can be lost or comptecl in transit, and neecl to be retransmitted. or that
packets m e arrive out of order. Thecefore keeping track of this-splitting the data into
packets. generating headers. parsing the headers of incoming packets. keeping track of
the sequence of the packets. etc. is a lot of work. and requires a lot of intricate software
p 71 . Thiis there are two reasoiis for the socket design. First. the iipper three layers hanclle
al1 the tletails of the application (FTP. Telnet. or HTTP. for esample) and know little
aboiit the communication detnils. The lower foiir layers. on the other hancl. know little
about application biit hanclle al1 the cornniunication details: sentliiig data. maiting for
acknowleclgenient. sec{iiencing data that arrives out of order. calculating ancl verifyirig
checksiirns. and so on. The secorid reason is that the upper three Iqers often forrn what
is callecl ii user process while the lower Four layers are normally provided as part of the
operation system kernel. Man' coritcrnporary operation systerns provitle this separation.
for esaniple. the Lnis. CVinclows. the SIacintosh. etc.
2.5 Glossary
Resoiirce: A file. HTSIL document. image. applet. or any otlier objert addressable
by a single CRL.
CRL: Lniform Resource Location. a MorlcI \Vide W b resource address. for es-
ample. http://w~\nv.comm.utoronto.ci~/f.~c1i1t~.htrnl. It contains three parts. the
protocol (http). the DXS (clornain name system) name of the machine on which
the web page is located (w~~vw.cornm.utoronto.ca). and the file name (faciiltyhtml).
Client: The client side of a request-response transaction: the client side rnakes the
request. and server side responcls. The client may be the aeb navigation software
program. such as the Xetscape Navigator or Internet Esplorer. Hoivever. a proxy
server acting as a client may also be referred to as a Ment" .
Semer: h program accepting and servicing requests from clients: a server may be
an origin server (content provider) or a proxy server.
0 Proxy Server: ;\ri internietlia- server that accepts recliiests froni clients and for-
nards them to other prosy seners. the origin server. or services thr request from
its own cache. A prosy acts both as a sen-er as d l as a client: the prosy is a
server to the client connectirig to it anci a client to servers that it coiinects to.
0 Host: -4 phvsical romputer. runnin- client. sesver. proxy. or other software.
2.6 Summary
This Chapter gives a basic description of what a proxy server does. why the prosy server
is used to do trariscoding instencl of the original server or the client itself. For the purpose
of transcotling in application laver. HTTP protocol and the definition of HTTP prosy
semer are iilso analyzeci. TCP sockets are used for the client. the prosy. and the server
to coniniiinicate at the level of application layer.
Chapter 3
System Design of Proposed
Transcoding Proxy Server
In this Cliapter. we present the design of the proposecl transcoding prosy server. The
design of t lie prosy server s! stem reqiiires bot h the network progranirning and the design
of weh page t ranscoding. The content incliicles:
Design Fundamentals
System Design for Proposecl Pros- System
Main Flow Chart
Single File Process Thread
0 Web Page Process Thread
3.1 Design Fundarnent als
T h e .*Transcoding Proxy Server't is in the application Iayer in the Intemet protocol suite
['16]. as indicated in Figure (2.1).
The b'Transcocling Pro- Server" uses TCP sockets to m i t e robiist client and semer
programs that are integratecl in the P r o s . The reason is basically due to the fact that
the service proviclecl by TCP [ZS] is connectiori-oriented. This means. TCP provicles
connections between clients anci seners. A TCP client establislies a connection witli a
given server. esctianges data with t h server across the connection. and then terminates
the connec t ion. Besides.
TCP provicles reliabilit!: K h e n TCP sends data to the other encl. it requires an
acknowIeclgment in rctiirn. If' there is no acknon-lecig returned. TCP nutomati-
ciilly retransmits the chta and waits a longer amount of tirne. This retransmission
miil repeat until the ackno~vledgement is got or TCP decicles to give up. TCP con-
tains algorithni to est iniate the round-trip time (RTT) between a client and semer
tlynairiically so thnt it knows how long to m i t for the acknonledgement.
TCP sequerices the data by associating a sequence niimber wit h every byte t hat it
sencls.
a TCP provides flow control. TCP always tell its peer esactly hon many bytes of
data it is miliing to accept. This is callecl the advertised window. At an- time. the
winciow is the amount of room availahle in the receiving buffer. guaranteeing the
sencier cannot overflow the receiver's buffer.
O TCP connection is full-duplex connection. This means that an application c m
send and receive data in both directions on a given connection a t any time. TCP
keeps track of state information such as sequence nurnbers and windon sizes For
each direction of data flow: both sending and receiving.
As the interface between the server and the client. the pro= behaves as both server
and client. That is. from the point view of the semer' pros? is acting like a client; and
3.2. Systent Design /or Proposed Proq Systern 26
from the point vien of the client. prosy is acting like a server. On the other hand. the
prosy also performs cert ai ri types of data processing. for esample. ive b page t ranscoding.
The type of data processing clepencis on which type of service the client user mants. In
otlier ivords. the proxy will provide multiple services on clifferent ports ancl it's the client's
choice to decide which service it needs. To be able to accept new services created by
users. the pros? is designed co have estensible services at run tirne. which means that
the prosy does not have to know an' parricular service before it can provide this service.
Furthermore. the prosy is able to hanclle niany users at the same time for a certain service
aricl the pro- is t hiis tlesigned to utilize multiple threads for processing the requests.
\Lé also design the coniniiinicnt ion pro toc01 between the client ancl the proxy mil1
almays be HTTP. No rnèttter which particiilar protocol the client reaily wants to use. the
HTTP request via prosy srnt bu the client will wrap the iiiforniatiori of the non-HTTP
CRL of the server inside the HTTP request. For esarnple. an FTP CRL. ftp://somesite/somefile.
is requesteci to the prosy server as:
GET ftp:/ /sornesite/soniefile HTTPI 1 .O
Cser-agent : tesla
Xccept: text/htrnI. irnage/gif. imageljpeg
Then it's the proxy server's responsibility to parse the request and contact the FTP
semer in a corresponding protocol by the protocol hancller module.
3.2 System Design for Proposed Proxy System
Rè design the nem Transcoding prosy server. as indicated in Figure (3.1). The design
o l the new system indudes two major parts: netrvork programming part and web page
transcoding part. To make sure that the client. the proxy server and the server are able
3.2. Sgstern Design Jor Proposed Proqj System -7
to commiinicate mith each other. network prograrnniing is required and rhis basically
inclutles the .'Yirtiial Ser~er" niodule and "Yirtiial Client" nioclule. To be capable of
trariscorling. "Request Processing" module ancl "Data Processing' module (based on
that proposecl in [61) are tlesigned.
TRANSCODING PROXY SERVER
Virtual Crint 1 ,
Request Processing
H r r P Semer
- t I I PP
R uestvia 'L I
~r#nscoded w b Page f * I I 1 I
5 J
rn? Client
Figure 3.1: Prosy server system design
The modules inside Figure (3.1) are designeci to perform following functions:
Virtual Server
Virtual Server. listens to sorne particular ports (corresponding to different services) and
receives requests from different clients. It then creates a socket to cope mith each request.
Request Processing
Request Processing reads the request from the socket created by Virtual Server and
parses the request. It h a tmo fiinctions:
3.2. System Design for Proposed Pronj Systerri 28
0 Recpest Parsing: Request parsing fiiicls out the CRL of the original request. in-
cliicling thc protocol. host and the file name. Then it ni11 p a s this information to
the Protocol Handler modiile of Virtiial Client to niake a connection to the original
server.
s P:cfc:c::cc Pxsiag: Thc ;se: ~ i ! ! s c ~ d :I specir! !inc ii,dicatinm O its prcfermcc in
the HTTP recliiest header. This prelerenc~ ail1 he storecl as the reference to this
connection and iised by the Transcoding Policy lIanager module of Data Processing
to clecide the transcoding policy for images in the web document.
Virt ual Client
\'irtiial Client gets tlw parsecl reqiiest and sencls a corresponding connection reqiiest to
the original semer. Then it receives the response and saws it locally at the prosy. YirtuaI
Client has the followirig fiinctions:
Protocol Hatidlcr: Protocol hancller niakes the prosy capable of hanclling different
protocols. Thus it guarantees the user to use any esisting protocol. or even clefine
its own protocol. by incorporating the corresponcling protocol class at run tirne'
which makes the proxy both flexible ancl extensible.
Client socket: Khile server sockets always m i t for a connection, client sockets
actually initiate connections. Each socket has two values to identify the connection
endpoint. an IP address and a port number. M e r the protocol handler finds out
the protocol that the client nishes to use to connect to the server. it opens this
client socket a i t h the host address and port number and starts the connection with
the corresponding protocol.
HTNL Parser: HTML parser is used to find out the linked content of the web
.3.2. Systeni Design jor Proposed Promj System
document. for esample. images. test. active classes. etc. Whciiever a new link
of sotrie corirerit is foountl. it will reqiiire the Iliiltilonnection Manager nioclule to
create a riew connection for downloacling that content.
a SfdtiConnection Nanager: SIultiConnection manager gets the link. the Host. and
epms a ne*: ~hread te hÿnd!p the rlornnlenrl f ~ f thic !ink. T ~ P t h r w r l will cnnw-
qiiently activate protocol hantller to create a new client socket. SluItiConnection
manager moriitors the tlo~vnloading of eacli rveb content and then it invokeç Content
hancllcr for hantlling tiifferent content of the web document.
Content Harider: Different content of the weh document will be treated differently.
Basically. the test/html part of the clociiment will be sent to the client without
modificatioii. tlic active clius part of the clociirnerit will be filteretl. The images
in the clociitrient nill be transcoded basetl ori tlieir content and transcoding policy
given by t lie Transcoding Policj- Manager module in Data Processing. Fiirt hermore.
for images of different format. for esample. GIF. a corresponding decocler stiould be
iisecl to decocle in content handler alter the image is downloütied in SIultiConnection
.\Imager niodiile. The clecoded image can thrn be sent for content analyzing and
transcocling.
Data Processing
Data Processing processes the response saved by Virtual Client. It can perform single
file transcoding as well as web page transcoding. Data Processing has the folloming
functions:
O Transcoding Policy Manager: Transcoding policy manager is respoiisible for decid-
ing the transcoding policy For each image in the rveb page adaptively. The decision
is based on user's preference as well as image's different content. The policy sets
3.3. !Clain F h v Chart of Proposed Proxy Sofiware System 30
two parameters for each iniage. clown scaling ratio (R)nnd color reduction (in bits.
C). If the scaling ratio becornes zero. it actually means that the image is filterecl
out.
r Content Arialyer: Content analyzer finds out the content of an image by its fiinc-
tior,aiity. for ex3~1ple. thir imago ir f î r decqratien. fer clicking. fnr infnrrnat im. fnr
advert isenient, etc.
0 irnage Transcocling: Image transcoding actually has tw~o parts. The first part is
to transcode the ininge accorciing to the corresponditig t ranscoding paramet ers for
ttiis type of image. The seconcl part will invoke a encocler to encode the iniage back
to its originel forniat. for esample. for GIF format. a GIF ericoder will bc called.
r HTNL writer: HTNL rvriter wvill compose the web clociinient again. after the
iinwan ted or iininiportant content filtered and wanteci images transcocled. The
resiilt HTNL file also will have al1 its links rewritten as fut1 CRL. instead of
relative CRL. pointing to the transcoded images at the pros? semer.
3.3 Main Flow Chart of Proposed Proxy Software
System
The main Row chart of the prosy software system is iridicated in Figure (3.2).
The proposed prosy system is designed of nested rnultithread structure. Each time
when Virtual Semer of pro- identifies a senice. it d l first check whether the conirnuni-
cation port that this service is going to use has been occupied by sorne other services, if
yes. then t his service cannot be added to the requested port. If no ot her services ever use
the port. or in other words? the port is available. Virtual Server creates a thread that has
3.3. iI1ain Flow Churt of Proposed P r o q Softuure System
Proxy Starts
Proxy lnitialization
F H T P Proxy
Service
Looping
'I
Proxy Service Thread for Client $1
Service #2 . . .
Multiple Threads for Multi-Clients
7
Proxy Sewice Thread for Client #2
v Proxy Service
Thread for Client #ri
Service #2
Figure 3.2: Main Roachart of the software prosy seryer systern
a semer socket to listen to that port. This is the first 1-r of threads for running multi-
ple services. Whenever a request of certain type of service is "heard". the corresponding
server socket will create a service thread. Each service t hread handles the communication
between the prosy and the client. the communication between the proxy and the server.
It also copes mith the processing of a web content (or web page). CVlen multiple users
send multiple requests for a service: it is the situation that multiple service threads. the
second laver of threads? running at the same time. as indicated in Figure (3.2). Inside
3.4. Single File Process Threud 32
each service threacl. a tliircl l q e r of threads niay be used. For esample. MiiltiConiiectioa
.\Imager creat es miilt iple t lireütls to domnloacl n i d t iple content inside a neb page inside
a prosy service t hread.
As a server. the pros' can have an' services that a stanclarcl server has. For the
piirpose of trünscotling. two tlesignated classes of services are provitlecl t h e proposeci
transcociing proxy serwr. Each of them can carry out the requirement of traiiscoding. the
ciifference is whether it's for a single file (sanie as previous research) or for the entire web
page (new concept). The first service. "single file process" . is designecl for the prosy to
transcocle one content item within one connection. The reqiiestecl file ni- be a textlhtml
file or an image file that has a extension of .jpg. .jpeg. .gif. The seconcl senice. "web
page proccss" . is designed for the prosy to transcocie the entire web page within one
connection.
Each service can be put into the nested niiiltithrcad Aowchart. as inclicated in Figure
3 .2 as a thread. prosy service thread. For "single file process" . the corresponding proxy
sen-ice thread is single file process threacl: for "web page process". the corresponding
pros? service ttireacl is web page process threacl. Prosy service threacl performs the
control of the proccssing of both the request from the mobile client and the rpsponse
from the server. Thiis it activates and monitors Request Processing, Virtiial Client'
and Data Processing. t t is d s o responsible to detect anything erroneous and send the
corresponding error messages to the client during the entire process.
3.4 Single File Process Thread
&Single file process" thread is the thread that transcodes a single file within each con-
nection. It is equivalent to the service that previous proxy systems can provide. Single
file process thread aiIl go through the follorving steps to perforrn transcoding:
3.4. Single File Process Thread
1. Reqiiest Processing: parsing the HTTP request
2 . Virt iial Client: downloading the single content
3. Data Processing: transcoding the single content
4. Single file process t hread: sending the transcocled content
3.4.1 Request Processing
The Reqiicst proccssing reads the MILIE hcacier of the reqiiest by the Client. 1 t parses the
CRL of the HTTP request via proxy from the client. The HTTP reqiiest via pros? from
the client will have the forrn: GET sonieprotocol://soniesite/somcfile HTTP/ 1.0. The
Reqiiest Process tlien gets the original CRL out. diich is sonieprotocol://somesite/sornefile
and sen& this information to Virtiial Client.
Virtual Client
From Recpest Process. the type of the file reqiiested by the client is parsecl ancl known.
If the type of the file is image. transcoding is needed. The image has to be saved first in
prosy. Thus after Virtual Client opens a connection with the original server. it will use
a client socket (at proxy) to download this image first locally at the prosy server.
If the type of the file is a test/HTML. no transcoding is needed for the moment
(for simple. ive don't consider the situation when transcocling has to filter out some
unwantecl content inside the HTSIL file at this moment). Thus proxy server will jump
directly to the Iast step. sending the test/HTML directly. It will open a pair of threads
inside the proxy service thread. One communicates with the server. while another one
communicates with the client. These two threads will return when one of the threads
finishes the data eschange and notifies the ot her one.
3.4. Single File Process Thread
Single File Process Service Thread
Thread Initialization
Request Piocessing
7 Pane the URL ïii J aid aavr Iirr Iiiànaire Find and save the preference End the file's content type
Viflual Client
Cfeate URLConnection by proiocol handler
t UR LConnection - No
Valid' , Yes
image TextiHTML - lmageortext3 -
t File Cannot be found Save the image to proxy
Daia Processing f
Decide transcoding poiicy image content analysis Image tmscoding
Send MIME Header Send me transcoded image ta Client
Figure
3.4.3 Data Processing
Send Client error inf onnation - Close connectian with Client finish service
Send Client error information
- . Close conneclion with Client Finish service
Send Client enor information Close connectioo wrth Semer and Client Finish service
T Send MIME Header Create ta NO threads. one communicates with Server. one communicates wtth Client Wait threads end
Send Client erro tnformation Close mnnection with Sewer and Client Finish service
t -
Close mnnection with Client and Server Finish seMce
3.3: Single file process thread
Simple method of setting transcoding parameters and transcoding is used to transcode
the single image donnloaded, which is just like what previous systems do. The next step
3.5. Web Puge Pmcess Threud 35
OF sentling the trariscotlecl image is carried out - the single file process t liread itself üfter
it fincls the image lias been siiccessfiilly transcoclecl.
3.5 Web Page Process Thread
"iveb page I J L U C C ~ " ~ L ~ e i d ia iilr &.h i e id t h t ~ i ~ i i a ~ d r ~ riltki? WC), pirge ivitliii i eddi
connection. as intlicatecl in Figure (3.4) and (3.3). h t e when niobile client requests an
single image by this *Web page process" service. the service becomes same as *-Single file
process" service because they will go through esactly the same procediire. This means
that " \\éb page process" can completel! siibst itute "Single file process" Pven when only
a single image file is rcqiiired.
In ordcr to perform web page transcoding. the proxy server neecls a thorough un-
clersrancling of the aeb page and a total control of doivnloading ancl processing the web
page. This recpirenient results in the clesign of modtiles of Request Processing. Virtual
Client. Data Processing. and web page process thread itself. W b page process thread
tvill go tliroiigh the following steps to perform web page transcocling:
1. Reqiiest Processing: parsing the HTTP reqiiest and user preference
2. \'irtiial Client: clownloading HTSIL file
3. Yirtual Client: persing the HTSIL file
4. l ï r tual Client: downloading the multi-content linked by the HTSIL file
.?. Data Processing: analyzing image content
6 . Data Processing: evaluat ing t ranscotling parameters adaptively
7. Data Processing: transcoding al1 images with corresponding transcoding parame-
ters
3.5. GVeb Pirge Process Thread
S. Data Processing: reconstriict ing nen meb page
9. \ k h page process thread: sending the transcodetl web page tc client
3.5.1 Request Processing
In order to trenscocle the web page. the prory needs to know how the client uTants the
transco<lrd web page. Thiis Req~iest Processing not only needs to parse the HTTP
request. biit also neetls to parsc the preference sent by the client. The detailecl defiriit ion
of user prefereiice is given in Chapter 5. the implementation part.
3.5.2 Virtual Client
I n orcler to transcocle the web piigr. Virtual Client has to carry oiit several tasks.
The tirst step: Virtiial Client makes a connection to tlie original server. iising the
CRL tliat wits parsed by Reqiiest Processing. as indicated in Figiire (3.4).
The second step: Virtiial Client piirses the HTUL file ( the web page). as indicated
in the starting part in Figure (3.5). The original web page rnq- contain some LRL tags
that link to the content like images. active content. sound files etc. HTML parser detects
al1 these tags and Ends the corresponding file types. The tngs that link to active content
and souncl files are not saved for Iüter downloading a t 'vIiiltiConnection manager. since
most mobile users ivouldn't -nt these content when browsing the Internet. .-\il tlie tags
for images are saved for furtlier downloading in blultConnection manager.
Third step: l'irt ual Client now downloads the multiple content using its Slult Con-
nection \lanager. SIultiConnection manager gets al1 the URL links that saved by HTML
parser. changes al1 the relative CRLs into absolute URLs. and checks if each URL is
valid. If the URL is valid. MultiConnection manager creates a sub-thread for handling
this CRL.
3.5. Web Puge Process Threod
Web Page Process Service Thread
Thread Initialization
Request Pmcessing
v No
URL Valid? .
, Yes
Pane the URL Find and Save the tilename Find and Save the prelerence Find the file's content type
Virtual Client
v Create URLConnection by protocol handler
7 file Cannot be found Save the image to proxy
Data Pmcessing v Web page pmcessing 7
Deade transcoding pdicy Image content analysis Image transcoding
Send MIME Header Send the tfanscoded image to Client
Send Client emr information
e Close wnnection with Client Finish service
Send Client error information
- Close mnnection with Client Finish service
Send Client error information
b Close connedion wth Server and Client Finish semce
Figure 3.5
Send Client en0 information Close connection with Server and Client Finish service
Close connection wth b Client and Server
Finish service
Figure 3.4: Keb page process service thread
Inside each downloading sub-thread. protocol handler is invoked first to create a
connection to the corresponding semer. Once the connection is setup. the image file
CVeb Page Process Thread
Web Page Processing
Virtual Client
Save me web page as HTMt file
Pane Vie HTML file Fna tags wi(ti the URL links Save tags and URL links F i d out me image types
MumConnectiori manager t
Cteate mumole downioading mread for each valid URL leach URLConnmon created by protocol handler) Oownioading mread Save eadi image hie Wait ail ihreads retum Çave downbaded mages
Content handler f
Add pre-oracessinq functans for sûved images Retum dawnbaded images
Yes t
Smd MIME Header Send the tranxoded web page to Client
Send Client enor infomaWn C h e connecnon with Client Finish sewm
Send Client m r informaiion
c Closeconnecbonwfth Client Fmish serirtce
Figure 3.5: Web page processing
reqtiested by CRL is t hen tlownloacied ancl saved. After tlownloacling is finished. the
sub-t hread rct urns.
'iliilt iConnect ion manager moni tors al1 the s~tb-t hreacls and adds a t ime limit for sorne
if necessary. It waits d l stib-threacls to end ancl checks if each file is saved correctly. If
the image file is correctly sarecl. it will Save the new filename as û string. If not. it d l
put ;i erripty string to inclicate the file is not amilable.
3.5.3 Data Processing
Data Processing performs the web page transcoding. Again it lias to carry out severnl
tasks.
The first step: Data Processing fin& out the content for al1 the images that have
becn sitccessfully tlownloaciecl. The content. dong with user preference. tvill be used as
the hasic criterion to assign different image with different importance.
The second step: Data Processing e n h a t e s the t ranscocling pürameters adapt ively.
Diffèrent images are consiclered wit h clifferent irnport ance and the overall downloarliiig
tinie optiniization for the entire web page is used as the criterion to decide how rnuch
each image shotild be transcocled.
The third step: Data Processing transcodes al1 the images with corresponding transcod-
ing parameters evaliiated.
The fourth step: Data Processing uses al1 the transcocled images and the original
HTlIL to reconstmct a new HTML file a i th links nioclifiecl to point to the transcoded
images at the proxy semer.
Details of Data Processing is @wn in Chapter 4. web page transcoding. After the
transcoded web page is obtained. the next step of sending the transcoded web page is
carried out by the web process thread itself.
3.6 Summary
This Chapter presents the systern design of the proposed transcocling prosy server. The
-stem not only lias the network-related Featiires of multiple extensible services. multiple
users' siipporting. protocol hander. ancl content handler. but also has the transcoding-
r e h t e d fe:?r;res cf t?:& page t ranic~dif ig. adaptir:~ tran~rnding poliry! r m n t ~ n t analysis,
HTSIL parsing. active content fil tering aiid HTSIL writer. This Chapter also int rocliices
two services provitlecl by the proposecl transcocling prosy server. I t is important to note
chat both services are carried out in threads. wtiich means the pro- server is capable of
serring iiiiiltiple iisers at the sanie tinie.
Chapter 4
Web Page Transcoding
In this Chapter. we preserit the details of how web page transcoding is carriecl out in
Data Processing niodiile. This includes how to analyzc the incliviclual iniage content.
how to (lecide t lie traiiscocliiig policy for the web page. how to transcode and liom to put
every transcoclcd elemerits hack togetiier irito a new t ranscoded web page. The content
of this Chapter includcs:
cr Data Processing Flow Chart
a Image Content Analysis
a Acliipt ive Transcoding Policj- - Trnnscoding and Recons trtiction
4.1 Data Processing Flow Chart
Csuaily the web page contains man? links for different content inside it. For example.
a web page may include test. images. applets. sound files. Javascript etc. For web page
transcoding. the entire page is transcoded as one object with different content treated
4.2. Image Content rlrra/y.sis 4'2
clifferently The active coritent like .Java applet is filterecl oiit. The audio files riiay be
filtereci out or kept by the requirenicnt from the user. Test ancl HTSIL content get
t hroiigh direct lu. Different iniages are classi fied by their piirposes. The images t hat the
iiser rloesn't want to see ivill be filtereci out after their content is iclentified. Images allomed
to get t hroiigh are processecl by ari aclap t ive t ranscoding policy wi t h clifferent importance,
decitlecl by image content arialysis ancl iiser's preference. With the transcoded iniages
and the test/htrnl conterit. a riew wcb page is constriicteti and serit to the mobile user.
The niain flov: chart of Data Processing motliile for web page transcoding is inclicated
in Figure (4.1).
4.2 Image Content Analysis
Image content analysis gives tlie information of what conterit iin image haç so thnt a
correspontiiiig importance is appliecl to it according to user's preference. The content of
images is classified into S grotips accorcling to the piirpose of an image ['LI. The 8 groiips
of images are:
0 ADC': acivert isement
DEC: tlecoration. i.e.. backgrou~id image
BCL: bullets. points. bails. dots
0 RCL: riiles. lines. separators
a IIAP: maps. Le.. images with clicking focus
[NF: information. i.e.. icons. logos
0 L-PïF: Linked information- Le.. scaled images linked with original size images
4.2. [muge Con tent .-ln&sis
Data Processing
Find the c~litent for tne imge
Image D-ng. Scaling, Cdw reductng
and EKodng
Aenirn me rranxoded mage
tiiename
End the m t e m fa; 4 me image
atl images in the web ,
Page'
nit images in the rrgb
PW'
Redece the rags mth iranscoded images R m e the tags
wmse m t e n a am f iPefea inc!uding
images. aopleu eu:
Figure 4.1: Data Processing motliile for rveb page transcoding
CON: content related images. most important image group for web browsing, Le..
photos: st.ock graphics
4 .2 Image Content A nirlysis 44
In order to decicle whicli group a certain image belongs to. a step by step classification
is clesignecl. The first classification is carrieci out by the '*HTSIL Parser" of T i r t ua l
Client" moclule. At this step. images are divitlecl into 4 grotips according what type of
HTlIL tags they have. Detailed description of these four types of tags will be given in
Chapter 5. The 4 types of tags are:
a BAI<: backgroiiticl. i.e.. <body backgr = . . . >
a IS.\lAP: isniap. i.e.. <inig src = . . . ismap>
LIX: linked. i-e.. <a href = . . . > <img src = . . . >
a INL: inline. i.e.. <inkg src = . . . >
The second step is to fincl out some charactcrist ics of an image. for esample. whether
an image is color or rion-color. how man- colors or gray levels it has. wtiat size (Bytes)
and cliniension it lias. and nhether the image is photo or gaphics.
To find the color characteristics of an image. a clifferent approach is used here instead
of the compies and the-consuniing mathematiral forrniilas given in [?]. The new a p
proach uses the default ColorSiocle1 [32] in Java AWT. which uses S bits per pisel for red.
green. aricl biue. dong rvith another S bits for alpha (transparenc~) level. to decide the
color information of an image. Since an image c m be constructed by giving .Java M T
its every pisel values and the default ColorSlodel. otir software first uses .Java .\KT to
load the image from its string name. then gets eacli pkei value of the image to form a
pixel array. Alter the default Color.\lodel representation of the image is obtained. it is
useci to find out if the image is non-color or color and hon man'. colors or gray lerels the
image has are counted.
To finci whether an image is photo or graphies. the intensity switch of each image is
calculated ([z]). The intensity switch is the minimum value of horizontal switch and the
vertical switch. Horizontal switch is definetl as:
h test of color characteristics and intensity switcli of different images is s h o w in
Table (4.1).
Images
adv-1472-yahoo.gif
adv-1475-yahoo.gif
idv-miadora-yahoo-gil
adv-ntap-yahoagif
adv-monster.git
inf -uofLgif
inf-billpay-yahoo.gif
inf-mail-yahoagif
inf-pts-yahoo.gif
inf-bag-yahoo-gif
stock.gif
guanyin-gif
tuibei.jpg
o. O CO ors or widui x neignt intensity jmage type Image size grai le:ek. (pixels) switch
-. . - - t 77 color
'IN . 2*M)9B (7lgraylevel) . 88 x 31 0.3900
255 color 4*499 ( 1 28 gray leve~) 105x60 0.4883
LIN 2,481 8 - .
- 204 color (1 O4 gray level)
64 color LIN . 2*694 ' (57 gray ievei)
105 x 60 0.3271
128 color 8*980 ' (76 gray level) .
468 x 60 0.3988
32 color INL . ' - (32 gray level) - 120 x 132 0.3939
31 cofor INL 496 (13 gray level) -
27 x 25 0.5822
IN1 28 color
371 ' (25 gray level) . 25 x 20 0.58
INL 32 color
457 (27 gray Isvei) 28 x28 0.4987
INL . -
7 cilor 2456 . (7 gray tevei) ,
16 coior INL 14*285 ' . (1 5 gray level) .
500 x 285 0.1 545
16 colors INL 73*048 . (14 gray level) - 403 x 550 0.5704
- . INL 36,538 6 256 gray level 308 x 502 0.4759
Color images are changed into gray images
Table 4.1: Cornparison of image properties for different image types
The third step is to design an image content decision tree. as indicated in Figure (-4.2).
to find out the content of this image. Note the distinguishing criterion in this new decision
4.2. Imcqe Content -4 nalysis 46
tree treats the iniage file size as the main conîern instead of the iniage characteristics of
mhether to br graphics or photo ([2]). As an csarnple tiow iniage size is more important
than its photofgraphics properry. if a e have a big image. a stock trend indication. which
will be graphics. IF we think the photofgraphics property a niore important issue. since
the iniage has a tiig of --LIN9'. we will decide the iniage is a "AD?w". But wheri image size
is the triain concrrri. as long as the image has a size greater t han 10 KB. it ' s consideretl as
"L-ISF". Oiir ciccision tree don't t ry to fintl tlic key nards like cl*'. *-testurc". "map".
"logo" . -ken" . etc. This is because first it 's tinic consiiming aricl time consideration is
essential for transcodirig prosy server. secorid riot every iniage tias an esplanation key
word ncarby its iniage tag.
Parse the tags of images
T v P BAK LIN ISMAP
- ? ? 'DEC" image V "MAP' image
Image size . .10KB
c 5 KB 5 K&sc 1OKB > 10 KB 4
v Graphics , ,, ,
. or photo-
? 'ADV" image
v 'L-INF - image
w: width h: height r: aspect ratio (width/height) #: threshold from (21
Y "CON" image
? INL
t
Image size -
? User
preference
'I - 'INF image
Yes t
'BUL' image
Yes
.Y - 'RUL" image
- -
Figure 4.2: Image content decision tree
4.3. A tluptiue Transcoding Policy
Thus the important concept iincler t h ~ iniproved decision tree is that. d i e n the image
has a relatively small size. it becomes less important to decide its exact content. This is
because the objective for finding images' content iç to tlecitle their %tiportance" to the
eritire weh pagç.. Wheri the iniüge has a smaller size relative to other iniages in the web
page. it woolcl be consiclered as less important. Even for the sanie iniages in the same
web page. tlieir *%nportaiice" can change duc to user's prefercnce froni tiriie to tinie.
For esaniple. if the user really want to clo~~nloiitl the entire web page fast. al1 the iniages
otlier tlian .*CON" cari be put into one big group mith the same ..iniportance" level. At
this rtiotrirnt. thert. is no clifference whetlier ari iniage belongs to '*.\DY" or .LINF".
Fiirt herniore. If the time reqiiirement [rom the client is ver! tight. the prosy can Further
decide the sniall '*COS'* images to be W F " and thus becorne less irriportant images.
and only big .*CO.\:" iniages reniain to be as "CON" iniages.
4.3 Adapt ive Transcoding Policy
The piirpose of n new aclnptive transcoding policy is to clecide not only when to transcode.
but also tiow. Actually Iiow to t ranscode will affect the decision of when to transcode. The
research work iri [6] gives a pretliction rnethocl basecl on fixed transcocling parameters.
and thus no adaptation of transcoding parameters is obtained. Therefore. our goal is
design a new aclap t ive t ranscoding policy to eviiluate the t ranscociing adap t ively for each
image. In other words. al1 the iinknonns for decision of when to transcotle should be
predicted n-it h iinknonn transcoding parameters. which will be eventually evaluated as
how to transcode. Furt hermore. images with different content are treated differentl- so
that the entire web page is transcoded with optimization. The nen transcoding policy
also considers the quality of each transcoded image. for esample. the image should be
big enough to be recognized.
4.3. Adaptive Transcoding Policg 48
The proposetl transcoding policy for web page transcoding finds how to trarisrotle by
the condition of when to transcode. which is baser1 on time e d u a t i o n . The condition of
when to transcode is w tiet her with transcoding, the client can dowiload the web page
faster than ivithout transcoding. The inequality is represented as:
Where Ti[.: Transcoding time del- of the web page: Sir*: Size of the web page: ASu.:
Size reciuct ion of the web page: Bel[ : Mobile connect ion baridwidt h:
In ineqiiality (4.2). the size of' the w b page. Sir-. is kiiowri. The niobile connection
bantlaicltti is also known by preference[O] sent bu the client (definecl in Chapter 5 ) . So
t a o ~inknonns rieed to be evaliiatetl: the transcocling tirne clelay of the web page. Ttv.
and the rrcitiction sizc of the wcb page. ASii-. The right side of the iiieqiiality is actually
a tirne threshold for the niobile client to donnloacl the original siïe of the web page.
This threshoid itself. T. can be adjiisted by iiser by preference[7] defineci in Chapter 5 .
Preference [TI is the relative clownloacling tinie ratio ( 3 ) . iisecl to giw the mobile user a
flexibilitx of sacrificing resolution to niake clonnloaciing time shorter. This is clone by a
linear interpolation given hy:
Thus inequality (4.2) becomes:
Iiieqiiality (4.4) is ttic basic criterion to decicle transcocling polic- The goal now
is to evaluate the transcocling paranieters of scaling ratio and color reduction for each
image comporient of the web page adaptively. The procedures to get these transcoding
parameters are designed as t hree stcps:
a Deçisioiis abolit size reciuction for each image From ASLr-
Decisions about transcodi~ig parameters for each image
4.3.1 Prediction of the Transcoding Time Delay
The tinie clel- for neb page triinscoclirig. Tib.. can be represented in the surnniation form
as:
Tt\- = Ttm,age(,)
.Y
W'here TL,,,,(,, is the transcoding time del- of i th image in the weh page.
Thus if we find the indiridual transcoding time clel- for each image in the a eb
page. Tiç. can be summecl out. And once Tir. is known. we c m clecicle the transcoding
requirement for ASri-.
Transcotling time delay for an image clepends on rnany factors. including image size
(B-tes) . image dimension ( width x height). image coritent (simple or coniplicated). coding
algorithm. systern speed. number of users sharing the CPC etc. A group of images (listed
in Appendix A) are tested for their transcoding time delay. shown in Table (4.2).
Note in Table (4.2). the transcoding parameters. the scaling factor and the color
reduction. are set as constants. The scaling factor is set to be the scaling ratio From the
Image ' M W No. of Pixels 'mg' Si*' Dimension (ByteSI
adv-1472-yahoo.gif 88 x 31 2.728 2.009
adv-1475-yahoa git 105 x 60 6,300 4.499
a&-miadora-yahoo-gif 88 x 31 2.728 2.j81
adv-mnster gil
adv-ntap-yahoo gif
anernone qil
announcer g~f
annaunceil28 gif
baboon gif
mrnrnhead2.gif
cwheel gif
gold gif
gwnyin gif
inf-bag-yahoo g ~ f
in!-billpay-yahoo gif
rnf-mail-yahoo gif
intptç-yahao git
Uofr gif
jordangaster2.gif
jordangosler3 gif
kIdS gd
map qif
splash git
stock.gtf
No. of Colors or Gray Levels
177 colors
255 colorç
204 colors
128 colors
64 colors
256 colorç
256 colors
252 colors
256 cofors
14 colors
255 colois
256 d o r s
16 colors
7 COlOrs
31 W ~ O ~ S
28 colon
32 coiors
32 colors
256 gray levels 40.479
256 colon 68.555
245 gray ievefs l 2 6 . M
256 colon 56,928
'1 Mobile device as HHC. using fixed transcoding parameten to evalwte transcoding tirne dday scaling O 625 = sqrt ((640~4ûOY(1024x768)) and 256 gray ievel '2 Transcoding delay time include image content anaiysis 8 transcoding
Table 4.2: Transcocling time d e l - for images tt-ith different size
dimension of a PC (1024 x 768) to that of a HHC (640 x 480). which is 0.625. The color
reduction is set acçording to HHC's display capabilitp: 23.6 gray. Of course the parameters
are just one case of testing the transcoding time del- In order to predict the tirne delay?
we have to see what happens to the time d e l - when different transcoding parameters are
applied (Tlie actiial transcoding paranieters are still iinknowti at t his moment). Table
(4.3) shows the transcoding time clel- a i t h changing transcoding parameters. Since the
same group of images are tested. the- are identifiecl using the sarne tiumber sequence as
Table (4.2) insteacl of listing their names again.
image Wd:5 W : 5 R d 7 5 R d 6 2 5 h a 6 2 5 R d = R d 5 R d 5 R=C5 R d 3 7 5 R d 3 7 5 R d 3 7 5 R d 2 5 FM3 R d 2 5 CQSB c.16 c d C Q . ~ LM c d c 8 t 9 C I ~ S CAJ X M . C - ~ S c d c t . a cala CJ
1 adv-lu?-yahoogd 110 60 110 50 60 110 Tl0 100 50 i l0 110 110 110 110 110
2 adv-1475-yahoagif 60 i l0 110 110 M 60 110 50 50 110 110 110 50 110 110
3 adv-miadora-yahoogit 60 60 50 50 60 1tO 60 50 60 60 110 60 110 110 110
I babOOn gif 930 610 550 ï70 550 5Oû 6ûû 440 s40 550 390 390 380 380 380
commheaa2.gtt
cwneei gd
qoid qif
guanyin grf
rnf-bag-yahw gd
inf-btltpay-yahoo gif
id-mail-yahoo gd
int-pts-yanao qd
UotT gif
pm;in-paner2.gif
prdangostef3.grf
hds gif
map qit
spiash gd
stockgit
me oit
R. Scaimg Ram C Na of cdor bits for new image
Table 4.3: Transcoding time delay for different transcoding parameters
The result of Table (4.3) shows that s i t h the same image. the transcoding time delay
4.3. '4 dap titre Transçoding Polzcy 52
cloes var- n i t h transcocling parameters. Anci if n-e draw the transcocling tirne del- as a
fiinction of size arid nimber of pixels of an image. for eadi case of transcoding parameters.
a stronger linear correlation is foitricl between time del- ancl nitrnber of pisels. -4s shown
in Table (4.4).
Estimation Y = a' X + b' Correlation Coef. Correlation Caef Gradient a'
R'S.75 C0=256
Rd.75 C=16
Rd.75 C=4
Rd.625 C=256
R3.625 C= 16
R3.625 C d
R 3 . 5 C=256
R=OS C=16
R=0.5 C=4
Rd.375 C=256
Rdl.375 C=16
R=0.375 C d
R4.25 C=256
RS.25 C=16
kQ.25 C=4
All cases
Mean
A: Scaling Ratio C: No. of color bits for new image
Table 4.4: Linear time delay estimation results for different transcoding parameters
LVe c m see that when a particular paranieter pair (R Si C) is used, the correspond-
ing correlation coefficient (q) is very good. For example? Figure (4.3). Figure (-1.4):
Figure(4.5) and Figure (4.6). But when the parameter pair R & C is unknown, and al1
the sample are estimated. a poorer correlation coefficient is got in Table (4.4). as indi-
cated in Figure (4.7). -1s a resiilt. the niean transcoding time clel- is calculatecl for al1
scenarios ancl wheri Ive es t i rnat~ tliis mean value of time clelay from nurnber of pisels of
an image. a better corr~latiori is foiinti in Table (4.4). as indicatecl in Figure (4.8).
Thus whenever ari image's tlimensioii is known. we c m roughly preclict how long the
transcoding time del- 1\41 be for this image iising the linear estiniation resiilts from the
estiniation of mean transcotliiig t inie del-. .-\ncl bu suiiiniat ion of the traiiscoding time
dela? for al1 images. tire can preclict the tra~iscoding tinie tlelay for the wcb page.
4.3.2 Size Reduction Decisions
Once the transcoding time del- of the iwb page is preclicted. the only iinknomn in
irieqiiality (-!.A) is the size rediictioii of tlic web page. ASii-. Sirice our iiltiniate goal is
to preclict the transcoding parariieter pair for each image. ive rieed to fincl out how big is
the size recluction for each iniagr. in other worcis. how the size reciiict.ion of th^ web page
is constitiitccl. or sharetl by each image.
Again ASir can be represented by a sunimation form as:
tvhere ASimage(i) is the size reduction of ith image in the web page.
Now if there are N images inside a web page. we have N unknowns and only one
eqiiation? (4.6). To solve the N unknowns. we need another Y-1 equation. Thus we
design two d e s to obtain 3-1 extra equations:
Equal Transcoding Parameter Bu this rule ("ETP1). ive let al1 the images of the
same content group have the same transcoding parameters: R Si C.
Figure 4.3: Trariscocling tinie ilelay \*S. No. of pisels (R=0.375. C=4bit)
Figure 4.4: Transcoding time del- Vs. No. of pisels (R=O.5. C=4bit)
Figure 4.3: Trariscodirig tinie del- \ 'S. No. of pixels (R=O.GE. C=-lbit)
I L L L I 1 O 0 5 1 15 2 2.5 3 3.5 * $ 5 5
NO. ot p a e * 10'
Figure 4.6: Transcoding time d e l e Vs. No. of pisels (R=O.T5. C=-Lbit)
4.9. Adap tive Tmnscoding Policy
Figure 4.7: Transcoding time del- Ys. Yo. of pisels (al1 samples)
Figure 4.8: .\Lean transcoding tirne del- Ys. No. of pixels
4.3. Adap tive T~nnscoding Policg 57
Equal Size(Byte) Reduction Ratio By this riile ( --ESRR"). we let al1 the images of
the sarne cotitent group have the sarne size rccluction in Bytes ratio. a.
Sote the size reduction ratio. n. is not the sanie as the scaling ratio R. Size reduction
ratio for ith iniage is tlefinetl as:
Wheri the first riile is applietl. we find "Eqiiivalent pixel No." (tlefined in (4.9)) ancl
then lise eqtiotioii (-LM) to fincl size retiuction for each iniage. Then \w can substitute
al1 the lS,,a,,(,i in equation (4.6) and get a second order equation for unknown R. This
ecliiation is still not difficult to solve if al1 the images irisicle a web page have the same
content type. The difficiilty cornes when different irriport;ince is applied to images r i t t i
different content. This means nith less important images. the. will be compressed more.
Furtherniore. when the iniage qtiality is taken into consideration. which means R also
depencls on whether the transcodecl images c m be recognizecl by the client. this entire
decision of R 9J C noiiltl be overiy cornplicatetl in seiise of time needed to find out the
results. So for this situation. the transcoding policy woiilcl avoid solving the equation and
set parameters to ciifferent group of images a i t h different content accordicg to previous
knomledge it acquires for the similar scenario. This situation is also applied when some
clients aant to set size reduction ratio and color as tlieir clownloading criterion instead of
the downloading time in inequality (4.4). The results of this type of web page transcoding
are shown in Chapter 3.
When the second mle is applied. Ive have every image within the same content group
the same a. A relative importance is decided accorciing to the preference[.j] defined
in Chapter 5 . Usually the importance of "COW images equals to 1. and a relative
importance of other content d l be less than or equal to 1. Then we can find the
size reduct ion ratio. ci (i). for ot her images dcpencling on t heir individual importance to
"CON" images hy a niapping inclicated in Figure (4.9).
Size A Fieduetion , Ratio
Aelatwe importance t
0 value/ I 5 t
Figiirc 4.9: lIapping of relative importance to size rediiction ratio
Now if ae siibstitrite al1 the lS,,,,,(,, in qi iat ion (4.6) witli th^ nitiltiplication of
Simage(,, aricl a( i ) ( in ternis of a of T O Y ) . CI of T O N " images can be solvecl by a first
order eqiiation. Ancl iising the n( i ) and S,,,,(,i. the individual ASirnage(i) for each
image is solvecl.
4.3.3 Sranscoding Paramet ers Decisions
Once tve know the lSima,ci> for ith image. the nest step is to get the transcoding
parameter pair for this image. To do this. ive have to find out the relationship between
the ASimage(,, and transcoding parameter pair. Again the sarne group of images are tested
with different transcoding parameters to fincl out the sizes of the transcocled images.
Table (4.5) shows the result.
Based on Table (-4.5): again we can use linear estimation. but t his tirne it is between
the ASimgect> and the transcoding parameters. Since m*e need to consider two paranie-
ters(R k C) at the same time and also need to consider the specialty for each individual
R: Scaiiig Ratm C: No of cdor btts for new image
Table 4.5: Sizes of t rmscoded images for different transcoding parameters (bytes)
images. we define a variable to consist al1 t hese factors. equimlent pixel ?;o.:
(30. of pisels (1 - R')
Where R: scaling ratio: C,,: color depth in bits for the transcoded image. which is C:
4.3. -4dap tiue Transcoding Policg
Cdd: color clepth in bits for the original image.
If we m e linear estimation to fintl the correlation between for itli iiriage
and the ecpivalerit pisel no. of ith iniage. a strong linear correlation is found for al1 25
images of the testing groiip. for example. Figiire (A.10). Figure (-1.1 1). Figure(A.7) and
Figure (-l.L3)
Biit if WC put al1 the samples of equivalent pixel No. into one estiniation. a loner
correlatioti coefficient is inclicatecl in Figure (4.14). This is due to the fact that different
images have clifferent characteristics (complesity. color clepth. size. and dimension) and
their correlation patterns wit h transcoding paranieters are clifferent.
Thiis ive ncetl a rvay to fincl oiit each set of a. b for the linear estimation before rve
cari make the estiniiit ion between lS,,.,,,( ,, ancl eqiiiv;ilent pixel No. acciirate enough.
Consiclering two estreme si t tiations:
0 R = O. rneans the transcoded image will have a diniension of O. Conseqiiently the
transcoclecl image will have a O size (Bytes) ancl the equivalerit pixel No. will be the
original No. of pixels (eqiiation(4.8)). Thus when equivalent pisel Yo. is original
No. of pisels. the size reduction should be the original size of the image: point
[originnl.Vo.o f pxels. original.size] shotild be on the line.
0 R = 1 and color bits rernain the same for traiiscoded iniage. rneans no changing
at all. Consequently the transcoded image will have a size (Bytes) of the original
image and the equivalent pixel No. will be O (equation(4.8)). Thus when equivalent
pixel No. is 0. the size reduction should also be O: point [O. O] should be on the
line.
From these two points for reference. the estimatecl a' and b* of the line equation for
image(i) should be:
Figure 4. LO: Size rediict ion \'S. Equivalent pisel No. for announcer.,uif
I t 1 t 12 1.4 16 t 8 2 22 24 26
Eqrrnrlent Na. of pxeis x TO'
5
r 10' q = O 971917
' 6 i r j 1 .. 3
Figure 4.11: Size reduction Vs. Equivalent pkel No. for baboon.gif
Equrvafenr No. ot @xds x 10%
1 5 1
14-
1 3 - -
: *:.
4
1 121
08
0 7 1 15 2 2
Figure 4.12: Çize reduct ion \,-S. Eqiiivalciit pisel No. for coniniheatl2.gil
0.6 0.7 0 8 0.9 1 t t 12 1.3 1.4 1.5 16 Eqwalent No. d pcxels r 10'
Figure 4.13: Size reduction Ys. Equivalent p~xel No. for uoft-gif
4.3. rldaptice Transcoding Policy
Figure 4.14: Size reduction \'S. Eqciivalent pisel ?;o. for al1 images
Therefore. the lS image( i , is actiially proportional to Eqtiivalent pixel No.:
If WC go one step further. Ive may find out if two images have the same content and
color clepth. the estimated transcoding parameters for them ~ i l l be the same whether
"ETP'' rule or *gESRR'l rule is applied. This is the first comparison we get from *ETPV
rule and .-ESRR" rule. Another cornparison will be shown in Chapter 5 after we get the
transcoded n-eb page b - these two rules.
The linear estiniation of different images. the relative error by substitution of a' and
b* wit h eqtiation (4.9). are shown iri Table (4.6).
tinear Estimation No of ,we Size vs. Equivalent No. of Pixels
Y = a'x + b' N S (&es) Correlation
1 adv-1472-yahw gif 2.728 2.009
2 aav-tJT5-yaiioo gii ô.% 4,JiEi
3 adv-miadora-yahoo gif 2.728 2.58 1
a&-monsrer gif
adv-ntap-yahoo gif
anemone gif
annwncer gif
announcerl28 gif
baboon gif
comrnhead2 gif
cwneet gif
gold.gif
guanyin gif
inf-bag-yahm grf
int-billpay-yahoo gif
inf-mail-yahoo gif
inf gts-yahoo gif
Uoff gif
lordangosteR gif
lordangoster3.gif
hds gif
-P 9if
splash gif
stockgif
rree.gif
Relative Ermr
Table 4.6: Linear size reduction estimation for different images
From (4.10). given ASimgecil- equiwlent pixel 80. for each image can be solved tight
away. And from the definition of equivalent pixel No.. given the original pixel go- of the
4.4. Transcoding und Reconstruction 6s
image. the scaling ratio R anci the color bits of the new image C can be solved easily.
The detailed procediire of tlie decision of the trariscoding policy is inclicated in Figure
( 4 . 1 A niore cornplicatecl one can he obtainecl by adcling following functions (shown
in Figtire (4.15 as bold clotted blocks).
0 Real tinie aciaptatiori: A paraiici rnotiuie can 'oe acicieci to recorci the resuited
transcoding time dela? and re-estirnate the coefficients for linear prediction of
t rariscoding t ime del- clynamically
Recursive evaluat ion cont roi: The resul t eci transcocling paraniet ers can be put back
into eqiiat ion (4.6) to re-idj iist CI and evaliiatc t hc t ranscotling parameters again
to piirsue higher precision. This procedure can be performecl reciirsively until the
precision of eqiiatiori (4.6) is fuiind to be within a. given error.
Tlie resultecl do~vnloacling tirne will be more precise at the espense of longer transcod-
ing time clelay.
4.4 Transcoding and Reconstruction
hfter the transcoding policy is decided. the rest of the work of the Data Processing
module. as indicated in Figtire (4.1). is to use the evaluated transcoding parameters to
c a p - out the transcoding for each individual image in the aeb page. The *-HTMLWriter"
will use these transcoded images dong with the original saved HT1IL file (original web
page) to forrn a new HTSIL file (the transcoded web page). mith al1 the link tags modified
accordingl- When Data Processing module returns? it gives web page process service
thread. as indicated in Figure (3.4). the transcoded web page and web page process
service thread d l send this new web page back to the client and finish the service.
4.4. TrarsscocIiny and Reconstruction
T Set detaun ram default-R (tmm pre fe@W 11 W1) Set Client's coior quiremen! ( h m preference(31) Set min-dim (min_wdth CL min-height) for image qualrty ( h m preterenœ(6D
T Find out total no. of pueis and total sue Predict bme needed by transcoding Predict w e red~ct~ln needed for the web --an r-a-
Find out alpha b r CON
? ? T t ADV DEC SUL RUL
End relative importance 4 Find out sire reducrian ratm a(i) Find out site reductian ratio R 8 no ot color bcts: C frorn a(!)
iiCIbm <
--* *., , R = mm-dm 1 image-dim.
V V T 7 MAP INF L-INF CON
v Set transaxtmg
parameters for image(il . -
r . Real time ,
Figure 4.15: Adaptive transcoding policy
Therefore. the web page transcoding is transparent to both the client and the semer.
The client sends one request and the whole transcoded web page is got.
4.5 Summary
In this Cliapter \YP propose the design of web page trariscoding in Data Processing module
of the proxy server sustem. An improved way of image content analysis is introcluced
and the performance of the riew image content decision tree is evaluated. An adaptive
polir? is d~si~riecl For web page transcotling. Details are eiven on. accorclinp to user's
clifferent reqtiirenients (displq recpirernents anci browsing recpirernents) ancl accorcling
to differerit importance for tlifferent images. how to decide the transcoding parameters
adapt ivel~..
Chapter 5
Implement at ion and Experiment al
Results
[ri tliis Cliapter. ive clescrilxs the .Java iniple~iitmtatioii aiid the espcriniental resiilts of
the proposecl transcoding pros!. sener sustem. The conterit inclutles two parts:
.Java Iniplernetitatioti of Proposed Transcotlirig System
a Esperimental Resiilts
5.1 Java Implementation of Proposed Transcoding
System
The proxy server system is implemented in Jan . As described before. it iacludes both
the networking programniing part and the transcoding part. The source code esceeds
3300 lines. which contains the core parts of a transcoding proxy semer. One can add
corne hnctions for the semer. for example. server security and maintenance. caching and
searching, and resource managing. Since the system is designed using object-oriented
prograrnming technique. new ftinctions can be added conveniently.
RTTI
RTTI stands for Ron-time type identification. which uncovers a whole plettiora of inter-
esting ob iect-orientecl design and raises funclamental questions of how to strtictiire the
prograni. In brief. RTTI allows the program to discover information about objects and
classes a t run-time. nhicli means the prograrn can handle t hose classes t hat are iinknomn
at compile time. The tcchnology rnakes the prograni self-extensive in sense of that any
classes c m be aclclecl to the prograni groiip bu be recognized at run-time. The proxy
software systeni uses RTTI to niake the proxy to have extensive rnulti-service. which
enable services to be added by riin time [34].
5.1.2 Nested Multiple Thread Structure
-4s seen in the main Aow chart of the pros? software system. n nestecl multiple tliread
structure is tisecl. The first lewl of multiple thread is usetl for providing multiple services.
the seconcl level is iised for providing one service For nit11 t iple clients. Inside each service
thread. a third level of thread m v be iisecl at times when necessa . For example. for
single file process thread. two sub-threads are usecl to receive from the server and send to
the client a single text/html file simultaneously. For web page service thread. in Virtual
Client. a t hird level of multiple t hreads is used to download different content in the web
page-
5.1.3 TCP Sockets: ServerSocket, Socket class
Two classes are invoked to realize the TCP sockets used by the proxy server [v9]. The
ServerSocket "listens to" the requests from the client. The Socket 'kpeaks" to the original
5.1. Java Implernentation of Proposed Tmscoding Systern 70
server to get the respoiise and a-spcaks to" the client to send the transcoded response.
1. public class ServerSocket extends Object: This class implements semer sockets. A
server socket waits for reqiiests to corne in over the network. It performs some o p
eration basetl on that reqiiest. ancl then possibly returns a resiilt to the requester.
Tlle \ < : ~ r k Of the serx;-r ~ ~ c ~ ~ t i- perf~rfiied h\ an i f i r tznc~ cf $ ~ & ~ ~ ! ~ p !
class. hri application can change the socket hctory that creates the socket iniple-
nientation to configure itself to create sockets appropriate to the local firewill.
2 . public class Socket extends Object: This class iniplements client sockets (also called
jiist "sockets'.). -4 socket is a n endpoint for corrimunication between two machines.
The actiiai work of the socket is performed by an instance of the SocketImpl class.
Ari application. b~ changing the sockct factory that creates the socket iinplenien-
tation. cnn configure itselF to create sockets appropriate to the local firewall.
5.1.4 Connection Control and Resource Management
.\ seniaphore c m be used to realize the connection control and it c m be programnieci as
a separate thread to rtin sirnultaneouslj-. Semaphore is important when multiple clients
share the seme resources at the prosy server.
Resoiirce Management incliides many details for server prograrnming. incltiding file
operations. ilse file and record locking. qtiery and motlify process attributes and resource
limits. etc [35]. At the moment. ive implement only simple considerations For file oper-
ations. In order t hat same filenames from different GRL links don3 conflict locally at
transcoding prosy server. a filename administration is considered. This is realized by
adding URL into the saved filenames.
For esample, a file in the URL of someprotocol://somesite/somefi1e will use a 10-
cal filename as someprotocolsomesitesomefile for saving. Note that ":" and "/" are
5.1. .laun Implernentation of Proposed Transcoding Systeni
. .. changed as '-- - in the riew filename.
5.1.5 HTML Parser
HTllL Parser parsec1 the web page and save the lirik tags for al1 images inside this web
pa-e. In a HTML file. therr are basicallv four tvpes of tans that are linked to the images:
[30. 3 LI
Background image: de fined wittiin the tag of body. <body backgroiincl = CRL . . .
>. the CRL is specifiecl to an image to be tiled in the document backgroiirid.
Mine images: defiriecl in the tags for images. <img src = CRL . . . >. the CRL is
pointetl to the atlclress wlicre the image stiould be downloadcd.
.. . 0 Isrnap images: tlefineti in the tags for images eiided by isniap" . <iriig src = PRL
. . . isrriap>.
Linkecl images: clefitirtl in the tags for images right after a hyperlink tag. <a href
- - . . . > <img src = . . . > </a>. the image c m be clicked and after clicking the
browser will use the acldrrss indicatecl in the hyperlink to find the linkecl content.
Another tag can point to an image when there is a hyperlink tag <a href = CRL
. . . > and the URL is pointed to an image. But since at this case the image will not be
shonn at the present Web page and it is only downloaded after clicked by the user latex-?
the image is not countecl for downlonding.
5.1.6 Protocol Handler
Protocol handler of Virtual Client is responsible to set up the connection betmeen the
pros. server and the original server using the certain protocol given in the URL. It makes
5.1. .Joua Implernentation O/ Pro posed Transcoding Syst ern 12
the proxy server capable of coping with different application laver protocols. which means
a riew protocol can be writteri by the mer and recognized by protocol Iiancller mithoiit
rnodifj-ing rest of the software systern.
Protocol handler rnechanisrii is impleniented throiigh four different classes: URL.
CRLStrearnHandler. URLConnection. and CRLStreamHandlerFactory (271. Among them.
only KRL class is concrete. CRLStreamHandler and CRLConnection c l a s is abstract
classes. and CRLS treaniHancIler F a c t q is an interface. Tliat nieans to nrite t Lie pro t o-
col. one has to write concrete stibclasses for the CRLStrearnHancller and URLConnectiori.
There are three steps to realize protocol haricller.
1. To lise these clases. CRLStrearnHandlerFactory is uscd to take the protocol and
Iocate an appropriate siibclnss of CRLStreaniHandler for the protocol.
2. This C~RLStrearnHaricller will then parse the string representation of the CRL into
its separate parts and creates a corresponding CRLConnection.
3. The new LiRLConnection is responsible for the interaction mith the server. converts
ariything the server sentis into an InputStream. ancl converts anything the pros-
sentis into an Outputstreani.
In most cases. an instance of a LRLStreamHandler subcinss is not crcated directly by
an application. Rather. the first time a protocol name is encountered when constructing
a URL. the appropriate stream protocol handler is automat ically loaded.
5.1.7 User Preference Definition
Preference from the client contains the information of mobile link bandwidth. display re-
quirernents. and browsing requirements. There is no particular reason why the definition
5.1. Java Implernentatlon of Proposed Transcoding System 73
in our irnplenientatioii is (lesignecl in the following W . .ktiially the client can define its
preference in any Corniat. as long as it contains the relevant information needecl.
In our implenientation. the preference is tlesignecl to be in the HTTP header right
after the "GET" request and it uses two 32-bit integers separateci by space:
Preference: prefl pref-
Prefl ancl preP arc defined. by our implerneritation, as two 32 bit integers. They are
interpreted by Reqiiest Processing anci traiislated into pr~ference[~ wliich is an interna1
integer array. The size of the iirray is eight ancl ae define it as following:
i Preference[O]: 'tlobile coiinection ban<lwiclth (Bni). -L bits. bits 0-3 of prefl. The
clefinition is cletailecl in Table (5.1).
Pre€erence[l] k [[-: Mdiie device clisplay sizc reqiiirement. 4 bits. bits 4-7 of prefl.
The clefinition is tletailed in Table ( 5 . 2 ) .
i Preference[3]: '\[obile device clisplay color reqiiirement. -4 bits. bits 8-1 1 of prefl.
The dcfinition is cletailed in Table (5.3).
6 Prefererice[4] k [SI: llobile user image filter. iised to indicate nhether certain type
of image is wntecl by the mobile iiser. S bits. bits 12-19 of prefl. Khen the bit is
set to '' 1''. the corresponding image will be transcoded at the proxy and sent to the
mobile user. If the bit is " O " . the image mil1 be filtered out by prosy. Also each type
of image is assigned a relative importance to the "CON*' type image (Images are
classified by their content into S groups. shown in Table ( 5 . 4 , detailed in Chapter
4). This will be used for deciding the adaptive transcoding policy. Basically niore
important image type d l be assigned longer transcoding time. And this image
preference is defined as pref'erence[û] or integer pref2. Each image type will thus
take four bits out of the 32-bit integer. For image preference. each image type will
5.1. Jasa Irnplementution O/ Proposed Transcoding Sgstem 74
take 4 bits. the value will be from O to 15. For "COK" . the four bits will always
be 1.5. wtiicli rrieans the relative base. Other image types will value from 0-15.
which gives the relative importance: vaIiie/lJ. For esample. if "INF" image. if
the 4 bit value is 3. then they have a relative importance of 0.2. coniparing with
" COX" . This basically means given the same t ranscoding t inie. " IXF" image ail1
be transcocletl so t hat the transcodetl image is 0.2 the size of the transcocled image
from " COX" .
0 Preference[GI: Image quality parameter. used to give the image quality l o s tolerance
of the niobilc user.
0 Preference(T1: Relative downloading tinie ratio. used to give the mobile user a
Hesibility of sacrificing resolution to make riotvnloaclirig time shorter. The usage of
this ratio has been introclocecl in Chapter 4. It also takes 4 bits. bits 24-27 of prefl.
The definition is detailed in Table ( 3 . 5 ) -
The overall construction of prefl is g v e n in Table ( 5 . 6 ) .
Mobrb Connectlon PretemMOj BRS" pretl Bandvuidth (Sm) (4 btts: 0-3)
Bm c 9.6 kbps O 0000
9.6 kbps c 8rn c= 14.4
mPs f ml
14.4 kbps c Sm c 28.8 '
kbps ml0
28.8 khps c Bm <= 33.6 3 001 1 w= - 33.6 kbps c 8m - 56
' w s 4 0100
56 kbps c Brn 5 0101
HHC 640 putels 480 pixels 0100 - -
--. - - Cobr PC 1020 paeis 768 p a e b 1100
Table -5.1: Definit ion of preference[O] Table -5.2: Definition of preference[l] 9r [2]
5.2. Erper-imental Results
BIW 2 gray kvel O 0000
4 gray levei 1 O001
Gray (Bit 11 :
8 g n y level 2 0010
O' 16 gray kvel 3 001 1
256 gray level 4 0100
8 color to taro
Coior 1 6 color 11 107 1
(Bit 11. 1) 256 cokr 12 1100
24 bit color 15 1111
Table 5.3: Defiriition of preference[3]
Image fiiter-Preterem{4j (Bits 12-19 in pretl)
CON L-INF INF MAP RUL BUL OEC ADV - - -
Bit 19 Bit 18 Bit 17 Bit 16 Bit 15 Bit 14 Bit 13 Bit 12 -. . .
Image preference-Preference[Sl= pren (Bits 0-31)
CON L-INF INF MAP Rut BUL OEC AOV
Bit ' Bit - Bit - Bit Bit Bit ' Bit ' Bd '
28-31 24-27 20-23 16-19 12-15 8-11 L 7 &3
Table 3.4: Definition of preference[-l] 9r [JI.
The notations COX. L-IXF. INF. ,GIAP.
RCL. BC'L. DEL. ancl ADV have been ex-
plained in Chapter 4
1 pretî (Bits 0-3t) 1 . prekrence preierem preference preterenœ preterenœ preference
m . PI . ~ 4 1 BI . L - PI Bit3 24-27 Bits 20-23 Bits 12-19 Bi& 8- 11 Bits 4-7 Bits 0-3 I
Table 5.6: Definition of pref
Table X.3: Definition of preference(71
5.2 Experimental Results
In orcler to test the prosy system. a HTTPSener and a HTTPClient are also coded. The
HTTPServer is a semer that tvaits for the HTTP requests From a client at port 80. The
HTTPClient initiates a connection by sending a HTTP request' which is actually sent to
the prosy server. And then the prosy server mil1 send this request to the HTTPServer,
get the response. transcode the web page and send it back to the HTTPClient.
We test Our system in three scenarios:
1. Transcoding of single images
2. Transcoding of web pages wit h "ETP" rule
3. Transcotling of web pages witli "ESRR" riile
5.2.1 Transcoding of single images
Different images are testecl for the effect of downloading mith transcocling vs. without
t riinscorlirig Thc tests ;ire rpppi~tpd for mobile links of t l i f f~rmt banclwidttis that eqiial
to 25.5 kbps. 14.4 kbps. ancl 9.6 kbps. For ~saniple. if announcer.gif is reqiiested by the
client via prosy. Figiire (.5.1) shows how the total tlownloading time (mith transcocling)
is corist i t iitecl and Figiire (5 2) shows t lie coniparison of downloacling time between mit h
transcocling aricl n i t hout transcocling. wit t i cliffererit t ranscocling paranieter pair. a t the
bandwidtli of 28.8 kbps for the mobile link. Figiire (-3.3) and (5.4) show the results at the
banclwidth of 14.4 kbps. Figure (5.3) ancl (2.6) show the results at the bandnidth of 9.6
kbps. Sirice wc clori't know the bandwidt h between the prosy serrer and the HTTPServer.
a 1 lIbps connection is assumed to ci\lciilate the %nage Swing Tirne'. (in Figiire (3.1):
(5.3). ( 5 . 5 ) ) from HTTPSerrer to send the iniage to the proxy server. In rcal situation.
this connection rate ciin always be nionitorecl and tised for calciilation.
From Figures (5.1)-(5.6). tve see:
a -4s scaling ratio (R) or color depth of transcoded image (C) becomes smaller, the
size (Bytes) of the transcodecl image becomes smaller. Thus the *+Transmission
Time of Transcoded Image'' (in F i g e ( 1 ) (5.3). (5.5)) becomes shorter and
consequently the total downloading time (with transcoding) becomes shorter.
As the banclwidth of the mobile link becomes smaller. the *'Transmission Time of
Transcoded Image" becomes longer. Thus its proportion to the total donmloading
t ime (wit h t ranscoding) becomes larger.
As the bandwidth of the mobile link becomes smaller, the downloading time (nrith-
out transcocling) increases by the sanie ratio as the --Transmission Tirne" does.
Thtis for the cases when the --Transmission Time" is the dominant Factor of the
total tlownloading time (with transcocling). the relative ratio between the download-
irig tinie with transcocling and t hat witliout transcodiiig remains almost unchanged
(in Figure ( 3 . 2 ) . (3.4). (5.6)) .
5.2. Eqesri-rnentaf Results
Bandwidth s 2û.8 Kbpr
F i 5.1: Image cionnloading tinie at 28.8 kbps(wit h trariscoding)
Bandwidth = 28.8 Kbps
Figure 3.2: Image dosnloading time cornparison at 28.8 kbps
Bùndwidth = 14.4 Kbps
Figure 5.Q: Iniage downloacling tirne at 14.4 kbps(ait h transcocling)
Bandwidth = t4.4 Kbps
Figure 5.4: Image domloading time cornparison at 14.4 kbps
5.2.2 Transcoding of web page with ETP
With "ETP" riile. ive testetl the web piige shown in Appendix. Figure (-1.26). Figure
(5.7) siioivs tiow the entire downloadirig tinie (n i t h transcoding) is constitcitecl and Figure
(3.8) shows t hc cornparison of clonnloacling tiine between tvit h t ranscoding and wit houi
transcoding. again t here are clifferent situations wit h different t ranscotling parameter
pair aricl t he bariclwiclth of t tic rriobile link is 28.8 kbps. Figure (.5.9) and (5.10) show the
resiilts ;it the bandwidth of 11.4 kbps. Figure (5.11) aiid (3.12) siion- the resiilts at the
banclnidtli of 9.6 kbps. Actiially al1 the paranieter pairs are for *f'ON9' iniages. for other
iniages. the t ranscocling paranietcr piiir is set to R = 0.625. C = -1. Tlie same connect ion
banclwitltli of 1 Ubps between ttie pros' serrer and the HTTPScrver is assiimed.
Siriiilarly. from Figures (5.7')-(5.12). we sec:
Tlie sanie sciilirig ratio ( R ) and color depth (C) apply to al1 the images ivith same
coritrrit . For différent content. clifferent RkC are decidecl accorcling to previous
knowledgr of transcoding. the content. the importance giiven by user's preference.
ttie client's ciisplay reqiiirement and the quality requirenient for transcoded image
(for example. the srnallest R t hat niakes the transcoded iniage still recognizable).
As scaling ratio (R) or color depth (C) gets srnaller. the size (Bytes) of the transcoded
aeb page becomes smaller. Thus the "Transmission Time of Transcoded Web Page"
(in Figure (5.7). (5.9). (5.1 1)) hecomes shorter and consequently the total down-
loiding time (wit h transcoding) becomes s horter.
a As the banclwidt h of t hc mobile link becornes smaller. the **Transmission Time of
Transcoded Web Page" becomes longer. Thus its proportion to the total down-
loading time (with transcoding) becomes larger.
-4s the bandwidth of the mobile link becomes srnaller, the downioading time (with-
out transcoding) increascs by the sanie ratio as the -'Traiisriiissiun Tinie of Transcoded
Keb Page" cloes. This for the cases alien the u'Transmission Tinie of Trariscoded
\ k b Page" is the clorninant factor of the total domnloacling tinie (with transcod-
ing). the relative ratio between the clownloading time s i t h traiiscocling and t ha t
withoiit transcocling remûins almost iitichanged (in Figure (5.8). (-5.10). (5.12)).
From prcvious knowledge of transcoding. along wi t h the clelay tinie cornponents anal-
ysis ancl the clowiiloading tiriie coniparison analusis. when the clirrit gives the prosy a
requirenient that inclucles same scaling ratio for sanie content images and an amount of
tinie for clownloading. the prosy scrver caii tlecide ivhich set of paranieter pair to meet
the client's reqiiirement. An rsample corisistirig of two transcocl~tl web pages (using
ETP) is s h o w in Appenclis B. Figure (B. 1) shows the nori-color version of ETP meb
page transcoding. and Figure (B.?) sho\i.s the color version.
0 M 100 150 200 250 300
Bandwldth a 14.4 Kbps
Figure 5.9: Web page tlowriloading time at 14.4 kbps(ETP transcoding)
-- -- AL~--- - R = O 7 5 . C = 8 ' 4
t , 1
1 R = O 7 5 . C = 2 1 - ' i
i
i
2 Weo Page D<mnroadnq
m L rune mm TrarucOdnp
d
1 ow- Pûge Downioabng
i rmm unthait Transcodnp
!? 1
b I
1
1 R = O Z . C = 2 j
O 100 200 3m 500 500
Bandwidîh = 14.4 Kbps
Figure 5.10: Web page downloading time cornparison at 14.4 kbps
I Bandwtdth = 9.6 Kbps
Figure 5.1 1: \\éb page clownloacling tinie at 9.6 kbps(ETP transcoding)
I Bandwidth + 9.6 Kbps
Figure Z.12: Web page downloading time cornparison at 9.6 kbps
5.2.3 Transcoding of web page with ESRR
Wit h ..ESRRo' ride. n e testecl the web page shown in Appendis -4. Figure (A.26). Z;ow the
different situations are due to t lie different t inie t hresholcl given by the client. for exnmplc.
the client wants to get the w b page domnloadecl within 10s. etc. The bandwidth of the
niobile lirik is 28.8 kbps (same tests can be run for banclwidth of 14.4 kbps and 9.6 khps).
The estimatetl t ranscocling parameter pairs are shonn in Table (.5.7).
Total size before transcoding (bytes) .
Total sue atter iranscoding (bytes) ,
WeD Pa* Wng ama II)
f f-ngmaewing- I m w ,
W Triramason orne or E O a m q % t S i
f OP) dmdaaumq orne (9)
15
ïï4.278
22.707
6.19
2.64
6.31
15.14
R.0 1667 Cr
R*O a2i C d
R=O 625 Gr
%O 3333 Gr
Rd3 2576 GJ
Rn0 625 Gr
Ra0 625 GJ
RIO 625 CY.8
R.0 625 C-J
R d 1823 G Z
R i 0 2578 6 4
R-42518 cd
Oownloading time threshold (s)
R d 625 C d . .
%O 3333
*
%O so2a C=J
R d 625 W
Rd) 62s
- -
Gr
R B 625 Gr
* RA 625 - '
c.4
R.0 3553 car
RrO s 2 4
- C=J
h a xni - L . 2 - - -
Rdl3553 -
Table 5.7: W b page transcoding and transcoding parameters evaluated by ESRR
Another coniparison between the ETP transcoding and the ESRR transcoding can
be observed. with the same domloading time requirement of 25 s (at 25.5 kbps). If
E 30 p i = = r j f O Weô Page û m t n m
i 2s ppl F m wih T ranscodnp
- u W M Page Domiioadmg
20 rr, rime m(hout Tramcodng
i , pp, i
O M 100 1 5 0 -Xa 2M
Bandwidth = 28.8 Kbps
Figiirr -5.14: Web page clownloacling time cornparison at '25.5 kbps
A s t tic tlowriloadiiig time requireriient becoiiies longer. the evaliiated transcoding
paranieters for images insicle the aeh page becorne larger (sarne trend for al1 im-
ages. but not necessarily the same amoiint). Thus transcoded images have larger
dimension ancl more color clepth (in Table (5.7)).
For sniall images. including the .*.\DV*' images and TÜF" images. t lieir scaling
ratio R Sr color depth C are actually much more affected by the image qualit';.
Le.. the smallest image size that c m be recognized b - the client. This is because
the evaluatecl transcoding parameters makes the image too small to be recognizecl.
Thus the finally decided transcoding parameter pair remains the same for different
time requirement scenarios (in Table ( 5 . 7 ) ) . But for big images like "CON''. each
tirne the adaptive policy will decide the optimal parameters using ESRR and each
image may be different if they have originally different color depth (in bits).
The transcoded web pages have the images clear enough to be recognized, even
b r the 10s ïcqiiirernerit case. 10s of clownloading tinie with transcocling is actually
3% 'cf that witlioiit transcotlitig, LI-hicii is 21% (in Figure (B.3)-(BA)). Also it is
possible to put a link in the ..HT'\[L witer" for each scaled image so that if the
client mants a clearer view. the iniage can be clicked to be shomn bigger. At this
tinie. the prosy server can send back a higger version mithotit clotvriloacling tlie
image agairi frorii the origirial serwr.
0 Results in ..Total downloacling tirne" line in Table (5.7) show that the adaptive
trariscocling policy works rathcr precisely. For esample. for the '25 second require-
ment. thc resulting downloading time is 23.1 seconds. giving a relative error of
Ï.6%.
0 Froni the Figure (5.14). we sec t hat the user ciln choose to domnload a web page with
"iiny" aniount of time. Of course. the tirne shoiild not be sliorter t han clownloading
the test only version of the w b page. And t his is guaranteed by eqiiation (5 .3) .
Summary
In this Chapter ive present some technical considerations for dava irnplenientation of the
proposed transcoding prosy systern. Shen sonie esperimentai test results are shown for
different transcoding scenarios: tests of proposeci transcoding policy to evaluate transcod-
ing parameters for different images in a web page adaptively: tests of transcoding prosy
server to compare the downloading time wit h t ranscoding vs. \vit hout transcoding. The
test resiilts show that the downloading time ancl resolution of a web page can be con-
trolled by the new pro? systern cornpletel. They also show an excellent performance
and efficiency of the proposed transcoding .stem. for esample, the client can reduce the
domloading time with transcoding to 5% of that without transcoding.
Chapter 6
Conclusions and Future Work
This thesis has addressecl the tlesigri of trariscocling proxy semer. mhich consists of four
parts: the clesign of web page t rariscoding. the clesign of adapt ive transcoding policy.
the .lava iniplenieritatiori. iiritl experinierital tests. Eacli part coritributes to the pro-
posecl transcocling pro- system and riiakes it superior to the esisting transcoding proxy
systems. Following are some concliisions:
Tlic niethotl of web page transcoding rriakes the proposed prosy system capable
of transcoding the entire web page withiri one connection. The method rnakes
possible many new processing techniques that enable a wide range of transcoding
reqiiirements. for example. overall downloading optirnization (the user c m give a
time boiindary for downloading the web page), searching and filtering unimportant
or unn-anted images. and dynamic browsing (the user can set the parameters for
browsing n i th each connection). It also reduces the reqiiest clel- for unwanted
content. and will render a better caching management.
The proposed adaptive transcoding policy can evaluate transcoding parameters for
each image inside a web page adaptivel. The policy takes into consideration of
iniage content. clecided t hrotigh a new image content clecision tree arid t ranscoded
iniage qiiality. By the proposed adaptive policy for web page transcoding. the
optimization of the downloacling of the entire web page is achievecl for the first time.
.Usa the resiiltecl t ranscocling pararriet ers are precise in t bat the error between the
resulted donriloading time and the arbitrary time threshold giveri by the mobile
user is small.
a The researcli gives a systern design of the proposed transcoding system as well as a
.Java iniplernentation. The design aritl iniplenientation not only have the netmork-
relatecl fcaturcs of multiple estensible services. niultiple iisers' supporting. protocol
lianclier. and content handler. but also have the trançcoding-related featiires of web
page transcoding. aclap t ive t rariscoding policy. content analusis, HTSIL parsing,
active conterit filtering and HT'r IL writcr. The implementation also features object-
orientecl programming technique and portability for ail major operating systems.
a Esperiments are carried out ancl the rrsiilts show that web page transcoding has
an excellent performance and efficiency (as fast as 5% of the original domnloading
time). mi th the value added that the transcoded web page still allows the client to
recognize al1 the images mell.
a Some limitations are esisting for the proposed transcoding proxy systeni: the sys-
tem needs to be added with more server functionso including server security and
maintenance, caching and searching capability. and resource management: the sys-
tem needs to acld coding and decoding functions to cope with JPEG images: the
system needs to add video stream transcoding functions so that the mobile user
can view video content inside a meb page.
Some future work can be done to improve the proxy server system:
-4. Test Images 9- liéb page 93
r Siniiilation of the server. the pros? serl-er. and the client running on different hosts
cari be testecl to nieasiire the transcotlirig time del*. Yoiv the senver. the prosy
server. and the client are runriing on the same host. to clo simulation. which nieans
they are sliaring the entire CPC tinie. Also tirrie estimation can be carriecl out in
ai1 adaptive w- itself. This means the systeni can "learn" to know the best esti-
mation factors after doing some trariscoding ancl recording the time. Also fiirther
esperinients on the time sharing of niiilti-client shoulcl be carriecl out.
a Some server furictions can be added. for esarriple. server security and maintenance.
caching ancl seardiing cüpability. and resource management can be addecl. Now
that the core parts of the transcocling proxy server are alreacly set up. one cari have
an option to iniplenient these furictiotis with al1 the esisting parts working properl.
The ability of coping with other image formats can be atlcleci. for example. coping
with JPEG images can be aclded mitliout miich difficulty to t lie proposecl prosy
systern. h o t her interesting issue is to let the prosy capable of coping witti tideo
Stream.
B. Transeodecl web pages
i , .-' ,, . ..., ini r 8 , i. i.." r - 4
.di , L.. *. -.,. ... .., ri-- ., S.,-"
Figure -1.26: Original iveb page (scaled by 0.4 x 0.4)
B. Transcoded web pages
This is Charles!
Department of Electrical and Computer Engineering
Welcome
The Communications Group is one of sevent research groups in the Depanment of Electricd & Computer Eneineering at the University of Toronto. We undenrike research and gmduate study in the areris of telecommunications and signal pm-essing. We hope thrit this page will provide infornation about Our Group's activities.
Michael Jordan The gresitest basketbail player ever. may someday use a tmnscoding proxy semer to browse the Internet. Isn't chat fmtastic?
Figure B.1: Transcoded web page (ETP) (scaled by 0.8 x O.S)(for "CON":R=0.315 C=4;
for other content: R=0.625 C=4
B. Transcoded rr-eb pages
This is Clrarles!
Figire B.?: Trauscoded web page (ETP) (scaled by 0.8 x 0.3) (colored-for "COS":R=0.375
C=4: for other content: R=0.625 C=4)
B. Transcoded rreb pages
Department of Electrica! and Cornputer Engineering
Hello, Welcome
This i s Charles! The Communications Croup is one of sevenl tesesirch groups in the Department of Electric;il& ~ o m ~ u r c r Engineering at the Ünivenity of Toronto. We undertake research and graduate study in the a r e s of telecommunicritions and signal pmcessing. We hope that this page will provide infonnation about our Gmup's activities.
Fncultv *&&ItJ&
r . 3 r - - - Michael Jordan The greatest basketbal1 plriyer ever. may someday use a transcoding proxy semer to browse the Internet. Isn't chat fantastic?
Figure B.3: Transcoded web page (ESRR with lOs)(scaled by 0.8 x 0.8)
B. Transcoded web pages
This is Charles!
Department of Electrical and Computer Engineering
Welcome
The Communications Croup is one of sevenl reserirch groups in the Department of Electricril Sr Computer Engineering rit the University of Toronto. We undertake resesrch and gnduate study in the areru of relecomrnunications and signal processing. We hope thrit this page wili provide information about our Groupes rictiviries.
Michael Jordan The greatest basketbal1 player ever. may sorneday use a tnnscoding proxy server to browse the Internet. isn't that fantristic'?
Figure BA: Transcoded web page (ESRR with I%)(scaled by 0.8 x 0.8)
B. Transcoded web pages
He llo,
This is Charles!
Department of Electrical and Computer Engineering
Welcome
The Communications Group is one of sevenl research proups in the Department of Electricd & Computer Engineering at the Uniyersity of Toronto. We undenrike reserirch and graduate study in the areas of telecommunications and signal processing. We hope that this page will provide information about our Group's rictivities.
Michael Jordan The greatest brisketbrill plriyer ever. mriy somedriy use a transcoding proxy semer to browse the Intemet. Isn't chat Fantristic'?
Figure B.3: Transcoded web page (ESRR with 30s) (scaled by 0.8 x 0.8)
B. Transcoded w-eb pages
Department of Electrical and Computer Engineering
Hello, Welcome
This is Charles! The Communications Group is one of sevenI research groups in the Depmment of Electricd & Computer Engineering at the University of Toronto. We undenake resesirch and gnduate study in the areas of teIecommuniçritions and signal processing. We hope that this page will provide information about our Group's activities.
Fricultv StudentS
MichaeI Jordan The greritest buketball player ever. may someday use a
@=* tmnscoding proxy semer to browse the Intemet. Isn't that - .!.-;-. .? 3- i-' -.& fantristic?
. -- - . . ... - .* . . 8 +' 4 '"f '- ; .:;"*. .. - .--
L : - - - . . -- . * * . , . r i -*- . I.. /_ .- LI7--
Figure B.6: Transcoded web page (ESRR with '25s)(scaled by 0.8 x 0.8)
B. Transcoded web pages
He&,
This is Charles!
Communications Group
Department of Electrical and Cornputer Engineering
The Communications Croup is one of sevenl researc h groups in the Deplutment of Electrical& Computer Engineering at the University of Toronto. We undenake research and gnduate study in the arelis of telecommunications and signal processing. We hope that this page wilI provide information about our Group's activities.
Students
C
Michaei Jordan The greatest basketbail player ever, may sorneday use a transcoding proxy server to bmwse the Internet. Isn't that fantastic?
Figure 8.7: Transcoded web page (ESRR with 30s)(scaIed €y 0.8 x 0.8)
B. Transcoded web pages
Department of Electrical and Computer Engineering
Hello,
This ts Charles!
Welcome
The Communications Group is one of sevenI research groups in the Department of Electricsil& Cornputer Engineering at the University of Toronto. We undertake research and graduate study in the a r e s of telecommunicritions and signal processing. We hope that this page will pmvide information about Our Gmup's activities.
j 3 ~ u l t v Students
Michael Jordan The grratest basketball player ever. may somedriy use a transcoding proxy server to browse the Intemet. Isn't chat fantristic'?
Figure B.8: Transcoded web page (ESRR with 35s) (scaled by 0.8 x 0.8)
Bibliography
[l] Guido X I . S huster. etc.. 5patial ly disjoint source channel coding: taking advan-
tage of the current dial-up architectiire for vicleo over the Internet". IEEE Proc.
International Conference on Image Processing. Vol. 3. p. 17. Oct. 1998
[2] .J. R. Smith. etc.. ''Content-basecl transcocling of images in the Internet". Proc. IEEE
International Conference on Image Processing. Vol. 3. Oct. 1998
(31 Palm \'II. h t tp://w~~wvw.palm.com/procliicts/palmvii/serviceplans. htrnl
(41 W. Pennebaker. J . '\.Iitchell. ".IPEG still image compression standard". Chapman Sr
Hall. 1993
[5] -4. Ortega. etc.. *'Frorn digitized images to online catalogs. data mining a sky survey".
.\I Slagazine. Arnerican r\ssociation for Artificial Intelligence (AAAI). Summer 1996
[61 Richard Han. etc.. .bDynamic Adaptation in an image transcoding prosy for mobile
web bromsing'. IEEE Personal Communications. Dec. 1998
[Tl T. Kostas. etc.. "Real-tinie voice over packet switched networks"? IEEE Network
magazine. vol. 12. pplS 27. Feb., 1998
[SI Harini Bharadi-aj, etc.: "An active transcoding proxy to support mobile web access",
IEEE Proc. Symposium on Reliable Distributed Systems. 1998, pp118 123
[9] 11. Neison. .J. Gailly. T h e data coriipression book". 2ncl ecl.. SI S T Books. 1996
[IO] A. Fox. etc.. ..Retlucing W i V W latency and bandaidt h requirernents by real-t iine
tlistillations". Fifth International Worlcl Wide Uéb Conference. SI- 1996
[Il] SI. Liljeberg. etc.. --Entiancecl services for World-Wcie Héb in niobile wan environ-
nient". report C-1996-28 April 1996
(121 Markku Kojo. etc.. . A n efficient transport service for slow wireless telephone links".
IEEE Journal on Selectecl Areas in Conirnunications. Vol. 15. 30. 7. pp1337 13-18.
Sept. 1997
[13] Intel Corporatiori. %tel cpick web tectinology: white paper".
http://w~viv.intel.coni/qi~ickweb/mhite.htni.
[l4I S pyglass-Prisni. ht t p://~viv~r..spyglass/procliicts/prisni
[lj] IBhI Corporation. "Ringing in wireless services: web access withoitt wires",
h t tp://at~tv.ibni.com/stories/ 1997/0S/aireless. ht ml
[16] IBSI Corporation. http://~~rvw.almnden.ibm.coni/cs/rvbi/incles.html
[li] IBhI Corporation. ht t p://mww-4.ibm.com/software/webservers/ t ranscocling
[lSI IBM Corporation. http://www.almanden.ibm.com/cs/1vbi/papers/chi9ï/~~~bipaper.html
[19] IBN Corporation. ht tp://~viv~v.edmark.coni/prod/kdis/
.J. R. Smith. etc.. Transcoding internet content for heterogeneous client devices".
Proc. IEEE International Symposium on Circuits and Systems. Vol. 3. June 1998
Rakesh Uohan. etc.. %iapt ing multimedia Internet content for universal access" . IEEE Trans. on Siultimedia? Vol. 1: No. 1: March 1999
[22] C. Brooks. etc.. *Applicat ion-specific prosy servers as HTTP stream transducers" .
Proc. \V\VIV-4, Boston. ht t p://ww~v.~v3.org/piib/conferences/wivn4/papers/56.
Slay 1996
[23] Ari Luotonen. W e b Pros? Servers'? . Prentice Hall. 1998
1241 .'HyperTest Transfer Protocol-HTri'P/ 1.O". KFC 1945
[25] -HyperText Transfer Protocol-HTTP/l. 1". RFC 2068
[26] CIw. Richard Stevens. T N I S netnork programniing-Networking APIS: Sockets ancl
STI" . 1998. Prentice Ha11 PTR
[27] Elliotte Riisty Harold. ".Java Network Progreniming" . 1997. O'REILLY
[?SI Postel. J . B.. "Transniission Control Protocol". RFC 793. 1951
[29] http://java.sun.coni/. .Java 2 Platforni ;\PI Specification
[30] Elizabeth Castro. "HTSLL for the Worltl Wicie W e b . 2nd Ecl.. Peachpit Press. 1997
[31] Valerie Quercia. "Internet in a nutshell". O'Reilly. 1997
[32] .John Zukowski "dava AWT reference". O'REILLY. 1997
(331 Alberto Leon-Garcia. "Probability and random process for electrical engineering",
Addison-Wesley. 1994
[34] Bruce Eckel. "Thinking in dava". Prentice-Hall Canada. 1998
[35] Xancy .J. Yeager. Robert E. SlcGrath. 9Veb semer technology: the advanced guide
for world web information providers.'. Morgan Kaufmann Publisherso 1998