A System Architecture of Intelligent-Guided Browsing on the Web

10
- 1 - A System Architecture of Intelligent-Guided Browsing on the Web Hsiangchu Lai, Tzyy-Ching Yang Department of Information Management National Sun Yat-sen University, Taiwan, R. O. C. [email protected], [email protected] Abstract Compared with traditional business operations, www- based commerce has many advantages, such as timeliness, worldwide communication, hyper-links, and multimedia. However, lack of customized interactive abilities of traditional sales representatives is its major weakness. To get competitive advantages over the countless web sites, it is critical to have such customized interactive abilities. The purpose of this paper is to present a system architecture of intelligent-guided browsing on the web. In the architecture, we present five kinds of browsing agents: recommendation agent, new-content agent, search agent, customized agent, and personal-status agent. In order to support these agents, there are user analyzer to maintain the user profile by analyzing log file and CGI parameters, and site monitor to maintain the site database by monitoring all changes of the site. Finally, we present a prototype to demonstrate the proposed system architecture. 1. Introduction World wide web has gained amazing popularity through the availability of “point and shoot” browsing tools [7] like Netscape and Explorer. Due to the surprising growth of population, www-based commerce has become a new competitive business weapon [4] whether their goals are increasing revenue, streamlining business processes or enhancing productivity [6]. Compared with the traditional business operation environment, web has many advantages, such as timeliness, worldwide communication, hyper-links, and multimedia. However, boundless resources and lots of freedom result in some browsing problems, such as getting lost, spending much time, missing the most valuable pages and something new. In addition, lack of the one-to-one and customized interactive abilities of traditional sales representatives is another major weakness of current www-based commerce. In fact, it is critical to have such customized interactive abilities in order to get competitive advantages over the countless web sites. Unfortunately, the cost to develop and maintain customized browsing capabilities will be very high or even impossible if they cannot be implemented automatically. Intelligent agent will be a solution because it is an autonomous softbot that can act intelligently based on the monitored environment. Therefore, if there are some kinds of intelligent-guided browsing agents to improve users’ browsing activities, the above problems may be solved. The purpose of this paper is to present a system architecture of intelligent-guided browsing on the web. The rest of this paper is organized as follows. In Section 2, we provide a classification framework of web browsing activities to describe problems and the required capabilities to solve them. In Section 3, we summarize the required general characteristics and intelligent capabilities of a web site for improving browsing activities. Then we propose a system architecture of intelligent-guided browsing and discuss all details of the major components. In Section 4, a prototype is used to explain how to design these agents and how they work. Finally, we end this paper with conclusions in Section 5. 2. Classification framework of web browsing activities Before discussing online browsing problems, we tried to categorize browsing activities first. We propose two dimensions, web familiarity and purpose of browsing. The web familiarity refers to whether the user is familiar with the web site. It will affect the way the user to browse. On the other hand, the purpose of browsing refers to whether a browsing behavior is driven by a particular purpose. If there is not any purpose, the user may browse whatever he pleases or randomly. In terms of purposes, we can classify it into two further categories. It can be either a regular purpose or an ad hoc purpose. For example, a user may has a habit of reading news online every day. Therefore, to browse for regular purpose may mean that the user browses a web site for same purpose regularly. On the other hand, if a user browses New York Times for job hunting, it is for an ad 1060-3425/98 $10.00 (c) 1998 IEEE

Transcript of A System Architecture of Intelligent-Guided Browsing on the Web

A System Architecture of Intelligent-Guided Browsing on the Web

Hsiangchu Lai, Tzyy-Ching YangDepartment of Information Management

National Sun Yat-sen University, Taiwan, R. O. [email protected], [email protected]

t

o

o

d

f

.

rs

.s

rr

AbstractCompared with traditional business operations, www-based commerce has many advantages, such timeliness, worldwide communication, hyper-links, anmultimedia. However, lack of customized interactivabilities of traditional sales representatives is its majoweakness. To get competitive advantages over countless web sites, it is critical to have such customizinteractive abilities. The purpose of this paper is tpresent a system architecture of intelligent-guidebrowsing on the web. In the architecture, we presefive kinds of browsing agents: recommendation agennew-content agent, search agent, customized agent, apersonal-status agent. In order to support these agenthere are user analyzer to maintain the user profile banalyzing log file and CGI parameters, and site monitoto maintain the site database by monitoring all changeof the site. Finally, we present a prototype tdemonstrate the proposed system architecture.

1. Introduction

World wide web has gained amazing popularitthrough the availability of “point and shoot” browsingtools [7] like Netscape and Explorer. Due to thsurprising growth of population, www-based commercehas become a new competitive business weapon whether their goals are increasing revenue, streamlinibusiness processes or enhancing productivity [6].

Compared with the traditional business operatioenvironment, web has many advantages, such timeliness, worldwide communication, hyper-links, anmultimedia. However, boundless resources and lots freedom result in some browsing problems, such getting lost, spending much time, missing the movaluable pages and something new.

In addition, lack of the one-to-one and customizeinteractive abilities of traditional sales representatives another major weakness of current www-based commerIn fact, it is critical to have such customized interactivabilities in order to get competitive advantages over th

- 1

1060-3425/98 $10

asderheed

dntt,ndts,yrs

y

e

[4]ng

nas

ofasst

disce.ee

countless web sites. Unfortunately, the cost to developand maintain customized browsing capabilities will bevery high or even impossible if they cannot beimplemented automatically.

Intelligent agent will be a solution because it is anautonomous softbot that can act intelligently based on themonitored environment. Therefore, if there are somekinds of intelligent-guided browsing agents to improveusers’ browsing activities, the above problems may besolved. The purpose of this paper is to present a systemarchitecture of intelligent-guided browsing on the web.The rest of this paper is organized as follows. In Section2, we provide a classification framework of web browsingactivities to describe problems and the requiredcapabilities to solve them. In Section 3, we summarizethe required general characteristics and intelligentcapabilities of a web site for improving browsingactivities. Then we propose a system architecture ointelligent-guided browsing and discuss all details of themajor components. In Section 4, a prototype is used toexplain how to design these agents and how they workFinally, we end this paper with conclusions in Section 5.

2. Classification framework of web browsingactivities

Before discussing online browsing problems, we triedto categorize browsing activities first. We propose twodimensions, web familiarity and purpose of browsing.The web familiarity refers to whether the user is familiarwith the web site. It will affect the way the user tobrowse. On the other hand, the purpose of browsingrefers to whether a browsing behavior is driven by aparticular purpose. If there is not any purpose, the usemay browse whatever he pleases or randomly. In termof purposes, we can classify it into two further categories.It can be either a regular purpose or an ad hoc purposeFor example, a user may has a habit of reading newonline every day. Therefore, to browse for regularpurpose may mean that the user browses a web site fosame purpose regularly. On the other hand, if a usebrowses New York Times for job hunting, it is for an ad

-

.00 (c) 1998 IEEE

s

tt

l

l

rn

ae

oa

h

i

y

s

s

e

hoc purpose because it may happen only when the uwants to find a job. Table 1 is the proposeclassification framework of browsing activities.

Table 1. Classification framework of web browsingactivities

web familiaritypurposeof browsing

unfamiliar familiar

purposeless browsingregularpurposeful

browsing ad hoc

2.1 For users unfamiliar with a web site

For a user unfamiliar with a web site, he may browthe web site aimlessly or on an ad hoc purpose to seasome particular information. If the user happens browse a web site without any purpose, he may just each link step by step or randomly to see what the wsite has. It may cost the user much time to browse hyperlinked pages backwards and forwards continualThis time-consuming process will become worse homepages contain multimedia [12]. In addition, thuser probably quits before he browses all pages becanothing can excite him. Furthermore, during thbrowsing process, he may miss some valuable pagesget lost due to the hyperlinked structure. For exampin order to know the details of a selected topic, useusually browse it by depth-first search. Then, it is veeasy to miss other links at higher or lateral levels. Beia web site designer, it is important to help users quickunderstand the structure and contents of the site and kusers interested through the whole browsing process.

There are several possibilities of solving the aboveproblems. First, a web site can provide users with overview map to help them quickly catch its wholpicture. Second, providing a trace map for the usduring his browsing process can avoid the well-knowlost in hyperspace. For example, WebMap updates adimensional graphical map of the user’s journey bdynamically analyzing the navigation actions [2]. Thirddedicating a window to providing all main options nmatter where the user is browsing. This method cmore or less prevent users from getting lost or quittinbefore browsing all pages. Fourth, recommending tmost often visited pages of the web site to users will nonly catch the users’ attentions but also allow them grasp the interesting pages immediately. Fifthhighlighting the most valuable pages can reduce the rof not browsing them. Finally, providing valuable

- 2

1060-3425/98 $10

serd

erchoryebally.ifeusee ore,rsyglyeep

n

ern 2-y,

nge

otto,sk

contents is the root of attracting users’ attentions all thetime.

If a user unfamiliar with a web site browses it onpurpose to search some information, this purpose will bekind of ad hoc purpose. Because the user is not familiarwith the locations of the desired information or theprovided search engine, he may have an inefficient searchor be not satisfied with the search result. Here, wepropose several helpful solutions. First, users need someintelligent search engines embedded in the web site tohelp them make an efficient and effective search. Forexample, Excite for Web Servers [3] offers concept-basedtext searching which enables users to restart search bsynonyms of the original keyword. Second, providingan overview map is another helpful tool. Finally, to listsearch keywords and hot links can be useful too for theuser unfamiliar with a web site.

2.2 For users familiar with a web site

Different from users unfamiliar with the web site,users familiar with the web site may visit the web sitevery often, know what the content is, and even haveregistered as members. However, for the purposelessuser, he may have similar issues encountered by userunfamiliar with the web site. Therefore, the abovepossibilities suggested for web site design, such as toprovide a trace map or to recommend hot pages, areapplicable here too. But there is a unique issueencountered by users familiar with the web site. That is,the user may miss something new. In order to solve thisproblem, a widespread design is to highlight new content.Because the web site may have collected frequent usersbrowsing activities or personal backgrounds, it is possibleto provide tailored homepages for users. For example,the web site can provide the hot pages browsed by userwith similar background to the user, or it can present allnew content which has been changed or added since thuser’s last visiting. This should be helpful to let the userbecome a frequent user. Overall, for the user browsing afamiliar web site aimlessly, how to catch his attentions isan important design issue.

On the other hand, if the user browses a web site onpurpose, it may be either for regular purpose or for ad hocpurpose. Most of time, the regular purpose is to browseroutinely updated homepages, such as to read newseveryday. The news homepage design may keepunchanged for a long time, but its content changes everyday or even every half hour. The possible problems mayinclude inefficient browsing, missing new content andlosing interests due to nothing exciting. Because mostof web sites not only update their content frequently but

-

.00 (c) 1998 IEEE

also change their design quite often, it implies that thweb site should remind the frequent user with regulapurpose that there are something new while present hregularly updated information. A current browsingagent, Web Browser Intelligence (WBI), can checkwhether the user’s favorite pages have new content, aprovide suggestions to the user [1].

When the user visits a site regularly for same purposhis browsing behavior may form a pattern which displaya particular browsing path through the site. If the wesite can present a tailored page or filter information foeach regular user based on his browsing pattern, it cousave the user much time and keep his visiting intereFor example, Letizia, a user interface agent [9], can tra“user browsing behavior - following links, initiatingsearches, requests for help - and tries to attempt whitems may be of interest to him. [10]”

Regarding the user on an ad hoc purpose to browse web site, he may have problems same as those of usunfamiliar with the web site. Therefore, solutions tohelp users unfamiliar with the web site will also be usefuIn addition, to highlight new content is another helpfudesign. More, an ad hoc purpose might be that the uswants to know up-to-the-minute information about theservices he has requested. For example, when the usends an online order or posts messages on board, he be concerned about the progress afterwards. On theoccasions, the web site should sum up the user’s persostatus instead of asking him to jump to different links tcheck the information. We summarize abovediscussions into Table 2 and Table 3.

Table 2. Problems of web browsing activitieswebfamiliarity

purposeof browsing

unfamiliar familiar

purp

osel

ess

brow

sin

g

�cost much time�get lost�miss the most

valuable pages�quit before

browsing all pages

�cost much time�get lost�miss the most valuable

pages�lose interests due to

nothing new or exciting�miss something new

regu

lar �browse inefficiently

�miss something new�lose interests due to less

value of content

purp

osef

ulbr

owsi

ng

ad

hoc

�have inefficient andineffective search�have unsatisfied

search result

�have inefficient andineffective search�have unsatisfied search

result

-

1060-3425/98 $1

er

im

nd

e,sbrld

st.ce

at

theers

l.ler

sermayse

nalo

Table 2 lists problems of web browsing activities whileTable 3 portrays the necessary requirements of the website to improve web browsing activities. Both tables arebased on the proposed classification framework of webbrowsing activities. Such systematic discussion canserve as a foundation to design the web site which cangive a comprehensive help to all kinds of users no matterhow frequently they visit the web site, how familiar theyare with it and whether they access it by chance or onpurpose.

Table 3. Requirements of improving web browsingwebfamiliarity

purposeof browsing

unfamiliar familiar

purp

osel

ess

brow

sin

g

�overview map�trace map�a dedicated window

to main options�hot link

recommendation�valuable content�flexibility�highlight of the

most valuable links

�overview map�trace map�a dedicated window to

main options�hot link recommendation�valuable content�flexibility�highlight of the most

valuable links�highlight of new content�tailored homepage

regu

lar �tailored homepages�highlight of new content�information filtering

purp

osef

ul b

row

sin

g

ad

hoc

�overview map�intelligent search

engine�default keyword list� hot link

recommendation

�overview map�intelligent search engine�default keyword list�hot link recommendation�highlight of new content�personal current status

3. System architecture

Based on Table 3, we can learn that the generalrequired characteristics of a web site are to providevaluable content and an overview map and allow usershave great browsing flexibility. However, it is notenough to provide an excellent browsing environment forits users unless it has intelligent capabilities listed inTable 4.

In order to have the intelligent capabilities, the website has to develop several kinds of intelligent agents.An intelligent agent is a software program which has thecapability of autonomous goal-oriented behavior in aheterogeneous computing environment. Same as anysoftware, a basic component is user interface [11,14]. Inorder to have intelligent, it should have a control engine

3 -

0.00 (c) 1998 IEEE

asn

g

e

r

d

,

r

to guide the agent’s behavior based on its knowledge b[5,13]. Therefore, control engine and knowledge baare the other two major components of an intelligeagents. According to the above discussions, we propoa system architecture of intelligent-guided browsing aFigure 1.

Table 4. Intelligent capabilities and agents for improvinbrowsing activities

Requiredintelligent capabilities

Requiredintelligent agents

�Recommend hot links

�Highlight the mostvaluable content

Ærecommendationagent

�Provide default searchkeywords

For usersunfamiliar

with the website

�Provide intelligentsearch

Æsearch agent

�Recommend hot links

�Highlight the mostvaluable content

Ærecommendationagent

�Provide intelligentsearch�Provide default search

keywords

Æsearch agent

�Highlight new contentsÆnew-content agent

�Provide customizedcontent

Æcustomized agent

For usersfamiliar withthe web site

�Monitor personal statusÆpersonal-statusagent

Site Monitor

User Analyzer

�Stable�Background�Agent-guide

�Variable�Transaction status�Log record

User Profile

WWWBrowser

We

b Do

cuments

UserServer Reply

Client Request

Site Database

�Site contents�Site relations

Control Engine

Log File &CGI Parameters

Application Layer

User Interface

Knowledge Base

(Recommendation / New-content/ Search/Customized/ Personal-status) browsing agent

Intelligent-Guided Browsing Agents

Figure 1: A system architecture of intelligent-guidedbrowsing

In the architecture, we present five types of applicatioagents to provide intelligent-guided browsing capabilityrecommendation agent, new-content agent, search ag

- 4

1060-3425/98 $10

seetses

n:nt,

customized agent and personal-status agent. In order tolet these intelligent browsing agents work well, it isnecessary to have a user analyzer to maintain the useprofile by analyzing log file and CGI parameters and asite monitor to maintain the site database by tracing allchanges of documents in the site. We will discuss alldetails of these major components in the followingsections.

3.1 User analyzer and user profile

The more a sales representative understands hiscustomers, the better he will be. In fact, the web siteplays the role of sales representative and any user of theweb site is its potential customers. The ultimate purposeof all agents proposed here is to satisfy users. In order toachieve this ultimate purpose, the starting point is tocollect and analyze user behavior. The user analyzerand user profile are designed for this purpose. The userprofile is a database to keep all information about users.We take the classification proposed by Laine-Cruzel, et al.[8] as a foundation to characterize each user by stableinformation and variable information. Stableinformation includes all data about individual inherentcharacteristics and preferences, such as background anagent-guides. Background information is a generaldescription of the user, such as name, gender, interestsand job. Agent-guides consist of what the user asksagents to do for him. For example, he may ask thenumber of hot links recommended by agent not over 5.Variable information consists of the changeableinformation about the user’s purchasing records or onlinebrowsing activities. A user’s transactions with the website belongs to this type of information. Another majortype is log records of users’ browsing behavior.

Log file and CGI parameters are two main datasources of the user profile. Log file records a lotsinformation, such as client IP, server IP, navigation dateand time, document name and file size of all browsingactivities occurred on the web site. It is recorded in webserver and becomes the main information source ofanalyzing online behavior, such as which homepages arevisited most frequently. However, log file only is notsufficient to create a comprehensive user profile. CGIparameter is another important data source of the useprofile. For example, the transaction data comes fromCGI parameters rather than log files. The user analyzeris designed to continually analyze log files and CGIparameters to grasp valuable information in order toupdate the user profile.

3.2 Site monitor and site database

-

.00 (c) 1998 IEEE

r

g

is

To satisfy users, understanding them is only a startipoint. The ultimate goal is to provide what the users aconcerned or interested based on the user profile. addition, the web site should also provide efficient aneffective browsing function and reduce the risk of usergetting lost or missing the most valuable informationTherefore, we need to have a full understanding of documents of the web site. The site database is creato serve this function. It stores important informatioabout each homepage. Site content and site relationstwo major types of information in the site database. Scontent refers to the characteristics of the document, sas URL, last modified time, file size, title, summary ankeywords. On the other hand, site relations describe relations among homepages. Their types can be trgraph or other structures. It will be very helpful to assisearch functions.

The site monitor is created to capture site informatioin order to update the site database. It will capture whthe site documents are in terms of site content and srelations and detect any changes of them. An up-to-dsite database is another fundamental element to seathe goal of satisfying users.

3.3 Intelligent-guided browsing agents

From Figure 1, we can see that the existence of tuser profile and site database is to support five kindsagents: recommendation agent, new-content agent, seaagent, customized agent and personal-status agEach agent’s goal and examples of tasks are listedTable 5. The recommendation agent is designed to gbrowsing suggestions based on analyses of user behaand site content while the responsibility of the newcontent agent is to catch users’ attentions to new conteAs for the search agent, it will help users to conduct efficient and effective search. Try to learn the userbrowsing pattern and therefore, to present tailorhomepages to each user is the goal of the customiagent. Finally, the personal-status agent will providup-to-date personal status upon each user’s request.

Each agent consists of user interface, application laycontrol engine and knowledge base. The user interfais responsible for communication between users and agent. The application layer defines the goal anexecution conditions of the agent. When the usinterface receives external message, the application layerwill check whether the execution condition is matcheIf it is, the control engine will be enabled. Followingthat, the control engine controls the problem solvinprocess. It may fire appropriate rules which reside

-

1060-3425/98 $1

ngre

Inds’.

allted

n areiteuchdtheee,st

natite

aterch

he ofrch

ent. inivevior-nt.

an’sedzede

er,ce

thed

er

d.

gin

the knowledge base to decide how to guide the usethrough his browsing process. It will collect therequired CGI parameters too during the process. Systemthen sends the results to the user interface for presentinto users.

Table 5. Goal and examples of tasks of intelligent-guidedbrowsing agents

nRecommendation agentGoal Give browsing suggestions by analyzing behavior of

reference group and site content.Examples

of taskRetrieve log file and count the hits of each

homepage;Recommend hot links based on the rank of hits of

homepages from all users;Recommend different hot links to different types of

users based on the rank of hits of homepages fromusers of same type;

Highlight the most valuable content for all usersbased on the judgment of the web site owner;

Analyze the interests of different types of users andthen present a tailored highlight of the mostvaluable content to each user according to his type.

oNew-content agentGoal Catch users’ attentions to new content.

Examplesof tasks

Retrieve new content from the site database andhighlight them for general users;

Based on the user profile, retrieve new content fromthe site database which are new for that user andpresent him a tailored highlight of new content.

pSearch agentGoal Provide efficient and effective search.

Examplesof tasks

Provide keywords list;Give suggestion of precise keywords to the user

based on his query;Learn the user’s search behavior as basis of

providing suggestions for his next search;Provide default keywords for search.

qCustomized agentGoal Present customized homepages based on the analys

of a user’s browsing pattern.Examplesof tasks

Analyze the user profile such as hits of eachhomepage, personal status, and background to findhis browsing pattern and preference;

Analyze the site database such as each homepage’shits, change frequency, size, keywords;

Apply AI algorithm to the analysis of the user profileand the site database to provide a tailoredhomepage for each user.

rPersonal-status agentGoal Provide up-to-date personal status for each user.

Examplesof tasks

Monitor all transactions of web site and update theuser profile if it is related to any user;

Summary up-to-date personal status in a singlehomepage upon the user’s request.

5 -

0.00 (c) 1998 IEEE

ddg

gne

he-

e.

frntrnlln

lr

r

sedy

e

,

it,e

toy

4. Prototyping

In this section, we will describe some prototypingexperience of running the web site of Fu-Wen bookstorThe bookstore is the university bookstore located iNational Sun Yat-sen University, Kaohsiung, TaiwanIts target customers are all teachers and students in university. The reasons we chose it as a pilot web stofor developing intelligent customer services are to havmore convenient communications due to its location, tprovide better services to all members in school, and offer us an opportunity to do field experiments. Wedeveloped some applications based on the propossystem architecture of intelligent-guided web browsing.

The prototype is running on Windows NT. We chosWebSite 1.1 as web server because it provided us soexamples and modules for developing CGI programAll CGI programs were written with Visual Basic (VB)4.0 which could maintain the databases of Access 7.0.

4.1 User analyzer and user profile

Log file and CGI parameters are two sources of thuser profile. Table 6 is an example of log file, access.loprovided by WebSite 1.1. It records client IP, server IPnavigation date, navigation time, document name, HTTprotocol and file size of any browsing activities occurringat the web site. These records can be used to analonline behavior, such as the number of hits of eachomepage.

However, the log file itself is not enough to provide alinformation required to create the user profile. We writCGI programs to supplement it. The major function oCGI programs is to collect information about usersactivities and related environment of the web site bpassing CGI parameters.

Table 6. An example of the log file140.117.95.176 140.117.75.61-[07/Jul/1997:08:15:59 -0800

"GET /bookstor/login.htm HTTP/1.0" 200 3056140.117.95.176 140.117.75.61-[07/Jul/1997:08:16:00 -0800

"GET /Picture/Backgnd2.jpg HTTP/1.0" 200 9753140.117.74.176 140.117.75.61-[07/Jul/1997:10:33:11 -0800

"GET /cgi/agent1b.exe HTTP/1.0" 200 830203.66.35.34 140.117.75.61-[07/Jul/1997:10:52:55 -0800]

"GET /hotnew4a.htm HTTP/1.0" 200 2856203.66.35.34 140.117.75.61-[07/Jul/1997:10:52:59 -0800]

"GET /Picture/Hotnew6.jpg HTTP/1.0" 200 4478203.66.109.55 140.117.75.61-[08/Jul/1997:00:31:55 -0800]

"GET /bookstor/picture/first1.gif HTTP/1.0" 200 53782140.117.110.207 140.117.75.61-[18/Jul/1997:05:49:58 -080

"GET /cgi/checkord.exe HTTP/1.0" 200 290

- 6

1060-3425/98 $1

e.n.thereeoto

ed

emes.

eg,,P

yzeh

lef,y

]

]

]

0]

The user analyzer is developed to analyze log file anCGI parameters to update the user profile. It is executeautomatically whenever there are new data from the lofile or CGI parameters. Then it analyzes the datacontents and updates the user profile. It will delete lorecords which are unnecessary or have been kept for omonth.

We collect a member’s background informationthrough registration-form. In order to become amember, the user has to fill in personal background, sucas ID, name, password, gender, address, telephone, mail, job, school, department and reading interest.

The user’s background information is only part ofstable information about him. Another part is how auser gives instructions to his agents, i.e., agent-guidIn the prototype, we let users have privilege to giveinstructions to his agents. Normally, for each type oinstruction, there are several options available. Foexample, the agent-guide of the recommendation ageincludes the period of time that the user wants to refeand number of recommended links he desires. Whethe user gives his instructions, the user analyzer wicatch these CGI parameters, identify the user and theupdate the user profile on these data.

Regarding variable information, the user analyzer wilrecord and analyze all transaction data to study usebehavior. The transaction data includes orders, bookrecommendations, book searches and messages, etc.

In addition to transaction data, log record is anothetype of the variable information. It records all browsingactivities by the sequence of URLs visited. Since“access.log” does not have information about who havisited the web, user analyzer utilizes IP address passby CGI parameter as a key to identify who is the user bmapping log record with the user profile. Then to whomthe log belongs can be identified. Based on thidentification, it retrieves relevant personal backgroundinformation and appends it to log record. Meanwhilesome meaningless information in the log file will bedeleted.

4.2 Site monitor and site database

Site monitor is designed to collect site information andupdate site database. Due to time and manpower limwe choose a full-text search engine applicable to Chinesenvironment, Tornado, as a tool instead of writingprograms from scratch to analyze homepages in order build the site database. “Tornado” is developed bGACT group of Chang-Yang Information Company inTaiwan. The beauty is it can run in Chineseenvironment. It can automatically analyze all files and

-

0.00 (c) 1998 IEEE

.

produce keywords, title, summary and file name for eacfile following the path specified by the web site owner.The results are stored in the site database. Becausome homepages in our web site are not big enough form relevant keywords automatically. We give one tothree default keywords for these homepages manuallFurthermore, we use the function “Page Info” of Netscapto collect each homepage’s last modified time and its filesize. Table 7 is an example of the site database.

For the purpose of providing intelligent new bookrecommendations, the site database should includinformation about books. Whenever we add a new book,we update the database on book’s title, file name and howmany pictures it contains. In addition, we categorythese new books. The book category is used to sobooks.

Table 7. An example of the site databaseFile-Name Words Graphs Links Level Modified

DateClick Title Key-

wordsaward.htm 1989 7 17 3 4/23/97 8 … …award3.htm 1651 6 17 1 4/28/97 10 … …board.htm 1993 9 22 3 3/15/97 5 … …

4.3 Intelligent-guided browsing agents

Up to now, we only have developed partial functions othe search agent, customized agent and personal-staagent. Here is to explain how these intelligent-guidedbrowsing agents be designed and work by describing theagent-guides, control engines and knowledge bases.

4.3.1 Search browsing agent

Because Tornado has provided most of searcfunctions we need, we set up it as an embedded bassearch engine of Fu-Wen web site. However, it doesnhave learning ability which is necessary for doingefficient as well as effective search. Therefore, we adadditional function which will learn search keywordsfrom all queries requested by users. That is, the ageranks the frequencies of all submitted search keywordwhenever there is a search request from users. For eaquery, the agent will provide the top ten frequently usedkeywords for users to choose or users can type othkeywords by themselves. Besides, users can set thesearch instructions in agent-guide. Regarding thinstructions the user can choose, it includes simple searcsynonymous search and tolerable search. The simpsearch is designed to improve search efficiency when thspeed of network transmission is slow. It only displaysthe title of each search result. The synonymous searc

- 7

1060-3425/98 $1

h

seto

y.e

e

rt

ftus

ir

hic

’t

d

ntsch

erir

eh,lee

h

means the search will be done with synonymous keywordstoo. On the other hand, the tolerable search allowsspelling errors. Figure 2 is the user interface for searchBoth agent-guide and the top ten keywords are shown onit. Figure 3 and Table 8 portray the algorithm of thesearch browsing agent.

Figure 2. Prototype of the search browsing agent

�Stable�Agent-guide

�Variable�Keywords list

User Profile

Site Database

�Site contents�title�URL link�number of keyword occurrence�summary

Server Reply

Client RequestUser Interface

Does the user ask to search?

Knowledge Base

�Sort rules�Syntax rules

Provide top ten keywords

Capture search condition

Initiate Tornado to start search

Display search results to UI

Yes

No

WWWBrowserUser

Stop

Search engine: Tornado

Control Engine

Application Layer

Figure 3. Algorithm of the search browsing agent

Table 8. Actions of the search browsing agentAction Description

nprovide top tenkeywords

�rank the frequencies of all submittedsearch keywords.�present the top ten keywords for user’s

reference.ocapture search

condition�retrieve the keyword and agent-guides.�add the keyword to user profile.

pinitiate Tornadoto start search

�create the query syntax following therelated syntax rules.�send the query syntax to ‘Tornado’ for

search.qdisplay search

results to userinterface

�retrieve the site database to get the title,URL link, number of the keywordoccurrence of searched pages.�summary the search result for users.

-

0.00 (c) 1998 IEEE

e.

enedd

Whenever a user wants to search, the control enginwill provide him up-to-the-minute top ten keywords.After the query is submitted, it will catch the searchcondition from agent-guides and then start the searcfollowing the keywords and agent-guides. That is,Tornado will be initiated. Then the result is presented tothe user.

4.3.2 Customized browsing agent

On the web site of Fu-Wen, we present ten to twelvenew books to all users every week. Each new book mayhave two to five homepages of detailed information. Wekeep one hundred new books in database based on rule offirst come, first out. In order to provide better service,we design the customized browsing agent to recommennew books to users based on the analysis of the usebrowsing experience which is stored in the user profileFurthermore, in the prototype, the user can decide thnumber of latest browsing book experience as the basis ofanalysis. The prototype also allows users to set thnumber of recommended books. For example, if thuser set the number of latest browsing book experience is50 and the recommendation number of book is 12, theagent will analyze his most recent fifty browsed books tofigure out his current reading preference. Then, iretrieves twelve books that most satisfy the user’s interestand he hasn’t browsed yet. Figure 4 displays acustomized homepage of book recommendation.

Figure 4. Prototype of the customized browsing agent -book recommendation

We briefly describe the algorithm of the customizedbrowsing agent in Figure 5 and Table 9. This agent firsidentifies who is the user and retrieves types of bookbeing browsed. Then it analyzes the user’s browsinpattern based on agent-guide to inference his preferenc

-

1060-3425/98 $1

e

h

drs’.e

ee

t

tsg

Following that, it captures the books which most satisfythe agent-guide and present them to the user. Whthe user browses a new book from the recommendbooks, this agent records the browsing action anpresents details of book. This process will repeat untilthe user quits the homepage of intelligent new bookrecommendation.

Show the results and sent them to UI

Compute and sort the book category

Retrieve agent-guide

Identify who is the user

�Stable�Background�Agent-guide

�Variable�Browsing record

User Profile

�Category rules�Sort rules�Presentation rules

Knowledge Base

�Book contents�Book name�Book picture�Book URL�Book page�Book category

Site Database

User Interface

Does the user choose one book

to browse?

No StopDoes the user ask system to

recommend books?

NoStop

Yes

Server Reply

Client Request

WWWBrowserUser

Show the results and sent them to UI

Check the category of the browsed book

Capture browsing action

Identify who is the user

Control Engine

Application Layer

Yes

Application Layer

Control Engine

Figure 5. Algorithm of the customized browsing agent

Table 9: Actions of the customized browsing agentAction Description

<a> When the user asks system to recommend booksnidentify who is

the user�catch IP address from CGI parameters

when there is a client request.�decode who is the user by matching the

IP and user’s ID in the user profile.oretrieve agent-

guide� retrieve the user’s agent-guide.

pcompute andsort the bookcategory

�category the books browsed by the userrecently.�rank the book category based on the

number of books browsed of eachcategory.

qshow the resultsand send them touser interface

�(1)choose the number one category.�(2)retrieve a new book of the category that

the user hasn’t browsed from the sitedatabase.�(3)put the new book to the

recommendation list.�(4)repeat step 2 to 3 until there is not any

book belonging to the category, thenchoose the next preferred category.�(5)repeat step 2 to 4 until the number of

book in the recommendation list is equalto the number limit assigned in agent-guide or there is no new book in database.�(6)present the recommended books to the

user.

8 -

0.00 (c) 1998 IEEE

n

t

t

nt

Table 9: Actions of the customized browsing agent(Continue)

Action Description<b> When the user chooses one book to browse

nidentify who isthe user

�catch IP address from CGI parameterswhen there is a client request.�decode who is the user by matching the

IP and user’s ID in the user profile.ocapture

browsing action�identify which book is browsed.

pcheck thecategory of thebrowsed book

�retrieve the book category from the sitedatabase�add one to the number of browsed book in

that book category and update thebrowsing experience of the user profile�mark the book been read.

qshow the bookpage chosen touser interface

�present the cover of book and otherdetailed links to the user.

4.3.3 Personal-status browsing agent

In the prototype, personal status includes four kinds information: new book, order, message and spent amoun“New book” is referred to books suggested byrecommendation agent; “order” is related to the currenstatus of the user’s order; “message” is regarding amessages posted or replied by the user in bulletin boa“spent amount” is the amount that the user has spent Fu-Wen bookstore. The personal-status browsing agenthelps users quickly browse all up-to-date personal stattogether. Without it, the user may have to go tdifferent places to find these information, and therefore is difficult to have a whole picture of his personal statusHowever, the user may be not concerned about all of thefour information at any time. He can make his choice bsetting agent-guide. “Agent task” allows the user tchoose the kinds of personal information he is concerneThe other option is “time period” that the user can set thtime period he hopes to trace back.

Algorithm of personal-status browsing agent isdescribed in Figure 6 and Table 10. Control engine hto identify who is the user, then retrieve his agent-guidto know agent tasks and the time period. According tthe selected tasks, it will fire appropriate rules in thknowledge base. Then it tries to detect whether thstatus has changed. Figure 7 is an example. On texample, the user, tcyang, requested two agent tas“message” and “spent amount”. The time period was sas 28 days. The agent traced the two tasks from June1997 to June 29, 1997. It shows that there were thrmessages posted or replied by tcyang. Furthermore, had spent NT 11,204 in this bookstore. System also

-

1060-3425/98 $

oft.

tll

rd;in

usoit.seyod.e

aseoee

hisks:et 2,eehe

automatically displayed the user’s bonus was NT 500 ithis case.

�Stable�Background�Agent-guide

�Variable�browsing record�order record�message record�transaction record

User Profile

Server Reply

Client RequestUser Interface

Does the user ask to check personal-status?

Knowledge Base

�Status-updated rules�new book rules�order rules�message rules�spent amount rules

�Presentation Rules

Identify who is the user

Retrieve agent-guide

Detect whether the status has changed

Summary results to UI

Yes

No

WWWBrowserUser

Stop

Site Database�Book contents�Book name�Book picture�Book URL�Book page�Book category

Control Engine

Application Layer

Figure 6. Algorithm of the personal-status browsing agen

Table 10. Actions of the personal-status browsing agenAction Description

nidentify who isthe user

�catch IP address from CGI parameterswhen there is a client request.�decode who is the user by matching the

IP and user’s ID in the user profile.oretrieve agent-

guide�retrieve agent task and time period

from the user’s agent-guide.pdetect whether

the status haschanged

For each task assigned in agent-guide �fire appropriate status-updated rules.

[new book rules] check the user’sbrowsing records of newestrecommendation books.

[order rules] check whether the orderstatus changed.

[message rules] check messageswhich the user posted or replied.

[spent amount rules] check theupdated spent amount.

�detect whether the status of therequired tasks has changed withinthe time period.

qsummary resultsand send to theuser interface

�summary the results on the homepage.

Figure 7. Prototype of the personal-status browsing age

9 -

10.00 (c) 1998 IEEE

5. Conclusions

It is a challenge to let users of a web site enjoy thwhole browsing process and then become frequenvisitors. We propose a system architecture ointelligent-guided browsing. This research hascontributed to three aspects.

First, we have provided a classification framework ofweb browsing activities by two dimensions: webfamiliarity and purpose of browsing. It can serve as afoundation to understand web browsing activities andproblems.

Second, we have proposed a system architecture intelligent-guided browsing on the web. In thearchitecture, the user analyzer and the site monitor adesigned to maintain the user profile and the site databaseparately. The existence of the user profile and the sidatabase is to support five kinds of agentsrecommendation agent, new-content agent, search agecustomized agent and personal-status agent. Each them has a control engine and a knowledge base.

Third, we have developed three prototypes of browsingagents to demonstrate the feasibility of the proposearchitecture. We describe the control engine anknowledge base of each agent. Meanwhile, it also showthe way to maintain and utilize the user profile and thesite database.

There are some promising issues for future researcIf we hope to have a more intelligent agent, how to applArtificial Intelligence technology would be a worthwhileresearch issue. Next, how the different agents cacommunicate with each other and support each other another promising research area. To this researccommon communication language and coordinationmechanism need to be studied. Finally, all the agenproposed here are kind of passive agents because nonethem will operate unless users visit the web site. To bmore proactive, agents which can push messages to usare worthwhile to be developed.

6. Acknowledgements

We wish to thank National Science Council becausethis work was supported in part by its grant. The grannumber is NSC-86-2416-H-110-011.

7. References

- 1

1060-3425/98 $1

et

f

of

resete:nt,of

dds

h.y

nish,

ts ofeers

t

[1] Barrett, R., Maglio, P. P., and Kellem, D. C.,“Autonomous Interface Agents,” Conference onHuman Factors in Computer Systems, Georgia, USA,1997, http://www.acm.org/sigchi/chi97/proceedings/paper/hl.htm

[2] Domel , P., “WebMap: A graphical hypertextnavigation tool,” Computer Networks & ISDN Systems,Vol.28, No.1&2, 1995, pp.85-97.

[3] Excite Inc., “About Excite for Web Servers,” 1996,http://204.211.87.12/Architext/AT-ad2.html

[4] Foo, S., and Lim, E. P., “A Hypermedia database tomanage World-Wide-Web Documents,” Information& Management, Vol.31, No.5, 1997, pp.235−249.

[5] Hayes-Roth, B., “An Architecture for AdaptiveIntelligent Systems,” Artificial Intelligence, Vol.72,1995, pp.329-365.

[6] Henry, J., “Using the Web to cut costs and buildsales,” Computer Reseller News, No.711, 1996,pp.S34-S35.

[7] Kent, R. E., and Neuss, C., “Creating a Web Analysisand Visualization Environment,” Computer Networks& ISDN Systems, Vol.28, No.1-2, 1995, pp.109-117.

[8] Laine-Cruzel, S., Lafouge, T., Lardy, J. P., andAbdallah, N. B., “Improving Information Retrieval byCombining User Profile and DocumentSegmentation,” Information Processing andManagement, Vol.32, No.3, 1996, pp.305-315.

[9] Lieberman, H., “Autonomous Interface Agents,”Conference on Human Factors in Computer Systems,Georgia, USA, 1997, http://www.acm.org/sigchi/chi97/proceedings/paper/hl.htm

[10] Lieberman, H., and Maulsby, D., “Instructible agents:Software that just keeps getting better,” IBM SystemsJournal, Vol.35, No.3&4, 1996, pp.539-556.

[11] Maes, P., “Agents that Reduce Work andInformation Overload,” Communications of the ACM,Vol.37, No.7, 1994, pp.31-40.

[12] Mehling, H., “Tools to help you beat the ‘WorldWide wait’,” Information week, No.597, 1996, pp.49-56.

[13] Motiwalla, U.F., “An Intelligent Agent forPrioritizing E-Mail Message,” Information ResourcesManagement Journal, Vol.8, No.2, 1995, pp.16-24.

[14] Selker, T., “Coach: a Teaching Agent that Learns,”Communications of the ACM, Vol.37, No.7, 1994,pp.92-99.

0 -

0.00 (c) 1998 IEEE