Design and Implementation of an Administration System

11
The following paper was originally published in the Proceedings of the Twelfth Systems Administration Conference (LISA ’98) Boston, Massachusetts, December 6-11, 1998 For more information about USENIX Association contact: 1. Phone: 510 528-8649 2. FAX: 510 548-5738 3. Email: [email protected] 4. WWW URL: http://www.usenix.org Design and Implementation of an Administration System for Distributed Web Server C. S. Yang and M. Y. Luo National Sun Yat-Sen University, Taiwan, R.O.C.

Transcript of Design and Implementation of an Administration System

The following paper was originally published in theProceedings of the Twelfth Systems Administration Conference (LISA ’98)

Boston, Massachusetts, December 6-11, 1998

For more information about USENIX Association contact:

1. Phone: 510 528-86492. FAX: 510 548-57383. Email: [email protected]. WWW URL: http://www.usenix.org

Design and Implementation of an Administration Systemfor Distributed Web Server

C. S. Yang and M. Y. LuoNational Sun Yat-Sen University, Taiwan, R.O.C.

Design and Implementation of anAdministration System for

Distributed Web ServerC. S. Yang and M. Y. Luo – National Sun Yat-Sen University, Taiwan, R.O.C.

ABSTRACT

The explosive growth of the World Wide Web has raised great concerns regarding manychallenges – performance, scalability and availability of the Web system. Consequently, Website builders are increasingly to construct their Web servers as distributed system for solvingthese problems, and this trend is likely to accelerate. In such systems, a group of loosely-coupled hosts will work together to serve as a single virtual server. Although the distributedserver can provide compelling performance and accommodate the growth of web traffic, itinevitably increases the complexity of system administration. In this paper, we exploit theadvantages of Java to design and implement an administration system for addressing thischallenging problem.

Introduction

The explosive growth of the World Wide Web(WWW for short) [1] has raised great concernsregarding many challenges – performance, scalabilityand availability of the Web system. Due to the expo-nential growth of the World Wide Web, many popularWeb sites are being overwhelmed by huge requestsand suffer from server overload. In order to cope withthe continued increasing demand on their Web servers,Web site managers must continually increase theserver ’s capacity for providing the desired levels ofquality of service. Upgrading the server to a morepowerful machine may only solve the problem in theshort term and might not be cost-effective in the longterm. An increasingly popular solution for these prob-lems is to deploy a set of computers, and enable themto work together as a single virtual server. A Webserver based on such approach can provide the scala-bility necessary to keep up with growing clientdemand at popular Web sites. It could allow additionalprocessing power to be dynamically added to increasethe total capacity of the server as demand grows. Theother key advantage of such approach is that one canmore easily build a server that can tolerate hardwareor software failures.

However, although such distributed server canprovide compelling performance and accommodatethe growth of web traffic, it inevitably increases thecomplexity of system administration. Failures, perfor-mance inefficiencies, content management, resourceallocation, security compromises, and accounting aresome of the problems associated with the operation oftraditional server. When the server is composed of agroup of loosely-coupled machines, the administrativeburden will be larger. Reliable operational status ofsuch distributed server, unless made in an automatedway, requires significant human effort for propagating

one administrative function to all nodes. In particular,such distributed servers tend to be more heteroge-neous, and this heterogeneity will come at the cost ofgreatly increased management costs. As a result,effective management mechanisms and tools arerequired to mask the complexity and heterogeneity ofinternal composition of such distributed server, andmanage the server composed of several nodes as easyas manage a single node. In this paper, we describe thework we are pursuing in exploiting the advantages ofJava [2] to design and implement an administrationsystem for addressing these problems. With the inno-vative administration system, the Web site managercan manage and maintain the distributed server as asingle large system. In addition, the administrationsystem running in the background can also make theclustered server a better place to work.

The rest of this paper is organized as follows.The next gives an overview of distributed Web server.Subsequently, we describe the management problemsraised by such an environment, and then present anoverview of the architecture of proposed system. Thedetails of implementation are given in the next section.Then, we discuss some related issues and compare oursystem with other related works, before presenting theconclusion and future work.

Overview of Distributed Web Server

With the rapid growth of the Internet, and theWeb in particular, it is difficult for organizations andpeople running web sites to predict future demandsthat clients will place on their servers. Consequently,the Web server should be designed to be capable ofmeeting the evolving demand constantly and easily.That is, if the offered load begins to exceed theserver ’s capabilities, there should be an easy way toscale up the system when the hardware or software of

1998 LISA XII – December 6-11, 1998 – Boston, MA 131

Design and Implementation of an Administration System . . . Yang and Luo

the existing server does not need to be replaced. Inaddition, as more and more commercial applicationsappear on the WWW, it has become apparent that thefunction performed by these servers is critical. Conse-quently, making these servers highly available isanother problem that is gradually getting attention.

Figure 1: Overview of Distributed Web Server.

For addressing these problems, we [3,4,5]designed and implemented a scalable and highly avail-able Web server. Figure 1 illustrates the overview ofour system. The entire server is composed of a groupof interconnected machines. For providing the illu-sion of single virtual server across these machines, theentire severs cluster should be presented to the outsideworld by single addressing interface (e.g., singleDomain name). Consequently, we designed and imple-mented a distributing mechanism to spread all incom-ing Web requests destined for this addressing inter-face. Such a mechanism contains the following threefunctions: load balancing, failure detection, and direct-ing. That is, when a new HTTP request arrives, someload balancing mechanisms are invoked for selecting asuitable node to serve the request, and such a mecha-nism could ensure that the workload is evenly spread

among these server nodes. If one server node goesdown, or during periods of maintenance, the distribut-ing mechanism could discover it and respond bydirecting new requests to the other available nodes.Finally, a directing mechanism is required for direct-ing the request to the selected node, in a manner that istransparently to the outside user. We [3,4] analyzed theTCP/IP [6,7] and HTTP [8,9] protocols that the web isbased on, and we found out a number of ways inwhich one could direct requests to a selected node.After evaluating the tradeoff among these methods, we[4,5] choose reroute TCP connection as the directingmethod in our system. We extend the Linux kernel tobuild in all mechanisms mentioned above for fulfillingthe distributor, which performs the distributing mecha-nism in the distributed server.

One node in the system will run the modifiedkernel to serve as distributor for distributing incomingweb requests; the other nodes will execute Web serversoftware responding the incoming HTTP request.Each host participating in the clustered server has aunique IP address, but only the distributor’s IP addressis associated with the domain name representing this

132 1998 LISA XII – December 6-11, 1998 – Boston, MA

Yang and Luo Design and Implementation of an Administration System . . .

web site. As a result, only the distributor’s IP addressis advertised to the outside world, so that all HTTPrequests destined for this web site will be delivered todistributor. The distributor will forward these incom-ing requests to the selected nodes via distributingmechanism.

In addition, as the incoming requests may be dis-tributed to any node in the system, we also must pro-vide a mechanism to ensure that each server nodewould look and feel identical to the requesting clients.In other words, we must make each node with thesame capability of responding to requests for any por-tion of resource that the web site provides. For solvingthis problem, all content provided by the web site canbe served from a centralized network file system.However, accessing data over the network file systemcan be very slow due to the overhead of LAN conges-tion. Furthermore, such a design will suffer from thesingle-point-failure problem, which will make theentire system more vulnerable. As a result, we choosean alternative approach as the content sharing methodin our system: replicating all content on the local filesystem of each server nodes.

With these mechanisms, although all machinesparticipating in the servers cluster are autonomous,they will seamlessly work together to serve therequests and provide the illusion of a single server.The new machine can be dynamically added toincrease the total capacity of the server as demandgrows. We expect that this approach should be attrac-tive, because it can preserve the previous financialinvestment and reduce the costs of scaling the server.The other key advantage of this architecture is that wecan more easily build a server that can tolerate hard-ware or software failures.

Although the foregoing system can provide adegree of high availability, the failure of distributorwill still bring down the entire web server. Wedesigned a primary-backup mechanism for addressingsuch a problem. One backup distributor can be setupto prevent the problem of single point failure. The pri-mary distributor will broadcast a message periodicallyin addition to perform the distributing mechanism.The backup distributor threats the message as ‘‘heart-beat’’ of the primary distributor. It performs the samemechanism as just mentioned for detecting the failureof primary distributor, and takes over the responsibil-ity of distributing request when the primary distributorfails.

Proposed System

Design ConsiderationAlthough this system can accommodate growth

of web traffic, it inevitably increases the complexity ofsystem administration. Unlike traditional single-serverconfiguration in which the web site manager can hasfull control on the whole system easily. When theentire Web server is composed by a group of loosely-

coupled machines, the administrative burden of man-aging and maintaining the entire system will be con-siderable. Consequently, we intend to provide anadministrative mechanism for web site manager tomask the complexity and heterogeneity of internalcomposition of the distributed server, and manage theserver composed of several nodes as easy as managinga single node. Such a mechanism should address thefollowing problems to which an administrator of dis-tributed Web server would like to have answers:

• Content management. To guarantee that anyrequest could access any resource regardless ofthe server to which that it is directed, we repli-cate all content on the local file system of eachserver nodes. However, the content of a Website may change over time. When a changeoccurs, the system must propagate that changethroughout the entire Web site. Usually, suchchanges need to be updated by manager manu-ally. To address the problem, the proposedadministrative mechanism should enable theweb site manager to be capable of performingfunctions on all nodes at once.

• Configuration. We should provide a mecha-nism for administrator to know which node isoperating as a part of the system. In addition,adding or removing a node should be an easyway and not require the extensive reconfigura-tion of all other nodes.

• Self-diagnose. If any failure or specific condi-tion occurs in the system, the system shouldautomatically inform the administrator by E-mail or other ways. Otherwise, the trouble-shooting will become administrator’s nightmarewhen they face such a complex system.

• Performance and health monitoring. Weshould provide a mechanism for administratorto monitor the status (e.g., resource utilization)of each node, ensure the resources provided bythe web site are operational, and specified con-tent can be delivered.

• Security Concerns. We should provide amechanism (i.e., watch the log files created byeach server node) to identify security problemsor other situations.

System ArchitectureTo fulfill the administration system, we intended

to construct a group of daemons running on each nodeand enable them to cooperate for performing theadministrative functions. However, many problemsarose when we tried to implement such an idea. Thefirst major problem is platform heterogeneity, whicharises from the fact that we hope the proposed systemis flexible enough that each server node can use anykind of hardware, operating system, and Web serversoftware. This means that these daemons that make upthe administration system must be capable of runningon different platforms. The second considerable prob-lem is extensibility (or versioning and distribution

1998 LISA XII – December 6-11, 1998 – Boston, MA 133

Design and Implementation of an Administration System . . . Yang and Luo

problem). That is, the functionality of this administra-tion system cannot be extended or modified withoutrewriting, recompilation, reinstallation, and re-instan-tiation of all existing daemons.

Figure 2: Overall Architecture of the Proposed System.

For tackling these problems, we decided to con-struct the administration system by Java [2,10]. Java isdeveloped to support applications in a heterogeneousenvironment, which requires that applications be capa-ble of executing on a variety of hardware architecturesand operating system. This is achieved by generatingan intermediate code called bytecode, which is anarchitecturally neutral format designed to be trans-ported easily to multiple hardware and software plat-form. Such a design can relieve both the developerand user of concerns related to heterogeneity of thetarget platforms. Thus it is the primary reason that wechoose Java to implement the administration system.In addition, the another attractive feature of Java is thenotion of downloaded executable content (data thatcontain programs that are executed upon receipt).Base on this, we can implement each administrationfunction in the form of Java class instead of imple-menting it in the daemon. All daemons distributed oneach node will download the appropriate classes froma central location to perform the management task.Such a design will avoid the software distribution

problem. If a new capability needs to be added, all wehave to do is to implement a new Java class. Weexpect that using the capability of downloadable codefrom Java should provide unlimited possibility toenhance the function of the administration system.

The Java-based administration system is com-posed of the following four key components: con-troller, broker, agent, and remote console. Figure 2illustrates the overall architecture of the proposed sys-tem. The broker will run on each Web server node toperform the administrative function, and monitor thestatus of the managed node. The administrative func-tions will be implement in agent, which is in the formof Java class. One special daemon called controllerwill be responsible for receiving request from admin-istrator, and then invokes brokers to perform the dele-gated tasks by dispatching the corresponding agent.The remote console is a Java applet, which can be runon any Java-enabled Web browser. The administratorcan download the remote console and interact with itto perform management operations.

Implementation

In this section we describe the implementationdetails and the current status of the proposed system.

134 1998 LISA XII – December 6-11, 1998 – Boston, MA

Yang and Luo Design and Implementation of an Administration System . . .

Control InterfaceBecause we made the distributor the control cen-

ter of system administration, as well as distributingrequests, we provided several administrative functionsfor the system administrator to control the system suchas the following:

• Turn on/off the distributing mechanism.• Join/remove a server node into/from the sys-

tem.• Read the related statistical data, such as out-

standing request of each server nodes.These functions are user-level programs written in Clanguage. Because all related data are located in thekernel, we also implemented several new system callsthat provide an interface for these user-level routinesto access kernel-level data.

ControllerThe controller is a standalone [11] Java applica-

tion, which runs as a background process on distribu-tor as the control center of system administration. Themain function of controller is to respond the requestfrom system administrator. When the controller startsas a process under the kernel, it forks a main thread toregister a socket on a pre-assigned TCP port and blockitself on listening mode for any incoming request.Upon a request arriving, the main thread will create anew working thread to carry out the requests so that itcan handle other requests waiting in queue or wait fornew requests. We expect such multithreaded imple-mentation will allow controller to be capable of serv-ing multiple requests simultaneously with improvedresponse time and throughput.

The request issued by an administrator may bemainly classified into two categories: (1) configuringthe system (e.g., add/remove a node into/from the sys-tem), or (2) performing a management function (e.g.,the web site manager intend to delete or add a file) onall nodes. For condition (1), the working thread willinvoke the administrative functions described in thecontrol interface. These administrative functions per-form in the form of native method. The native methodis a mechanism in Java, which is used to call functionthat is written in languages other than Java. Theadministrative function will look up the kernel andsetup the related data via the corresponding systemcall, which is defined in the control interface. In thecondition (2), the controller will dispatch the corre-sponding agent to each server node for performingdelegated task. The working thread will keep on lis-tening for the execution result of the agent and reportit to the administrator.

BrokerThe broker is also a standalone Java application

program, which performs as a daemon process on eachserver node in the system. The broker is composed oftwo operating threads: agent thread and monitorthread.

Agent Thread

The agent thread is responsible for accepting thedispatched agent from controller, and executing it forperforming the delegated task. As a result, it should becapable of loading code from a variety of resourcesand provide an environment for executing it. Foraddressing this, we implemented the following com-ponents:

• Class Loader. To load code from othersources, the Java runtime system calls a sub-class of the ClassLoader (an abstract class inJava), which defines an interface for the run-time system to ask a Java program to provide aclass [12]. As a result, we implemented a spe-cialized version of the ClassLoader as theskeleton of the agent thread, which can loadJava class files from a variety of resources andconvert the raw data of a class into an internaldata structure representing that class.

• Agent Context. We defined and implementedan interface called agent Context, which isresponsible for providing an environment inwhich the downloaded codes executes andinteracts with broker.

• Security Manager. The capability of down-loadable code is powerful, but it is also a poten-tial security threat to a system and raises manyconcerns. The essence of the problem is thatrunning programs on a computer typicallyneeds to access certain resources on the host.However, if downloaded code is not careful torestrict the access of some critical systemresources, it can also provide a malicious codewith the same ability to do mischief on the host.In Java, the SecurityManager class is meant todefine an interface for access control [13,14].The SecurityManager class itself is notintended to be used directly, instead it isintended to be subclassed and installed as thesystem security manager. The Java platform isdesigned in such a way that all system callsmade by a Java program must be routedthrough a security manager, which can decidewhether or not certain sensitive operationsshould be allowed. Consequently, we defined asecurity policy and implemented a specificsecurity manager to support runtime security onhost environment. For example, the down-loaded code can only access the given directoryand file on the local file system.

Monitor Thread

The primary role of the monitor thread is to mon-itor the status of the managed node. Periodically it willwake up and initiate a request to web server runningon the managed node. For minimizing the additionalworkload added on the managed node, such a requestis designed for retrieving a small file (e.g., Home pagein our implementation)

1998 LISA XII – December 6-11, 1998 – Boston, MA 135

Design and Implementation of an Administration System . . . Yang and Luo

If the server responds normally, then the brokerwill send a message to distributor. Such a messagewill be treated as ‘‘heartbeat’’ of this managed node.The distributor keeps a counter for each server node,and such a counter will be incremented periodically.When the distributor receives a heartbeat from oneserver node, it resets the counter associated with thatserver node. On selecting a server for a new arrivingrequest, distributor checks the counter associated withthe candidate server, which is selected by load balanc-ing module. If the counter exceeds a ‘‘warning value,’’which means that the server node may be eitherunreachable or overburdened. Such a node will beskipped, and the request will automatically be allo-cated to the next most available server. If the counterexceeds a ‘‘dead value’’ (which is a higher thresholdthan warning value), the node will be declared deadand be removed from the server cluster. The adminis-tration system will automatically record such an eventin a log file and inform administrator by E-mail. As aresult, any failure in the clustered server can bemasked by such a mechanism, and the user from theoutside world will be unaware of it.

In addition, the monitor thread also measures theresponse time that the request took from start to finishand then performs the following algorithm:If (RPT_NOW > RPT_LAST) thenInterval_time = RPT_NOW * Multiplier;Else

Interval_time = RPT_LAST * Multiplier;RPT_LAST = RPT_NOW;Sleep(Interval_time);

RTP_NOW denote the response time of therequest issued by this ‘‘wake-up.’’ We also keep theresponse time of request issued by the last ‘‘wake-up’’in RTP_Last. Multiplier is a pre-assigned value forcalculating the Interval_time. The Interval_time is theinterval from this ‘‘wake-up’’ to next ‘‘wake-up.’’There are two main purposes in such a design. First, itwill prevent the monitor thread from burdening theload of web server. If the server is overloaded, themonitor thread will decrease the frequency of probeby discovering the longer response time. Second, fine-grained load balancing can be achieved by such adesign. That is, if the managed server node is over-loaded, the time of interval will be lengthened. Thismeans that the broker will increase the time of intervalbetween this heartbeat and the next. As a result, thedistributor will stop to dispatch new request to thisnode, because it does not receive the heartbeat for along time.

AgentThe primary role of agent is to perform one dele-

gated task on all nodes according to the request ofadministrator. An agent is dispatched by controller andexecutes within the environment created by broker.The agent is in the form of Java classes and is trans-ported across the network as byte streams. It will bereconstituted into Class objects by class loader and

runs in its own thread of execution after arriving at ahost.

We first defined an Agent class (a subclass ofObject) to be the abstract base class, and then weimplemented (inherit the Agent class) all managementfunctions in the form of agent. For instance, we imple-ment an agent to visit all server nodes for updating (ordeleting) a file.

In addition, we built in a priority mechanism inthe agent system. Based on this, we can assign differ-ent priority to different agent. For example, we imple-mented an agent to analyze the log file of each servernode for security concern. Such an agent needs to beexecuted continually in the system, but it does notneed to be executed immediately. Consequently, wecan assign a lower priority to this agent. When con-troller receives a request for executing such agent, itwill not dispatch the agent to a server node until thatnode is idle. Such a design will prevent those back-ground jobs (i.e., housekeeping job) from burdeningthe load of server node during the high-traffic periods.

Remote ConsoleFor simplifying the system administration as

much as possible, we implement a management toolcalled remote console from which an administratorcan issue management operations. The remote consoleis a Graphical User Interface (GUI) in the form ofJava applet, thus it can run under any Java-enabledbrowser environment. The remote console applet isbuilt completely on top of Java abstract windowtoolkit (AWT). It will interact with the user and com-municate with the controller over the network. At anygiven time, the Web site manager can download theremote console applet anywhere after proper authenti-cation and then control or monitor the whole system.The applet has a main thread for listening request fromthe user. Upon receiving a request, it will create a newworking thread, and then the main thread is releasedand ready for other user request. The working threadwill talk to the controller for carrying out the requests.We implemented a prototype remote console, whichprovides the following functions:

• Configuration function (e.g., join/remove aserver node, or schedules a down time formaintenance): When one select such function,the virtual console will inform the controller,which will invoke the administration functiondescribed above to accomplish the configura-tion setup.

• Content management (e.g., add/delete ormodify a file): When one selects such a func-tion and fills in the related variables, the remoteconsole will inform the controller, which willdispatch the respective agent to each servernode for performing the job.

• Monitor the activities of the system : One canselect the monitor function to monitor the activ-ities of each node. The controller will invoke

136 1998 LISA XII – December 6-11, 1998 – Boston, MA

Yang and Luo Design and Implementation of an Administration System . . .

the ‘‘read related data’’ function defined in thecontrol interface to report the related data. Theadministrator also can assign some specialevent auditing, and then the controller will dis-patch an agent to each node for analyzing itslog file.

These user selectable functions are available inthe form of buttons at the bottom of the applet’s mainwindow. One can see the remote console and its demooperation from our web site [15].

Discussion

In this section, we discuss some related issuesraised in our system and compare our system withother related works.

SecurityThe security of our system depends fundamen-

tally on the following three layers: the Java languageitself, class loader, and the security manager. First, theJava language has the following important featuresfrom a security standpoint: access control for variablesand methods within classes, lack of pointers as a lan-guage data type, and automatic garbage collection.Second, the class loader performs further securitycheck to verify that the downloaded code does not vio-late the security requirements of the Java language.Finally, the security manager supports runtime secu-rity on host environment.

Recently, the security manager has been aug-mented with fine-grained access control mechanismsthat allow it to make decisions based on who signedthe downloaded code and where it was loaded from[16]. In the future, we will use cryptographic tech-niques (e.g., digital signature) to guarantee theintegrity of code transferred and to provide an identifi-cation of the agent provider.

Load balancingGiven a clustered server, poor performance may

still exist, which is often due to uneven load distribu-tion throughout the system. As a result, a good loadbalancing mechanism is required for allocating incom-ing requests in a way that utilizes the cluster resourceevenly and efficiently. At first, we [4,5] used a roundrobin method as the load-balancing policy for the rea-son of minimizing the cost associated with load bal-ancing mechanism. This simple scheduling mecha-nism suffers from the fact that the processing time ofindividual request is not constant For instance, theload imbalance still happens while one server nodereceives five requests for 3KB HTML files, anothersame server node receives five requests for 3MBMPEG files. Consequently, we implemented an alter-native scheduling mechanism to solve this problem.The new mechanism keeps track of the number of out-standing requests in each server node, and distributesthe request to the node with the least number of out-standing requests. However, the both two mechanismsdo not always reflect the actual load because they do

not track the actual load condition on the server nodes.A potentially better approach is to combine the twomechanisms with some additional information aboutthe actual load. With the administration system, such amulti-level load-balancing can be achieved by monitorthread of broker. The distributor can dispatch theincoming requests to one of the server nodes, usingthe previous scheduling mechanism, unless the peri-odic heartbeat of some of these nodes stops. The peri-odic heartbeat is stopped when the monitor threaddetects the fact that the load of the managed serverexceeds a critical threshold. In such a case, the distrib-utor temporarily excludes the overloaded servers fromscheduling consideration until it receives the heartbeatagain (when the load returns under the given thresh-old).

ComparisonThe distributed server concept has been used by

many research projects to address the scalability andavailability problems of Internet server. In this sec-tion, we compare these works with our system (i.e.,the distributed server coupled with Java-based admin-istration system).

The NCSA scalable HTTP server is described in[17,18]. Their architecture consists of a number ofserver machines cooperating to provide the Web ser-vice, that use the Andrew file system for sharing con-tent provided by the Web site, and using the roundrobin DNS server for distributing accesses. In contrastto our scheme, this architecture has the followingproblems. First, the round robin DNS approach willinevitably suffer from the problems of DNS cachingeffect [4,5]. In comparison, the load balancingachieved using our routing TCP connection approachis significantly better than that achieved using theround robin DNS techniques. Second, it simply dis-tributes incoming requests in a round robin fashion,which does not consider the heterogeneity of eachrequest and existing load of respective machine. Fur-thermore, if the configurations (e.g., CPU type, mem-ory size, etc) of machines in the clustered server aredifferent, it is not considered in making a mapping.Third, this architecture did not consider the problem offailure detection and handling.

The Magic Router [19] provides transparentaccess by placing a modified router before a set ofmachines, which cooperate to provide a service. Theirapproach is very similar to our distributor, which redi-rects incoming Web requests via rerouting TCP con-nection. In the modified router, they use a user levelprocess to intercept all IP packets and possibly modifypackets destined for WWW service for directing it to aselected host. It allocates requests using round robin,random, or incremental system load methods. Themain problem of this scheme is that using user spaceapproach will pay the performance cost of contextswitching delay, and that multiple competing pro-cesses will increase the scheduling delay by over an

1998 LISA XII – December 6-11, 1998 – Boston, MA 137

Design and Implementation of an Administration System . . . Yang and Luo

order of magnitude. Furthermore, it does not considerthe content sharing and management problem.

IBM proposed a prototype scalable and highlyweb server [20] built on an IBM SP-2 [21] system.The system architecture consists of a set of logicalfront-end nodes and a set of back-end nodes. Thefront-end nodes run the web daemons and are con-nected to the external network. The back-end nodefunction as the server node for the sharing file system,used by the frond-end to access the data [22]. Theyuse a TCP router approach to dispatch incomingrequests. This scheme requires dedicated hardware(i.e., SP-2 system) and software. Consequently, it maynot be feasible for all Web sites. In contrast, ourapproach can be built from commodity hardware andsoftware components. It also can be applied to anyexisting web site in a manner that any kind of soft-ware/hardware of the original server does not need tobe replaced.

In addition, several products [23,24,25] havebeen announced for use as front-end nodes that per-form distributing functions across a group of servers.Due to space limitations, we do not describe thedetails of all these products.

All these works do not address the systemadministration problem. Many administration or con-tent management operations must be done manuallyon each node. It should be the nightmare of web sitemanager, in particular, a number of web site cannotafford specialized computer operations staff. In con-trast, we think our system has advantages over otherarchitectures in terms of scalability, high availability,fine-grained load balancing, and powerful and sophis-ticated system administration functions.

Conclusion

In this paper, we demonstrate the design andimplementation of an administration system for dis-tributed server. We exploit the advantages of Java toconstruct this administration system, which providesthe solution for the automation of many tasks in sys-tem administration and content management that usu-ally must be done manually. We also offer an easy-to-use GUI for web site manager to maintain and managethe system. With the proposed system, the administra-tor can perform functions on all nodes at once, moni-tor the activities of Web site, and manage the entiresystem as easy as facing a single host. Furthermore,with the feature of downloaded agent, the proposedsystem provides unlimited possibility to extend itsfunction.

In addition, the proposed system enables multi-level control of incoming requests. The combinationof load-balancing mechanism provided by distributorand fine-grained load balancing archived by theadministration system gives more precise control fordirecting incoming requests.

The system also provides excellent high avail-ability features, including automatically detection offailures, and alerts in the form of log file and e-mail.The end user will be unaware that any failure hasoccurred on the server, although the aggregate capac-ity of the server will temporarily be reduced. In addi-tion, our approach can enable a high performanceserver to be built from commodity hardware and soft-ware components. Otherwise, it also can be applied toany existing web site in a manner that any kind ofsoftware/hardware of the original server does not needto be replaced.

In the future, we will further investigate the secu-rity issues raised by Java language and our system indetail.

Acknowledgements

This work was supported by the National Sci-ence Council, R.O.C., under contract no. NSC86-2213-E-110-026 and NSC 86-2213-E-110-032.

The authors would like to thank David D. H. Linof IBM for his valuable suggestions and help on thiswork. The authors also greatly appreciate GretchenPhillips for her proof reading and comments on thispaper.

Author Information

C. S. Yang received the B.S. degree in engineer-ing science and the M.S. and Ph.D. degrees in electri-cal engineering from National Cheng Kung Univer-sity, Tainan, Taiwan, Republic of China, in 1976,1984, and 1987, respectively. During 1988-1992, hewas with the Department of Electrical Engineering,National Sun Yat-Sen University, Kaohsiung, Taiwan,Republic of China. In 1992, he joined the faculty ofthe Institute of Computer and Information Engineeringat National Sun Yat-Sen University, where he is a pro-fessor and Chairman of the Institute now. His currentresearch interests include mobile computing, paral-lel/distributed programming support environment, andscalable architecture. Reach him at [email protected] .

M. Y. Luo received the B.S. degree in Physicsfrom the National Sun Yat-Sen University, Kaohsiung,Taiwan, Republic of China, in 1995, and the M.S.degree in Computer Science from the Institute ofComputer and Information Engineering at NationalSun Yat-Sen University in 1997. He has been workingtoward his Ph.D. degree in the Institute of Computerand Information Engineering at National Sun Yat-SenUniversity. His research interests are in the areas ofcomputer network, Internet technology, and paralleland distributed system. Reach him at [email protected] .

138 1998 LISA XII – December 6-11, 1998 – Boston, MA

Yang and Luo Design and Implementation of an Administration System . . .

References

[1] T. Berners-Lee, R. Cailliau, A. Luotonen, H.Nielsen, A. Secret. ‘‘The World-Wide Web’’Communications of the ACM, Aug. 1994.

[2] Java-Programming for the Internet. http://java.sun.com .

[3] C. S. Yang, M. Y. Luo, ‘‘Design an Environmentfor Scalable Web Server,’’ Proceedings of 1996Multimedia Technology and Applications Work-shop, pp. 107-114. Dec. 1996.

[4] M. Y. Luo, Design and Implementation of a Scal-able and Highly Available Web Server. M.Sc.Thesis, Institute of Computer and InformationEngineering, National Sun Yat-Sen University,June 1997.

[5] C. S. Yang, M. Y. Luo ‘‘Design and Implementa-tion of a Environment for Building Scalable andHighly Available Web Server.’’ Proceedings of1998 International Symposium on Internet Tech-nology, pp. 124-131, April 29-May 1, 1998.

[6] G. Wright and W. R. Stevens TCP/IP Illustrated,Volume1, Addison-Wesley, Reading, May 1994.

[7] G. Wright and W. R. Stevens TCP/IP Illustrated,Volume2, Addison-Wesley, Reading, May 1995.

[8] T. Berners-Lee, R. Fielding, and H. Frystyk.Hypertext Transfer Protocol – HTTP/1.0, http://www.w3.org/hypertext/WWW/Protocols/ .

[9] T. Berners-Lee, R. Fielding, H. Frystyk, J. Get-tys, J. C. Mogul, Hypertext Transfer Protocol –HTTP/1.1, http://www.w3.org/hypertext/WWW/Protocols/ .

[10] J. Rodely, Writing Java Applets, The CoriolisGroup, Inc.

[11] J. Gosling and H. McGilton. The Java LanguageEnvironment, A White Paper, May 1996. Avail-able via ftp://ftp.javasoft.com/docs/papers/langenviron-ps.zip .

[12] T. Lindholm and F. Yellin, ‘‘The Java VirtualMachine Specification.’’ Addison-Wesley, 1996.

[13] Joseph A. Bank, ‘‘Java Security,’’ Available viahttp://swiss-ftp.ai.mit.edu/˜jbank/javapaper.ps

[14] F. Yellin. ‘‘Low level security in Java.’’ InFourth International World Wide Web Confer-ence, Boston, MA, Dec. 1995. Available viahttp://www.w3.org/pub/Conferences/WWW4/Papers/197/40.html .

[15] http://pds1.cie.nsysu.edu.tw/WebScale/Demo/console.html .

[16] Li Gong and Roland Schemers. ‘‘Implementingprotection domains in the Java Development Kit1.2.’’ In The Internet Society Symposium on Net-work and Distributed System Security, SanDiego, California, March 1998.

[17] E. D. Katz, M. Butler, and R. Mcgrath. ‘‘A Scal-able HTTP Server: The NCSA Prototype,’’ Pro-ceedings of First International WWW Confer-ence. May 1994.

[18] R. McGrath T. Kwan and D.Reed. ‘‘NCSA’sWorld Wide Web Server: Design and Perfor-mance.’’ IEEE Computer, November 1995.

[19] E. Anderson, D. Patterson, and E. Brewer. TheMagicrouter, an Application of Fast PacketInterposing, http://HTTP.CS.Berkeley.EDU/̃ eanders/projects/magicrouter/osdi96-mr-submission.ps .

[20] D. Dias, W. Kish, R. Mukherjee, and R. Tewari,‘‘ A Scalable and Highly Available Web Server,’’COMPCON 1996, pp.85-92, 1996.

[21] T. Agerwala, J. Martin, J. Mirza, D. Sadler andM. Snir, ‘‘Sp2 system Architecture,’’ IBM Sys-tem Journal, Vol. 34, no. 2, pp. 152-184, 1995.

[22] C. R. Attanasio, M. Butrico, C. A. Polyzois, S. E.Smith, and J. L. Peterson. ‘‘Design and Imple-mentation of a Recoverable Virtual SharedDisk.’’ IBM Research report RC 19843, T.J Wat-son Research Center, Yorktown Heights, NewYork, 1994.

[23] IBM Corporation. The IBM Interactive NetworkDispatcher, 1998. http://www.ics.raleigh.ibm.com/netdispatch .

[24] ‘‘Cisco LocalDirector,’’ http://www.cisco.com/warp/public/751/lodir/index.html .

[25] Cisco System. Scaling the Internet Web Servers:A white paper. http://www.cisco.com/warp/public/751/lodir/scale_wp.htm, 1997.

1998 LISA XII – December 6-11, 1998 – Boston, MA 139

140 1998 LISA XII – December 6-11, 1998 – Boston, MA