IRIS: Design and Implementation of an Intentional Resource Indicator Service

10
1 IRIS: Design and Implementation of an Intentional Resource Indicator Service Mukundan Venkataraman, Puneet Gupta, Mainak Chatterjee, and Kartik Muralidharan Abstract— This paper describes the design and implementa- tion 1 of IRIS: an intentional resource indicator service. IRIS springs from the concept that end-users should not be bogged down with network names when looking for a resource on the network, and should have the liberty to freely sketch and describe what they want and not where to go about finding it. IRIS archives the following in performing network searches for resources: (i) expressiveness in near English like queries, (ii) subjective searches leading to more relevant results, and (iii) complete transparency with increased usage. IRIS design bifurcates nodes in the networks to resolvers and end-hosts. Resolvers and resources form a network overlay using soft state communications. Network routing is coupled with user intent to dynamically map requests to available resources using this overlay, thereby allowing a graceful join and leave procedure. We perform algorithmic analysis of IRIS at an abstract level for the most computationally intensive steps for tree based recursive searchings. We also present implementation results in terms of memory, CPU usage, control signalling, and delay to demonstrate the performance of IRIS. I. I NTRODUCTION Resource discovery in networks pertains to the ability to extract desired services available directly from the network. With hardware costs diminishing and the ability to connect disparate devices to the network, the potential for resource discovery with a large ensemble of heterogeneous devices is a promise coming true. Traditional searches are rather objective in nature: a user submits a query that is interpreted literally, and a simple list of resources is presented to the user. This omits important criterions: the context and subjectivity of the person issuing a search for a resource. Theoretically, as far as hard wired networks are concerned, resource discovery is both a manual and static process (for e.g., discovering something as basic as a printer in a new campus). The process is manual since the user has to explicitly configure his machine to identify available resources; and static, because, even if the user had configured a default printer which cannot handle color documents, calls to the printer requiring color prints will be issued to the default one without prior warning. The only workaround possible is for the user to explicitly use his own understanding and judgement in selecting a resource. This process obviously kills transparency, and one cannot go on to do more fancier things like delegate a document to the M. Venkataraman and M. Chatterjee are with the Department of Elec- trical and Computer Engineering at The University of Central Florida, Orlando, FL 32816. E-mail: {mukundan,mainak}@cpe.ucf.edu. P. Gupta and K. Muralidharan are with Pervasive Computing Wing, Software Engg and Technology Labs, Infosys Technologies Ltd, Bangalore 560100, India Email: {puneet gupta, kartik muralidharan}@infosys.com 1 For screen shots of IRIS working, please visit http://www.cs.ucf. edu/ ˜ mukundan least loaded printer or the nearest printer (while on the move) or connect to only, say, public or community printers. What is to be taken away is that these metrics are application specific and there is only so-much that a network layer implementation of resource discovery can achieve. A. Motivations for this work The following factors motivate our design: Intentions: Computer networks is about two applications talking to each other, and it is the intention of the user sitting on top of the application that needs to be served. One of the foremost visions of pervasive computing is that devices around a user should serve him and his needs rather than him adapting to the idiosyncracies of the devices. Taking into account the subjectivity and context of a query intuitively results in more meaningful search results. Extracting Resources: With an accepted trend that smart spaces and pervasive computing environments of the future will have deeply embedded devices which will make computation transparent to the user, the ultimate challenge will be to extract relevant services efficiently from these environments. Low level naming paradigms: It is increasingly observed that in most distributed systems, naming of nodes for low level communication leverages topological location and is in general independent of any application (see, for e.g., [7]). In emerging classes of distributed networks, low level communication does not rely upon network topological location. Rather, low-level communication is based on names that are external to the network topology and relevant to the application. Since data is now self identifying, it enables activation of application-specific processing inside the network. Names should mean what, not where: IRIS springs from the concept that users should not be bogged down by network names when looking for a resource. Rather, they should be able to express to the network what they want and not where to go about finding it (O’Toole and Gifford in [13]). Since IP addresses are synonymous to geographical locations, the first step is to decouple this bondage to introduce expressiveness; both from the users in what they want, and from the resources in what they offer. B. IRIS overview and contributions We present IRIS: Intentional Resource Indicator Service for resource discovery in networked environments. The IRIS

Transcript of IRIS: Design and Implementation of an Intentional Resource Indicator Service

1

IRIS: Design and Implementation of an IntentionalResource Indicator Service

Mukundan Venkataraman, Puneet Gupta, Mainak Chatterjee, and Kartik Muralidharan

Abstract— This paper describes the design and implementa-tion1 of IRIS: an intentional resource indicator service. IRISsprings from the concept that end-users should not be boggeddown with network names when looking for a resource onthe network, and should have the liberty to freely sketch anddescribe what they want and not where to go about findingit. IRIS archives the following in performing network searchesfor resources: (i) expressiveness in near English like queries,(ii) subjective searches leading to more relevant results, and(iii) complete transparency with increased usage. IRIS designbifurcates nodes in the networks to resolvers and end-hosts.Resolvers and resources form a network overlay using soft statecommunications. Network routing is coupled with user intentto dynamically map requests to available resources using thisoverlay, thereby allowing a graceful join and leave procedure.We perform algorithmic analysis of IRIS at an abstract level forthe most computationally intensive steps for tree based recursivesearchings. We also present implementation results in terms ofmemory, CPU usage, control signalling, and delay to demonstratethe performance of IRIS.

I. INTRODUCTION

Resource discovery in networks pertains to the ability toextract desired services available directly from the network.With hardware costs diminishing and the ability to connectdisparate devices to the network, the potential for resourcediscovery with a large ensemble of heterogeneous devices is apromise coming true. Traditional searches are rather objectivein nature: a user submits a query that is interpreted literally,and a simple list of resources is presented to the user. Thisomits important criterions: the context and subjectivity of theperson issuing a search for a resource.

Theoretically, as far as hard wired networks are concerned,resource discovery is both a manual and static process (for e.g.,discovering something as basic as a printer in a new campus).The process is manual since the user has to explicitly configurehis machine to identify available resources; and static, because,even if the user had configured a default printer which cannothandle color documents, calls to the printer requiring colorprints will be issued to the default one without prior warning.The only workaround possible is for the user to explicitly usehis own understanding and judgement in selecting a resource.

This process obviously kills transparency, and one cannotgo on to do more fancier things like delegate a document to the

M. Venkataraman and M. Chatterjee are with the Department of Elec-trical and Computer Engineering at The University of Central Florida,Orlando, FL 32816. E-mail: {mukundan,mainak}@cpe.ucf.edu. P. Gupta andK. Muralidharan are with Pervasive Computing Wing, Software Engg andTechnology Labs, Infosys Technologies Ltd, Bangalore 560100, India Email:{puneet gupta, kartik muralidharan}@infosys.com

1For screen shots of IRIS working, please visit http://www.cs.ucf.edu/˜mukundan

least loaded printer or the nearest printer (while on the move)or connect to only, say, public or community printers. What isto be taken away is that these metrics are application specificand there is only so-much that a network layer implementationof resource discovery can achieve.

A. Motivations for this workThe following factors motivate our design:• Intentions: Computer networks is about two applications

talking to each other, and it is the intention of the usersitting on top of the application that needs to be served.One of the foremost visions of pervasive computing isthat devices around a user should serve him and hisneeds rather than him adapting to the idiosyncracies of thedevices. Taking into account the subjectivity and contextof a query intuitively results in more meaningful searchresults.

• Extracting Resources: With an accepted trend that smartspaces and pervasive computing environments of thefuture will have deeply embedded devices which willmake computation transparent to the user, the ultimatechallenge will be to extract relevant services efficientlyfrom these environments.

• Low level naming paradigms: It is increasingly observedthat in most distributed systems, naming of nodes forlow level communication leverages topological locationand is in general independent of any application (see, fore.g., [7]). In emerging classes of distributed networks,low level communication does not rely upon networktopological location. Rather, low-level communication isbased on names that are external to the network topologyand relevant to the application. Since data is now selfidentifying, it enables activation of application-specificprocessing inside the network.

• Names should mean what, not where: IRIS springs fromthe concept that users should not be bogged down bynetwork names when looking for a resource. Rather,they should be able to express to the network what theywant and not where to go about finding it (O’Toole andGifford in [13]). Since IP addresses are synonymous togeographical locations, the first step is to decouple thisbondage to introduce expressiveness; both from the usersin what they want, and from the resources in what theyoffer.

B. IRIS overview and contributionsWe present IRIS: Intentional Resource Indicator Service

for resource discovery in networked environments. The IRIS

2

topology consists of hosts, resolvers and resources. Hosts areconventional end users who interact with the network. “Re-solvers” within the network can either be hosts or specializeddevices which perform the resource discovery with the aid of adatabase. We also devise a routing scheme earmarked towardsIRIS for performing request-replies in such a network.

Hosts generate queries that are subjectively parsed at theend hosts to generate an XML description of the user intention.This is sent to resolvers, which act in two steps: (i) identify po-tential resources that can service this request from a database;and (ii) communicate with the identified resources to get realtime information (like level of loading, pricing, accessibility,proximity etc.) to further refine the solution space. The setof resources are then mapped into a reply XML and sent tothe host. This is once again parsed at the host to display theresources in a meaningful order.

Apart from request-reply dialogues, IRIS goes a step fur-ther in establishing user guidance and complete transparency.When exact match resources are not available, IRIS initiates anegotiation with the user to let him pick what is best available,and this is done by extensively using history of searches toidentify the most important criterions within a search. Also,with extensive usage, IRIS so completely understands a userthat it itself disappears from the users view (in a sense, explicitqueries need not be provided since past data is sufficient toconstruct intentions), and all that is left are user intentions thatdirectly fetch the correct resource.

We have designed and implemented IRIS in two floors ofthe Infosys Campus [1], and our code base runs over 7000lines in Java. In our quest to design IRIS, we achieve thefollowing:• We integrate user intention with message routing, and in

effect fuse the application with network level indirectionwhich in traditional networks are treated separate. Sinceusers do not specify a network address for a resource,it allows them to connect to resources even thoughthe mapping between end nodes and network addresseschange over time.

• IRIS uses a decentralized network of resolvers whichperform the actual resource discovery in the network.A resolver overlay network is created entirely usingsoft state beacons: both to discover/monitor resourcesand to maintain a resolver overlay. Entries not refreshedperiodically are deleted, and this provides for seamlessregistration and de-registration of services. Resolversfurther perform load balancing by spawning/terminatingresolvers with growing/waning load, and in effect solvescalability problems.

• A non-intrusive design which requires minimal changesto the infrastructure already present. Instead of perform-ing changes to the fundamental resource, we create soft-ware overlays at the resolver that periodically monitorsthe status of the resource.

• We keep the design consistent with end-system intelli-gence (as is seen in the Internet [24]). In other words,intelligence is pushed to the network edges while keepingthe network “core” dumb. This has specific advantages:(i) new applications can be seamlessly deployed by a sim-

ple host update; (ii) allows for application level servicedescription, rather than a network layer implementationof it. Ferreting resources based on hop count, availablebandwidth or congestion can never match subjectiverequirements.

• We provide a basic blueprint for accommodating sub-jectivity in searches. IRIS keeps extensive track of pastsearches, allows users to use free form language de-scriptions in expressing intent, and evolves with andunderstands the user better in time. With extensive usage,IRIS itself disappears from the users perspective and allthat is left is user intentions mapping straight to resources,providing complete transparency.

Fig. 1. Schematic representation of what IRIS does.

II. RELATED RESEARCH

Resource discovery has been an issue since the early net-working days, and the earliest solutions were implementedin the network layer (in fact, as of today, this is the norm).Approaches that utilize this strategy are incomparable toIRIS since they have no provisions for capturing applicationspecific metrics for resource discovery. Trying to fuse resourcediscovery with application specific metrics is a very recentconcept indeed. One of the earliest proposals to mappingnetwork names and its IP address is the domain name service(DNS) [21] of the Internet.

A. Related Architectures

Sun’s Jini [16] provides “federation of networked devices”over Java’s Remote Message Invocation (RMI) for sponta-neous distributed resource discovery. Jini does not addressdynamism or resource failure. Universal plug-and-play [20]uses a subset of XML to describe resources provided bydevices, which is similar to our concept of using XML docu-ments for “business cards”, but the design and philosophiesdiffer greatly. The Service Location Protocol (SLP) [25],[23] achieves the discovery heterogeneous network resourcesusing centralized Directory Agents. The Berkeley ServiceDiscovery Service (SDS) [14] extends this concept with

3

secure, authenticated communications. IBM’s “T” Spaces [17]provide a lightweight database over which nodes performqueries. However, this framework is highly optimized for staticclient-server applications rather than dynamic peer to peerinteractions.

Related literature with similar philosophies would be theInformation Bus by Oki et. al. [22], which allow applicationsto communicate with each by only providing the subject ofdata, as is the case with Salamander [18]. There has beenwork using such philosophies in entirely different domains likeJacobson’s multicast based self configuring Web caches [15],and Estrin et. al.’s diffusion based approaches [6] for dataaggregation in sensor networks and the SPIN [10] protocolalso for sensor networks. Numerous protocols exist that use aresolver like concept to aid network functionalities. CARD[3] avoids global flooding, and lets local resources (onehop) be discovered easily, while making every node maintaindistant “contacts” for discovering remote resources. Kozat et alpropose Network Layer Support [4] for resource discovery,and create virtual backbones for discovering and registeringresources. Saeda and Helmy propose Rendezvous Regions [5],dividing a network topology into geographic regions, witheach region responsible for a set of keys representing servicesin that region, and subsequently mapping this information byelecting a few nodes to map the entire network. Cheng andMarsik propose piecewise network awareness [8] to lever-age service advertisement and discovery as well as networkawareness techniques. This work however focuses on reducingwireless bandwidth consumption and battery energy of mobiledevices. Tilak et. al. [9] propose dynamic resource discoveryfor sensor networks to allow a modular implementation todiscovery dynamic and heterogeneous resources with a littletradeoff between inter-operability and energy efficiency. Dovaland O’Mahony propose NOM [11] for resource discoveryin evolving networks, where nodes and resources may switchfrom on to off or devices move physically. Sailhan and Issarnypropose a scalable service discovery for ad hoc networks[12] based on dynamic and homogeneous deployment of co-operating directories within the network. Stann and Heide-mann in BARD [19] identify resource discovery to be themost expensive step in data dissemination in sensor networks,and counter global flooding by modeling searching and routingas a stochastic process. Note however, that none of theseforegoing protocols focus on application centered resourcediscovery, subjective interpretation or an attempt to understandand evolve with the user. Also, to the best of our knowledgewe are not aware of protocols suited or optimized for plannednetwork deployments like an enterprise or home networking.

B. How IRIS differs

The design and goals of IRIS strongly overlap with thatof INS [2], but the implementation and areas of focus differgreatly: (i) IRIS believes in incremental pervasive ability, andis more friendly with existing infrastructure. Little needs tobe modified to set up IRIS (INS requires changes in resourcedrivers as well as compatibility with Chord, a peer-to-peerlook up service that complements INS), (ii) IRIS uses XML

for data exchange, and XML has emerged to be a convincingstandard for such purposes in the Internet, (iii) INS is highlytuned for self collaborative networks (like ad hoc networks)and established infrastructures would gain little from it. IRIS,on the other hand, has been tuned more towards deployment inenterprise mobility solutions; (iv) IRIS lets the user intent totake foremost focus, and allows him to express himself freelyin language. IRIS strives best to understand the meaning of ausers intent in a subjective fashion. INS does not provide anysubjective interpretation; (v) IRIS in time starts to understandthe user more and more with time, to a point where IRIS itselfdisappears from the users view, and all that is left is the users“understood” intentions and his domain of resources; (vi) IRISdoes not propagate changes in the entire network, and makeschanges to local nodes within a hop only. This makes thedesign extremely scalable and avoids excess networks traffic;(vii) IRIS proposes the use of a new routing framework thatnot just leverages an IRIS like philosophy, but is good as faras planned deployment of IT infrastructure like enterprise areconcerned. Like INS, IRIS can complement such architectures.Unlike DNS, where resolvers form a static overlay, resolversin IRIS collaborate with each other and prove resilient anddynamic.

III. SYSTEM DESIGN

A. Hosts: Understanding user Intentions

The IRIS host is internally partitioned into ApplicationLayer and Processing/Intelligence layer. IRIS lets users touse free form strings to specify their intent to it. The textis primarily English to begin with. We present here a basicframework for capturing user intents which adapts itself to theuser in time. IRIS comes with an initial built in vocabulary ofkeywords, and this collection is based on a survey made byus asking over a hundred people to list things that they thinkis appropriate when describing a printer resource. Users arefree to add keywords to the vocabulary to further customizeIRIS with time. In general, we identify a user intent to be ofthe following format: (adjective, proximity, urgency, flexibility,attr1, attr2 .. attr ’n’). A tabulation of initial possibilities islisted in Fig. 2. These are further elucidated:• Adjectives: An adjective is a word that describes in

a relative fashion how the resource should be, basedentirely on a users subjectivity. Its interpretation willmean differently for different users. Take an example ofa “good” printer. This might mean a color laser printer toone user and ink-jets to another. To let IRIS interpret suchintentions, we maintain a customization file in an XMLformat for every user. The file is initially populated withdefinitions derived from “most probable” understandings.However, as a users continues to use IRIS and makesselections of a resource for a given query, IRIS recordsthe various attributes of a resource chosen by the user toits initial interpretation. This lets the customization evolveaccording to the user. Adjectives are further classified asstrong and weak adjectives. A strong adjective describesmany things about a resource that need not be explicitlymentioned.

4

Keywords Usage DefaultsStrong Adjectives “excellent, good” Color printer,

high res.Weaker Adjectives no adjectives black-white,

“Med, low, any” lower resolutionProximity “near, nearest, nearest printer

close, closest”Urgency “least-loaded, least loaded printer

fast, urgent”Flexibility Negotiable: “try” flexible

NotNegotiable: “hard” InflexibleOther Attributes Color: “color, bw” Color capabilities

Access: Public, private Defaults

Fig. 2. Tabular representation of the default customization (stored as anXML) of adjective interpretations. Each field in the left hand column is anode in the XML tree that is parsed according to the inference presented.The definitions, however, change with continuous usage.

• Proximity: The user can explicitly declare if the resourcehe wants should be close to him or if he is not botheredabout its proximity altogether as long as he gets whathe wants. Words like “near, nearest, close, closest” etc.trigger this clause. This is an assumed metric by default2.

• Urgency: This describes how fast the user expects theturn around time for the resource to be. Words like“fast, urgent, hurry” etc. indicate such an intent. In theappropriate context of the resource, we understand this tobe something which is least loaded, or something whichcan process a request really fast. For the printer example,this might mean a “least-loaded” printer (one with a smallor zero pending queue of jobs to process).

• Flexibility: The user might use an intent to describe aresource, but not necessarily want all of them. In otherwords there is an implicit relative importance amongstthe attributes themselves. Based on past selections ofresources for given intentions, a list of attributes in areverse priority list is created (from the most importantattribute sought to the least important one). If an exactmatch cannot be found, IRIS will keep knocking off theleast important parameter from the query and attemptsearching until all resources of that type (wildcard) arefound.

• Other Attributes: These are other attributes that are spe-cific to the resource in question. For example, attributeslike black-white or color, a Xerox or an Epson printerand so on would fall in such a category.

B. Explicit Query to IRIS

As IRIS starts, a GUI pops up with a simple text box toaccept user queries. When we look for a resource, we firstgenerate a “vague” description of what we are looking forin our heads. It is easy to recall the last time one wantedto go to a restaurant, or a movie, or something you wantedto buy, or tried to look for a printer in a new campus. Oneintuitively decides on what one wants with a simple sentence

2See Lemma 4 in Section IV for the rationale behind this

Fig. 3. Input query processing modules at the IRIS host

<PrinterQuery> <Description><Mode>BlackWhite</Mode><Resolution>1024x680</Resolution><Access>Public<Access><Type>Laser</Type><Location>

<Building>19</Building><Floor>1</Floor>

</Location></Description> </PrinterQuery>

Fig. 4. XML describing the user intent (request-XML) for the example caseof “good black white public printer”

of the form “I need a good black-white printer, and fast”.The motive of IRIS is to let you stop at that intent generation.This query string is a bunch of unprocessed data. IRIS firstextracts from this a set of “tokens”. A token is defined as aword which potentially means something to IRIS. This is doneby comparing individual words to known keywords, and theend result is a set of attributes for the resource in question.

For the given query, the keywords recognized at this pointare (good, black-white, public, printer). To generate the valuesassociated with the attributes, the customization XML isparsed (in accordance with the current state of definitions, as inFig. 2). This yields a set of attribute-value pairs used to definea user intent, which in effect is a request-XML. Going withthe foregoing example, the request-XML thus constructed isshown in Fig. 4. Note that though the adjective “good” defaultsto a color printer, the prescience of the token ”black-white” isreflected in the “Mode” field in the request.

C. IRIS Guidance

It is possible that the user wanted something, and no exactmatch could be made for two reasons: (i) The resource withsuch parameters is simply not available in the network and(ii) The user wanted proximity to be important, and such aresource is not available in the immediate vicinity. In suchcases, we let IRIS guide the user to a resource.

5

In the former case, it is not possible to satisfy the usercompletely. What can be done, however, is present the userwith the nearest match that most correctly matches his set ofparameters in the right order of priority. IRIS starts to builda reverse priority list with increased usage, which is a list ofmost important parameters in a search in a reverse order (theleast important parameter first). This is built by recording themost frequently occuring attribute and its corresponding valuein the choice of resources made by the user in the past forhis given intents. This lets IRIS understand parameters mostimportant to the user for a given resource over a long periodof time. For example, it is possible to have a list such as this:proximity, color, resolution, load. When guidance is looped,IRIS will knock off the first attribute (proximity) from thequery (if present) and make an attempt, and then the secondattribute and so on until the most important parameter remains.

In the latter case, when the user wanted proximity andthe resource was not available in the immediate vicinity, theresolver in the zone contacts neighboring resolvers with theuser intent. the neighboring resolvers in turn perform a search,and the net result is communicated back to the user (withduplicate suppression). This process could go on with moreresolvers being incrementally contacted in two, three, or morehops radius until either a match is found or if the user selectsan appropriate resource. Note that the users intent is still servedbest, because IRIS still tries to find a resource in as littlenumber of hops as possible.

D. I “Trust” IRIS

In time, IRIS adapts to the user. It records the queries typedby the user, the state of the query as it was when a selectionof a resource was made (this is more relevant when the userwas flexible and negotiation continued till he found a resourcewas chosen). The “state” of the query is defined in terms ofthe dimensions of describing a resource: adjective, proximity,urgency, flexibility and other attributes. Note that history isrecorded only if the user commits to a resource by selectingit. When extensive profiling has been performed, it is possibleto not say anything to IRIS, and let it do best from whatis understands of the user. By tracking history, IRIS itselfconstructs a query, lets in parse in the users context, andconstructs a request-XML.

E. Resolvers and Resource Overlays

Resolvers are network entities that aid in the performanceof the network. They could either be conventional hosts orspecialized devices to this end. Resolvers are passive entitiesthat wait for both resources to connect to a network and forhosts to submit queries on well known ports.

Resolvers also perform resource monitoring via resourceoverlays. The concept of resource overlay springs from therequirement that we make little changes to the existing infras-tructure and embed the entire logic in software. Probing forreal time data about a resource (like the queue-length) becomesa lot easier if the resource co-operates with periodic broadcastsof real time information about itself into the network. Takingthe printer example, this might require changes to the device

Fig. 5. Topology layout for IRIS. The given topology is logically partitionedinto zones, with at least one resolver per zone.

driver code, thus changing the fundamental “printer”. Tocounter this, we design a resource proxy. This is a codesnippet that is not in the resource, and instead resides in theresolver. As soon as a resource is identified, a piece of codefor that resource type is initiated. This performs the necessarymonitoring of the resource and keeps the resolver updated.

Resolvers have two main functions: (i) Parse incomingrequest XML’s from hosts to identify user intention, query thedatabase to find resources that match the request, and commu-nicate with the resources to extract real time information and(ii) wait for resources to connect to the network. Once a newresource is identified, static information (or immutable fieldslike name, type, capabilities etc.) from the resource is enteredinto the database. An overlay for that resource is additionallycreated that continues monitoring it.

We allow resolvers to spawn additional resolvers if the loadon a particular grid exceeds a certain limit, and allow spawnedresolvers to terminate themselves if the load is beneath athreshold. Such designs are popular, and are good as far asload balancing and congestion are concerned.

IV. NETWORK TOPOLOGY AND ROUTING

Conventional routing schemes in ad hoc networks or staticinfrastructures are in general unsuitable for IRIS like ar-chitecture for a variety of reasons. Since resources are tobe discovered in a given radius from the user, we need amix of controlled flooding and extensive interactions betweenresolvers and available resources. IRIS design gains littlefrom schemes like AODV [29] and DSR [30]. Assuminga transmission radius of 50 meters, we conducted a simpleexperiment with 10 hosts within a typical IRIS grid of 50×50meters. The number of control messages used by AODV andDSR rises exponentially with number of hosts. To counterthis, we designed a topology control mechanism and routingscheme to better suit the needs of IRIS.

6

Fig. 6. Host-Resolver-Host interactions

A. Topology maintenance

The IRIS topology is logically partitioned into grids, and weplace at least one resolver per grid. Assuming a transmissionrange of λ meters, we choose grid sizes to be λ

2× λ2 . Resolvers

periodically advertise themselves using beacons, which areused by hosts and other neighboring resolvers to maintain a listof proximate resolvers. Since the transmission range extendswell into the adjacent grids, hosts end up having a strong list ofpotential resolvers to service requests. When hosts (and neigh-boring resolvers) fail to receive three consecutive beacons froma resolver, it is marked “down” and its entry deleted from theneighbor table. In effect, our routing table is simplistic in thatit only requires local states to be maintained. No global routingstate is maintained. This has direct consequences: (i) resourcesmay join and leave without registering or de-registering withany central repository, and the current list is assured to bestable and correct; (ii) users shall continue to be mapped toresources even though such a mapping may change in timewith network dynamics, and (iii) when a resolver fails, thefailure is gracefully handled with simple changes in neighbortables in its local vicinity.

B. Simple queries

Hosts generates a request-XML based on user intentions andsend them to the nearest resolver using a HTTP connection forparsing. The choice of a resolver is purely based on proximity(link quality or location information when available), althoughmore sophisticated metrics may be applied (like load on aresolver itself, for example). The resolver extracts availableresources that match such a query from the database, andcontacts the resources for real time information about them.

C. Routing IRIS Guidance

When an exact match to a query cannot be made or ifthe user is not satisfied with the results, IRIS guidance islooped. This usually means that resources available in onehop cannot satisfy the user. In such cases, the gateway resolverwhich could not service the request contacts its neighboring

Fig. 7. Abstract representation of the parsing file

resolvers with a copy of the query. These resolvers in turnextract available resources in their vicinity. The resulting setis not collated at the gateway resolver to suppress duplicateresources, and the list communicated to back to the user. Thisis depicted in Fig 5.

V. PERFORMANCE ANALYSIS

The most computationally intensive steps are the onesinvolving recursive parsings. The following Lemmas establishalgorithmic analysis of IRIS at an abstracted level of tree basedrecursive searchings. Refer to Fig 7 for the abstract modelchosen for XML representations. The following table definesthe symbols used in our lemmas and corollaries:

na Number of attributes presentd One half of the depth of name treera Range of attributes possiblerv Range of values possible

T (d) Time to parse at depth d

Lemma 1: Number of non-conflicting parameters/attributesrequired to completely define a resource has a strict upperbound

Proof: A non-conflicting attribute is one that does notcontradict another attribute already present in the query (eg.,a black-and-white color printer is conflicting, in the sensethe user wants a color printer after explicitly asking for ablack-and-white only printer). This lemma states that thereis a definitive number of attributes, k, that can be used tocompletely define a resource type. When an end user tries todescribe a resource to IRIS, he uses a limited set of attributesto describe what he wants. In normal circumstances, it takesa maximum of six attributes to precisely describe an intent toIRIS.

Lemma 2: Recursive tree parsing has a complexity ofΘ(nd) for a tree of depth ”d”Proof: Parsing of the XML document precisely describing theattribute-value pairs, or av-pairs is the most computationally

7

intensive task at the end host. An XML document may beabstracted to an k-ary tree, where k is the number of attributespossible at a given node. Refer to Fig 7 for references to treeparsing in this lemma.

Denote by T (d) the time required to parse a node at a depthd. We have the following expression describing the recursion:

T (d) = na(ta + tv + T (d− 1))

For an XML document with strict ordering of attributes,values are only present at the leaves, and there is only onevalue for an attribute. Intermediate nodes are heads of othernodes. We hence have tv = c, where c is a constant. Theexpression ta + tv can be rephrased as ta + c or simply t,where t = ta +c. Rephrasing the previous recursion, we have:

T (d) = na(t + T (d− 1))

Assume the anchor case of T (0) = b, we have on expansion:

T (d) = na(t + T (d− 1))= nat + na

(na(t + T (d− 2))

)

= Θ(nd

a(t + b))

(1)

A linear search results in T (D) = Θ(nd

a · (ra + rv + b)),

where ra and rv are ranges of the attribute and value fields.The above is true because for the search tree ta ∝ ra andtb ∝ rb for linear searches. This complexity seems to be risingwith the value of na, but as observed in Lemma 1, there is afinite upper bound to it. During implementation, we have foundna < 6. XML parsing of this order happens in the followingcases: (i) parsing of customization XML at the end host tounderstand the semantics of the query, (ii) XML parsing at theresolver to decode the av-pairs of the query and (iii) parsingof the result in XML sent by the resolver to the host. Of thethree cases, we have found the second parsing (query at theresolver) to be the most extensive since the possibility of thefile being abound with all the parameters exists.

Corollary: IRIS guidance is a linear cumulation of tree basedrecursive searchesProof: This lemma analyzes IRIS guidance. Assume a userquery with n attributes. When IRIS returns a set of resultsbased on this query, and the user wants to broaden the searchwithin the existing query, IRIS knocks off the least importantparameter (with n − 1 parameters). If the user is still notsatisfied, it knocks off one more parameter and so on till awildcard ”*” match of all the resource types are made. Thefollowing expressions captures the complexity of this step:

T (R) = T (n) + T (n− 1) + T (n− 2) + . . . + T (n− k)

where k is the number of times the user lets the guidance loop(for k ≤ n). Upon expansion using Equation (1), we have:

T (R) =k≤n∑

k=0

T (n− k)

=k≤n∑

k=0

( (n− k)d − 1(n− k)− 1

· t + (n− k)d−1b)

=k≤n∑

k=0

Θ((n− k)d(t + b)

)

= Θ(nd(t + b)

)(2)

It is obvious that a single query (with no guidance) is aspecial singular case of k = 0, as the value of k increases,the values of (n − k)d diminish and converge to the case inLemma 2.

Lemma 3 More localized the traffic patterns, more is thesaving on bandwidth, memory and energy consumption ofparticipating devices

Proof: To be successfully operating, one needs to have aresolver for every networked area. If the deployed area to beIRIS-enabled is large, one logically breaks it down into smallergrids that exhaustively cover the entire area, and each such gridhaving at least one resolver. In other words, every host shouldhave at least one resolver in its proximity. For reasons of loadbalancing and robustness, and assuming the transmission rangeof a resolver to be λ meters, we will assume the grid sizes3

to be λ/2× λ/2Consider a topology of p× q square meters. Assuming the

range of a resolver to be about λ meters, the minimum numberof resolvers we need is given as:

Rn = 4× d p× q

λ× λe (3)

Consider an epoch with n sessions, with each sessiongenerating xi number of packets, such that the total numberof packets generated is given by (the ideal case):

Pideal =n∑

i=1

xi (4)

Pideal is the total number of packets generated at the host,and should be that number only for the most efficient usage ofbandwidth and energy (for a goodput of one). Resolvers startgenerating duplicates permeating to the entire resolver networkonly for inter-group communication. In normal cases, they willreturn the most imminent resource available for usage. Denoteby p the probability that the session is local (one hop) and byp the probability that the session is not local. Also note thatthe following implies:

p = 1− p (5)

Using equation (4), we have the total number of packetsgenerated as:

3For such a configuration, the host is never deserted without a resolver evenif a present imminent resolver fails. This is because resolvers in adjacent gridscan easily service the host, since their transmission range exceeds into deepcorners of adjacent grids [27]. Also, having multiple options for resolversalleviates problems of growing congestion and load balancing.

8

Fig. 8. Map of the floor where IRIS is implemented

PTotal =n∑

i=1

(pi · xi + 4× p · dp× q

λ2e · xi

)

=n∑

i=1

(4 · dp× q

λ2e+ pi(1− 4 · dp× q

λ2e)

)xi

=n∑

i=1

(Rn + pi(1−Rn)

)xi (6)

The number of excessive packets generated is as follows:

Pδ = PT − Pideal (7)

For optimum protocol performance, Pδ should tend towardszero, or in other words, Pδ → 0. Restating the same usingprevious derivations is:

( n∑

i=1

xi −( n∑

i=1

(Rn + pi(1−Rn))xi

)) → 0 (8)

The above relation shows that larger the value of p (orin other words p >> p), the duplicates generated are lessand the system tends more towards ideal bandwidth andmemory usage. This observation can be extended to generalobservations of traffic patterns.

VI. IMPLEMENTATION PERFORMANCE

We have implemented IRIS in specific buildings at theInfosys Bangalore Campus [1]. IRIS was run on standardIBM PC’s, with configuration of an Intel P4 and 256 MB ormemory. We use SQL-Server for the database and operate withXerox DocuPrint printers, along with inkjets for direct imple-mentation. We analyze IRIS performance for CPU utilization,bandwidth, memory and time.

We profiled IRIS to analyze it in real time. IRIS wasrun in seclusion to all other applications at the time of thetests. However, depending upon utilization by other machinemodules already running (particulary the operating systemand background processes), the CPU and memory utilizationsreturn different values on different runs, albeit extremely close.Each test was run ten times and we recorded the average thusobtained from the runs.

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

200

400

600

800

1000

1200

Number of Names in Namespace

Num

ber

of lo

okup

s (p

er s

ec)

Fig. 9. Performance of name-lookup v/s number of names in name space

A. Name lookups

One factor governing scalability is the ability to han-dle a large number of requests. We constructed an XMLdocument in which we inserted random entries into thecustomization.xml file from 1 to 10, 000 in steps of 20,and recorded the time it took to perform a single name resolu-tion (say tr). The inverse of this (1/tr) gives an indication ofthe number of queries per second that can be handled at a host(See Fig 9). This is true since tr gives a precise estimate for anindependent query. The number of name-resolution went froma maximum of around 11000 per second (for a single entryin the customization) to a minimum of around 948 for 10,000entries. The slope of the graph is around -1.005 lookups persecond per names. Since the slope is fairly close to one, it isindicative that performance does not degrade too fast and isalmost constant with a minor negative slope.

B. Memory and CPU utilization

For the same experiment, we recorded the memory allocatedby the Java interpreter. The memory allocated should be aconstant amount greater than the customization.xml file,which houses the name resolution information. The allocationwas more or less uniform and greater than the actual file sizeby a small constant amount, owing to system pointers andcomputational workspace. The allocation went to a maximumof 1.48 MB. We have re-plotted the INS result alongside insame graph as well for a direct comparison. The difference invalues exist because the name-resolution files in INS are a lotmore informative and bulkier.

We also recorded CPU and memory saturation (Fig 10) foran increasing number of names in the customization file. Ourrecordings are consistent with that of INS, and we found thatthe CPU saturated faster than bandwidth. We found that theCPU saturated at around 9800 entries (this is almost twicethe value of INS. This is largely because we use an IntelP4 processor at a different clock speed compared to the INSexperiments). Memory utilization was at a mere 3.45% (of theavailable 256 MB) even at 20000+ entries.

9

0 0.5 1 1.5 2 2.5

x 104

0

10

20

30

40

50

60

70

80

90

100

Number of Names in Namespace (x10000)

Sat

urat

ion

(per

cent

age)

CPUBandwidthMemory

Fig. 10. Percentage utilization of CPU, Memory and Bandwidth

0 5000 10000 150000

0.5

1

1.5

2

2.5

3

3.5

Number of Names in Namespace

Siz

e of

Nam

espa

ce in

inte

rpre

ter

(MB

)

IRISINS

Fig. 11. Size of the namespace in mega-bytes

C. Name discovery

We handle network changes with a slightly different phi-losophy, and we consider changes as a norm rather thanan exception. When a resource cannot be monitored by aresolver by its overlay or it stops advertising itself with softstate information, we do not propagate this to the entirenetwork. Instead, only the resolver(s) which was monitoringthe resource directly makes a silent note of this phenomenon.A request to a resolver for a resource is resolved with aphilosophy of getting the closest such match to the user,unless the user explicitly asks otherwise. In effect, we try tokeep network traffic as local as possible in accordance withLemma 3.

We plot the time to discover a resource (not a new name)as a function of the number of hops it is from the host(Fig 12). We found that time taken is dominated by messagerouting more than any other factor. Time to discover resourcesin the vicinity is a minimum, and most efficient as far astime, memory, CPU and bandwidth utilization are concerned.However, even as the number of hops increase from the host,

the time taken is small. This is because inter-resolver routingis largely based on controlled flooding, which takes the leasttime compared to any routing protocol. This is evident sincetime to route a packet to a cached route is very comparableto a routing without cache. We have reproduced the INS timeto discover a new name alongside for a direct comparison.

There is an issue of the number of excessive packets thatmight be generated as the number of hops from the hostincreases, though the time it would take would always bea bare minimum since the method is controlled flooding.However, we have found that the average number of packetsthat are generated in a typical IRIS session to be much lesserthan the average packets generated with other routing protocolsbecause a combination of the following reasons: (i) IRIS willtry its best to direct a user to imminent resources since itnaturally tries to avoid flooding. This works good in two ways:one, the user gets a resource for immediate consumption,and two, excessive inter resolver traffic is avoided.; (ii) IRISincrementally contacts neighbor resolvers: if a resource is notfound immediately, and user lets the guidance loop, IRIS willcontact its immediate neighbors first, and report results, andupon more calls for guidance, will slowly propagate deeperinto the network. In effect, even flooding the resolver networkis taken slowly with extreme caution; (iii) triggered updatestypically use a variety and a large number of control packetswhich have to be transmitted to the entire network. Since thisis avoided altogether, a large number of overhead packets areavoided.

D. Quantifying user satisfaction

We collected opinions from 35 end users of IRIS to studyuser satisfaction. Subjects were asked to report a mean opinionscore (MOS) in the range of 0 to 5, with 0 being theworst opinion and 5 the best. Particular attention is paid tothe variation of this score with increased usage to capturethe effectiveness of IRIS in understanding its end user. Forthis experiment, we consider 18 printers scattered across twofloors. The average MOS thus obtained from various usagelevels is shown in Fig. VI-D. Users were able to add newwords to the IRIS vocabulary with increased interactions, andwe found that MOS value particularly shot up with IRISunderstanding these new words. Satisfaction again being asubjective metric, no two users reported a similar variationin MOS with increased usage. The plot however projects anincreased satisfaction with continuous usage, establishing theeffectiveness of subjectivity based searches.

VII. CONCLUSION

Resource discovery is an important problem to solve, andwith resources becoming deeply embedded in dense networkedenvironments, revenue generation will come from advertisingand discovering resources. We believe resource discovery withthe user, his intention and his applications as the nucleus is anexcellent way of approaching this. We are in full support withwork like INS, and our design complements such approaches.

We have successfully designed, implemented, tested and de-ployed IRIS, an intention based resource discovery mechanism

10

1 2 3 4 5 6 7 8 90

10

20

30

40

50

60

70

80

90

Number of Hops

Tim

e (m

s)IRIS (No Cache) IRIS (Cached)INS

Fig. 12. Performance versus the number of hops between the host and thedesired resource in question.

0 5 10 15 20 25 30 350

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

IRIS usage (number of times)

Mea

n O

pini

on S

core

(MO

S)

Fig. 13. Mean Opinion Score as reported by 35 subjects using IRIS todiscover “printers” in a campus with varying levels of usage. A score of 0denotes the worst opinion while 5 denotes the best opinion.

that lets users freely sketch their imagination, understandsthem, and evolves to understand more and more of the userwith increased usage to a point where IRIS itself disappearsand the users understood intentions always find him his type ofa resource. We have deployed IRIS for ”printers in a building”.We have achieved the following parameters in our quest forsuch a design: precision, user satisfaction, context awareness,free form English representation of queries, scalability anda stable design. We conclude this paper with the hope thatthis work spurs more research in thrust towards ubiquitouscomputing.

REFERENCES

[1] Infosys Technologies Limited, http://www.infosys.com.[2] W. Adjie-Winoto, E. Schwartz, H. Balakrishnan and J. Lilley, “The design

and implementation of an intentional naming system”, Proc. 17th ACMSymposium on Operating Systems Principles, pp 186 - 201, KiahwahIsland, SC. Dec 1999

[3] Ahmed Helmy, Saurabh Garg, Nitin Nahata, Priyatham Pamu, “CARD: AContact-based Architecture for Resource Discovery in Wireless Ad HocNetworks”. MONET 10(1-2): 99-113 (2005)

[4] Ulas Kozat, Leandros Tassiulas, “Network Layer Support for ServiceDiscovery in Mobile Ad Hoc Networks” IEEE INFOCOM 2003

[5] Karim Seada, Ahmed Helmy, “Rendezvous Regions: A Scalable Archi-tecture for Service Location and Data-Centric Storage in Large-ScaleWireless Networks” IPDPS 2004

[6] D. Estrin, R. Govindan, J. Heidemann and S. Kumar, “Next Century Chal-lenges: Scalable Coordination in Sensor Networks”, In Proc. ACM/IEEEMOBICOM, pp 263 270, August 1999.

[7] J. Heidemann, F. Silva, C. Intanagonwiwat, R. Govindan, D. Estrin and D.Ganesan “Building Efficient Wireless Sensor Networks with Low LevelNaming”, In ACM SOSP’01, Banff, Canada, 2001.

[8] L. Cheng and I. Marsic, “Piecewise Network Awareness Service forWire-less/Mobile Pervasive Computing”, Mobile Networks and Applications(7), Kluwer Publications, pp 269-278, 2002.

[9] S. Tilak, K. Chiu, N. B. Abu-Ghazaleh and T. Fountain, “DynamicResource Discovery for Wireless Sensor Networks”, Network CentricUbiquitous Systems, 2005.

[10] W. Heinzelman, J. Kulik, and H. Balakrishnan, “Adaptive Protocolsfor Information Dissemination in Wireless Sensor Networks”, Proc. 5thACM/IEEE Mobicom Conference, Seattle, WA, Aug 1999.

[11] Diego Doval and Donal OMahony, “Nom: Resource Location and Dis-covery for Ad Hoc Mobile Networks”, Proc. 1st Annual MediterraneanAd Hoc Networking Workshop, Medhoc -Net, 2002.

[12] F. Sailhan and V. Issarny, “Scalable Service Discovery for MANET”,Proc 3rd Intl Conf. on Pervasive Computing and Communications (Per-Com’05), 2005.

[13] J. O Toole and D. Gifford, “Names should mean what, not where”, Proc.5th ACM European Workshop on Distributed Systems, September 1992.Paper No. 20.

[14] S. Czerwinski, B. Zhao, T. Hodes, A. Joseph, and R. Katz, “AnArchitecture for a Secure Service Discovery Service”, In Proc. ACM/IEEEMOBICOM, pp 2435, August 1999.

[15] V. Jacobson, How to Kill the Internet, Talk at the SIGCOMM 95Middleware Workshop, available from http: //www-nrg.ee.lbl.gov/nrg-talks.html, August 1995.

[16] Jini (TM). http://java.sun.com/products/jini/, 1998.[17] T. Lehman, S. McLaughry, and P. Wyckoff, T Spaces: The Next Wave,

http://www.almaden.ibm.com/cs/ TSpaces/, 1998.[18] G. R. Malan, F. Jahanian, and S. Subramanian, “Salamander: A Push-

based Distribution Substrate for Internet Applications”, In Proc. USENIXSymposium on Internet Technologies and Systems, pp 171181, December1997

[19] Fred Stann and John Heidemann, “BARD: Bayesian-Assisted ResourceDiscovery in Sensor Networks”, USC/ISI Technical Report ISI-TR-2004-593, 2004.

[20] Universal Plug and Play: Background.http://www.upnp.com/resources/UPnPbkgnd.htm. 1999.

[21] P. V. Mockapetris and K. Dunlap, “Development of the Domain NameSystem” Proc. of SIGCOMM 88, Stanford, CA, pp 123133, August 1988.

[22] B. Oki, M. Pfluegl, A. Siegel, and D. Skeen, “The Information Bus(R) An Architecture for Extensible Distributed Systems”, In Proc. ACMSOSP, pp 5878, 1993.

[23] C. Perkins, Service Location Protocol White Paper,http://playground.sun.com/srvloc/slp white paper.html, May 1997.

[24] J. Saltzer, D. Reed, and D. Clark, “End-to-end Arguments in SystemDesign”, ACM Transactions on Computer Systems, 2:277288, Nov 1984.

[25] J. Veizades, E. Guttman, C. Perkins, and S. Kaplan, Service LocationProtocol, June 1997. RFC 2165 (http://www. ietf.org/rfc/rfc2165.txt)

[26] Mukundan Venkataraman and Puneet Gupta, “Stack Aware Architecturesfor Mobile Ad Hoc Networks”, Internet Draft, Internet Engineering TaskForce (IETF). May 2004.

[27] Mukundan Venkataraman and R. Bhakthavathsalam, “Proxmiate RunnerProtocol for Routing in Mobile Ad hoc Networks”, ACM/IEEE Commu-nication Networks and Distributed Systems (CNDS’04), San Diego, CA.January 2003.

[28] Puneet Gupta and Deependra Moitra, Evolving a Pervasive IT Infras-tructure: A Technology Integration Approach, Personal and UbiquitousComputing Journal Vol 8(1), Feb 2004.

[29] C . Perkins, “Ad Hoc On Demand Distance Vector (AODV) Routing”,IETF, Internet Draft, draft-ietf-manet-aodv-00.txt, November 1997.

[30] D. Johnson, D. Maltz and J. Broch, “DSR: The Dynamic SourceRouting Protocol for Multi-Hop Wireless Ad Hoc Networks”, in AdHoc Networking, edited by Charles E. Perkins, Chapter 5, pp. 139–172,Addison-Wesley 2001.