Core technologies for service robotics

6
Published in Proc. of Int. Conf. on Intelligent Robots and Systems (IROS) 2004 Core Technologies for Service Robotics Niklas Karlsson, Mario E. Munich, Luis Goncalves, Jim Ostrowski, Enrico Di Bernardo, and Paolo Pirjanian Evolution Robotics, Inc. Pasadena, California, USA Email: {niklas,mario,luis,jim,enrico,paolo}@evolution.com Abstract—Service robotics products are becoming a reality. This paper describes three core technologies that enable the next generation of service robots. They are low-cost, make use of low-cost hardware, and prepare for a short time-to- market for product development. The first technology is an object recognition system, which can be used by the robot to interact with the environment. The second technology is a vision-based navigation system (vSLAM TM ), which simulta- neously can build a map and localize the robot in the map. Finally, the third technology is a flexible and rich software platform (ERSP TM ) that assists developers in rapid design and prototyping of robotics applications. I. I NTRODUCTION Products such as Sony’s entertainment robot Aibo TM , Electrolux’s robotic vacuum cleaner Trilobite TM , iRobot’s Roomba TM , and prototypes from numerous robotic research laboratories across the world forecast a rapid development of service robots over the next several years. In fact, the momentum created by these early products combined with recent advances in computing and sensor technology could lead to the creation of a major industry for service robots. The applications of service robotics range from existing products, where robotic technologies are incorporated to enhance and improve the product functionality, to new products, where robotic technologies enable completely new functionalities. One example of the first case is vac- uum cleaners, where the enhancement is reflected in a never before seen degree of autonomy. An example of the second case is companionship robots. Other examples can be found in interactive entertainment and edutainment. A necessary element to make the industry for service robots grow is to develop core robotic technologies that deliver flexibility and autonomy with low-cost hardware components. At Evolution Robotics (ER) we develop such technologies, and they are available as individual com- ponents and at low cost for the consumer market. This allows the robotics community to focus on the value added applications development rather than solving the core robotic problems, which on the other hand enables the industry to proliferate. A major challenge in providing core technologies for service robots is that the technologies must be low-cost, yet provide rich and flexible features. In particular, the technologies should enable a short time-to-market. At the same time, they must provide the robots “intelligent” features, meaning that they should give the robots the ability to interact with the environment and to operate somewhat autonomously. For example, a robotic vacuum cleaner which retails at 199USD can only cost about 60USD in bill of materials. The 60USD must contain all the hardware components, assembly costs, and typically packaging, manuals, and shipping. This constraint greatly limits the choices of the sensors, actuators, and amount of computation that will be available for the autonomous capabilities of the product. Due to this cost constraint, currently most robotic products which retail at a price acceptable to the consumers pro- vide very limited autonomous functionality. For example, robotic vacuum cleaning products such as Roomba and the like perform random navigation for floor coverage and tactile sensing for collision detection and avoidance. However, random navigation for floor cleaning is not efficient since it cannot guarantee complete coverage from wall to wall and from room to room at a reasonable power consumption. Random coverage cannot guarantee that the robot will be able to return to a home location, e.g. a docking station for self-charging. Localization can enable a robot to perform complete coverage at a close- to-optimal power budget as well as allow it to navigate from room to room to perform its cleaning task and finally return back to its docking station for recharging. Combining cost constraints with requirements on product durability, reliability, power consumption, and expectations of autonomous capabilities creates a major challenge in creating successful products. This paper is organized as follows. Section II highlights some of the requirements that are placed on the technolo- gies that are used for service robotics. Each requirement is thereafter addressed by proposing appropriate core tech- nologies. Section III describes ERs object recognition sys- tem. This object recognition system is one component in a novel navigation system, which is described in Section IV. A competitive set of core technologies for service robotics also requires a rich and flexible software architecture. Section V describes such an architecture. A few examples of product concepts, which are enabled by the proposed core technologies, are described in Section VI. Finally, Section VII establish some concluding remarks. II. REQUIREMENTS The service robotics market imposes a complex set of requirements on potential products: it is a consumer mar- ket, so robots should be inexpensive, the market is highly competitive, so products should have a short time-to- market. At the same time, robots need to provide services

Transcript of Core technologies for service robotics

Published in Proc. of Int. Conf. on Intelligent Robots and Systems (IROS) 2004

Core Technologies for Service RoboticsNiklas Karlsson, Mario E. Munich, Luis Goncalves,

Jim Ostrowski, Enrico Di Bernardo, and Paolo PirjanianEvolution Robotics, Inc.

Pasadena, California, USAEmail: {niklas,mario,luis,jim,enrico,paolo }@evolution.com

Abstract— Service robotics products are becoming a reality.This paper describes three core technologies that enable thenext generation of service robots. They are low-cost, makeuse of low-cost hardware, and prepare for a short time-to-market for product development. The first technology is anobject recognition system, which can be used by the robotto interact with the environment. The second technology isa vision-based navigation system (vSLAMTM ), which simulta-neously can build a map and localize the robot in the map.Finally, the third technology is a flexible and rich softwareplatform (ERSPTM ) that assists developers in rapid designand prototyping of robotics applications.

I. I NTRODUCTION

Products such as Sony’s entertainment robot AiboTM ,Electrolux’s robotic vacuum cleaner TrilobiteTM , iRobot’sRoombaTM , and prototypes from numerous robotic researchlaboratories across the world forecast a rapid developmentof service robots over the next several years. In fact, themomentum created by these early products combined withrecent advances in computing and sensor technology couldlead to the creation of a major industry for service robots.

The applications of service robotics range from existingproducts, where robotic technologies are incorporated toenhance and improve the product functionality, to newproducts, where robotic technologies enable completelynew functionalities. One example of the first case is vac-uum cleaners, where the enhancement is reflected in a neverbefore seen degree of autonomy. An example of the secondcase is companionship robots. Other examples can be foundin interactive entertainment and edutainment.

A necessary element to make the industry for servicerobots grow is to develop core robotic technologies thatdeliver flexibility and autonomy with low-cost hardwarecomponents. At Evolution Robotics (ER) we develop suchtechnologies, and they are available as individual com-ponents and at low cost for the consumer market. Thisallows the robotics community to focus on the valueadded applications development rather than solving thecore robotic problems, which on the other hand enablesthe industry to proliferate.

A major challenge in providing core technologies forservice robots is that the technologies must be low-cost,yet provide rich and flexible features. In particular, thetechnologies should enable a short time-to-market. At thesame time, they must provide the robots “intelligent”features, meaning that they should give the robots theability to interact with the environment and to operatesomewhat autonomously.

For example, a robotic vacuum cleaner which retails at199USD can only cost about 60USD in bill of materials.The 60USD must contain all the hardware components,assembly costs, and typically packaging, manuals, andshipping. This constraint greatly limits the choices of thesensors, actuators, and amount of computation that will beavailable for the autonomous capabilities of the product.Due to this cost constraint, currently most robotic productswhich retail at a price acceptable to the consumers pro-vide very limited autonomous functionality. For example,robotic vacuum cleaning products such as Roomba andthe like perform random navigation for floor coverage andtactile sensing for collision detection and avoidance.

However, random navigation for floor cleaning is notefficient since it cannot guarantee complete coverage fromwall to wall and from room to room at a reasonablepower consumption. Random coverage cannot guaranteethat the robot will be able to return to a home location,e.g. a docking station for self-charging. Localization canenable a robot to perform complete coverage at a close-to-optimal power budget as well as allow it to navigatefrom room to room to perform its cleaning task andfinally return back to its docking station for recharging.Combining cost constraints with requirements on productdurability, reliability, power consumption, and expectationsof autonomous capabilities creates a major challenge increating successful products.

This paper is organized as follows. Section II highlightssome of the requirements that are placed on the technolo-gies that are used for service robotics. Each requirementis thereafter addressed by proposing appropriate core tech-nologies. Section III describes ERs object recognition sys-tem. This object recognition system is one component in anovel navigation system, which is described in Section IV.A competitive set of core technologies for service roboticsalso requires a rich and flexible software architecture.Section V describes such an architecture. A few examplesof product concepts, which are enabled by the proposedcore technologies, are described in Section VI. Finally,Section VII establish some concluding remarks.

II. REQUIREMENTS

The service robotics market imposes a complex set ofrequirements on potential products: it is a consumer mar-ket, so robots should be inexpensive, the market is highlycompetitive, so products should have a short time-to-market. At the same time, robots need to provide services

in an “intelligent” way, so they should be autonomous,adaptive, and able to interact with the environment. Thispaper describes a set of solutions for these requirementsthat have been developed at Evolution Robotics, Inc.

The price sensitivity of the market creates the needfor the development of low-cost technologies. This isachieved first of all by basing the technologies primarilyon hardware components that are relatively inexpensive, forexample, cameras and IR sensors. Indeed, a web camerais two orders of magnitude less expensive than, e.g., aSICKTM laser range sensor.

In order for a robot to qualify as a service robot, itmust also possess high level of autonomy. It must beable to operate within its environment with a minimumamount of user interaction. This is achieved by providinga robust navigation system containing mapping, as well as,localization capabilities.

Furthermore, to serve its purpose, a service robot musthave the ability to interact with and make inference from itsenvironment. A strong object recognition system satisfiesboth these requirements. By extracting relevant informationfrom vision data, the service robot can make consciousand intelligent decisions based on interaction with theenvironment and people in the environment.

To be competitive, the core technologies for servicerobotics must also offer a short time-to-market, in thatthe time from a conceptual idea to a final product mustbe minimized. This translates into a requirement on thesoftware platform used for development. The softwareplatform should provide an architecture that is rich andflexible. The architecture should be designed in a way thatallows the developer to reconfigure the hardware withoutrewriting more than a very small amount of code.

Some of the above requirements also imply that thecore technologies must enable adaptivity of the system.For example, the navigation system must be able to dealwith a changing environment.

III. ER V ISION

The ER Vision object recognition system offered byEvolution Robotics is versatile and works robustly withlow-cost cameras. It addresses the greatest challenge forobject recognition systems, which is the development ofalgorithms that reliably and efficiently perform recognitionin realistic settings with inexpensive hardware and limitedcomputing power.

The object recognition system can be used in appli-cations such as navigation, manipulation, human-robotinteraction, and security. It can be used in mobile robotsto support navigation, localization, mapping, and visualservoing. It can also be used in machine vision applicationsfor object identification and hand-eye coordination. Otherapplications include entertainment and education. The ob-ject recognition system provides a critical building blockin applications such as reading children’s books aloud, orautomatically identifying and retrieving information abouta painting or sculpture in a museum.

The object recognition system specializes on recognizingplanar, textured objects. However, three-dimensional ob-jects composed of planar, textured structures, or composedof slightly curved components, are also reliably recognizedby the system. Three-dimensional deformable objects suchas a human face cannot be handled in a robust manner.

The object recognition system can be trained to rec-ognize objects using a single, low-cost camera. Its mainstrength lies in its ability to provide reliable recognition inrealistic environments where lighting can change dramat-ically. The following are the two steps involved in usingthe Evolution Robotics’ object recognition algorithm:

Training: Training is accomplished by capturing one ormore images of an object from various viewpoints. Forplanar objects, only the front and rear views are necessary.For 3-D objects, several views, covering all facets of theobject, are necessary.

Recognition: The algorithm extracts up to 1,000 local,unique features for each object. A small subset of thosefeatures and their interrelation identifies the object. Theobject’s name and its full pose up to scale with respect tothe camera are the results of recognition.

The object recognition algorithm is based on extractingsalient features from an image of an object [6]. Eachfeature is uniquely described by the texture of a smallwindow of pixels around it. The model of an object consistsof the coordinates of all these features along with eachfeature’s texture description. When the algorithm attemptsto recognize objects in a new image, it first finds features inthe new image. It then tries to associate features in the newimage with all the features in the database of models. Thismatching is based on the similarity of the feature texture.

If many features in the new image have good matchesto the same database model, that potential object match isrefined. The refinement process involves the computationof an affine transform between the new image and thedatabase model, so that the relative position of the featuresis preserved through the transformation. The algorithmoutputs all object matches for which the optimized affinetransform results in a small root-mean-square pixel errorbetween the features found in the new image, and thecorresponding affine-transformed features of the originalmodel.

Figure 1 shows the main characteristics of the recog-nition algorithm. The first row displays the two objectsto be recognized and the other rows presents recognitionresults under different conditions. The main characteristicsof the object recognition algorithm are summarized in thefollowing list:

a) Invariant to rotation and affine transformations:The object recognition system recognizes objects evenif they are rotated upside down (rotation invariance) orplaced at an angle with respect to the optical axis (affineinvariance). See second and third row of Figure 1.

b) Invariant to changes in scale:Objects can be rec-ognized at different distances from the camera, dependingon the size of the objects and the camera resolution. Therecognition works reliably from distances of several meters.

Fig. 1. Examples of object recognition under a variety of conditions. Thefirst row presents the objects to be recognized and other rows displaysrecognition results.

See second and third row of Figure 1.c) Invariant to changes in lighting:The object recog-

nition system can handle changes in illumination rangingfrom natural to artificial indoor lighting. The system is in-sensitive to artifacts caused by reflections or back-lighting.See fourth row of Figure 1.

d) Invariant to occlusions:The object recognitionsystem can reliably recognize objects that are partiallyblocked by other objects, and objects that are partially inthe camera’s view. The amount of occlusions allowed istypically between 50% and 90% depending on the objectand the camera quality. See fifth row of Figure 1.

e) Reliable recognition:The object recognition sys-tem has 80-95% success rate in uncontrolled settings. A95-100% recognition rate is achieved in controlled settings.

The recognition speed is a logarithmic function of thenumber of objects in the database; i.e., the recognitionspeed is proportional to log (N), where N is the numberof objects in the database. The object library can storehundreds or even thousands of objects without a significantincrease in computational requirements. The recognitionframe rate is proportional to CPU power and image res-olution. For example, the recognition algorithm runs at 5frames per second(fps) at an image resolution of 320x240on an 850MHZ Pentium III processor and 3 fps at 80x66on a 100MHz MIPS-based 32-bit processor.

Reducing the image resolution decreases the imagequality and, ultimately, the recognition rate. However, theobject recognition system allows for graceful degradationwith decreasing image quality. Each object model requiresabout 40KB of memory.

IV. VSLAM

Evolution Robotics’ Visual Simultaneous Localizationand Mapping(vSLAMTM ) system is robust and requiresno initial map. It is the core navigation functionalityof Evolution Robotics Software Platform(ERSPTM ), andenables simultaneous localization and mapping using alow-cost camera and wheel encoders as the only sensors.Because a web camera is used instead of a SICK laserrange sensor or other expensive range measuring device,the vSLAM technology offers a dramatic cost reduction.With vSLAM, a service robot can operate in a previouslyunknown environment to build a map and to localize itselfin this map.

Relying on visual measurements rather than range mea-surements when building maps allows vSLAM to deal withcluttered spaces easily. Range-based SLAM techniques,involving laser scanners and similar devices, often fail incluttered environments. The vSLAM algorithm handles dy-namic changes in the environment well, including lightingchanges, moving objects and/or people, simply because itpossesses such a wealth of information about the createdlandmarks.

Figure 2 illustrates the inputs and outputs of vSLAM.The inputs to the algorithm are dead reckoning informa-tion (for example, wheel encoder odometry) and imagesacquired by a camera. The main components of vSLAMareVisual Localization, SLAM, andLandmark Database.

Fig. 2. Block diagram of vSLAM.

In theVisual Localizationmodule the object recognitionsystem described in Section III is used to define newand recognize old natural landmarks. The landmarks areused for localization of the robot. In particular, whena previously defined landmark is recognized, a relativemeasurement of the robot pose is computed with respect tothe pose of the robot when the landmark was first defined.Some related work is described in [6], [8], [11]

In the SLAM module, the relative measurements arefused with odometry data to update the robot pose andthe poses of all the landmarks. For related work, see forexample [3], [5], [7], [9], [10]. The SLAM algorithmis developed to track multiple hypothesis, but also tomaximize the robustness to dramatic disturbances such askidnapping, which is the scenario where the robot is liftedand moved to a new location without notification.

In theLandmark Databasemodule information is storedand updated on the landmarks in the vSLAM map. Forexample, the unique identifiers are recorded, as well as

estimates of the landmark poses and uncertainty measuresof the estimates.

When entering a new area or environment, the robot caneither load an existing map of that area, or immediatelystart creating a new map with a new set of landmarks.

During its operation, the robot will attempt to createnew visual landmarks. A visual landmark is a collection ofunique features extracted from an image. Each landmark isassociated with an estimate of where the robot was located(the x, y, and heading coordinates) when the landmark wascreated. As the robot traverses the environment, it continuescreating new landmarks and correcting its odometry asnecessary, using these new landmarks and a correspondingmap.

vSLAM is constantly improving the robot’s knowledgeabout its environment. For example, new landmarks areautomatically created if the robot is mobile for too longwithout recognizing any known landmarks in its database.This enables the robot to constantly refine its understandingof its environment and increase its confidence in where itis located.

In order to understand the limitations of vSLAM, it isnecessary to understand the different steps that take placein the vSLAM algorithm. These are:

1) Traverse the environment, collect odometry, and ac-quire an image.

2) Process the image to extract visual features.3) If the number of visual features are sufficiently large

and distinct, thena) Determine if the features correspond to a pre-

viously stored landmark.b) If the features do correspond to a previously

stored landmark, then:i) Compute a visual measurement of where the

robot is located now, relative to where it waslocated when the landmark was created.

ii) Based on the visual measurement, the cur-rent map, and the collected odometry; im-prove the localization of the robot.

iii) Based on visual measurement, localization,and collected odometry; refine the estimatedlocation of the landmarks..

c) If the features do not correspond to a previouslystored landmark, then try to create a new land-mark. In order to create a new landmark, someof the same visual features must be detected inthree consecutive image frames.i) Assign a unique landmark number to the

features, and add this landmark to the pre-viously created landmarks.

ii) Initialize the landmark pose based on odom-etry information.

4) Return to step 1.Note that, if the environment does not contain a sufficient

amount of visual features, then steps 3a through 3c willnever be executed. Hence, no landmark will be createdand no visual measurements will be computed. vSLAM

will still produce pose estimates, but the estimates willbe exactly what is obtained from wheel odometry. Thisscenario is rarely experienced, but is most likely to happenin environments that are free from furniture, decorations,and texture.

The vSLAM algorithm is typically good at detecting andcorrecting for kidnapping. Note that dramatic slippage andcollisions may be considered as kidnapping events andmust be handled by any reliable navigation technology.vSLAM recovers quickly from kidnapping once the maphas become dense with landmarks. Early in the mappingprocedure, is it very important to avoid kidnapping anddisturbances in general.

Figure 3 shows the result of vSLAM after the robot hastraveled around in a typical two-bedroom apartment. Therobot was driven around along a reference path (this path isunknown to the SLAM algorithm). The vSLAM algorithmbuilds a map consisting of landmarks marked with bluecircles in the figure. The corrected robot path, which usesa combination of visual features and odometry, provides arobust and accurate position determination for the robot asseen by the red path in the figure.Fig. 3. Example result of SLAM using vSLAM. Red path (darkergray): vSLAM estimate of robot trajectory. Green path (lighter gray):odometry estimate of robot trajectory. Blue circles: vSLAM landmarkscreated during operations.

The green path (odometry only) is obviously incorrect,since, according to this path, the robot is traversing throughwalls and furniture. The red path (the vSLAM correctedpath), on the other hand, is consistently following thereference path.

Based on experiments in various typical home environ-ments on different floor surfaces (carpet and hardwoodfloor), and using different image resolution (Low: 320 x280, High: 640 x 480) the following robot localizationaccuracy was achieved:

Floor Image Median Std dev Median Std devsurface resol- error error error error

ution (cm) (cm) (deg) (deg)Carpet Low 13.6 9.4 4.79 7.99Carpet High 12.9 10 5.97 7.03HW Low 12.2 7.5 5.28 6.62HW High 11.5 8.3 3.47 5.41

V. EVOLUTION ROBOTICSSOFTWARE ARCHITECTURE

The Evolution Robotics Software Platformprovides ba-sic components and tools for rapid development and proto-

typing of robotics applications. The software architecture,which is the infrastructure and one of the main componentsof ERSP, addresses the issue of short product developmentcycles (time-to-market). The design of ERSP is modularand allows applications to use only the required portionsof the architecture.

Figure 4 shows a diagram of the software structure; andthe relationship among the software, operating system, andapplications.

Fig. 4. ERSP structure and relation to application development.

There are five main blocks in the diagram – three ofthem,Applications, OS & Drivers, and3rd Party Softwarecorrespond to external components to the ERSP. The othertwo blocks correspond to subsets of ERSP: the core tech-nology libraries (left-hand-side block) and the implementa-tion libraries (center block). The core technology librariesconsist of the following components:

• Architecture : The Architecture component provides aset ofApplication Programmer’s Interfaces (APIs) forintegration of all the software components with eachother, and with the robot hardware. The infrastructureconsists of APIs to interact with the hardware, to buildtask-achieving modules that can make decisions andcontrol the robot, to orchestrate the coordination andexecution of these modules, and to control access tosystem resources. The three primary components ofthe Architecture are theHardware Abstraction Layer(HAL), the Behavior Execution Layer(BEL), and theTask Execution Layer(TEL).

• Vision: The Vision APIs correspond to the objectrecognition algorithm described in Section III.

• Navigation: The Navigation APIs provide mecha-nisms for controlling movement of the robot. TheseAPIs provide access to modules for mapping, local-ization (described in Section IV), exploration, pathplanning, obstacle avoidance, occupancy grid map-ping, target following, and teleoperation control.

• Interaction : The Interaction APIs support buildinguser interfaces for applications with graphical userinterfaces, voice recognition, and speech synthesis.

The architecture is composed of three layers and cor-responds to a mixed architecture in which the two firstlayers follow a behavior-based philosophy [1], [2] and thethird layer incorporates a deliberative stage for planningand sequencing [4]. The implementation libraries use theinterfaces defined by the architecture to implement thethree layers of the architecture for particular hardware andpurposes.

The first layer, HAL, provides interfaces to the hardwaredevices and low-leveloperating system(OS) dependencies.This assures portability of Architecture and applicationprograms to other robots and computing environments.At the lowest level, HAL interfaces with device drivers,which communicate with the hardware devices througha communication bus. The description of the resources,devices, busses, their specifications, and the correspondingdrivers are managed through configuration files based onExtensible Markup Language(XML).

The second layer, BEL, provides infrastructure for de-velopment of modular robotic competencies, known asbehaviors, for achieving tasks with a tight feedback loopsuch as following a trajectory, tracking a person, avoidingan object, etc. Behaviors are the basic, reusable buildingblocks on which robotic applications are built. BEL alsoprovides techniques for coordination of the activities ofbehaviors, for conflict resolution, and for resource schedul-ing. Behaviors are organized into a behavior network toachieve more complex goals, such as avoiding obstacleswhile following a target. Behavior networks are executedsynchronously at a rate set by the user.

BEL defines a common and uniform interface for allbehaviors and the basic protocols for interaction amongthe behaviors, as well as the order of execution for eachbehavior in a behavior network. The user has the flexibilityof selecting the communication ports and the data transfer-ence protocols between behaviors, as well as deciding theinternal computation in the behavior. A typical behaviornetwork would contain a number of sensory behaviors anda number of actuation behaviors. The sensory behaviors’outputs are fed as inputs to the actuation behaviors, whichthen control the robot according to the sensory data. Byanalyzing the flow of behavior inputs and outputs, theBehavior Manager automatically orders the execution ofbehaviors in the network so that sensory behaviors executebefore actuation behaviors. The coordination of behaviorsis transparent to the user.

Finally, the third layer, TEL, provides infrastructurefor developing goal-oriented tasks along with mechanismsfor the coordination of task executions. Tasks can run insequence or in parallel. Execution of tasks can also betriggered by user-defined events. Events are conditions orpredicates defined on values of variables within BEL orTEL. Complex events can be defined by logical expressionsof basic events.

While behaviors are highly reactive, and are appropriatefor creating robust control loops, tasks are a way to ex-press higher-level execution knowledge and coordinate theactions of behaviors. Tasks can run asynchronously usingevent triggers or synchronously with other tasks. Time-critical modules such as obstacle avoidance are typicallyimplemented in the BEL, while tasks implement behaviorsthat are not required to run at a fixed execution rate. Tightfeedback loops for controlling the actions of the robotaccording to perceptual stimuli (presence of obstacles,detection of a person, etc.) are typically implemented inthe BEL. Behaviors tend to be synchronous and highly

data driven. TEL is more appropriate to deal with com-plex control flow which depends on context and certainconditions that can arise asynchronously. Tasks tend to beasynchronous and highly event-driven.

VI. PRODUCT CONCEPTS

The product conceptualization is the step that takes agiven technology and designs an application for it. Thissection describes a few product concepts enabled by thetechnologies described in this paper. The described con-cepts are:

• The use of object recognition for command and con-trol of robots

• The use of vSLAM for efficient vacuuming by roboticvacuum cleaners

An important capability of a service robot is its abilityto interact with people in a natural, friendly, and robustmanner. Traditionally, speech-based interaction has beenthe natural choice in a variety of designs. However, theshortcomings of speech recognition in terms of recognitionrobustness and accuracy with respect to user locationrenders this technology to be unreliable for robotic ap-plications. Therefore, we propose the use of vision-basedobject recognition to communicate a command to the robot.Commands are issued by showing predefined cards to therobot. The described robustness of the algorithm makesit an excellent technology for reliable recognition of thecards.

The idea of commanding a robot with cards was ini-tially prototyped with Evolution Robotics ER2TM robots.However, the idea is now productized in the latest versionof Sony’s AIBOTM robotic dog, the ERS-7. This robot isequipped with a set of LEDs to provide visual feedback tothe user. Thus, each time the robot recognizes a cards, avisual display is shown as well as a sound is played.

A second important capability of a service robot isits ability to navigate in an unknown environment. Forexample, an autonomous vacuum cleaner must be ableto estimate its own position in order to carry out anefficient sweeping of the floor. Due to the complexityof the problem, most autonomous vacuum cleaners todayavoid dealing with the localization problem. Instead, theytypically operate under some random walk strategy. Thisis necessarily an inefficient way of doing coverage sincemany spots will be covered multiple times before the lastspot is covered. Also, there is no mechanism to guaranteethat all of the environment has been covered at all.

The usage of vSLAM to do mapping and localizationin support of efficient sweeping has been prototyped onvarious robot platforms at Evolution Robotics, and hasdemonstrated to dramatically improve coverage and effi-ciency. Experiments have shown that vSLAM offers anincreased degree of autonomy and “intelligence” to vacuumcleaners and service robots in general.

VII. C ONCLUSIONS

We have described three core technologies developedby Evolution Robotics that we constitute key components

in the next generation of service robots. They succeed insatisfying the typical requirements on service robotics: low-cost. short time-to-market, reliability to perform in realworld settings.

The basic requirement is that these technologies have tobe low cost to be employed in service robots. At the sametime, these technologies must enable the products to have ashort time-to-market and provide the products “intelligent”features.

The proposed technologies have been developed atEvolution Robotics. The cost requirement is achieved byleveraging on low-cost hardware components like camerasand IR sensors.

• ER Vision: The object recognition system gives therobot an ability to interact with and make inferencefrom its environment.

• vSLAM: The vision-based navigation system makes itpossible for a robot to simultaneously build a map andlocalize itself in the map. In particular, the vSLAMtechnology does not require an initial map, but buildsa map from scratch.

• ERSP: The proposed software architecture addressesthe issue of of short product development cycles.The Evolution Robotics Software Platform providesbasic components and tools for rapid development andprototyping of robotics applications. The design ofERSP is modular and allows applications to use onlythe required portions of the architecture. It is designedin a way that allows the developer to reconfigure thehardware without rewriting more than a very smallamount of code.

REFERENCES

[1] Arkin, R. C., Behavior-Based Robotics, MIT Press, 1998[2] Brooks, R. A.,A Robust Layered Control System for a Mobile Robot,

IEEE Journal of Robotics and Automation, March 1986.[3] Fox, D., Burgard, W., and Thrun, S.Markov localization for mobile

robots in dynamic environments.Journal of Artificial IntelligenceResearch 11, pp. 391-427, 1999.

[4] Gat, E.,On Three-Layer Architectures, in D. Kortenkamp et al. eds.,AI and Mobile Robots. AAAI Press, 1998.

[5] Gutmann, J.-S. and Konolige, K.,Incremental Mapping of LargeCyclic Environments, Proceedings of the IEEE International Con-ference on Robotics and Automation (ICRA), San Francisco, CA,2000.

[6] Lowe, D. Local feature view clustering for 3D object recognition.Proc. of the 2001 IEEE/RSJ, Int. Conf. on Computer Vision andPattern Recognition, Kauai, Hawaii, USA, December 2001.

[7] Roumeliotis, S.I. and Bekey, G.A.Bayesian estimation and Kalmanfiltering: A unified framework for mobile robot localization.Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA), pages 2985–2992, San Francisco, CA, 2000.

[8] Se, S., Lowe, D., and Little, J.Local and global localization formobile robots using visual landmarks.Proc. of the 2001 IEEE/RSJ,Int. Conf. on Intelligent Robots and Systems, Maui, Hawaii, USA,October 2001.

[9] Thrun, S., Fox., D., and Burgard, W.A probabilitic approach toconcurrent mapping and localization for mobile robots.MachineLearning, 31:29-53, 1998.

[10] Thrun, S.Mapping: A Survey.Technical Report, CMU-CS-02-111,Carnegie Mellon University, Pittsburgh, PA, USA, February, 2000.

[11] Wolf, J., Burgard, W., and Burkhardt, H.Robust vision-basedlocalization for mobile robots using an image retrievel system basedon invariant featuresProceedings of the 2002 IEEE Int. Conf. onRobotics and Automation, Washington, DC, USA, May, 2002.