Robotic Exploration: New Heuristic Backtracking Algorithm, Performance Evaluation and Complexity...

ARTICLE

International Journal of Advanced Robotic Systems

Robotic Exploration: New HeuristicBacktracking Algorithm, PerformanceEvaluation and Complexity MetricRegular Paper

Haitham El-Hussieny1,3*, Samy F.M. Assal1,4 and Mohamed Abdellatif2

1 Mechatronics and Robotics Engineering Department School of Innovative Design Engineering, Egypt-Japan University of Science andTechnology, Egypt

2 Mechanical Engineering Department, Faculty of Engineering and Technology, Future University, Egypt3 Electrical Engineering Department, Faculty of Engineering, Benha University, Egypt4 Department of Production Engineering and Mechanical Design, Faculty of Engineering, Tanta University, Egypt (on leave)* Corresponding author(s) E-mail: [email protected]

Received 1 April 2014; Accepted 27 October 2014

DOI: 10.5772/60043

© 2015 The Author(s). Licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided theoriginal work is properly cited.

Abstract

Mobile robots have been used to explore novel environ‐ments and build useful maps for navigation. Althoughsensor-based random tree techniques have been usedextensively for exploration, they are not efficient for time-critical applications since the robot may visit the same placemore than once during backtracking. In this paper, a novel,simple yet effective heuristic backtracking algorithm isproposed to reduce the exploration time and distancetravelled. The new algorithm is based on the selection ofthe most informative node to approach during backtrack‐ing. A new environmental complexity metric is developedto evaluate the exploration complexity of different struc‐tured environments and thus enable a fair comparisonbetween exploration techniques. An evaluation index isalso developed to encapsulate the total performance of anexploration technique in a single number for the compari‐son of techniques. The developed backtracking algorithmis tested through computer simulations for several struc‐tured environments to verify its effectiveness using thedeveloped complexity metric and the evaluation index. The

results confirmed significant performance improvementusing the proposed algorithm. The new evaluation index isalso shown to be representative of the performance and tofacilitate comparisons.

Keywords Robot exploration, sensor-based random treetechnique, backtracking, complexity metric, evaluationindex

1. Introduction

There is increasing need for autonomous robots in hazard‐ous environments, such as disaster sites and nuclear plants,as well as in inaccessible areas such as volcanoes and forspace missions. Exploration is essential for robots that arerequired to move autonomously in novel environments.Therefore, developing efficient strategies for exploration isboth interesting and important. Map information isimportant for path planning and task execution since theavailability of a map increases the speed with which therobot can reach areas of interest in the environment. It is

1Int J Adv Robot Syst, 2015, 12:33 | doi: 10.5772/60043

important to define the goal and evaluation criteria to judgethe exploration performance. Intuitively, the objective ofexploration is to gain the maximum amount of accurateinformation about the environment - represented by theexplored space completeness - in the shortest time and overthe minimum distance travelled.

1.1 Robot exploration algorithms and backtracking techniques

Exploration is usually made using a greedy strategy thatplans one step ahead by determining the next-bestlocations - called ’frontiers’ - which maximize theacquired information. One of the most popular frontier-based exploration techniques was developed in [1], inwhich the frontiers are defined as the boundaries betweenfree and unexplored areas. Approaching those frontiersenables the acquisition of more information about theunknown environment. Frontier-based exploration [1] iscommon to almost all exploration techniques and,depending upon the frontier selection mechanism, theexisting techniques can be broadly classified into threecategories, namely: optimal-frontier, behaviour-basedand randomized motion techniques.

In optimal-frontier exploration, the next frontier is selectedbased upon a cost function. In [1], this function was selectedto be the shortest distance required to reach each of them.The cost function may take other forms. For example, in [2],two criteria were considered in the evaluation, namely: thetravelling cost required to reach a frontier and the expectedinformation gain when performing a sensing action at thatfrontier. The cost function was controlled by three param‐eters: the distance cost, the expected utility and localizabil‐ity (the latter of which is defined by the suitability of afrontier to enhance the robot localization when reachingthat frontier, as described in [3]).

Optimal frontier strategies have two common problems.First, due to continuous map updates, the currentlyapproached region might become fully explored duringnavigation before reaching the destination frontier. In thiscase, the robot will start to explore the next unknownregion. This problem occurs when using sensors with awide perceptual range. Second, the optimal frontier can lieoutside the room being explored. This situation may causethe robot to explore the same room twice, with unnecessarylong distances. Repetitive re-checking of the frontier duringnavigation and segmentation of the partially built mapwere suggested in [4] to solve the first and the secondproblem respectively. The segmentation process separatesrooms from each other causing the robot to favour visitingfrontiers lying inside the currently explored room. How‐ever, although such a solution works well in office-likeenvironments, open environments cannot be reasonablysegmented, which may impair performance.

The exploration process could be decomposed into smaller,simple, reactive tasks, which leads to the use of behaviour-

based exploration approaches [5-11]. The exploration taskis divided into simple simultaneous actions. Such actionsor behaviours involve repulsive and attractive forces, suchas avoiding obstacles and reaching targets, respectively.For instance, in [6] a behaviour-based approach combinedthree reactive behaviours for exploration, namely: “reachfrontiers”, “avoid obstacles” and “avoid other robots”.Another example in which a simple wall-followingbehaviour as a reactive model led exploration was pro‐posed in [7]. In [8], repulsive behaviour from previouslyvisited areas was used for exploration. In [9, 10], a complexbehaviour architecture was proposed in which a combina‐tion of several weighted behaviours were fused togetherfor efficient exploration. These behaviours can be generallyformulated as: “go to frontier”, “go to unexplored areas”,“avoid other robots” and “avoid obstacles”. However,reactive systems do not perform well in large and complexenvironments [11]. In such environments, the forcescombining those behaviours could compensate each otherin a certain region, causing local minima that trap the robotat a certain point. The local minima problem is common inbehaviour-based techniques, not only for exploration butalso for normal navigation. This requires such systems tohave a local minima detection and recovery mechanism toavoid this problem.

In randomized motion planning [12-14], robots are directedto acquire more information through random steps.Randomized increments of a data structure called ’sensor-based random trees’ (SRTs) were generated in [13]. Thistree represents the roadmap of the explored area with anassociated safe region that describes obstacle-free regionsaround the robot depending upon its sensor aspects. Thenodes of this tree are the explored locations. This basic SRTstrategy was later modified to enhance the process. Forinstance, in frontier-based SRT (FB-SRT) [14], the randomselection of target points was biased towards local frontierarcs in the current safe region. This improves efficiency interms of shorter travelled paths and greater area coverage.Furthermore, in [15] and [16] a sensor-based explorationtree (SET) was constructed for depth-first search explora‐tion. The target configurations were selected to maximizethe estimated information utility in forward mode only bycalculating the expected utility along the local frontierboundary. This helps the robot to filter out any uselessactions that might be executed. In these strategies (SRT, FB-SRT and SET), if there are no more areas to explore, therobot goes back through the previous nodes to find newregions and explore again. This backtrack strategy causeslong exploration distances and times, especially in placeswith wide open spaces.

Several approaches were proposed to solve this backtrack‐ing problem. For instance, bridges were added in [17] to theexploration tree so that the robot could plan quick pathswithout looping all the previous nodes. A bridge wasadded between any two adjacent nodes with a commonsafe region between them and separated by a distancegreater than double the sensor range. The drawback of thisimprovement is the possibility that other nodes might be

2 Int J Adv Robot Syst, 2015, 12:33 | doi: 10.5772/60043

worth reaching despite having no shared safe region withthe currently visited node. Ignoring such a node will reducethe area covered by the robot. In addition, in [18] a shortpath was planned between the current node withoutadditional information and the initial node assuming thatno information will be gathered from the nodes betweenthem. This hypothesis works well in corridor environmentsrather than wide spaces and office-like environments.

1.2 Performance metrics for robotic exploration

Although robots are designed bearing in mind theirworking environment, there is no efficient way to comparerobot performances in different places. One way to do thisis to run a series of simulations of robots working indifferent structures. However, this is time-consuming andrequires much effort. Several complexity metrics (CMs)had been introduced [19-22], namely: a space syntax,entropy, and obstacle distribution and compression. Thosemetrics are suitable for path planning rather than explora‐tion. The space syntax method [19], which is concernedwith the connectivity of environmental features rather thandistances, uses a labour-intensive axial map to measureenvironmental complexity. Thus, it is not suitable forexploration. Entropy is regarded as a measure of uncer‐tainty in the working environment [21, 22]. The more freespaces there are, the more decisions the robot should maketo reach its target and the more time will be taken. Hence,entropy is a measure of the environmental complexity if therobot wishes to reach a particular goal. For exploration,there is no inaccurate decision, since any given decision willhave benefits for exploration. Additionally, the sensordetails are not considered in the entropy computations.Therefore, entropy is not an adequate measure of environ‐mental complexity from the exploration point of view.Since the distribution of obstacles in an environment affectsthe navigation task’s complexity, the environment isidentified by a unique factor called ’the compression factor’[22]. This factor measures the repeatability of patterns ofobstacles in a certain environment, and hence the environ‐ment’s complexity. In other words, if the obstacles haverepeated patterns and - in turn - can be easily described,then this environment has a low CM, and vice versa. Eventhough it considers the sensor properties, it is computa‐tionally expensive and not suitable for measuring theenvironmental complexity for the exploring robot. There‐fore, there is no complexity measure in the publishedliterature suitable from the exploration point of view, to thebest of our knowledge.

Although sensor-based exploration enjoys simplicity andcompleteness, it can take long time and involve a greaterdistance, mainly due to its backtracking strategy [18].Suppose that there are no frontier areas during exploration,then the robot must go back to previous locations untilthere exists a frontier area. In some situations, the robot cantravel a significant distance until it reaches a position whichhas frontier regions. In this paper, which extends ourpreliminary work in [30], a novel, simple yet effectivebacktracking technique based on a heuristic algorithm is

proposed. This new algorithm is based on the selection ofthe most informative node to approach directly rather thantravelling across all previous nodes in order. The mostinformative node is determined using the ray-castingalgorithm. The proposed technique can be regarded as acombination of the optimal frontier and randomizedmotion planning strategies. Random exploration planningis applied in forward mode, while the optimal node isapproached in backward mode. The new technique issuitable for time-critical applications, such as rescueapplications. Another contribution of this paper is indevising a new evaluation index EI for exploration that isuseful for comparing different techniques. The EI encap‐sulates the exploration efforts, distance and time, thepercentage of the area covered, completeness and thenumber of nodes in a single number to avoid any trade-offamong those metrics. An efficient environmental CM is alsoproposed to evaluate the degree of complexity of anenvironment and to compare techniques working indifferent structured environments.

This paper is organized as follows: In Section 2, the basicsof the basic SRT exploration strategy are briefly outlined.In Section 3, the new exploration technique using theproposed heuristic backtracking algorithm is described.The proposed environmental CM and EI are introduced inSection 4. Simulations running on different explorationscenarios and a comparison with the basic sensor-basedtechnique are presented in Section 5. Finally, conclusionsare drawn in Section 6.

2. Sensor-based random tree exploration

The SRT exploration technique [13] is based on a randomselection of robot configurations q = x y θ T inside thelocal safe region (LSR), where x and y represent theposition of the robot and θ represents the robot orientationwith respect to the local coordinate frame. The LSRrepresents the free space around the robot in the currentconfiguration qcurr , where its shape depends upon thesensor characteristics as described in [23]. A road map ofthe visited configurations - with the associated LSR - isrepresented by an incremental data structure called ’SRT’.Each node in the tree T represents the explored location.Pseudo-code of the SRT technique is shown in Algorithm1. This algorithm is repeated Kmax times, which is themaximum number of nodes assumed to be found in theenvironment. The Algorithm starts at qcurr acquiring thesensor measurements through the PERCEPTION function.The LSR, denoted by S (qcurr), and qcurr , are added to T . Next,a random angle θrand is generated to select the direction ofthe path that the robot will travel. The length of this path isthe radius r of the LSR in that direction, θrand , which isobtained by the RAY function. According to this randomdirection and radius r , a candidate configuration qcand isobtained inside S through the function DISPLACE.Afterwards, qcand is tested to validate two conditions.Firstly, it must be at a distance greater than a given distance

3Haitham El-Hussieny, Samy F.M. Assal and Mohamed Abdellatif:Robotic Exploration: New Heuristic Backtracking Algorithm, Performance Evaluation and Complexity Metric

dmin from qcurr , where dmin is a given threshold. Secondly, itshould not belong to the LSR of any other node in theexploration tree T , as shown in Fig. 1. This search for a validcandidate node is done by the VALID function and isrepeated until the valid node or the maximum number ofsearches, Imax, has been reached. Note that setting the SRTradius multiplier constant α ≤1 guarantees that qcand iswithin the safe region, and hence collision-avoidance is notneeded. If there is no configuration satisfying theseconditions, backtracking or the homing step will start,letting the robot travel along previous nodes to find a newunexplored region.

Figure 1. Validation of different candidate configurations in SRT:qcand3 is accepted, while qcand1 and qcand2 are not.

Algorithm 1: A pseudo-code for the basic SRT algorithm.Build_SRTInput: (qinit, Kmax, Imax, α, dmin)Output: Roadmap tree Tqcurr = qinit;

for k = 1→ KmaxS(qcurr)← PERCEPTION(qcurr);ADD(T, (qcurr, S(qcurr));i← 0;repeat

θrand ← RANDOM_DIR;r ← RAY(S(qcurr), θrand);qcand ← DISPLACE(qcurr, θrand, α.r);i← i + 1;

until(VALID(qcand, dmin, T) or i = Imax)if VALID(qcand, dmin, T)MOVE_TO(qcand);qcurr ← qcand;

elseMOVE_TO(qcurr.parent);qcurr ← qcurr.parent;

endloopreturn T;

• The exploration environment is static, i.e. it consists ofunchanging surroundings in which the robot explores.

• The robot is holonomic, i.e., it can turn in any direction.

In SRT, when there is no more valid configurationsto reach, the robot traverses the parent node of qcurr,searching for new candidate locations to explore. Thisbacktrack step consumes time which is not justified. Inthe proposed approach, the forward mode explorationis made similar to the basic SRT method as shown inAlgorithm 2. While in backtracking mode, after no validconfigurations to reach, the built tree data structure istested in reverse order, starting from the current node,parent nodes, qtest, are checked for ability to provide moreinformation. A gain G(qtest) is calculated by the functionGET_GAIN which is based on the ray-casting algorithmas described in Section 3.2. This gain, measured in termsof free cells, is a measure of how a certain node is valuable

Algorithm 2: A pseudo-code for the proposedheuristic backtracking algorithm.

Enhanced_SRTInput: (qinit, Kmax, Imax, α, dmin, Gthresh)Output: most informative node qtest, roadmap tree T.qcurr = qinit;

for k = 1→ KmaxS(qcurr)← PERCEPTION(qcurr);ADD(T, (qcurr, S(qcurr));i← 0;repeat

θrand ← RANDOM_DIR;r ← RAY(S(qcurr), θrand);qcand ← DISPLACE(qcurr, θrand, α.r);i← i + 1;

until(VALID(qcand, dmin, T) or i = Imax)if VALID(qcand, dmin, T)MOVE_TO(qcand);qcurr ← qcand;

% Modifications done to the backtrack:else

% prepare the parent node to be testedqcurr ← qcurr.parent;repeat

qtest = qcurr;% calculate the information gain for qtest

G = GET_GAIN(qtest);qcurr ← qcurr.parent;

% exit if the tested node is valuable% or the configurations tree is empty

until(G ≥ Gthresh or qcurr.parent = NULL)% if the tested node is valid

if(G ≥ Gthresh)% plan a shortest path to reach

APPROACH(qtest);qcurr ← qtest;

else% stop at the current node for homing

qcurr ← qcurr;end

endloopreturn T;

to reach for further exploration. If the estimated gain ofthe tested node is more than a certain threshold Gthresh, thenode will be selected as a valuable node. In other words,this node can be considered as the most informative node.Then, the shortest path will be planned to reach thisselected node using A∗ through the APPROACH(qtest)function. If no more valuable nodes are found, the currentnode will be identified as a home node, where no furthermoves the robot can take, and the exploration is complete.This is the difference between the proposed algorithmand the basic SRT algorithm. The homing process inthe proposed algorithm does not require moving therobot to its starting node as in basic SRT. Instead, in theproposed approach, the homing process may be at anynode whenever there are no more valuable nodes to visit.

The basic idea for enhancing backtracking in SRT is shownschematically in Fig. 2. In the basic SRT strategy, as

4 Short Journal Name, 2013, Vol. No, No:2013 www.intechopen.com

Figure 1. Validation of different candidate configurations in SRT: qcand 3 is

accepted, while qcand 1 and qcand 2 are not

3. A new heuristic backtracking algorithm

The following are the explicit assumptions for the newexploration approach:

1. Robot localization is provided by a separate module.

2. The exploration environment is planar, i.e., R 2, due tothe nature of the planar range sensor used.

3. The exploration environment is static, i.e., it consists ofunchanging surroundings in which the robot explores.

4. The robot is holonomic, i.e., it can turn in any direction.

In SRT, when there are no more valid configurations toreach, the robot traverses the parent node of qcurr , searchingfor new candidate locations to explore. This backtrack steptakes time, which is not justified. In the proposed approach,forward mode exploration is performed in a similar fashionto the basic SRT method as shown in Algorithm 2. While inbacktracking mode, when there are no valid configurationsto reach, the tree data structure that has been built is testedin reverse order; starting from the current node, and theparent nodes, qtest , are checked for whether they canprovide more information. A gain G(qtest) is calculated bythe function GET_GAIN, which is based on the ray-castingalgorithm as described in Section 3.2. This gain, measuredin terms of free cells, is a measure of how a given node isworth visiting for further exploration. If the estimated gainof the tested node is more than a given threshold Gthresh , thenode will be selected as a valuable node. In other words,this node can be considered to be the most informativenode. Next, the shortest path will be planned to reach thisselected node using A * through the APPROACH(qtest)function. If no more valuable nodes are found, the currentnode will be identified as a home node, at which no furthermoves are to be made and exploration is consideredcomplete. This is the difference between the proposedalgorithm and the basic SRT algorithm. The homingprocess in the proposed algorithm does not require movingthe robot to its starting node as in basic SRT. Instead, in theproposed approach, the homing process may be at anynode whenever there are no more valuable nodes to visit.

The basic idea for enhancing backtracking in SRT is shownschematically in Fig. 2. In the basic SRT strategy, as shownin Fig. 2(a), when there are no new areas the robot back‐tracks to all the previous nodes until exiting the currentlyexplored room. While using the proposed approach, therobot searches for the most informative node, as shown inFig. 2(b), to which the starting node is assumed as a newstarting node. A shortest path - with the help of the builtLSRs - is then planned to approach this node of interest,saving both distance and time.

3.1 Map building

A spatial representation for the unknown environment isrequired for two tasks: environment monitoring (such as in


surveillance applications) and helping the robot to navigatethe environment in backtracking mode. In this paper, theoccupancy grid-based map is used for this representation.The environment is divided into small grids, each contain‐ing a value representing the probability of being occupiedby obstacles. It is necessary to know for each cell whetherit is unknown, free or an obstacle. Initially, the map isassumed to be unknown. Given a range scan and the robotpose, the occupancy grids within the sensor range areupdated as follows. Firstly, scan readings are convertedinto Cartesian coordinates of the occupancy grid map,creating a polygon of points. Secondly, this polygon isidentified as the LSR and filled using the flood-fill algo‐rithm. Thirdly, obstacle cells are identified by the sensorreadings that are lower than the maximum sensor rangeRmax. The process is shown in Fig. 3.

3.2 The most informative node

In the proposed approach, when there is no valid configu‐ration to reach, the robot selects the most informative nodeamong the previous nodes. This informative node isexpected to have information gain (in terms of free cells)greater than the threshold Gthresh , which depends upon themaximum range of the sensor. In other words, a largesensor range means more regions to explore and, hence, alarge threshold is required. This threshold is selectedheuristically to achieve a compromise between explorationcompleteness and the total distance travelled. The higherthe threshold, the shorter the exploration time and thelower the level of exploration completeness. In fact, theestimation of the information gain that could be obtainedat any point is difficult. The actual gain is hard to predict,as it varies according to the structure of the corresponding

a)

b)

Figure 2. A sketch showing a robot exploring a room using a)basic SRT strategy and b) the proposed approach, where a shortestpath, in green, is planned to the most informative node.

shown in Fig. 2.a, when there are no new areas, therobot backtracks all the previous nodes till exit from thecurrently explored room. While in the proposed approach,the robot searches for the most informative node as shownin Fig. 2.b in which the starting node is assumed to be thatone. A shortest path, with the help of the built LSRs, is thenplanned to approach this node of interest, saving distanceand time.

3.1. Map Building

A spatial representation for the unknown environmentis required for two tasks, environment monitoring suchas in surveillance applications, and helping the robot tonavigate the environment in the backtracking mode. Inthis paper, the occupancy grid-based map is used for thisrepresentation. The environment is divided into smallgrids, each containing a value representing the probabilityof being occupied by obstacles. It is necessary to knowfor each cell whether it is unknown, free, or an obstacle.Initially, the map is assumed to be unknown. Givena range scan and the robot pose, the occupancy gridswithin the sensor range are updated as follows. Firstly,scan readings are converted into Cartesian coordinates ofthe occupancy grid map, creating a polygon of points.Secondly, this polygon is identified as the LSR and filledusing flood-fill algorithm. Thirdly, obstacle cells areidentified by the sensor readings that are smaller than themaximum sensor range Rmax. The process is shown in Fig.3.

Figure 3. Building occupancy grid map using scanner range data.

3.2. The most informative node

In the proposed approach, after no valid configuration toreach, the robot selects the most informative node amongprevious nodes. This informative node is expected tohave information gain, in terms of free cells, more thanthe threshold Gthresh which depends on the maximumrange of the sensor. In other words, large sensor rangemeans more regions to explore and hence large thresholdis required. This threshold is selected heuristically tocompromise between the exploration completeness andthe total travelled distance. The higher the threshold, theshorter the exploration time and the lower the explorationcompleteness. In fact,estimation of the information gainthat could be obtained at any point is difficult. The actualgain is hard to predict as it varies according to the structureof the corresponding region. In [11], this gain wascalculated by counting the number of unknown cells lyingin a particular region surrounded by the maximum sensorrange. This method did not guarantee correct estimation;since some unreachable and unknown regions could becounted. In [4], the information gain was approximatedas the relative difference between the current map entropyand the expected entropy after the simulated robot step atthe candidate location. This approach requires scanningall cells in the global map. In our algorithm, a simpleheuristic ray-casting [24] method is applied to estimatehow much a certain node qtest will be valuable. Duringthe ray-casting process as illustrated in Algorithm 3, thenumber of configurations traversed by the rays, whichcould contribute to the exploration process, is recorded,and the sum of all the valid configurations traversed bythe rays is used as a measure of how much informationgain can theoretically be obtained from a particular node.This suits the used laser scanner sensor, where the numberof scan rays and the angle between them depend onthe characteristics of the actual sensor used. Note, Notethat sensor noise is not considered when estimating theinformation gain of each node since the noise can affect theestimation and should be modeled. Valid configurationsare tested through VALID function to meet the conditionsmentioned before in Section 2. This is shown in Fig. 4,where the tested node q2 on the left is estimated to havemore information gain G ≥ Gthresh than the node q2 on theright.

After identifying the most informative node, a shortestpath is planned using the A∗ algorithm [25] to reach it.This saves more exploration distance and time rather thanvisiting all previous nodes. Dimensions of the robot aretaken into account while planning the path by eroding the

www.intechopen.com :Robotic Exploration: New Heuristic Backtracking Algorithm, Performance Evaluation and Complexity Metric

5

Figure 2. A sketch showing a robot exploring a room using: a) the basic SRTstrategy, and b) the proposed approach, where a shortest path, in green, isplanned to the most informative node

a)

b)

Figure 2. A sketch showing a robot exploring a room using a)basic SRT strategy and b) the proposed approach, where a shortestpath, in green, is planned to the most informative node.

shown in Fig. 2.a, when there are no new areas, therobot backtracks all the previous nodes till exit from thecurrently explored room. While in the proposed approach,the robot searches for the most informative node as shownin Fig. 2.b in which the starting node is assumed to be thatone. A shortest path, with the help of the built LSRs, is thenplanned to approach this node of interest, saving distanceand time.

3.1. Map Building

A spatial representation for the unknown environmentis required for two tasks, environment monitoring suchas in surveillance applications, and helping the robot tonavigate the environment in the backtracking mode. Inthis paper, the occupancy grid-based map is used for thisrepresentation. The environment is divided into smallgrids, each containing a value representing the probabilityof being occupied by obstacles. It is necessary to knowfor each cell whether it is unknown, free, or an obstacle.Initially, the map is assumed to be unknown. Givena range scan and the robot pose, the occupancy gridswithin the sensor range are updated as follows. Firstly,scan readings are converted into Cartesian coordinates ofthe occupancy grid map, creating a polygon of points.Secondly, this polygon is identified as the LSR and filledusing flood-fill algorithm. Thirdly, obstacle cells areidentified by the sensor readings that are smaller than themaximum sensor range Rmax. The process is shown in Fig.3.

Figure 3. Building occupancy grid map using scanner range data.

3.2. The most informative node

In the proposed approach, after no valid configuration toreach, the robot selects the most informative node amongprevious nodes. This informative node is expected tohave information gain, in terms of free cells, more thanthe threshold Gthresh which depends on the maximumrange of the sensor. In other words, large sensor rangemeans more regions to explore and hence large thresholdis required. This threshold is selected heuristically tocompromise between the exploration completeness andthe total travelled distance. The higher the threshold, theshorter the exploration time and the lower the explorationcompleteness. In fact,estimation of the information gainthat could be obtained at any point is difficult. The actualgain is hard to predict as it varies according to the structureof the corresponding region. In [11], this gain wascalculated by counting the number of unknown cells lyingin a particular region surrounded by the maximum sensorrange. This method did not guarantee correct estimation;since some unreachable and unknown regions could becounted. In [4], the information gain was approximatedas the relative difference between the current map entropyand the expected entropy after the simulated robot step atthe candidate location. This approach requires scanningall cells in the global map. In our algorithm, a simpleheuristic ray-casting [24] method is applied to estimatehow much a certain node qtest will be valuable. Duringthe ray-casting process as illustrated in Algorithm 3, thenumber of configurations traversed by the rays, whichcould contribute to the exploration process, is recorded,and the sum of all the valid configurations traversed bythe rays is used as a measure of how much informationgain can theoretically be obtained from a particular node.This suits the used laser scanner sensor, where the numberof scan rays and the angle between them depend onthe characteristics of the actual sensor used. Note, Notethat sensor noise is not considered when estimating theinformation gain of each node since the noise can affect theestimation and should be modeled. Valid configurationsare tested through VALID function to meet the conditionsmentioned before in Section 2. This is shown in Fig. 4,where the tested node q2 on the left is estimated to havemore information gain G ≥ Gthresh than the node q2 on theright.

After identifying the most informative node, a shortestpath is planned using the A∗ algorithm [25] to reach it.This saves more exploration distance and time rather thanvisiting all previous nodes. Dimensions of the robot aretaken into account while planning the path by eroding the


5

Figure 3. Building an occupancy grid map using scanner range data


region. In [11], this gain was calculated by counting thenumber of unknown cells lying in a particular regionsurrounded by the maximum sensor range. This methoddid not guarantee a correct estimation, since some unreach‐able and unknown regions could be counted. In [4], theinformation gain was approximated as the relative differ‐ence between the current map entropy and the expectedentropy after the simulated robot step at the candidatelocation. This approach requires scanning all the cells in theglobal map. In our algorithm, a simple heuristic ray-casting[24] method is applied to estimate how valuable a certainnode qtest will be. During the ray-casting process, asillustrated in Algorithm 3, the number of configurationstraversed by the rays which can contribute to the explora‐tion process is recorded, and the sum of all the validconfigurations traversed by the rays is used as a measureof how much information gain can theoretically be ob‐tained from a particular node. This suits the laser scannersensor used, where the number of scan rays and the anglebetween them depend upon the characteristics of the actualsensor used. Note, that sensor noise is not considered whenestimating the information gain of each node, since thenoise can affect the estimation and should be modelled.Valid configurations are tested through the VALIDfunction to meet the conditions mentioned above in Section2. This is shown in Fig. 4, where the tested node q2 on theleft is estimated to have more information gain G ≥Gthresh

than the node q2 on the right.

Figure 4. Estimation of information gain at differentconfigurations by the use of ray-casting technique.

partial built map with a disk structure element. Unknownand obstacle cells are avoided during robot navigation.

4. The Proposed Complexity Metric and Evaluation Index

4.1. Environmental Complexity Metric (CM)

Robots are tested in different environments with differentdegrees of complexities, and this is problematic to compareperformance of different exploration strategies. Therefore,it is useful to develop a metric that can quantify theenvironmental complexity. This will help to reduce theeffect of the environment complexity on the performancecomparison of the different exploration strategies. Theavailable complexity measures such as space syntax,entropy and obstacles compression are measures relatedto path planning rather than robot exploration. From thepath planning perspective, free areas means more choicesfor the robot to reach its target and hence high complexenvironment. On the other hand, from the explorationperspective, free areas means more information to beacquired about the unknown environment and henceimplies less complex environment. So, in this paper, anovel environmental complexity metric for the explorationprocess is proposed as follows:

Intuitively, obstacles in the environments could helpor oppose the exploration process. The explorationprocess will be efficient if the sensor range spans theentire environment. The effect of the obstacles density

Algorithm 3: Ray-Casting algorithm to estimatethe most informative node.

GET_GAINInput: qcurrOutput: Node gain G.% starting from angle 0 to the field of view FOVfor θray = 0→ FOV% cover the total maximum range

for r = 0→ Rmax% convert to Cartesian co-ordinate

q.x = qtest.x + r ∗ cos(θray);q.y = qtest.y + r ∗ sin(θray);

% testing for validityif VALID(q, dmin, T)

%increment gain by 1G ← G + 1;

looploopreturn G;

Figure 5. Effect of obstacle distribution over complexity spaces.Two environments with the same number of obstacle cells withdifferent distribution. The first mazy environment is more complexthan the open space environment.

and distribution is illustrated in Fig. 5, where twoenvironments; namely a mazy and an open spaceenvironments, have the same number of obstacle cells,but different distribution. From the exploration point ofview, the mazy environment is more complex than theopen space one. The complexity measurement could besimplified by calculating the difference between the actualnumber of nodes required to cover a certain environment,Nact, and the estimated number of nodes, Nest, requiredto cover the abstract free area without considering theobstacle distribution, as follows:

CM = 1− NestNact

(1)

where CM ∈ [0, 1] is the complexity metric that rangesfrom zero to one, where zero means no scattered obstaclesand more benefits will be gained from the sensor range,while higher values of the complexity metric mean thatthe sensor range is not used effectively to cover thisenvironment such as the maze-like environment. Theestimated number of nodes, Nest, can be calculated asthe number of environment free cells divided by the areacovered by the inner rectangle of the sensor field of view(FOV) as follows:

Nest =FreeSpaceArea

2 ∗ R2max

(2)

where Rmax is the sensor perceptual range.

On the other hand, calculating the actual number ofnodes required to fully cover the structured environmentconsidering the density and the obstacles distribution,Nact, which was addressed in [26], is not easy. Here,a modification of the art-gallery algorithm [27], awell-known algorithm for visibility in computationalgeometry, will be used. The reason is that, the basicalgorithm does not consider the sensor details such as themaximum range and the field of view. Also, it assumes aline of sight sensor which is not practical, besides beingcomputationally expensive. Thus, only an approximatesolution that depends on the density and the distributionof obstacles in the environment can be obtained. Therefore,it is proposed to find the minimum actual number ofnodes required to cover the entire space based on thebasic art gallery problem. Basically, it randomly samplesthe environment to construct a relatively large set Ssamof covering nodes. Sensor aspects such as the maximumrange and the field of view are considered by using


After identifying the most informative node, a shortestpath is planned using the A * algorithm [25] to reach it.This saves both exploration distance and time as com‐pared to visiting all the previous nodes. The dimen‐sions of the robot are taken into account while planningthe path by eroding the partially built map with a diskstructure element. Unknown and obstacle cells areavoided during robot navigation.













looploopreturn G;



CM = 1− NestNact

(1)


Nest =FreeSpaceArea

2 ∗ R2max

(2)




Figure 4. Estimation of information gain at different configurations by theuse of a ray-casting technique

4. The proposed complexity metric and evaluation index

4.1 Environmental complexity metric

Robots are tested in different environments with differentdegrees of complexity, which is problematic for comparingthe performance of different exploration strategies. There‐fore, it is useful to develop a metric that can quantifyenvironmental complexity. This will help to reduce theeffect of the environmental complexity on the performancecomparison of the different exploration strategies. Theavailable complexity measures, such as space syntax,entropy and obstacle compression, are measures related topath planning rather than robot exploration. From the path-planning perspective, free areas mean more choices for therobot to reach its target, and hence a more complexenvironment. On the other hand, from the explorationperspective, free areas mean more information to beacquired about the unknown environment, and henceimply a less complex environment. As such, in this paper anovel environmental CM for the exploration process isproposed, as follows.

Intuitively, obstacles in the environment could help orhinder the exploration process. The exploration processwill be efficient if the sensor range spans the entire envi‐ronment. The effect of the obstacle density and distributionis illustrated in Fig. 5, where two environments - namely, amazy environment and an open space environment - havethe same number of obstacle cells but a different distribu‐tion. From the exploration point of view, the mazy envi‐ronment is more complex than the open spaceenvironment. The complexity measurement could besimplified by calculating the difference between the actualnumber of nodes required to cover a certain environment,Nact , and the estimated number of nodes, Nest , required tocover the abstract free area without considering theobstacle distribution, as follows:

1= - est

act

NCMN (1)


where CM ∈ 0,1 is the CM, ranging from zero to one,whereby zero means an absence of scattered obstacles andthat greater benefits will be acquired from the sensor range,while higher values mean that the sensor range is not usedeffectively to cover the environment, such as with the maze-like environment. The estimated number of nodes, Nest , canbe calculated as the number of free cells in the environmentdivided by the area covered by the inner rectangle of thesensor field of view (FOV), as follows:

2=2*est

max

FreeSpaceAreaNR (2)

where Rmax is the sensor’s perceptual range.

On the other hand, calculating the actual number of nodesrequired to fully cover the structured environment whileconsidering the obstacles’ density and distribution, Nact ,which was addressed in [26], is not easy. Here, a modifica‐tion of the art gallery algorithm [27], a well-known algo‐













looploopreturn G;



CM = 1− NestNact

(1)


Nest =FreeSpaceArea

2 ∗ R2max

(2)




Figure 5. Effect of obstacle distribution over complexity spaces - twoenvironments with the same number of obstacle cells with differentdistributions. The first mazy environment is more complex than the openspace environment.

rithm for visibility in computational geometry, will beused. The reason for this is that the basic algorithm doesnot consider the sensor details, such as the maximum rangeand the FOV. Furthermore, it assumes a line of sight sensor,which is not practical (besides being computationallyexpensive). Thus, only an approximate solution thatdepends upon the density and distribution of obstacles inthe environment can be obtained. Therefore, it is proposedto find the minimum actual number of nodes required tocover the entire space based on the basic art galleryproblem. Basically, it randomly samples the environmentto construct a relatively large set Ssam of covering nodes.Sensor aspects, such as the maximum range and the FOV,are considered by using the ray-casting GET_GAINalgorithm. Algorithm 4 is structured as follows:

CM = 0.1 CM = 0.16 CM = 0.26 CM = 0.31 CM = 0.42

CM = 0.12 CM = 0.15 CM = 0.24 CM = 0.3 CM = 0.4

Figure 6. The complexity metric values CM, obtained over a single run, for different environments with n = 100 random samples, FOV=360 ◦,a coverage threshold Uthresh = 99% and with different sensor range. Top: Rmax = 2m. Down:Rmax = 4m.

Algorithm 4: Greedy Randomized Art-Gallery Algorithm.Input: 2D model of the environment’s map, Uthresh, nOutput: Nact Actual number of nodes required

to cover the entire environment.Nact = 0;repeat

generate n random samples, si;% where samples should be taken from the area of% the map not yet covered by previous nodes

for i = 1→ nG(si) = GET_GAIN(si);

loopselect sample s∗i with max. G;PLOT(s∗i )Nact ← Nact + 1;compute % of Coverage U;

until U ≤ Uthreshreturn Nact;

the ray-casting GET_GAIN algorithm. Algorithm 4 isstructured as follows:

After getting the 2D model of the environment, thealgorithm computes the estimated number of nodes,Nest required to cover the entire environment using (2).Afterwards, n samples will be generated randomly tobe distributed over the environment. To eliminate therandomness effect over the output of the algorithm, thenumber of random samples is selected to be large forexample, 100 samples for the Rmax = 2m. The gainG(si) of each sample si will be computed by the useof the ray-casting algorithm, where sensor aspects willbe considered. The sample si with maximum gain willbe selected and the area covered by this sample will beplotted over the map by PLOT function. This is madeto exclude the area from the sample generation in thesuccessive iterations. Then, the actual number of nodes,Nact, will be incremented by one. This algorithm will berepeated until reaching a percentage of threshold coverageUthresh. Afterwards, the actual number of required nodes,Nact, will be returned. However, this algorithm is notguaranteed to give an optimal solution for the requirednumber of covering nodes, but it is practical since therobot cannot be accurately positioned at the optimal nodes.

Near optimal solutions can be achieved by increasing thenumber of randomly generated samples. This will give achance to fairly distribute the nodes over the entire spaceat the expense of the computing time. Figure 6 shows thecomplexity metric values, CM, for different environments;n = 100 random samples, FOV = 360 ◦, the coveragetermination criterion is limited to Uthresh = 99% and withdifferent sensor range: on the top row, Rmax = 2m andon the bottom, Rmax = 4m. It can be noted that, thosemetric values are obtained over one run with large n toeliminate the randomness effect over the output of thealgorithm. Additionally, they are nearly independent onthe sensor parameters since Nest and Nact are varied in thesame manner for different sensor parameters.

4.2. Performance Evaluation Index(EI)

The performance of an exploration strategy is usuallymeasured by four metrics namely the traveled distance,D, the exploration time, T, the number of nodes createdin the tree, Nnodes, and the completeness, C. The traveleddistance is defined as the total distance traveled by arobot after returning back to the home position, while theexploration time is defined as the time taken by robot tocomplete the exploration process. The completeness isthe percentage of total area covered after the homing step.However, there is always a trade-off among those metricson which the performance of an exploration strategy canbe evaluated. In order to avoid the trade-off among thosemetrics, a single evaluation index (EI) is proposed here.This EI encapsulates all metrics in one single number.Intuitively, the proposed index can be formulated based onits relationship with the mentioned metrics. EI is proposedto be directly proportional to the completeness C, andinversely proportional to the normalized exploration time,T, the normalized traveled distance, D and the normalizednumber of nodes in the tree structure, N. This proposedindex is useful for comparing the performance of differentSRT-like Algorithms. The larger the values of this index,the better the performance of a strategy. This proposedindex can be formulated as follows:


7

CM = 0.1 CM = 0.16 CM = 0.26 CM = 0.31 CM = 0.42

CM = 0.12 CM = 0.15 CM = 0.24 CM = 0.3 CM = 0.4

Figure 6. The complexity metric values CM, obtained over a single run, for different environments with n = 100 random samples, FOV=360 ◦,a coverage threshold Uthresh = 99% and with different sensor range. Top: Rmax = 2m. Down:Rmax = 4m.

Algorithm 4: Greedy Randomized Art-Gallery Algorithm.Input: 2D model of the environment’s map, Uthresh, nOutput: Nact Actual number of nodes required

to cover the entire environment.Nact = 0;repeat

generate n random samples, si;% where samples should be taken from the area of% the map not yet covered by previous nodes

for i = 1→ nG(si) = GET_GAIN(si);

loopselect sample s∗i with max. G;PLOT(s∗i )Nact ← Nact + 1;compute % of Coverage U;

until U ≤ Uthreshreturn Nact;

the ray-casting GET_GAIN algorithm. Algorithm 4 isstructured as follows:

After getting the 2D model of the environment, thealgorithm computes the estimated number of nodes,Nest required to cover the entire environment using (2).Afterwards, n samples will be generated randomly tobe distributed over the environment. To eliminate therandomness effect over the output of the algorithm, thenumber of random samples is selected to be large forexample, 100 samples for the Rmax = 2m. The gainG(si) of each sample si will be computed by the useof the ray-casting algorithm, where sensor aspects willbe considered. The sample si with maximum gain willbe selected and the area covered by this sample will beplotted over the map by PLOT function. This is madeto exclude the area from the sample generation in thesuccessive iterations. Then, the actual number of nodes,Nact, will be incremented by one. This algorithm will berepeated until reaching a percentage of threshold coverageUthresh. Afterwards, the actual number of required nodes,Nact, will be returned. However, this algorithm is notguaranteed to give an optimal solution for the requirednumber of covering nodes, but it is practical since therobot cannot be accurately positioned at the optimal nodes.

Near optimal solutions can be achieved by increasing thenumber of randomly generated samples. This will give achance to fairly distribute the nodes over the entire spaceat the expense of the computing time. Figure 6 shows thecomplexity metric values, CM, for different environments;n = 100 random samples, FOV = 360 ◦, the coveragetermination criterion is limited to Uthresh = 99% and withdifferent sensor range: on the top row, Rmax = 2m andon the bottom, Rmax = 4m. It can be noted that, thosemetric values are obtained over one run with large n toeliminate the randomness effect over the output of thealgorithm. Additionally, they are nearly independent onthe sensor parameters since Nest and Nact are varied in thesame manner for different sensor parameters.

4.2. Performance Evaluation Index(EI)

The performance of an exploration strategy is usuallymeasured by four metrics namely the traveled distance,D, the exploration time, T, the number of nodes createdin the tree, Nnodes, and the completeness, C. The traveleddistance is defined as the total distance traveled by arobot after returning back to the home position, while theexploration time is defined as the time taken by robot tocomplete the exploration process. The completeness isthe percentage of total area covered after the homing step.However, there is always a trade-off among those metricson which the performance of an exploration strategy canbe evaluated. In order to avoid the trade-off among thosemetrics, a single evaluation index (EI) is proposed here.This EI encapsulates all metrics in one single number.Intuitively, the proposed index can be formulated based onits relationship with the mentioned metrics. EI is proposedto be directly proportional to the completeness C, andinversely proportional to the normalized exploration time,T, the normalized traveled distance, D and the normalizednumber of nodes in the tree structure, N. This proposedindex is useful for comparing the performance of differentSRT-like Algorithms. The larger the values of this index,the better the performance of a strategy. This proposedindex can be formulated as follows:


7

Figure 6. The CM values, obtained over a single run, for different environments with n =100 random samples, FOV=360°, a coverage threshold.U thresh =99 %. and with different sensor ranges. Top: Rmax =2m. Down: Rmax =4m.


After acquiring the 2D model of the environment, thealgorithm computes the estimated number of nodes, Nest ,required to cover the entire environment using (2). After‐wards, n samples will be generated randomly to bedistributed over the environment. To eliminate the ran‐domness effect over the output of the algorithm, thenumber of random samples is selected to be large, forexample, 100 samples for the Rmax =2m. The gain G(si) ofeach sample, si, will be computed by the use of the ray-casting algorithm, where sensor aspects will be considered.The sample si with maximum gain will be selected, and thearea covered by this sample will be plotted over the mapby the PLOT function. This is made to exclude the area fromthe sample generation in successive iterations. Next, theactual number of nodes, Nact , will be incremented by one.This algorithm will be repeated until reaching a percentageof threshold coverage U thresh . Afterwards, the actualnumber of required nodes, Nact , will be returned. However,this algorithm is not guaranteed to give an optimal solutionfor the required number of covering nodes, but it ispractical since the robot cannot be accurately positioned atthe optimal nodes. Near-optimal solutions can be achievedby increasing the number of randomly generated samples.This will give a chance to fairly distribute the nodes overthe entire space at the expense of the computing time.Figure 6 shows the CM values, CM , for different environ‐ments: n =100 random samples, FOV =360 ° , the coveragetermination criterion is limited to U thresh =99% and withdifferent sensor ranges (on the top row, Rmax =2m and on thebottom, Rmax =4m). It can be noted that those metric valuesare obtained over one run with a large n to eliminate therandomness effect over the output of the algorithm.Additionally, they are nearly independent of the sensorparameters, since Nest and Nact are varied in the samemanner for different sensor parameters.

4.2 Performance evaluation index

The performance of an exploration strategy is usuallymeasured by four metrics, namely: the distance travelled,D, the exploration time, T , the number of nodes created inthe tree, Nnodes, and the completeness, C . The distancetravelled is defined as the total distance travelled by a robotafter returning back to the home position, while theexploration time is defined as the time taken by a robot tocomplete the exploration process. Completeness is thepercentage of total area covered after the homing step.However, there is always a trade-off among these metricsby which the performance of an exploration strategy can beevaluated. In order to avoid this trade-off, a single EI isproposed here. This EI encapsulates all the metrics in justone number. Intuitively, the proposed index can beformulated based on its relationship with the mentionedmetrics. EI is proposed to be directly proportional to thecompleteness, C , and inversely proportional to the normal‐ized exploration time, T̄ , the normalized distance travelled,

D̄, and the normalized number of nodes in the tree struc‐ture, N̄ . This proposed index is useful for comparing theperformance of different SRT-like algorithms. The largerthe values of this index, the better the performance of agiven strategy. This proposed index can be formulated asfollows:

* *= c

t d n

w CEIw T w D w N (3)

where wc, wt , wd and wn are the proportional weights whichare added to measure the importance of each normalizedmetric for the EI . In the proposed approach, each of thoseweights is set to unity since the four metrics are equallyimportant. The completeness is defined as:

= 100%*Known CellsC

Map Area (4)

The normalized number of nodes is made here with respectto the near-optimal number of actual nodes, Nact , calculatedfrom the greedy randomized art-gallery algorithm, and soN̄ is defined as:

= nodes

act

NNN (5)

The distance travelled is normalized to the total distancerequired for a robot to explore the entire environment,which can be estimated using the actual number of nodescalculated. The distance between two nodes in the coverageproblem based on graph theory [28] is equal to double themaximum sensor range, and so the total distance, Dtotal ,required can be approximated to the following:

2* * ( 1)= -total max actD R N (6)

As such, the normalized travel distance is given as:

= =2* *( 1)-total max act

D DDD R N (7)

The normalized exploration time is given as:

= *2* *( 1)

n-max act

TTR N (8)

where ν is the average speed of the robot, which is a fixedvalue during the exploration process. Note that theexploration time measured here is the total simulation time,including the computation time of the algorithm. This is


why the normalized value of the distance travelled differsfrom the normalized exploration time.

5. Simulation results

Several simulation scenarios have been implemented tovalidate the proposed exploration approach. Each scenariohas been identified by its environmental CM and theexploration EI . The 3D mobile robot simulator Webots [29]was used in all our simulations. A three-wheel omnidirec‐tional mobile robot was used. The robot has a diameter of0.2 m and carries a 360° laser range finder with one-degreeangular resolution. In the simulations, the parameters forthe SRT algorithm were selected as follows: α =0.9,dmin =0.7m, Gthresh =100 cells and Rmax =2m. The performanceof the developed approach was compared with the basicSRT approach through several metrics, representing theeffort paid (the distance travelled, D, the exploration time,T ), the coverage gained (the completeness, C) and thenumber of nodes created in the tree, Nnodes, as well as theexploration EI . Two environmental scenarios are presentedhere, namely an office-like environment and a maze-likeenvironment.

5.1 The office-like environment

The proposed exploration approach with heuristic back‐tracking and the basic SRT strategy are implemented in theoffice-like environment shown in Fig. 7. The environmentalCM for this environment is calculated using (1) as 0.31.Detailed simulation steps at different simulation times withthe associated roadmap are shown in Fig. 8. The nodescreated and the paths travelled are shown in green andblue, respectively, while the shortest paths planned by theheuristic approach are shown in red. In a basic SRTstrategy, red edges mean that the robot has backtrackedalong them to return to its home position, which does notappear in the proposed backtracking approach as the robotis not required to go back over all previous nodes (it is justrequired to plan a short path to the most informativenodes). A comparison between the proposed approach andthe basic SRT approach is given in Table 1 in terms of thementioned metrics at different sensor ranges, namely 1 mand 2 m. Due to the random behaviour of SRT, values areaveraged over five simulation runs with different initialrobot positions. The standard deviations are also shown forthe four metrics.

EI =wcC

wtT ∗ wdD ∗ wn N(3)

where wc, wt, wd and wn are the proportional weightswhich are added to measure the importance of eachnormalized metric to the evaluation index. In the proposedapproach, each of those weights is set to unity since thefour metrics are equally important. The completeness isdefined as:

C =Known Cells

Map Area∗ 100% (4)

The normalized number of nodes is made here withrespect to the near-optimal number of actual nodes,Nact, calculated from the greedy randomized art-galleryalgorithm, so N is defined as:

N =NnodesNact

(5)

The traveled distance is normalized to the total distancerequired for a robot to explore the entire environmentwhich can be estimated using the calculated actual numberof nodes. The distance between two nodes in the coverageproblem based on the graph theory [28] is equal to twicethe maximum sensor range, so the total distance, Dtotal ,required can be approximated to the following:

Dtotal = 2 ∗ Rmax ∗ (Nact − 1) (6)

Then, the normalized travel distance is given as

D =D

Dtotal=

D2 ∗ Rmax ∗ (Nact − 1)

(7)

The normalized exploration time is given as

T =T

2 ∗ Rmax ∗ (Nact − 1)∗ ν (8)

where ν is the average speed of the robot which is afixed value during the exploration process. Note that, theexploration time measured here is the total simulation timeincluding the computation time of the algorithm. This iswhy the normalized traveled distance value differs fromthe normalized exploration time.

5. Simulation Results

Several simulation scenarios have been implemented tovalidate the proposed exploration approach. Each scenariohas been identified by its environmental complexity metricand the exploration evaluation index. The 3D mobilerobot simulator Webots [29], has been used in all oursimulations. A three-wheel omni-directional mobile robothas been used in our simulations. The robot has a diameterof 0.2 m and carries a 360 deg laser range finder with onedegree angular resolution. In simulations, the parametersfor the SRT algorithm were selected as follow: α = 0.9,dmin = 0.7m, Gthresh = 100 cells and Rmax = 2m. Theperformance of the developed approach was comparedwith the basic SRT approach through several metrics;these metrics represent the effort paid (traveled distance,D, the exploration time, T), the coverage gained (the

Figure 7. Office-like environment.

completeness, C) and the number of nodes created in thetree, Nnodes as well as the exploration evaluation index.Two environmental scenarios are presented here, namelyan office-like environment and a maze-like one.

5.1. Office-like environment

The proposed exploration approach with the heuristicbacktracking and the basic SRT strategy are implementedin the office-like environment shown in Fig. 7. Theenvironmental complexity metric for this environment iscalculated using (1) as 0.31. Detailed simulation steps atdifferent simulation time with the associated roadmap areshown in Fig. 8. The nodes created and the traveledpaths are shown in green and blue, respectively, whilethe shortest paths planned by the heuristic approach areshown in red. In basic SRT strategy, red edges means thatthe robot has backtrack them to return to its home positionwhich does not appear in the proposed backtrackingapproach as the robot is not required to go back over allprevious nodes, it is just required to plan a short path tothe most informative nodes. A comparison between theproposed approach and the basic SRT approach is givenin Table 1 in terms of the mentioned metrics at differentsensor ranges, namely 1 m and 2 m. Due to the randombehavior of SRT, values are averaged over five simulationruns with different initial robot positions. The standarddeviations are also shown for the four metrics.

It can be noted from Table 1 that, there is a significantdecrease in the total path length and the total explorationtime for the proposed approach compared with the basicSRT approach. Also it can be observed that significantreduction in the exploration distance appears clearly in thescenario with smaller sensor range which can be attributedto the high number of exploration edges required to fill theentire open space. Additionally, the proposed approachprovides nearly complete coverage which is comparableto that of the original SRT approach, while exerting lessexploration effort. Furthermore, the standard deviations ofthe four metrics are shown to be small values which provehigh reliability of the results. The new evaluation index isalso shown to be a good representative of the explorationperformance avoiding the trade-off among the metrics.The proposed approach showed higher evaluation indicescompared to those of the basic SRT at different perceptualranges. The average speed of the robot is set to ν =10m/sec and each of the proportional weights whichmeasures the contribution of each metric to the index is


Figure 7. Office-like environment

It can be noted from Table 1 that there is a significantdecrease in the total path length and the total explorationtime for the proposed approach compared to the basic SRTapproach. Furthermore, it can be observed that a significantreduction in the exploration distance appears clearly in thescenario with a smaller sensor range, which can be attrib‐uted to the high number of exploration edges required tofill the entire open space. Additionally, the proposedapproach provides nearly complete coverage, which iscomparable to that of the original SRT approach, whileexerting less exploration effort. Furthermore, the standarddeviations of the four metrics are shown to be small values,which proves the high reliability of the results. The new EIis also shown to be representative of the explorationperformance avoiding trade-off among the metrics. Theproposed approach showed higher evaluation indices

T = 50sec. T = 50sec.

T = 100sec. T = 100sec.

T = 150sec. T = 150sec.

T = 281sec. T = 237sec.

Figure 8. Simulation steps at different simulation time T withthe associated roadmap. Left: the basic SRT approach. Right: theproposed approach

Strategy T(sec.) C(%) D(m) Nnodes EIPerceptual Range Rmax= 1 m

Basic SRT 536±6 97.2±1 336.5±5 193±3 7.3Proposed Approach 360±4 97.1±1 217.7±4 182±6 18.2

% of Benefit 32.8 -0.1 35.3 5.7 145Perceptual Range Rmax= 2 m

Basic SRT 281±3 99±1 180±2 52±3 16.12Proposed Approach 237±3 98.5±1 146.95±3 50±2 46.22

% of Benefit 15.7 -0.5 18.4 3.8 186

Table 1. Simulation results of the office-like environment

taken as one. Also, it is worth noting that the complexityof exploration process is reduced through the decreasednumber of tree nodes.

Here, it is imperative to discuss the advantages of theproposed backtracking approach over the approachespresented in [17] and [18]. The backtracking approachin [17] is based on constructing a bridge betweentwo adjacent nodes having a common safe regionbetween them. This approach is helpful in corridor-likeenvironments which have a low probability of havingin-between nodes with unexplored regions. Also, in

Figure 9. Maze-like environment

(a) (b)

Figure 10. Simulation results of the maze-like scenario (a) thebasic SRT approach (b) the proposed approach

[18] the backtracking approach is based on constructinga shortcut between the initial and the last visited nodeswithout taking into account the opportunity of having avaluable node between them. This shows the advantageof the proposed backtracking approach over these twoapproaches, since the proposed heuristic backtracking isbased on approaching the most informative node whichhas a high probability of having unexplored regionsin office-like or even in open space environments asillustrated in the results.

5.2. Maze-like environment

Similarly, simulations of the proposed explorationapproach with the heuristic backtracking and the basicSRT strategy are conducted in the maze-like environmentas shown in Fig. 9. The environmental complexity metricfor this environment is calculated using (1) as 0.42. Thesimulation result of the basic SRT is presented in Fig. 10(a)while that of the proposed SRT is shown in Fig. 10(b)in which red lines represent the shortest paths followedwhenever there are no valuable nodes to reach. Table 2summarizes the results obtained with a 2 m perceptualrange averaged over five simulation runs with differentinitial robot positions. Similarly, significant benefit in thetotal path length and the total exploration time are shownfor the proposed approach compared with those of thebasic SRT approach. Also, the proposed approach showedhigher evaluation index compared to that of the basic SRT.

Strategy T(sec.) C% D(m) Nnodes EIPerceptual Range Rmax= 2 m

Basic SRT 347±3 99±1 276.71±3 87±2 12.17Proposed Approach 291±2 98.5±1 219.55±2 85±2 18.56

% of Benefit 16 -0.5 20.65 2.3 50

Table 2. Simulation results of the maze-like scenario


9

Figure 8. Simulation steps at different simulation times T with the associatedroadmap. Left: the basic SRT approach. Right: the proposed approach.


compared to those of the basic SRT at different perceptualranges. The average speed of the robot is set to 10 m / s andeach of the proportional weights measuring the contribu‐tion of each metric to the index is taken to be one. Inaddition, it is worth noting that the complexity of explora‐tion process is reduced through the decreased number oftree nodes.

Strategy T (sec.) C (%) D (m) Nnodes EI

Perceptual Range Rmax = 1 m

Basic SRT 536±6 97.2±1 336.5±5 193±3 7.3

ProposedApproach

360±4 97.1±1 217.7±4 182±6 18.2

% of Benefit 32.8 -0.1 35.3 5.7 145


Basic SRT 281±3 99±1 180±2 52±3 16.12

ProposedApproach

237±3 98.5±1 146.95±3 50±2 46.22

% of Benefit 15.7 -0.5 18.4 3.8 186


Here, it is imperative to discuss the advantages of theproposed backtracking approach over the approachespresented in [17] and [18]. The backtracking approach in[17] is based on constructing a bridge between two adjacentnodes with a common safe region between them. Thisapproach is helpful in corridor-like environments, whichhave a low probability of having in-between nodes withunexplored regions. Moreover, in [18] the backtrackingapproach is based on constructing a short cut between theinitial and the most recently visited nodes without takinginto account the possibility of there being a valuable nodebetween them. This demonstrates the advantage of theproposed backtracking approach over these two ap‐proaches, since the proposed heuristic backtracking isbased on approaching the most informative node whichhas a high probability of having unexplored regions inoffice-like or even in open-space environments, as illus‐trated in the results.

5.2 The maze-like environment

Similarly, simulations of the proposed exploration ap‐proach with heuristic backtracking and the basic SRTstrategy were conducted in a maze-like environment, asshown in Fig. 9. The environmental CM for this environ‐ment is calculated using (1) as 0.42. The simulation resultsof the basic SRT are presented in Fig. 10(a) while those ofthe proposed SRT are shown in Fig. 10(b), in which the redlines represent the shortest paths followed whenever thereare no valuable nodes to travel to. Table 2 summarizes theresults obtained with a 2 m perceptual range averaged overfive simulation runs with different initial robot positions.

Similarly, the significant benefit in the total path length andthe total exploration time are shown for the proposedapproach compared with those of the basic SRT approach.In addition, the proposed approach showed a higher EIcompared to that of the basic SRT.

Strategy T (sec.) C (%) D (m) Nnodes EI


Basic SRT 347±3 99±1 276.71±3 87±2 12.17

ProposedApproach

291±2 98.5±1 219.55±2 85±2 18.56

% of Benefit 16 -0.5 20.65 2.3 50


T = 50sec. T = 50sec.

T = 100sec. T = 100sec.

T = 150sec. T = 150sec.

T = 281sec. T = 237sec.






% of Benefit 15.7 -0.5 18.4 3.8 186





(a) (b)







% of Benefit 16 -0.5 20.65 2.3 50



9


T = 50sec. T = 50sec.

T = 100sec. T = 100sec.

T = 150sec. T = 150sec.

T = 281sec. T = 237sec.






% of Benefit 15.7 -0.5 18.4 3.8 186





(a) (b)







% of Benefit 16 -0.5 20.65 2.3 50



9

Figure 10. Simulation results for the maze-like scenario: (a) the basic SRTapproach, (b) the proposed approach

6. Conclusions

A novel heuristic backtracking algorithm has beendeveloped for sensor-based random tree exploration toreduce the exploration time and the distance travelled soas to cope with time-critical applications. The newapproach is based on the selection of the most informa‐tive node to approach rather than backtracking across allunnecessary explored areas. The enhancement of SRTexploration using the developed backtracking algorithmhas been confirmed by conducting several simulations indifferent exploration scenarios. A new evaluation indexhas been devised to encapsulate the exploration metrics,namely the exploration time, the distance travelled, thecoverage, and the number of nodes in a single number,avoiding trade-off among these metrics. This index hasbeen shown to be representative of the explorationperformance to be used for exploration comparison,especially in SRT-like algorithms. A step towards findinga unique complexity metric has also been developed,


representing the environment’s structural complexityfrom the exploration point of view. This metric reflectsthe difference between the ideal number of visits requiredto fully cover the environment and the actual number ofvisits required. It has been shown that this complexitymetric is dependent upon the disparity among obstaclesin the environment. Although the estimated number ofnodes ultimately depends upon the sensor properties,such as range, field of view and angular resolution, thecomplexity metric is nearly independent of theseproperties; since these properties will be reflected in boththe actual and the estimated number of nodes requiredto cover the environment, the complexity metric willtherefore not change. In the future, we will explore thepotential of the evaluation index in diverse scenarios asa step towards generalizing it for the measurement ofexploration performance and comparing betweentechniques.

7. References

[1] B. Yamauchi, A frontier-based approach for auton‐omous exploration, Int. Symp. on ComputationalIntelligence in Robotics and Automation, CIRA,(1997) 146-151.

[2] H. H. Gonzalez-Banos and J.-C. Latombe, Naviga‐tion strategies for exploring indoor environments,The Int. J. of Robotics Research, 21 (2002) 829-848.

[3] T. Tao, Y. Huang, F. Sun, and T. Wang, Motionplanning for SLAM based on frontier exploration,Int. Conf. on Mechatronics and Automation, ICMA,(2007) 2120-2125.

[4] D. Holz, N. Basilico, F. Amigoni, and S. Behnke,Evaluating the efficiency of frontier-based explora‐tion strategies, 6th German Conf. on Robotics(ROBOTIK), (2010) 1-8.

[5] R. C. Arkin, Behavior-based robotics, 1998.[6] H. Lau, Behavioral approach for multi-robot

exploration, Australasian Conf. on Robotics andAutomation, (2003).

[7] D. Schmidt, T. Luksch, J. Wettach, and K. Berns,Autonomous behavior-based exploration of officeenvironments, Int. Conf. on Informatics in Control,Automation and Robotics, ICAR, (2006) 235-240.

[8] T. Balch, Avoiding the past: a simple but effectivestrategy for reactive navigation, Int. Conf. onRobotics and Automation, ICRA, (1993) 678-685.

[9] M. Juliá, O. Reinoso, A. Gil, M. Ballesta, and L. Payá,A hybrid solution to the multi-robot integratedexploration problem, Engineering Applications ofArtificial Intelligence, 23, 4, (2010) 473-486.

[10] M. Abdellatif, Behavior fusion for visually guidedservice robots. I-Tech, Vienna, Austria (2008) 1-12.

[11] M. Juliá, A. Gil, and O. Reinoso, A comparison ofpath planning strategies for autonomous explora‐

tion and mapping of unknown environments,Autonomous Robots, 33, 4 (2012) 427-444.

[12] J. Barraquand, L. Kavraki, R. Motwani, J.-C. La‐tombe, T.-Y.Li, and P. Raghavan, A randomsampling scheme for path planning, RoboticsResearch. Springer, (2000) 249-264.

[13] G. Oriolo, M. Vendittelli, L. Freda, and G. Troso, TheSRT method: Randomized strategies for explora‐tion, Int. Conf. on Robotics and Automation, 5.(2004) 4688-4694.

[14] L. Freda and G. Oriolo, Frontier-based probabilisticstrategies for sensor-based exploration, Int. Conf.on Robotics and Automation, ICRA, (2005)3881-3887.

[15] L. Freda, G. Oriolo, and F. Vecchioli, Sensor-basedexploration for general robotic systems, Int. Conf.on Intelligent Robots and Systems, IROS, (2008)2157-2164.

[16] L. Freda, G. Oriolo, and F. Vecchioli, An explorationmethod for general robotic systems equipped withmultiple sensors. Int. Conf. on Intelligent Robotsand Systems, IROS, (2009) 5076-5082

[17] A. Franchi, L. Freda, G. Oriolo, and M. Vendittelli,The sensor-based random graph method forcooperative robot exploration, IEEE/ASME Trans‐actions on Mechatronics, 14, 2, (2009) 163-175.

[18] J. Kim, K. J. Seong, and H. J. Kim, An efficientbacktracking strategy for frontier method in sensor-based random tree, Int. Conf. on Control, Automa‐tion and Systems, ICCAS, (2012) 970-974.

[19] S. Read, The grain of space in time: the spatial/functional inheritance of Amsterdam centre, UrbanDesign Int., 5, 3-4, (2000) 209-220.

[20] D. A. Shell and M. J. Mataric, Human motion-basedenvironment complexity measures for robotics, Int.Conf. on Intelligent Robots and Systems, IROS, 3(2003) 2559-2564.

[21] A. Rnyi, On measures of entropy and information,Fourth Berkeley Symp. on Mathematical Statisticsand Probability, (1961) 547-561.

[22] G. T. Anderson and G. Yang, A proposed measureof environmental complexity for robotic applica‐tions, Int. Conf. on Systems, Man and Cybernetics,SMC, (2007) 2461-2466.

[23] M. Vendittelli, L. Freda, and G. Oriolo. The SRTmethod. http://www.dis.uniroma1.it/labrob/arch/SRT.html. Accessed Jan 2013.

[24] S. D. Roth, Ray casting for modeling solids, Com‐puter graphics and image processing, 18, 2, (1982)109-144.

[25] T. Ersson and X. Hu, Path planning and navigationof mobile robots in unknown environments, Int.Conf. on Intelligent Robots and Systems, IROS, 2(2001) 858-864.


[26] M. Cardei and J. Wu, Energy-efficient coverageproblems in wireless ad hoc sensor networks,Computer Communications, 29, 4, (2006) 413-420.

[27] H. González-Bãnos, A randomized art-galleryalgorithm for sensor placement, The 17th. AnnualSymposium on Computational Geometry. ACM,(2001) 232-240.

[28] D. B. West et al., Introduction to graph theory, 2(2001).

[29] Webots website, “http://www.cyberbotics.com,”Commercial Mobile Robot Simulation Software.Accessed Jan 2013.

[30] H. El-Hussieny, S. F. M. Assal, and M. Abdellatif,Improved sensor-based mobile robot exploration ofnovel environments, 6th. Int. Conf. on IntelligentComputing and Information Systems, ICICIS,(2013) 43-49.


Robotic Exploration: New Heuristic Backtracking Algorithm, Performance Evaluation and Complexity...

Documents

Transcript of Robotic Exploration: New Heuristic Backtracking Algorithm, Performance Evaluation and Complexity...