Toward a Competitive Pool Playing Robot: Is Computational Intelligence Needed to Play Robotic Pool?

Toward a Competitive Pool Playing Robot:

Is Computational Intelligence Needed to Play Robotic Pool?

Michael Greenspan1,2 Joseph Lam1 Will Leckie1 Marc Godard2

Imran Zaidi2 Ken Anderson1 Donna Dupuis1 Sam Jordan1

1Dept. Electrical & Computer Engineering, Queen’s University, Kingston, Canada2School of Computing, Queen’s University, Kingston, Canada

Abstract— This paper describes the development of DeepGreen, an intelligent robotic system that is currently in de-velopment to play competitive pool against a proficient humanopponent. The design philosophy and the main system compo-nents are presented, and the progress to date is summarized.We also address a common misconception about the game ofpool, i.e. that it is purely a game of physical skill, requiringlittle or no intelligence or strategy. We explain some of thedifficulties in developing a vision-based system with a highdegree of positional accuracy. We further demonstrate thateven if perfect accuracy were possible, it is still beneficial andnecessary to play strategically.

Keywords: Computational Intelligence, Robotics, Intelli-gent Systems, Computer Vision, Games, Pool

I. I NTRODUCTION

Computational Intelligence (CI) and Robotics are relateddisciplines in that they both attempt to emulate human andbiological behaviour. Robotic systems often serve as a testingground where CI concepts are implemented and evaluatedin dynamic, noisy, and unstructured environments. One sucharea where CI and Robotics have co-developed is the field ofRobotic Gaming Systems, where intelligent robotic systemsare developed to compete against themselves and/or humansin structured tasks. These systems combine the sensing andactuation of robotic devices with the strategic planning ofCI methods. Examples of annual Robotic Gaming Systemtournaments include: the Robocup challenge [1], which in-volves a number of categories of soccer-playing robots; andthe AAAI Mobile Robot Contest [2] which comprises ascavenger hunt and other competitive challenges; and theTrinity College Fire Fighting Robot Contest [3].

The first attempt at automating pool1 was The SnookerMachine which was developed in the late 1980s at theUniversity of Bristol [4]. This system comprised an invertedarticulated robot which was positioned over the table by agantry. A camera positioned on the ceiling was used to ana-lyze the table state and direct the robot to place a shot. Morerecent attempts include the Roboshark system from Iran [5],systems from Malaysia [6] and Taiwan [7], and our own Deep

1We loosely refer herein to all cue sports (e.g., billiards, carom, snooker,etc.) aspool.

Fig. 1. Deep Green Hardware

Green system from Queen’s University, Canada [8], [9]. TheAutomatic Pool Trainer from the University of Aalborg inDenmark does have a vision and CI component, although itdoes not involve robotic actuation [10].

The Deep Green system is illustrated in Fig. 1, andcomprises: a 3 degree-of-freedom (DOF) spherical roboticwrist attached to a 3 DOF, ceiling mounted gantry robot; a1 DOF cue end-effector; a ceiling mounted (global) camera;a wrist mounted (local) camera; a PC; and a standard4

′

×8′

pool table. While other platforms have been proposed inthe literature, such as a mobile robot that circumnavigatesthe table, the research community has converged upon thegantry as the platform of choice [4], [5], [7], [6], [8]. Theend-effector is attached to the wrist flange, and is a customdesigned electrical linear actuator. The gantry and wrist arecontrolled from a single industrial controller, whereas theend-effector has a separate control unit.

As much as possible, the system has been developed usingstandard off-the-shelf hardware components, which has al-lowed us to focus our efforts on software processing methods.Research into Deep Green has involved four distinct technicalfields: Computer Vision, Robot Calibration, Physics, and

Strategy.This paper continues with a description of Deep Green,

including a description of the main technical challenges andthe progress that we have made to date. We present some ofthe challenges in developing a system which has the accuracyrequired to play well, and we further demonstrate that, evenif perfect accuracy were possible, it is still necessary to playintelligently to be competitive. The paper concludes witha discussion of the benefits of machine over human playand some future research problems that will be necessaryto address before we can play credibly against a proficienthuman.

II. COMPUTERV ISION

The Computer Vision subsystem is based upon two cam-eras, and their associated processing routines: theGlobalVision System (GVS), which is a ceiling-mounted 8.2Megapixel digital SLR camera (a Canon 350D); and theLocal Vision System (LVS), which is a ∼ 1 Megapixelcompact firewire camera (a Point Grey Flea). There is alsoan additional ceiling mounted firewire camera (a Point GreyDragonfly) that is used for realtime image collection duringexperimentation.

Together, the GVS and LVS serve two functions:identifi-cation, and localization.

A. Localization

The purpose of localization is to accurately determinethe position of each ball in the table coordinate frame.Once the lens radial distortions have been corrected using acalibration method, the main challenge is to rectify the tableplane to compensate for perspective distortions. Perspectivedistortions result from the global camera retinal plane beingaligned not exactly parallel to the table surface, which isdifficult to achieve manually to the desired accuracy.

The retinal plane and the table are related by a transforma-tion known as ahomography, which is a mapping betweentwo planes. The standard technique for determining a ho-mography involves extracting corresponding point locationsfrom a pattern that is imaged on the plane. This techniqueis awkward to apply in this case as the pattern must bevery flat (typically made of glass plate) and very accurate,which is difficult using standard printing technologies. Asanalternative, we have developed a method that makes use ofan invariant property of the projective space which allows usto place a simple target comprising perpendicular lines, suchas a large carpenter’s square, at various random locations onthe table [9], [11].

An example of the processing steps of the GVS is illus-trated in Fig. 2. The removal of the effects of perspectivedistortion is not illustrated in Fig. 2, as the homographicwarping is applied directly to the center positions of the balls,rather than to the entire image. The raw image is shownin Fig. 2(a), and (b) illustrates this same image after thecamera’s intrinsic parameters are used to remove the effectsof optical radial distortion. In (c), the (undistorted) image hasbeen compared against a set of statistics (pixel means and

variances) acquired from a set of∼30 backgroundimagesof the table without any balls present. For each pixel, ifthe difference between the foreground and background pixelvalues exceeds some threshold value of the background stan-dard deviation, then the pixel is judged to be foreground,i.e.possibly a ball. It can be seen that, in addition to ball regions,there is a significant amount of noise that is admitted throughthis process. In (d), a connected components algorithm isapplied and only those regions that are large enough areadmitted as valid balls: otherwise, they are removed as noise.These ball regions are then processed using circle fitting andbest fit routines, leading to an accurate estimate of the centerlocation of each ball.

Once the ball locations have been accurately identified,the circular subregions defining each ball are then sent to thecolor indexing routine to determine identities. With each ballaccurately localized and identified, the table state can thenbe represented in simulation for shot planning, as illustratedin Fig. 2(e).

B. Identification

The purpose of identification is to determine the uniqueidentity (i.e. number) of each ball on the table, as is necessaryin most variations of pool. Each ball has a distinct color, andthe identities of the balls are determined using color indexingmethods [12]. The balls are first segmented from the imageusing a sequence of background subtraction, noise filtering,and blob detection. The color signature of each ball is thencomposed into a histogram and compared against a databaseof color histograms, one for each ball, previously acquiredin a training phase.

Identification is challenging because each ball only com-prises a few hundred pixels in the global camera image,which is a small sample size. The colors of each ball are alsonot completely distinct, but are repeated among the stripeand solid sets. For example, the solid 1 ball is the sameyellow color as the striped 9 ball, the solid 2 ball is the sameblue as the striped 10 ball, etc. The only way to differentiatebetween the striped and solids is by the percentage of whitethat is contained in each ball. Further, the balls need to beidentified in random positions and orientations over the tablesurface. At certain angles, the white section of any stripedball may be facing toward the camera, which reduces thenumber of colored pixels that can be used to discriminate theball’s identity. Finally, some of the colors are very similarand can get confused using standard color indexing methods.For example, yellow (1,9) and orange (5,13) are sometimesconfused, as are red (7,15) and pink (3,11).

Our solution to the color indexing problem has beento implement a number of parallel histogram comparisonmethods and color spaces that are then arbitrated in a votingscheme. The balls are first partitioned into stripes and solids,based upon the percentage of white pixels. Each group is thensent to a bank of histogram comparators, and the consensusresult is chosen. We have also implemented a novel methodto analyze the color signatures based upon nonparametric

(a) Raw Image (b) Radial Distortion Correction (c) Background Subtraction (d) ExtractedCircles

(e) Recovered Table State (f) Predicted Table State After Shot (g) Actual Table State AfterShot

Fig. 2. Acquiring Table State

statistics [13], which has some advantages over the standardhistogram comparison methods.

The above described identification and localization taskswere performed solely by the GVS. In addition, the LVS isused to improve robot positioning accuracy prior to placinga shot, as described in the following section.

III. ROBOT CALIBRATION

Rather than build our own hardware, we made the decisionto base the system on standard commercially available, albeitcustomized, components. The advantage of this approachis that it was relatively inexpensive, quick to deploy, andallowed us to focus our effort on the computational aspectsof the problem.

One challenge in using a standard gantry platform wasits limited accuracy. In response to the needs of industries,commercial robotics tend to be highly precise and repeatable,but not terribly accurate. It is possible to design a gantryrobot that has fine-grain accuracy (∼15µm) over the desiredworkplace. For example, Coordinate Measurement Machines(CMMs) have such accuracy over similar working volumes.This high degree of accuracy comes at a cost, however, andsuch a device would be expensive, delicate, and unlikelyto maintain accuracy while absorbing the impacts requiredwhen placing shots.

A more reasonable approach is to demand less absoluteaccuracy from the primary positioning device, and to relyupon the vision system for calibration and correction. Robotcalibration refers to a collection of techniques that improveupon the positioning accuracy of a repeatable and preciserobot. A series of measurements are taken of the robot in avariety of locations using an accurate external measurementsensor. By comparing the robot joint encoder readings withthe external sensor readings, a model of the system parame-ters can be generated which can then be referenced to moreaccurately control the positioning of the robot.

In [8], we described a calibration technique that involvedthe use of both the LVS and GVS cameras. The robot was

Fig. 3. Local Vision System

repeatedly positioned over a series of circular patterns placedon the table surface, and the correspondence between therobot joint encoder values and the centers of the extractedcircles within the GVS image was used to determine thefunctional relationship between the robot coordinate frameand the table plane. The resulting robot positioning errorwas reduced from the order of centimeters to within0.6 mmon average, with a standard deviation of0.3 mm.

A. LVS Correction

The LVS can be used to further improve shot accuracyduring play by invoking a correction routine once the robot isin position to take a shot. Consider the nearly perfect straightshot illustrated in Fig. 4. In this GVS image, the inscribedline is defined by the centers of the cue and object balls priorto placing the shot. The rendered circles are a sequence of3 extracted positions of the object ball once the shot hasbeen placed. The centers of the 3 object ball positions fallon (or very close to) the line, which indicates that the robotwas positioned such that the shot was a very accurate straight

Fig. 4. Perfect Straight Shot

a) Before Correction b) After Correction

Fig. 5. LVS Correction

shot. The illustrated final positions of the cue and object ballsalso fall on this line, which further supports the quality ofthis shot.

From the vantage of the LVS, we call this line theidealline. When the robot is servoed to its shot position, asdetermined by the GVS, it accumulates error. By analyzingthe LVS image, and comparing the line connecting thecurrent cue and object ball centers with the ideal line, it ispossible to calculate transformations which can correct forthe robot positioning error [14].

An example is illustrated in Fig. 5. In (a), the robot hasbeen servoed to its shot position using only the informationfrom the GVS. The current (red) and ideal (green) lines arenot aligned, indicating that there exists positioning error. Part(b) of the figure shows the lines after an automatic alignmentprocedure has been executed. In this case, the current andideal lines are identical, i.e. the ball centers intersect withthe ideal line, and the shot will therefore be (very close to)a perfect straight shot.

We have developed two different methods to align therobot position with the LVS ideal line [14]. The simplerof the two, called theimage-basedmethod, is an iterativemethod based purely on 2D LVS image data. The sec-

Fig. 6. Straight Shot Test Errors

ond method is called theposition-basedmethod, and usesknowledge of the 3D rigid transformation between the robotend-effector coordinate reference frame and the LVS opticalframe. This transformation is commonly known as theTCFmatrix [15], and is determined off-line in a calibration stage.

The results of an experiment designed to characterize theperformance of these two methods are plotted in Fig. 6. Atotal of 90 straight shots were executed. Thirty of these shotspositioned the robot using information only from the GVS,thirty more applied positional correction using the image-based method, and the final thirty used the position-basedmethod.

The angular error of each shot was calculated by extractingthe object ball center locations at a number (at least 2)positions along their trajectories using the GVS, and com-paring the angle of this line with the line defined by thecue and object balls prior to placing the shot (similar toFig. 4). The angular errors for each of the 90 shots werecalculated and are plotted in ascending order in Fig. 6. Itcan be seen that the use of LVS correction using eithermethod significantly reduces the angular error of the straightshot. Without any correction routine, i.e. using only theGVS for positioning, the mean absolute error was1.8 degs.With correction, the error was reduced by more than twothirds, to 0.51 and 0.56 degs. for the image- and position-based methods, respectively. While the accuracy is similarfor both methods, the position-based method is∼40% faster,requiring on average 50 secs. per shot, as compared to 83secs. for the image-based method. Once the straight shot isaligned accurately, the cue can be further rotated around andoffset from the cue ball center to execute a cut shot of anydesired angle and spin.

IV. PHYSICS MODEL AND SIMULATION

A physics model is required to predict the state of the tableafter a shot, so that subsequent shots can be planned. Thephysics of pool has been well-investigated in the literature,

and there are available a number of good resources [16], [17].Spin is an essential element of the game, and imparting spinon the cue ball by displacing and angling the cue at impact isa technique used to control the interaction and placement ofballs following a shot. The physics model therefore involvesconservation of not only linear but also angular componentsof momentum.

We have developed a physics simulator that predicts theoutcome of a shot based upon a derived physics model [18],[19], [20]. Unlike physics simulators which use the morecommon numerical integration approach, our method op-erates in the continuous domain, predicting the times ofoccurences of pending events, such as collisions or tran-sitions between motion states. Our technique returns anexact analytic solution based on a parameterization of theseparation of two moving balls as a function of time. Theresulting equation is a quartic polynomial which can besolved either iteratively or in closed form to determine thetime of collision. A similar derivation exists for other events,such as ball-rail and ball-pocket collisions, and transitionsfrom sliding to rolling and rolling to stationary states.

The major benefits of this approach over integration arethat it is more accurate, requiring no discrete time step,and more time efficient requiring∼ 2 to 3 orders of mag-nitude fewer computations per shot. The added efficiencyis especially important when the physics simulator is usedin expanding a game tree, as many different shots may besimulated prior to making a decision, sometimes tens ofthousands or more.

An example of the utility of the physics simulator isillustrated in Fig. 2(e)-(g). Part (e) shows the current tablestate resulting from the image in (a), (f) shows the predictedtable state after a shot is executed, and (g) shows the actualtable state after the shot. It can be seen that the predictedand actual states are quite close, which supports the notionthat the simulation can be used to predict future table statesresulting from potential shots, which can in turn be used tostrategically evaluate the relative utilities of such shots.

The physics simulator that we developed was used as thebasis for the Computational 8 Ball Tournaments at the10th

and11th International Computer Olympiads [21], [22], [23],[24], [25]. This tournament allowed teams to develop dif-ferent strategy engines and compete in simulation using thecommon physics simulator. One consideration in modelingthe physics was noise. When a human or robotic playertakes a shot, there is error in the cue’s position and velocity,which makes each shot non-ideal. To make the simulationmore realistic (and make the competition more challenging),it was necessary to add zero-mean random Gaussian noiseto each of the five shot parameters. The sigma values ofeach distribution was empirically determined so as to causea missed shot on average every 10 shots, which is a similarsuccess rate as advanced human play. When planning a shotfor robotic play, a noise model based upon the calibratedpositioning accuracy of the robot can be used to determinethe probability of success of a given shot.

Fig. 7. Pool Game Tree

V. STRATEGY

There is a significant amount of strategy involved inplaying pool, and professionals are known to plan 5 ormore shots ahead for a given table state. For a roboticsystem to play competitively, it is therefore necessary tostrategize computationally, which involves the interplayofboth a physics simulator and a search method.

Approaches to shot planning for computational pool haveincluded the use of fuzzy logic [6] and grey decision mak-ing [7]. Our approach is based upon the minimax game treethat is used in games like chess and checkers [25]. While thebasic concept is the same as in chess, there is a significantdifference when considering pool, in that pool is played ina continuous, rather than a discrete, domain. The size of thesearch space for any particular shot is therefore truly infinite,rather than the huge but finite search space of chess.

Another consideration in pool is shot noise. There are fiveparameters that dictate the outcome of a particular shot: 2angles(θ, φ), 2 offsets(a, b), and the striking speedV [19].Each of these parameters in practise has an element of uncer-tainty which can be modeled as a probability distribution. Forthis reason, we have adapted the *-Expectimax formalism,which has been applied to games like backgammon thathave a probabistic component [26]. The stochastic natureof the game of pool motivates the use of a tree structuresimilar to the *-Expectimax tree to account for uncertaintyin shot outcomes due to the noise model. Since pool isplayed in a continuous domain, the chosen tree searchalgorithm incorporates statistical sampling in the form ofMonte Carlo simulation in order to account for uncertaintyin shot execution and weight the value of future table statesby their probability of occurrence.

The game tree structure for the pool strategy algorithm isillustrated in Fig.7. A given table state is represented by aState node, denoted bySti. A State nodeSt

ji at depthj has

an infinite number of possible shots, with a subset ofn shotschosen by a shot generation algorithm. Each of these shotsis represented by a childShot node, denoted bySh

jk, k =

1 . . . n. The shot itself is represented by the arc of the tree

between a State node and a Shot node, and is annotated byits set of five shot parametersP = {a, b, θ, φ, V }

jk.

A player’s turn begins at the root State nodeSt0. Theshot generator chooses to explore a set ofn shots, each ofwhich arcs to a child Shot node{Sh0

k}nk=1

. At each Shotnode, a Monte Carlo simulation is performed to generatethat Shot node’s child State nodes. ForNδ samples, theshot parametersP representing the Shot node are perturbedby the addition of zero-mean random Gaussian noise. Asampling of possible outcomes of the shot are obtained fromthe simulations using perturbed shot parameters, and theresulting table states are added to the tree as child State nodesof the Shot node. A child State node of a Shot node resultsin a player either continuing the turn and shooting againafter successfully pocketing an object ball, losing the turnafter failing to pocket an object ball, fouling and surrenderingball-in-hand, or winning the game by pocketing the 8-ball.

The child State nodes are recursively visited using thesame process down to the desired search depth. One ply,or layer, of the tree consists of a layer of State nodes andtheir child Shot nodes. A depth 2 tree, for example, containsa root State node, a layer of depth 0 Shot nodes, a layerof depth 1 State nodes, a layer of depth 1 Shot nodes, andfinally a layer of depth 2 State nodes at the leaves of the tree.In other words, the depth of the tree refers to the number ofshots, or resulting table states, ahead in the game that arevisited by the tree.

One difference between *-Expectimax for Backgammonand its adaptation for pool follows from the differencein turn ordering between the two games. In Backgammonplayers strictly alternate turns at each dice role, so the *-Expectimax tree contains alternating layers of nodes, withthe one layer representing the player’s turn and the next layerrepresenting the opponent’s turn. In contrast, a player’s turnin pool continues and they keep shooting so long as an objectball is legally pocketed at each shot. To accommodate thisdifference, the search tree for pool is recursively expandedby preorder traversal as long as a player’s shot is legaland successful. The traversal is terminated either when thespecified search depth is reached, or at a leaf node. Bydefinition, a leaf node is reached either when a player loseshis turn (by failing to pocket an object ball or by fouling), orwhen a player pockets the 8-ball to win the game. At the leafnodes of the tree, an evaluation function returns a numericalscore which is propagated back up the tree.

The main difference between the scheme described aboveand *-Expectimax for turn-based games is that, in the caseof pool, the opponent’s State nodes are not expanded andexplored further. In the tree search for pool, no forwardmodeling of an opponent’s possible shots is performed,and the advantage given to an opponent at the loss of aplayer’s turn is estimated by calculating the score of that leafState node from the opponent’s perspective. In Backgammon,enumerating all of a player’s possible moves is trivial becausethe game space is discrete. In pool, however, an opponent’sshot is specified by five continuous parameters, which makes

enumerating all of an opponent’s possible shots impossible.Even if the strategy algorithm used its own “common sense”rules to explore a few of an opponent’s expected shots, thereis no guarantee that the opponent will select any of thoseshots because the opponent’s shot selection algorithm willlikely be a different program. For example, a shot discovery-style shot selection algorithm would almost certainly failtopredict the shot chosen by an opponent’s shot specification-based selection algorithm. Therefore, no time is spent mod-eling an opponent’s potential shots by expanding a leaf Statenode, because it is fairly likely to be an inaccurate projectionand therefore wasteful. This adaptation of *-Expectimax alsosimplifies the search algorithm and results in a more compacttree and a faster shot selection time, because far fewer nodesare added to the tree.

At a given node in the tree, the scores of each of thechild nodes are combined numerically using averaging andprobabilistic weighting. The combined score is propagatedback up the tree to the root where the shot leading tothe maximum leaf score is selected. Different schemes foraveraging and weighting the scores of child nodes have beenexplored and compared [27].

A. Empirical Evaluation of Strategic Play

To explore the benefits of strategic play in pool, we haveexecuted a set of experiments using the above described treesearch framework. The performance of several algorithmicvariations were quantified by playing a series of computa-tional Eight Ball tournaments. There were 19 competitors inthe tournament, all with identical shot generation algorithms.Eighteen of the competitors used different tree search depths,tree scoring variations, and evaluation function variations.The 19th player used a “greedy” shot selection algorithm andchose its shots based solely on the probability of success ofthe current shot with no regard for the resulting table stateor future shots.

Three tournaments were played in this format with dif-ferent noise models in each tournament. The different noisemodels used reflect the technical skill (precision in executingshots) of the players involved. One tournament used a “zero”noise model where no noise was added to a player’s shot.The second tournament used the “low” noise model andmodeled human players with very strong technical skill whomissed relatively few shots and had good control of thecue ball placement, while the third tournament used the“high” noise model and modeled human players with lesstechnical skill who missed more shots and had less controlof the cue ball placement. Since all players involved in eachtournament used the same noise model, the results of agiven tournament show the performance versus search depthand tree scoring/evaluation function variant. Comparing theresults of the three tournaments illustrates how beneficiala given search depth and tree scoring/evaluation functionvariant is for a player of a certain technical skill level.

The situation is similar to comparing two human play-ers by categorizing their play in two areas: technical skill(precision in executing shots) and level of strategic play

TABLE I

SUMMARY ACROSS SEARCH DEPTHS FOR ZERO, LOW, AND HIGH NOISE TOURNAMENTS

Noise Player Avg. wins (%) Avg. pts. Avg. Pt. Diff. Avg. miss (%) Avg. BIH (%)greedy 9.9 771.6 -1093.3 0.0 10.3

zero all depth 1 61.1 1390.8 278.4 0.0 2.5all depth 2 79.0 1622.5 814.9 0.0 0.4

greedy 19.9 963.1 -791.0 6.3 12.0low all depth 1 62.7 1458.8 323.9 2.6 3.6

all depth 2 67.4 1523.9 467.1 1.6 3.4greedy 36.5 1301.7 -314.6 11.8 14.3

high all depth 1 54.8 1484.4 114.2 8.9 10.4all depth 2 58.7 1519.9 200.4 9.3 9.0

(how far ahead in the game the player looks and how theplayer controls the cue ball position for the next shot). Ahuman player with relatively low technical skill (or, a strategyalgorithm in a computational tournament with relatively highσ values for the noise model) will not play well againstany player, no matter how strategically they play (or, howdeep the strategy searches in the game tree). Similarly, ahuman player with very high technical skill (or, an algorithmin a tournament with lowσ values for the noise model)will probably not play as well as a player with equallyhigh technical skill who has a greater strategic sense forthe game (or, an algorithm that searches deeper in the gametree). In analyzing a player’s performance, it is importanttounderstand which factor limits their overall competitiveness,technical skill or search depth.

A variety of combinations of tree scoring variations(Monte Carlo, Probabilistic, or Success-weighted) and evalu-ation function variations (Average, Max, or Weighted) wereexamined, both for tree search depths of 1 and 2. Withineach tournament, the players with common search algo-rithm/evaluation function (but varying search depth) played200-game matches against one another and against the greedyplayer in a round-robin format. Games were scored asfollows: the winning player was awarded awarded a totalof 10 points, and the losing player was awarded 1 point foreach ball of their colour group that was pocketed at the endof the game (i.e. a player pocketing all of their balls butlosing at the Eight ball would score 7 points). The matchscore was simply the sum of all a player’s points from thegames in the match.

Although the shot generation routine is capable of identify-ing direct, bank, combo, kick and safety shots, for all playersin all three tournaments the shot generator was configured toreturn only direct shots. If a player found itself in a positionfrom which no direct shots were available, no attempt ata more complicated shot such as a bank or kick shot wasmade, and a safety shot was never attempted. In these casesthe turn was passed directly to the opponent in the form ofball-in-hand. This format was used to highlight the effectsof search depth, search algorithm and evaluation functionvariant on the players’ success. By allowing only directshots, the importance of cue ball positioning for the next

shot is more pronounced and allows an easier interpretationand comparison of the various search depths/search algo-rithm/evaluation function variants. The percentage of shotsresulting in ball in hand indicates not only how often aplayer fouled, but more importantly how often the playerleft itself with no shot. The greedy player was more heavilypenalized by this setting since it never considered the tablestate resulting from its chosen shot.

The results from these experiments are summarized inTable 1. The players are ranked by their overall performanceby averaging the percentage games won, points scored, pointdifferential, miss rate and percentage of shots resulting in ballin hand (BIH). In the zero-noise tournament, the deeper-searching players consistently outplayed their shallowersearching competitors. For a given search type/evaluationfunction variant, the depth 2 player always defeated thegreedy player easily and then defeated the depth 1 playerin turn. The greedy player was defeated in all matches in thezero-noise tournament, winning at best 16.5% of its gamesin its match against one player. Against the greedy player,all of the depth 2 players scored more wins with a higherpoint differential than the corresponding depth 1 player.

Positional play in the form of look-ahead is clearly animportant consideration in pool. Choosing the easiest shot,or the shot with the highest probability of success, does notresult in a competitive player; planning strategically usinglook-ahead does. These results mirror the expectation forhuman players similarly characterized by technical skill andlevel of strategic reasoning. A player is always limited bytheir technical skill, regardless of how strategically they planshots. However, for sufficiently skilled players the benefitsof strategic reasoning and cue ball placement in the form oflook-ahead always dominate over less strategic play. Whereasthese experiments have evaluated lookahead only to a depthof 2, we expect that the benefits of lookahead would continueto be apparent for search depths up to 8, at which point allgame tree bracnhes will have terminated, i.e. all balls willhave been sunk and the game completed.

VI. D ISCUSSION

A. Advantages of Machine Play

In a number of ways, pool is an ideal game for automation.At its core, pool is a game of accuracy, and a great dealof human pool instruction and practise is oriented towardestablishing an accurate and repeatable stroke. Machinesroutinely outperform humans at positioning accuracy andrepeatability, and unlike humans, they do so consistently,without the performance-degrading effects of muscle fatigue.Machines also have the advantage of not being susceptibleto psychological pressure, which is a significant source ofvariation and failure in human play.

Another advantage of machine play is its ability to sensethe absolute metric locations of the balls in the table coordi-nate reference frame. Humans can develop a perception of thegeometric arrangement of the balls based upon their relativepositions on the table. This perceptive ability is often quiteimpressive, allowing humans to plan and execute challengingshots, with little margins for error. There are, however,certain situations where it is difficult for even skilled humansto perceive the correct angles. For example, shots whichinvolve multiple banks are inherently difficult to perceive,and humans often make use of inexact systems involvingtable landmarks (e.g. diamonds) to augment their perception.In contrast, the machine resolves the metric position, withinthe measurement accuracy, of all balls and elements of thetable (i.e., rails and pockets). This ability allows for moreexact geometric planning, and ultimately improves the abilityto predict the outcome of a shot.

Another advantage of the machine is its ability to com-putationally simulate the physics of the table. The majorityof human players rely on an intuitive understanding of theunderlying physics of the system. Typically with little orno formal knowledge of physics, human players developheuristics to predict the subsequent table state that resultsfrom the mutliple interactions of any particular shot. Whileoften useful, these heuristics estimate the physics of thesystem with a limited fidelity. In contrast, the machine hasthe benefit of an executable physics model and, so longas a handful of parameters have been estimated throughcalibration, can make use of a physics simulator to predictthe resulting table state both accurately and efficiently.

The cue end-effector has an advantage over a human in thatit provides very precise control of the speed of the stroke. Theelectrical linear actuator which is responsible for the forwardmotion of Deep Green’s stroke has a dedicated digital controlunit which can be commanded in either position or velocitymodes. The speed of the cue can range from almost stationaryto ∼3 m/sec., with an average error of∼0.1%. Compare thiswith a human, who tends to strike with one of 6 speeds (slow,medium-slow, medium, medium-fast, fast, and break speed).The added graduation in the control of the speed of the cuetranslates to an increased ability to place the cue ball andpredict and control the table state.

Once the mechanics of placing a shot have been mastered,the game of pool becomes one of strategy, and here too

the machine has a potential advantage. The essence of poolstrategy is the ability to look ahead and predict the state ofthe table following a potential shot, or a series of potentialshots. This is the same capability that allows computersto outperform humans at chess and other games recentlybelieved to be approachable only by humans. There areaspects of pool which differentiate it from most board games.For example, pool balls can be positioned anywhere on thecontinuous surface of the table, whereas chess pieces areplaced only at a small finite number of discrete locations. Thesearch space of pool is therefore truly infinite, as comparedtothe huge but finite search space of chess. We have found thatsearch-tree constructs similar to those successfully appliedto chess play can be used effectively for shot planning inpool [25], [22], [27].

Unlike soccer, decisions in pool are made only when allballs have come to rest. This removes real-time planningconstraints and eases real-time sensing requirements.

B. The Need for Intelligence

As we have pursued the development of Deep Green, wehave received three distinct views on the degree of difficultyrequired to attain our goal. One view that has been expressedis that it is a trivial task, requiring only standard robotictechniques to provide a solution. The polar opposite view isthat playing pool well is a distinctly human activity, requiringhuman intelligence and skill, and that automating pool is animpossible task. In general the first view tends to be heldby people who are familiar with technology but unfamiliarwith pool, whereas the second view is often put forwardby those who themselves are proficient pool players, butwho have no special relationship with technology. Our ownview lies somewhere between these two extremes. We believethat developing a robotic system to play pool competitivelyagainst a proficient human opponent is a challenging taskthat is nevertheless achievable through dedicated research.The technical problems are both interesting and sufficientlychallenging to motivate advanced research, but not so diffi-cult as to evade a meaningful solution.

Another sentiment that has been expressed is to questionwhether or not computational intelligence is needed forrobotic pool, i.e., is it not sufficient to have a very accuratepositioning system and simple shot planning based purely ongeometry? In this paper, we have addressed this question withtwo arguments, the first being that accurate positioning of astandard gantry robot is itself a challenging goal requiringsensor-based CI for calibration and correction. The secondargument is that, even if perfectly accurate positioning werepossible, it is still advantageous to play strategically and planahead a number of shots, as evidenced by our experimentswith zero-noise tournaments.

There are a number of future research topics that will needto be addressed to advance the system further. Deep Greencurrently plays at a good amateur level: It has pocketed runsof 4 consecutive balls, and we feel that it is a matter of timebefore it advances to the stage where it can consistently run

the table. The final challenges will be faced when compet-ing against proficient human opponents. To put it mildly,we humans are crafty competitors, able to very efficientlyrecognize and exploit weaknesses in our opponents. To playat a professional level will require incorporating the benefitsof research into machine learning and opponent modelingtechniques into our system.

ACKNOWLEDGEMENTS

We would like to thank the Institute for Robotics andIntelligent Systems, Precarn Inc., the Canada Foundation forInnovation, and NSERC for their financial support. We wouldalso like to thank the many students who have contributedlong hours to the development of this system.

REFERENCES

[1] Robocup, “http://www.robocup.org,” 2006.[2] AAAI, “http://www.aaai.org,” 2006.[3] T. College, “http://www.trincoll.edu/events/robot,” 2006.[4] S. W. S. Chang, “Automating skills using a robot snooker player,”

Ph.D. dissertation, Bristol University, 1994.[5] M. E. Alian, S. Shouraki, M. Shalmani, P. Karimian, and P. Sabzmey-

dani, “Roboshark: A gantry pool player robot,” inISR 2004:35th Intl.Sym. Rob., 2004.

[6] S. Chua, E. Wong, A. W. Tan, and V. Koo, “Decision algorithm forpool using fuzzy system,” iniCAiET 2002: Intl. Conf. AI in Eng. &Tech., June 2002, pp. 370–375.

[7] Z. Lin, J. Yang, and C. Yang, “Grey decision-making for a billiardrobot,” in IEEE Int. Conf. Sys. Man Cyb., 2004, pp. 5350–5355.

[8] F. Long, J. Herland, M.-C. Tessier, D. Naulls, A. Roth, G.Roth, andM. Greenspan, “Robotic pool: An experiment in automatic potting,” inIROS 2004: IEEE/RSJ Intl. Conf. Intell. Rob. Sys., 2004, pp. 361–366.

[9] J. Lam, F. Long, G. Roth, and M. Greenspan, “Determining shotaccuracy of a robotic pool system,” inCRV 2006:3rd Can. Conf.Comp. Rob. Vis., June 2006.

[10] L. Larsen, M. Jensen, and W. Vodzi, “Multi modal user interactionin an automatic pool trainer,” inICMI 2002: 4

th IEEE Intl. Conf.Multimodal Interfaces, Oct. 2002, pp. 361–366.

[11] F. Long, “Techniques for planar metric rectification,” Master’s thesis,Queen’s University, 2005.

[12] M. J. Swain and D. H. Ballard, “Color indexing,”Int. Jour. Comp.Vis., vol. 1, no. 7, pp. 11–32, 1991.

[13] I. Fraser and M. Greenspan, “Color indexing by nonparametric statis-tics,” Image Analysis and Recognition: Lect. Notes. on Comp. Sci.,vol. 3656, pp. 694–702, Sept. 2005.

[14] J. Lam, “Eye-in-hand visual servoing to improve accuracyin poolrobotics,” Master’s thesis, Queen’s University, 2007.

[15] Y. C. Shiu and S. Ahmad, “Calibration of wrist-mounted roboticsensors by solving homogeneous transformation equations of the formax=xb,” IEEE Transactions on Robotics and Automation, vol. 5, no. 1,1989.

[16] R. Shepard,Amateur Physics for the Amateur Pool Player. selfpublished, 1997.

[17] D. Alciatore,The Illustrated Principles of Pool and Billiards. Sterling,2004.

[18] W. Leckie and M. Greenspan, “An event-based pool physics simulator,”11

th Advances in Computer Games: Lecture Notes on ComputerScience, no. 4250, pp. 247–262, 2006.

[19] ——, “Pool physics simulation by event prediction 1: Motion transi-tions,” Intl. Comp. Gaming Ass. Journal, vol. 28, no. 4, pp. 214–222,Dec. 2005.

[20] ——, “Pool physics simulation by event prediction 2: Collisions,” Intl.Comp. Gaming Ass. Journal, vol. 29, no. 1, pp. 24–31, Mar. 2006.

[21] J.-P. Dussault and J.-F. Landry, “Optimization of a billiard player -position play,”11th Advances in Computer Games: Lecture Notes onComputer Science, no. 4250, pp. 263–272, 2006.

[22] M. Smith, “Running the table: An ai for computer billiards,” in AAAI2006: The21

st Nat. Conf. on AI, July 2006.[23] M. Greenspan, “Uofa wins the pool tournament,”Intl. Comp. Gaming

Ass. Journal, vol. 28, no. 3, pp. 191–193, Sept. 2005.[24] ——, “Pickpocket wins pool tournament,”Intl. Comp. Gaming Ass.

Journal, vol. 29, no. 3, pp. 153–156, Sept. 2006.[25] W. Leckie and M. Greenspan, “Monte carlo methods in pool strategy

game trees,” in5th Intl. Conf. Comp. Games, May 29-31 2006.[26] T. Hauk, M. Buro, and J. Schaeffer, “*-minimax performancein

backgammon,” 2004.[27] W. Leckie, “Continuous-domain stochastic search for computational

and robotic pool,” Master’s thesis, Queen’s University, 2007.

Toward a Competitive Pool Playing Robot: Is Computational Intelligence Needed to Play Robotic Pool?

Documents

Transcript of Toward a Competitive Pool Playing Robot: Is Computational Intelligence Needed to Play Robotic Pool?