Robotic Minimally Invasive Surgical skill assessment based on automated video-analysis motion...

7
Abstract²Assessment of surgical skill, arising from the synthesis of the cognitive and sensorimotor capabilities of the surgeon, has predominantly been a subjective task. Development of quantitative metrics-of-performance with clinical relevance and other desirable characteristics (repeatability and stability) has always lagged behind. New opportunities for objective and automated assessment frameworks have arisen by virtue of technological advances in computation, video-processing, and data-acquisition, especially in the robotic Minimally Invasive Surgical (rMIS) realm. Most efforts focus on semi-quantitative (Likert scale) or inadequately validated, spatially- or temporally-aggregated quantitative metrics derived from direct physical measurements. In this work we propose an automated surgical expertise evaluation method, by adapting well-established motion studies methodologies, especially for MIS evaluation. This method relies on segmenting a primary task into sub-tasks, which can be evaluated by statistical analyses of micromotions. Motion studies were developed by 2 methods: (A) manual annotation process by experts (to serve as a benchmark); and (B) automated kinematic-analysis-of-videos; for economy, repeatability as well as dexterity. The da Vinci SKILLS simulator was used to serve as a uniform testbed. Surgeons with varied levels of expertise were recruited to perform two representative simplified tasks (Peg Board and Pick & Place). The automated kinematic analysis of video was compared with the ground truth data (obtained by manual labeling) using misclassification rate and true classification confusion matrix. Future studies aimed towards analyzing real surgical procedures are already underway. I. INTRODUCTION Surgical proficiency engenders merger of sensorimotor and cognitive capabilities and its systematic assessment has been a topic of considerable importance. In the current 6HH RQH ’R RQH 7HDFK RQH¶ paradigm [1], novitiate clinicians typically learn to perform procedures by observing more experienced personnel actually performing them [2, 3]. The biggest challenges to assessment and accreditation of *This work was supported in part by the National Science Foundation under Grants, CNS 0751132 and CNS 1135660. S.-K. Jun, M. Sathianarayanan and P. Agarwal are PhD candidates in Dept of Mech. & Aero. Engg. at SUNY Buffalo, NY 14260 USA (e-mails: [email protected], [email protected], [email protected] ). A. Eddib, M.D. is the OB-GYN Robotic Fellow at Millard Fillmore Suburban Hospital, while P. Singhal, M.D., is a Director of Robotic Surgery Division for the Kaleida Health System, in Buffalo, NY 14221 USA (email: [email protected], [email protected]) S. Garimella, M.D., is a Clinical Asst. Prof. in Pediatrics Nephrology, SUNY Buffalo, NY 14260 USA (email: [email protected]) V. Krovi, is an Associate Professor with the Dept. of Mech. & Aero. Engg., SUNY Buffalo, NY 14260 USA (e-mail: [email protected]). surgeons include (i) creating appropriately rich and diverse clinical settings (real or virtual); as well as (ii) developing uniform, repeatable, stable, verifiable performance metrics; both at manageable financial levels for ever increasing cohorts of trainees. Due to significant diversity in training, surgical assessment can occur in different settings ranging from operations on live patients, to surrogate phantoms ranging from cadavers, animal models to plastic mannequins to most recently simulated/virtual environments. Current practice leverages an apprenticeship model and entails subjective or at best semi-objective (Likert-scale) evaluation of surgical performance by an expert surgeon [4, 5]. However, as Satava [6] notes, the concomitant revolutions of objective assessment of procedural skills and transition from an apprenticeship- to criterion-based model is revolutionizing medical training. Over the past decade, the ACGME (Accreditation Council for Graduate Medical Education) [7] has espoused development of a cost-efficient proficiency- based curriculum, with an emphasis on simulation methodologies and quantitative skills-assessment tools, to bypass the limitations in the current apprenticeship-based system. The growth of computer integration and data acquisition in the form of robotic Minimally Invasive Surgical (rMIS) now offers a unique set of opportunities to comprehensively address this situation. Quantitative metrics of surgical performance are not only critical for distinguishing novice from expert but can provide a gradated basis for expertise necessary to monitor performance. In conjunction with a well-structured curriculum, it would form the core of a comprehensive training program. While quantitative metrics are clearly superior to subjective assessment, it is unclear as to WHICH data, at WHAT spatial and temporal resolution needs to be collected from the vast choices of physical measurements possible. It is the improper understanding of the underlying relationships, coupled with insufficient computational support that has led to present system of quantitative metrics comprising of relatively simplistic metrics (spatial and temporal aggregated measures) for example, total-time-to- task completion (TTC) [8-10], total-tool-path-length (TPL) [11-13] etc). Even the commercial surgical simulators use only these metrics. For example, da Vinci Skills simulator (dVSS) uses a MScore TM evaluation system to evaluate robotic surgical expertise [14]. The principal issue in such a Robotic Minimally Invasive Surgical Skill Assessment based on Automated Video-Analysis Motion Studies Seung-Kook Jun, Madusudanan Sathia Narayanan, Priyanshu Agarwal, Abeer Eddib MD, Pankaj Singhal MD, Sudha Garimella MD and Venkat Krovi, Member, IEEE The Fourth IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics Roma, Italy. June 24-27, 2012 978-1-4577-1200-5/12/$26.00 ©2012 IEEE 25

Transcript of Robotic Minimally Invasive Surgical skill assessment based on automated video-analysis motion...

Abstract²Assessment of surgical skill, arising from the

synthesis of the cognitive and sensorimotor capabilities of the

surgeon, has predominantly been a subjective task.

Development of quantitative metrics-of-performance with

clinical relevance and other desirable characteristics

(repeatability and stability) has always lagged behind. New

opportunities for objective and automated assessment

frameworks have arisen by virtue of technological advances in

computation, video-processing, and data-acquisition, especially

in the robotic Minimally Invasive Surgical (rMIS) realm. Most

efforts focus on semi-quantitative (Likert scale) or inadequately

validated, spatially- or temporally-aggregated quantitative

metrics derived from direct physical measurements. In this

work we propose an automated surgical expertise evaluation

method, by adapting well-established motion studies

methodologies, especially for MIS evaluation. This method

relies on segmenting a primary task into sub-tasks, which can

be evaluated by statistical analyses of micromotions. Motion

studies were developed by 2 methods: (A) manual annotation

process by experts (to serve as a benchmark); and (B)

automated kinematic-analysis-of-videos; for economy,

repeatability as well as dexterity. The da Vinci SKILLS

simulator was used to serve as a uniform testbed. Surgeons

with varied levels of expertise were recruited to perform two

representative simplified tasks (Peg Board and Pick & Place).

The automated kinematic analysis of video was compared with

the ground truth data (obtained by manual labeling) using

misclassification rate and true classification confusion matrix.

Future studies aimed towards analyzing real surgical

procedures are already underway.

I. INTRODUCTION

Surgical proficiency engenders merger of sensorimotor

and cognitive capabilities and its systematic assessment has

been a topic of considerable importance. In the current µ6HH�

RQH��'R� RQH��7HDFK� RQH¶� paradigm [1], novitiate clinicians

typically learn to perform procedures by observing more

experienced personnel actually performing them [2, 3]. The

biggest challenges to assessment and accreditation of

*This work was supported in part by the National Science Foundation

under Grants, CNS 0751132 and CNS 1135660.

S.-K. Jun, M. Sathianarayanan and P. Agarwal are PhD candidates in

Dept of Mech. & Aero. Engg. at SUNY Buffalo, NY 14260 USA (e-mails:

[email protected], [email protected], [email protected] ).

A. Eddib, M.D. is the OB-GYN Robotic Fellow at Millard Fillmore

Suburban Hospital, while P. Singhal, M.D., is a Director of Robotic Surgery

Division for the Kaleida Health System, in Buffalo, NY 14221 USA (email:

[email protected], [email protected])

S. Garimella, M.D., is a Clinical Asst. Prof. in Pediatrics Nephrology,

SUNY Buffalo, NY 14260 USA (email: [email protected])

V. Krovi, is an Associate Professor with the Dept. of Mech. & Aero.

Engg., SUNY Buffalo, NY 14260 USA (e-mail: [email protected]).

surgeons include (i) creating appropriately rich and diverse

clinical settings (real or virtual); as well as (ii) developing

uniform, repeatable, stable, verifiable performance metrics;

both at manageable financial levels for ever increasing

cohorts of trainees.

Due to significant diversity in training, surgical

assessment can occur in different settings ranging from

operations on live patients, to surrogate phantoms ranging

from cadavers, animal models to plastic mannequins to most

recently simulated/virtual environments. Current practice

leverages an apprenticeship model and entails subjective or

at best semi-objective (Likert-scale) evaluation of surgical

performance by an expert surgeon [4, 5]. However, as

Satava [6] notes, the concomitant revolutions of objective

assessment of procedural skills and transition from an

apprenticeship- to criterion-based model is revolutionizing

medical training. Over the past decade, the ACGME

(Accreditation Council for Graduate Medical Education) [7]

has espoused development of a cost-efficient proficiency-

based curriculum, with an emphasis on simulation

methodologies and quantitative skills-assessment tools, to

bypass the limitations in the current apprenticeship-based

system. The growth of computer integration and data

acquisition in the form of robotic Minimally Invasive

Surgical (rMIS) now offers a unique set of opportunities to

comprehensively address this situation.

Quantitative metrics of surgical performance are not only

critical for distinguishing novice from expert but can provide

a gradated basis for expertise necessary to monitor

performance. In conjunction with a well-structured

curriculum, it would form the core of a comprehensive

training program. While quantitative metrics are clearly

superior to subjective assessment, it is unclear as to WHICH

data, at WHAT spatial and temporal resolution needs to be

collected from the vast choices of physical measurements

possible. It is the improper understanding of the underlying

relationships, coupled with insufficient computational

support that has led to present system of quantitative metrics

comprising of relatively simplistic metrics (spatial and

temporal aggregated measures) for example, total-time-to-

task completion (TTC) [8-10], total-tool-path-length (TPL)

[11-13] etc). Even the commercial surgical simulators use

only these metrics. For example, da Vinci Skills simulator

(dVSS) uses a MScoreTM

evaluation system to evaluate

robotic surgical expertise [14]. The principal issue in such a

Robotic Minimally Invasive Surgical Skill Assessment

based on Automated Video-Analysis Motion Studies

Seung-Kook Jun, Madusudanan Sathia Narayanan, Priyanshu Agarwal, Abeer Eddib MD,

Pankaj Singhal MD, Sudha Garimella MD and Venkat Krovi, Member, IEEE

The Fourth IEEE RAS/EMBS International Conferenceon Biomedical Robotics and BiomechatronicsRoma, Italy. June 24-27, 2012

978-1-4577-1200-5/12/$26.00 ©2012 IEEE 25

case is that the reliability, stability and relevance of these

measures have not been established [11-13] creating

lingering questions about the utility of the evaluation system

to support a pathway to accreditation [7, 15].

Clearly, a clinically-relevant scenario coupled with

automated and quantitative skill assessment system is the

need of the hour that would enable seamless skill evaluation

with minimal manual (expert surgeon) intervention. In this

work we propose to adapt and study a variant of motion

analysis, used frequently for evaluation of skill and efficacy

within the industrial engineering community, for use in the

context of robotic surgeries (RS). Within this larger context,

we wish to: (i) leverage machinery of motion studies and

process charts to aid characterization of the performance; (ii)

retain clinical significance in terms of creation of a catalog

of basic motion elements; (iii) begin the process of

automation of evaluation of such motion studies based on

3D motion-capture techniques; and (iv) examine self-

consistency and reliability of ensuring qualitative measures

to characterize skill levels of user population. In this

manuscript, we focus on the first three aspects principally

and work is currently underway to address the fourth.

II. BACKGROUND

A. Current Surgical Assessment

A range of techniques exist today based on manual

assessment of technical skills of surgeons. Yet they lack

consistency and reliability due to the subjective nature of

experts¶ intervention [4], such as: (a) procedure lists with

logs; (b) direct observation; (c) direct observation with

criteria; (d) video based assessment. Robust methods of

assessment of technical skills are sought to establish high-

standards required for modern surgical training programs.

These methods should be flexible and effective regardless of

modality of training used.

B. Objective Assessment

In recent times, standardized objective methods for

assessing technical skills were introduced and accepted for

use in surgical training programs. These include Objective

Structured Assessment of Technical Skills (OSATS) [16, 17]

as well as Objective Structured Clinical Examination

(OSCE) that emphasizes the quantitative assessment

processes. These methods require hardware (measurement

device) such as Imperial College Surgical Assessment

Device (ICSAD) and Advanced Dundee Endoscopic

Psychomotor Trainer (ADEPT) to perform surgical dexterity

analysis. New generation of virtual simulators now provide

the ability to: (1) Control presentation of stimuli to trainees:

Varying sets of exercises with increasing complexity and

clinical relevance are available ranging from relatively

simple peg-transfer and suturing exercises to realistic

clinical scenarios such as resections and anastomoses [10,

18]. (2) Accurately and transparently monitor user

responses: Physical measurements of a variety of quantities

(tool motions, completion times) can be done using data-

acquisition hardware in a transparent manner [13, 19].

C. Motion Studies

Within industrial engineering practice, motion studies are

a well-established method used to characterize, simplify and

improve the efficiency and effectiveness of manual tasks

[20, 21]. For over a half century now, such motion studies

have been employed to characterize expertise as well as to

eliminate inefficient motions. In this work, we seek to

examine the applicability and usefulness of such a technique

to assess surgical performance and help create a viable

quantitative basis for grounding the training process.

Frank Gilbreths cataloged a set of basic motion elements

FDOOHG� ³Therbligs´� WKDW� KDV� VHUYHG� DV�EXLOGLQJ�EORFNV�RI� Dll

manual manipulative activities especially in a factory shop

floor. At its core, these elements allow for decomposition of

a complex manual job sequence into sub-parts that could be

individually examined. This segmentation potentially allows

for a finite state automaton representation of a complex

activity that could form the discrete basis for linguistic

representation as well as fault-detection and correction

similar to the ones explained in [22-24].

However, the challenge remains in being able to systematically, consistently and automatedly achieve the temporal discretization in a generalized manner. Current practice is highly reliant on the expertise of the manual annotators. For example, multiple manual annotators may create similar but slightly differing process charts for the same task² the discretization remains unable to distinguish between the natural variability in task performance by the user and the capability of the trained annotator. Video-recording of task performance for subsequent manual activity labeling (and reviewing) alleviates some of the variability in the temporal discretization.

D. Automated Video Analysis

While markerless video analysis has been used to study

hand-gesturing and skill [25-27], tracking unconstrained

human motions in concise quantitative terms with the

precision and fidelity suitable for motion analysis remains a

challenging task. Several studies in the recent past showed

that segmenting the surgical videos into sub-tasks (defined

as surgemes in [24]) and identifying kinematic tool poses

[28] can aid in analyzing the performance and skill. The

basis and requirements of the surgical task segmentation has

not been dealt with necessary detail in most of these studies.

It is necessary as these subtasks are the building blocks on

top of which metrics for skill and expertise will be defined.

Nonetheless, the problem still remains highly complicated

due to high variability and dependence on multiple-

parameters involved in inter- and intra- subject studies.

III. MOTION STUDIES

A. Robotic Surgery Therbligs

In an effort to develop a specialized but well defined set

of Robotic Surgery (RS) Therbligs, we base the development

on already established set of the basic motion elements. In

26

VHOHFW� FDVHV�� VXFK� DV� µ8VH� 7RRO¶�� WKLV� action was expanded

LQWR� DOWHUQDWH� FODVVHV� VSHFLILFDOO\� DV� µ&XW� 7LVVXH¶�� µ2SHQ�

7LVVXH¶��µ6FLVVRUV¶�DQG�µ&DXWHUL]H�7LVVXH¶��$W�WKH�Vame time,

several of original Therblig series were deemed

inappropriate and not included (refer to Table 1).

B. Process Charts

Traditional motion studies are captured in a process chart

where basic motion elements (Therbligs) are hierarchically

grouped into work elements and then ultimately into

meaningful tasks [29], which has proved adequate to offer a

primary discretization of industrial manipulation tasks. At its

core, the process chart consists of a table as shown in Fig. 1:

Fig. 1: Process Chart

A description of manipulation-related Therbligs can be

found in many standard textbooks [21, 29]. Enhancements to

this basic process chart now involve taking advantage of

bilateral symmetry (left/ right) or increased discretization or

agglomeration of tasks as well as performing varying levels

of statistical analyses for the collected information. The

principle of motion economy can then be applied to analyze,

assess, simplify and improve the efficiency and effectiveness

of a (hand-manipulable) manual task, as widely used by

industrial engineers. In this work, the same principle is

extended to analyze robotic surgical skill assessment.

C. Kinematic Motion Capture and Analysis

3D motion capture techniques are widely used in the

field of biomechanics [30] and robotics [31] to accurately

estimate the kinematics of a system especially, when

measurements from onboard sensing are inaccurate or

unavailable. Among the many methods, marker based video

motion analysis methods build upon availability of

synchronized video streams to estimate 3D motions of

moving objects within the scene. Virtual markers can be

potentially introduced via manual annotation (or template

matching image processing techniques) even if physical

markers are not explicitly present in the videos. However, a

minimum of 2 video streams with calibrated baseline are

required to estimate 3D coordinates with considerable

accuracy. In our analysis, the two endoscopic video feeds

from dVSS-Si serve for this purpose. In order to calibrate the

cameras in 2D and 3D prior to digitizing the videos,

inanimate objects present in the video feeds were used. In

this work, a motion analysis application (SIMIMotion [32])

was used to analyze the videos and estimate 3D positions,

velocities and accelerations (both linear and angular) of tool

motions (as will be discussed in Section V). This

information served as a basis to devise our algorithms for

automated recognition of Therblig elements, which are

presented in the following section.

TABLE 1: ROBOTIC SURGICAL THERBLIGS

Therbligs* Symbol Description

RE Reach Reaching for object with empty

hand.

G Grasp Moving an object using a hand

motion

M Move Grasping an object by contacting and

closing the finger of the active hand

RL Release Releasing control of an object

PP Pre-Position Positioning and/or orienting an

object for the next operation

P Position Positioning and/or orienting an

object in the defined location

U

Use

(UC, UO,

US, UZ)

Manipulating and/or applying a tool

in the intended way (UC- cutting

tissue, UO- cut-open tissues, US-

scissors cut, UZ- cauterize tissues)

A Assemble Joining the two parts together to

form an assembled entity

DA Disassemble Separating multiple components that

were previously joined in some way

H Hold Holding an object

SH Search Attempting to find an object using

the eyes or hand

SL Select Choosing an object from a group

I Inspect Determining the quality of

characteristic of an object

PL Plan Deciding a course of action

AD Avoidable

Delay :DLWLQJ� WKDW� LV� ZLWKLQ� WKH� ZRUNHU¶V�control

R

Rest to

overcome

Fatigue Rest to overcome fatigue

UD Unavoidable

Delay Wait due to the factors beyond the

control of the worker

Italicized: ineffective Therbligs, bold: used in the current study

IV. EXPERIMENTAL SETUP

A. Simulation Platform: Da Vinci Si

In order to benchmark the performance of different

surgeons both for intra- and inter-subject comparative

analyses and evaluate improvements over a period of time, it

is desirable to conduct these studies in a relatively controlled

Fig. 2: Da Vinci SKILLS Simulator [14]

27

and standardized testbed. In this particular study, the dVSS­Si was used along with its SKILLS simulator system [14] asin Fig. 2. In addition, using the dVSS-Si enabled recordingof stereoscopic video images for post -processing as eachtask was being performed. Since, our objective was todevelop a system skill assessment methods for a genericsurgical robotic device, only the video feeds were used asinput to our evaluation scheme.

Fig. 3(a) Peg Board (PegI) (b) Pick-N-Place (PnP)

Fig. 4. Manual Therblig Labeling Segmentation Application

B. Subject Recruitment and Task Description

Overall, the experiments were conducted using sixsubjects with varied levels of expertise (2 experts, 2intermediates, 2 novices). Though the number of subjects islimited in this study, recruitment of more surgeons iscurrently underway for validation of our metrics. Tworepresentative but simple simulator tasks were chosen toensure: (i) only a subset of entire set of "Therbligs" arerequired for our analysis (ii) tractability manual labelingsegmentation process; and (iii) possibility of conductingphysically simulated tasks to correlate with this analysis in

28

the future. Each surgeon was assigned to perform twosimulated tasks (i) Pick-and-place (PnP) (as in Fig. 3.a) (ii)Peg board (Pegl) (as in Fig. 3.b) which were available withinthe SKILLS simulator.

Each task was repeated minimum of 10 times. Twosequences of the videos were recorded from dVSS while thetasks are performed for use in motion study in order toestimate 3D kinematic poses as well as velocities bykinematic-motion-analysis. However, implementing thisprocess to estimate these parameters on-the-fly is alsopossible.

V. RESULTS

The pick-and-place tasks need only the 4 Therbligs­Reach (RE), Grasp (G), Move (M) and Release (RL) whilethe Peg-Board tasks require a total of 5 elements- inaddition to the four elements as earlier, Hold (H) Therblig isincluded.

Fig. 5: TTC Histogram for 4 Therbligs of Peg Board Tasks

A. Manual Therblig Analysis

For all cases, the two hand chart in form of text files weregenerated using Therblig labeling software developed in ourlab (refer Fig. 4). These data files were then analyzed foreach subject, each task and each Therblig based on thedistributions of time to task completion. In order toanonymize the subject information, the following symbolswere assigned during our analysis - experts (E1 and E2) ,

intermediates (11 and 12) and novices (N 1and N2) . A detaileddiscussion of manual Therblig Analysis, dexterity anddefective motion detection and its applicability in real­robotic-surgical scenarios is available in [33] and hence,only representative results are shown in Fig. 5 and Table 2.An observation of the final results based on this studyrevealed that higher task complexity resulted in improveddiscriminative characteristics of surgical efficacy betweenexperts and novices using this method.

B. 3D Kinematic Motion Analysis

SIMIMotion motion capture system was used foranalyzing the videos and determining the Cartesiantrajectories of tool motions over a period of time within thesimulated task environment. The inputs to this step are only

the videos obtained from the endoscopic cameras available

in dVSS-Si and manual placement of markers in each frame.

The outputs from this step comprise of 3D position and

filtered velocities trajectories of tool tip and distance

between tool jaw (to indicate if the tool is in closed or open

state). An example of these for a specific case is illustrated

in Fig. 6.

C. Automated Therblig Recognition

Based on these 3D trajectories obtained for each task and

each subject, a decision-tree type classification scheme is

established in order to conduct studies for automated

recognition of Therbligs. The four primary variables that

found to be relevant effective automated determination of

these Therbligs are: tool position & velocity (translation

component only) and tool fingertip distance. In addition, in

some cases the image coordinates (pixels) were also used as

the secondary variables to detect object motions in these

tasks.

Our classification method actually relies on threshold

values that were adapted (or normalized) for each surgeon

depending on their motions and based on whether the

considered variable exceeds this optimized threshold value,

a decision (or classification) is made. The relationship

between these thresholds and the corresponding Therbligs is

shown in Fig. 7. Using this classification algorithm, the

automated recognition of the sub-tasks was performed

directly from the motion analysis trajectories and a few

illustrative results are shown in Fig. 8-11.

In order to evaluate the performance of this method,

misclassification rates and corresponding confusion matrix

are determined for the whole set of tasks and subjects. The

grey column in the confusion matrices indicates the

percentage (true and false positive) measures of individual

(a)

(b)

(c)

(d)

Fig. 6: 3D Kinematic Motion Analysis to Study rMIS Skills (a) Combined

3D Tracking with Video Overlay (b) Tool tip trajectory (b) Tool tip

distance (c) Tool tip Velocities

TABLE 2: MEANS AND STANDARD DEVIATIONS OF TTC FOR 6 THERBLIGS OF

PEG BOARD TASKS

RE M G RL H AD

E1 1.54 1.37 0.23 0.17 1.23 1.11

+/-0.51 +/-0.46 +/-0.16 +/-0.06 +/-0.41 +/-0.69

E2 1.69 1.36 0.16 0.13 0.43 1.19

+/-0.98 +/-0.4 +/-0.06 +/-0.07 +/-0.22 +/-1.03

I1 2.1 2.61 0.27 0.17 1.52 2.92

+/-1.43 +/-0.85 +/-0.42 +/-0.1 +/-1.88 +/-2.29

I2 2.03 1.2 0.49 0.3 2.11 1.01

+/-0.93 +/-0.47 +/-0.56 +/-0.16 +/-2.28 +/-0.42

N1 1.5 1.39 0.36 0.22 1.49 1.66

+/-0.71 +/-0.53 +/-0.19 +/-0.13 +/-0.52 +/-1.09

N2 1.75 1.64 0.53 0.27 1.02 2.92

+/-1.96 +/-0.59 +/-0.28 +/-0.13 +/-0.69 +/-2.88

Fig. 7. Automated Therblig Recognition- Decision Tree Algorithm

RE, M, G, RL, AD ,H

R > x1

V < x3 AD,

H

AD, H,

RE, M

AD

H,

ADD < x4

D > x4

H

AD

After M,G

G,

RL

R < x1

V> x3

RE,

M

G

RL

D < x2

D > x2 RE

MD < x2

D > x2

R - Rate of change in

finger tip distance

V t Tool Tip Velocity

D t Tool tip distance

x1 , x2, x3, x4 : subject-

specific decision

parameter threshold

29

Therblig recognition. Based on these values,

misclassification rate is found individually for each task as

in Fig. 12 and cumulatively given in Table 3 for the

proposed automated recognition. Since, this is a multi-class

problem; confusion matrix provides a compact way of

representing the true and false positive (and also negative)

detection rates (refer to Fig. 13 & 14).

TABLE 3: AUTOMATED THERBLIG RECOGNITION RATES

Pick and Place Peg Board

E1 86.3% 79.6% 77.8% 80.7%

E2 83.1% 88.7% 74.3% 67.8%

I1 77.4% 72.1% 70.4% 74.9%

I2 49.7% 75.8% 52.3% 76.4%

N1 60.0% 64.2% 63.5% 62.5%

N2 67.2% 72.4% 70.2% 67.9%

It is noted that the performance of automated recognition

rates degrade with decrease in expertise of surgeons. This in

turn is found to be an indirect indication of increasing

µUDQGRP¶� PRWLRQV� H[KLELWHG� E\� VXEMHFWV� ZLWK� GHFUHDVLQJ�

expertise levels. Clearly, expert surgeons seem to have

inherently demonstrated a distinctive start and pause

between each of Therbligs with smoother and rhythmic

motions compared to intermediates or novices which are

captured subtly by this method. Except for one of the

novices for which performance was very poor, all other

subjects were included in this study.

By taking a closer look at Fig. 13 and Fig. 14, we note

that the maximum error is due to false recognition of

avoidable delay Therblig. This is mainly because, of its

definition that says any type of ineffective motion that is

ZLWKLQ� WKH� XVHU¶V� FRQWURO� LV� FDWHJRUL]HG� DV� DYRLGDEOH� GHOD\��

That basically involves a wider variety of motions and

hence, higher probability of being false-recognized as any

other Therblig using our classification method. By

eliminating avoidable delay portions, the recognition

performance improved beyond 70% for the same datasets.

Thus, the proposed approach provides a useful framework to

analyze robotic surgical skills. However, it is not desirable

for its performance to be dependent on skill levels which is a

subject of interest in future.

Fig. 10. Automated Therblig Recognition Rate for E2 in PegI

Fig. 11. Automated Therblig Recognition Rate for I1 in PegI

Fig. 12. Automated Therblig Recognition Rates for each Task and each

Hand for each User

VI. CONCLUSION

The application of Therblig analysis for rMIS was

examined and a preliminary study to validate the process

was discussed. The kinematic motion analysis technique was

used to estimate 3D Cartesian trajectories of visible bodies

(tool and objects) in the scene. A detailed study was

conducted based on two simple simulated tasks (namely,

pick and place and peg board) not only to evaluate the skill

of surgeons but also to analyze the performance of our

automated decision tree-based classification algorithm. The

performance of automated Therblig recognition was studied

using misclassification rate and confusion matrix that

Fig. 8. Automated Therblig Recognition Rate for E2 in PnP

Fig. 9. Automated Therblig Recognition Rate for I1in PnP

30

provided reasonable estimates for accuracy as well as

sensitivity of our algorithm. This classification method were

also found to be capable for in real surgical analyzes [33].

Collection of video data from comprehensive subject studies

will enable us to augment this classification method under

probabilistic framework as well as achieve more uniform

performance across different levels of expertise.

Fig. 13. Confusion Matrix (Recognition Rate) for E2 in Pick and Place

Fig. 14. Confusion Matrix (Recognition Rate) for N1 in Peg Board

REFERENCES

[1] J. Vozenilek, et al., "See One, Do One, Teach One: Advanced

Technology in Medical Education," Academic Emergency Medicine,

vol. 11, pp. 1149-1154, 2004.

[2] M. Bridges and D. L. Diamond, "The financial impact of teaching

surgical residents in the operating room," The American Journal of

Surgery, vol. 177, pp. 28-32, 1999.

[3] E. Harrison. (2006) The Cost of Surgical Training. Association of

Surgeons of Great Britain and Ireland Newsletter. 4-6.

[4] B. M. Schout, et al., "Validation and implementation of surgical

simulators: a critical review of present, past, and future," Surgical

Endoscopy, vol. 24, pp. 536-546, Mar 2010.

[5] J. A. Aucar, et al., "A Review of Surgical Simulation With Attention to

Validation Methodology," Surgical Laparoscopy Endoscopy &

Percutaneous Techniques, vol. 15, pp. 82-89, 2005.

[6] R. Satava, "Historical Review of Surgical Simulation²A Personal

Perspective," World Journal of Surgery, vol. 32, pp. 141-148, 2008.

[7] N. Brown, et al., "The revised ACGME laparoscopic operative

requirements: how have they impacted resident education?," Surgical

Endoscopy, pp. 1-7, Jan 2012.

[8] M. Pellen, et al., "Construct validity of the ProMIS laparoscopic

simulator," Surgical Endoscopy, vol. 23, pp. 130-139, 2009.

[9] K. Tanoue, et al., "Skills assessment using a virtual reality simulator,

/DS6LP���DIWHU�WUDLQLQJ�WR�GHYHORS�IXQGDPHQWDO�VNLOOV�IRU�HQGRVFRSLF�

surgery," Minimally Invasive Therapy & Allied Technologies, vol. 19,

pp. 24-29, 2010.

[10] P. Kanumuri, et al., "Virtual Reality and Computer-Enhanced Training

Devices Equally Improve Laparoscopic Surgical Skill in Novices,"

Journal of the Society of Laparoendoscopic Surgeons, vol. 12, pp. 219-

226, 2008.

[11] J. H. Chien, et al., "Accuracy and speed trade-off in robot-assisted

surgery," The International Journal Of Medical Robotics and Computer

Assisted Surgery, vol. 6, pp. 324-329, 2010.

[12] P. A. Kenney, et al., "Face, content, and construct validity of dV-

trainer, a novel virtual reality simulator for robotic surgery," Urology,

vol. 73, pp. 1288-92, Jun 2009.

[13] M. A. Lerner, et al., "Does training on a virtual reality robotic

simulator improve performance on the da Vinci surgical system?,"

Journal of Endourology, vol. 24, pp. 467-72, Mar 2010.

[14] Intuitive-Surgical and Inc. da Vinci Surgical Robot: Si and SKILLS

Simulator. Available: www.intuitivesurgical.com/

[15] D. Wagner and M. L. Lypson, "Centralized Assessment in Graduate

Medical Education: Cents and Sensibilities," Journal of Graduate

Medical Education, vol. 1, pp. 21-27, 2009.

[16] J. A. Martin, et al., "Objective structured assessment of technical skill

(OSATS) for surgical residents," British Journal of Surgery, vol. 84,

pp. 273-278, 1997.

[17] K. Moorthy, et al., "Objective assessment of technical skills in

surgery," BMJ, vol. 327, pp. 1032-1037, 2003.

[18] M. P. Fried, et al., "Identifying and reducing errors with surgical

simulation," Quality and Safety in Health Care, vol. 13, pp. i19-i26,

October 1 2004.

[19] J. D. Hernandez, et al., "Qualitative and quantitative analysis of the

learning curve of a simulated surgical task on the da Vinci system,"

Surgical Endoscopy, vol. 18, pp. 372-378, 2004.

[20] Handbook of Industrial Engineering: Technology and Operations

Management, 3rd ed.: John Wiley & Sons, 2001.

[21] Motion and Time Study: Design and Measurement of Work, 7th ed.

Univ. of California, Los Angeles: Wiley, August 1980.

[22] J. Rosen, et al., "Generalized approach for modeling minimally

invasive surgery as a stochastic process using a discrete Markov

model," Biomedical Engineering, IEEE Transactions on, vol. 53, pp.

399-413, 2006.

[23] J. Rosen, et al., "Markov modeling of minimally invasive surgery

based on tool/tissue interaction and force/torque signatures for

evaluating surgical skills," Biomedical Engineering, IEEE

Transactions on, vol. 48, pp. 579-591, 2001.

[24] H. C. Lin, et al., "Towards automatic skill evaluation: detection and

segmentation of robot-assisted surgical motions," Computer Aided

Surgery: Official Journal Of The International Society For Computer

Aided Surgery, vol. 11, pp. 220-230, 2006.

[25] G. Ye, et al., "Gesture Recognition Using 3D Appearance and Motion

Features," presented at the Conference on Computer Vision and

Pattern Recognition Workshop, 2004.

[26] J. Corso, et al., "Analysis of Composite Gestures with a Coherent

Probabilistic Graphical Model," Virtual Reality, vol. 8, pp. 242-52,

2005.

[27] N. Padoy, et al., "Statistical modeling and recognition of surgical

workflow," Medical Image Analysis, vol. 16, pp. 632-641, 2012.

[28] A. Jog, et al., "Towards integrating task information in skills

assessment for dexterous tasks in surgery and simulation," presented at

the 2011 IEEE International Conf. on Robotics & Automation,

Shangai, China, 2011.

[29] B. Niebel and A. Freivalds, Methods, Standards, and Work Design,

10th ed.: McGraw-Hill, 2003.

[30] M. S. Narayanan, et al., "Parallel Architecture Manipulators for Use in

Masticatory Studies," International Journal of Intelligent Mechatroncis

and Robotics, vol. 1, pp. 100-22, 2011.

[31] Q. Fu and V. Krovi, "Articulated Wheeled Robots: Exploiting

Reconfigurability and Redundancy," in 2008 ASME DSCC Dynamic

Systems and Control Conference, DSCC2008-130, Ann Arbor,

Michigan, USA., 2008.

[32] 3D-Motion-Capture. SIMI-Motion. Available:

http://www.simi.com/en/products/motion/overview/analysis/index.htm

[33] S.-K. Jun, et al., "Evaluation of Robotic Minimally Invasive Surgical

Skills using Motion Studies," in Performance Metrics for Intelligent

Systems Workshop (PerMIS'12), 2012.

31