Three-dimensional heart motion estimation using endoscopic monocular vision system: From artificial...

9
Three-dimensional heart motion estimation using endoscopic monocular vision system: From artificial landmarks to texture analysis Mickae ¨l Sauve ´e a,b , Aure ´lien Noce b, * , Philippe Poignet b , Jean Triboulet b , Etienne Dombre b a Sinters, 5 rue Paul Mesple, 31000 Toulouse, France b LIRMM, UMR 5506 CNRS UM2, 161 rue Ada, 34292 Montpellier, France Received 28 February 2007; received in revised form 2 July 2007; accepted 9 July 2007 Available online 27 August 2007 Abstract In robot-assisted beating heart surgery, motion of the heart surface might be virtually stabilized to let the surgeon work as in on-pump cardiac surgery. Virtual stabilization means to compensate physically the relative motion between the instrument tool tip and the region of interest on the heart surface, and to offer surgeon a stable visual display of the scene. To this end, motion of the heart must be estimated. This article focusses on motion estimation of the heart surface. Two approaches are considered in the paper. The first one is based on landmark tracking allowing 3D pose estimation. The second is based on texture tracking. Classical computer vision methods, as well as a new texture-based tracking scheme has been applied to track the heart motion, and, when possible, reconstruct 3D distance to the heart surface. Experimental results obtained on in vivo images show the estimated motion of heart surface points. # 2007 Elsevier Ltd. All rights reserved. Keywords: Medical applications; Motion estimation; Vision; Robot control; Spectral analysis 1. Introduction One of the most widely spread intervention in cardiac surgery is coronary artery bypass grafting (CABG). Currently most of them are performed using heart–lung machine and stopped heart, which allows the surgeon to achieve complex and fine sutures on motionless heart surface. However cardiopulmonary bypass (CPB) have deleterious effects. For instance, systemic inflammatory response have been observed [16]. To avoid the CPB problems, solutions for operating on the beating heart should be proposed. Among proposed solutions, passive mechanical stabilizers are used on the heart surface to cancel the motion (e.g. Octopus TM by Medtronic [8]). The idea is to apply a mechanical constraint on the heart surface to stabilize the working area. To ensure contact between heart surface and mechanical device, vacuum suction or suture technique have been developed. However remaining motion inside the stabilized area still exists. Lemma et al. [9] have observed excursion up to 2.4 mm on the stabilized area. So surgeon has to manually cancel the movement of the target coronary. Another way to perform beating heart surgery is then to compensate for the organ motion with a robotized assistive device. Telerobotic systems (e.g. Da Vinci TM system by Intuitive Surgical [6]) should contribute to minimize the invasiveness of surgery. They also offer the possibility to compensate for physiological motion. Thus, ideally the surgeon can concen- trate on his task, his desired movement being superimposed to the trajectory generated by the motion compensation algorithm. Hence the surgeon can work on a virtually stabilized heart as in on-pump surgery. In this robotized context, the key issues are listed below: (1) Heart motion estimation Movement must be extracted from visual feedback provided by an endoscope of a highly deformable and non structured surface, which requires the development of www.elsevier.com/locate/bspc Biomedical Signal Processing and Control 2 (2007) 199–207 * Corresponding author. Tel.: +33 4 67 14 85 64; fax: +33 4 67 14 85 00. E-mail address: [email protected] (A. Noce). 1746-8094/$ – see front matter # 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.bspc.2007.07.006

Transcript of Three-dimensional heart motion estimation using endoscopic monocular vision system: From artificial...

www.elsevier.com/locate/bspc

Biomedical Signal Processing and Control 2 (2007) 199–207

Three-dimensional heart motion estimation using endoscopic

monocular vision system: From artificial landmarks

to texture analysis

Mickael Sauvee a,b, Aurelien Noce b,*, Philippe Poignet b,Jean Triboulet b, Etienne Dombre b

a Sinters, 5 rue Paul Mesple, 31000 Toulouse, Franceb LIRMM, UMR 5506 CNRS UM2, 161 rue Ada, 34292 Montpellier, France

Received 28 February 2007; received in revised form 2 July 2007; accepted 9 July 2007

Available online 27 August 2007

Abstract

In robot-assisted beating heart surgery, motion of the heart surface might be virtually stabilized to let the surgeon work as in on-pump cardiac

surgery. Virtual stabilization means to compensate physically the relative motion between the instrument tool tip and the region of interest on the

heart surface, and to offer surgeon a stable visual display of the scene. To this end, motion of the heart must be estimated. This article focusses on

motion estimation of the heart surface. Two approaches are considered in the paper. The first one is based on landmark tracking allowing 3D pose

estimation. The second is based on texture tracking. Classical computer vision methods, as well as a new texture-based tracking scheme has been

applied to track the heart motion, and, when possible, reconstruct 3D distance to the heart surface. Experimental results obtained on in vivo images

show the estimated motion of heart surface points.

# 2007 Elsevier Ltd. All rights reserved.

Keywords: Medical applications; Motion estimation; Vision; Robot control; Spectral analysis

1. Introduction

One of the most widely spread intervention in cardiac

surgery is coronary artery bypass grafting (CABG). Currently

most of them are performed using heart–lung machine

and stopped heart, which allows the surgeon to achieve

complex and fine sutures on motionless heart surface. However

cardiopulmonary bypass (CPB) have deleterious effects. For

instance, systemic inflammatory response have been observed

[16]. To avoid the CPB problems, solutions for operating on the

beating heart should be proposed. Among proposed solutions,

passive mechanical stabilizers are used on the heart surface to

cancel the motion (e.g. OctopusTM by Medtronic [8]). The idea is

to apply a mechanical constraint on the heart surface to stabilize

the working area. To ensure contact between heart surface and

mechanical device, vacuum suction or suture technique have

* Corresponding author. Tel.: +33 4 67 14 85 64; fax: +33 4 67 14 85 00.

E-mail address: [email protected] (A. Noce).

1746-8094/$ – see front matter # 2007 Elsevier Ltd. All rights reserved.

doi:10.1016/j.bspc.2007.07.006

been developed. However remaining motion inside the stabilized

area still exists. Lemma et al. [9] have observed excursion up to

2.4 mm on the stabilized area. So surgeon has to manually cancel

the movement of the target coronary. Another way to perform

beating heart surgery is then to compensate for the organ motion

with a robotized assistive device.

Telerobotic systems (e.g. Da VinciTM system by Intuitive

Surgical [6]) should contribute to minimize the invasiveness of

surgery. They also offer the possibility to compensate for

physiological motion. Thus, ideally the surgeon can concen-

trate on his task, his desired movement being superimposed to

the trajectory generated by the motion compensation algorithm.

Hence the surgeon can work on a virtually stabilized heart as in

on-pump surgery. In this robotized context, the key issues are

listed below:

(1) H

eart motion estimation

Movement must be extracted from visual feedback

provided by an endoscope of a highly deformable and non

structured surface, which requires the development of

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207200

specific algorithms. It is intended that the estimated motion

will be used in a bio-mechanical model. This model will

improve the robustness of the robot control architecture; it

will also make it possible to model the displacement of the

tracked area.

(2) C

ontrol system design

Robot manipulator must compensate for high-bandwidth

motions (up to 5 Hz), with high precision (suture tasks are

performed on 2-mm diameter vessels). Moreover, stability

must be guaranteed for motion in both free space and

constrained space when the robot interacts with the

environment.

(3) V

isual stabilization

A stable view of the suturing area must be displayed on

the surgeon screen.

Fig. 1. Projection model: frame description.

A first solution was proposed by Nakamura et al. [12] making

use of a high speed camera which tracked artificial landmarks

placed on the heart surface. Visual feedback control was used to

control a lightweight mini-robot. In vivo experiments performed

on porcine model showed good 2D trajectory tracking, but errors

of about 1 mm have been observed. More recently, motion

estimation based on natural landmarks have been presented by

Ortmaier et al. [15] who have developed a prediction algorithm

based on ECG signals and respiratory pressure signal to improve

robustness of landmark detection. Nevertheless, experimental

evaluation was restricted to tracking of landmarks inside a

mechanically stabilized area and results were expressed in image

coordinate. In ref. [5], active landmarks placed on the heart

surface and a laser spot are observed by a 500 Hz video sensor,

which allows to compute the distance between the instrument

tool tip and the reference surface. This information is processed

in an adaptive model predictive controller. In vivo evaluation on

porcine model exhibits motion cancellation with a residual

tracking error up to 1.5 mm. However, due to the large distance

between the markers, this estimation does not reflect accurately

enough the local behavior of the Region of Interest (ROI).

Recently, Cavusoglu et al. [3] proposed to combine biological

signals (ECG signal, arterial and ventricular blood pressures) and

heart motion measurement in a model-based predictive control

algorithm to add feedforward path to robot motion control. Heart

motion measurements are obtained by a Sonomicrometric

system (manufactured by Sonometric). This technique is based

on ultrasound signals, transmitted by small piezoelectric crystals

fixed to the heart surface. To perform this experiment, the

pericardial sac has to be filled with a saline solution, which is not

possible during CABG.

In this paper, we address the issue of heart motion estimation

in terms of displacement and acceleration. We use available

information, i.e. endoscopic image, to perform heart motion

estimation in three dimensions, using at first calibrated

landmarks placed on the heart surface, then a texture based

approach to avoid the use of such landmarks. In Section 2 we

present a method based on artificial landmarks. The 3D

reconstruction procedure is based on metric information

obtained from landmarks with calibrated dimension. Section

3 introduces our texture-based approach to track motion of the

heart without landmark. Experimental evaluation of these

methods on artificial and real image sequences are detailed in

Section 4. A conclusion is drawn in Section 5, exploring

possible developments of the methods.

2. Pose estimation using landmarks

Our approach for extracting visual information and

estimating motion of the ROI consists in three steps:

(1) I

mage acquisition using calibrated endoscopic vision

system.

(2) T

racking of a geometrically known landmark, small enough

to be assumed planar (5 mm � 5 mm).

(3) P

ose estimation of the template using metric of the object.

2.1. Endoscopic vision model

In classical vision systems, a pinhole model, based on thin

lens hypothesis, is applied to describe image projection.

Although endoscope is a long rigid tube composed of a

succession of lenses that provide light from extremity of the

tube to camera lens, the pinhole model is assumed in the

literature to reasonably model endoscopic image [2,11].

Nevertheless, the pinhole model is a linear approximation of

the real camera projection. Therefore, with endoscopic system

that induces high radial distortion, it is necessary to improve

accuracy of the model by adding nonlinear compensation [18].

The pinhole model is split into two parts. The first part (see

Fig. 1) maps the coordinates of point M defined in world frame

Rw (attached to the plane P corresponding to the landmark) to

the coordinates of point m on plane F at a distance Z ¼ 1 from

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207 201

the camera projection center Oc (center of camera frame Rc)

(Eq. (1)). The coordinates ðxn; ynÞT

of point m are the

normalized coordinates, obtained from perspective projection

without considering camera intrinsic parameters. The second

part of the model takes into account the intrinsic parameters of

the vision system by applying frame transformation from

camera frame to image frame (Eq. (2)).

sxn

yn

1

0@

1A ¼ PcTw

wXwYwZ1

0BB@

1CCA (1)

uv1

0@

1A ¼ K

xn

yn

1

0@

1A (2)

The pinhole model is composed of:

� s

up

m

, the scale factor induced by the perspective projection;

� K

, the (3� 3) intrinsic parameter matrix; composed of the

optical center coordinates ðu0; v0Þ and the focal length in X

and Y directions ð f c1; f c2Þ,

� P , a (3� 4) projection matrix,

� c

Tw, the rigid transformation matrix defining the world frame

(attached to the object plane) w.r.t. the camera frame.

Distortion model adds extra displacements on the normal-

ized coordinates of point m. The model is composed of two

components. The first one takes into account radial distortion

(Eq. (3)), and the second one approximates tangential distortion

(Eq. (4)):

xd

yd

� �¼ ð1þ k1r2 þ k2r4Þ xn

yn

� �þ dx (3)

dx ¼ 2k3 xnyn þ k4ðr2 þ 2x2nÞ

k3ðr2 þ 2y2nÞ þ 2k4xnyn

� �(4)

The intrinsic parameters and coefficients of distortion have

been computed using Matlab Calibration Toolbox1 (based on

algorithm proposed by Zhang [19]). Since the model is an

approximation of reality, the calibration procedure must con-

sider several points of the ROI, inside the workspace.

2.2. Pattern tracking algorithm

For each image acquired, we need to track the pattern

resulting from the perspective projection of the landmark onto

the image plane. We choose a pattern-based algorithm that

tracks the whole image of the square object. We used Efficient

Second order Method (ESM) proposed by Benhimane and

Malis [1]. The application of the ESM algorithm2 to visual

1 Available on: http://www.vision.caltech.edu/bouguetj/calib_doc/ (last

dated: 2005).2 Available on author’s website: http://www-sop.inria.fr/icare/personnel/

alis/software/ESM.html.

tracking allows an efficient homography estimation admitting

large inter-frame displacements. In this algorithm, an

iteratively estimation procedure finds the optimal homography

which minimizes the Sum of Square Differences (SSD)

between the reference pattern (defined off line) and the current

pattern (which has been reprojected in the reference frame

using the current homography). Because initial prediction of

the homography is not available, we start with an initial

estimation equal to the identity matrix. The image derivatives

of both template and the current pattern are used to obtain an

efficient second-order update. It is an efficient algorithm since

only the first derivatives are used and the Hessians are not

explicitly computed.

Once we have computed the homography matrix GðkÞ, at

time k, between the reference and the current patterns, we

extract the image coordinates of the four-corner points of the

square pattern ðuiðkÞ; viðkÞÞT using image coordinates in the

reference pattern ðuið0Þ; við0ÞÞT through Eq. (5). The coordi-

nates of the four corners in the reference pattern are manually

selected during the off line procedure and automatically refined

by detecting nearest corner location at subpixel level.

l

uiðkÞviðkÞ

1

0@

1A ¼ GðkÞ

uið0Þvið0Þ

1

0@

1A 8 i ¼ 1; . . . ; 4 (5)

At the end of this step, we have computed information in the

image space. We now have to integrate metric information

available on the pattern to estimate 3D pose.

2.3. Pose estimation

The method is inspired from Zhang[19] and assumes that the

intrinsic parameters are known. It is implemented with

OpenCV Library.3 Assuming that the world frame is attached

to the object plane, the 4 points satisfy wZ ¼ 0; thus, without

loss of generality, we can rewrite Eq. (1) as:

xn

yn

1

0@

1A ¼ s�1½ r1 r2 t �

wXwY1

0@

1A ¼ HðkÞ

wXwY1

0@

1A (6)

where r1 and r2 are the first two columns of the rotation matrix

and t is the translation vector from the rigid transformationcTwðkÞ at time k.

Image plane coordinates expressed in the world frame

coordinates are used to compute the estimation HðkÞ of the

projection matrix between ðxn; yn; 1ÞT and ðwX; wY ; 1ÞT (see

Appendix A of ref. [19]). Scale factor s is retrieved by

computing the mean of norms of the first two columns of matrix

HðkÞ (s ¼ 0:5ðjjh1jj þ jjh2jjÞ�1). Then, Eq. (6) yields:

r1 ¼ sh1; r2 ¼ sh2; t ¼ sh3 (7)

3 Available on: http://www.intel.com/technology/computing/opencv/

index.htm (last updated: 2005).

Table 1

Chosen texture features

Approach Feature

Coocurrence matrix Energy

Coocurrence matrix Contrast

Coocurrence matrix Cluster shade

Coocurrence matrix Cluster prominence

Run-length matrix Non-uniformity

Run-length matrix Short low grey

Level run emphasis

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207202

The third vector of the rotation matrix is obtained form the

orthogonality property of this matrix:

r3 ¼ r1 � r2 (8)

3. Motion tracking without landmarks

The previous approach and its practical set-up raised several

issues:

First order statistics Skewness

First order statistics Kurtosis

� P lacing landmarks is practically difficult and time-consum-

ing;

� la

ndmarks hide portions of the heart surface;

� b

io-compatibility problems may arise.

To overcome these limitations, we developed a new

method that avoids the use of landmarks [13,14]. Such an

approach is promising and recent works focus on applying it

to visual servoing and robotics [4]. Our method relies on

tracking small regions of the heart based on texture

information instead of artificial landmarks, which should

provide a robust tracking.

3.1. Texture characterization

We characterize texture of the tracked pattern using a set

of statistical descriptors known as texture features. These

features can describe the pattern in terms of contrast,

luminosity (gray-levels), granularity, etc. Each attribute is

normalized, so the values are double precision numbers

between 0 and 1. For example, a luminosity value of 0

corresponds to a black image while a value of 1 correponds to

a white image. Naturally, the settings of the endoscope can

influence the quality of the texture-based description. But the

features has been selected in a way (see ref. [14]) that good

visual feedback corresponds to good texture characterization.

Furthermore, the use of several features in the texture vector

limits the risk of an inefficient feature bringin down overall

performance.

The texture features Ti; i2 1; . . . ; n from the texture vector Vwhich is characteristic of the ROI:

V ¼

T1

T2

..

.

Tn

0BBB@

1CCCA (9)

This characterization is, as experiments tend to prove, more

robust to changes in illumination and deformations than

classical similarity computation algorithm using correlation or

SSD. The 8 most efficient features (in Eq. (9) we use n ¼ 8) that

are recalled in Table 1 have been selected through Principal

Components Analysis and Discriminant Function Analysis

[14]. Using such a reduced set of features has several

advantages: faster computation, more efficient characteriza-

tion, and easier interpretation of the results.

3.2. Texture-based tracking

To integrate texture characterization in the pattern tracking

procedure, we introduce a new composite metric based on both

SSD and texture features. The resulting Composite Tracking

Algorithm, represented on Fig. 2, can be divided into three

steps: (1) compute normalized SSD and texture-based distance,

(2) merge those results in the Composite Metric and (3) run the

minimization procedure. Each of these steps is described

below:

� T

he Texture Distance (TD) is the distance between the texture

vector (Eq. (9)) of the tracked pattern VI and the one of the

ROI V 0I . TD tðI; I0Þ is computed using Euclidean metric

(Eq. (10)), Ti and T 0i corresponding to texture feature

components in V p and Vr:

tðI; I0Þ ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiX8

i¼1

ðTi � T 0i Þ2

vuut (10)

The distance is then normalized:

tNðI; I0Þ ¼tðI; I0Þ

max 8ROIðtÞ(11)

� O

nce TD tðI; I0Þ and SSD dðI; I0Þ have been computed, the

Composite Metric, Eq. (12), is computed and will be used as

the similarity measurement between tracked pattern and ROI.

sðI; I0Þ ¼ ldðI; I0Þ þ gtNðI; I0Þlþ g ¼ 1

(12)

The parameters l and g give the balance (in percent) between

texture characterization and correlation-based pattern match-

ing. Those parameters can be set to fixed values, but better

performance is obtained through the use of dynamic values.

Several possible solutions for those parameters have been

evaluated on experimental seqences. From this evaluations,

we found good solutions for tuning l and g, as given in

Eqs. (13) and (14):

g ¼ dðI; I0Þ (13)

g ¼ e�dðI;I0Þ=ð1�dðI;I0ÞÞ

l ¼ 1� g(14)

Fig. 2. Outline of the texture based tracking procedure.

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207 203

� F

Fig. 3. Endoscopic image of the heart with landmarks.

inally, we use a minimization procedure to retrieve the

homography matrix images. This part is similar to the one

exposed in Sections 2.2 and [13].

4. Experiments

4.1. Precision evaluation

A first step of the experiments concerns the 3D pose

estimation from landmarks. But before applying this approach

to estimate heart motion, accuracy of the method has been

evaluated using calibrated measurement system to take into

account different sources of error such as camera model

approximation, pattern tracking accuracy, 3D reconstruction

algorithm.

The endoscopic vision system is composed of a rigid

endoscope (Hopkin’s II from Karl Storz Inc.), mounted on a

35 mm focal lens optic and a CMOS camera Dalsa 1 m75. A

300 W xenon light source is connected to the rigid endoscope.

The frame rate is adjusted at 125 Hz, and a 512� 512-pixel

image size is selected.

Considering that precision is better along Xc and Yc than

along Zc, which is normal to the image plane, we evaluate

precision along Zc using a laser measurement device

(precision ¼ 0:5 mm) in a dynamic way. An object was

manually moved in plane parallel to the image plane. A

maximum error of 0.8 mm was observed w.r.t. laser measure-

ments (standard deviation ¼ 0:34 mm). Thus, we consider that

the complete system (cameraþ pattern trackingþ pose esti-

mation) gives an estimation of motion with a precision up to

1 mm.

4.2. In vivo experimentation

The proposed approach without landmarks has been applied

on an anesthetized pig of 25 kg with the assistance of a trained

medical team, in accordance with ethical and regulatory issues

related to animal experiments. The pig was assisted by a

pulmonary ventilator with constrained respiratory cycle to 20

cycles per minute. A thoracotomy has been performed to

facilitate heart access. In the open chest, the heart is not

anymore constrained by the ribs. Then, the cardiac cycle

induces non natural motion of the heart within the thorax that

should be much less important in MIS conditions. Images have

been acquired using endoscopic vision system presented

in Section 4.1. The endoscopic vision system is calibrated

for a workspace of 20 mm � 15 mm � 15 mm, centered at a

distance of 55 mm from the endoscope tip. Intrinsic parameters

are reported in Table 2.

Artificial passive landmarks have simply been laid on the

heart surface as shown in Fig. 3. Each landmark is assumed to

be a planar square of 5 mm side, drawn on white paper. To

avoid lack of information for the tracking algorithm, variable

gray levels have been painted on the pattern. Sequences of 1900

images have been acquired with a sampling time of 8 ms.

4.3. Motion analysis

Estimation of heart motion is presented on Fig. 4, over the

entire acquired sequence of 15 s. A spectral analysis has been

performed to evaluate frequency components of the estimated

signal (see Table 3). The first two components ( f 1 ¼ 0:34 Hz,

and f 2 ¼ 0:68 Hz) must be related to the respiratory activity.

f 1 is equal to the frequency imposed by the pulmonary

ventilator (20 cycles per minute). f 2 may be considered as an

Fig. 4. Motion estimation of patch 1: (a) X direction, (b) Y direction, (c) Z

direction.

Table 2

Intrinsic parameters (in pixel)

f c1 f c2 u0 v0

Pinhole model

593.6 594.5 279.6 223.7

k1 k2 k3 k4

Distortion model

�0.241 0.207 0.005 0.008

Table 3

Spectral analysis

Frequency (Hz)

0.34 0.67 1.19 2.38 3.57 4.76

Density in Xc direction (%) 8.7 0 21.8 60 2.8 6.7

Density in Yc direction (%) 3.9 1.9 45.6 33 3.9 11.7

Density in Zc direction (%) 70.9 11.5 5.9 8.7 1.2 1.8

Table 4

Maximum acceleration of the estimated motion

x (m s�2) y (m s�2) z (m s�2)

Landmark 1 1.73 1.09 3.33

Landmark 2 1.38 0.94 3.67

Table 5

Error in translation tracking (sequence of 100 images at 100 Hz)

Method Mean error (Pixels) Standard deviation

Correlation 0 0

SSD 0.04 0.19

Optical flow 0.90 0.73

Composite 0 0

ESM 0 0

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207204

harmonic component of the respiratory activity ( f 2 ¼ 2 f 1).

Heart activity provides 4 other frequencies. f 3 ¼ 1:19 Hz

represents the heart beat cycle (around 70 beats per minute),

and the three others are harmonic components of f 3. The

spectral analysis shows that the estimated motion are

dominated by cardiac and respiratory activities. From power

spectral density analysis, we can see that motion in Xc and Yc

directions (parallel to image plane) are mostly governed by

cardiac activity, whereas amplitude of the motion along Zc

direction is greatly induced by respiratory activity. By applying

a low pass filter of 1 Hz cutoff frequency, we can extract the

respiratory component from the estimated motion along Zc

direction (see bold plot in Fig. 4c). Thus maximum amplitude

observed in Zc direction is about 11 mm and is composed of

respiratory displacement of 6 mm.

Accelerations have been computed as the second derivatives

of the position, after backward and forward lowpass filtering.

We obtain accelerations up to 3.67 m s�2 as reported in

Table 4.

4.4. Offline evaluation of motion tracking without

landmarks

To assess the precision and robustness of the texture-based

tracking and compare the result with other approaches, we used

both artificial and natural test sequences.

Fig. 5. Sample artificial sequences: translation.

Fig. 6. Sample trajectory for vertical translation. Motion tracking error is projection on several axes, underlining the inadequacy of classical approaches, such as

optical flow, with the considered application.

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207 205

Fig. 7. Sample trajectory for circular translation and rotation of the pattern. This shows superior robustness of the Composite Approach over other classical

algorithms.

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207206

� A

rtificial sequences were generated by moving the tracked

pattern along a heart image, as illustrated on Fig. 5, in which a

small pattern (the target) is moved among the heart sequence

to challenge the tracking algotithms. Each image of the

sequence is obtained by superimposing on the original

experimental sequence a small textured patch and apply

displacements and transformations (such as translation,

rotation) on it. These sequences enabled us to quantify the

performance of our approach in terms of tracking precision,

and to compare it with other tracking schemes, namely SSD,

correlation [7], optical flow [17] and ESM [10]. Figs. 6 and 7

show sample trajectories and results (our method is termed as

‘‘Composite’’). On Fig. 6 we observe repetitive tracking

errors with the optical flow that can be related to the presence

of specularities. Fig. 6(a and b) shows that with the SSD and

optical flow approaches, the tracking motion is disturbed by

small oscillations. These oscillations come from the fact that

the algorithms loose temporally the target (due for instance to

specularities). Fig. 7 shows the robustness of our approach

Fig. 8. Displacements in pixels during motion.

during a composite circular trajectory involving translation

and rotation of the pattern: a scale factor up to �10% have

been applied to the pattern along the trajectory. The SSD and

the optical flow algorithms are unable to track the trajectory

while the correlation algorithm behaves well at the beginning

of the trajectory, then diverges. Two causes are possible:

specularity, or scale factor and rotation changes too high.

The precision of tracking has been computed in the image

for each sequence in order to quantify the performance of the

approaches. Table 5 gives sample results obtained on the

sequence corresponding to Fig. 6 in terms of mean and

standard deviation of the tracking error. As mentioned earlier,

correlation, ESM and Composite Tracking (our approach)

score better than SSD and optical flow. Other sequences have

been generated, that confirm the improvements in tracking

robustness using textures.

� Q

ualitative evaluation was also performed using registered

beating heart sequence to tune the algorithm and to perform

application-focused comparison with other approaches.

Overall performance is very encouraging, and tracking is

visually correct, even if very homogeneous regions of the

heart are still difficult to track whichever approach is used and

even if we did not succeed in obtaining precise 3D metric of

the displacement of the heart. Fig. 8 shows the vertical and

horizontal motions obtained without landmark, which can be

related to the curves of Fig. 4(with landmarks), proving the

relevance of the approach.

5. Conclusion and perspectives

We have presented in this paper estimation of heart motion

using available vision system in operating room, using both

landmark-based and texture-based approaches. The latter

showed that evolution towards full vision-based motion

compensation is realistic and promising.

The approach with landmarks exhibits results with precision

better than 1 mm. In vivo experiments have been done on porcine

model. Estimated heart motion is clearly governed by cardiac and

M. Sauvee et al. / Biomedical Signal Processing and Control 2 (2007) 199–207 207

heart cycle activities. Depending on the location on heart surface,

accelerations up to 3.67 m s�2 have been observed.

Several improvements can be expected. Tests of a new

approach based on texture tracking on experimental sequences

revealed that the approach is quite robust in terms of tracking

but, due to the high deformation of the heart, it is still subject

to tracking losses in some extreme cases. Further work is

necessary to make up for the lack of metric in image and

achieve performance required by medical context.

Acknowledgement

We gratefully thank Roland Demaria4 for his help in

experimental applications.

References

[1] S. Benhimane, E. Malis, Real-time image-based tracking of planes using

efficient second-order minimization, in: Proceeding of IEEE IROS,

Sendai, Japan, (2004), pp. 943–948.

[2] J.J. Caban, W.B. Seales, Reconstruction and enhancement in monocular

laparoscopic imagery, in: Proceedings of Medicine Meets Virtual Reality,

vol. 12, 2004, pp. 37–39.

[3] M.C. Cavusoglu, J. Rotella, W.S. Newman, S. Choi, J. Ustin, S.S. Sastry,

Control algorithms for active relative motion cancelling for robotic

assisted off-pump coronary artery bypass graft surgery, in: Proceedings

of IEEE ICAR, Seattle, WA, USA, 2005.

[4] A.I. Comport, E. Marchand, M. Pressigout, F. Chaumette, Real-time

markerless tracking for augmented reality: the virtual visual servoing

framework, IEEE Trans. Visual. Comput. Graphics 12 (2006) 289–298.

[5] R. Ginhoux, J. Gangloff, M. de Mathelin, L. Soler, M.M. Arena Sanchez,

J. Marescaux, Active filtering of physiological motion in robotized surgery

using predictive control, IEEE Trans. Robotics 21 (1) (2005) 67–79.

[6] G.S. Guthart, J.K. Salisbury, The intuitive telesurgery system: overview

and application, in: Proceedings of IEEE ICRA, 2000, pp. 618–621.

4 Roland Demaria is a surgeon at the ‘‘Service de chirurgie Cardiovasculaire,

CHU Arnaud de Villeneuve, 371 av. G. Giraud, 34295 Montpellier, France’’.

[7] S. Hutchinson, G.D. Hager, P. Corke, A tutorial on visual servo control,

IEEE Trans. Robotics Automat. 12 (5) (1996) 651–670.

[8] E.W.L. Jansen, P.F. Grundeman, C. Borst, F. Eefting, J. Diephuis, A.

Nierich, J.R. Lahpor, J.J. Bredee, Less invasive off-pump CABG using a

suction device for immobilization: the octopus method, Eur. J. Cardio-

Thoracic Surg. 12 (1997) 406–412.

[9] M. Lemma, A. Mangini, A. Reaelli, F. Acocella, Do cardiac stabilizers

really stabilize? experimental quantitative analysis of mechanical stabi-

lization, Interact. Cardiovasc. Thoracic Surg. 4 (2005) 222–226.

[10] E. Malis, Improving vision-based control using efficient second-order

minimization techniques, in: Proceedings of IEEE ICRA, New Orleans,

USA, 2004.

[11] G. Marti, V. Bettschart, J.-S. Billiard, C. Baur, Hybrid method for both

calibration and registration of an endoscope with an active optical marker,

in: Proceedings of Computer Assisted Radiology and Surgery, 2004, pp.

159–164.

[12] Y. Nakamura, K. Kishi, H. Kawakami, Heartbeat synchronization for

robotic cardiac surgery, in: Proceedings of IEEE ICRA, Seoul, Korea,

(2001), pp. 2014–2019.

[13] A. Noce, J. Triboulet, P. Poignet, Composite visual tracking of the

moving heart using texture characterization, in: Proceedings of IEEE

MICCAI Work. Medical Robotics, Copenhagen, Denmark, (2006), pp.

54–65.

[14] A. Noce, J. Triboulet, P. Poignet, E. Dombre, Texture features selection for

visual servoing of the beating heart, in: Proceedings of IEEE/RAS-EMBS

BIOROB’06, number 183, 2006.

[15] T. Ortmaier, M. Groger, D.H. Boehm, V. Falk, G. Hirzinger, Motion

estimation in beating heart surgery, IEEE Trans. Biomed. Eng. 52 (10)

(2005) 1729–1740.

[16] A.L. Picone, C.J. Lutz, C. Finck, D. Carney, L.A. Gatto, A. Paskanik, B.

Searles, K. Snyder, G. Nieman, Multiple sequential insults cause post-

pump syndrome, Ann. Thoracic Surg. 67 (1999) 978–985.

[17] T. Suzuki, T. Kanade, Measurement of vehicle motion and orientation

using optical flow, in: Proceedings of IEEE ITSC, 1999, pp. 25–

30.

[18] X. Zhang, S. Payandeh, Application of visual tracking for robotic-assisted

laparoscopic surgery, J. Robotics Syst. 19 (7) (2002) 315–328.

[19] Z. Zhang, A flexible new technique for camera calibration. Technical

Report MSR-TR-98-71. Microsoft Research, 1998.