IMAGE RECONSTRUCTION IN HIGH-RESOLUTION PET

195

Transcript of IMAGE RECONSTRUCTION IN HIGH-RESOLUTION PET

IMAGE RECONSTRUCTION IN HIGH-RESOLUTION PET:

GPU-ACCELERATED STRATEGIES FOR IMPROVING IMAGE QUALITY

AND ACCURACY

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Guillem Pratx

December 2009

http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/vz692jm2943

© 2010 by Guillem Pratx. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

ii

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Craig Levin, Primary Adviser

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Patrick Hanrahan

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

John Pauly

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

Abstract

Molecular imaging can interrogate subtle molecular disease signatures non-invasively in liv-

ing subjects. Positron emission tomography (PET), one particular molecular imaging modal-

ity, is able to sense molecular signals deep within tissue. Although PET is well suited for

imaging signals associated with cancerous lesions in humans, it is still unable to resolve

very small (<2 mm) structures. Improving the spatial resolution of PET is an active area

of research driven by many potential applications. However, the new generation of high-

resolution PET systems raise new challenges for image reconstruction.

In this dissertation, several strategies and algorithms are proposed to enable accurate

and practical image reconstruction for high-resolution PET. Reconstruction is performed di-

rectly from the list-mode data via maximum-likelihood estimation. A shift-varying model of

the imaging process is incorporated in the reconstruction. For fast reconstruction, the calcu-

lations are implemented using highly-parallel graphics processing units (GPU). A Bayesian

sequence reconstruction algorithm is also used to position the annihilation photons that

deposit energy in multiple detection elements.

We show that the reconstruction provides near-uniform spatial resolution throughout the

eld-of-view, enhanced trade-o between noise and contrast, and better image quantitative

accuracy. Furthermore, thanks to the computing power of graphics hardware, reconstruction

times are practical for clinical applications.

iv

Acknowledgement

This work would not have been possible without the support of my coworkers, and the help

and love from my friends, family and wife.

In particular, I am very grateful to my dissertation adviser Craig Levin, a passionate

teacher never reluctant to share his knowledge. I thank him for his availability and his

dedication to help me achieve my goals. Also, I was lucky to share the lab with a band

of great people: Alex, Angela, Arne, Eric, Frances, Frezghi, Garry, Hao, Jing-Yu, Jinjiang,

Paul, Peter, Virginia, and Yi. I am grateful to you all for these intense discussions from

which I learned so much, for the help I received conducting experiments, for reviewing my

dissertation, for the basketball games, and for sometimes watching over my cats during the

holidays.

I also gratefully acknowledge the institutional support that I have received while working

on this project. In particular, I thank the Bio-X program for their generous graduate

fellowship program, that has allowed me to perform research in the best conditions possible.

I also thank the NVIDIA Corporation for providing funds and state-of-the-art equipment

for my research. This work would also not have been possible without support from the

National Institutes of Health.

Last, I want to give special thanks to my familyJean-Max, Marie-Paule, Thomas, and

Annewhose permanent support has helped me so much. These lines would be incomplete

without a word for my dear wife Lindsey, for her love, patience, and support over the years.

v

Contents

Abstract iv

Acknowledgement v

1 Introduction 1

1.1 Molecular Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Principles of PET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 High-Resolution PET Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.2 Pre-Clinical PET System . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.3 Breast Cancer PET System . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 Image Reconstruction Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4.1 Image Reconstruction Complexity . . . . . . . . . . . . . . . . . . . . 9

1.4.2 Shift-Varying System Response . . . . . . . . . . . . . . . . . . . . . . 11

1.4.3 Multiple-Interaction Photon Events . . . . . . . . . . . . . . . . . . . 13

1.5 Overview of this Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Imaging Model for High-Resolution PET 16

2.1 Principles and Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.1 Physics of PET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.1.1 Photon Emission . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.1.2 Photon Detection . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.1.3 Photon Transport . . . . . . . . . . . . . . . . . . . . . . . . 19

2.1.1.4 Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.2 Spatially Variant and Invariant Models for Discrete Image Represen-

tations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2 Analytical Calculation of the Coincident Detector Response Function . . . . 24

vi

2.3 Approximation for Small Crystals . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.1 Fast Calculation of Intrinsic Detector Response Function . . . . . . . . 27

2.3.2 Analytical Scaled Convolution . . . . . . . . . . . . . . . . . . . . . . . 30

2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Maximum-Likelihood Image Reconstruction 36

3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1.1 Analytical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1.2 Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1.2.1 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . 38

3.1.2.2 Maximum A Posteriori . . . . . . . . . . . . . . . . . . . . . 38

3.1.2.3 Other Objective Functions . . . . . . . . . . . . . . . . . . . 39

3.1.3 Existing Optimization Methods . . . . . . . . . . . . . . . . . . . . . . 39

3.1.3.1 Expectation-Maximization for ML reconstruction . . . . . . 39

3.1.3.2 Ordered-Subset Expectation-Maximization for ML reconstruc-

tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.1.3.3 List-Mode Processing . . . . . . . . . . . . . . . . . . . . . . 41

3.1.3.4 Gradient Ascent for ML reconstruction . . . . . . . . . . . . 42

3.1.3.5 Conjugate Gradient for WLS Reconstruction . . . . . . . . . 43

3.1.3.6 Conjugate Gradient for ML Reconstruction . . . . . . . . . . 46

3.2 Novel ML Conjugation of Search Directions . . . . . . . . . . . . . . . . . . . 46

3.2.1 Conjugation in ML-CG . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.2.2 Explicit Conjugation of Search Directions . . . . . . . . . . . . . . . . 48

3.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3 Novel ML Reconstruction via Truncated Newton's Method . . . . . . . . . . 53

3.3.1 Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3.2 KarushKuhnTucker Conditions . . . . . . . . . . . . . . . . . . . . . 54

3.3.3 Newton Step for a Relaxed Problem . . . . . . . . . . . . . . . . . . . 55

3.3.4 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

vii

4 Fast Shift-Varying Line Projection using Graphics Hardware 61

4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.1.1 The Graphics Processing Unit . . . . . . . . . . . . . . . . . . . . . . . 61

4.1.2 Iterative Reconstruction on the GPU . . . . . . . . . . . . . . . . . . . 64

4.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.2.1 System Response Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.2.2 GPU Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.2.2.1 Data Representation . . . . . . . . . . . . . . . . . . . . . . . 66

4.2.2.2 Line Projection Stages . . . . . . . . . . . . . . . . . . . . . . 67

4.2.2.3 Voxel Identication in Line Forward Projection . . . . . . . 67

4.2.2.4 Voxel Identication in Line Backprojection . . . . . . . . . . 70

4.2.2.5 Kernel Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 71

4.2.2.6 Vector Data Update . . . . . . . . . . . . . . . . . . . . . . . 72

4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Applications of GPU-Based Line Projections 75

5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2 List-Mode OSEM with Shift-Invariant Projections . . . . . . . . . . . . . . . 75

5.2.1 Shift-Invariant System Response Kernel . . . . . . . . . . . . . . . . . 75

5.2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2.2.1 Simulation Data . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2.2.2 Validation: Experimental Pre-Clinical Data . . . . . . . . . . 78

5.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.3 List-mode OSEM with Shift-Varying Projections . . . . . . . . . . . . . . . . 85

5.3.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.3.1.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.3.1.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.3.2.1 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.3.2.2 Contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.3.2.3 Reconstruction Time . . . . . . . . . . . . . . . . . . . . . . . 90

5.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.4 Time-of-ight PET Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 93

5.4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

viii

5.4.2.1 System Description . . . . . . . . . . . . . . . . . . . . . . . 95

5.4.2.2 Implementation on the GPU . . . . . . . . . . . . . . . . . . 95

5.4.2.3 Phantom Experiment . . . . . . . . . . . . . . . . . . . . . . 96

5.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.4.3.1 Contrast vs. Noise . . . . . . . . . . . . . . . . . . . . . . . . 98

5.4.3.2 Processing Time . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6 Bayesian Reconstruction of Photon Interaction Sequences 103

6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.1.2 Methods to Position Multiple Interaction Photon Events . . . . . . . . 104

6.1.2.1 Initial Interaction Selection. . . . . . . . . . . . . . . . . . . . 104

6.1.2.2 Unconstrained Positioning. . . . . . . . . . . . . . . . . . . . 104

6.1.2.3 Full Sequence Reconstruction. . . . . . . . . . . . . . . . . . 105

6.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.2.1 Maximum-Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.2.2 Maximum A Posteriori . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.3 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.3.1 Simulation of a CZT PET System . . . . . . . . . . . . . . . . . . . . 112

6.3.2 Positioning Algorithms and Figures of Merit Used . . . . . . . . . . . 113

6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.4.1 Recovery Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.4.2 Point-Spread Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.4.3 Reconstructed Contrast . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.4.4 Reconstructed Sphere Resolution . . . . . . . . . . . . . . . . . . . . . 121

6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.5.1 Performance of Proposed Scheme . . . . . . . . . . . . . . . . . . . . . 124

6.5.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.5.3 Possible Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

7 Concluding Remarks, Future Directions 129

ix

A GPU Line Projections 131

A.1 Data Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

A.1.1 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

A.1.2 List-Mode Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

A.2 Line Forward Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

A.3 Line Backprojection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

B File Formats 145

B.1 List Mode and Histogram Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 145

B.2 Image Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

B.3 Colormap Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

C User Manual 147

C.1 Command Line Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

C.2 Conguration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

C.3 Interactive-Mode Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

D Gamma Camera Acquisition Software 150

D.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

D.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

D.3 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

D.3.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

D.3.2 Flood Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

D.3.3 Automatic Peak Finding . . . . . . . . . . . . . . . . . . . . . . . . . . 153

D.3.4 Peak Manual Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . 154

D.3.5 Automatic Peak Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . 154

D.3.6 Crystal Segmentation and Energy Gating . . . . . . . . . . . . . . . . 154

D.3.7 Camera Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . 157

D.3.8 Real-Time Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

D.3.8.1 Accumulation Mode . . . . . . . . . . . . . . . . . . . . . . . 157

D.3.8.2 Dynamic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 157

D.4 User's Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

E Analysis of Reconstructed Sphere Size 160

F Glossary of Terms 163

x

Bibliography 166

xi

List of Tables

3.1 Example of histogram-mode and list-mode datasets . . . . . . . . . . . . . . 42

5.1 Reconstruction time for GPU and CPU . . . . . . . . . . . . . . . . . . . . . 83

5.2 Reconstruction time on GPU with and without shift-varying model . . . . . . 91

5.3 Processing time for list-mode TOF reconstruction . . . . . . . . . . . . . . . . 100

6.1 Recovery rate for MAP and MPD positioning for four datasets . . . . . . . . 117

6.2 Recovery rate as a function of the number of interactions . . . . . . . . . . . . 117

6.3 Recovery rate as a function of system parameters . . . . . . . . . . . . . . . . 117

6.4 Recovery rate for MAP for stochastic and deterministic objectives . . . . . . . 117

C.1 Command-line options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

C.2 Interactive shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

xii

List of Figures

1.1 Basic principles of PET imaging . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 High-resolution small-animal PET scanner based on CZT detectors . . . . . . 7

1.3 High-resolution PET camera for breast cancer . . . . . . . . . . . . . . . . . . 8

1.4 Trend in the number of LORs for PET systems . . . . . . . . . . . . . . . . . 10

1.5 Depiction of parallax error in ring-shaped PET systems. . . . . . . . . . . . 11

1.6 Depiction and eect of spatially-varying spatial resolution in a box-shaped

system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.7 Comparison of resolution at the center at the center of a ring and a box-shaped

PET system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.8 Example of mispositioning caused by multiple-interaction photon event . . . . 14

2.1 Depiction of positron emission and positron range . . . . . . . . . . . . . . . 17

2.2 Intrinsic detector response function . . . . . . . . . . . . . . . . . . . . . . . 18

2.3 The three types of coincidences in PET . . . . . . . . . . . . . . . . . . . . . 21

2.4 Depiction of the system matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.5 Geometry used for calculating the CDRF . . . . . . . . . . . . . . . . . . . . 25

2.6 2-D vs 3-D system response kernels . . . . . . . . . . . . . . . . . . . . . . . . 26

2.7 Representation of the detection length and the attenuation length . . . . . . 28

2.8 Comparison of the intrinsic detector response function for a small CZT crystal

and a larger LSO crystal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.9 Decomposition of the coincidence detector response function into the sum of

nine elementary functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.10 Coincidence detector response function for three lines-of-response . . . . . . . 33

2.11 Comparison for a section of the CDRF for a normal LOR, calculated by a full

Monte-Carlo simulation and by the SC+SA approximate method. (a) Section

at the center of the LOR. (b) Section 25 mm from the LOR center. . . . . . 34

xiii

3.1 Reconstructed images for ML-CG with PolakRibière and with the new for-

mulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2 Rate of convergence for ML-CG reconstruction with PolakRibière and with

ML conjugation of search directions . . . . . . . . . . . . . . . . . . . . . . . 51

3.3 Example of calculated values for β using ML conjugation of search direction 51

3.4 Histogram of the diagonal coecient of Λ . . . . . . . . . . . . . . . . . . . . 52

3.5 Initial log-likelihood gradient and initial Newton search direction for the re-

construction of a noise-free and a noisy Shepp-Logan phantom with Newton's

method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.6 Reconstructed Shepp-Logan phantom using the truncated Newton's method . 58

3.7 Convergence rate for reconstruction with the truncated Newton's method . . 59

4.1 Trend in the computational performance for CPUs and GPUs . . . . . . . . . 62

4.2 The graphics pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.3 Example of a parametrization of the system response kernel. . . . . . . . . . . 65

4.4 Depiction of line forward projection on the GPU . . . . . . . . . . . . . . . . 69

4.5 Depiction of line backprojection on the GPU . . . . . . . . . . . . . . . . . . 71

5.1 Rod phantom and sphere phantom . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2 Photos of hot/cold phantom and GE Vista eXplore DR . . . . . . . . . . . . 78

5.3 Reconstructed rod phantom on GPU and CPU . . . . . . . . . . . . . . . . . 79

5.4 Prole through reconstructed rod phantom . . . . . . . . . . . . . . . . . . . 79

5.5 Contrastnoise trade-o for rod phantom . . . . . . . . . . . . . . . . . . . . . 80

5.6 Reconstructed sphere phantom with GPU and CPU . . . . . . . . . . . . . . 81

5.7 Prole through reconstructed sphere phantom . . . . . . . . . . . . . . . . . . 81

5.8 Average error between GPU and CPU reconstructions . . . . . . . . . . . . . 81

5.9 Reconstruction of the hot rod phantom . . . . . . . . . . . . . . . . . . . . . . 82

5.10 Reconstruction of the cold rod phantom . . . . . . . . . . . . . . . . . . . . . 82

5.11 Architecture of the calculation of the coincidence detector response function

on the GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.12 Depiction of phantoms used for measuring the eect of shift-varying resolution

models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.13 Reconstructed sphere phantom with and without shift-varying model . . . . . 88

5.14 Reconstructed sphere size in sphere phantom with and without shift-varying

model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.15 Reconstructed contrast phantom, with and without shift-varying model . . . . 90

xiv

5.16 Noisecontrast trade-o with and without shift-varying model . . . . . . . . . 91

5.17 Principles of time-of-ight PET . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.18 TOF kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.19 Cylindrical phantom used for time-of-ight PET measurements . . . . . . . . 96

5.20 GPU list-mode reconstructed images with and without TOF information . . . 98

5.21 Inuence of voxel size on TOF reconstructed images . . . . . . . . . . . . . . 99

5.22 Noisecontrast trade-o curves for TOF and non-TOF reconstructions . . . . 100

6.1 Position quantization in CZT cross-strip modules . . . . . . . . . . . . . . . . 106

6.2 Eect of detection element size on sequence reconstruction . . . . . . . . . . . 109

6.3 Linear Compton scatter attenuation coecient for CZT . . . . . . . . . . . . 111

6.4 Phantoms used in the quantitative evaluation of the positioning methods . . . 114

6.5 Success rate in positioning the rst interaction with MAP as a function of

the parameter β (6.19). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.6 Point-spread functions (1-D) for four positioning methods . . . . . . . . . . . 119

6.7 Point-spread function (2-D) for three beams angles . . . . . . . . . . . . . . . 120

6.8 Reconstructed contrast phantom for four positioning methods . . . . . . . . . 120

6.9 Noisecontrast trade-o curve for four positioning methods . . . . . . . . . . 122

6.10 Reconstructed sphere phantom for four positioning methods . . . . . . . . . . 123

6.11 Sphere size for reconstructed sphere phantom for four positioning methods . . 125

A.2 List-mode storage on the GPU . . . . . . . . . . . . . . . . . . . . . . . . . . 132

A.1 Image storage on the GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

A.3 Schematic of the forward projection of a LOR. . . . . . . . . . . . . . . . . . 133

D.1 Photos of the gamma camera prototype . . . . . . . . . . . . . . . . . . . . . 151

D.2 State schematics for the gamma camera software . . . . . . . . . . . . . . . . 152

D.3 Example of ood histogram and individual channel histogram . . . . . . . . . 153

D.4 Peak nder and sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

D.5 Crystal segmentation map, per-crystal energy resolution, and examples of

energy spectrums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

D.6 Per-crystal photopeak and eciency factor . . . . . . . . . . . . . . . . . . . . 156

D.7 Example of real-time imaging for a SLN biopsy . . . . . . . . . . . . . . . . . 158

E.1 Impact of blurring on reconstructed sphere size . . . . . . . . . . . . . . . . . 161

E.2 FWHM size of blurred sphere as a function of the blurring kernel FWHM . . 162

xv

xvi

Chapter 1

Introduction

1.1 Molecular Imaging

When researchers began studying living organisms and disease at the molecular level, they

needed more powerful instruments to allow them to quantify these molecular processes

in vivo. Although microscopes and other in-vitro instruments could be used to analyze

molecular markers in small tissue samples, the information gained by performing such studies

was limited. Hence, a new eld of instrumentation emerged, called molecular imaging [1,

2]. In molecular imaging, medical imaging techniques are used to visualize and quantify

subtle molecular processes in living subjects. While conventional medical imaging can reveal

the details of the anatomy, molecular imaging can estimate how a given molecular probe

distributes in the body. As a result, researchers can now visualize the molecular signals

associated with disease without perturbing the biological system. The insights resulting from

these studies have led to new ways of detecting and treating diseases. Molecular imaging was

pioneered by one particular imaging modality called positron emission tomography (PET).

PET is now commonly used for imaging cancer [3, 4], the heart [5] and the brain [6].

Because resolving very small (≤ 2 mm) structures with PET is still problematic, improving

the spatial resolution of PET systems is an active area of research. Unfortunately, high-

resolution PET raises new challenges for image reconstruction.

Molecular imaging studies require a molecular probe and a medical imaging instrument.

Imaging endogenous molecules directly would be ideal; however, it is dicult because most

molecules of interest do not naturally produce physical signals that could be detected by

an instrument. Therefore, molecular probes (or tracers) are engineered ex vivo and injected

into the subject. These probes are composed of a biological marker that determines how

the probe interacts with its host, and a label that signals the location of the probe. The

1

2 CHAPTER 1. INTRODUCTION

biological marker is designed to answer a specic question, such as Which cells express a

specic receptor? or Which cells actively transport a specic molecule?

Molecular imaging techniques are already widely used to diagnose diseases early and

improve treatment. For example, certain molecular targets can indicate the onset of cancer

before any anatomical changes are detectable. For example, when cancer is diagnosed in

stage I, more treatment options are available and the ve-year patient survival rate is greater

than 90% [7]. With conventional anatomical imaging techniques, such as X-ray computed

tomography (CT), cancer can be detected only when tumors have grown larger than 1 cm in

diameter and contain 109 cells [8]. Furthermore, molecular imaging can be used to monitor

treatment for disease in the clinic to eciently determine whether a patient is responding

to a particular therapy, or whether alternative drugs or treatments are needed.

Molecular imaging is also a powerful tool for discovering novel treatments that target

cancer [8, 9], Alzheimer's disease [10], etc. The development of new drugs is expensive

and time-consuming, and clinical trials require many patients. Molecular imaging has the

potential to accelerate drug discovery and reduce development costs. Imaging studies can

be performed on animal models of human disease using a wide range of molecular imaging

techniques. New targets for drugs can be discovered by imaging biomarkers specic to

disease. In clinical trials, the ecacy of the new drugs can be evaluated more quickly using

therapy endpoints based on imaging biomarkers rather than histological analysis of tumor

biopsy samples.

Besides PET, molecular imaging encompasses four major imaging modalities. Each

modality uses a specic mechanism for signaling where the molecular probes are and conse-

quently oers a dierent trade-o in term of cost, spatial resolution, biological signal sensi-

tivity, signal penetration depth, and clinical applicability. These four modalities operate as

follows:

• Radionuclide imaging uses radioactive elements to label molecular probes. This

well-established modality is well suited for imaging molecular signals deep within tis-

sue. In addition, the molecular probes can be made very small because the signal

transmitter consists of a single atom. Within this modality, PET is the most sensi-

tive molecular imaging instrument for whole body imaging, as probes can be detected

in concentrations as low as 10-12 mol/L. Besides PET, gamma camera and single-

photon emission computed tomography (SPECT) can image molecular probe built from

gamma-emitting isotopes.

• Optical imaging uses light as a signaling mechanism [11]. Light has several advan-

tages: it is harmless to the patient and relatively inexpensive to produce and to detect.

1.2. PRINCIPLES OF PET 3

However, light does not penetrate deep within tissue, which limits the clinical applica-

bility of optical techniques. Light can be produced by uorescent probes (which need

to be excited by an external light source), bioluminescent probes (which produce light

through a chemical reaction), or other nano-sized probes such as quantum dots.

• Magnetic resonance imaging (MRI) uses strong magnetic elds to align the mag-

netic moment of protons. Following a short radio-frequency pulse, these protons lose

their alignment and produce radio-frequency signals. MRI is conventionally used for

imaging anatomical structures; however, MRI-specic molecular probes have been de-

veloped, so MRI can now also be used for imaging molecular processes. MRI has high

spatial resolution and new probe discoveries can be readily translated into clinical ap-

plications. However, it remains an expensive tool and the sensitivity of MRI-specic

molecular probes is still limited.

• Ultrasonography uses pressure pulses to image anatomical structures. This inex-

pensive modality is widely used in many medical applications. Thanks to a special

micro-bubbles contrast agent, ultrasonography is now starting to be used for molecular

imaging.

1.2 Principles of PET

PET is a molecular imaging technique that uses positron emission as a signaling mechanism.

In PET, the molecular probe contains a radioactive atom that can decay by emitting a

positron. A PET probe interacts with a living subject in the same way as a chemically

identical molecule made of stable isotopes. This property allows PET to track molecules

without aecting their behavior. Several positron-emitting isotopes have a half-life suitable

for PET imaging: 11C, 15O, 13N, 18F, 64Cu, 82Rb, and 124I.

One of the most successful PET probes has been 2-[18F]uoro-2-deoxy-D-glucose (FDG)

[5, 6, 12]. FDG consists of a modied molecule of glucose in which a radioactive uorine

atom (18F) substitutes for a hydroxyl group (OH). Radioactive uorine is synthesized in

a cyclotron [13]. After intravenous injection into the patient, FDG is transported from the

blood stream into the cells by glucose transporters (Glut-1 in particular). In the cell, FDG

is phosphorylated by a group of enzymes called hexokinases [14]. The additional phosphate

group prevents phosphorylated FDG from leaving the cell, resulting in FDG accumulation

in cells where glucose is transported and utilized. Hence, FDG concentration is a good

surrogate for the local rate at which the cells use glucose. Unlike glucose, FDG is cleared

4 CHAPTER 1. INTRODUCTION

out of the blood by the kidneys [15]. This results in high contrast between the signal (the

FDG trapped in cells) and the background (the FDG not trapped in cells). Cancerous cells

have abnormally high metabolism and require a lot of energy (in the form of glucose) to

sustain accelerated division. Therefore, in principle, cancerous lesions appear brighter than

normal tissue on PET scans.

The signal of a PET probe is transmitted by a pair of annihilation photons. When the

probe radioactive label decays, it emits a positron which annihilates with an electron within

tens of microns to a few millimeters of the decay location. The annihilation results in the

simultaneous production of two anticollinear (i.e. back-to-back) 511 keV1 photons, called

annihilation photons.

Annihilation photons are detected using radiation detectors that surround the subject.

These photons have high energy and are therefore very penetrating. This is an advantage

for imaging because they can easily escape from the subject in which they are produced,

hence PET can image molecular probes deep within tissue. However, for the same rea-

son, annihilation photons are hard to stop and detect. To stop annihilation photons, PET

radiation detectors must be made from special materials that are dense and have a high

atomic number. Annihilation photons have a higher chance of interacting with such heavy

materials.

Most PET radiation detectors comprise a scintillation crystalwhich converts the an-

nihilation photon into lightand a photodetector. Some common scintillation crystals are

Lu2(SiO4)O:Ce, Gd2(SiO4)O:Ce and Bi4(GeO4)3 (abbreviated LSO, GSO and BGO, re-

spectively). These crystals are cut in small, discrete elements and glued together to form

2-D arrays (Figure 1.1). They are then coupled through a light guide to a sensitive pho-

todectector, typically a photomultiplier tube (PMT). In most PET systems, one PMT can

read out multiple crystals and the light generated from one crystal can spread to multiple

PMTs (Figure 1.1). This form of multiplexing allows for fewer PMTs than crystal elements,

and fewer electronic readout channels are required. For this reason, such PET detectors are

called block detectors.

PET detectors can measure where, when and how annihilation photons interact with

them. Each time an annihilation photon is stopped by a PET detector, the system records

the time, location and energy of that interaction. The detection of an interaction with

an energy close to 511 keV is referred as a single event. Because annihilation photons are

produced in pairs, one can assume that two single events recorded roughly simultaneously

1By denition, one electron-volt (1 eV = 1.6 × 10−19 J) is the amount of kinetic energy gained by anelectron when it accelerates through an electrostatic potential dierence of one volt.

1.2. PRINCIPLES OF PET 5

Figure 1.1: Basic principles of PET imaging. The radioactive molecular probe, injectedin the subject, forms a spatial distribution which correlates with a biological parameter ofinterest. The signal is produced by the decay of the radioactive label, which leads to theproduction two anticollinear 511 keV photons (red line). The photons are detected by acombination of scintillation crystals (yellow) and photomultiplier tube (PMT). When theelectronics records two photons in near coincidence, a coincidence event is generated for thecorresponding line-of-response and stored in a computer for image reconstruction.

are the result of a single positron emission. Therefore, PET electronics pair up single events

by comparing their timestamp to extract coincidence events.

Coincidence events are reconstructed to estimate the location of positron emissions.

When two annihilation photons are detected roughly simultaneously, it can be inferred that

a positron was emitted in the proximity of the line that connects the two detectors involved,

called the line-of-response (LOR, shown by a red line in Figure 1.1). A typical whole-body

PET collects several hundred million coincidence events. From these coincidence events,

the spatial distribution of the PET probe is recovered by applying image reconstruction

algorithms. Image reconstruction uses advanced statistical or analytical methods to produce

the tomographic images, from which radiologists make their diagnosis.

How small a structure a PET system can visualize is quantied by its spatial resolution.

The spatial resolution of PET systems is mainly determined by the size of the scintillation

crystal elements. However, a sucient number of coincidence events must also be acquired,

which is determined by the acquisition time, probe activity and the photon detection e-

6 CHAPTER 1. INTRODUCTION

ciency. Although in current whole-body PET the spatial resolution (5383 mm3) is sucient

for many clinical applications, improvements in this domain are needed to further enhance

disease imaging.

1.3 High-Resolution PET Systems

High-resolution PET systems have been designed to visualize molecular signals in more

detail. Imaging breast cancer or small research animals is a particularly active area of

research.

1.3.1 Overview

Conventional PET systems are designed for imaging a wide range of targets in humans.

Their bore is large to accommodate a variety of patient sizes, provide enough room for

patients comfort and avoid spatial resolution variations throughout the eld-of-view (FOV).

Therefore, these systems are not optimized to image small subjects, such as small research

animals.

Because of limited spatial resolution, clinical PET systems can detect cancerous tumors

that contains more than 108 cells [4]. Improving the spatial resolution of PET can reduce

the number of cancerous cells required to produce a detectable signal, hence helping early

detection and staging. Higher resolution can also new enable applications, such as studying

protein-protein interactions in signal transduction pathways or investigating the interaction

of two populations of cells over time, such as cells of the immune system and tumor cells.

The rst requirement for higher resolution in PET is to decrease the crystal size. How-

ever, the system must also have high photon detection eciency to capture a large fraction

of the annihilation photons. For small objects, such as research animals or specic organs,

it is feasible to design a PET system with a small bore. The coincidence photon detection

eciency can be very high for such a system because of the increase in solid angle coverage.

Because they require small bore, high-resolution PET systems are rarely based on the

conventional block detector design used in whole-body clinical PET. Some use semiconductor

material that can directly sense the high-energy photons instead of scintillation crystals

[16]. In other designs, the bulky PMTs are replaced by thin photodiodes [17, 18]. Optical

bers have also been used to transport the light from the tightly packed crystal arrays to

multichannel PMTs placed away from the bore [19,20]. High-resolution PET systems many

several application, but the most signicant ones are imaging small animals and breast

cancer.

1.3. HIGH-RESOLUTION PET SYSTEMS 7

1.3.2 Pre-Clinical PET System

Mice and rats are widely used in biomedical research as surrogates for human disease. Small

rodents have been used to develop models for human diseases [21,22]. To study these diseases

in living animals, PET systems dedicated for imaging small animals have been designed, built

and even commercialized [19, 20, 2330]. Such systems oer new opportunities to perform

longitudinal studies of molecular markers in living animal subjects. Molecular imaging can

reduce the duration and cost of biomedical studies since the animal does not need to be

sacriced to obtain the tracer bio-distribution. In addition, the study can be repeated at

multiple time-points. Last, the animal can serve as its own control in experiments designed

to evaluate the ecacy of a treatment.

Imaging small animals with PET requires that small structures can be resolved. A mouse

brain is on average 2,700 times smaller (in volume) than a human brain. Therefore, small-

animal PET requires spatial resolution better than 0.63 mm3 to image a mouse with a level

of detail equivalent to imaging a human subject with a standard PET system.

Figure 1.2: High-resolution small-animal

PET scanner based on CZT detector slabs

with cross-strip electrodes, with 8 × 8 × 8cm3 useful FOV.

A small-animal PET system (Figure 1.2)

is under development using detectors based

on a semiconductor material called Cadmium

Zinc Telluride (CZT) [16]. Unlike scintil-

lation crystals, semiconductor detectors di-

rectly produce electronic charges when they

are hit by annihilation photons. In these de-

tectors, a strong electric eld is established

across the crystal by applying a relatively

large potential dierence (a few hundreds

volts) on the two electrodes (anode and cath-

ode) on either face of a monolithic crystal

slab. When an incoming annihilation photon

interacts with the atoms in the semiconduc-

tor crystal, electron-hole pairs are created and

drift toward opposite faces where they are de-

tected by readout electronics. The motion of

the charge induces signals on the respective

electrodes. These signals are used to extract spatial, energy, and temporal information [16].

CZT detectors have high energy and spatial resolution [31]. A detector module in devel-

opment [16] uses a set of parallel, very thin rectangular strips for the anode and an orthogonal

8 CHAPTER 1. INTRODUCTION

set for the cathode. The x − y coordinate of the interaction (as dened in Figure 1.2) is

determined by the intersection of the strips on either side of the crystal slab that record a

signal above threshold. The pitch with which the electrodes are deposited determines the

intrinsic spatial resolution in that direction. The z coordinate along the direction orthogonal

to the electrode plane is determined using either the ratio of the cathode to anode signal

pulse heights, or the arrival time dierence. In this direction, the intrinsic resolution is below

1 mm full-width half-maximum (FWHM). The eective detection elements are 1×5×1 mm3

(in the coordinate system of Figure 1.2). In this cross-strip electrode design, the signals are

multiplexed, thereby reducing the number of electronic channels required. In addition, the

3-D coordinates of the energy deposition for individual photon interactions can be measured.

In the small-animal PET based on CZT detector modules (Figure 1.2), 40× 40× 5 mm3

slabs of CZT are arranged edge-on with respect to the incoming annihilation photons to

form an 8× 8× 8 cm3 FOV [32]. The system has high photon detection eciency (18% of

the positrons emitted at the center of the FOV produce coincident events) and high spatial

resolution (1× 1× 1 mm3).

1.3.3 Breast Cancer PET System

Figure 1.3: High-resolution PET camera for

breast cancer, with adjustable panel separa-

tion.

Breast cancer is the most common type of

cancer for women. When detected early,

new treatments can greatly improve patient

survival rate. Breast cancer management

currently involves whole-body PET imag-

ing. Post treatment, whole-body PET is

used to monitor how cancer responds to

therapy or its possible recurrence. Because

whole-body PET systems have spatial reso-

lution greater than 53 mm3, they cannot vi-

sualize small cancerous lesions. It is particu-

larly important to detect and visualize duc-

tal carcinoma in situ (DCIS), a non-invasive

condition that can lead to invasive ductal carcinoma (IDC), an aggressive form of breast

cancer. In this disease, abnormal cells multiply and form a growth within a milk duct.

High resolution PET can be a powerful tool in the management of breast cancer. Stan-

dard x-ray mammography can visualize the micro-calcications associated with DCIS. How-

ever, 2530% of x-ray mammography studies produce inconclusive results, therefore, there

1.4. IMAGE RECONSTRUCTION CHALLENGES 9

is a need for more sensitive cancer detection. In addition, many breast-conserving lumpec-

tomies require multiple surgeries when the extent of the disease is underestimated, due to

the presence of residual cancerous cells on the outer surface of the tissue specimen (positive

margins). A breast-dedicated PET system might be able to assess the presence of cancer in

inconclusive mammography studies. In addition, such a system can quantify the margins of

the disease and guide tumor biopsy and resection with more accuracy. Furthermore, it can

be used to monitor local breast cancer recurrence with high sensitivity.

Several high-resolution PET systems specic to breast cancer are being developed [33

36]. Most designs place the detectors close to the breast to improve photon detection sensi-

tivity. In one possible geometry, the PET detectors are arranged in two opposing panels in

a way similar to x-ray mammography. The breast might be slightly compressed for higher

image quality. Alternatively, the panels can be retracted to allow for rotation. Another

possible geometry arranges the detector in a ring that fully encircles the breast.

1.4 Image Reconstruction Challenges

Reconstructing images from high-resolution PET systems present a number of challenges

that stem from the small crystal size and unique detector geometries.

1.4.1 Image Reconstruction Complexity

Over the years, the number of LORs in PET systems has increased by orders of magnitude

(Figure 1.4 and [37]). This trend has been driven by smaller detector crystals, more accurate

3-D photon positioning, larger solid angle coverage and 3-D acquisition. These advances

have boosted the spatial resolution and the photon detection eciency of PET systems.

However, they have made the task of reconstructing images from the collected data more

dicult. The demand in computation and memory storage for high-resolution PET has

exploded, outpacing the advances in memory capacity and processor performance [38].

By accounting for the stochastic nature of the imaging process, statistical image recon-

struction methods [39,40] oer a better trade-o between noise and resolution in comparison

to other methods, such as ltered backprojection [41]. These methods incorporate an accu-

rate imaging model represented by the system matrix, which maps the image voxels to the

LORs. The system matrix is gigantic for high-resolution 3-D PET systems with billions of

LORs. As a consequence, statistical methods are computation and memory intensive.

The issues arising from the size of the system matrix have been addressed by various

methods. The system matrix can be factored into the product of smaller components that

10 CHAPTER 1. INTRODUCTION

are stored in memory [42]. Some implementations also compute parts (such as solid angle)

of this factorization on-the-y, which saves memory but increases the processor workload.

The system matrix can also be compressed using symmetries and near-symmetries [43], and

extracted only when needed to limit the memory prole. However, all of these methods

degrade the accuracy of the reconstruction because they involve approximating the system

matrix.

Figure 1.4: Trend in the number of LORs for

PET systems (adapted from [37] with permis-

sion).

Another approach to reduce the com-

plexity of the reconstruction consists in re-

binning the 3-D projections into a stack of

2-D sinograms that can be reconstructed

independently using a 2-D reconstruction

method. Fourier rebinning (FORE), com-

bined with a 2-D iterative reconstruction

method [44], is an order of magnitude

faster than a direct 3-D reconstruction

method. Furthermore, it produces images

that are not signicantly degraded com-

pared to 3-D OSEM for whole-body clinical

scanners [45]. However, for high-resolution

pre-clinical PET systems, the average num-

ber of counts recorded per LOR is low (i.e.

the data is sparse). As a consequence, the measured projections do not reect the ideal line

integral and the potential for resolution recovery is lost with this approach [42].

A better way to deal with the high dimensionality of the measurements is to perform

the reconstruction in list-mode. List-mode is an ecient format to process sparse data

sets, such as dynamic or low count studies. In this format, the LOR index and other

physical quantities (e.g. time, energy, TOF, depth-of-interaction, or incident photon angle)

are stored sequentially in a long list as the scanner records the events. Reconstruction can

be performed directly from the raw list-mode data using on-the-y calculations, which is

particularly appealing for dealing with the parameter complexity as well as sparseness of

the measured data.

Iterative image reconstruction is computationally intensive, whether the data is in list-

mode, histogram-mode or sinogram format. Graphics processing units (GPU) have been used

with success as a practical way of accelerating the reconstruction of sinograms. Yet, these

GPU-based techniques are not directly applicable to list-mode or histogram-mode datasets.

1.4. IMAGE RECONSTRUCTION CHALLENGES 11

The main challenge in implementing list-mode reconstruction on the GPU is that, unlike

sinograms, the list-mode elements are not arranged in any regular pattern. In addition,

the projection operations must be line driven, which means that the processing must be

performed on individual list-mode elements. Therefore, a new technique is required to allow

GPUs to process individual list-mode elements, described by arbitrary endpoint locations.

1.4.2 Shift-Varying System Response

A reconstructed point source appears smaller if placed at the center of a PET scanner, than

near the edge of the FOV. This eect, referred to as parallax error, is one example of the

shift-varying response of PET, which depends upon the location of the positron decay and

the orientation of the resulting LOR. Crystals in PET are long and narrow and they are

oriented facing the center of the FOV. When a photon is emitted near the center of the

system, it only sees the narrow dimension of the crystal. However, when a photon is emitted

close to the edge of the FOV, it also sees the long dimension of the crystal (Figure 1.5).

As a result, the reconstructed spatial resolution is not uniform, which is a confounding issue

in PET [46].

Figure 1.5: Depiction of parallax error in ring-

shaped PET systems.

The diameter of bore of conventional

PET system is designed to be much larger

than the typical patient to ensure that the

spatial resolution remains roughly constant

throughout the useful FOV. This drives the

cost of the system up since more crystal ma-

terial is needed, and it also results in a de-

crease in the solid angle coverage and a sub-

sequent degradation of the photon sensitiv-

ity.

Small-animal PET scanners have a small

bore for high photon sensitivity, and therefore are subject to parallax errors. This eect can

be mitigated by having several layers of shorter detection elements in place of one layer

of long detection elements. For example, the system described in 1.3.2 has eight layers of

5 mm-long crystals. This allows the useful FOV to extend to the edge of the detector.

Yet, the DOI ability is not sucient to compensate for parallax error completely. Figure

1.6a shows the spatial response of a few LORs. The wide shape of certain oblique LORs

indicates that spatial resolution is degraded on these LORs. Figure 1.6b shows simulated

hot spheres in air, placed on a grid extending to the edge of the FOV and reconstructed

12 CHAPTER 1. INTRODUCTION

(a) (b)

Figure 1.6: (a) Depiction of the spatially-varying response of ve LORs. Horizontal andvertical LORS only cover the small dimension of the crystal (1 mm) and therefore providethe highest spatial resolution. In contrast, oblique LORs suer from signicant parallaxerrors due to the 5:1 crystal aspect ratio. (b) A simulated PET acquisition of spheres lledwith activity (diameters 1, 1.25, 1.5, and 1.75 mm) were placed in four quadrants in thesystem central axial plane.

0.5

1

1.5

2

30

210

60

240

90

270

120

300

150

330

180 0

Box (5 mm DOI)

Box (1 mm DOI)

Ring (no DOI)

Figure 1.7: Polar plot of the coin-cident detector response (full-widthhalf-maximum, mm) as a function ofLOR angle, at the center of the FOV,for three CZT systems with 1 × 1mm2 crystal pitch. Two of the sys-tems (blue and black) are arranged ina box geometry (see Figure 1.2) with5 and 1 mm depth-of-interaction lay-ers (DOI). The third system (red) isbuilt in a ring geometry, with no DOIcapability.

1.4. IMAGE RECONSTRUCTION CHALLENGES 13

with an iterative method. The spheres closer to the detectors suer from substantial spatial

resolution degradations. In a box-shaped geometry with 5 mm DOI, parallax errors are not

limited to the edge of the FOV. Figure 1.7 shows that at the center of the FOV, the spatial

resolution (measured as the FWHM of the coincident detector response) can vary from 0.5 to

1.6 mm, depending on the LOR angle. The issue of shift-varying system response is critical

to obtaining high quality images with good spatial resolution. Hence, several approaches

can be implemented to provide uniform reconstructed spatial resolution.

The incorporation of an accurate model of the spatially-variant response of PET has been

shown to help reduce quantitative errors [42,43,47,48] and improve resolution uniformity by

deconvolving the system blurring. Yet, including the contribution of voxels that are o of

the LOR axis increases the number of voxels processed by an order of magnitude. Accurate

reconstruction with a detailed resolution model is computationally demanding and typically

requires large computer clusters. Therefore, a new approach is needed to perform list-mode

reconstruction with shift-varying kernels within practical processing times.

1.4.3 Multiple-Interaction Photon Events

High-resolution PET requires detector modules comprising small detection elements. In

these detectors, Compton scatter and other physical eects cause the annihilation photon to

deposit energy in multiple interaction locations in the detectors. These multiple-interaction

photon events (MIPEs) can produce misidentication of the LOR (as shown in Figure 1.8),

which in turn causes contrast, quantitative accuracy, and spatial resolution loss. For the

CZT system presented in 1.3.2 (1 × 1 × 5 mm3 eective detection elements), 93.8% of

the recorded coincidences involve at least one MIPE. These events must be used in the

reconstruction to maintain high photon eciency. Unless MIPEs are positioned accurately

to the location of the initial interaction, the potential performance of high-resolution PET

will not be achieved.

In a MIPE, the initial interaction denes the correct LOR for the coincidence event.

Subsequent interactions are not aligned with the true LOR because Compton scatter deviates

the annihilation photon from its straight trajectory. Reconstructing the complete sequence

of interactions of each photon provides a reliable way to select the initial interaction [49].

This procedure ensures that all the subsequent interactions are consistent with the choice

of the initial interaction.

14 CHAPTER 1. INTRODUCTION

Figure 1.8: Example of coinci-dent event recorded by a PETsystem based on CZT cross-strip electrodes modules (Sec-tion 1.3.2). The solid red linerepresents the annihilation pho-ton trajectory. In this exam-ple, a coincident pair consists ofa pure photoelectric event (left)and a multiple-interaction pho-ton events (MIPE, right). Mis-positioning of the MIPE resultsin misidentication of the LOR(dotted line).

1.5 Overview of this Work

This work presents a set of novel approaches adapted for high-resolution PET image recon-

struction. The methods are implemented and evaluated for the CZT-based small-animal

PET system presented in 1.3.2. These methods are suitable for a number of other high-

resolution PET systems, including those dedicated to breast cancer (1.3.3). To some extent,

for standard clinical PET systems (Section 5.4).

The methods are organized in four chapters. Chapter 2 provides some background on

mathematical models of the data collection process in PET. A new way of calculating the

response of the imaging system with very low memory overhead is presented. Owing to

the small size of the crystals, the intrinsic detector response function can be linearized,

which lead to a fast analytical expression for the coincident aperture function. The new

formulation is evaluated against Monte-Carlo simulations.

Chapter 3 contains background information on statistical image reconstruction algo-

rithms for PET. A novel formulation of the conjugate gradient algorithm, specic to the

ML objective, is proposed and evaluated. Newton's method is also investigated for PET

reconstruction.

Chapter 4 details a new implementation of fast, shift-varying line projections using

graphics hardware. Fully 3-D, list-mode OSEM was developed based on this method on a

graphics processing unit (GPU). The iterative reconstruction algorithm was evaluated both

on simulated and real PET datasets.

Chapter 5 presents two applications of GPU-based line projection. The rst applica-

tion uses the GPU framework to calculate the coecients of the system coincident detector

1.5. OVERVIEW OF THIS WORK 15

response function on-the-y, and incorporates an accurate model of the data acquisition

process within list-mode iterative reconstruction. The second application uses the GPU for

image reconstruction on an existing clinical PET system that has time-of-ight (TOF) capa-

bilities. The TOF information is incorporated within the list-mode iterative reconstruction.

Chapter 6 proposes a new statistical algorithm for positioning photons in small crystal

detectors. The algorithm uses robust Bayesian estimation for reconstructing the full in-

teraction track of the annihilation photon in the detectors. An evaluation of the method,

implemented on the GPU, is performed for a high-resolution PET system made of CZT

detectors.

Chapter 2

Imaging Model for High-Resolution

PET

2.1 Principles and Theory

The measurements in PET involve complex physical processes. An accurate model of the

data collection process is important, not only to better understand the system's performance,

but also to improve the quality and accuracy of the image reconstruction.

2.1.1 Physics of PET

A PET dataset is produced by counting how many coincident photon pairs have been de-

tected for every possible pair of detection elements. We call a line-of-response (LOR) a line

that connects a pair of detectors elements (Figure 1.1). In order to describe the imaging

process, the photon emission, transport and detection must be modeled.

2.1.1.1 Photon Emission

Positrons are emitted with a range of initial kinetic energies, the maximum amount of which

depends on the radionuclide. For example, the maximum kinetic energy of the positron

emitted by an 18F atom is 0.64 MeV. The positron can only annihilate with an electron

once it has given up most of its kinetic energy through inelastic collisions with atoms and

molecules. As a result, the positron-electron annihilation does not occur at the location of

the positron emission (Figure 2.1a). This fundamental blurring eect limits the resolution

of PET. For example, the positron range of 18F in water is 0.10 mm FWHM and 1.03 mm

FWTM, one of the lowest among all positron emitters [50]. The distribution of the positron

16

2.1. PRINCIPLES AND THEORY 17

(a) (b)

Figure 2.1: (a) A radioactive 18F atom decays by emitting a positron with some initialamount of kinetic energy. The positron propagates through matter, loosing its kinetic energythrough Coulombic interactions until it annihilates with an electron and produce two roughlyanti-colinear 511 keV photons. (b) Spatial distribution of the positron annihilations forpositrons emitted at the origin (from [50]).

annihilation locations is isotropic and well modeled by a cusp-like response function for

homogeneous materials (Figure 2.1b). The distribution of the positron annihilations in a

homogeneous material is obtained by convolving the tracer spatial distribution with the

response function. Modeling of positron range in inhomogeneous material is more complex

because the width of the blurring kernel depends on the density and eective Z of the

material.

In addition to the positron range, the two annihilation photons are not always emitted in

exactly opposite directions. Due to uctuations in residual positron and electron momenta,

conservation of momentum implies that the summed momentum of the annihilation photons

is also not zero, leading to photon acolinearity. The angle between the two photons is

approximated by a Gaussian distribution with mean 180 degree and FWHM 0.23 degree [51].

The contribution of photon acolinearity to spatial resolution depends on the separation D

between the two detectors hit and is approximated by 0.0022 × D [50]. For small-animal

PET systems with small bore diameter, the contribution of photon acolinearity to the spatial

resolution is often neglected. For example, for a bore diameter of 80 mm, the resolution

blurring due to photon acolinearity is less 0.2 mm FWHM.

2.1.1.2 Photon Detection

One factor that impacts the spatial resolution is the detector geometry. Unlike other physical

processes, the detector geometry is determined by design and can be optimized for a specic

goal. The resolution of a radiation detector is quantied by its intrinsic detector response

18 CHAPTER 2. IMAGING MODEL

Figure 2.2: Intrinsic detector response function gθ(X) (red) for a crystal array without (left :normal photon; middle: oblique photon) and with DOI positioning capabilities (right). Thescintillation array with DOI capabilities results in a narrower IDRF, hence better spatialresolution.

function (IDRF). The IDRF describes the response of a single detector to a ux of photons.

For a needle beam of 511 keV photons, aimed at a detector with an angle θ and oset X,

the number of photons detected per second in the detector is given by I0gθ(X), where I0 is

the intensity of the beam (in photons per second) and gθ(X) is the IDRF (see Figure 2.2).

The photons detected in PET have high energy (511 keV). Therefore, standard PET

systems use long and narrow crystals to produce both high photon detection eciency and

high spatial resolution. For this design to work, however, the crystals must always present

the narrow face to the incoming photons. This requires that crystals be arranged in a ring

geometry, far from the subject. Yet, small-animal PET systems place the detectors close

to the animal, therefore the photons emitted near the edge of the FOV are more likely to

be enter the detectors obliquely. In the CZT system described in 1.3.2, the box geometry

further complicates the situation since photons can hit the detector obliquely, regardless

where they were emitted.

Parallax errors resulting from oblique photons can be mitigated by measuring the 3-D

coordinate of the interaction (or depth of interaction DOI). This can be achieved by

segmenting the crystal array in the depth dimension (Figure 2.2, right) [16,17], or by other

schemes, such as reading out a continuous scintillation crystal element from both sides [52].

Formulating the response of the detector in terms of the IDRF neglects an important

component of the spatial resolution. A 511 keV photon can either interact with detector

material by undergoing photoelectric conversion or Compton scatter. In photoelectric con-

version, the total energy of the photon is transferred to a bound electron and the photon

disappears. In Compton scatter, the photon interacts with an unbound or loosely bound

electron. Due to conservation of momentum, the photon cannot transfer all of its energy.

The photon instead transfers a portion of its energy to the recoil electron, and is deected

2.1. PRINCIPLES AND THEORY 19

from its initial trajectory. The scattered photon might then either escape from the system or

interact further with the detectors, leaving behind her a track of interactions. The average

number of interactions depends on the photon initial energy, the detector material and the

size of the detection elements.

In the standard PET detector, the scintillation crystal array is coupled to one or more

light detectors. Because one photon detector typically reads out multiple scintillation crys-

tals, the light signals aremultiplexed. Charge can also be multiplexed in the position-sensitive

photo-detector or in the associated readout circuit. This results in a few (typically four)

readout channels. Such detectors estimate the photon interaction coordinates for each event

by determining the weighted mean of the readout signals. Therefore, individual interaction

coordinates and their deposited energies cannot be determined in the standard PET detec-

tor [16]. For these systems, Compton scatter in the detectors is a blurring factor that cannot

be corrected with signal processing algorithms.

Some more recent PET system designs allow readout of multiple interactions [17, 23,

37, 53]. In the CZT cross-strip electrode design (1.3.2), the 3-D coordinates and energy

deposition for every interaction can be recorded. All these systems are able to distinguish the

photons that deposit their energy in a single detection element from those which deposit their

energy in multiple detection elements. For systems where high resolution is a requirement,

these latter photons can be discarded or included provided that appropriate identication

methods exist (such as the method presented in Chapter 6).

2.1.1.3 Photon Transport

Another complication is the possibility of scatter and absorption of the photon before it

reaches the detector. Even though the photon energy is high (511 keV), some photons

interact in the subject. As a consequence, photons can be absorbed or scattered by the tissue.

Photon absorption always decrease the number of coincidence events measured along a given

LOR. Photon scatter can either increase or decrease the correct number of coincidence events

measured along an LOR because photons might scatter out of the LOR, or into the LOR

(Figure 2.3, middle). The measurement of the photon energy can help reject tissue scattered

events. When the energy of an interaction is measured to be much lower than 511 keV, it

can be inferred that the photon either scattered in tissue, or deposited only a fraction of its

total energy in the detection element. The nite energy resolution of the detector limits how

accurately scattered events can be identied. For a clinical 3-D PET acquisition, 4060% of

all the recorded coincidences include scattered events [54].

Photon attenuation includes both the eects of photon absorption and photons scattering

20 CHAPTER 2. IMAGING MODEL

out of the LOR. In PET, the attenuation is constant along the LOR because the total

distance traveled by the two annihilation photons does not depend on the location where

these photons were emitted. Therefore, the photon attenuation factor for LOR i is modeled

as

ωi = exp−∫LOR

µ(r)dr

(2.1)

where µ(r) is the spatial distribution of the total attenuation coecient at 511 keV.

Tissue scattered events increase the number of coincidence events detected along a given

LOR. The contribution of these events is denoted yr in (2.3). Modeling analytically the

dependence of the scatter distribution on the tracer spatial distribution is dicult. Accurate

scatter distributions can be obtained by Monte-Carlo methods [54]. When an approximate

distribution is acceptable, faster methods such as the single scatter simulation [55] have been

shown to yield reasonably accurate results .

2.1.1.4 Mathematical Model

Mathematically, a PET dataset consists of a non-negative integer vector m ∈ NP , which

represents the number of coincidence events recorded for all P LORs in the system. PET

imaging is a stochastic process due to the limited number of discrete events recorded. There-

fore, a PET dataset is not well modeled by a deterministic quantity. Two scans of the same

object can dier quite substantially. Instead, a random vector Y is used to describe the

stochastic distribution of the measurements. The components Yi (i = 1 . . . P is the LOR

index) are independent and follow a Poisson distribution with mean yi

Yi ∼ Poisson(yi). (2.2)

A sample measurement mi is a realization of Yi. The measurements m are often referred to

as projections because they are roughly equal to the integral of the tracer spatial distribution

along the LOR, also known as the Radon transform [56].

The expected number of coincidence events y on each LOR as a function of the tracer

spatial distribution is well described by a linear model, provided that the amount of activity

in the FOV is not too high. At high count rate, pulse pile-up and dead-time lead to saturated

output. For lower activity levels, the expected measurements recorded by the scanner depend

linearly on the internal tracer distribution. The process of data collection is naturally

represented by a discrete-continuous model that relates the discrete vector of measurements

to the continuous tracer spatial distribution [57]. The volumetric tracer distribution is well

described by a 3-D function f(r) of the spatial variable r. The most general formulation

2.1. PRINCIPLES AND THEORY 21

Figure 2.3: The three types of coincidences in PET. (left) true coincidence; (middle) tissuescattered coincidence; and (right) random coincidence. The black dashed line represents theincorrect LOR recorded by the system in the case of the scattered and random coincidences.

of the imaging process is based on a spatially-varying response. The contribution from a

point of unit strength located at r to a LOR i is represented by a kernel hi(r), called the

coincidence detector response function (CDRF). Hence, the expected measurement yi on

LOR i can be expressed as

yi = ηiωi

∫Ωf(r)hi(r)dr + ysi + yri (2.3)

where Ω is the support of the tracer spatial distribution. The additive terms ysi and yriaccount for tissue scattered and random coincidences (see Figure 2.3). Both terms depend

upon the tracer distribution f(r) and various models have been proposed to express this

dependency [55, 58]. The multiplicative factors ηi and ωi model respectively the eect of

small variations in detector eciency and photon attenuation by the subject along the

LOR. The detector eciency ηi is calibrated by performing a normalization scan [59, 60].

The photon attenuation ωi is measured either by performing a special transmission scan of

the patient using an external positron emitter, or using a previous scan from an anatomical

imaging modality such as X-ray CT [61].

The spatial response of PET systems is determined by a number of factors. The physical

processes involved in the photon production and detection aect the spatial resolution.

Physical processes involved in photon transport (tissue scatter and photon attenuation)

rather impact the contrast of the reconstructed images.

22 CHAPTER 2. IMAGING MODEL

2.1.2 Spatially Variant and Invariant Models for Discrete Image Repre-

sentations

Image reconstruction consists of solving for the spatial distribution of tracer f(r) given a setof measurements m. A frequent simplifying assumption is that the tracer distribution can

be expressed as a linear combination of basis functions bj(r). We denote f(r) the resultingapproximation and xj (j = 1 . . . N is the voxel index) the basis coecients, so that

f(r) =N∑j=1

xjbj(r). (2.4)

The resulting discrete-to-discrete model for PET can be expressed as

yi = ηiωi

N∑j=1

aijxj (2.5)

where the system matrix coecients aij satisfy

aij =∫

Ωhi(r)bj(r)dr. (2.6)

The response of most PET systems is spatially varying. For a small-animal PET system

made by arranging CZT detectors in a box geometry (Figure 1.2), the response depends

on the position and orientation of the LOR. In addition, the amount of blur varies along

the LOR (Figure 1.6). Such systems require a shift-varying model. The linear relationship

between a tracer distribution and the expected measurements (2.5) can be written in matrix

form:

y = DAx (2.7)

where A = (aij) ∈ RP×N is the system matrix and D = diagηiωi is a diagonal matrix

obtained by performing a transmission and a normalization scan. In this model, the mea-

surements of the tracer distribution along each LOR can be described by a custom model,

i.e. yi = ηiωiaTi x where ai is the ith row of the system matrix. The vector ai is a discretized

representation of the CDRF hi(r) that takes into account the contribution of every voxel to

LOR i. For a typical PET system, ai is sparse and has non-zero values only inside a volume

centered on the LOR called the tube-of-response (TOR).

The CDRF is not to be confused with the point-spread function (PSF). For a discrete

2.1. PRINCIPLES AND THEORY 23

representation of the tracer spatial distribution, the CDRF is dened for a given LOR and

forms a row of the system matrix A (Figure 2.4). The PSF is the response of the system to

an impulse vector δj0 . Therefore, the PSF is dened for a given voxel location j0 in the image

and is equal to the corresponding column of the system matrix A (Figure 2.4). Since the

imaging system is shift-varying, the PSF is dierent for every voxel. The PSF is inherently

discrete since the output of a PET system consists of a discrete set of measurements. In

contrast, the CDRF is a continuous function that models the kernel of integration used in

the linear shift-varying, discrete-continuous model of the data acquisition process (2.3).

Figure 2.4: Depiction of the system matrix.

The PSF for voxel j = 0 and the CDRF for

LOR i = 0 are shown.

Spatially-invariant models, also called

shift-invariant, have also been used to rep-

resent the system matrix. The projection

y of a tracer distribution x using a shift-

invariant model satises

y = DQXR︸ ︷︷ ︸A

x (2.8)

where X represents the geometrical projec-

tor, R the 3-D convolution of the volume

with a shift-invariant resolution kernel, and

Q the 2-D convolution of every projection

view with another resolution kernel. The

projector X assumes ideal line integrals and

is typically a discrete Radon transform, al-

though various interpolation schemes can

be used. A shift-varying resolution model

therefore consists of a 3-D convolution kernel describing uniform blurring in the image, and

a 2-D convolution kernel describing spatially-uniform blurring in the projection space. For

example, positron range blurring (see 2.1.1) is well described by the former, and detector

cross-talk by the latter.

Shift-varying models are a more accurate representation of the response of PET sys-

tems since the system matrix A can rarely be factored as QXR. Yet, they require more

computation and memory. The simple projection X and the kernel convolutions used in

shift-invariant projections (2.8) require little computation, especially if the kernels are sepa-

rable. In contrast, the matrix A (2.7), which maps the image voxels to the scanner detectors

and models the imaging process, can be gigantic [62] (even after accounting for its sparse-

24 CHAPTER 2. IMAGING MODEL

ness).

2.2 Analytical Calculation of the Coincident Detector Response

Function

The response of a PET system can be either measured, simulated or calculated analytically.

In the rst case, a point-source (made of a long-lived positron emitting isotope sealed in a

small capsule) is stepped by a robotic arm through the scanner FOV [48]. The PSF of the

system is measured by acquiring a long scan for every point-source position. This process

requires several weeks of acquisition as well as large memory storage. Regularization is

sometimes performed to obtain a smooth PSF from fewer events or point source locations.

Measuring the PSF is labor-intensive. As a result, Monte-Carlo simulations are often

performed instead [43, 63]. The geometry of the system, consisting of the position and size

of all the detection elements, is entered in the Monte-Carlo program. Billions of positrons

are randomly generated throughout the FOV. The resulting photons are tracked using ray-

tracing techniques and physical process simulations. When photons deposit energy in a

detection element, an event is created and recorded. These events undergo further processing

to replicate the generation and processing of a real PET signal. The PSF at a given location

is available by compounding all the events that originated from positrons emitted in the

proximity of that location.

The third class of methods uses analytical models to compute the PSF [47,64]. The spa-

tial resolution in PET is the product of multiple factors (see 2.1.1), and therefore there does

not exist a perfect analytical model that includes everything. Analytical models attempt to

capture the dominant eects. For the box-shaped PET system studied in this work, we have

assumed that the geometrical response of the detectors dominates over all the other blurring

processes. The justication for this is as follows: First, positron range is an image-based

factor that can be factored out of the system matrix. Then, owing to the small diameter

of the bore (80 mm), photon acolinearity is a small blurring eect (∼0.2 mm) [50]. Last,

the resolution-degrading eect of MIPEs will be removed before reconstruction by a special

approach detailed in Chapter 6. In contrast, the detector response blurring is on the order

of W cos θ + T sin θ, where W stands for the crystal width (1 mm for the system studied)

and T for the thickness of the crystal (5 mm) and θ is the photon incidence angle [65].

An accurate framework for calculating the CDRF was developed in the early days of

PET [6668] in order to design better systems without requiring computationally expensive

Monte-Carlo simulations. The key nding of the method is that the CDRF can be computed

2.2. ANALYTICAL CDRF CALCULATION 25

Figure 2.5: Geometry used for calculating the CDRF. The two crystals (blue) can be orientedarbitrarily with respect to each other. The integration is performed over φ within theintegration cone (light red). The CDRF is calculated at a location r in the FOV (lightyellow), oset by X with respect to the LOR axis (dashed line), at a distance sA and sBfrom each crystal.

approximately by convolving a scaled version of the IDRF of each crystal. The IDRF itself

is computed analytically based on linear attenuation of the photons in the detectors.

For any LOR i and any location r in the FOV, the CDRF is obtained by summing the

response of the crystal pair to a pencil beam of oppositely-directed photons over the range of

all admissible angles φ (Figure 2.5). Two rectangular detection elements in coincidence are

rotated by an angle θA and θB, respectively, with respect to the LOR axis (horizontal dashed

line). Coincidences are possible for positron annihilations that occur in the convex hull of

both crystals (area shaded in light yellow). Positrons that annihilate outside that area do

not contribute to the CDRF because of the coincidence criterion. Assuming a positron and

an electron annihilate at a location r, shown on the gure, two anti-colinear photons will be

emitted at an angle φ with respect to the LOR axis. The chance that photon A interacts

with detector A is given by the IDRF for an incident angle of θA + φ. In practice, an

approximation is used [66]. Only a small range of photon angle φ will result in coincidences

(area shaded in light red), especially when the inter-detector distance sA +sB is much larger

than the detector size. Therefore, φ is assumed to be negligible for the purpose of computing

the IDRF. This is equivalent to assuming that the crystal is irradiated by a beam of parallel

photons. This assumption simplies the calculations greatly sinceX 7→ gθ(X) has a compactanalytical expression, while θ 7→ gθ(X) does not. An approximation of the CDRF can then

be calculated by using

hi(r) =∫ π/2

−π/2gθA(X + sA sinφ) gθB(X − sB sinφ)dφ (2.9)

where sAand sB are the distances indicated on Figure 2.5. Using the small angle approxi-

mation for sinφ, (2.9) can be simplied further to yield the model used in [66], which is an

26 CHAPTER 2. IMAGING MODEL

Out-of-planeincident angle

In-planeincident angle

Ring difference

(a) (b)

Figure 2.6: (a) In-plane and out-of-plane components of the resolution model. (b) Variationof the axial blurring as a function of ring dierence (blue: data, red : linear t). Theblur (FWHM) was measured for the box-shaped CZT PET system, at the center of LORswith zero in-plane angle and varying out-of-plane angle. In the box-shaped system, a ringdierence of 80 mm corresponds to an out-of-plane angle of 45 deg.

analytical scaled convolution:

hi(r) =∫ J

IgθA(x) gθB((1 + ε)X − εx)

dx

sA(2.10)

where x = X + sAφ is the new integration variable, [I, J ] the integration domain and

ε = sB/sA is the ratio of the distances to each detector.

The model based on the analytical scaled convolution only depends on the in-plane

dimensions (Figure 2.6a), i.e. it assumes that both crystals lie in the same axial plane. This

works well for a 2-D PET system in which lead or tungsten septa restrict the acquisition

to in-plane LORs. However, modern PET systems use fully 3-D acquisition to improve the

photon sensitivity. For these systems, coincidences can be acquired between two crystals

located within dierent rings. Therefore, a complete model takes into consideration the ring

dierence. This requires modeling the IDRF as a 2-D function, and calculating the CDRF

though a double convolution (over the in-plane and out-of-plane angles).

This additional parameter would entail a considerable increase in computation, not nec-

essarily justied by image quality gains. In an eort to make modeling practical for image

reconstruction, the 2-D model neglects how the spatial resolution varies with the ring dif-

ference. Blur in the axial direction is modeled by a shift-invariant Gaussian kernel. This is

2.3. APPROXIMATION FOR SMALL CRYSTALS 27

a common assumption for practical reconstruction in 3D PET [42,43,47].

Neglecting how the resolution blur depends on the ring dierence can aect the spatial

resolution and spatial resolution uniformity. Yet, several factors mitigate this issue: Out-of-

plane LORs are redundant because it is possible to reconstruct a dataset using in-plane LORs

only (for example, using septa). Therefore, although out-of-plane LORs have degraded axial

resolution (Figure 2.6b), high axial resolution is available from the in-plane LORs. This is

one of the reasons why, in 3-D PET systems, most reconstructions ignore LORs that have

too high a ring dierence [42, 69]. In addition, most LORs in a PET system have lower

ring dierence. Because of the nite axial extent of the system (Figure 2.6a), the number

of LORs with a ring dierence D is proportional to Dmax −D + 1, therefore LORs with a

higher D are less numerous. Although Figure 2.6b indicates that the axial blur can be as

high as 1.9 mm, this is only the case for a small number of very oblique LORs. The most

prevalent ring dierence is zero.

2.3 Approximation for Small Crystals

2.3.1 Fast Calculation of Intrinsic Detector Response Function

The IDRF can be calculated by considering the photon linear attenuation in the detector

material. Neglecting scatter in the detector, a photon produces a detectable signal if it

interacts with the detector and it is not attenuated by any material along its trajectory. For

the calculation of the IDRF, neighboring detectors are considered as attenuating material.

For an array of detectors, such as the one depicted on Figure 2.7, the probability gθ(X) thata photon of initial energy E0 = 511 keV interacts with the highlighted crystal, and does not

interact with other detectors on its trajectory, is given by the exponential attenuation law

gθ(X) = (1− e−µddet(X,θ))e−µdatn(X,θ) (2.11)

where ddet(X, θ) and datn(X, θ) are the length of detector and attenuating material traversed,respectively (Figure 2.7). The linear attenuation coecient µ includes both Compton and

photoelectric attenuation at 511 keV.

The IDRF for a rectangular detection element is a piecewise exponential function. The

in-plane dimensions of the detection element are denoted (W,T ) and its orientation with

respect to the incoming photons beam is denoted by the angle θ . The four interval bound-

aries (or knots) for the piecewise exponential IDRF are denoted (Xl, Yl) where l = 0, . . . , 3.

28 CHAPTER 2. IMAGING MODEL

Figure 2.7: Representation of the detec-tion length (ddet(X, θ)) and the attenu-ation length (datn(X, θ)) as a functionof the oset X and the incident angle θfor a linear crystal array. The two func-tions are piecewise linear and can can beevaluated with minimal computation.

The knots X coordinate can be computed following

Xi = (±W ) cos(θ) + (±T ) sin(θ). (2.12)

Let us assume that the knots are sorted X0 < X1 < X2 < X3. The knots are located

symmetrical around zero: X0 = −X3 and X1 = −X2 (Figure 2.7).

The detection element photon detection eciency depends upon the thickness of both

the detection element of interest and the attenuating material traversed by the beam. The

thickness ddet(X, θ) of the detection element of interest traversed by the photon beam is

zero outside the outer knots X0 and X3. Between X0 and X1, ddet(X, θ) increases linearly.It is then constant between X1 and X2, and it decrease linearly down to zero between X2

and X3 (Figure 2.7). The peak thickness of the crystal is obtained using maxX

ddet(X, θ) =

min 2Lcos θ ,

2Hsin θ.

In a standard crystal array, neighboring crystals will cause attenuation of the photon

beam. Mechanical structures required to hold the crystals together, support readout elec-

tronics, or provide heat dissipation in the detectors can also provide some attenuation. In

the following derivation, mechanical structures are neglected and the crystal array is as-

sumed to be innitely long. Under these assumptions, there is no attenuation for the two

knots that correspond to the front of the crystal (X0 and X1 for the beams depicted in

Figure 2.7). The attenuation length datn(X, θ) then increases linearly until its peak value,

which is attained either at X0 or X3, depending on the incident angle θ (Figure 2.7). The

peak attenuation length is maxX

datn(X, θ) = 2Lcos θ .

Following the calculation of the detection and attenuation length, the IDRF is computed

using the linear attenuation coecient µ in the material (2.11).

Figure 2.8 compares the IDRF for the 1× 5× 1 mm3 CZT detection element used in the

2.3. APPROXIMATION FOR SMALL CRYSTALS 29

Figure 2.8: Comparison of the IDRF for a 1 × 5 × 1 mm2 CZT crystal and a 4 × 20 × 4mm2 LSO crystal. Two dierent incident angles are shown. The IDRF is shown with (red)and without (black) attenuation from surrounding detectors. For the small CZT crystal, theIDRF is well approximated by a piecewise linear function.

30 CHAPTER 2. IMAGING MODEL

high-resolution system under development at Stanford (1.3.2), and for the 4×20×4 mm3 LSO

crystal element used in the Siemens Biograph PET system. In addition to being smaller, the

CZT detection element has lower linear attenuation than LSO: at 511 keV, µCZT = 0.5 cm−1

compared to µLSO = 0.869 cm−1 for LSO. As a result, the exponential behavior of the IDRF

can be reasonably approximated by a linear function for small CZT detection elements.

For this approximation to be valid, the detection element size and attenuation must satisfy

µT 1 and µW 1. Thus, the linear approximation cannot be used for the larger

LSO crystal. Linearizing the IDRF has the advantage of facilitating the computation of the

CDRF by the analytical scaled convolution method (2.10). Furthermore, due to symmetries,

the linearized IDRF can be represented by only four oating-points numbers (X0, X1, Y1,

and Y2) which reduces the storage requirements.

2.3.2 Analytical Scaled Convolution

Linearizing the IDRF has the advantage of facilitating the computation of the CDRF using

(2.10). In general, convolutions can be computed either numerically or analytically. Two

functions can be convolved numerically if they both have a nite support. In contrast,

analytical convolution requires the existence of an analytical function equal to the convolu-

tion. The analytical method does not require the functions to have nite support as long as

they are square integrable. Analytical convolution is exact, while the accuracy of numerical

convolution depends on the number of samples.

An analytical expression was derived for the CDRF based on the the small crystal ap-

proximation. This approach requires little computation and memory, thus enabling the

CDRF coecients to be computed extremely fast, when needed, within the reconstruction.

Let us consider a pair of detectors, denoted A and B, and a point r where the CDRF is

to be evaluated. For each detector, the IDRF gdθd(X) is approximated by a linear function

over each interval [Xdl , X

dl+1] where l ∈ 0, 1, 2 and d ∈ A,B is the detector identier.

We can further express the IDRF as the sum of three linear functions

gdθd(X) = kd1(X) + kd2(X) + kd3(X) (2.13)

where

kdl (x) =

adlX + bdl , Xdl < X < Xd

l+1

0, otherwise.(2.14)

and adl and bdl are the coecients of the linear function.

Using these notations, the CDRF can be decomposed into the sum of nine elementary

2.4. RESULTS 31

convolutions

hi(r) =1sA

3∑l=1

3∑m=1

Kl,m(X) (2.15)

where

Kl,m(X) =∫ Jl,m

Il,m

kAl (x) kBm((1 + ε)X − εx)dx

can be further expressed as

Kl,m(X) = − 13aAl a

Bmε(J

3l,m − I3

l,m)

+12[aAl a

Bm(1 + ε)X + aAl b

Bm − εbAl aBm

](J2l,m − I2

l,m)

+[bAl a

Bm(1 + ε)X + bAl b

Bm

](J2l,m − I2

l,m). (2.16)

with integration bounds Il,m and Jl,m computed using

Il.m = max

XAl ,

(1 + ε)X −XBm+1

ε

(2.17)

and

Jl.m = minXAl+1,

(1 + ε)X −XBm

ε

. (2.18)

Figure 2.9 shows a section through the CDRF of a sample LOR, as well as its decom-

position into nine elementary convolutions Kl,m(X). All the components of the CDRF do

not contribute equally. In particular, a fast approximation can be obtained by neglecting

K1,1(X), K3,3(X), K1,3(X), and K3,1(X).The small crystal approach favors arithmetic calculation over memory access (high arith-

metic intensity). Only eight oating point values need to be stored for each LOR (four for

each of the two IDRFs). Graphics processing units (GPU), investigated in Chapter 4, devote

more transistors to the arithmetic logic units than to the cache, and as a result calculating

the CDRF using the approach detailed above is ecient on the GPU.

2.4 Results

Three ways of computing the CDRF were compared. The rst method (MC) is a forced-

detection Monte-Carlo simulation of two detectors in coincidence. In order to accelerate the

simulation time, only coincidence events in which both photons interacted with the detectors

32 CHAPTER 2. IMAGING MODEL

were simulated. For that purpose, the location of each interaction within the detector was

randomly generated using a uniform distribution. Each randomly generated coincidence

event was weighed by its probability of occurrence, computed based on the photon linear

attenuation. An estimate the CDRF was obtained by combining many simulated coincidence

events according to their respective probability weights. The second method (SA) sampled

the accurate IDRFs using 200 samples and produced a numerical convolution according to

the small angle approximation (2.10). The third method (SC+SA) used both the small

angle and the small crystal (SC) approximations to calculate the CDRF analytically (2.15).

−3 −2 −1 0 1 2 30

1

2x 10

−4

X (mm)

CA

F

SC+SA approximation

Decomposition

Figure 2.9: Decomposition of the CDRF

(black) into the sum of nine functions (red)

that are calculated analytically using (2.16).

For a LOR normal to the detector (Fig-

ure 2.10, rst row), the detector response is

a trapezoid except at the center where it is

a triangle. The three methods for comput-

ing the CDRF are in good agreement. The

Monte-Carlo approach is indeed more noisy

since it relies on the simulation of a limited

number of discrete events. Normal LORs

provide the highest resolution in the system

since they are not subject to parallax error.

For these LORs, the FWHM of the CDRF

at the center is equal to half of the detector

size (0.5 mm).

In the standard ring geometry, the resolution is optimal at the center of the system

because all the LORs that pass through that point are normal to the detectors. In a box-

shaped geometry, there is no such sweet spot. Hence, LORs with degraded resolution

traverse the center of the system. As an example, for a 45 deg angle LOR (Figure 2.10,

second row), the blurring kernel FWHM is equal to 1.8 mm at the LOR center, more than

three times the value for a normal LOR. For a 45 deg LOR, both the SA and the SC+SA

approximations provide accurate CDRF models compared to the Monte-Carlo method. Due

to crystal penetration, the coincident response is asymmetric.

For a very oblique LOR (Figure 2.10, third row), the LOR forms a 9 deg angle with

the leftmost detector and a 81 deg angle with the rightmost one. As a result, the LOR has

fairly good spatial resolution in the proximity of the leftmost detector (prole A), but the

resolution degrades quickly when approaching the rightmost detector (proles B and C). In

addition, the quality of both analytical models is inferior for short and very oblique LORs.

For such LORs, the SA approximation deviates from the true distribution because the angle

2.4. RESULTS 33

Figure 2.10: CDRF for three LORs. (top row) Normal LOR connecting two 1×5 mm2 CZTdetectors, shown on a linear intensity scale (inset). The proles through the CDRF at threelocations (center, one quarter, one eighth) are shown (A, B and C, respectively) for threemethods: Monte-Carlo simulation, small-angle approximation (SA) and a combination ofthe SA and the small-crystal (SC+SA) approximation. (middle row) CDRF for a 45 degoblique LOR going through the center of the FOV. Both detectors are oriented vertically.(bottom row) CDRF for an oblique LOR. The leftmost detector is oriented horizontally andforms a 9 deg angle with the LOR. The rightmost detector is oriented vertically and thereforeforms an 81 deg angle with the LOR.

34 CHAPTER 2. IMAGING MODEL

−3 −2 −1 0 1 2 30

2

4

6

8

10

12

X (mm)

CD

RF

SC+SA approximation

True MC

(a)

−3 −2 −1 0 1 2 30

1

2

3

4

5

6

7

8

X (mm)

CD

RF

(b)

Figure 2.11: Comparison for a section of the CDRF for a normal LOR, calculated by a fullMonte-Carlo simulation and by the SC+SA approximate method. (a) Section at the centerof the LOR. (b) Section 25 mm from the LOR center.

φ (see Figure 2.5) can no longer be assumed to be small. The additional SC approximation

results in further deviation: due to the very oblique angle of the LOR, the IDRF is not well

approximated by a piecewise linear function. However, very oblique LORs only see activity

placed at the very edge of the FOV. For mouse or rat imaging, the animal is placed on a

bed at the center of the system, and therefore there is usually no activity near the edge of

the FOV.

A full Monte-Carlo simulation was also performed to verify that the forced detection

Monte-Carlo is an adequate validation method. This experiment was carried out for a LOR

normal to the plane of the detectors. Using the GATE package [70], a line source of activity,

orthogonal to the LOR, was simulated in the CZT-based box-shaped PET scanner. In the

Monte-Carlo simulation, only events that did not scatter between the detection elements

were selected. Two sections orthogonal to the LOR are shown in Figure 2.11: at the center,

and 25 mm from the center. There is good agreement between both methods for calculating

the CDRF.

2.5 Summary and Discussion

We proposed and evaluated a method for calculating the spatial response of the detectors in

a PET system made of CZT detector modules (1.3.2). These calculations can be performed

fast enough for calculating the CDRF when needed, as opposed to storing pre-calculated

2.5. SUMMARY AND DISCUSSION 35

values in a huge look-up table. This approach runs eciently on GPUs since large amounts

of computations are done for every memory access.

The model we presented can provide an accurate representation of the geometrical com-

ponent of the detector response for most of the LORs. It does not include eects such

as inter-crystal scatter, photon acolinearity, or positron range. Inter-crystal scatter in the

CZT-based system is addressed using a sequence reconstruction approach (Chapter 6). The

remaining blurring eects are small compared to the geometrical detector response. In

Section 5.3, we show that incorporating a model based only on the geometrical detector

response corrects for most of the parallax blurring and improves the global accuracy of the

reconstruction.

Chapter 3

Maximum-Likelihood Image

Reconstruction

3.1 Background

The goal of a PET scan is to non-invasively estimate the distribution of a radio-labeled

molecular tracer accumulating in the patient organs. The tracer distribution is not directly

available, but it can be inferred from the millions of oppositely-directed annihilation photon

pairs collected by the scanner, using a process known as image reconstruction.

Image reconstruction consists in solving the inverse problem associated with the data

collection process in PET (detailed in Section 2.1). That is, given a set of PET measure-

ments, how to produce an estimate of the tracer distribution consistent with the imaging

model, and prior information.

This section presents existing image reconstruction methods. Sections 3.1.3.6 and 3.3

present two novel image reconstruction methods.

3.1.1 Analytical Methods

Early image reconstruction methods were based on a simplied model for the data collection

process in PET. These methods assume that the projection value for each LOR is equal

to the integral of the tracer distribution along an innitely thin line connecting the two

detectors. For a 2-D PET acquisition, this is equivalent to assuming that the measurements

are produced by the Radon transform [56]. The tracer distribution can then be recovered

from the measurements by using an analytical inverse method called ltered backprojection

[41].

36

3.1. BACKGROUND 37

In addition to trans-axial sinograms, 3-D datasets contain oblique LORs. In the ab-

sence of noise, these LORs are redundant because the trans-axial sinograms contains all the

information necessary to reconstruct the exact tracer distribution. However, with limited

counting statistics, using oblique LORs improves the photon sensitivity and SNR. If pro-

jection data is acquired for all angles (θ and φ), then the tracer distribution can also be

reconstructed analytically. However, in practice PET systems have nite axial coverage;

therefore, very oblique LORs cannot be acquired. Several approaches have been devised

to reconstruct images from 3-D sinograms, even when some angles are missing. The Orlov

suciency condition [71, 72] determines which spatial sampling patterns provide sucient

information to reconstruct the images analytically. Several other reconstruction approaches

are based on this condition [73, 74]. Alternatively, missing projections can be estimated by

reprojecting an image reconstructed using the set of 2-D transaxial sinograms only [75]. The

resulting complete 3-D dataset can then be reconstructed using 3-D FBP.

Because most of these 3-D analytical methods are computationally demanding, 3-D

datasets are sometimes reconstructed using rebinning techniques. Briey, rebinning consists

in converting a 3-D sinogram into a stack of 2-D transaxial sinograms. These 2-D transaxial

sinograms are similar to those obtained in the conventional 2-D PET scan, however they

are less noisy because they include the complete set of 3-D measurements. Thus, rebinning

decomposes the 3-D reconstruction problem into a set of independent 2-D reconstructions.

Approximate methods were rst proposed, such as single slice rebinning [76] and multi-

slice rebinning [77]. Later, exact rebinning techniques were discovered, such as Fourier

rebinning [44].

In summary, analytical reconstruction methods are fast and relatively simple to analyze.

Yet, they cannot account for either the statistical nature of the measurements, or model the

physical complexity of the data collection process. Statistical methods provide a framework

in which the nature of the noise is taken into consideration. In addition, these methods

are iterative, and can therefore incorporate complex arbitrary models of the data collec-

tion process. These two features can improve reconstructed image quality and quantitative

accuracy.

3.1.2 Statistical Methods

Statistical methods formulate the image reconstruction problem using statistical estimation

techniques such as maximum likelihood (ML) or maximum a posteriori (MAP). They use

optimization to nd the image that maximizes a merit function (also known as the objective

function) and satises a set of constraints. In particular, the image solution is faithful to the

38 CHAPTER 3. ML RECONSTRUCTION

measurements without being too sensitive to the noise, and has other desired characteristics

such as non-negativity and smoothness.

3.1.2.1 Maximum Likelihood

The log-likelihood objective is commonly used to evaluate the agreement of an image can-

didate with the PET measurements [40]. Using the statistical model described in (2.2), the

likelihood of y given m is dened as

pm(y) = P (Y = m|y) =P∏i=1

ymii e−yi

mi!.

The log-likelihood is a concave function of y and it can be expressed as

log pm(y) =P∑i=1

−yi +mi log(yi)− log(mi!).

The image x that satises the ML criterion is a solution to the convex optimization problem

maximize fm(y) =P∑i=1

−yi +mi log(yi)

subject to Ax− y = 0x ≥ 0

(3.1)

where the variables are x ∈ RN and y ∈ RP , and the data is m ∈ RP . The condition y ≥ 0is implicit in the objective function. The system matrix A ∈ RP×N

+ is large but sparse (see

2.1.2).

3.1.2.2 Maximum A Posteriori

Typical ML reconstructions exhibit a high level of noise due to ill-conditioning. Some form

of regularization is required to produce images useful for clinical diagnostics. Two of the

most frequent procedures are early termination of the optimization iterations and post-

reconstruction smoothing [78].

Regularization can also be achieved by using a suitable penalty function within the objec-

tive function. The problem can be formulated within a Bayesian framework by introducing a

statistical prior distribution for the object being imaged [42]. Typically, these priors assume

that the object is a smooth random eld. The distribution of PET tracers is locally smooth,

3.1. BACKGROUND 39

except at organ boundaries where it can vary abruptly. The Gibbs distribution, based on

a Markov random eld, provides a good model for the local image properties [42]. It is

also computationally attractive because the local nature of the model results in an ecient

update strategy. The Gibbs prior results in an additive penalty term to the log-likelihood.

The simplest version of the Gibbs prior is a Gauss-Markov random eld and has a quadratic

form. More complex priors can be designed to allow for abrupt changes at organ boundaries,

using anatomical priors derived from a coregistered X-ray CT scan [79].

3.1.2.3 Other Objective Functions

Other objectives have been proposed for PET image reconstruction. Because a Poisson dis-

tribution can be approximated by a normal distribution, the least-square (LS) and weighted

least-square (WLS) objectives are potential alternatives to the Poisson log-likelihood [80].

For quadratic objectives, the gradient is a linear function of the image estimate, and therefore

conjugate gradient optimization methods can be readily applied [80].

3.1.3 Existing Optimization Methods

3.1.3.1 Expectation-Maximization for ML reconstruction

The expectation-maximization (EM) algorithm is a method for nding the ML estimate.

It introduces a set of complete, unobserved variables that relate the projections to the

image. The method alternates between performing an expectation step, which computes

the expectation of the log likelihood with respect to the current estimate of the distribution

for the unobserved variables, and a maximization step, which solves for the parameters that

maximize the expected log likelihood.

The EM algorithm has been successfully applied to PET reconstruction based on the

inhomogeneous Poisson process (3.1) [40]. The procedure results in the following update

strategy

xn+1j =

xnjNj

P∑i=1

pijmi

N∑b=1

pibxnb

(3.2)

where xnj is the volume estimate after iteration n. The voxels are indexed by j = 1, ..., J .The system response coecients pij model the probability that a positron emitted from

voxel j will generate two annihilation photons detected along the LOR i. The sensitivity

40 CHAPTER 3. ML RECONSTRUCTION

image

Nj =P∑i=1

ηiωipij (3.3)

takes into account the non-uniform density of LORs throughout the volumetric image and

the change in sensitivity ηiωi along LOR i caused by tissue attenuation and geometrical

and intrinsic detection eciency variations. This computation requires the time-consuming

backprojection of all LORs, unless variance reduction techniques are used [81].

The measurements in PET include random and tissue scattered coincidences (as de-

scribed by the mathematical model (2.3) in 2.1.1.4). When an estimate of these background

events is available for every LOR, it can be included in the calculation of the expected

projections by adding these corrections to the forward projection (in the denominator of

(3.2)).

Including the corrections within the reconstruction model is more accurate than subtract-

ing these corrections prior to reconstruction. The latter method alters the Poisson nature

of the measurements, and can also result in negative measurement values. Furthermore, the

subtraction method is not applicable to list-mode processing (see 3.1.3.3).

The EM algorithm involves two main operations. The forward projection computes the

expected number of events measured in each LOR and can be formulated as the matrix-vector

multiplication x → Ax. The backprojection is the transpose operation, i.e. y → ATy. In

a standard implementation of the EM algorithm, the image estimate is initialized with a

uniform image. For each iteration, the forward projection is calculated for the current image

estimate. The ratio of the measured to the estimated projection is then backprojected.

Finally, a multiplicative update is performed on the image.

One of the benets of the EM algorithm is that the multiplicative update naturally

enforces the non-negativity constraint (3.1). Other optimization methods must be adapted

to account for the non-negativity constraint, for example, by using a bent line search [80] or

an active set [82].

3.1.3.2 Ordered-Subset Expectation-Maximization for ML reconstruction

It was found that the convergence of the EM algorithm could be accelerated by order-of-

magnitudes by partitioning the data into subsets. The ordered-subset expectation-maximization

(OSEM) algorithm was designed based on this principle [39]. OSEM has established itself

as the standard iterative image reconstruction method in nuclear medicine.

Each subset in OSEM consists of a limited number of projection views. Each update

is performed using the data from a single subset. That way, each update requires far less

3.1. BACKGROUND 41

computation than if the full data was used. The subsets are processed sequentially, and one

iteration of OSEM is completed once all the subsets have been used. The OSEM algorithm

can be formulated as

xn,lj =xn,l−1j

Nj

∑i∈Sl

pijmi

N∑b=1

pibxn,l−1b

(3.4)

where the projections are partitioned into subsets Sl and l = 1, ..., L is the subset index.

The image estimate at the end of the nth iteration is xn+1,1 = xn,L+1.

The OSEM algorithm accelerates the EM algorithm by a factor roughly equal to the

number of subsets. However, the acceleration is limited because as the number of subsets

increases, the variance of the reconstructed image increases too. In addition, the OSEM

algorithm is subject to limit cycles and does not converge. As a result, OSEM is best suited

for providing early image iterates using a small number of subsets (32 is a popular choice).

3.1.3.3 List-Mode Processing

In list-mode, the processing method used in this work, the LOR index and other physical

quantities (e.g. time, energy, TOF, depth-of-interaction, etc.) are stored sequentially in a

long list as the scanner records the events. The problem of reconstructing directly from the

list-mode data lends itself to a maximum-likelihood formulation based on the EM algorithm.

Despite its computational burden, this processing method is popular [38,62,8387] because

it is an ecient format to process sparse data sets, such as dynamic, time-of-ight, or high-

resolution studies. It has additional benets, namely: (1) all the original information can

be stored for each event; (2) natural complete subsets can be formed by splitting the events

chronologically; (3) the symmetries of the system are preserved; (4) image reconstruction

can be started as soon as the acquisition begins; (5) events can be positioned continuously

in space and time; and (6) data can be converted to any other format. Table 3.1 shows an

sample dataset with four counts represented both in histogram-mode and list-mode.

The OSEM algorithm can be adapted for list-mode data. In list-mode, the vector m

of measurements for every LOR is not readily available (although it could be obtained

by parsing the list-mode). Therefore, the standard OSEM update strategy (3.4) is not

applicable. Instead, each event is processed (forward and back- projections) individually.

The OSEM subsets are formed according to the arrival time of the events. The resulting

42 CHAPTER 3. ML RECONSTRUCTION

Crystal ID 1 2 3 41 0 1 02 2 13 04

(a)

Event ID Crystal 1 Crystal 2

1 3 12 2 33 2 44 3 2

(b)

Table 3.1: Sample PET dataset with four counts and six LORs stored in (a) histogram formand (b) list-mode.

list-mode OSEM algorithm can be formulated as follow

xn,lj =xn,l−1j

Nj

L∑k=1

pikj1

N∑b=1

pikbxn,l−1b

(3.5)

where L denotes the number of list-mode events recorded and ik is the LOR index for the

kth list-mode event. Unlike the subsets Sl in (3.4), an index might be repeated if multiple

events are measured on the corresponding LOR (see Table 3.1). Likewise, the vector of

measurements m is replaced by 1 in the update equation. The method is ecient for sparse

datasets because empty LOR bins are neither stored nor processed.

3.1.3.4 Gradient Ascent for ML reconstruction

Gradient ascent is a common convex optimization method that can be used to solve (3.1) [88].

For each iteration, the search direction dn is computed from the log-likelihood gradient

dn = ∇fm(xn) (3.6)

= AT(mi

Ax− 1). (3.7)

Next, a 1-D line search is performed along the search direction dn to nd the step size that

maximizes the log-likelihood

αn = argmaxα

fm(xn + αdn). (3.8)

3.1. BACKGROUND 43

The line search is performed using an iterative method, such as Newton-Raphson or the

Amijo rule [82] . Last, an additive update is performed

xn+1 = xn + αndn. (3.9)

Unlike the EM algorithm, gradient ascent is not applicable to list-mode data because the

line search requires the knowledge of the measurement vector m. The EM algorithm itself

can also be formulated as a scaled gradient ascent method with a constant step size [80].

The EM update rule can be equivalently written as

xm+1j = xmj +

xmjNj∇fm(y). (3.10)

3.1.3.5 Conjugate Gradient for WLS Reconstruction

This section reviews the rationale for using the conjugate gradients (CG) algorithm for

quadratic objective functions such as WLS [89, 90]. The WLS estimate x∗wls is the solution

to the quadratic problem with equality constraints

maximize fm(y) = −12

P∑i=1

(yi −mi√

mi

)2

subject to Ax = y.

(3.11)

The gradient of the WLS objective with respect to x is given by

gn = ∇xfm(yn) (3.12)

= ATM−1 (m− yn) (3.13)

where yn = Axn and M = diag(m).

CG is the iterative method of choice to optimize a quadratic objective. This ascent

method alternates the computation of a search direction and a step size, producing a se-

quence of estimates xn. The CG search direction dn combines the gradient of the objective

function fm and the previous search direction

dn = gn + βndn−1. (3.14)

where βn denes the relative weight of each term. Several formulations exist for βn. In this

44 CHAPTER 3. ML RECONSTRUCTION

work, we have used the PolakRibière formulation [91]:

βn =(∇fm(xn)−∇fm(xn−1))T∇fm(xn)

∇fm(xn−1)T∇fm(xn−1). (3.15)

In CG, the image update is performed additively as in (3.9) and a line search determines

the step size αn (in a way similar to the gradient ascent method see 3.1.3.4). After the

line search procedure (3.8), the gradient gn+1 is orthogonal to the search direction dn

dTngn+1 = 0. (3.16)

The image residual is dened as the dierence between the current image estimate and the

optimal WLS solution

en+1 = xn+1 − x∗wls. (3.17)

The optimal solution x∗wls of (3.11) satises ∇xfm(Ax∗wls) = 0. Using this property, the

gradient can be expressed as a linear function of the residual

gn = −ATM−1Aen. (3.18)

From (3.16), the search direction and the image residual satisfy a conjugation relationship

dTnCen+1 = 0 (3.19)

where C = ATM−1A is the conjugation matrix. Following (3.9) and (3.17), the residual

en+1 can be reformulated as

en+1 = xi+1 +n∑

j=i+1

αjdj︸ ︷︷ ︸xn+1

−x∗wls (3.20)

= ei+1 +n∑

j=i+1

αjdj (3.21)

3.1. BACKGROUND 45

for i < n. As a result,

dTi Cen+1 = dTi Cei+1︸ ︷︷ ︸0

+n∑

j=i+1

αjdTi Cdj (3.22)

=n∑

j=i+1

αjdTi Cdj . (3.23)

The key to WLS-CG is to use Gram-Schmidt orthogonalization procedure to construct

a basis of conjugate search directions (di) that satisfy

dTi Cdj = 0, i 6= j. (3.24)

For such a basis of conjugate directions, (3.23) implies that the residual is conjugate to all

the past search directions

eTnCdi = 0, i < n. (3.25)

In other words, n − 1 components of en are zero in the C-orthogonal basis dened by

(di). As n increases, the image residuals en are constrained to a subspace of decreasing

dimension. In other words, the nth update does not undo the work achieved during the

previous steps. The exact value of the residual en is never known during the optimization,

yet afterN+1 iterations it is exactly zero (at least in theory). Moreover, it has been observed

that small residuals can be obtained even after a number of iterations much smaller than

N . Convergence is particularly fast when the eigenvalues of the conjugation matrix C are

clustered [92].

For WLS-CG, conjugate search directions can be interpreted as being orthogonal when

projected and normalized by the measured standard deviation of each sinogram bin(M−1/2Adi

)⊥(M−1/2Adj

). (3.26)

The PolakRibière formulation builds recursively such a basis of conjugate search direc-

tions (3.24). The new search direction dn is chosen in the subspace spanned by the gradient

gn and all the past search directions:

dn = gn +n−1∑j=1

βn,jdj . (3.27)

46 CHAPTER 3. ML RECONSTRUCTION

The coecients βn must be such that dn satises (3.24), which yields

gTnCdi + βn,idTi Cdi = 0 (3.28)

for i < n. The n− 1 equations are uncoupled and can be solved independently, which yields

the PolakRibière formulation (3.15).

3.1.3.6 Conjugate Gradient for ML Reconstruction

The ML objective (3.1) is non-quadratic, therefore the results derived in 3.1.3.5 do not apply.

In particular, (3.18) is not valid for the ML gradient. Yet, the CG update mechanisms have

been applied with some success to the non-quadratic ML objective [80, 82, 93, 94]. The

ML-CG reconstruction method involves calculating a search direction based on the gradient

of the ML objective (3.7) and the PolakRibière formulation (3.15). An additive update,

including a line search for the step size, are then performed.

In ML-CG, most of the properties of WLS-CG no longer hold. For example, the search

directions are not exactly conjugate because the objective is not quadratic. However, empir-

ical evidence suggests that the ML-CG algorithm performs better than the simpler gradient

ascent method presented in 3.1.3.4 [93]. One reason often cited for this result is that ML-CG

exploits the local quadraticity of the ML objective. Fast convergence can be expected when

the Hessian matrix varies slowly between iterations. Near the optimal value, the objective is

well approximated by a quadratic function, and as a result ML-CG is ecient. However, in

the early iterations, CG takes large steps and the Hessian matrix can change substantially

between iterations. Hence, the search directions are not conjugate and the image residual

en does not get constrained to a subspace of the full image volume.

In the next section, an alternate formulation of CG that preserves the conjugation of

the search directions is derived for the ML objective. This new formulation substitutes the

conjugation relationship (3.19) by an approximate conjugation relationship specic to the

ML objective.

3.2 Novel ML Conjugation of Search Directions

The ML estimate x∗ml solves the convex problem described in (3.1). Although the log-

likelihood objective pm(y) is non-quadratic, it has been observed that (3.1) could be opti-

mized eciently using CG with the PolakRibière formulation (see 3.1.3.6).

A new CG method, specic to the ML objective, was investigated [95]. Inspiration was

3.2. NOVEL ML CONJUGATION OF SEARCH DIRECTIONS 47

drawn from the WLS case to derive a new approximate conjugation relationship specic to

the ML criterion, and to design a method to form search directions consistent with that

relationship. This new formulation outperforms applying the conventional PolakRibière

formulation directly to the non-quadratic ML objective.

3.2.1 Conjugation in ML-CG

The image residual is dened as

en = xn − x∗ml (3.29)

where x∗ml satises ∇xfm(Ax∗ml) = 0.

The gradient of the ML objective with respect to x is given by

hn = ∇xfm(yn) (3.30)

= ATY −1n (m− yn) (3.31)

where yn = Axn and Yn = diag(yn). For the ML gradient hn, the dierence between

the measured and estimated projection (mi − yi) is scaled by the inverse of the estimated

variance yn, while in the WLS gradient gn, this dierence is scaled by the inverse of the

measured variance m.

The ML gradient can be expressed as a function of the residual:

hn = −ATY −1n ΛAen (3.32)

where Λ = diag(m/Ax∗ml). This expression depends upon the unknown optimal solution to

the ML problem, therefore Λ is approximated by the identity matrix. This approximation

is equivalent to Ax∗ml = m. It should be noted that if this relationship is true, then indeed

x∗ml is the ML optimal solution. Alternatively, (3.33) can be written using an additive error

term ε:

hn = −ATY −1n Aen + ε (3.33)

where ε = ATY −1n (Ax∗ml −m). Here again, ε can be assumed to be negligible.

An approximate relationship can then be established between the gradient and the image

residual, independently of the optimal solution:

hn ≈ −ATY −1n Aen. (3.34)

48 CHAPTER 3. ML RECONSTRUCTION

The gradient is orthogonal to the search direction when αn is computed by a line search

dTnhn+1 = 0. (3.35)

Following (3.34), a new conjugation relationship can be formulated for ML

dTnBn+1en+1 ≈ 0 (3.36)

where Bn+1 = −ATY −1n+1A. Unlike in the WLS case, the conjugation matrix Bn+1 varies

with the iterations. Nevertheless, this conjugation relationship can be exploited to constrain

the residual.

Similarly to (3.23), the image residual en satises

dTi Bi+1en+1 ≈n∑

j=i+1

αjdTj Bi+1di (3.37)

for i = 1 . . . n − 1. We therefore explored a method that recursively builds a sequence of

search directions such that

dTj Bi+1di = 0, i < j. (3.38)

For such a sequence of search directions, the image residuals en are constrained to a subspace

of decreasing dimension as the image estimate xn approach the optimal solution

dTi Bi+1en+1 ≈ 0, i < n. (3.39)

The basis of search directions conjugated in the ML sense (3.38) can be interpreted as

follows. The projected search directions Adi and Adj (i < j), scaled by the estimate of the

variance at the ith iteration, are orthogonal:

Y−1/2i+1 Adi ⊥ Y −1/2

i+1 Adj . (3.40)

3.2.2 Explicit Conjugation of Search Directions

An algorithm similar to Gram-Schmidt orthogonalization procedure is employed to produce a

basis of search directions that satisfy (3.38). The search direction dn is formed by combining

linearly the gradient hn and all the past search directions

dn = hn +n−1∑j=1

βn,jdj . (3.41)

3.2. NOVEL ML CONJUGATION OF SEARCH DIRECTIONS 49

Applying conditions (3.38) yields a system of n−1 linear equations with n−1 variables βn,j

hTnBi+1di +n−1∑j=1

βn,jdTj Bi+1di = 0 (3.42)

for i = 1 . . . n − 1. Unlike for the WLS objective, the equations are coupled. They can be

equivalently represented in matrix notation

H(n)βn = cn, (3.43)

where βn = (βn,1, . . . , βn,n−1), cn is the vector dened as

cn,i = −hTnBi+1di (3.44)

and H(n) is a lower-triangular matrix

H(n)ij = dTj Bi+1di. (3.45)

The matrix H(n) can be constructed recursively since H(n+1)ij = H

(n)ij for i, j ≤ n− 1.

A truncated formulation can also be implemented by assuming βn,j = 0 for j ≤ n − 2.In that case, the non-zero component can be formulated as

βtruncn,n−1 = − hTnBndn−1

dTn−1Bndn−1. (3.46)

3.2.3 Results

A 2-D 128 × 128 Shepp Logan phantom with a positive-valued surrounding background

was simulated. No cold region was present in the phantom to circumvent the issue of the

non-negativity constraint which exists for all additive update methods. A noise-free dataset

was obtained by computing 192 parallel-beam projection views. A noisy dataset was also

produced by generating a realization of a Poisson random vector parameterized by the noise-

free projections, resulting in 35 million counts.

Reconstructions were performed for both datasets (Figure 3.1). We compared ML-

CG with the PolakRibière formulation (3.1.3.6) against the new formulation (3.42). The

truncated formulation (3.46) was also investigated. The log-likelihood residual, dened as

| log pm − log p∗|, where log p∗ is the value of the objective function at optimality, was the

main gure of merit of this study. The value of the objective at optimality was computed

50 CHAPTER 3. ML RECONSTRUCTION

(a) (b) (c) (d)

Figure 3.1: The noise-free and noisy reconstructed images are shown for 500 iterations ofPolakRibière ((a) and (c)) and 500 iterations of the new ML-specic formulation ((b) and(d)).

by running 10,000 iterations of ML-CG with PolakRibière.

Figure 3.3 shows typical values for βn,j for n = 30. We have observed that, independently

of the iteration number, the last coecient (here, β30,29) is always far greater than the others.

Yet, accounting for all the search directions improves the convergence of ML-CG both for

noise-free and noisy datasets. The log-likelihood residual converges to zero faster for the new

formulation . For the noisy dataset, it reaches the equivalent of 50 PolakRibière iterations

in 39 iterations (1.3 times faster), and the equivalent of 2,000 PolakRibière iterations in

451 iterations (4.4 times faster) (Figure 3.2).

The new formulation can be truncated as shown in (3.46). The resulting formulation

converges to the ML solution at the same rate as Polak-Ribière. Furthermore, both for-

mulations yield similar values (average dierence smaller than 0.3%) for β for all iterations

(Figure 3.3).

The assumption (in 3.2.1) that Λ ≈ IP (or, equivalently, Ax∗ml ≈m) was experimentally

studied. Figure 3.4 is a histogram of the diagonal coecients of Λ, computed from the

estimate of x∗ml obtained by running 10,000 iterations of PolakRibière ML-CG for the

dataset with noise. The histogram is centered on 1, with a full-width half-maximum of 0.04.

3.2.4 Discussion

For quadratic objectives, such as WLS, accounting for the last search direction alone is

sucient to form a basis of conjugated search directions. For a non-quadratic objective,

such as ML, all the past search directions must be combined when forming a new search

direction to ensure conjugation. In the new, ML-specic formulation, the last search di-

rection is typically more heavily weighted than older ones. However, accounting for all the

search directions improves the convergence rate. If older search directions are truncated,

3.2. NOVEL ML CONJUGATION OF SEARCH DIRECTIONS 51

(a) (b)

Figure 3.2: Progress of reconstruction, measured by | log pm − log p∗|, where p∗ is the log-likelihood optimal value found by running 10,000 iterations of Polak-Ribière ML-CG. Theprogress of our new formulation for ML-CG is shown with and without truncation of thesearch directions. Two datasets were reconstructed: (a) a noise-free dataset and (b) a datasetwith Poisson noise based on 35 million counts. For the noise-free case, the log-likelihood ofthe new formulation exceeded the result of 10,000 iterations with PolakRibière after only485 iterations; therefore, the residual could not be plot past iteration 485.

(a) (b)

Figure 3.3: (a) Typical coecients for βn,i for n = 30 and i = 1 . . . 29. The last searchdirection dn−1 is weighted more heavily than older ones for computing the new searchdirection dn, but older search directions also contribute to the nal search direction. Whenthe contribution of these older search directions is ignored, the new formulation 3.46 iswithin 0.3% of PolakRibière 3.15, as shown (b) by plotting β for both formulations for 20iterations of ML-CG.

52 CHAPTER 3. ML RECONSTRUCTION

the convergence rate reverts back to that of standard PolakRibière.

Figure 3.4: Histogram of the diagonal coe-

cients of Λ = diag(m/Ax∗ml). In the deriva-

tion of the ML conjugation relationship, Λ is

assumed to be the identity matrix.

The CG method based on the Polak

Ribière formulation is surprisingly ecient

for the non-quadratic ML objective. Fur-

thermore, it can be shown (by doing a full

search on β) that no other formulation con-

verges faster. The reason is that Polak

Ribière applied to the ML objective is equiv-

alent to truncating the optimal ML formu-

lation, which uses all the past search direc-

tions.

For the new method, only a small

amount of additional computation is re-

quired (evaluation of dot products). The

total complexity remains dominated by the

cost of the back- and forward projections. The main drawback of the optimal ML formu-

lation is that it requires that all the past search directions be stored in memory, which is

impractical for large image volumes. How to reduce the memory footprint while keeping the

desirable properties of the new formulation needs to be researched further.

The images produced by ML iterative reconstruction become increasingly noisy as the

sequence of estimates reaches optimality. The noise can obscure the main features of the

images. Thus, most clinical protocols terminate the iterations early, before the convergence

is reached. This indeed limits the attractiveness of the reconstruction methods described

in this section. When an early ML iterate is sought, the method of choice is the ordered-

subsets expectation-maximization (OSEM) algorithm. OSEM can produce images suitable

for clinical use in as little as two iterations [96], but it does not converge to the ML solution

[97]. Alternatively, a fully converged reconstruction can be performed, and in such case

a smoothing lter must be applied to the ML estimate, or a smoothing penalty must be

incorporated in the objective. The CG algorithm and the novel methods described in this

section are suitable for fully converged reconstruction approaches.

The CG algorithm can be modied to incorporate a preconditioning matrix [90]. Pre-

conditioned CG (PCG), with the right choice of matrix, converges faster than regular CG

to the optimal solution [82, 94]. Conventional preconditioners attempt to approximate the

inverse of the Hessian matrix. In PET, the EM preconditioner [80] has also been shown to

improve convergence and provide a non-linear smoothing eect, similar to that obtained by

3.3. NOVEL ML RECONSTRUCTION VIA TRUNCATED NEWTON'S METHOD 53

running the EM algorithm. The optimal ML formulation for CG can be readily extended to

include preconditioning.

The current ML problem formulation (3.1) does not incorporate a non-negativity con-

straint for the voxels. The non-negativity of the estimated projections is however implicitly

enforced by the logarithmic term in the ML objective. Largely negative voxel values are

therefore prevented, because they would induce negative projection values. In the absence

of a non-negativity constraint, voxels can take slightly negative values. It is very challenging

to incorporate non-negativity constraints in CG while preserving the fast convergence [80].

In the early iterations, during which the cold voxels are identied, CG is not faster than a

simple gradient ascent method. A suboptimal ML estimate might be found by truncating

the negative voxels at the end of the reconstruction (projection onto convex sets). Such

truncation cannot be performed within the iterations because it would destroy the deli-

cate sequence of conjugate directions. A logarithmic barrier can also be used to enforce

non-negativity. Positive bias in cold regions might result, but the advantage of the barrier

method is that the conjugation of the search directions in CG is preserved. The positive

bias can be mitigated by reducing the weight of the log barrier, but this negatively aects

the conditioning of the objective.

3.3 Novel ML Reconstruction via Truncated Newton's Method

3.3.1 Dual Problem

A dual problem can be formulated for ML reconstruction. The Lagrangian associated with

problem (3.1) is

L(x,y,λ,ν) = λTθ + νT (y −Ax) +P∑i=1

−ρi +mi log(ρi) (3.47)

= (ν − 1)Tρ + (λ−ATν)Tx +P∑i=1

yi log(ρi) (3.48)

where ν ∈ RP is the dual vector for the linear equality constraint, and λ ∈ RN is the dual

vector for the non-negativity constraint. The Lagrange dual function g(λ,ν) is dened as

54 CHAPTER 3. ML RECONSTRUCTION

the supremum over the primal variables of the Lagrangian:

g(λ,ν) = supx∈RP ,y∈RN

L(x,y,λ,ν). (3.49)

The dual function is equal to ∞ unless ATν = λ and ν ≤ 1. In that case, the supremum is

attained for yi = mi/(1− νi), and the value of the dual function is

g(λ,ν) =P∑i=1

mi log(mi)−mi −mi log(1− νi). (3.50)

The dual function leads to the formulation of the dual problem, where the constant terms

in the objective have been dropped:

maximizeP∑i=1

mi log(1− νi)

subject to ATν = λ

λ ≥ 0.

(3.51)

The optimal primal variable y∗, and the optimal dual variable ν∗ are orthogonal. This

follows from y∗Tν∗ = x∗TATν∗ = x∗Tλ∗, which is zero due to complementarity slackness.

Using the relationship between primal and dual variables,

y∗Tν∗ = y∗T (1− my∗

) = 1T (y∗ −m) = 0. (3.52)

The ML estimate therefore has the interesting property that 1Ty∗ = 1Tm, i.e. the total

number of counts estimated is equal to the total number of counts measured. This condition

can provide a necessary stopping criterion for the reconstruction.

Solving the dual problem (3.51) is not as practical as solving the primal problem (3.1)

since x∗ can only be recovered from ν∗ by computing A−1(

m1−ν∗

), which is expensive or

inaccurate if CG is used.

3.3.2 KarushKuhnTucker Conditions

The primal variables x∗ and y∗ are optimal for (3.1) if and only if there are ν∗ ∈ Rm and

λ∗ ∈ Rn such that[0

∇f(y∗)

]+

[−AT

I

]ν∗ +

[I

0

]λ∗ = 0, Ax∗ = y∗, x∗ ≥ 0, λ∗ ≥ 0. (3.53)

3.3. NOVEL ML RECONSTRUCTION VIA TRUNCATED NEWTON'S METHOD 55

The KKT conditions can be more conveniently formulated as

ATν∗ = λ∗, ∇f(y∗) = −ν∗, Ax∗ = y∗, x∗ ≥ 0, λ∗ ≥ 0. (3.54)

In particular, we have x∗TAT∇f( y∗) = 0, or equivalently x∗T∇Xf(Ax∗) = 0.

3.3.3 Newton Step for a Relaxed Problem

The inequality constraints can be handled by using a log barrier. However, for simplicity,

we relax the original problem (3.1) by dropping the constraints x ≥ 0. The coecients of Aare non-negative, and the objective constrains Ax to be positive. As a result, large negative

values of x are penalized. If needed, we can project x onto RN+ and perform a few additional

iterations.

The Newton step can be obtained by solving for the minimum under constraints of the

quadratic approximation of f around (x,y).

minimize f(y + v) = f(y) +∇f(y)Tv + 12vT∇2f(y)v

subject to A(x + u)− y − v = 0.(3.55)

where u and v are the Newton steps for x and y, respectively. Using Ax = y, the optimal

variables for the quadratic problem satisfy 0 0 AT

0 −∇2f(y) −ImA −Im 0

u

v

w

=

0∇f(y)

0

(3.56)

where w is the associated optimal dual variable. The gradient and the diagonal Hessian are,

respectively,

(∇f(y))i =mi − yiyi

,(∇2f(y)

)ii

= −mi

y2i

. (3.57)

The variables u and w can be eliminated, which yield the more compact equation for the

Newton step ∆xnt

AT∇2f(y)A∆xnt = −AT∇f(y). (3.58)

56 CHAPTER 3. ML RECONSTRUCTION

The new image estimate x† can be expressed as

x† = x + ∆xnt

= x + (ATdiagm/y2A)−1

(AT

m− yy

)= (ATdiagm/y2A)−1

(AT (m/y) +AT

m− yy

)x† = (ATdiagm/y2A)−1ATdiagm/y2(2y − y2/m). (3.59)

This equation means that x† is the solution to a weighted least-squares (WLS) regression of

the linear system of equations Ax† = 2y − y2/m, with weights m/y2.

The best linear unbiased estimator (BLUE) for x is

xBLUE = (ATM−1A)−1ATM−1m (3.60)

because mi is the ML estimate of the variance of Y . Therefore the Newton step will give

the BLUE for x when m = y. When m is large (i.e. > 30), the Poisson distribution can be

approximated by a Gaussian distribution and xBLUE ≈ xML.

For this reason, when the noise is low (m ≈ y), the Newton's method provides an excel-

lent search direction even at the rst update (Figure 3.5b and Figure 3.5d). In comparison,

the ML gradient provides a much blurrier search direction (Figure 3.5a and Figure 3.5c).

Higher noise results in a very noisy search directions for Newton methods (Figure 3.5d).

(a) (b) (c) (d)

Figure 3.5: The image estimate x was initialized with a uniform intensity map. The rstsearch direction was computed for the Shepp-Logan phantom with and without noise. (a)Log-likelihood gradient computed for the noise-free Shepp-Logan phantom; (b) Newtonsearch direction, computed on the same dataset by running 30 iterations of CG (relativeresidual: 9.9e-4) (c) log-likelihood gradient computed for the Shepp-Logan phantom withnoise (d) Newton search direction, computed on the same dataset by running 50 iterationsof CG (relative residual: 3.9e-3)

3.3. NOVEL ML RECONSTRUCTION VIA TRUNCATED NEWTON'S METHOD 57

3.3.4 Preconditioning

A diagonal preconditioner Mdiag can be designed by using the diagonal coecients of the

Hessian matrix:

Mdiagii = eTi A

T∇2f(y)Aei. (3.61)

The diagonal preconditioner Mdiag depends upon the current estimate y, and therefore

must be recomputed at every iteration. Because H = AT∇2f(y)A is factored and not

stored in memory, the computation of Mdiag is costly (order PN2) compared to one CG

iteration (order PN). Therefore this diagonal preconditioner is impractical. We will instead

investigate a constant preconditionerM cons that is computed once before the reconstruction

M consii = eTi A

TM−1Aei, (3.62)

where mi/y2i is approximated by 1/mi. At optimality, yi ≈ mi.

3.3.5 Results

To evaluate the method, we used the Shepp-Logan 2D phantom with 128×128 pixels (Figure3.6). 128 projections, each with 192 samples, were used to generate a noise-free projection

dataset. The projections were corrupted by Poisson noise to simulate limited statistics. The

noise level (35 million counts in a single slice) is consistent a clinical PET scan. The matrix

A, which models the line projection, was chosen to be the product of two operators: a simple

line projector where the contribution of one voxel to one projection bin is one if the line

intersects the voxel and zero otherwise, and a shift-invariant Gaussian kernel that models

limited spatial resolution.

The projection data were reconstructed with a gradient ascent method (3.1.3.4), ML-CG

(3.1.3.6) and truncated Newton. 3, 5, 10, 25 and 50 iterations of CG were run to compute

the Newton direction, with and without preconditioning. The previous search direction was

used as an initial value. All methods used the same constant preconditioner. A line search

was performed using bisection.

An order-of-magnitude improvement in the convergence rate can be seen between gra-

dient ascent and non-linear conjugate gradient (Figure 3.7). Newton's method using a low

number of iterations (3, 5 and 10) provides a better search direction than ML-CG, however

the improvement does not compensate for the increased computational cost. In addition, the

method jams and fails to converge. Jamming occurs when a xed number of CG iterations

does not yield a Newton step that has a lower relative residual than the value it is initialized

58 CHAPTER 3. ML RECONSTRUCTION

(a) (b) (c) (d)

Figure 3.6: (a) Shepp-Logan phantom, 128 × 128 voxels, with Poisson noise equivalent to35 million counts recorded. Reconstructed image for (b) 4 iterations of truncated Newtonwith 50 sub-iterations of linear CG, (c) 200 iterations of non-linear conjugate gradient and(d) 200 iterations of gradient ascent.

with.

For CG higher iteration numbers (25 and 50), Newton's method with preconditioning

converges almost at the same rate as ML-CG. Four iterations of Newton's method are com-

parable to 200 iterations of ML-CG, however, the two methods have the same computational

cost.

The use of the constant preconditioner (3.62) improved the convergence rate for all

algorithms (dashed line versus solid line in Figure 3.7). A variable preconditioner computed

at each iteration would further improve the convergence.

3.4 Discussion

Solving problem (3.1) can be approached in many dierent ways. OSEM has become the

preferred method for obtaining an estimate quickly before convergence is reached. In this

chapter, other methods that use more elaborate search directions and step sizes were inves-

tigated.

In Section 3.2, we presented a new way to form search directions in ML-CG by enforcing

explicit conjugation relationships derived from the expression of the log-likelihood in PET.

This new ML conjugation relationship accounts for the non-quadraticity of the objective

function. It thus requires that all the past search directions be used when forming a new

conjugated search direction.

The new formulation converges faster to the ML objective: it takes 22% fewer iterations

to reach the equivalent of 50 PolakRibière iterations, and 77% less iterations to reach the

equivalent of 2,000 PolakRibière iterations. To reduce the memory burden, truncating all

3.4. DISCUSSION 59

0 50 100 150 20010

−1

100

101

102

103

104

105

Equivalent gradient computations

|log(

p m)−

p*|

TNT, 3 CG it.TNT, 5 CG it.TNT, 10 CG it.TNT, 25 CG it.TNT, 50 CG it.Non−linear PCGGradient ascent

Figure 3.7: Progress of reconstruction, measured by | log pm − p∗|, where p∗ is the log-likelihood optimal value found by running 2,000 iterations of ML-CG. For truncated Newton(TNT), the inner CG loop ran 3, 5, 10, 25 and 50 iterations. Computation is evaluated interms of the number of operations required to evaluate the gradient (i.e. two matrix-vectormultiplications by A and AT respectively). We compared reconstruction with (dashed lines)and without (solid lines) preconditioners. The preconditioner used was M cons (3.62).

60 CHAPTER 3. ML RECONSTRUCTION

but the last search direction was investigated. In that case, the convergence rate reverts to

that of the PolakRibière formulation. This result provides some insight on the performance

of ML-CG with the PolakRibière formulation. PolakRibière is approximately equivalent to

truncating the optimal ML formulation, and for this reason performs relatively well despite

the non-quadraticity of the ML criterion.

In Section 3.3, we have applied Newton's method to solving the ML estimation problem.

Newton's method did not perform as well as conventional non-linear CG method, regardless

of preconditioning and CG iterations (Figure 3.7). Truncated Newton performed better than

the gradient ascent method when it did not jam.

We also found that, for Newton's method, the convergence rate (normalized to account

for dierences in computation) was the greatest when 50 CG iterations were run. With fewer

CG iterations, the reconstruction did not converge as fast. The opposite result was reported

in [98], and this might be due to ner tuning of CG. A variable preconditioner would also

improve the performance of our implementation.

Regularization of the objective is probably the key for improving the performance of

Newton's method, since it will improve the condition of the Hessian around optimality and

facilitate the computation of the Newton step. As shown on Figure 3.5, even the rst

Newton's search direction is very noisy when no regularization is used.

Chapter 4

Fast Shift-Varying Line Projection

using Graphics Hardware

4.1 Background

Most of the computation time in ML image reconstruction is spent in the line projection

operations. In addition, list-mode schemes require that LORs be processed individually, in

arbitrary order. We investigated practical ways to implement and accelerate these operations

using programmable graphics hardware, namely the graphics processing unit (GPU) [99].

This chapter describes how graphics concepts can be mapped onto the GPU. A more

detailed presentation is available in Appendix A. A glossary of GPU terms can be found in

Appendix F.

4.1.1 The Graphics Processing Unit

Primarily designed to deliver high-denition graphics for video games in real-time, GPUs

are now increasingly being used as cost-eective high-performance co-processors for scientic

computing [100]. GPUs are characterized by massively parallel processing, fast clock-rate,

high-bandwidth memory access, and hardwired mathematical functions. These character-

istics make them particularly well suited for on-the-y schemes with high computational

intensity.

As shown on Figure 4.1, over the last ve years, GPUs' peak performance P has increased

at a faster rate than CPU's: PGPU ≈ P 1.4CPU. While Moore's law hypothesises that the

density of transistors on a chips doubles every two years, the peak compute performance of

GPU GPUs are single-instruction multiple-data (SIMD) processors but multi-core CPUs are

61

62 CHAPTER 4. GPU LINE PROJECTION

2002 2003 2004 2005 2006 2007 2008 2009 2010

101

102

103

GF

LOP

S

Year

CPU

GPU

A

B

C

D

E F

GH I J

A

B

C

DE

F

G

Figure 4.1: Trend in the computational performance P for CPUs and GPUs over ve years,measured in billion single-precision oating-point operation per second (GFLOPS). GPUs:NVIDIA GeForce FX 5800 (A), FX 5950 Ultra (B), 6800 Ultra (C), 7800 GTX (D), QuadroFX 4500 (E), GeForce 7900 GTX (F), 8800 GTX (G), and Tesla C1060 (H); CPUs: Athlon64 3200+ (A), Pentium IV 560 (B), Pentium D 960 (C), 950 (D), Athlon 64 X2 5000+ (E),Core 2 Duo E6700 (F), Core 2 Quad Q6600 (G), Athlon 64 FX-74 (H), Core 2 Quad QX6700(I) and Intel Core i7 965 XE (J).

4.1. BACKGROUND 63

Figure 4.2: The graphics pipeline. The boxes shaded in light red correspond to stages of thepipeline that can be programmed by the user.

multiple-instruction multiple-data (MIMD). MIMD leads to more complex integrated circuit

designs because multiple instruction decode blocks as well as special logic are required to

avoid data read/write hazards. SIMD also dedicates less area to the data cache and more

to the arithmetic logic units. As a result, the number of parallel SIMD processing units has

been growing faster than has the number of MIMD's. It therefore appears likely that GPUs

will continue to be increasingly useful for medical image reconstruction, especially as the

performance gap with CPUs widens.

Computations can be executed on the GPU either using graphics APIs, such as OpenGL

or DirectX, or specic APIs such as CUDA. This work used OpenGL to interface with the

GPU, hence we will present briey the graphic pipeline (Figure 4.2).

The role of the GPU in a 3-D graphical application (such as a video game) is to perform

the calculations necessary to render complex 3-D scenes in a short amount of time. The

3-D scene is created by the application using polygons (triangle, quadrangles, etc.). Using

OpenGL, these polygons are streamed to the GPU, along with instructions on how to render

the scene (position of the camera, lighting, textures, etc.). The programmer does not need

to know about the specics of the GPU because OpenGL is a standard API that interfaces

directly with the GPU driver. Two stages in the GPU are fully programmable: the vertex

64 CHAPTER 4. GPU LINE PROJECTION

and the fragment shaders (in light red on Figure 4.2).

The vertex shaders can perform, in parallel, a programmable sequence of instructions

on each individual vertex that passes through the pipeline. In computer graphics, the ver-

tex shader is used to perform the projection of the polygons on the plane of the display

and to calculate per-vertex properties, such as surface normals or texture mapping coor-

dinates. Properties dened for each vertex are bilinearly interpolated within the polygon.

The vertices are then assembled, and the triangles rastered into fragments (i.e. all the data

necessary to generate a pixel in the frame buer).

The fragment shaders' role is to perform programmable computation in parallel on all the

fragments. In computer graphics, they are used to calculate the nal color of the voxel based

on texture and lighting information. General-purpose computation (i.e. non-graphical) can

also be performed in this stage since the output of the fragment shader can be read out from

directly the frame-buer. Each fragment is then combined with the frame-buer according

to predened raster operation (additive blending, etc.).

4.1.2 Iterative Reconstruction on the GPU

Image reconstruction on GPUs has been the focus of previous research. Texture mapping on

non-programmable graphics hardware was rst proposed in 1994 [101] as a way to accelerate

cone-beam FBP for x-ray computed tomography. The same technique was later applied to

port sinogram-based OSEM to a consumer-grade graphics architecture [102]. More accurate

methods were developed once the GPU became programmable and handled oating-point

textures. The general approach was rst described for processing sinograms using FBP

and EM [103], and the ordered subset convex reconstruction algorithm [104]. Attenuation

correction and the incorporation of a point spread function were also addressed for SPECT

[105]. A real-time GPU-based reconstruction framework was developed for X-ray CT [106].

These methods [101106] have been successful because the GPU is ecient at applying

the ane transformation that maps a slice through the volumetric image to any sinogram

projection view, and vice-versa.

Until now, there have not been any work on executing list-mode iterative reconstruction

on GPUs. The main challenge in implementing list-mode OSEM on the GPU is that the

list-mode LORs are not arranged in any regular pattern like sinogram LORs. The mapping

between the list-mode data and the volumetric image is not ane, and as a result texture

mapping cannot be used in this context. The projection operations must be line driven,

which means that the back- and forward projections must be performed on a per LOR

basis. This constraint motivates the design and investigation of a novel GPU technique

4.2. THEORY 65

Figure 4.3: Example of a parametrization ofthe system response kernel. Two detectionelements in coincidence are shown, as wellas the projection Lij of a sample voxel Vjonto the axis of LOR i.

to back- and forward project individual LORs described by arbitrary endpoint locations,

even when a shift-varying kernel is used to model the response of the system [107]. No

existing GPU projection technique has addressed the specic issues of list-mode processing.

These issues also arise when data is processed in histogram-mode, in which case a weight,

representing the measured projection, is passed to the GPU with each LOR [49]. Even

sinogram-based reconstruction can be performed in this new LOR-driven framework by

describing each sinogram bin by its value and the two LOR endpoint locations; however this

approach would be less ecient than the GPU texture mapping technique cited above. We

also propose a novel framework to dene arbitrary, shift-varying system response kernels

that are evaluated on-the-y by parallel units within the GPU. This feature is important to

correct for the various resolution blurring factors in emission tomography.

The implementation on the GPU of list-mode 3D-OSEM with shift-varying kernels is

challenging because the graphics pipeline architecture does not run eciently unless the two

main components (line back- and forward projections) are reformulated. This reformulation

involves handling line backprojection using the GPU rasterizer and decomposing line forward

projection into smaller elementary operations that run eciently in parallel on the GPU. We

proved, both mathematically and experimentally, that the reformulated operations replicate

the correct line back- and forward projections [107].

4.2 Theory

4.2.1 System Response Kernel

The spatial resolution in PET is degraded by physical processes associated with photon

emission, transport and detection. These resolution blurring factors can be modeled in

the system matrix. This provides resolution recovery through deconvolution on condition

that the model is accurate enough, the SNR is high enough and the number of iterations

is sucient. Several experiments have shown that incorporating a model of the system

response can improve the performance of the reconstruction for certain tasks [42,43,48,79].

In the GPU line-projection technique we have developed, we generalize the notion of

66 CHAPTER 4. GPU LINE PROJECTION

system matrix by modeling the system response using kernels. Kernels are non-negative

real-valued functions that model the contribution of each voxel to each LOR as a function

of multiple variables. These variables include the indices of the current LOR and voxel,

which allow any system matrix to be represented with a kernel. Kernels can be described

more generally by selecting another choice of parametrization, such as the center Vj of voxel

j, the projection Lij of Vj on LOR i, the distance dij between the center of voxel j and

LOR i, the distances between Lij and each of the two detectors δ(1)ij and δ(2)

ij , the orientation

ui and length li of LOR i, the time-of-ight τi, and the photon depth-of-interaction for

each detector z(1)i and z(2)

i (Figure 4.3). Kernels are smooth approximations of the system

matrix, independent of the voxel size. They model the system response in a compact way

by exploiting the geometrical redundancies in the system.

The kernel is evaluated at all voxels that contribute signicantly to LOR i. We call the

set of such voxels the tube-of-response (TOR), further dened by a cylindrical volume:

Ti = j : dij ≤ η (4.1)

where η is a user-dened constant which sets an upper bound on the distance dij between

voxel j and LOR i. While system matrices are implemented by look-up tables, kernels allow

for a mix of memory look-ups and on-the-y computations and lead to a higher computa-

tional intensity (dened as the ratio of arithmetic logic unit to memory usage). Kernels can

also be evaluated at each voxel independently, in the GPU parallel processing units.

4.2.2 GPU Implementation

In order to use the GPU pipeline eciently, we reformulated the projections to enhance

parallelism and match the pipeline architecture.

4.2.2.1 Data Representation

GPU memory is organized in textures, which in computer graphics are used to store color

images. A 2-D color texture forms an array of 32-bit oating-point quadruples, that can be

accessed randomly by GPU shaders. We stored the volumetric images used for reconstruction

in such textures by tiling the stack of slices in 2-D (A.1.1). The list-mode projection data,

consisting of LOR endpoints and the projection value, were stored in another 2-D texture

using the four color channels. We used the OpenGL frame-buer object (FBO) extension

to enable shaders to write directly to texture [108].

4.2. THEORY 67

4.2.2.2 Line Projection Stages

The forward projection of the image xj along LOR i and the backprojection of LOR i with

weight ωi into volumetric image xoldj are mathematically represented as, respectively,

fi =∑j∈Ti

aijxj (4.2)

xnewj =

aijωi + xoldj j ∈ Ti

xoldj otherwise.(4.3)

Both operations can be conceptualized of as a sequence of three stages. In the rst stage,

the voxels Ti that contribute non-negligibly to LOR i are identied. In the second stage,

these identied voxels are further processed: The kernel parameter variables are computed

from LOR i and voxel j attributes and then used to evaluate the system response kernel aij .

In the last stage, the data vector (image or projection data) is updated according to (4.2)

and (4.3).

4.2.2.3 Voxel Identication in Line Forward Projection

The voxel identication stage consists of determining the voxels Ti that are to be processed

during the line back- and forward projection of LOR i. Because the TOR is a volume, in

typical CPU code, three levels of nested loops with variable bounds would be performed to

cycle through all the voxels.

On the GPU, this stage was the most problematic because nested loops with variable

bounds are not ecient, unless the same constant number of iterations are executed in

each parallel unit. When the number of iterations is constant, all parallel units run the

same number of instructions and the loops can be unrolled. The line forward-projector was

eciently reformulated so that all loops run a constant number of iterations, even though

for some LORs this meant increasing the number of iterations.

Let us assume that the LOR main direction is along ez, i.e.ui · ez ≥ ui · exui · ez ≥ ui · ey

(4.4)

where ui denotes the direction vector for LOR i. This relationship can always be satised

by rotating the coordinate axis if needed. As shown later, (4.4) is important to ensure that

the number of iterations in distributed loops is bounded.

The line forward projection of the volumetric image λj along LOR i can be described

68 CHAPTER 4. GPU LINE PROJECTION

equivalently as

fi =Nz∑k=1

∑j∈Sik

aijxj

(4.5)

where

Sik = Ti ∩Qk. (4.6)

and Qk represents a slice of the volumetric image along the ez axis, indexed by an index k =1 . . . Nz where Nz is the number of slices (the total image volume size is N = Nx×Ny×Nz).

In this formulation, the outer loop distributes the computation across the dimension ez while

the inner loop iterates over the two remaining dimensions. In Figure 4.4a, the inner and the

outer loops are represented by vertical and horizontal dashed lines, respectively.

Sik can be equivalently described by introducing the ellipse E dened by the set of all

the points in slice Πk that are at a distance η from LOR i (Figure 4.4b).

The computation of the inner loops (4.5) is distributed over parallel shading units. In

the GPU, computation is done by drawing a horizontal line, that is Nz-pixels long, in a

temporary texture while a custom shader is bound (represented in Figure 4.4a by a horizontal

line at the bottom). The inner loop computation is skipped when Sik is empty.

The direct computation of the inner loop in (4.5) is inecient because the bounds vary

with the LOR and the slice index k ( Figure 4.4b). Yet, when conditions (4.4) are satised,

the number of iterations in the inner loop is bounded by(2√

2η + 1)2

because the angle

between the LOR and the z axis is less than π/4. Conditions (4.4) can always be met by

choosing the main dimension of the LOR to correspond to the outer loop.

Consequently, the inner loop can be performed in exactly⌈2√

2η + 1⌉2

iterations pro-

vided that an indicator function for TOR Ti is used :

ITi(j) =

1, j ∈ Ti

0, otherwise.(4.7)

The indicator function ITi is eciently evaluated by the GPU. For k such that Sik is not

empty, the inner loop computation can be equivalently expressed as

αik =∑j∈S†ik

ITi(j) aij xj (4.8)

where S†ik is the set of voxels shown on Figure 4.4b. The voxel set S†ik contains Sik but has

a constant number of elements. This technique processes more voxels than strictly needed

4.2. THEORY 69

(a)

(b)

Figure 4.4: (a) In the line forward projection, voxels that contribute to LOR i are identiedby performing an outer and an inner loop. The former iterates over the main dimension forthe LOR ( as dened in (4.4) here ez), while the latter iterates over the two remainingdimensions (only ey is shown on the gure). The computation of the inner loops is performedsimultaneously in parallel shaders within the GPU. To make computation ecient, the innerloop bounds are increased so that the number of iterations is constant. In a second pass,the outer loop sum is computed by a second shader (bottom).(b) Voxel j ∈ Sik (represented in dark gray) if and only if its center (V x

j , Vyj ) is inside ellipse

E (4.6). The size and shape of Sik vary with i and k, which prevents ecient GPU loopsover this set. However, Sik is a subset of S†ik (light+dark gray), whose size is constant. Thus,loops on S†ik run eciently on the GPU.

70 CHAPTER 4. GPU LINE PROJECTION

but keeps the bounds of the inner loop constant.

The translation of this technique in OpenGL/Cg terms is the following: horizontal lines

(shown on the bottom of Figure 4.4a) are drawn into a temporary 2-D buer while a 1-D

texture is applied onto these lines by mapping the horizontal line endpoints to the original

LOR endpoints. The 1-D mapping generates texture look-up coordinates (shown as white

dots in Figure 4.4a). Textures are ltered on-the-y by custom shaders which performed

the inner loop computation described in (4.8). This method generates the αik values and

stores them in a temporary 2-D texture. In a second pass, a shader calculates the sum over

k (4.1 on page 73).

4.2.2.4 Voxel Identication in Line Backprojection

A dierent technique was used to identify the voxels in the line backprojection. The GPU

rasterizer was used to identify which voxels belong to the TOR and distribute the evaluation

of the system response kernel.

The GPU rasterizer can convert a 2-D vectorial polygon Γ into a 2-D pixel image xj .

In computer graphics, 2-D polygons come from the projection of 3-D vectorial primitives

onto the plane of the display. Pixel j is rastered if its center (V xj , V

yj ) belongs to polygon Γ

(Figure 4.5). We call

RΓ =j : (V x

j , Vyj ) ∈ Γ

(4.9)

the set of such voxels. A pixel shader Φ can be inserted in the graphics pipeline to compute

the pixel value xj (i.e. color). This yields the raster equation

xj =

Φ(j), j ∈ RΓ

0, otherwise.(4.10)

GPUs can only raster 2-D vectorial objects, which hinders a straightforward implementa-

tion of 3-D line backprojection. Yet, it is possible to circumvent this obstacle by performing

the line backprojection slice by slice. Color is used to encode the slice index and process

four slices simultaneously. For each slice k and LOR i, a polygon Γ is generated and then

rastered into the set of voxels RΓ (4.9). The best choice for Γ is the smallest rectangle that

covers the ellipse E (Figure 4.5). In that case, RΓ contains Sik and all the voxels in Ti are

processed. RΓ can be larger than Sik, so an indicator function is necessary (4.7).

In OpenGL, rectangles are drawn into a 2-D texture while vertex and pixel shaders

are bound, respectively, to dene Γ's coordinates and to evaluate the value of the system

response kernel at each pixel location. The result of the kernel evaluation, aij , is then

4.2. THEORY 71

Figure 4.5: Pixels whose center (represented by a black dot) is located within the rasterpolygon Γ are selected by the GPU rasterizer (light+dark gray). When the coordinatesof the raster polygon Γ are chosen to contain ellipse E, the set of such voxels includesSik. Rastering a rectangle provides an ecient way to identify contributing voxels in thebackprojection.

assigned to the pixel color register and additively blended with the image texture (4.1 on

page 73).

Identifying voxels using the GPU was implemented distinctly in the line forward and

back- projections. In the forward projector, we used a constant-size square to bound the

set Sik of the voxels that contributed to LOR i (4.2), while in the backprojector we used a

variable-size rectangle (Figure 4.5). The latter method was more ecient because less voxels

were needlessly processed, which was experimentally conrmed: the GPU line backprojec-

tor runs 40% faster than the forward projector. Unfortunately, due to GPU architecture

constraints, it is not ecient to use the rasterizer in the line forward projector. Another

fundamental dierence is that parallelization in the forward projection was achieved by run-

ning computation simultaneously on multiple slices, while in the backprojection the voxels

that belong to the same slice are processed in parallel.

4.2.2.5 Kernel Evaluation

The pixel shaders evaluate the value of the system response kernel. For each LOR, this

evaluation is performed twice (once in the forward and once in the back projection) on all

the voxels belonging to the associated TOR.

72 CHAPTER 4. GPU LINE PROJECTION

First, the kernel parameters are calculated using LOR and voxel attributes. LOR at-

tributes are dened in the vertex shader and passed to the pixel shader. The voxel attributes

are read from the Cg WPOS register.

For a xed-width Gaussian system response kernel, the only parameter needed is the

distance dij between LOR i and voxel j. This distance can be computed by forming the

orthogonal projection of the voxel center Vj onto the LOR dened by a point Pi and a

direction vector ui, i.e.

dij =∥∥∥−−→PiVj − (

−−→PiVj · ui) ui

∥∥∥2. (4.11)

This computation is fast because hardwired GPU functions for dot product and norm are

used.

Following the calculation of the parameter variables, the kernel value aij for LOR i and

voxel j is evaluated. The kernel evaluation can use texture look-ups and arithmetic functions

such as exponentials, powers and linear interpolation. Texture look-ups are useful, for

example, to read out the coecients of splines functions, which represent one parameter of

the system response kernel. The kernel value is only computed when needed. This approach

allows for implementation of arbitrary shift-varying kernels. The high-level shading language

Cg [109] provides an important library of mathematical functions that are applicable to both

scalar and vectorial oating-point registers.

4.2.2.6 Vector Data Update

The last stage of the projection consists of updating the data vector (either a volumetric

image or a set of list-mode projections).

For the line forward projector, the partial sums αik (4.5) are summed (outer loop):

fi =Nz∑k=1

αik. (4.12)

The resulting values fi are then inverted and written back to the projection data texture in

preparation of the line backprojection.

In the line backprojector, the pixel shader called by the rasterizer directly writes to

the correct voxel location. Additive blending was enabled to add the shader output to

the previous voxel value (4.10). Additive blending is performed in dedicated 32-bit oating-

point units. The last step in OSEM consists of multiplying the update image by the previous

volumetric image and dividing it by the sensitivity map (3.5). This is done by running the

volumetric image through a pixel shader.

4.3. DISCUSSION 73

Algorithm 4.1 Simplied schematics for one sub-iteration of list-mode 3D-OSEM on the GPU.

(OGL) indicates an OpenGL call, (VS) and (PS) denote programs running in the vertex and the

pixel shader, respectively.

Load list-mode events in video memory (OGL)

Line forward projection:

For each event

Choose outer loop dimension (VS)

Compute number of slices traversed (VS)

Draw a horizontal line (outer loop) (OGL)

For each pixel in the horizontal line

Inner loop through a slice (PS)

Evaluate kernel (PS)

Read image value (PS)

Accumulate (PS)

Sum horizontal line voxels (PS)

Update projection value (PS)

Line backprojection:

For each slice

For each event

Raster rectangle (OGL)

Compute rectangle coordinates (VS)

For each voxel in rectangle

Evaluate kernel (PS)

Blend additively with image (OGL)

Update image estimate multiplicatively (PS)

Divide by normalization map (PS)

4.1 summarizes the steps involved in the back- and forward projection of a group of

LORs. A more detailed overview, including shader code, can be found in Appendix A.

4.3 Discussion

GPUs and CPUs both aim at executing the workload as fast as possible but they use dier-

ent strategies to achieve that goal. CPUs excel at executing one long thread of computation,

while GPUs are ecient at running thousands of independent threads. Therefore, it is nec-

essary to adopt dierent reconstruction strategies on each platform. For example, Siddon's

algorithm [110] is well suited to CPU but not GPU architectures because it requires voxels to

be processed sequentially, in long threads of computation. In kernel projection techniques,

the system matrix is evaluated at each voxel independently, so the computation can be bro-

74 CHAPTER 4. GPU LINE PROJECTION

ken down into many small threads. Besides, kernel projection techniques produce better

images because Siddon's algorithm is based on the perfect line integral model which does

not include the contribution of voxels that are o of the LOR axis.

The CUDA library [111] is another interface to the compute engine of the GPU. CUDA

has several advantages, including greater ease of development, shared memory, and scattered

reads. However, CUDA does not provide access to the rastering engine, which is a critical

component of the line backprojection approach we have developed.

The GPU line projection technique presented in this chapter in used to implement three

dierent applications that are presented in the next chapter. An evaluation of the accuracy

of the projection is also presented.

Chapter 5

Applications of GPU-Based Line

Projections

5.1 Overview

Line projections are essential building blocks in tomographic image reconstruction. In this

chapter, we present three applications of GPU-based line projections for list-mode recon-

struction.

Three reconstruction algorithms were implemented. The rst algorithm was imple-

mented for the CZT high-resolution PET system and used a simple shift-invariant projection

kernel and served to validate the accuracy of the GPU-based line projections (Section 5.2).

The second algorithm, implemented on the same system, was based on an accurate detector

response model that was calculated on-the-y on the GPU (Section 5.3). The third algo-

rithm algorithm was implemented for a clinical system with time-of-ight (TOF) capabilities

(Section 5.4).

Owing to the sparseness of the data, list-mode OSEM (3.1.3.3) was chosen for all three

applications. The techniques we have described in Chapter 4 are particularly suitable for

list-mode because the line projections need to be performed individually on the GPU.

5.2 List-Mode OSEM with Shift-Invariant Projections

5.2.1 Shift-Invariant System Response Kernel

The results presented in this section are based on a shift-invariant Gaussian kernel centered

on the LOR axis. The full-width half-maximum (FWHM) was chosen to match the average

75

76 CHAPTER 5. APPLICATIONS

system-resolution blurring. The kernel K is parametrized by the distance dij between the

center of voxel j and LOR i

K(dij) = exp

(−d2ij

2σ2

)(5.1)

and we have aij = K(dij). This kernel is not the perfect representation of the system

response, but it is sucient to demonstrate the GPU line-projection technique. The next

section will demonstrate the use of the GPU line projection with more advanced, shift-

varying projection kernels.

5.2.2 Methods

5.2.2.1 Simulation Data

This work used data provided by a simulated small-animal PET system design based on

cross-strip 3-D CZT detectors described in 1.3.2.

(a) (b)

Figure 5.1: (a) Rod phantom used for contrast recovery comparison. (b) Sphere phantomused for resolution evaluation.

The Monte-Carlo package GATE [70] was used to simulate the acquisition of two phan-

toms. To keep the simulation as realistic as possible, the output from the GATE hits le

was used to position each photon event. Due to the low photo-fraction of the CZT material,

incoming photon events often interact multiple times in the detectors (Chapter 6). Such

photon events were positioned at the estimated location of the rst interaction and binned

to the nearest 1×5×1 mm3 bin. Consistent with measurements [16], we modeled the energy

resolution by adding Gaussian noise with FWHM 3%×√

511/E, where E is the energy of

the single interaction in keV.

A phantom comprising two large concentric rods (1 cm and 4 cm diameter) of activity

(Figure 5.1a) was simulated to assess the quantitative contrast recovery of the GPU-based

5.2. LIST-MODE OSEM WITH SHIFT-INVARIANT PROJECTIONS 77

reconstruction independent of the system resolution. Two regions of interest (ROI 1 and

ROI 2) were dened in the 1 cm and the 4 cm radius rods as shown in Figure 5.1. The

activity in each rod were set up to create a 10:1 activity concentration ratio between ROI 1

and ROI 2. The contrast C was measured on reconstructed images as a function of iteration

as

C =xROI 1 − xROI 2

xROI 2(5.2)

where xROI 1 and xROI 2 are the average image intensities over each ROI. Spatial variance

σ2ROI 2 in ROI 2 was also computed to approximate image noise N . Our gure of merit for

noise in the images is

N = σROI 2/xROI 2. (5.3)

Photons that scattered in the object as well as random coincidences were not included in

the reconstruction to obtain the reconstructed contrast in an ideal case.

The phantom data were reconstructed using list-mode 3D-OSEM on a CPU and a GPU

architecture. On the CPU, we used an in-house C++ reconstruction package [112] that was

modied to support arbitrary system response kernels. On the GPU, we used the novel

technique described in Section 4.2. For both platforms, the FWHM (= 2.35σ in (5.1)) of the

xed-width Gaussian kernel was chosen to be 1 mm, a value roughly equal to the detector

pitch. The computation of the sensitivity image Nj followed the same procedure for both

reconstructions.

A high-resolution sphere phantom (Figure 5.1b) was simulated to look at the eects of

the GPU reconstruction on image resolution. The phantom was composed of four quadrants

of spheres, all in one central plane, placed in air. The spheres were 1, 1.25, 1.5 and 1.75 mm

in diameter. Their centers were placed twice their diameter apart. Twenty million counts

were acquired. The activity was placed all the way up to the edge of the 8 × 8 × 8 cm3

system FOV.

Finally, to provide a global measure of the deviation between images produced using

GPU and CPU list-mode 3D-OSEM, we measured the average relative deviation

ε =1N

N∑j=1

∣∣∣xcpuj − xgpuj

∣∣∣xcpuj

(5.4)

at dierent sub-iterations for both phantoms.

78 CHAPTER 5. APPLICATIONS

(a) (b)

Figure 5.2: (a) GE Vista eXplore DR small-animal PET system. (b) Picture of the hot rodPET phantom.

5.2.2.2 Validation: Experimental Pre-Clinical Data

The GE eXplore Vista DR [29] is a pre-clinical PET scanner installed at Stanford with two

depth layers of 1.55 mm-pitch crystals. The useful FOV is 6.7 cm transverse and 4.6 cm

axial. Photons can be recorded by 6,084 crystal elements, providing 28.8 million LORs.

Data is acquired in 3-D and stored in LOR histograms. We performed two phantom studies

(hot rod and cold rod phantoms) to evaluate the performance of the GPU reconstruction

on a real dataset.

The hot rod phantom (Micro Deluxe phantom, Data Spectrum, Durham, NC) was lled

with 110 µCi of 18F and imaged for 20 minutes. The cold rod phantom was lled with 200

µCi of 18F and imaged for 20 minutes. The rod diameters were 1.2, 1.6, 2.4, 3.2, 4.0 and 4.8

mm. The spacing between the centers was twice the diameter. For both experiments, data

were collected in histogram-mode.

Reconstruction was performed on a GPU using 3D-OSEM with Gaussian kernel (1.4

mm FWHM) and on a CPU using FORE+2D-OSEM, included with the Vista DR instal-

lation. For both reconstructions, 32 subsets were formed and two iterations were run, the

recommended value for the system. For 3D-OSEM, the subsets were formed by generating

a random partition of the LORs. We also modied our GPU-based list-mode reconstruction

package to handle histogram-mode data by adding the capability to assign a projection value

to each LOR.

5.2. LIST-MODE OSEM WITH SHIFT-INVARIANT PROJECTIONS 79

(a)

(b)

Figure 5.3: Reconstruction of the rodphantom using list-mode 3D-OSEMon (a) the GPU and (b) the CPU.

0 20 40 60 80 100 120 140 1600

2

4

6

8

10

Voxel index

Rec

onst

ruct

ed v

oxel

val

ue

CPUGPU

Figure 5.4: Horizontal prole through the centerof both images (Figure 5.3).

5.2.3 Results

No signicant dierence was observed between the images generated using list-mode 3D-

OSEM on the GPU and the CPU for the simulated rod contrast phantom (Figure 5.3). This

was further conrmed by a horizontal prole through the center of both images (Figure

5.4). The contrastnoise trade-o at dierent sub-iterations was neither aected by the

mathematical reformulation of line projections nor by the use of the GPU as a reconstruction

platform (Figure 5.5). The contrast, measured between ROI 1 and ROI 2, converged to 9.426

for the GPU and 9.428 for the CPU. Noise was virtually identical on both reconstruction

(0.28440 vs 0.28435 RMS).

Inspection of the sphere phantom images revealed no signicant dierence between the

two implementations (Figure 5.6). Neither did the prole through one row of 1.75 mm

spheres. The reconstructed sphere size was evaluated by tting a sum of Gaussians to 1-D

proles through the center of the 1.75 mm spheres. The sphere size on images reconstructed

with 3D-OSEM on both GPU and CPU is 1.36±0.32 mm. The dierence in the reconstructedsphere size between the GPU and CPU implementations was on the order of 10−5 mm.

The global dierence between images reconstructed using the GPU and the CPU was

80 CHAPTER 5. APPLICATIONS

Figure 5.5: Contrastnoise trade-o at dierent sub-iterations for the rod phantom (Figure5.3). Contrast is evaluated between ROI 1 and ROI 2 (Figure 5.1). Noise is approximatedby the spatial standard deviation in ROI 1.

quantitatively evaluated by measuring the average relative deviation (5.4). The overall

deviation ε between the two implementations was below 0.25% at 20 iterations for both

phantom. It was lower for the rod phantom than for the sphere phantom (Figure 5.8).

The GPU reconstruction package was benchmarked against an existing standard recon-

struction package on high-resolution datasets acquired on the Vista DR. A comparison of

GPU histogram-mode 3D-OSEM against CPU FORE+2D-OSEM for the hot rod (Figure

5.9) and the cold rod (Figure 5.10) shows visual dierences. All of the nineteen 1.6 mm rods

were resolved when 3D-OSEM was used, compared to only ten with FORE+2D-OSEM. The

improvement is due to the limited potential of FORE for resolution recovery [42, 43], not

the dierence in processing between GPU and CPU.

The processing time for each reconstruction method was measured (Table 5.1). CPU-

based 3D-OSEM was benchmarked on an Intel Core 2 Duo E6600 (2.4GHz) CPU. The

GPU used for the same task was the NVIDIA GeForce 8800GT GPU. The image size was

160 × 160 × 160 voxels for the simulated datasets and 175 × 175 × 60 voxels for Vista DR

datasets. The measured time includes Fourier rebinning for FORE+2D-OSEM. A 1 mm-

FWHM Gaussian kernel with a TOR cut-o of η = 1 mm was used for 3D-OSEM in the rst

experiment. In the second one, we chose a 1.1 mm-FWHM kernel with a TOR η = 0.8 mm

cut-o. Reconstruction time is provided for one million LORs processed (back- and forward

5.2. LIST-MODE OSEM WITH SHIFT-INVARIANT PROJECTIONS 81

(a)

(b)

Figure 5.6: Sphere phantom in air re-constructed with 20 iterations of list-mode 3D-OSEM on (a) the GPU and(b) the CPU, using Gaussian kernelwith 1 mm-FWHM.

10 20 30 40 50 60 70 800

2

4

6

8

10

Voxel index

Rec

onst

ruct

ed v

oxel

val

ue

CPUGPU

Figure 5.7: Horizontal prole through the 1.75 mmspheres for both reconstructions.(Figure 5.6).

Figure 5.8: Average relative deviation between the GPU and the CPU versions of list-mode3D-OSEM for the rod phantom and the sphere phantom.

82 CHAPTER 5. APPLICATIONS

(a) (b)

Figure 5.9: Micro Deluxe hot rod phantom, acquired on the Vista DR system and recon-structed with (a) histogram-mode 3D-OSEM with 1.4 mm-FWHM Gaussian kernel on theGPU and (b) using FORE+2D-OSEM provided with the system. A single slice is shown.The rod diameters are 1.2, 1.6, 2.4, 3.2, 4.0 and 4.8 mm. Spacing is twice the diameter.

(a) (b)

Figure 5.10: Micro Deluxe cold rod phantom, acquired on the Vista DR system and recon-structed with histogram-mode 3D-OSEM with (a) 1.4 mm-FWHM Gaussian kernel on theGPU and also (b) using the FORE+2D-OSEM provided with the Vista DR system. A singleslice is shown. The rod diameters are 1.2, 1.6, 2.4, 3.2, 4.0 and 4.8 mm. Spacing betweencenters is twice the diameter.

5.2. LIST-MODE OSEM WITH SHIFT-INVARIANT PROJECTIONS 83

projected). For list-mode 3D-OSEM on the simulated PET system, the GPU reconstruction

was 25 times faster than the CPU's. 3D-OSEM on the GPU was 2.3 times slower than

FORE+2D-OSEM on the GPU, but potentially more accurate. The computation of the

sensitivity map took 7 min 20 sec for the simulated dataset and 1 min 14 sec for the real

dataset on the Vista DR.

Table 5.1: Reconstruction time (seconds per million LORs processed)System Algorithm Recon. time (s)

CZT PET GPU 3D-OSEM (160× 160× 160) 8.8CPU 3D-OSEM (160× 160× 160) 224

Vista DR GPU 3D-OSEM (175× 175× 60) 5.3CPU FORE+2D-OSEM (175× 175× 61) 2.3

5.2.4 Discussion

Despite dierent projection formulations and hardware architectures, the GPU and the CPU

versions of list-mode 3D-OSEM generated virtually identical images. Figure 5.8 indicates

that globally, at 20 iterations, the relative deviation ε between the gold standard CPU

implementation and its GPU-based counterpart was, on average, on the order of 0.25%. This

level of error is acceptable for PET and well beyond the accuracy needed. For example, for a

scan with 100 million counts, a 100× 100× 100 voxels image will have at best 10% variance

per voxel (based on Poisson statistics). The deviation between GPU and CPU reconstruction

was also smaller for low-resolution phantoms such as the rod phantom (ε <0.12%).

The agreement between the GPU and the CPU implementation was validated both in

terms of the quantitative voxel values (Figure 5.5) and the ability to resolve small features

(Figure 5.6 and Figure 5.7). The contrastnoise trade-o and the reconstructed sphere sizes

were identical.

The computation of the distance dij between voxel j and LOR i (4.11) is the leading

cause of error on the GPU. The inaccuracy on the calculation of dij is around 8.6 × 10−6

voxel RMS. This error might seem insignicant, however dij is computed and compared to

the cut-o η 10 billion times per sub-iteration. As a result of these errors, 0.002% of the

TOR voxels are misclassied. The dierence in dij values stems from minuscule errors in

the output of oating-point operations on graphics hardware.

Other less signicant sources of deviation between GPU and CPU results occur during

the evaluation of the kernel. The numerical values produced by GPU's hardwired functions,

such as exponentials, are slightly dierent from those produced by the CPU math libraries.

84 CHAPTER 5. APPLICATIONS

The Vista DR study shows that the GPU reconstruction performs well with data mea-

sured on an existing high-resolution PET system. We compared GPU 3D-OSEM with a

Gaussian kernel to the standard reconstruction algorithm installed on this system, FORE+2D-

OSEM, in order to show that the GPU reconstruction produces acceptable results. The qual-

ity of the images meets our expectations and matches or exceeds that of FORE+2D-OSEM

reconstruction.

As mentioned in Table Table 5.1, FORE+2D-OSEM on a CPU is 2.3 times faster than

3D-OSEM on the GPU, but potentially not as accurate because FORE uses several approx-

imations to rebin the 28.8 million LORs into 1.4 million eective 2-D LORs (sixty-one 2-D

sinograms with 175 spatial locations and 128 angles [29]). While FORE+2D-OSEM trades

image quality for reconstruction speed, the GPU implementation does not pay a signicant

penalty for the acceleration.

It is also worth noting that the processing time for FORE+2D-OSEM per million eective

LORs is 47.3 sec, which is 9 times than that for GPU 3D-OSEM. In addition, the rebinned

2-D LORs involve a smaller number of voxels because they are shorter than 3-D LORs and

they do not incorporate a broad system kernel. The TORs that were used in Table Table 5.1

for 3D-OSEM involved on average 10 times more voxels than the LORs used for 2D-OSEM,

the volumetric image size being equal. Thus, 3D-OSEM would run around 10 times faster

if a narrow (i.e. η small) TOR was used.

A few other qualitative comments can be made. Concerning the hot rod phantom (Figure

5.9), all of the 1.6 mm rods are clearly resolved for the GPU-based reconstruction with

Gaussian kernel. In contrast, some the 1.6 mm rods at the edge of the FOV are not resolved

on the FORE+2D-OSEM image. The background noise is also lower by 27% for the 3-

D reconstruction. For the cold rod phantom (Figure 5.10), we observed that 3D-OSEM

provided greater uniformity throughout the FOV as well as higher contrast.

The 1.6 mm diameter rods are more dicult to resolve in the cold rod phantom than

in the hot rod phantom. This is due to several factors. In the hot rod phantom, the signal

(dened as the reconstructed activity in the hot rods) has lower noise than the surrounding

background. The situation is reversed for the cold rod phantom where the signal is consti-

tuted by the cold rods. In addition, random and scatter events (uncorrected in this study)

tend to obscure the cold regions of the phantom. Positive reconstruction bias, which occurs

in cold regions in PET, has a similar eect. Because spatial resolution is not the limitation

in a cold rod phantom, the dierence between 2-D and 3-D reconstruction is more subtle

for the cold rod phantom than for the hot rod phantom.

5.3. LIST-MODE OSEM WITH SHIFT-VARYING PROJECTIONS 85

Figure 5.11: Schematics of the computation architecture used for calculating the CDRF onthe GPU. The complete process is divided into three stages, one running on the CPU, onein the GPU vertex shaders and one in the GPU fragment shaders.

5.3 List-mode OSEM with Shift-Varying Projections

In Chapter 2, we described a method to calculate the geometric detector response for a sys-

tem based on CZT modules. In this section, this approach is applied to generate an accurate

system matrix, on the GPU, within the reconstruction. Because this on-the-y method re-

lies on GPU computation rather than accessing memory, it provides a fast alternative to

storing the full detector response model. In addition, it is advantageous in cases where the

PET system geometry is dierent for every scan (for example for a breast-dedicated PET

scanner with variable detector separation, such as the one shown in 1.3.3).

5.3.1 Methods

5.3.1.1 Implementation

The coincidence detector response function (CDRF) approach was implemented for the

small-animal PET system based on CZT detectors under development at Stanford. Owing

to the large number of LORs in that system (more than 10 billion), reconstruction was

performed in list-mode using a fully 3-D OSEM algorithm (3.1.3.3). The system matrix

coecients were calculated on-the-y. In order to accelerate the computation, we used the

GPU to perform the line projections and the online kernel evaluation.

The implementation relies on the basic principles introduced in Chapter 4. The voxels

contained in a cylinder of radius η are identied, both in the forward and in the back-

projection. For each voxel, the kernel is calculated using the CDRF procedure outlined in

Chapter 2.

The calculation of the CDRF is split into three stages (Figure 5.11). The rst stage, per-

formed on the CPU, consists in calculating a piecewise linear approximation of both intrinsic

detector response functions (IDRFs) for all the LORs in the current subset. Each IDRF is

86 CHAPTER 5. APPLICATIONS

stored using only four oating-point coecients: X0, X1, Y1 and Y2. These coecients are

transferred to a 2-D texture in the GPU video memory.

In the second stage, which takes place in the GPU parallel vertex shaders, the coecients

adl and bdl are calculated for every LOR. They are then streamed to the fragment shaders.

In the third stage, the GPU fragment shaders compute the kernel value for every voxel

within the tube-of-response following (2.15).

5.3.1.2 Evaluation

The Monte-Carlo package GRAY [113] was used to simulate the acquisition of two phantoms

with the CZT-based PET system. To keep the simulation as realistic as possible, the output

from GRAY was used to position each photon event. Due to the low photo-fraction of

the CZT material, incoming photon events often interact multiple times in the detectors

(Chapter 6). Such photon events were positioned at the estimated location of the rst

interaction and binned to the nearest 1×5×1 mm3 bin. Consistent with measurements [16],

we modeled the energy resolution by adding Gaussian noise with FWHM 3% ×√

511/E,where E is the energy of the single interaction in keV.

The high-resolution sphere phantom (Figure 5.12a) was used to research the eects

of accurate system modeling on image resolution. The phantom was composed of four

quadrants of spheres in air, all in the central axial plane, placed all the way to the edge of

the 8× 8× 8 cm3 transaxial FOV. The spheres were 1, 1.25, 1.5, and 1.75 mm in diameter.

Their centers were placed twice their diameters apart. The phantom had a total of 800 µCi

and ve seconds of acquisition were simulated, yielding 27.2 million coincident events.

Two reconstructions were performed on the GPU using list-mode 3D-OSEM, with 10 sub-

sets. The rst reconstruction used a shift-invariant 1 mm-FWHM Gaussian kernel, and the

second one a shift-varying model based on the analytical CDRF. The reconstructed sphere

FWHM was measured by tting a Gaussian mixture with oset to 1-D proles through the

reconstructed image. Since the ML estimate is non-linear, the reconstructed sphere FWHM

should be analyzed with care and should not be interpreted in terms of modulation transfer

function. It should also be noted that reconstructed sphere FWHM is not expected to be

equal to the true sphere diameter (see Appendix E for more details). The 1 mm spheres

were also too small relative to the voxel size for a reliable measure of their FWHM.

The contrast phantom (Figure 5.12b) was also used to assess the quantitative contrast

recovery. The phantom was composed of a 2.5 cm-radius, 6 cm-long cylinder, lled with a

warm solution of activity, in which ve hot spheres were placed. The spheres were centered

on the central axial plane and their diameters were 1, 1.5, 2, 4, and 8 mm. The activity

5.3. LIST-MODE OSEM WITH SHIFT-VARYING PROJECTIONS 87

(a) (b)

Figure 5.12: Depiction of phantoms used for measuring the eect of shift-varying resolutionmodels. (b) A contrast phantom, consisting of a 2.5 cm-radius, 6 cm long cylinder lledwith a warm solution of activity, in which were placed ve hot spheres of diameters 1, 1.5, 2,4, and 8 mm. The ratio of the activity concentration in the hot spheres to that in the warmbackground cylinder was 10. (c) A hot sphere resolution phantom, consisting of four spherepatterns, all in the same central plane. The spheres extended to the edge of the 8 × 8 × 8cm3 FOV and their diameters were 1, 1.25, 1.5, and 1.75 mm. The spacing between thespheres' centers was twice their diameter.

was ten times more concentrated in the hot spheres than in the warm background. The

phantom had a total of 800 µCi and ve seconds of acquisition were simulated, yielding

14.6 million coincident events. Reconstruction was performed in list-mode with two subsets

and attenuation correction. The contrast was measured in the reconstructed image as a

function of iteration number (5.2). The mean reconstructed activity was measured in the hot

spheres using spherical regions-of-interest (ROIs). The background activity was evaluated by

averaging the reconstructed intensity in two cylindrical ROIs placed o of the central axial

plane. The noise was approximated by the spatial standard deviation in the background

ROI, normalized by the mean background intensity (5.3).

5.3.2 Results

The impact of using a resolution model based on the CDRF in the reconstruction was

evaluated both in term of contrast recovery and spatial resolution.

5.3.2.1 Resolution

Figure 5.13 shows the high-resolution sphere phantom reconstructed with a shift-invariant

Gaussian kernel and a shift-varying model. The image reconstructed with a shift-invariant

model has non-uniform resolution due to parallax errors (dened in 1). Radial blurring is

88 CHAPTER 5. APPLICATIONS

(a) (b)

Figure 5.13: Hot spheres in air phantom, reconstructed on the GPU with 5 iterations oflist-mode 3D-OSEM with 10 subsets and (a) a shift-invariant Gaussian kernel or (b) anaccurate model of the system response based on the analytical CDRF. The spheres extendto the edge of the 8 × 8 × 8 cm3 FOV and their diameters are 1, 1.25, 1.5 and 1.75 mm.They are spaced twice the diameter.

noticeable at the edge of the FOV due to oblique LORs. In contrast, the image reconstructed

using a shift-varying model based on the analytical CDRF shows little sign of resolution

degradation near the edge of the FOV.

This is further conrmed by measuring the reconstructed FWHM of the spheres along

a horizontal prole as a function of sphere position. The results of these measurements

are reported in Figure 5.14, for the (a) 1.75, (b) 1.5, and (c) 1.25 mm spheres. All the

reconstructed spheres are signicantly smaller when an accurate shift-varying model is used.

In addition, the spatial resolution is uniform throughout the entire FOV, as evidenced by

the uniform reconstructed sphere size.

5.3.2.2 Contrast

Figure 5.15 shows the reconstructed contrast phantom (Figure 5.12a) after reconstruction

with a shift-invariant Gaussian kernel and the system response model based on the CDRF.

In both cases, reconstruction was performed by running 50 iterations of list-mode OSEM

with two subsets.

Figure 5.16 compares the contrast vs. noise trade-o for reconstruction with a shift-

invariant Gaussian kernel and a shift-varying analytical model. Because high-frequency

components are only recovered in the late iterations, premature termination of the OSEM

iterations was used as implicit regularization to produce the trade-o curve. For all ve

spheres (diameters 8, 4, 2, 1.5, and 1 mm), the use of a more accurate model improves

5.3. LIST-MODE OSEM WITH SHIFT-VARYING PROJECTIONS 89

0 5 10 15 20 25 30 35 400.5

1

1.5

2

2.5

3

Y (mm)

Rec

onst

ruct

ed S

pher

e F

WH

M

Accurate shift varying model

1mm FWHM Gaussian TOR

(a)

0 5 10 15 20 25 30 35

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Y (mm)

Rec

onst

ruct

ed S

pher

e F

WH

M

(b)

0 5 10 15 20 25 30 35 400.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Y (mm)

Rec

onst

ruct

ed S

pher

e F

WH

M

(c)

Figure 5.14: Reconstructed sphere size (FWHM in mm) as a function of sphere position,for two projection models, measured by tting a Gaussian mixture with oset to 1D prolesthough the reconstructed images (Figure 5.13). (a) 1.75 mm spheres; (b) 1.5 mm spheres;and (c) 1.25 mm spheres.

90 CHAPTER 5. APPLICATIONS

(a) (b)

Figure 5.15: Contrast phantom, reconstructed with 50 iterations of list-mode 3D-OSEMwith two subsets, using (a) a shift-invariant Gaussian kernel and (b) a shift-varying modelbased on the CDRF. The phantom was composed of a 2.5 cm-radius, water-lled cylinder,in which were placed ve hot spheres. The activity was ten times more concentrated in thespheres than in the background. The sphere diameters were 1, 1.5, 2, 4 and 8 mm.

the trade-o between contrast and noise. More specically, at any given iteration number,

the CR is higher and the noise is lower (except for the 1 mm sphere) for the shift-varying

reconstruction. For the 8 mm sphere, close to full contrast recovery is observed (CR of

95.7% at convergence). In addition, the background variability is lower for the shift-varying

reconstruction.

5.3.2.3 Reconstruction Time

The reconstruction time was measured for the simple Gaussian shift-invariant and accurate

shift-varying model in Table 5.2. Both measurements were made for the hot sphere phan-

tom dataset, using a GeForce 285 GTX (NVIDIA). The image size was 160 × 160 × 160.Consistently with Section 5.2, the Gaussian kernel width was 1 mm, much narrower than

the average width of the shift-varying kernel based on the CDRF. Hence, the TOR cut-o

parameter η was set to 3.5 voxels for the Gaussian projections, and to 5.5 voxels for the

shift-varying projections. More specically, η = 3.5 voxels means that the diameter of the

TOR is more than eight times the standard deviation of the Gaussian kernel. Likewise, for

the shift-varying kernel, η = 5.5 voxels results in a TOR diameter of 5.5 mm, larger than

the maximum CDRF kernel width of 5.1 mm (=√

52 + 12). As a result, the reconstruction

with accurate, shift-varying model was ten times slower than the simpler method based on

the shift-invariant Gaussian kernel.

5.3. LIST-MODE OSEM WITH SHIFT-VARYING PROJECTIONS 91

Figure 5.16: Contrast recovery (CR) plotted as a function of noise for varying iterationnumbers (datapoints) and sphere sizes. The curves are shown for the ve sphere sizes(black : 8mm, red : 4 mm, magenta: 2 mm, blue: 1.5 mm, and cyan: 1 mm) and for twotypes of reconstruction: accurate projection model (diamond) or shift-invariant Gaussianmodel (circle).

Table 5.2: Reconstruction time on a GPUProjection Model Recon. time (s)

Shift-invariant Gaussian kernel 3.0Shift-varying kernel (CDRF) 29.9seconds per million LORs processed

92 CHAPTER 5. APPLICATIONS

It should be noted that the results reported in Table 5.2 for the shift-invariant kernel

are better than those reported in Table 5.1 in Section 5.2, since the value reported in this

section was obtained on a newer computer equipped with a more powerful GPU.

5.3.3 Discussion

The benets of using a more accurate, shift-varying model are clear and have already been

demonstrated elsewhere [42, 43, 47, 64, 68, 114]. For the CZT system we are developing, we

have shown that a system response model based solely on the detector response brings four

main improvements. First, the reconstructed spatial resolution is more uniform across the

FOV (Figure 5.13 and Figure 5.14). By incorporating accurate shift-varying information in

the system matrix, the spatially-variant blur present in the projections does not propagate

to the reconstructed image. Secondly, because the reconstructed spheres are smaller for the

shift-varying model, it suggests that the spatial resolution is globally higher (Figure 5.14)

and hence being recovered. Thirdly, the reconstructed images are more quantitative and

accurate because the physical processes involved are better modeled. Fourthly, the noise is

lower because using a more accurate system matrix in the reconstruction reduces the amount

of inconsistency between the dierent projections.

Figure 5.16 illustrates three of these properties. Higher resolution results in lower partial

volume eect and contributes to higher contrast recovery. The noise is also systematically

lower at a xed contrast, and at a xed iteration number. For the 8 mm diameter sphere,

which is large enough not to be aected by partial volume eect, the CR is 95.8% for the

shift-varying model vs. 85.9% for the shift-invariant Gaussian projection. This suggests

that the shift-varying reconstruction is more quantitative and accurate, a property observed

elsewhere [47,48].

For small object, spatial blurring causes a loss of contrast also known as partial volume

eect (PVE). Due to PVE, small spheres have a lower CR than larger spheres. This property

can be observed on Figure 5.16, except for two spheres: the 1.5 mm-diameter has CR higher

than the 2 mm one. This eect might be a consequence of the spatially varying nature

of the system response. Because the system is not cylindrical, the spatial resolution can

be dierent at dierent sphere locations. Therefore, the amount of PVE for each sphere

depends upon its position.

The total reconstruction time is ten times higher when the shift-varying model is used

(Table 5.2). This is due to two factors: an increase in the number of voxels processed, and

an increase in the computation required to evaluate the shift-varying kernel. For the shift-

invariant Gaussian kernel, 7 × 7 voxels are processed within each slice through the TOR,

5.4. TIME-OF-FLIGHT PET RECONSTRUCTION 93

twice fewer than for the wider kernels based on the CDRF (11 × 11 voxels per slice). In

addition, each voxel requires the evaluation of nine dierent kernel functions that are added

together ((2.15)).

The system response model can be implemented in many dierent ways. In this thesis,

we have chosen not to store any information but rather to compute the coecients of the

system matrix every time they are needed. As a consequence, this approach is useful when

the PET geometry needs to be adjusted between scans to the patient morphology. It is also

a scalable technique which uses a constant amount of computing resources, independent of

the number of LOR in the system.

A shift-varying model can also be stored in memory, however there exists a trade-o

between the accuracy of the representation and the amount of memory used. Our approach,

based on linearizing the IDRF, is accurate for the majority of the LORs (Figure 2.10) and

uses little memory. In addition, the computation of the kernel on the GPU is partially

hidden by the latency of reading the voxel values from memory.

5.4 Time-of-ight PET Reconstruction

5.4.1 Background

In PET, when two photons are detected in near coincidence, it can be inferred that the

positron annihilated somewhere near the LOR that connects the two detectors (see Figure

1.1 in Chapter 1). When the time dierence between tow single events falls within a pre-

determined time window, the two events are said to be in coincidence. Due to the nite

speed of light, two photons emitted simultaneously by positron annihilation do not reach

the detectors at the same time. For example, for a 60 cm FOV, the time dierence between

the two photons caused by travel time can be as large as 2 ns.

The timing uncertainty in existing clinical PET systems can be as good as 585 ps1 [115],

which makes it possible to estimate the photon time-of-ight (TOF) dierence, and to some

extent the rough location of the positron annihilation along the LOR. This information can

be used in image reconstruction to improve the image quality and quantitative accuracy. A

new generation of PET scanners have been designed and commercialized according to this

principle [115]. For the same scan duration, these systems have higher signal-to-noise ratio

(SNR) than similar systems that do not use the TOF information, and, as a result, provide

improved lesion detectability. Alternately, the scan time can be decreased while providing

the same image quality as non-TOF PET systems.

1One picosecond (ps) is equal to 10−12 seconds.

94 CHAPTER 5. APPLICATIONS

Figure 5.17: Principles of time-of-ight (TOF) PET. Due to the nite speed of light, twophotons emitted simultaneously by positron annihilation do not reach the detectors at thesame time. A measurement of the TOF gives an estimate of the position of the annihilationalong the LOR. This information can in turn be used in image reconstruction.

In a non-TOF PET system, a uniform probability is assumed for describing in the recon-

struction the location of the positron annihilation along the LOR. When TOF information is

available, a Gaussian distribution is used instead. The width of the Gaussian ∆x (FWHM)

is determined by the system time resolution ∆τ (FWHM) according to

∆x =c

2∆τ

where c is the speed of light [116]. The typical time resolution varies for dierent LORs and

an optimal reconstruction method should use a custom TOF kernel for each LOR. However,

for simplication, the system's average time resolution is usually chosen in the TOF kernel.

Nevertheless, our GPU-based implementation was designed so that custom TOF kernel can

be used if needed.

Using TOF information yields a gain in SNR on the order of D/∆x [117], where D is

the average diameter of the subject imaged. This SNR increase is due to the fact that the

counts collected on an LOR are backprojection over a smaller region rather than the entire

thickness of the patient. TOF is therefore more helpful for large patients than for small

subjects (such as children), and does not impact small-animal imaging at all.

The LORs of 3-D TOF-PET systems are characterized by four spatial dimensions (two

rotations and two translations) and one additional TOF dimension. As a result, image

reconstruction is more complex when TOF information is incorporated. Owing to the higher

data dimensionality, the measurements are very sparse. The dimensionality of the data can

be reduced from ve to four dimensions by using rebinning methods that account for the

5.4. TIME-OF-FLIGHT PET RECONSTRUCTION 95

(a) (b)

Figure 5.18: (a) Depiction of a Gaussian TOF kernel. (b) Parametrization of the TOF andprojection kernels.

TOF information [118]. However, such methods are approximate. The best image quality

is obtained when the images are directly reconstructed from the raw data. Fortunately,

ML reconstruction can be performed directly from list-mode (3.1.3.3), which is an ecient

format to store unprocessed PET data with TOF information.

In this section, we demonstrate that reconstruction of TOF-PET data can be performed

on a GPU in list-mode. The basic framework introduced in Chapter 4 is applied.

5.4.2 Methods

5.4.2.1 System Description

The Gemini TF (Philips Medical Systems, Highland Heights, OH) is the rst commercial

PET system capable of exploiting TOF information. The system comprises 28 modules,

each consisting of a 23×44 array of 4×4×22 mm3 LYSO crystals. The individual modules

are arranged in multiple 90 cm-diameter rings. The useful transverse and axial FOVs are

57.6 and 18.0 cm, respectively. The system timing resolution for the data shown in this

section was 785 ps (FWHM), however the timing resolution can be as good as 585 ps (for a

point source) [115]. The timing resolution can be aected by factors such as the count rate

and the detector temperature.

5.4.2.2 Implementation on the GPU

The GPU implementation of list-mode reconstruction with TOF information diers slightly

from non-TOF reconstruction. The projections are performed according to the approach

previously described in Chapter 4, with the exception that the TOF kernel was combined

with the projection kernel.

96 CHAPTER 5. APPLICATIONS

Figure 5.19: Cylindrical phan-tom used for time-of-ight PETmeasurements. The phantomis composed of six 10 mm diam-eter spheres, placed in a singleaxial plane 4.2 cm away fromthe central plane. The activ-ity is six times more concen-trated in the spheres than inthe cylinder.

Within projection operations, the TOF kernel, modeled as a Gaussian with standard

deviation σ, was truncated at ±3σ. Therefore, the LOR endpoints were reassigned to

C± 3σui, where C is the TOF kernel center, and ui is the direction of LOR i (Figure 5.18).

With this transformation, the transfer of the TOF kernel center C and width σ to the GPU

can be avoided, which reduces the amount of memory required on the GPU.

Within both the forward and the back-projection, the TOF kernel parameters were

computed in the vertex shaders. The TOF kernel center and width were recovered by

computing respectively the center and the distance between the two endpoints. Hence, the

only data required on the GPU are the coordinates of the transformed LOR endpoints.

The Gaussian projection kernel Kp (5.2.1) and the Gaussian TOF kernel Ktof were

combined, resulting in a singe 2-D Gaussian kernel parametrized both by the TOF and the

distance from the voxel center to the LOR

aij = Kp(dij) Ktof(dtofij )

where the distances dij and dtofij are the distances between Vj and Lij , and Lij and C,

respectively (as indicated on Figure 5.18).

The projection Lij of the voxel center Vj onto LOR i is computed on the GPU for each

voxel. Next, the distances dij and dtofij are computed and the Gaussian kernel is evaluated.

5.4.2.3 Phantom Experiment

PET measurements using the Gemini TF system were performed at the University of Penn-

sylvania using a 35 cm diameter cylindrical phantom (Figure 5.19). Six 10 mm diameter

spheres were placed in the phantom in a single axial plane 4.2 cm away from the central

plane. Within the plane, the spheres were arranged on a 8 cm-radius circle. The spheres

and the cylinder were lled with a solution of radioactive 18F. The activity was six times

5.4. TIME-OF-FLIGHT PET RECONSTRUCTION 97

more concentrated in the spheres than in the cylinder. The total activity was 6.4 mCi,

corresponding to a background activity concentration of 0.16 µCi/cc. The total scan time

was 5 min.

The images were reconstructed using 15 iterations of list-mode OSEM, with and without

TOF information. Twenty subsets were used for each iteration. The GPU-based image

reconstruction was compared against a CPU-based reconstruction performed at the Univer-

sity of Pennsylvania [119]. While the GPU used a radially-symmetric Gaussian kernel in

the projection and cubic voxels in the image representation (4.2.2), the CPU reconstruction

modeled the tracer spatial distribution as a sparse collection of Kaiser-Bessel (KB) blob

basis functions [120], and the projections as ideal line integrals. The Gaussian kernel and

the KB blobs are similar approaches: both use a kernel parametrized by the distance be-

tween the LOR axis and the center of the voxel. The dierences between the two approaches

include the kernel used, the kernel spacing, and in the fact that the Gaussian kernel directly

reconstructs image voxels, while the KB blob image must be converted to voxels for display.

The goal of this study is not to compare two image representations, but rather to inves-

tigate the feasibility of performing reconstruction of TOF PET data on a GPU in list-mode.

Therefore, the objective is to show that the improvement achieved by using TOF informa-

tion is consistent across both computing platforms. A comparison of the image quality on

GPU and CPU was presented in Section 5.2.

On the GPU, voxel sizes of 2× 2× 2, 4× 4× 4 and 8× 8× 8 mm3 were investigated. A

4 mm-FWHM Gaussian kernel was used in the projections. A post-reconstruction Gaussian

lter was also applied. The width of the lter was chosen to obtain image quality comparable

with the CPU implementation. A lter width of 2.1 mm FWHM was found to yield the

closest results. On the CPU, the blobs were arranged in a 8 mm body centered cubic (BCC)

grid. In theory, 8 mm blob spacing is comparable to 4 mm voxels [120].

Both reconstructions were normalized using the same blank and transmission scans.

The blank scan was performed by rotating a positron emitting source around the gantry.

A transmission scan of the phantom was acquired on the Gemini TF system using X-ray

CT to obtain a map of the photon attenuation coecients. In the standard manner, the

attenuation values were subsequently rescaled for 511 keV photons [61]. An estimate of

the random coincidences was also produced by measuring delayed coincidence events within

the emission scan. The random coincidences estimate was smoothed using Casey's method

[58] to improve the SNR. A TOF scatter estimate was generated using the single-scatter

simulation method [121]. The ratio of the normalization over the transmission scan were

incorporated in the sensitivity map (3.1.3.1) as a multiplicative factor (3.3). The randoms

98 CHAPTER 5. APPLICATIONS

Figure 5.20: Phantom images reconstructed with GPU-based and CPU-based implementa-tions, with and without TOF information. The voxels are 2× 2× 2 mm2.

and TOF scatter estimates were corrected for normalization and attenuation, and were then

used as additive terms in the forward projection, as described previously in (2.3).

The contrast recovery (CR), dened as the contrast as a percentage of the original ac-

tivity concentration ratio, was assessed in the reconstructed images. The sphere signal was

computed by averaging the voxel intensity in spherical ROIs for the six spheres. The back-

ground signal was evaluated similarly for six ROIs in a background slice axially opposite

to the sphere plane. The noise was approximated by the spatial variability (RMS) within

the background ROIs (5.3). The CR and noise were averaged over the six spheres present

in the phantom.

5.4.3 Results

5.4.3.1 Contrast vs. Noise

Figure 5.20 shows 2 mm-thick slices taken from the volume reconstructed with and without

TOF information, on the GPU (voxel representation) and the CPU platform (blobs repre-

sentation). All the images are shown for 15 iterations of list-mode OSEM with 20 subsets.

For all images, the pixel size is 2 × 2 mm2. In particular, the blob-based images were also

converted to 2 mm voxels for display.

5.4. TIME-OF-FLIGHT PET RECONSTRUCTION 99

Figure 5.21: Phantom images reconstructed using TOF information on the GPU with vary-ing voxel size.

The image sampling rate impacts the reconstructed image quality, as well as the process-

ing time. Figure 5.21 shows the same TOF dataset reconstructed on the GPU with three

dierent square voxel sizes: 2, 4 and 8 mm. While the 2 and 4 mm voxels result in similar

image quality, 8 mm voxels do not provide sucient sampling and result in a loss of spatial

resolution.

Figure 5.22 displays the trade-o between the contrast and the noise at dierent itera-

tions for the GPU and CPU implementations, with and without TOF information. The use

of TOF information within the reconstruction (black and red curves) results in an increase of

the CR compared to non-TOF reconstruction (blue and purple curves), while the noise level

is comparable. While for the non-TOF dataset, GPU and CPU reconstructions resulted in

comparable behavior, the reconstruction of the TOF dataset presented some disagreement

between CPU and GPU implementations. These dierences are unavoidable since the two

implementations have some key dierences that are described in the next section. However,

the contrast vs. noise trade-o curves show that the improvement achieved by using TOF

information is consistent across GPU and CPU platforms.

5.4.3.2 Processing Time

The processing time for the GPU reconstruction are summarized in Table 5.3. The values

are quoted for one pass through one million events, not including the calculation of the

sensitivity map and the scatter and randoms estimates. Two graphics card were used: a

GeForce 9800 GX2 and a GeForce 285 GTX.

100 CHAPTER 5. APPLICATIONS

Figure 5.22: CR vs. noise trade-o curve.

Table 5.3: Processing time as a function of image size for GPU-based list-mode reconstruc-tion.

GPU voxel size TOF non-TOF

2 mm 6.8 s 11.3 sGeForce 9800GT 4 mm 2.3 s 4.9 s

8 mm 0.8 s 1.3 s2 mm 3.3 s 6.5 s

GeForce 285GTX 4 mm 1.2 s 1.8 s8 mm 0.3 s 0.5 s

per million prompts reconstructed

5.5. SUMMARY 101

5.4.4 Discussion

Section 5.2 and Section 5.3 showed that GPUs could be employed for non-TOF list-mode

reconstruction. Furthermore, the GPU implementation was shown to produce images that

were not signicantly dierent from those produced with an equivalent CPU implemen-

tation. In this section, we demonstrate that GPUs can be used for list-mode TOF-PET

reconstruction.

Figure 5.22 showed some discrepancies between the GPU and CPU reconstructions.

These dierences are unlikely to be caused by dierences in image representation alone. For

the TOF reconstruction, the single scatter simulation estimate was stored using coarser TOF

bins on the GPU to make processing practical. Furthermore, the subsets were organized

chronologically within the GPU implementation while a geometrical ordering was applied for

the CPU implementation. Further dierences might exist between the two implementations

since the goal of this study was not to compare both implementations, but to verify the

feasibility of implementing list-mode TOF PET reconstruction on the GPU.

Three factors determine the speed of the reconstruction: the number of variables in

the image representation (voxels or blobs), the size of the projection footprint (how many

variables are accessed for each line projection), and the amount of computation involved in

the evaluation of the projection kernel. The rst two factors determine the number of voxels

(or blobs) processed, hence the amount of memory that must be accessed for each event

processed. The amount of computation required is a combination of all three factors. As an

example, the GPU reconstruction with a Gaussian projection kernel involves a large number

of small voxels in the image, a relatively large number of voxels per LOR (i.e. the TOR)

and a medium amount of computation. Therefore, adjusting a single parameter (such as in

Table 5.3) characterizes only partially the performance of the method.

Hence, a comparison between the computing performance of two methods that use dier-

ent representations of the image and have dierent projection footprints is dicult. There-

fore, further investigation would be required to better characterize the benets of each

approach.

5.5 Summary

Three applications have been implemented based on the GPU-based line projection frame-

work introduced in Chapter 4. List-mode iterative reconstruction was performed on the

GPU for a high-resolution PET system with billions of LORs. The GPU framework allows

102 CHAPTER 5. APPLICATIONS

for a broad range of projection kernels, hence both a simple shift-invariant and a more com-

plex shift-varying model were studied. The latter model resulted in signicant image quality

and accuracy improvements, however the computation time increased tenfold. These two

examples show the exibility of the framework.

In addition, the framework was also chosen to implement list-mode reconstruction for

TOF PET. We showed the feasibility of the approach. Further work is required to match

the projection models of both GPU and CPU implementations and establish a comparison

of processing time with commercially-available software.

Chapter 6

Bayesian Reconstruction of Photon

Interaction Sequences

6.1 Background

6.1.1 Motivation

Cadmium Zinc Telluride (CZT) is a semiconductor material that can be used for building

radiation detectors. As a low Z material, its photo-fraction is low compared to other scin-

tillation crystals [16]. To preserve high photon detection eciency, the geometry of our

system (described in 1.3.2) is designed such that 511 keV photons traverse a minimum of 4

cm-thick material. Still, a large fraction of all the photons undergoes Compton scatter in

the detectors (see Figure 1.8). Because the eective detection elements are small (1× 5× 1mm3), the scattered photons usually escape into adjacent elements. On average, a 511 keV

photon deposits its energy in 2.2 detection elements.

To exploit the full potential of CZT detector modules, one major challenge needs to be

overcome. The image reconstruction must be able to use coincident events in which at least

one annihilation photon deposits its energy (511 keV) across multiple detection elements.

For the PET system described in 1.3.2, 93.8% of all the recorded coincident events for which

the summed energy is near 511 keV comprise at least one such multiple-interaction photon

event (MIPE). When MIPEs are used, high coincident photon sensitivity can be reached:

17 % for 800 µCi at the center of the eld of view (FOV) [32], a 16-fold increase compared

to using only events that deposit all their energy in a single detection element. However,

the ability to correctly position these events strongly determines the quality and accuracy

of the images obtained from a CZT-based PET system [49]. Determining the crystal of

103

104 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

entrance for MIPEs is ambiguous (see Figure 1.8). Hence, these events are at risk of being

erroneously assigned to the incorrect line-of-response (LOR), which in turn degrades spatial

resolution and image contrast [122,123].

6.1.2 Methods to Position Multiple Interaction Photon Events

Unlike standard PET detectors (2.1.1.2), the CZT cross-strip electrode design presented

in 1.3.2 can record the 3-D coordinates and energy deposition of individual interactions

for MIPEs. The system is able to distinguish the photons that deposit their energy in a

single detection element from those which deposit their energy in multiple detection elements

through multiple interactions. Positioning schemes have been devised to attribute a position

to MIPEs. These schemes can be broadly divided into three categories:

6.1.2.1 Initial Interaction Selection.

The MIPE position is selected from the nite set of all detected interactions. This class

of methods exploits some form of correlation between the order of the interactions and

properties of their energy and position. Techniques previously investigated include choosing

the interaction with largest / second largest signal [124,125], the smallest depth of interac-

tion [125], or the minimum distance to the other coincident photon [49]. For sequences of

more than two interactions, the order of subsequent interactions is not recovered with those

methods. Several techniques have been developed specically for positioning photons that

deposit energy in exactly two detectors. One method is based exclusively on the energies,

which for 511 keV photons is equivalent to assuming that the initial interaction is the most

energetic [126]. When one of the annihilation photon in a coincident pair scatters once, both

possible LORs can be used in the image reconstruction [127].

6.1.2.2 Unconstrained Positioning.

The positioning problem can be relaxed by allowing the position of MIPEs to be assigned to

any location within the detection volume. For example, the energy-weighted mean scheme

[124] combines the interaction locations linearly using the energy as weight. This is the only

positioning method available for conventional PET systems based on four-channel block

detectors. Because block detectors use a high degree of light and electronic multiplexing,

they cannot position individual interactions within a MIPE.

6.2. THEORY 105

6.1.2.3 Full Sequence Reconstruction.

The crystal of entry for MIPEs can also be estimated by reconstructing the complete se-

quence of interactions. A number of metrics have been investigated in order to penalize

sequences that violate the kinematics of Compton scatter. These techniques are based on

testing the consistency of redundant information. For example, the cosine of the scatter

angle can be computed using the Compton formula, provided that the order of the sequence

of interactions and the annihilation photon energy are known. This quantity can also be

computed directly from the interaction locations. The sum of the square of the dierences

of the scatter angle cosines [128,129] can be used as a metric to assess the kinematic validity

of a given sequence. This scheme can be rened by weighting the summands by the posi-

tional and energy measurement uncertainties. The weighted sum of the absolute dierence

between the scatter angle computed from trigonometry and from Compton kinematics is

another option for forming objective [130] .

The validity of an ordered sequence of interactions can also be measured based on physical

considerations, such as the probability that the annihilation photon follows a particular

trajectory realization. The Klein-Nishina dierential cross-section [131] is one component of

the trajectory probability [127]. Other components, such as the photoelectric cross section,

also contribute to the trajectory probability and can be included.

The sequence of interactions can also be reconstructed backwards [132]. Instead of

performing a full search over the combinatorial space of all the sequences, the method

recovers the complete sequence of interactions sequentially by rst identifying the photo-

electric interaction, whose energy is assumed to be independent of the track properties, and

then retracing the interaction track backwards.

6.2 Theory

We investigated a new sequence reconstruction technique which optimizes agreement with

the measurements while also accounting for the a priori probability of the photon trajectory

[49]. Bayesian estimation provides a natural framework to combine these two goals. The

likelihood component can deal with the consistency of the measurements, while a prior

probability distribution can describe the total trajectory cross-section. The product of

the likelihood with the prior distribution yields the maximum a posteriori (MAP) rule.

Using a statistical framework has the advantage that measurement noise can be explicitly

characterized and its eects accounted for.

One issue often reported with interaction sequence reconstruction methods is that the

106 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

Figure 6.1: The true interaction position riof the energy deposition is quantized to thenearest intersection of the electrodes (di).The quantization error is qi = ri−di. Elec-tronic signals are read out from the an-odes and cathodes involved. Eective de-tection elements are represented by dottedrectangles.

exact location of each interaction within the detection element is uncertain. It is then as-

sumed that the interactions occur at the center of the detection elements, which induces

errors in the calculation of the objective. To mitigate this problem, we consider a stochastic

Bayesian estimation approach. For every possible sequence ordering, the objective is aver-

aged over many possible interaction paths where the interaction locations are sampled from

3D uniform distributions over the corresponding detection elements.

Although the general formalism for interaction sequence determination derived in this

work applies in principle to any detector, we focus on its application to the CZT cross-strip

module module described in 1.3.2.

6.2.1 Maximum-Likelihood

The ML criterion is used to seek the sequence of interactions that has the greatest statistical

consistency with the observations. For an event comprising N interactions, N ! hypothesesare tested. Each hypothesis describes a possible sequence of N interactions. The ML

procedure evaluates the likelihood of all the hypotheses and selects the one that has the

greatest likelihood.

Due to the pixelated nature of the detector, the interaction position is quantized to

the center of the nearest detection element. The exact position ri of interaction i can be

expressed as the sum of the detection element center di and the position quantization error

qi, also referred to as sub-voxel position (Figure 6.1):

ri = di + qi. (6.1)

Let PN denote the set of all the permutations with N elements. PN is a nite set of

cardinality N !. We use s = (s1, . . . , sN ) to refer to a particular element of PN . For example,s = (3, 1, 2) is an element of P3. The set of all the possible sequences of N interactions can

be mathematically represented by PN .

6.2. THEORY 107

The recorded energy deposition locations are numbered j = 1, . . . , N , which is related to

i, the true interaction number, by a permutation s ∈ PN . The mapping of index j to the ith

interaction is arbitrary and thus does not represent the true order of the interactions. The

measurement Ej of the energy deposited at the jth location rj in the detector is subject to

zero-mean Gaussian noise ni, with variance Σ2i ,

Ej = εi + ni, ni ∼ N (0,Σi), j = si (6.2)

where εi denotes the true energy deposition for the ith interaction.

The key in formulating a ML objective is that it is possible to compute the energy

deposited during a Compton scatter interaction analytically. Given any permutation s ∈ PNin the order of the interactions, ei(s) is the hypothetical value of the photon energy betweenthe ith and the (i+ 1)th interactions, computed from Compton kinematics.

Furthermore, events whose summed interaction energy is not in the energy window are

discarded, since these might have been deected by scatter in the tissue. Therefore, it can

be assumed that the last interaction in the sequence is a photoelectric interaction.

The hypothetical energy deposited by the ith interaction is denoted εi(s). Conservationof energy implies

ei(s) = ei+1(s) + εi(s). (6.3)

The rst hypothetical energy deposition can be computed based on the Compton formula

ε1(s) = e0 −e0

1 + e0mec2

(1− cos θ1), cos θ1 =

< rs1 − p | rs2 − rs1 >||rs1 − p|| · ||rs2 − rs1 ||

, (6.4)

where me denotes the mass of the electron and c the speed of light in vacuum. In this

expression, it is assumed that the energy e0 of the incoming photon and the position p of

the other coincident photon in the pair are known. For PET, the incoming energy e0 is set to

511 keV. When the other coincident pair also involves multiple interactions, p is estimated

roughly, for example, by computing the center of mass. The error in estimating p with such

a method is on the order of the distance among interactions within the same cluster, which

is in general much smaller than the distance between interaction clusters.

In a similar fashion, for i ≤ N−1, εi(s) can be evaluated recursively for any permutations:

εi(s) = ei−1 −ei−1

1 + ei−1

mec2(1− cos θi)

, cos θi =< rsi − rsi−1 | rsi+1 − rsi >

||rsi − rsi−1 || · ||rsi+1 − rsi ||. (6.5)

For i = N , the energy of the annihilation photon is fully deposited through photoelectric

108 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

interaction. Therefore the last energy deposition is

εN (s) = eN−1(s). (6.6)

The correct sequence ssol satises εi(ssol) = εi for all i. However, only a noisy measure-

ment Ei of εi is known (5.3); therefore the sequence ssol which maximizes the likelihood that

εi(ssol) = ESsoli

is chosen instead. Assuming the positions qi are known, the likelihood Lq

is then our objective function and can be expressed as

Lq(s) = P (Es1 = ε1(s), . . . , EsN = εN (s) | s) (6.7)

=N∏i=1

1√

2πΣi

exp(−(Eσi − εi(s))2/2 Σ2

i

), (6.8)

because ni is a Gaussian-distributed random variable with variance Σ2i . The hypothetical

energy depositions εi(s) are computed using (6.4), (6.5) and (6.6). The variance Σ2i is a

function of the energy deposition εi. A model of the energy resolution of the detector is

used and a hypothetical variance Σ2i is computed using the hypothetical energy deposition

εi(s). In the evaluation of this method, we used Σ2i = Σ2

elec + Σ2det(εi). This expression is

the quadrature sum of white electronics noise Σ2elec, and detector specic noise. We assumed

6 keV FWHM for Σelec and used a linear model for the detector noise as a function of the

energy. The linear coecient was adjusted to t the energy resolution for 511 keV photons.

In most of this chapter, the energy resolution was assumed to be 2.5% at 511 keV, but other

energy resolutions were also investigated (Table 6.3).

In practice, the sub-voxel positions qi are unknown. In order to compute the objective,

they can be substituted by their expected value E(qi). For a uniform distribution, E(qi) = 0,therefore the interaction locations are assumed to be at the detection element center di.

Because the scatter angle θi is sensitive to the precise location of the interactions, signicant

errors in the objective can result from position uncertainty (Figure 6.2).

Stochastic optimization can deal with this uncertainty in the problem parameters by

seeking an optimal solution to the expectation of the objective function (which is modeled

as a function of random variables). The objective expectation can be computed via Monte-

Carlo integration by sampling the parameters distribution. This framework was applied to

sequence reconstruction by assuming that the sub-voxel interaction locations q are uniformly

distributed within each detection element. The expectation over q of the likelihood function

6.2. THEORY 109

Figure 6.2: Eect of detection element size: an interaction has occurred in each of the twodetection elements, delineated by dashed rectangles. The sub-voxel position q1 and q2 ofeach interaction is unknown. If the interactions are assumed to occur at the center of thedetection element, one obtains the average trajectory (dotted line). If sub-voxel samplingis used instead, q1 and q2 are generated randomly within the detection elements and theobjective is averaged over many possible trajectories, two of which are shown (solid red line).The scatter angle θ is subject to large variations depending on the position of the interactionwithin each detection element.

was calculated using Monte-Carlo integration

L(σ) = E(Lq(s)) (6.9)

and maximized over the nite set of all possible sequence ordering.

A GeForce 9800 GX2 (NVIDIA) graphics processing unit (GPU) and the CUDA li-

brary [111] were used to accelerate the calculation of the expected value by Monte-Carlo

integration. Processing each MIPE involved computing the likelihood objective over 16 384realizations. A Mersenne-Twister random number generator was executed on the GPU to

randomly sub-sample the detection elements using a uniform distribution. The computation

was decomposed into 32 blocks of 128 threads, resulting in a total of 4 096 threads each pro-

cessing four realizations. The number of realizations was chosen to maximize performance,

and can be reduced for faster processing. The 16 384 objective values were subsequently

averaged on the CPU.

6.2.2 Maximum A Posteriori

The quality of the estimation can be enhanced by incorporating prior knowledge into the

objective. This particularly helps when the measurements are noisy and unreliable. To

estimate the order of the interactions in a MIPE, a prior probability distribution is obtained

110 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

by computing the total cross-section for all N ! trajectories. This cross-section is based on

the physics of γ-ray transport. Before the energy measurements are made, the relative a

priori distribution of the sequence of interactions can be inferred based on the Klein-Nishina

dierential cross-section for Compton scatter [131] and the photoelectric absorption cross-

section.

The prior probability distribution Pprior(s) comprises three components: the probabilityPprop(s) that the annihilation photon travels the distance linking two successive interactionswithout interacting with matter; the probability Pcomp(s) that the photon Compton scattersN − 1 times, with angle θi, each interaction being localized within a small control volume

δVi centered on ri; and the probability Pphot(s) that the photon is absorbed by photoelectriceect within a small control volume δVN centered on rN .

The rst component Pprop(s) can be expressed as

Pprop(s) =N−1∏i=1

exp(−µtot(ei)× ||rsi+1 − rsi ||) (6.10)

where µtot is the linear photon attenuation coecient in the detector material [133], which

is a function of ei, the energy of the photon after interaction i, is computed from Compton

kinematics using (6.4) and (6.5).

The second component Pcomp(s) can be obtained from the Compton scatter cross-section

Pcomp(s) =N−1∏i=1

µc(ei)µkn(ei)

∫ 2π

φ=0

dσkn

dθi, (6.11)

where ∫ 2π

φ=0

dσkn

dθi=

∫ 2π

φ=0

dσkn

dΩi

dΩi

dθi(6.12)

=∫ 2π

0

dσkn

dΩisin θidφ (6.13)

= 2π sin θidσkn

dΩi(6.14)

is the dierential Compton cross-section per unit of angle, which can be computed using the

Klein-Nishina formula [131]:

dσkn

dΩi∝

(ei+1

ei−(ei+1

ei

)2

sin2 θi +(ei+1

ei

)3). (6.15)

6.2. THEORY 111

0 100 200 300 400 5000

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Energy (keV)

Line

ar a

ttenu

atio

n co

effic

ient

(m

m−

1 )

Klein−NishinaXCOM database

Figure 6.3: Linear Compton scat-ter attenuation coecient as a func-tion of photon energy for CZT. Un-like the XCOM photon cross-sectiondatabase [133], the Klein-Nishina for-mula [131] does not accurately pre-dict the Compton cross section forlow (≤ 100 keV) energies.

However, the Klein-Nishina model assumes the electron is free and at rest. As a result, σknis not accurate for low photon energy. For this reason, (6.11) was rescaled by the ratio of

the Compton scatter attenuation coecient µc(ei) obtained from published databases [133]

and that obtained by integrating the Klein-Nishina scatter cross-section over all angles:

µkn(ei) =∫ 2π

φ=0

∫ π

θ=0dσkn. (6.16)

Figure 6.3 compares both attenuation coecients as a function of photon energy for CZT.

The third component can be calculated using a model of the probability of photoelectric

interaction, for an incoming photon energy eN−1. This prior probability is computed for an

arbitrarily small control volume δVN , and is proportional to

Pphot(s) = µphot(eN−1). (6.17)

The resulting prior distribution is the product of the three components:

Pprior(s) = Pprop(s)× Pcomp(s)× Pphot(s). (6.18)

The MAP objective is then formed by multiplying the likelihood objective with the a priori

probability distribution

PMAP(s) = Lq(s)(1−β) × Pprior(s)β, (6.19)

where β is a parameter weighing the the prior probability and the likelihood. The ML

estimate is obtained when β is zero.

112 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

Algorithm 6.1 Simplied schematics of the MAP positioning scheme.

For each event:

Form the set of all permutations with N elements

For each permutation:

Generate 16 384 realizations of the positions in the detection elements

For each realization

Compute the hypothetical energy depositions

Compute the hypothetical energy variance

Compute the hypothetical scatter angles

Evaluate the likelihood

Compute the prior probability

Evaluate the MAP objective

Average the MAP objective

Select the permutation with maximum objective value

Position MIPE at the location of estimated initial interaction

A description of the steps involved in the positioning algorithm is provided in 6.1.

6.3 Evaluation Methodology

6.3.1 Simulation of a CZT PET System

We used GRAY, a fast Monte-Carlo package developed in our group [113], to simulate a

PET system based on CZT cross-strip electrode modules (1.3.2). The photoelectric eect

and Compton scattering are the only physical processes included in GRAY. GRAY uses

published databases [133] for computing the interaction cross-sections. The Compton scatter

angle is generated according to the Klein-Nishina formula [131]. Accurate time, energy and

position values of individual interactions in the detector material are stored in list-mode.

After the simulation, the position of each event was processed to account for the the

limited spatial resolution of the system. Within one module, the spatial coordinates were

binned to a grid of 1×5×1 mm3 eective detection elements. On the rare occasion when two

interactions occurred in the same detection element, they were merged and their energies

summed. The energy of each individual interaction was blurred by additive Gaussian noise

with variance Σ2i = 1/2.352 × ((6 keV)2 + (2.2% × εi)2). The order of the interactions was

also randomly permuted, i.e. j = si where s is a random permutation in PN . A lower

energy detection threshold of 10 keV was applied. The time stamp was blurred using an 8

ns FWHM Gaussian noise source [134]. Consistent with maximizing NEC for rat and mouse

phantoms [32], an 8 ns time window was applied for coincidence gating.

6.3. EVALUATION METHODOLOGY 113

6.3.2 Positioning Algorithms and Figures of Merit Used

For evaluation, the performance of four MIPE positioning schemes was compared:

1. Initial Interaction (II): The interaction with the smallest non-blurred time stamp was

selected. This ideal positioning scheme provides the best possibly achievable perfor-

mance for any positioning algorithm. Due to time resolution limitations, it is only

available for Monte-Carlo simulations.

2. Maximum A Posteriori (MAP): The full sequence was reconstructed using the methods

described in sections 6.2.1 and 6.2.2. The event was positioned at the location of the

estimated rst interaction.

3. Energy-Weighted Mean (EWM): The event was positioned at the energy-weighted

mean position of the interactions. This scheme is the only one available for con-

ventional PET block detectors, because they cannot position individual interactions

within a MIPE.

4. Minimum Pair Distance (MPD): First, both coincident events are roughly positioned

(for example, using energy EWM). Then, the interaction closest to the rough location

of the other coincident photon event is selected. This scheme is based on the preference

of 511 keV photons to forward scatter, at small angles. Unlike in MAP, the energy

measurements are not used.

Positioning only single-interaction photoelectric events has not been considered because,

for the system considered in this study, those only represent a small fraction (6.2%) of all

coincidence photon events.

To identify the initial interaction with MAP, the full sequence of interactions is recon-

structed. Unlike MPD where the search space grows linearly with N, the search space in

MAP grows super-exponentially. In particular, for N ≥ 5 the size of the combinatorial

search space (N !) is greater than 60. Therefore, the MPD scheme was used instead of MAP

for identifying the initial interaction for N ≥ 5. In addition, for N large, the energy depo-

sitions in that event are small which means most Compton scatter interactions have small

angle. Hence, the MPD method can recover the rst interaction with high probability when

N ≥ 5.The simplest measure of the quality of the positioning is the recovery rate, dened as the

fraction of all the processed single-photon events for which the rst interaction is correctly

identied. This gure of merit is not applicable for unconstrained positioning methods such

as EWM. The recovery rate was evaluated in a variety of situations: for β varying between

114 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

(a) (b) (c)

Figure 6.4: Depiction of phantoms used in the quantitative evaluation of the positioningmethods. (a) Three beams, with incident angles 0, 30 and 60 deg., were simulated. (b)A contrast phantom, consisting of a 2.5 cm-radius, 6 cm long cylinder lled with a warmsolution of activity, in which were placed ve hot spheres of diameters 1, 1.5, 2, 4, and 8 mm.The ratio of the activity concentration in the hot spheres to that in the warm backgroundcylinder was 10. (c) A hot sphere resolution phantom, consisting of four sphere patterns, allin the same central plane. The spheres extended to the edge of the 8× 8× 8 cm3 FOV andtheir diameters were 1, 1.25, 1.5, and 1.75 mm. The spacing between the spheres' centerswas twice their diameter.

zero and one (dened in (6.19)); for stochastic and deterministic objectives; for MIPEs with

number of interactions varying between one and six; and for varying energy resolution (0, 3

and 12% FWHM at 511 keV) and spatial resolution (1× 1× 1 and 1× 5× 1 mm3 detection

elements).

The collimated, single-photon, 1-D point-spread function (PSF) was measured for three

incident angles (0, 30 and 60 deg) and for all four positioning methods (II, MAP, MPD,

and EWM). An innitely thin needle beam of 511 keV photons was simulated in GRAY

and aimed at the center of a detection element in the middle of the panel (Figure 6.4).

The events were positioned and their transverse coordinate (along ey) histogrammed. The

axial coordinate (along ez) was also histogrammed for the normal (0 deg) beam. In order

to assess the extent of back-scatter, and investigate depth-dependent eects, 2-D PSFs were

also generated by histogramming the estimated event positions along both ex and ey.

A contrast phantom (Figure 6.4) was simulated to assess the quantitative contrast re-

covery. The phantom was composed of a 2.5 cm-radius, 6 cm-long cylinder, lled with a

warm solution of activity, in which were placed ve hot spheres. The spheres were centered

on the central axial plane and their diameters were 1, 1.5, 2, 4, and 8 mm. The ratio of

the activity concentration in the hot spheres to that in the warm background cylinder was

10. The phantom had a total of 800 µCi and ve seconds of acquisition time was simulated,

6.4. RESULTS 115

0 0.2 0.4 0.6 0.8 174

76

78

80

82

84

86

β

Rec

over

y R

ate

(%)

Figure 6.5: Success rate in positioning therst interaction with MAP as a functionof the parameter β (6.19).

yielding 14.6 million coincident events. List-mode 3D-OSEM, with 10 million events per

subset, was used for the reconstruction [107]. Attenuation correction was implemented by

calculating analytically the absorption of 511 keV photons through a cylinder of water of

known dimensions. The contrast was measured in the reconstructed image as a function

of iteration number. The mean reconstructed activity was measured in the hot spheres

using spherical 3-D regions of interest (ROIs). The background activity was evaluated by

averaging the reconstructed intensity in two cylindrical ROIs placed o of the central axial

plane. The noise was approximated by the spatial standard deviation in the background

ROI, normalized by the mean background intensity (as dened previously in (5.3)). The

peak value of the contrast-to-noise ratio (CNR) was computed over all the iterations.

A high-resolution sphere phantom (Figure 6.4) was also simulated to research the eects

of the positioning algorithm on image resolution. The phantom comprised four quadrants of

spheres in air, all in the central axial plane, placed all the way to the edge of the 8×8×8 cm3

system FOV. The spheres were 1, 1.25, 1.5, and 1.75 mm in diameter. Their centers were

placed twice their diameters apart. The phantom had a total of 800 µCi and ve seconds of

acquisition time was simulated, yielding 27.2 million coincident events. The reconstructed

FWHM sphere size was measured by tting a sum of Gaussians with oset to 1-D intensity

proles through the reconstructed image. Note that the 1 mm spheres were too small relative

to the reconstruction voxel size for a reliable measurement of their FWHM size.

116 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

6.4 Results

6.4.1 Recovery Rate

The recovery rate is the fraction (%) of all the processed single-photon events for which

the rst interaction is correctly identied. All recovery rates were evaluated based on at

least 10 million events, and are subject to statistical measurement error below 0.02%. Some

positioning methods, such as EWM, do not identify the rst interaction, but rather estimate

the position directly. The recovery rate is undened for such methods.

For the MAP positioning scheme, the recovery rate is shown as a function of β, for β

varying between zero and one (Figure 6.5). For the contrast phantom, the highest recovery

rate (85.2%) is obtained for β = 0.85. The use of prior information increases the quality

of the estimation: for β = 0 (no prior), the rst interaction is correctly identied only in

83.5% of the single photons. The worst performance is reached for β = 1, when only the

prior probability is optimized. Yet, in this case, the recovery rate (74.3%) is still larger than

for MPD (69.9 %).

The recovery rate varies with the photon angle of incidence (Table 6.1). MAP is most

challenged by large photon incident angles (such as 60 deg), where the photon is more likely

to interact with multiple panels in the box-shaped PET system. The MPD scheme performs

best for photons impinging normally on the detector. For a realistic set-up (such as the

contrast phantom, in which the activity extends across the entire axial FOV) the number

of mispositioned events with MAP is a factor of two lower compared to MPD: 30.6% versus

14.8%.

The recovery rate also depends upon the number of times the annihilation photon in-

teracts with the detector (Table 6.2). MAP's ability to nd the initial interaction is not

substantially degraded by an increasing number of interactions. The second column of Ta-

ble 6.2 shows that MIPEs with two interactions are the most challenging to sequence for

both methods in terms of accuracy. In addition to identifying the initial interaction, MAP

also has the ability to recover the full sequence of interactions, which is useful in certain

applications [135,136] (Table 6.2, third row).

The specications of the detector technology, including the energy resolution and intrinsic

spatial resolution, aect the performance of the MAP scheme (Table 6.3). For MAP, higher

detector spatial and energy resolution increases the fraction of MIPEs that are correctly

positioned. Also, the recovery rate is less sensitive to the energy resolution than to the

detection element size.

The estimation procedure uses stochastic optimization via sub-voxel sampling to account

6.4. RESULTS 117

Table 6.1: Recovery rate (%) for MAP and MPD positioning, measured on four simulateddatasets.

Positioning Beam Beam Beam Contrastmethods 0 deg. 30 deg. 60 deg. Phantom

MAP (1st interaction) 84.3 86.2 83.6 85.2MDP (1st interaction) 76.6 71.9 70.2 69.4

Table 6.2: Recovery rate (%) for MAP and MPD positioning, as a function of the numberof interactions. CS: Compton scatter. PE: photoelectric.Positioning PE 1 CS 2 CS 3 CS 4 CS 5 CS Globalmethods Only +PE +PE +PE +PE +PE

MAP (1st interaction) 100 76.7 85.0 84.8 85.2∗

MDP (1st interaction) 100 52.1 64.1 75.6 82.4 86.4 69.4MAP(Full sequence) 100 76.7 78.3 67.0 77.6†∗ For N ≥ 5, the rst interaction is estimated using MPD because the large size of thecombinatorial search space (≥ 60) makes the identication of the correct sequencecomputationally impractical.† Computed only based on events with N ≤ 4 interactions.

Table 6.3: Recovery rate (%) for MAP and MPD positioning, as a function of the detectionelement size and energy resolution.

Positioning 1× 1× 1 mm3 1× 5× 1 mm3

methods 0.5%† 2.5%† 12%† 0.5%† 2.5%† 12%†

MAP (1st interaction) 90.0 89.6 88.1 85.5 85.2 84.5MDP (1st interaction) 69.4∗ 69.4∗∗ The MPD scheme does not use energy information.† FWHM at 511 keV.

Table 6.4: Recovery rate (%) for MAP using stochastic and deterministic objectives. CS:Compton scatter. PE: photoelectric.

Interaction PE 1 CS 2 CS 3 CS Globalposition Only +PE +PE +PE

Sub-voxel sampling 100 76.7 85.0 84.8 85.2Voxel center 100 68.2 75.9 72.5 78.8

118 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

for the geometrical uncertainty arising from the nite size of the detection elements. Table

6.4 shows that the stochastic approach results in a higher recovery rate compared to a

deterministic one (where the interactions are assumed to occur at the center of the detection

element).

6.4.2 Point-Spread Function

Figure 6.6 reports the detector coincident PSFs for dierent photon incident angle and for all

four positioning methods. For a normal beam (0 deg), the PSF was plotted along the axial

(a) and tangential (b) dimensions. The tangential component of the PSF was also plotted

for (c) 30 deg and (d) 60 deg beams. Due to the limited depth resolution of the detector

design (5mm), the PSF is wider and asymmetric for photons incoming at an oblique angle.

The PSF is not normalized; therefore, a higher peak value indicates that more counts are

correctly positioned at the PSF center. Note that the worst case error is smaller for EWM

(≈30 mm) than for MAP or MPD (40 mm), while the average error is lower for both MAP

and MPD.

Figure 6.7 provides a 2-D representation of the PSF for the same beam angles (0, 30

and 60 deg). The rst column is a histogram of all the interactions as recorded by the

system (raw hits). The second to fth columns show histograms of the position estimates

of the initial interactions by the II, MAP, MPD, and EWM schemes, respectively. The

histograms are shown on a logarithmic scale since MIPEs mostly aect the tails of the PSF.

For the EWM scheme (as well as other unconstrained positioning schemes), some MIPEs

are positioned outside of the detector volume. This occurs when the photon back-scatters

and deposits energy in two opposite detector panels, which places the center of mass of

the interactions towards the center of the system. These interactions have a characteristic

distribution (Figure 6.7, rightmost column) that is produced by the xed repartition of

the 511 keV between the front and back interactions for a given back-scattering angle. As

a result, the distribution of the EWM locations reveals the box-shaped geometry of the

scanner.

6.4.3 Reconstructed Contrast

The system PSF aects the nal image quality. The contrast versus noise trade-o was there-

fore investigated by means of a phantom containing ve hot spheres in a warm background

of activity (Figure 6.4b). Figure 6.8 shows the images reconstructed using 100 iterations of

list-mode OSEM for all four positioning methods: (a) II, (b) MAP, (c) MPD, and (d) EWM.

The four pictures are shown on the same intensity scale to facilitate comparison. Only two

6.4. RESULTS 119

−40 −30 −20 −10 0 10 20 30 4010

0

101

102

103

104

105

106

107

Z (mm)

Cou

nts

IIMAPMPDEWM

(a)

−4 −2 0 2 40

0.5

1

1.5

2

x 106

Y (mm)

Cou

nts

(b)

−4 −2 0 2 40

1

2

3

4

5

6

7

8

9

10

x 105

Y (mm)

Cou

nts

(c)

−4 −2 0 2 40

1

2

3

4

5

6

7

x 105

Y (mm)

Cou

nts

(d)

Figure 6.6: Point-spread functions (PSFs) for four positioning methods: Initial Interaction(II), MAP, Minimum Pair Distance (MPD) and Energy Weighted Mean (EWM). (a) 1-Daxial PSF for a normal beam (i.e. 0 deg incident angle); (b) 1-D tangential PSF for thesame normal beam; (c) 1-D tangential PSF for a 30 deg beam; and (d) 1-D tangential PSFfor a 60 deg beam.

120 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

Figure 6.7: Point-spread function (2-D, log scale) for three beam angles (top: 0 deg, middle:30 deg, bottom: 60 deg). In the rst column, the histogram of all the interactions recorded bythe system is shown. The second to fth columns show histograms of the position estimateof the initial interaction by the II, MAP, MPD and EWM positioning schemes, respectively.

(a) (b) (c) (d)

Figure 6.8: Image slice through the contrast phantom, reconstructed with 100 sub-iterationsof list-mode 3-D OSEM. Ten million events were included in each subset. The phantomcomprised a 2.5 cm-radius, water-lled cylinder, in which were placed ve hot spheres. Theratio of the activity concentration in the spheres to that in the background was 10. Thesphere diameters were 1, 1.5, 2, 4 and 8 mm. Four positioning schemes were used: (a) II,(b) MAP, (c) MPD, and (d) EWM. No post-processing was performed. The images aredisplayed using the same intensity scale.

6.4. RESULTS 121

spheres can be resolved for EWM, for the other positioning schemes however all but the

smallest spheres are resolved.

Figure 6.9 shows the contrast versus noise in the reconstructed images for the ve spheres

(diameter 18 mm). A total of 100 list-mode 3D-OSEM sub-iterations are displayed. The

non-monotonic behavior of the noise is caused by the structural artifacts present in the early

image iterations.

Unlike the contrast recovery, the noise is not aected by mispositioning as evidenced by

Figure 6.9. Independently of the positioning method, the contrast is degraded by small-

angle tissue scatter, random coincidences (randoms rate is 10.6% for the contrast phantom

simulation), partial volume eect and inaccurate system matrix. For these reasons, it never

reaches its true value (10:1), even for the ideal II positioning scheme which provides the

highest contrast recovery (8.6 to 1). For the largest sphere, the contrast is 10% higher for

MAP (7.5 to 1) than for the MPD scheme (6.8 to 1). The contrast dierence is even greater

for the smaller spheres: for the 2 mm sphere, the contrast is 24% higher for MAP than

for MPD. It should also be noted that the contrast of the smallest spheres (1 and 1.5 mm)

did not converge within a hundred iterations; however, the monotonically increasing noise

prevents iterating further if the image is to maintain a reasonable signal-to-noise ratio.

The EWM method demonstrates the worst performance for the contrast recovery task.

The contrast is degraded to the extent that the small spheres (≤ 2 mm) cannot be resolved.

The contrast of the largest sphere is 46% lower to that achieved with MAP.

The CNR provides a rough estimate of the detectability of hot lesions in a background.

Lesions with CNR greater than 4 (shown as a dotted line in 6.9f) are generally detectable,

even though observer experience and object shape can also aect the detectability [137]. Ac-

cording to this criterion, the 1 mm sphere can be detected only when the ideal II positioning

scheme is used for the chosen reconstruction voxel size, while the 1.5 and 2 mm spheres are

not detectable when EWM is used. The peak CNR is systematically higher for MAP than

for MPD.

6.4.4 Reconstructed Sphere Resolution

The hot spheres resolution phantom (Figure 6.4c) was used to evaluate how the spatial

resolution is aected by positioning accuracy when iterative 3D-OSEM reconstruction is

used. Figure 6.10 shows the reconstructed images at 50 iterations for the four positioning

methods (II, MAP, MPD and EWM). The spheres are best resolved by the ideal II scheme.

MAP and MPD appear to perform similarly, but EWM shows substantial degradation of

the spatial resolution.

122 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

1

2

3

4

5

6

7

8

9

Noise

Con

tras

t

II

MAP

MPD

EWM

(a)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

1

2

3

4

5

6

7

8

Noise

Con

tras

t

(b)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

1

2

3

4

5

6

Noise

Con

tras

t

(c)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

1

2

3

4

5

6

7

Noise

Con

tras

t

(d)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7−0.5

0

0.5

1

1.5

2

2.5

3

Noise

Con

tras

t

(e)

8 mm 4 mm 2 mm 1.5 mm 1 mm0

10

20

30

40

50

60

Sphere size

Con

trat

to n

oise

rat

io

IIMAPMPDEWM

(f)

Figure 6.9: Reconstructed hot spheres contrast as a function of noise, at dierent sub-iteration numbers. The contrast phantom was reconstructed with list-mode 3-D OSEMusing ten million events in each subset. Four positioning methods were used: II, MAP,MPD and EWM. The resulting curves are shown for the (a) 8 mm, (b) 4 mm, (c) 2mm,(d) 1.5 mm, and (e) 1 mm spheres. (f) Peak contrast-to-noise ratio, computed betweeniteration number 10 and 100. The dotted line represent the threshold for the Rose criterionof detectability [137].

6.4. RESULTS 123

(a) (b)

(c) (d)

Figure 6.10: Hot spheres in air phantom, reconstructed with 50 iterations of list-mode 3-DOSEM. Ten million events were included in each subset. The spheres extend to the edge ofthe 8×8×8 cm3 FOV and their diameters are 1, 1.25, 1.5 and 1.75 mm. The spacing betweenthe centers is twice the diameter. Four positioning schemes were used: (a) II; (b) MAP; (c)MPD; and (d) EWM. No post-processing was performed. The images are displayed usingdierent intensity scale to maximize the dynamic range.

124 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

Further investigation was performed by measuring the reconstructed FWHM size of the

spheres. The results of these measurements are reported in Figure 6.11, for the (a) 1.75,

(b) 1.5, and (c) 1.25 mm spheres. Since the ML estimate is non-linear, the reconstructed

sphere FWHM size should be analyzed with care and should not be interpreted in terms of

modulation transfer function. The FWHM size is, however, an interesting gure of merit to

study since it denes the ability of the algorithm to distinguish small lesions that are close

to each other. It should also be noted that reconstructed sphere FWHM size is not expected

to be equal to the true sphere diameter (see Appendix E for more details).

Due to parallax blurring and despite 5 mm depth resolution, the reconstructed FWHM

sphere size is degraded for spheres near the edge of the FOV. The ideal II scheme provides

the best reconstructed FWHM size value, followed by MAP and MPD. Due to fewer mispo-

sitioned events, the reconstructed 1.75 mm diameter spheres were, on average, 5.6% smaller

for MAP than for MPD. As demonstrated in Figure 6.10d, EWM adds a substantial amount

of blur to the reconstructed images. The prole through the 1.75 mm diameter spheres (Fig-

ure 6.11d) shows that, besides the FWHM, the contrast of the spheres is aected by the

positioning accuracy as well.

6.5 Discussion

6.5.1 Performance of Proposed Scheme

When MAP is used, two times fewer events are mispositioned compared to MPD, a simpler

algorithm. As a result, the PSF has lower tails and higher peak value (Figure 6.6) because

more events are positioned to the correct LOR. MAP is also less likely to misposition events

in which the photon undergoes back-scatter (Figure 6.7). This directly aects the recon-

structed image quality. The contrast is higher for MAP positioning than for MPD (Figure

6.9), because mispositioning causes contrast loss because events that originate from the hot

lesion are positioned in the background. The peak CNR is greater for MAP than for MPD

and EWM, which implies that hot lesions have a better chance of being detected in a clin-

ical setting. In addition, MAP provides better quantitative accuracy in the sense that the

reconstructed contrast is a better estimate of the actual tracer concentration ratio. Images

reconstructed using the MAP positioning scheme also show higher spatial resolution (Figure

6.11), which facilitate the detection of smaller hot structures.

The full sequence of interactions can also be reconstructed by optimizing the prior distri-

bution alone (i.e. β = 1 in MAP). Like MPD, this approach has the advantage that energy

measurements are not needed. Furthermore, MAP with β = 1 outperforms MPD by 4.4%

6.5. DISCUSSION 125

0 5 10 15 20 25 30 35 401

1.5

2

2.5

3

3.5

Y (mm)

Rec

onst

ruct

ed S

pher

e F

WH

M

II

MAP

MPD

EWM

(a)

0 5 10 15 20 25 30 350.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

Y (mm)

Rec

onst

ruct

ed S

pher

e F

WH

M

(b)

0 5 10 15 20 25 30 35 40

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Y (mm)

Rec

onst

ruct

ed S

pher

e F

WH

M

(c)

40 45 50 55 60 650

0.5

1

1.5

2

2.5

3

3.5

4

4.5

X (mm)

Rec

onst

ruct

ed v

oxel

val

ue

(d)

Figure 6.11: Reconstructed FWHM sphere size (mm) as a function of sphere position for thefour positioning methods, measured by tting a Gaussian mixture with oset to 1D prolesthough the reconstructed volume. (a) 1.75 mm spheres; (b) 1.5 mm spheres; and (c) 1.25mm spheres. (d) A prole through the 1.75 mm spheres is shown for the four positioningmethods.

126 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

(Figure 6.5).

Both the number of energy measurements and the size of the search space used for

recovering just the rst interaction are equal to N , the number of interactions. Therefore,

the accuracy performance of the MAP scheme is maintained even as N increases (Table 6.2,

rst row). When attempting to recover the full sequence, however, the size of the search

space increases super-exponentially with N , hence the recovery rate for complete sequences

drops sharply for N = 4 (Table 6.2, third row).

Uncertainty in the locations of the interactions (within the detection element boundaries)

translates into uncertainty in the scatter angle, which in turn yields uncertainty on the energy

deposited. The eect of this positional uncertainty in 1 × 5 × 1 mm3 detection elements is

equivalent to that of an energy blur far greater than the 12% FWHM energy resolution at

511 keV (Table 6.3). That is to say, when the interactions are constrained to xed detectors

voxels, the range of energies that can be deposited is often much larger than the energy

resolution of the system (Figure 6.2). Therefore, the correct estimation of the order of

interactions depends more strongly on having good spatial resolution than on having good

energy resolution.

6.5.2 Limitations

The processing time is problematic for the robust Bayesian method (MAP). Although the

algorithm was implemented on a fast GPU, it takes more than 5 seconds to process 1 000events. MPD runs 43 times faster than MAP, mainly because MAP computes the objective

function over 16 384 statistical realizations. Using fewer simulation paths degrades the

positioning. Hence, when the objective is computed for only one trajectory passing through

the center of the detection elements, the recovery rate drops from 85.2% to 78.8% (Table

6.4).

The validation of the MAP scheme was carried out based on simulations performed

with GRAY (6.3.1). These simulations included the standard Compton scatter model based

on the Klein-Nishina formula, the accurate linear attenuation coecients from published

databases, and the standard photoelectric absorption model. In a real system, other physical

eects occur, namely characteristic X-ray production, Raleigh scattering, Bremsstrahlung,

and Doppler broadening. Furthermore, the electron is neither free nor at rest, therefore

the Klein-Nishina formula is not an accurate model for low (< 100 keV) energy photons

undergoing Compton scatter (Figure 6.3). We therefore applied our method to a dataset

generated using GATE [70], a more detailed Monte-Carlo package that incorporates all

these eects (except Doppler broadening). For a point source located at the center of the

6.5. DISCUSSION 127

CZT system, the recovery rate dropped from 85.0% to 82.4% for MAP, and from 70.1% to

67.9% for MPD. This drop is mostly caused by the production of a characteristic X-ray that

can propagate beyond the boundaries of the detection element, resulting in an increase in

the number of interactions recorded. Doppler broadening was not modeled in any of the

simulations. For 511 keV photons, Doppler broadening blur is at most 6 keV FWHM [138].

The energy blurring remains dominated by the nite resolution of the detectors. In addition,

increase in energy blur results in only a modest reduction in MAP's recovery rate (Table

6.3).

In this study, it was assumed that the CZT modules could read out interactions as low

as 10 keV. Being able to read such low energy events is crucial for the MAP positioning

scheme. A higher energy threshold (for example, 100 keV [139]) will cause the PET system

to drop interactions. As a consequence, some MIPEs will be discarded by energy gating

because their total energy will not be comprised within the energy window. Furthermore,

in some cases the PET system will drop one or more interactions, yet the resulting MIPE

will still be comprised within the energy window. In that case, ML sequencing is likely to

perform poorly because one or more interactions will be missing from the sequence.

6.5.3 Possible Extensions

The image-degrading eect of MIPEs has been previously compensated for by using a sim-

ple positioning method (EWM), and then by reconstructing the images with an accurate

model of the PSF [48]. However, by deconvolving the blur caused by MIPEs, this approach

amplies the noise. Prior to reconstruction, it is preferable to use an advanced positioning

method (such as MAP) to estimate the location of the rst interaction. The incorporation

of the resulting PSF in the reconstruction generates less noise amplication because the

PSF corresponding to a more precise positioning algorithm is narrower (Figure 6.6). This

advanced method is only available for PET systems that can position individual interactions

within a MIPE.

In 6.2.1, it is assumed that the photon incoming energy e0 and the location of the

other coincident photon p are available. While true for PET, these assumptions cannot be

made for other modalities such Compton cameras [138]. Nevertheless, the method can be

generalized for these modalities. The incoming energy e0 can be estimated by summing the

energies of all the interactions. When p is unavailable, the likelihood should not include the

rst interaction because the scatter angle cos θ1 cannot be computed as in (6.4).

The MAP objective can be readily extended to be used as criteria for rejecting tissue

scattered events and random coincidences. In its current formulation, it assumes that the

128 CHAPTER 6. BAYESIAN SEQUENCE RECONSTRUCTION

incoming photon energy e0 is 511 keV and that the incident angle is given by the location

p of the other photon pair. When the photon scatters in tissue or is paired up incorrectly

(random coincidence), these hypotheses do not hold. As a result, no sequence might be

attributed a high probability and the event can be discarded.

MAP positioning can be applied to crystals other than CZT. For instance, high-resolution

detectors with depth-of-interaction capabilities can be built from lutetium oxyorthosilicate

(LSO) coupled to thin position-sensitive avalanche photo-diodes (PSAPD [17]). Even though

LSO's photo-fraction is higher than CZT's, small crystal elements will cause MIPEs. The

MAP scheme can then be used for positioning: based on Table 6.3, MAP is accurate even

with 12% energy resolution FWHM.

6.6 Summary

The ability to correctly position MIPEs greatly aects the global performance of PET sys-

tems based on high-resolution detectors. Discarding MIPEs is not possible for the CZT

system studied since these events are part of almost every recorded coincident event. Con-

ventional positioning schemes, such as EWM, degrade both contrast and resolution, reduce

the image quantitative accuracy, and aect the detectability of hot lesions. Simple ap-

proaches, such as MPD, help improve the image quality, but are outperformed by MAP, a

more advanced positioning scheme that uses all the information available in a statistically

optimal way.

Although inter-crystal scatter is more prevalent in smaller crystals, improved intrinsic

spatial resolution will enhance the identication of the photon crystal of entry. Bayesian

sequence reconstruction methods will play a key role in ultra-high resolution systems, espe-

cially those made from materials with low photo-fraction such as CZT or Germanium.

Chapter 7

Concluding Remarks, Future

Directions

The response of a PET system is complex and shift-varying. As a result, images recon-

structed with simple schemes have non-uniform spatial resolution and suboptimal quantita-

tive accuracy. This issue can be addressed by the incorporation of an accurate model of the

system response within the reconstruction.

For the box-shaped PET system that is one of the foci of this dissertation, the system

response is particularly irregular. Likewise, the images that were reconstructed with a shift-

invariant projection kernel suered from limited spatial resolution near the edge of the FOV

and degraded quantitation. The inclusion of an accurate model of the geometrical detector

response corrected for the variations in spatial resolution, and improved the quantitative

accuracy. This suggests that the geometrical response is the main component of the full de-

tector response. Modeling the other components (inter-crystal scatter, photon acolinearity

and positron range) shall indeed bring further benets, but will require a considerably more

complex model. Therefore, the model proposed (based on the analytical CDRF) is an excel-

lent trade-o between image quality, accuracy and computation. Uniform spatial resolution

is achieved throughout the FOV, with manageable memory and computation requirements

(especially when GPUs are used).

The measurements obtained from the high-resolution PET and TOF PET systems are

sparse. Therefore, a direct reconstruction from the list-mode is preferable to using sinograms.

GPUs have been successfully employed in sinogram-based reconstruction because they are

ecient at applying the ane transformation that maps a slice through the volumetric image

to any sinogram view, and vice-versa. The main challenge in implementing list-mode OSEM

129

130 CHAPTER 7. CONCLUDING REMARKS, FUTURE DIRECTIONS

on the GPU is that the list-mode LORs are not arranged in any regular pattern like sinogram

LORs. The mapping between the list-mode data and the volumetric image is not ane, and

as a result texture mapping cannot be used in this context. The technique described in this

dissertation is unique because it can be used for back- and forward projecting individual

LORs described by arbitrary endpoint locations, even when a shift-varying kernel is used to

model the response of the system.

Yet, the appeal of using the computational power of the GPUs for list-mode reconstruc-

tion is somewhat mitigated by their relatively poor cache performance. Unlike sinogram-

based reconstruction, in list-mode the memory is accessed randomly, which makes GPU-

based implementations challenging. The memory latency and throughput aect the overall

speed of the reconstruction. Still, the memory latency can be hidden by performing com-

putation while the image voxel is being retrieved. In our implementation, the kernel is

evaluated while the memory is being accessed. Writing to the image is also pipelined so that

the memory latency does not aect the overall performance.

The shift to higher resolution detectors not only implies that data is sparser, but also

less reliable due to multiple energy depositions for each 511 keV photon. This issue is

exacerbated when semiconductor detectors such as CZT are used because of the lower photo-

fraction. In this dissertation, a Bayesian method was introduced for reconstructing the

order of a sequence of interactions. This approach is a unifying framework which combines

the statistical properties of the measurements with a priori information. Although inter-

crystal scatter is more prevalent in smaller crystals, improved intrinsic spatial resolution will

facilitate the identication of the photon crystal of entry. Robust sequence reconstruction

methods, such as MAP, should play a key role in ultra-high resolution systems.

Beyond considering MIPEs as a source of errors that need to be corrected for, these events

have the potential to increase the amount of information available in the reconstruction.

The direction of an incoming 511 keV photon that produced a MIPE can be conned to

a cone (called the Compton cone). Such information could lead to a paradigm shift in

PET: single events (currently discarded) could be used in the reconstruction [136,140,141].

The single events can be included in the reconstruction using techniques borrowed from

Compton imaging. In addition, the knowledge of the photon incoming direction can help

reject a signicant fraction of the random and tissue scattered coincidences [135]. As a

result, in that scenario, MIPEs would actually contribute to improve image quality instead

of degrading it.

Appendix A

GPU Line Projections

This appendix describes how the data stored in the GPU on-board memory is represented.

Next, a detailed description of the forward and back- projection operations is provided,

accompanied with simplied Cg and OpenGL code.

A.1 Data Representation

In OpenGL, data can be stored in the video memory using 2-D textures. A 2-D color texture

forms an array of 32-bit oating-point quadruples and can be accessed randomly by GPU

shaders. 3-D textures are also available, however they do not support the render-to-texture

operation that is required to write data out. 3-D arrays, such as volumetric images, need

to be reshaped into 2-D structures to use write-to-texture capabilities. Writing to a 2-D

texture is performed by using OpenGL's frame-buer object (FBO) extension [108]. The

FBO extension can be used to do o-screen rendering, or to output the result of GPU-based

computations.

A.1.1 Images

In our implementation, the slice stack was tiled in 2-D into a larger rectangular texture (as

shown on Figure A.1). The 3-D index used to access the volumetric image is converted into

a 2-D index when needed.

A.1.2 List-Mode Data

The list-mode data is pre-loaded into another 2-D texture before the projections are per-

formed. Another option investigated was to stream the list-mode while it is being processed.

131

132 APPENDIX A. GPU LINE PROJECTIONS

Figure A.2: List-mode storage on theGPU. The RGB channels (red, green, andblue) are used to store the LOR endpointcoordinates. The alpha channel (black)can be used for storing additional data(per-LOR correction factors for scatter orrandoms, number of counts in histogram-mode, etc.).

This second approach requires less memory, but each list-mode event must be streamed twice

(once for the forward projection, once for the backprojection). Storing the list-mode data

in the video memory also makes it possible to perform some pre-processing steps directly on

the GPU. The list-mode data was stored according to Figure A.2.

Figure A.1: Image storage on the GPU using a 2-D tiled

representation.

OpenGL and Cg allow for

rectangular textures, however to

be readable from the vertex

shaders, the vertex arrays have

to be square. As a result, a

4-channel 2-D texture can store

2κ2 list-mode events, where κ

is a positive integer. Therefore,

the subset size is rounded to the

nearest integer that can be ex-

pressed as 2κ2.

The size of the on-board video memory, as well as the OpenGL specications can limit

the size of the list-mode dataset. In that case, a multi-pass approach is used. Batches of

2κ2 LORs are processed (forward and back-projection) sequentially, without clearing the

accumulation buer between each batch. The number of passes can be specied with the

command -m (see Table C.1).

A.2. LINE FORWARD PROJECTION 133

Figure A.3: Schematic of the forward projection of a LOR.

A.2 Line Forward Projection

The line forward projection is described mathematically in 4.2.2.3. This step uses two FBOs

for temporary storage. The size of these FBOs is λ (N_FP_SAMP in the code) by κ (N_FP_LINE

in the code) pixels. Each pass of the main loop in A.1 iterates over a block of LORs. A

block contains κ LORs, stored in the dst object.

For each LOR block, samples are calculated along each LOR (Figure A.3, left). One

sample is taken for each slice traversed by the LOR and is stored in a temporary FBO. The

volume is sliced is the main direction of the LOR (as dened by the conditions set in (4.4)).

The inner loop (in the double sum dened in (4.5)) is performed by a fragment shader (A.5,

in blue on Figure A.3). The sample locations are computed by drawing horizontal lines in

the temporary FBO while shaders are bound. The vertex shader (A.4) is passed an index

(lineIdx) to the LOR endpoint coordinates, as well as a parameter (idx) that indicates

where the horizontal LORs should be drawn within the FBO. The sign of idx determines

if a vertex is rst or second, and the magnitude of idx provides the vertical location where

the horizontal line is to be drawn.

The LOR endpoint coordinates are loaded from the list-mode texture (Figure A.2) and

used to generate 3-D texture map coordinates for the two endpoints of the horizontal line.

The length of the horizontal line and the texture coordinates are such that one sample

will be drawn from the center of each slice traversed by the LOR. Each sample will be

mapped through texture mapping to a location on the original 3-D LOR. The width λ of

the temporary FBO in which the horizontal line is drawn has to be at least equal to the

maximum number of samples expected (i.e. the number of slices). At the end of the rst

pass, κ× λ samples have been computed along κ LORs.

Geometrical properties (such as LOR direction vector, length, etc.) are also computed

134 APPENDIX A. GPU LINE PROJECTIONS

and passed to the fragment shader. The LOR main direction (n3 in the code) and the

two other orthogonal directions (n1 and n2) are also identied. The vectors n1 and n2 are

used later on in the fragment shader (A.5) as the directions along which the inner loop is

performed.

The double inner loop (in A.5) is performed for all the voxels within a square centered

on the LOR (see (4.2)b). For each of these voxels, the distance to the LOR is calculated

and if the voxel is inside the TOR, the voxel value is read out and summed using a Gaussian

projection kernel. The partial sum is then written to the temporary FBO.

In a second pass, the sum of all the partial sums for a given LOR is calculated using

a fragment shader (A.6). The result is stored as a vertical line (in red on Figure A.3) in

a second temporary FBO. Each pixel in that second FBO is the forward projection of the

volume along LOR.

After λ blocks of κ LORs have been processed, the second FBO is full. At that point,

the projection values are transferred to FBO holding the list-mode data (Figure A.3, in

yellow) by drawing a rectangle with a texture map (A.7). While the forward projection

values are being transferred, they are inverted by a fragment shader (A.8) in preparation of

the backprojection (as indicated in (3.5)).

The following OpenGL and Cg code has been simplied for easier reading. In particular,

the parameter set-up and parameter activation instructions are not shown.

A.3 Line Backprojection

The line backprojection step is described in 4.2.2.4. Slices are processed four at a time

(A.9, outer loop). For every group of four slices, all the LORs are backprojected in an

accumulation FBO (using additive blending) in blocks of MAX_LINES lines (A.9, inner loop).

After each block of LORs is backprojected, the accumulation FBO is added to a second

accumulation buer. This hierarchical accumulation architecture was designed because in

previous GPU the blending units were limited to 16-bit dynamic range (half). With the

GeForce 8 series, the blending units were upgraded to support full 32-but oating-point and

therefore many more LORs can be accumulated before underow occurs. On the newer

generation of GPU, millions of LORs can be back-projected in a single pass without any

degradation in quality. For the older GPU models, it is preferable to limit the size of the

blocks to 5,000 LORs.

The coordinates of the LOR endpoints are stored in a vertex texture (Figure A.2). An

index to both LOR endpoints is passed to the vertex shader (LineIdx in A.10). Both

A.3. LINE BACKPROJECTION 135

endpoints are readout from the vertex texture. The vertex shader calculates a number

of parameters for the LOR, including its length, middle point and direction vector. This

information is then passed to the fragment shader that is responsible for evaluating the

projection kernel within the TOR (see (4.1) for a denition).

The role of the fragment shader (A.11) is to evaluate the projection kernel. In A.11, a

Gaussian kernel is implemented. First, the distance DSQ between the voxel center and the

LOR axis is calculated. The calculation for four slices are performed in parallel using 4D

vector operations. Next, the Gaussian kernel is evaluated. The product of the kernel times

the LOR weight is returned and accumulated in the accumulation FBO (A.9) using additive

blending. A mask is used to prevent the voxels that are outside the TOR from being written

to.

136 APPENDIX A. GPU LINE PROJECTIONS

Algorithm A.1 OpenGL: Forward projection main loopglDisable(GL_BLEND);

glViewport( 0, 0, N_FP_SAMP, N_FP_LINE );

glMatrixMode( GL_TEXTURE ); glLoadIdentity( );

glMatrixMode( GL_MODELVIEW ); glLoadIdentity( );

glMatrixMode( GL_PROJECTION ); glLoadIdentity( );

gluOrtho2D( 0, N_FP_SAMP, 0, N_FP_LINE );

fboFP->bind();

fboFP->setWrite(1);

glClear( GL_COLOR_BUFFER_BIT );

//Process all the lines in the buffer in groups of N_FP_LINE

for (i = 0, j=1, k=0; i<dst->number() ; i += N_FP_LINE, j++)

<First pass>: sample the image along LORs (A.2)

<Second pass>: sum the samples (A.3)

if (j==N_FP_SAMP)

fboFP->unbind();

dst->update(k, j, fboFP->texID(1));

fboFP->bind();

fboFP->setWrite(1);

glClear( GL_COLOR_BUFFER_BIT );

k += j;

j = 0;

if (j != 0)

dst->update(k, j, fboFP->texID(1));

fboFP->unbind();

dst->update(k, j, fboFP->texID(1)); //(A.7)

dst->pingpongFBO();

A.3. LINE BACKPROJECTION 137

Algorithm A.2 OpenGL: Forward projection, rst passfboFP->setWrite(0);

glClear( GL_COLOR_BUFFER_BIT );

cgVertFore->activateProgram();

cgFragFore->activateProgram();

glDrawArrays( GL_LINES, 0, 2*N_FP_LINE );

cgVertFore->desactivateProgram();

cgFragFore->desactivateProgram();

Algorithm A.3 OpenGL: Forward projection, second passfboFP->setWrite(1);

cgFragSumLines->activateProgram();

glBegin( GL_LINES );

glTexCoord2f(0, 0); glVertex3f(j-0.5f, 0, 0);

glTexCoord2f(0, N_FP_LINE+1); glVertex3f(j-0.5f, N_FP_LINE+1, 0);

glEnd();

cgFragSumLines->desactivateProgram();

138 APPENDIX A. GPU LINE PROJECTIONS

Algorithm A.4 Vertex shader: Line forward projection rst passvoid main(

float4 lineIdx : POSITION,

float idx : TEXCOORD0,

uniform samplerRECT sampVert: TEXUNIT0,

uniform float nfp,

out float4 HPosition : POSITION,

out float3 L : TEXCOORD0,

out float3 v : TEXCOORD1,

out float3 n1 : TEXCOORD2,

out float3 n2 : TEXCOORD3,

out float3 C : TEXCOORD4,

out float d : TEXCOORD5 )

float2 uv1 = lineIdx.xy;

float2 uv2 = lineIdx.zy;

float3 vert = texRECT( sampVert, uv1 ).xyz;

float3 vertOther = texRECT( sampVert, uv2 ).xyz;

float3 vD = vert - vertOther;

d = length(vD);

float3 vDabs = abs(vD);

v = sign(idx) * vD / d;

n1 = ( vDabs.x < vDabs.z || vDabs.x <= vDabs.y ) ? float3(1,0,0) : float3(0,0,1);

n2 = ( vDabs.y < vDabs.z || vDabs.x > vDabs.y ) ? float3(0,1,0) : float3(0,0,1);

float3 n3 = float3(1) - n1 - n2;

float v1n3 = dot( vert, n3 );

float v2n3 = dot( vertOther, n3 );

float3 u = vD / ( v1n3 - v2n3 );

float mi = floor(min( v1n3, v2n3 )) - 5;

float ma = ceil(max( v1n3, v2n3 )) + 5;

float pick = (lineIdx.w==0) ? mi : ma;

L = vert - u * ( v1n3 - pick );

C = (vert + vertOther) / 2.0;

float vmax = ma - mi ; //number of slices

float xpos = (lineIdx.w==0) ? (-1) : (2*vmax-nfp)/nfp;

HPosition = float4(xpos, abs(idx)-1, 0, 1);

A.3. LINE BACKPROJECTION 139

Algorithm A.5 Fragment shader: Line forward projection rst passfloat4 main(

float3 L : TEXCOORD0,

float3 v : TEXCOORD1,

float3 n1 : TEXCOORD2,

float3 n2 : TEXCOORD3,

float3 C : TEXCOORD4,

float d : TEXCOORD5,

uniform float3 vs,

uniform float eta_sq,

uniform float sigma_sq,

uniform samplerRECT samp :TEXUNIT1,

uniform samplerRECT zLookUp :TEXUNIT2

) : COLOR

float s, sum = 0;

for ( float i=-3; i<=3; i++ )

for ( float j=-3; j<=3; j++ )

P = floor( L + i*n1 + j*n2 ) + 0.5;

if (all(P>0) && all(P<vs))

CPv = dot(CP,v);

d2 = dot(CP,CP) - CPv*CPv;

if (d2 < eta_sq)

t1 = h4texRECT( zLookUp, float2(P.z, 0.5) );

kern = exp( -d2 / ( 2*sigma_sq ) );

s = texRECT( samp, P.xy + t1.xy ).r;

sum += s*kern;

return float4(sum,0,0,0);

140 APPENDIX A. GPU LINE PROJECTIONS

Algorithm A.6 Fragment shader: line summationfloat4 main(

float2 p : TEXCOORD0,

uniform samplerRECT samp2,

uniform float scale ) : COLOR

float s, sum = 0;

for (float i=0.5; i<160; i++)

s = texRECT(samp2, float2( i , p.y ) ).r;

sum += s;

return float4(sum,0,0,0);

A.3. LINE BACKPROJECTION 141

Algorithm A.7 OpenGL: LOR updateglViewport( 0, 0, 2*SUBSET_X, SUBSET_Y );

gluOrtho2D( 0, 2*SUBSET_X, 0, SUBSET_Y );

fboVert->bind();

fboVert->setWrite(_newFBO);

fragUpdate->activateProgram();

glBegin(GL_QUADS);

glMultiTexCoord2fARB(GL_TEXTURE0_ARB, 0, 0);

glMultiTexCoord2fARB(GL_TEXTURE1_ARB, 0, startIdx);

glVertex3f( 0, startIdx, -1);

glMultiTexCoord2fARB(GL_TEXTURE0_ARB, 0, N_FP_LINE);

glMultiTexCoord2fARB(GL_TEXTURE1_ARB, 2*N_FP_LINE, startIdx);

glVertex3f( 2*N_FP_LINE, startIdx, -1);

glMultiTexCoord2fARB(GL_TEXTURE0_ARB, nUpdate, N_FP_LINE);

glMultiTexCoord2fARB(GL_TEXTURE1_ARB, 2*N_FP_LINE, (startIdx+nUpdate));

glVertex3f( 2*N_FP_LINE, startIdx+nUpdate, -1);

glMultiTexCoord2fARB(GL_TEXTURE0_ARB, nUpdate, 0);

glMultiTexCoord2fARB(GL_TEXTURE1_ARB, 0, (startIdx+nUpdate));

glVertex3f( 0, startIdx+nUpdate, -1);

glEnd();

fragUpdate->desactivateProgram();

fboVert->unbind();

142 APPENDIX A. GPU LINE PROJECTIONS

Algorithm A.8 Fragment Shader: LOR updatefloat4 main(

float2 uv0: TEXCOORD0,

float2 uv1: TEXCOORD1,

uniform samplerRECT sampVert,

uniform samplerRECT sampFP

) : COLOR

float4 col;

float4 d4 = texRECT(sampVert, uv1);

col.rgb = d4.rgb;

float a = texRECT(sampFP, uv0).r ;

col.a = 1. / ( a + d4.a );

return col;

Algorithm A.9 OpenGL: BackprojectionglLineWidth( LINE_WIDTH );

glBlendFunc( GL_ONE, GL_ONE );

glViewport( 0, 0, VOX_SIZE_X, VOX_SIZE_Y );

for (z=0; z<VOX_SIZE_Z; z+=4)

fboBP->bind();

fboBP->setWrite(1);

glClear( GL_COLOR_BUFFER_BIT );

glEnable( GL_BLEND );

for(i = 0; i<src->number(); i += MAX_LINES)

fboBP->setWrite(0);

glClear( GL_COLOR_BUFFER_BIT );

glMatrixMode( GL_PROJECTION ); glLoadIdentity( );

glOrtho( 0, VOX_SIZE_X, 0, VOX_SIZE_Y, -(z-5), -(z+9) );

cgVertBack->activateProgram();

cgFragBack->activateProgram();

glDrawArrays( GL_LINES, 0, 2*MAX_LINES );

cgVertBack->desactivateProgram();

cgFragBack->desactivateProgram();

A.3. LINE BACKPROJECTION 143

Algorithm A.10 Vertex shader: Line backprojectionstruct bindings

float4 HPos : POSITION;

float Col0 : TEXCOORD2; // LOR weight

float3 v : TEXCOORD0; // Direction vector

float3 C : TEXCOORD1; // LOR middle point

float b : TEXCOORD3; // LOR length

bindings main(

float4 lineIdx : POSITION,

uniform samplerRECT sampVert : TEXUNIT0,

uniform float4x4 ModelViewProj

)

bindings OUT;

float2 uv1 = lineIdx.xy;

float2 uv2 = lineIdx.zy;

float4 vert = texRECT( sampVert, uv1 );

float4 vertOther = texRECT( sampVert, uv2 );

float flip = (1-2*lineIdx.w);

float3 vline = vert.xyz - vertOther.xyz;

OUT.HPos = mul(ModelViewProj, float4(vert.xyz,1));

OUT.Col0 = vert.w;

OUT.b = length(vline);

OUT.v = flip*vline / OUT.b;

OUT.C = ( vert.xyz + vertOther.xyz ) / 2.0;

return OUT;

144 APPENDIX A. GPU LINE PROJECTIONS

Algorithm A.11 Fragment shader: Line backprojectionfloat4 main(

bindings IN,

uniform samplerRECT zRGBA : TEXUNIT1,

uniform float sigma_sq,

uniform float slice

) : COLOR

float4 zEnc = IN.C.z - (slice + float4(0.5, 1.5, 2.5, 3.5));

float2 v2 = IN.C.xy - IN.WPos.xy;

float4 DP = dot( IN.v.xy , v2 ) + IN.v.z * zEnc;

float4 D2 = dot(v2,v2) + zEnc*zEnc;

float4 DSQ = D2 - DP*DP;

mask = (DSQ < 2);

if ( !any(mask) ) discard;

float4 dexp = exp( - DSQ / (2.0*sigma_sq) );

return( dexp * mask * IN.Col0 );

Appendix B

File Formats

B.1 List Mode and Histogram Mode

A list-mode le is a list of events. For each event, ten properties are stored. For a TOF

PET sytem, these include the spatial coordinates of both detectors and the TOF value c∆τ .Optionally, a random estimate ri and two scatter estimates (sTOFi with TOF information,

si without) can be provided for the LOR in which the event was recorded on. The randoms

are uniformly distributed over all TOF bins, therefore the TOF randoms estimate can be

calculated from the one without TOF. The spatial coordinates can be entered using any unit

of length (mm, cm, inches, etc.), however the choice must be consistent with the units used

in the FOV denition. The TOF value should also be converted into the same unit of length,

according to the relationship ∆x = c∆τ , where c is the speed of light. The randoms and

scatter estimates should provide the number of such events in the duration corresponding to

a subset of data, for a particular LOR or TOF bin. The list-mode data is stored in 32-bits

oating-point format (IEEE 754, little endian). List-mode les can be recognized by their

.bin.f extension. For TOF datasets, an element of a list-mode le is organized as follow:

x(1)i y

(1)i z

(1)i c∆τ ri x

(2)i y

(2)i z

(2)i sTOFi si

When TOF information is not available, the TOF value and the TOF scatter estimate

are set to zero. Similarly, when randoms and scatter estimates are not available, they are

set to zero.

The reconstruction package can also handle PET data in histogram form. Histogram

data is similar to list-mode data, with the only exception that all the events that occurred

145

146 APPENDIX B. FILE FORMATS

on the same LOR are grouped together. The LORs are organized in a list which contains

the coordinates and the number of events (denoted mi) of every LOR. The structure of a

histogram-mode data le is the following:

x(1)i y

(1)i z

(1)i mi ri x

(2)i y

(2)i z

(2)i si N/A

Unlike the histogram-mode format, individual energy and time information can be stored

for each event in list-mode.

B.2 Image Files

The 3-D images used by the application (output, initialization image and sensitivity map)

can be stored to disk. To avoid conversions when moving images to the GPU, the image

voxels are stored in a 2-D tiled array (see Figure A.1). The rst 16 bytes constitute the le

header and dene the image size. The header is comprised of four 32-bit integers. The rst

two (vx and vy) determine the tile size, and the other two (tx and ty) the number of tiles

in each dimension. The rest of the le consists of the voxel values, stored in 32-bits IEEE

oating-point. The resulting le size is 4 × (4 + vx vy tx ty) (in bytes). Image les saved

according to this scheme have a .vs extension.

B.3 Colormap Files

Custom colormaps can be used to visualize data in the interactive mode. Colormap les are

stored in ~/.GpuOsem/, and are recognizable by the .colormap extension. The colormap le

format is consistent with MATLAB's colormaps. Each colormap has 64 color values, with

each color being represented by a 32-bit oating-point quadruple (RGBA). The colormap is

linearly interpolated for better result. The le size is 1024 bytes (= 4× 4× 64 bytes).

Appendix C

User Manual

C.1 Command Line Options

Many parameters can be provided to the application through the command line (see Table

C.1). Most of the parameters are optional and have default values. Most of the time, the

application will check for missing parameters or incorrect values. Some parameters must

be provided for the reconstruction application to work, namely: the image size, the FOV

size, the subset size and the list-mode input le. All the remaining parameters have default

values.

Because the list-mode data is stored in a square texture of size 4κ2 (see A.1.2), the

subset size (specied using -s) is rounded o to the closest integer that can be expressed as

2κ2. Similarly, the number of tiles in the horizontal dimension (ty) must be a multiple of

four. This is because four slices are processed simultaneously in the back-projection using

the four color channels (see Section A.2).

It is sometimes desirable to use very large subsets in the reconstruction. In practice,

there are two limits to the subset size: the amount of memory available on-board, and the

maximal size of a rectangular texture allowed by OpenGL. Therefore, the subset multiplier

(-m) should be used to perform such reconstruction. For example, instead of specifying a

subset of 10 million events (using -s 10000000), one can specify a subset size of two million

and a multiplier of 5 (using -s 2000000 -m 5).

C.2 Conguration File

Some of the application parameters are set up through a conguration le. On a Linux

OS, the conguration le is stored in ~/.GpuOsem/GpuOsem.conf. While the command line

147

148 APPENDIX C. USER MANUAL

Flag Format Description

vsize="vx vy tx ty" INT Size of the image slice, and number of tilesfov="fx fy fz" DOUBLE Size of the eld of view (unit of length)-d LORs are rebinned to the edge of the FOV-j Jitters the line by a random quantity-r Flips x/z dimensions (useful for dual-panel recon)-u DOUBLE Intensity of the uniform image used for initialization-s INT Number of events in subset-m INT Multiplier for subset size-S Saves the image volume after each iteration-C Uses scatter and randoms corrections-T DOUBLE Time-of-ight kernel width (unit length, FWHM)-i INT Number of sub-iterations-D Debug mode on (obsolete)-N Sensitivity map calculation (accumulate mode)-H Input is histogram-mode-l, listmode FILE List-mode input lename-n, norm FILE Sensitivity map lename-I, init FILE Initialization le-c DOUBLE Rebins the LOR to a cylinder of radius <DOUBLE>-t INT Truncate dataset to use only the <INT> rst events-g DOUBLE Standard deviation for Gaussian projection kernel-o, output FILE Output lename-h, help Print help and exit

Table C.1: Command-line options

is used to dene parameters relating to the geometry of the PET system, the parameters

contained in the conguration le do not aect the reconstruction.

C.3 Interactive-Mode Commands

When no iteration number is set (using -i), the application runs in interactive mode. In-

teractive mode allows the user to iterate as needed, and visualize the image as it is being

reconstructed. The following keystrokes can be used during the execution of the application.

The colormap used for visualization, as well as the viewport size can be adjusted in the con-

guration le. When the application is run in batch-mode (using -i), it will exit after the

specied number of iterations has been performed. Therefore, no interactive visualization is

possible in batch-mode.

C.3. INTERACTIVE-MODE COMMANDS 149

Key Action Key Action

V Save Volume W Switch V1/V2

S Screen Shot C Clear V1

RETURN Iterate R Rewind dataESC Exit S Screen shotL Load current dataset D Toggle 2D/3DF Forward project P Print debug infoB Backproject 1 Display V1

F1 Toggle full screen 2 Display V2

F2 Next slice 3 Display NF3 Previous slice 4 Display F1

F4 Toggle y-through 5 Display F2F5 Decrease alpha blending 6 Display B1F6 Increase alpha blending 7 Display B2F7 Decrease intensity 8 Display slice V 1F8 Increase intensity 9 Display slice N

Table C.2: Interactive shortcuts

Appendix D

Gamma Camera Acquisition Software

D.1 Background

A small hand-held gamma camera (Figure D.1a and [142]) was developed in the Molecular

Imaging Instrumentation Lab to be used for radio-guided surgery. Unlike PET, gamma

cameras use a physical collimator to select the projection view. Most of the incoming photons

that are not incident normal to the radiation detector are stopped by highly attenuating

material (such as lead). A gamma camera produces 2-D images that do not require any

tomographic reconstruction.

Gamma cameras are useful in a number of nuclear medicine applications. They are

usually fairly large and used for whole body imaging. Small-FOV handheld gamma cameras

do exist but their use is still under investigation. While large cameras can be used for

surgery planning, small cameras can be useful in the operation room to guide the surgery,

when needed. Sentinel lymph node (SLN) biopsy is a very promising application for such

small-FOV cameras. The SLN biopsy procedure consists in removing the lymph nodes that

most likely drain the tumor site, in order to assay for the spread of the disease. In the

United States, the SLN procedure is standard for aggressive breast and melanoma cancers.

The SLNs are found by injecting, at the tumor site, a radioactive dye (99mTc sulfur colloid)

that accumulates in the lymph nodes.

A software package was developed to interface with the gamma camera prototype. The

software provides camera calibration, real-time imaging and performance analysis. It runs

on a PC, which is connected to the camera through a National Instrument data acquisition

(DAQ) card. The camera and software were evaluated by conducting a preliminary study

using a porcine model, followed by a small clinical prospective trial on 50 patients (Figure

D.1b). This appendix summarizes the implementation of the software package.

150

D.2. HARDWARE 151

(a) (b)

Figure D.1: (a) Gamma camera prototype. (b) Use of the camera for sentinel lymph nodebiopsy in a melanoma patient.

D.2 Hardware

The gamma camera prototype consisted of a parallel hole collimator coupled to a pixelated

NaI(Tl) scintillation crystal array, itself coupled to a at panel, multi-anode Hamamatsu

H8500 position-sensitive photomultiplier tube (PSPMT) [142]. The collimator was 5 × 5cm2 large and 1.5 cm-thick, with 1.3 mm hexagonal holes and 0.2 mm septa (15 cps / µCi).

The crystal array had a 1.7 mm pitch and was composed of 29×29 individual crystals, each

1.5 × 1.5 × 6 mm3 in size. The PSPMT was read out using a symmetric charge division

circuit [143]. Approximately 3 mm of lead shielding was wrapped around the collimator and

the scintillation crystal.

D.3 Software

The software architecture is based on a state machine. The principal states are summarized

in Figure D.2 and are described in more details in the following sections.

The gamma camera software was developed in C++ and runs on a Linux OS. Real-

time visualization was achieved using the SDL media library and OpenGL. The camera was

interfaced through the DAQmx Base C API (National Instruments). When an interaction

152 APPENDIX D. GAMMA CAMERA ACQUISITION SOFTWARE

Figure D.2: Schematics of the seven main states in the gamma camera software. Thered arrows indicate the entry points for the software. The circling arrow means that theapplication stays in that state until some condition is met.

occurs in a scintillation crystal, four signals are produced by the readout electronics. When

the trigger signal (created by summing and delaying the four channels) crosses a certain

threshold, these four voltages are digitized by the DAQ card and stored in a buer on the

card, in list-mode. Periodically, the application reads out the events from the buer for

processing. The event energy and the 2-D position of the centroid of the light distribution

are obtained by Anger logic.

Using command line parameters, the application can start at three possible states (red

arrows in Figure D.2). The camera can be started in calibration mode, in which case a

ood source of 99mTc should be placed on the camera. The calibration le, containing the

crystal segmentation map and camera parameters, can be saved to a le and loaded later

for performance analysis (in red). The application can also be started to perform real-time

imaging (in red) directly, in which case it will load a previously saved calibration le.

D.3.1 Initialization

A ood source, made of a plastic container lled with a solution of 99mTc must be placed

on the camera before the calibration sequence is initiated. Calibration should be performed

monthly to monitor the camera performance and correct for potential drift in its parameters.

Once the DAQ API is initialized, the trigger signal is dened, as well as the size of the

data transfers from the buer. The data acquisition then starts. At rst, a few hundred

thousand events are acquired to estimate the spread of the channel values and maximize

the dynamic range of the ood histogram. The graphical viewport is also initialized. Once

these tasks have been accomplished, the program jumps to the next state.

D.3.2 Flood Acquisition

The ood acquisition sequence is executed in a loop. In this sequence, the DAQ buer is

polled for new events. If a sucient number of events are present, they are transferred to

D.3. SOFTWARE 153

(a) (b)

Figure D.3: (a) Flood image obtained while calibrating the gamma camera. The 29 ×29 peaks correspond to discrete crystals in the scintillation array. (b) Histogram for theindividual spatial channels.

the RAM for further processing. Each event consists of the four digitized voltages. Anger

logic is applied to extract the energy and 2-D spatial coordinates of the event. The events

are then binned into a 3-D histogram (two spatial and one energy dimensions). During the

execution, the ood image is displayed on the screen (Figure D.3a) for monitoring purposes.

The pulse height histograms for each of the four acquisition channels can be also displayed

at any time by pressing D (Figure D.3b). The acquisition terminates when the RETURN

key is pressed.

D.3.3 Automatic Peak Finding

Once the calibration acquisition has been completed, the software must segment the 2-D

ood histogram into discrete crystal cells. This requires that the peaks be identied (Figure

D.4a). An automatic peak nder provides a rough estimate of the peak locations. The

peak locations are then rened manually by the user. Automatic peak nding is achieved

by applying a low pass lter and a lower threshold to the ood histogram and by nding

the local maxima. The width of the ltering kernel and the threshold can be adjusted

interactively. Increasing the amount of smoothing or the threshold reduces the number of

peaks found. Once a satisfying estimate is attained, the user can press the RETURN key

to move to the next state.

154 APPENDIX D. GAMMA CAMERA ACQUISITION SOFTWARE

(a) (b)

Figure D.4: (a) Peak location, indicated by a red triangle. (b) Peak index after automated2-D sorting.

D.3.4 Peak Manual Adjustment

The location of the peaks can be adjusted manually to compensate for errors produced by

the automatic peak nding method. An incorrect peak can be deleted by right-clicking on

its location. A peak can be added anywhere by left-click. The user can also load the peak

positions from a recent calibration le by pressing L . Once the number of peaks is equal

to the number of crystals in the camera, the user can press the RETURN key to validate

the peak positions.

D.3.5 Automatic Peak Sorting

Once the 841 (29 × 29) peaks have been identied, they must be sorted into a 2-D grid so

that a mapping can be established between the 2-D crystal array and the peaks in the ood

histogram. An example of a sorted peak list is shown in Figure D.4b.

D.3.6 Crystal Segmentation and Energy Gating

The crystal segmentation map is created from the locations of the peaks. The events recorded

by the camera are assigned to the nearest peak in the 2-D ood histogram. To accelerate

the search of the nearest neighbor, a segmentation map is pre-calculated during calibration

(Figure D.5a). Each bin in the 2-D ood histogram is mapped to a crystal in the array.

D.3. SOFTWARE 155

(a) (b)

(c) (d)

Figure D.5: (a) Crystal segmentation map (color) and Voronoi graph (black lines). (b)Per-crystal energy resolution. (c) Energy spectrum for a crystal with good energy resolution(13.8%) (center of the FOV). (d) Degraded energy resolution (75.5%) for an edge crystal.

156 APPENDIX D. GAMMA CAMERA ACQUISITION SOFTWARE

(a) (b)

Figure D.6: (a) Per-crystal photopeak. (b) Per-crystal eciency factor.

For this purpose, the Delaunay triangulation and the Voronoi graph are computed using

the OpenCV library. The Voronoi is the dual lattice for the Delaunay triangulation graph

and provides a tessellation of the ood histogram such that all the bins comprised within a

Voronoi cell are closest to the same peak.

Following crystal segmentation, an energy gate is determined for each individual crystal.

The 99mTc metastable isotope used with the gamma camera decays by emitting a single

140 keV gamma ray with a half-life of 6 hours. Therefore, events with energy signicantly

dierent from 140 keV are either scattered or background radiations. These events can be

eliminated by energy gating. The energy resolution of a radiation detector is dened as the

deviation (FWHM) of the energy measured as a fraction of the true energy deposition. For

the gamma camera, the energy resolution is measured for each crystal element by nding the

photopeak location and measuring the FWHM of the 140 keV peak in the energy spectrum.

The energy resolution ranges between 12% (near the center of the FOV) and 90% (at the

edge). For edge crystals, the light is compressed and therefore the energy determination is

inaccurate (Figure D.5d) compared to the center crystals (Figure D.5c). The energy gate is

determined for each crystal as [P (1− Er), P (1 + Er)] where P is the photopeak and Er the

energy resolution.

D.3. SOFTWARE 157

D.3.7 Camera Performance Analysis

The parameters of the camera calibration can be analyzed interactively. Clicking on a cell

shows the energy histogram for this crystal (Figure D.5c and (d)). The photopeak value can

be displayed for each crystal (Figure D.6a), as well as the crystal eciency i.e. the number

of counts recorded in each crystal during the ood source calibration (Figure D.6b). If the

performance of the calibration is found unsatisfying, it is possible to repeat the previous

steps (Figure D.2).

After the calibration has been completed, a calibration le is automatically created for

subsequent imaging sessions.

D.3.8 Real-Time Imaging

The real-time imaging state is designed for clinical imaging procedures. In this state, the

software alternatively fetches new events from the DAQ buer, and displays the resulting

image frame onto the graphical viewport. Imaging can be performed either in accumulation

or dynamic mode.

D.3.8.1 Accumulation Mode

A data frame consists of a 29× 29 array of pixels that represent the ux of radiation hitting

the crystal array. Each event read out from the DAQ is rst assigned to a crystal according

to the crystal segmentation map. If the event is within the energy gate for that particular

crystal, then it is added to the data frame.

The data frame is corrected for non-uniform eciency (Figure D.6b). For optimal lymph

node detectability, the data frame undergoes several processing steps, including square-root

compression and bilinear interpolation [142].

When the camera is moved, the accumulation frame must be cleared. In the current

software, this is achieved by pressing C or by pressing a foot pedal (Figure D.1).

Figure D.7 shows images taken with the gamma camera from a melanoma patient case.

The rst image shows the three injections. A cluster of three SLNs was then imaged intra-

operatively. Two of these SLNs were imaged later, after their excision. After all SLNs were

removed, an image survey conrmed that no SLNs were left in the patient.

D.3.8.2 Dynamic Mode

In dynamic mode, imaging can be performed while the camera is being moved. Because

of the very low statistics, binning the counts in independent time frames does not provide

158 APPENDIX D. GAMMA CAMERA ACQUISITION SOFTWARE

(a) (b)

(c) (d)

Figure D.7: Example of real-time imaging for a SLN biopsy in a melanoma case. (a) Imageof the injection site. (b) Cluster of three SLNs, imaged in vivo. (c) Two SLNs after removal,imaged ex-vivo. (d) Background activity image, after the removal of all the SLNs. The colorscale is adaptively adjusted to maximize the imaging dynamic range.

D.4. USER'S COMMANDS 159

sucient image quality. Instead, all the events are combined using motion compensation.

The motion between two consecutive data frames is computed using the Lucas-Kanade

optical ow method [144]. The past frame is then motion corrected and combined with the

new frame. Poisson noise is not taken into consideration, however the technique has proved

to be quite robust.

The optical ow information can also be used to trigger the clearing of the accumulation

frame in accumulation mode. In the clinical investigation, we instead relied on a manual

clearing signal (foot pedal).

D.4 User's Commands

The gamma camera software can be run in three dierent mode:

Command Mode

grc c Calibration

grc a Performance analysis

grc i Real-time imaging

The following keystrokes are available when using the application:

Key Function

RETURN Go to the next state

BACKSPACE Go to the previous state

[1 0] Toggle display

F Toggle full screen

D Display channel histogram

P Screen shot

S Save calibration le

L Load calibration le

ESC Exit application

C Clear accumulation frame

SPACE Freeze frame

M Capture movie (obsolete)

K Performs K-means clustering

Appendix E

Analysis of Reconstructed Sphere

Size

In a linear spatially-invariant system, the spatial resolution can be fully characterized by

the point-spread function (PSF). For a linear shift-varying system, the spatial resolution

can also be studied by looking at the local PSF. However, for a non-linear system (i.e.

which does not satisfy the principle of superposition), the spatial resolution is aected by

the distribution of the tracer.

The EM and OSEM reconstruction methods (see Chapter 3) are both non-linear esti-

mators due to the non-linear update rule. As a result, the PSF cannot be dened because

the superposition theorem does not apply. In this work, we looked at the reconstructed

sphere size as a surrogate for spatial resolution. This appendix provides more details on the

interpretation of such measurements.

In the following toy problem, a 1.75 diameter sphere was blurred with a 3-D Gaussian

kernel (Figure E.1). A Gaussian function was then t to a prole through the blurred sphere,

and its standard deviation measured. The width of the Gaussian kernel was varied (E.2).

For the 1 mm FWHM Gaussian kernel, the blurred sphere was measured to be 1.47

mm FWHM, smaller than the original 1.75 mm-diameter sphere. For a 1.5 mm kernel, the

blurred sphere was measured to be 1.80 mm FWHM. Therefore, the FWHM size of a blurred

sphere can be larger or smaller than the original sphere diameter. Hence, the FWHM size

of a blurred sphere should not be compared to the original sphere diameter. The sphere

blurred with a 1.5 mm FWHM kernel has a FWHM size closer to the original 1.75 mm

sphere, despite more aggressive blur.

More generally, the blurred sphere FWHM size is a monotonically increasing function

160

161

(a)

(b)

Figure E.1: (a) A 1.75 mm-diameter sphere (left) was blurred by a 3-D, 1 mm FWHMGaussian kernel (middle), resulting in a blurred sphere (right). (b) One-dimensional prolesthrough the above images. The blurred sphere is narrower at FWHM than the original 1.75mm one.

162 APPENDIX E. ANALYSIS OF RECONSTRUCTED SPHERE SIZE

Figure E.2: FWHM size of blurred (or reconstructed) sphere as a function of the blurringkernel FWHM.

of the blurring kernel width. Therefore, lower values indicate lower blur. As a result, if we

assume that imaging followed by reconstruction of a 3-D object is equivalent to applying a

Gaussian blurring kernel, then the FWHM size of reconstructed spheres can be related to

the width of the equivalent blurring kernel (E.2).

Appendix F

Glossary of Terms

• APD: Avalanche photo-diode. Sensitive semiconductor light detector

• Back-projection: y→ ATy, where A is the system matrix

• Blending: In graphics, combining fragments with an existing frame buer by adding

them on a pixel-by-pixel basis, using the alpha channel as a weight

• CDRF: Coincident detector response function

• CG: Conjugate gradient

• CNR: Contrast over noise ratio

• Coincidence: Event selection consisting of the detection of two high-energy photons

within some selected time interval

• CPU: central processing unit, or processor.

• CR (contrast recovery): contrast measured in the reconstructed image, expressed as

a percentage of the original activity concentration ratio

• CZT: Cadmium Zinc Telluride: Semiconductor material used for radiation detection

• DOI: Depth of interaction

• FBO: Frame-buer object

• FDG: 2-[18F]uoro-2-deoxy-D-glucose. A common tracer used in PET as a marker

for glucose utilization.

163

164 APPENDIX F. GLOSSARY OF TERMS

• Forward-projection: x→ Ax, where A is the system matrix.

• FOV: Field of view

• Fragment: In graphics, all the data necessary to generate a pixel in the frame buer.

• FWHM: Full-width at half-maximum

• GPU (graphics processing unit): The main processor on a computer graphics

card. It is a specialized processor, optimized for geometrical computing. In a rendering

task, the role of the GPU includes calculating lighting eects, object transformations,

texture mapping, and rastering.

• IDRF: Intrinsic detector response function

• LOR (line of response): In PET, a line of response joins two detection elements

and can measure annihilation photon pairs

• List-mode: Acquisition mode in which the events are stored individually, in the orderthey are measured. Other acquisition modes include sinogram and histogram mode.

• List-mode OSEM: A variant of OSEM that processes directly individual events

rather than a sinogram

• MAP: Maximum a posteriori

• ML: Maximum likelihood

• MIPE: Multiple-interaction photon event

• OSEM (ordered-subset expectation-maximization): A popular, iterative, to-

mographic image reconstruction algorithm that is used in PET and SPECT

• Parallax error: Loss of spatial resolution caused by the obliquity of 511 keV photons

entering a detector element

• PCG: Preconditioned conjugate gradient

• PET: Positron emission tomography

• PMT: Photomultiplier tube. Sensitive light detector

• Positron: An elementary particle with positive charge; interaction of a positron and

an electron results in annihilation, yielding two oppositely-directed 511 keV photons

165

• PSF: Point-spread function

• Random coincidence: Background coincidence event formed from two photons orig-

inating from dierent positron decays

• Rastering: Operation that converts a vectorial primitive (line, triangle, quadrangle)

into a set of fragments

• Reconstructed image: A 3-D array, where each cell is an estimate of the tracer

distribution in the patient

• Shader (fragment / vertex): A program that runs on the GPU. The vertex and

fragment shaders are applied to all the vertices processed, and all the fragment pro-

duced, respectively

• Sinogram: Data structure in which the measurements are ordered by projection angleand radial distance

• Scattered coincidence: Background coincidence event in which one or both photonshave scattered one or more times before being detected

• SNR: Signal to noise ratio

• System matrix: Matrix that models the linear relationship between the tracer spatial

distribution and the data measured in PET

• Texture: In computer graphics, a 2D rectangular color image

• Texture mapping: In computer graphics, operation that applies a texture onto a

polygon of arbitrary shape

• TOF: Time of ight

• Vertex: In geometry, a corner point of a polygon

• WLS: Weighted least-squares

• X-Ray CT: X-Ray Computed Tomography

Bibliography

[1] R. Weissleder and U. Mahmood, Molecular imaging, Radiology, vol. 219, no. 2,

pp. 316333, 2001.

[2] T. Massoud and S. Gambhir, Molecular imaging in living subjects: Seeing fundamen-

tal biological processes in a new light, Genes Dev., vol. 17, pp. 4580, Mar 2003.

[3] P. Som, H. L. Atkins, D. Bandoypadhyay, J. S. Fowler, R. R. MacGregor, K. Matsui,

Z. H. Oster, D. F. Sacker, C. Y. Shiue, H. Turner, C.-N. Wan, A. P. Wolf, and

S. V. Zabinski, A Fluorinated Glucose Analog, 2-uoro-2-deoxy-D-glucose (F-18):

Nontoxic Tracer for Rapid Tumor Detection, J Nucl Med, vol. 21, no. 7, pp. 670675,

1980.

[4] S. Gambhir, Molecular imaging of cancer with positron emission tomography, Nat

Rev Cancer, vol. 2, pp. 68393, Sep. 2002.

[5] M. Phelps, E. Homan, C. Selin, S. Huang, G. Robinson, N. Mac-Donald, H. Schel-

bert, and K. DE, Investigation of 18F-2-uoro-2-deoxy-glucose for the measure of

myocardial glucose metabolism, J Nucl Med, vol. 19, pp. 13111319, 1978.

[6] M. Reivich, D. Kuhl, A. Wolf, J. Greenberg, M. Phelps, T. Ido, V. Casella, J. Fowler,

E. Homan, A. Alavi, P. Som, and L. Sokolo, The 18F-uorodeoxy-glucose method

for the measurement of local cerebral glucose utilization in man, Circ Res, vol. 44,

pp. 127137, 1979.

[7] R. Etzioni, N. Urban, S. Ramsey, M. McIntosh, S. Schwartz, B. Reid, J. Radich,

G. Anderson, and L. Hartwell, The case for early detection, Nat Rev Cancer, vol. 3,

pp. 243252, Apr 2003.

[8] R. Weissleder, Molecular imaging in cancer, Science, vol. 312, no. 5777, pp. 1168

1171, 2006.

166

BIBLIOGRAPHY 167

[9] M. Rudin and R. Weissleder, Molecular imaging in drug discovery and development,

Nature Reviews Drug Discovery, vol. 2, pp. 123131, Feb 2003.

[10] G. D. Rabinovici, A. J. Furst, J. P. O'Neil, C. A. Racine, E. C. Mormino, S. L.

Baker, S. Chetty, P. Patel, T. A. Pagliaro, W. E. Klunk, C. A. Mathis, H. J. Rosen,

B. L. Miller, and W. J. Jagust, 11C-PIB PET imaging in Alzheimer disease and

frontotemporal lobar degeneration, Neurology, vol. 68, no. 15, pp. 12051212, 2007.

[11] R. Weissleder and V. Ntziachristos, Shedding light onto live molecular targets, Nature

Medicine, vol. 9, pp. 123128, 2003.

[12] T. Ido, C. Wan, and V. Casella, Labelled 2-deoxy-D-glucose analogs. 18F-labeled-

2-deoxy-2-uoro-D-glucose, 2-deoxy-2-uoro-D-mannose and 14C-2-deoxy-2-uoro-D-

glucose., J Labell Comp Radiopharm, vol. 14, pp. 175183, 1978.

[13] T. Ido, C. Wan, J. Fowler, and A. Wolf, Fluorination with molecular uorine. A con-

venient synthesis of 2-deoxy-2-uoro-D-glucose, The Journal of Organic Chemistry,

vol. 42, no. 13, pp. 23412342, 1977.

[14] E. Bustamante and P. L. Pedersen, High aerobic glycolysis of rat hepatoma cells in

culture: Role of mitochondrial hexokinase, PNAS, vol. 74, no. 9, pp. 37353739, 1977.

[15] J. K. Moran, H. B. Lee, and M. D. Blaufox, Optimization of urinary FDG excretion

during PET imaging, J Nucl Med, vol. 40, no. 8, pp. 13521357, 1999.

[16] C. S. Levin, New imaging technologies to enhance the molecular sensitivity of positron

emission tomography, Proceedings of the IEEE, vol. 96, no. 3, pp. 439467, 2008.

[17] J. Zhang, A. Foudray, P. Olcott, R. Farrell, K. Shah, and C. Levin, Performance

characterization of a novel thin position-sensitive avalanche photodiode for 1 mm res-

olution positron emission tomography, IEEE Trans. Nucl. Sci., vol. 54, pp. 415421,

June 2007.

[18] A. Foudray, F. Habte, C. Levin, and P. Olcott, Positioning annihilation photon inter-

actions in a thin LSO crystal sheet with a position-sensitive avalanche photodiode,

Nuclear Science, IEEE Transactions on, vol. 53, pp. 25492556, Oct. 2006.

[19] A. F. Chatziioannou, S. R. Cherry, Y. P. Shao, R. W. Silverman, K. Meadors, T. H.

Farquhar, and M. P. amd M. E. Phelps, Performance evaluation of microPET: A

high-resolution lutetium oxyorthosilicate PET scanner for animal imaging, J. Nucl.

Med., vol. 40, pp. 11641175, Jul 1999.

168 BIBLIOGRAPHY

[20] A. F. Chatziioannou, Y. C. Tai, N. Doshi, and S. R. Cherry, Detector development

for microPET II: a 1 mu l resolution PET scanner for small animal imaging, Phys.

Med. Bio., vol. 46, pp. 28992910, Nov 2001.

[21] R. Weisslede, Scaling down imaging: molecular mapping of cancer in mice, Nat Rev

Cancer, vol. 2, pp. 1118, Jan 2002.

[22] L. A. Green, C. S. Yap, K. Nguyen, J. R. Barrio, M. Namavari, N. Satyamurthy, M. E.

Phelps, E. P. Sandgren, H. R. Herschman, and S. S. Gambhir, Indirect monitoring

of endogenous gene expression by positron emission tomography (PET) imaging of

reporter gene expression in transgenic mice, Mol Imag Bio, vol. 4, no. 1, pp. 71 81,

2002.

[23] M. Bergeron, J. Cadorette, J. F. Beaudoin, J. A. Rousseau, M. Dumoulin, M. D. Lep-

age, G. Robert, V. Selivanov, M. A. Tetrault, N. Viscogliosi, T. Dumouchel, S. Thorn,

J. DaSilva, R. A. deKemp, J. Norenberg, R. Fontaine, and R. Lecomte, Performance

evaluation of the LabPET APD-based digital PET scanner, IEEE Nuclear Science

Symposium Conference Record, vol. 6, pp. 41854191, November 2007.

[24] A. L. Goertzen, A. K. Meadors, R. W. Silverman, and S. R. Cherry, Simultaneous

molecular and anatomical imaging of the mouse in vivo, Phys. Med. Bio., vol. 47,

pp. 43154328, Dec 2002.

[25] A. L. Goertzen, V. Nagarkar, R. A. Street, M. J. Paulus, J. M. Boone, and S. R.

Cherry, A comparison of x-ray detectors for mouse CT imaging, Phys. Med. Bio.,

vol. 49, pp. 52515265, Dec 2004.

[26] J. Joung, R. S. Miyaoka, and T. K. Lewellen, cMiCE: a high resolution animal PET

using continuous LSO with a statistics based positioning scheme, Nucl. Instr. Meth.

Phys. Res., vol. 489, pp. 584598, Aug 2002.

[27] B. J. Pichler, B. K. Swann, J. Rochelle, R. E. Nutt, S. R. Cherry, and S. B. Siegel,

Lutetium oxyorthosilicate block detector readout by avalanche photodiode arrays for

high resolution animal PET, Phys. Med. Bio., vol. 49, pp. 43054319, Sep 2004.

[28] Y. C. Tai, A. Chatziioannou, S. Siegel, J. Young, D. Newport, R. N. Goble, R. E.

Nutt, and S. R. Cherry, Performance evaluation of the microPET P4: a PET sys-

tem dedicated to animal imaging, Physics in Medicine and Biology, vol. 46, no. 7,

pp. 18451862, 2001.

BIBLIOGRAPHY 169

[29] Y. Wang, J. Seidel, B. M. W. Tsui, J. J. Vaquero, and M. G. Pomper, Performance

evaluation of the GE healthcare eXplore Vista dual-ring small-animal PET scanner,

J Nucl. Med., vol. 47, pp. 18911900, 2006.

[30] Y. F. Yang, Y. C. Tai, S. Siegel, D. F. Newport, B. Bai, Q. Z. Li, R. M. Leahy, and

S. R. Cherry, Optimization and performance evaluation of the microPET II scanner

for in vivo small-animal imaging, Phys. Med. Bio., vol. 49, pp. 25272545, Jun 2004.

[31] T. E. Schlesinger, J. E. Toney, H. Yoon, E. Y. Lee, B. A. Brunett, L. Franks, and R. B.

James, Cadmium Zinc Telluride and its use as a nuclear radiation detector material,

Mat Sci Eng: Reports, vol. 32, no. 4-5, pp. 103 189, 2001.

[32] F. Habte, A. M. K. Foudray, P. D. Olcott, and C. S. Levin, Eects of system geometry

and other physical factors on photon sensitivity of high-resolution positron emission

tomography, Phys. Med. Bio., vol. 52, pp. 37533772, 2007.

[33] N. K. Doshi, Y. Shao, R. W. Silverman, and S. R. Cherry, Design and evaluation of

an LSO PET detector for breast cancer imaging, Med. Phys., vol. 27, no. 7, pp. 1535

1543, 2000.

[34] V. H. Tran, R. W. Silverman, A. L. Goertzen, and S. R. Cherry, Design and initial

performance of a compact rotating PET scanner for tomographic breast imaging, J.

Nucl. Med., 2003.

[35] C. S. Levin, F. Habte, A. M. K. Foudray, J. Zhang, and G. Chinn, Impact of high

energy resolution detectors on the performance of a PET system dedicated to breast

cancer imaging, Physica Med., vol. 21, pp. 2834, 2007.

[36] Y. Wu, S. L. Bowen, K. Yang, N. Packard, L. Fu, G. B. Jr, J. Qi, J. M. Boone,

S. R. Cherry, and R. D. Badawi, PET characteristics of a dedicated breast PET/CT

scanner prototype, Phys Med Bio, vol. 54, no. 13, pp. 42734287, 2009.

[37] D. Brasse, P. E. Kinahan, R. Clackdoyle, M. Defrise, C. Comtat, and D. Townsend,

Fast fully 3-D image reconstruction in PET using planograms, IEEE Trans Med

Imag, vol. 23, no. 4, pp. 413425, 2004.

[38] A. Rahmim, J. C. Cheng, S. Blinder, M. L. Camborde, and V. Sossi, Statistical

dynamic image reconstruction in state-of-the-art high-resolution PET, Phys. Med.

Bio., vol. 50, pp. 48874912, Oct 2005.

170 BIBLIOGRAPHY

[39] H. Hudson and R. Larkin, Accelerated image reconstruction using ordered subsets of

projection data, IEEE Trans Med Imag, vol. 13, pp. 601 609, Dec 1994.

[40] L. A. Shepp and Y. Vardi, Maximum likelihood reconstruction for emission tomogra-

phy, IEEE Trans Med Imag, vol. 2, pp. 113122, 1982.

[41] R. Bracewell and A. Riddle, Inversion of fan beam scans in radio astronomy, Astro-

phys. J., vol. 150, pp. 427434, 1967.

[42] J. Qi, R. M. Leahy, S. R. Cherry, A. Chatziioannou, and T. H. Farquhar, High-

resolution 3D bayesian image reconstruction using the microPET small-animal scan-

ner, Phys. Med. Bio., vol. 43, pp. 10011013, Jul 1998.

[43] J. L. Herraiz, S. Espana, J. J. Vaquero, M. Desco, and J. M. Udias, FIRST: Fast

iterative reconstruction software for (PET) tomography, Phys. Med. Bio., vol. 51,

pp. 4547 4565, Sep 2006.

[44] M. Defrise, P. Kinahan, D. Townsend, C. Michel, M. Sibomana, and D. F. Newport,

Exact and approximate rebinning algorithms for 3-D PET data, IEEE Trans Med

Imag, vol. 16, pp. 145 158, Apr 1997.

[45] X. Liu, C. Comtat, C. Michel, P. E. Kinahan, M. Defrise, and D. Townsend, Compar-

ison of 3-D reconstruction with 3D-OSEM, and with FORE+OSEM for PET, IEEE

Trans Med Imag, vol. 20, pp. 804 814, Aug 2001.

[46] F. C. Sureau, A. J. Reader, C. Comtat, C. Leroy, M.-J. Ribeiro, I. Buvat, and R. Tre-

bossen, Impact of image-space resolution modeling for studies with the high-resolution

research tomograph, J Nucl Med, vol. 49, no. 6, pp. 10001008, 2008.

[47] A. Alessio, P. Kinahan, and T. Lewellen, Modeling and incorporation of system re-

sponse functions in 3-D whole body PET, IEEE Trans Med Imag, vol. 25, pp. 828837,

July 2006.

[48] V. Y. Panin, F. Kehren, C. Michel, and M. E. Casey, Fully 3D PET reconstruction

with system matrix derived from point source measurements, IEEE Trans Med Imag,

vol. 25, no. 7, pp. 907921, 2006.

[49] G. Pratx and C. S. Levin, Bayesian reconstruction of photon interaction sequences

for high-resolution pet detectors, Phys. Med. Bio., vol. 54, pp. 50735094, 2009.

BIBLIOGRAPHY 171

[50] C. S. Levin and E. J. Homan, Calculation of positron range and its eect on the

fundamental limit of positron emission tomography system spatial resolution, Phys

Med Bio, vol. 44, no. 3, pp. 781799, 1999.

[51] S. DeBenedetti, C. E. Cowan, W. R. Konneker, and H. Primako, On the angular

distribution of two-photon annihilation radiation, Phys. Rev., vol. 77, pp. 205212,

Jan 1950.

[52] P. A. Dokhale, R. W. Silverman, K. S. Shah, R. Grazioso, R. Farrell, J. Glodo, M. A.

McClish, G. Entine, V.-H. Tran, and S. R. Cherry, Performance measurements of a

depth-encoding PET detector module based on position-sensitive avalanche photodi-

ode read-out, Phys Med Bio, vol. 49, no. 18, pp. 42934304, 2004.

[53] V. Spanoudaki, Development and performance studies of a small animal positron emis-

sion tomograph with individual crystal readout and depth of interaction information.

Dissertation, Technische Universitat München, 2008.

[54] C. Levin, M. Dahlbom, and E. Homan, A monte carlo correction for the eect

of compton scattering in 3-D PET brain imaging, IEEE Trans Nucl Sci, vol. 42,

pp. 11811185, Aug 1995.

[55] C. Watson, New, faster, image-based scatter correction for 3-D PET, IEEE Trans

Nucl Sci, vol. 47, pp. 15871594, Aug 2000.

[56] J. Radon, On the determination of functions from their integral values along certain

manifolds, IEEE Trans Med Imag, vol. 5, pp. 170176, Dec 1986.

[57] H. H. Barrett and K. J. Myers, Foundations of Image Science. Wiley-Interscience,

2003.

[58] M. E. Casey and E. J. Homan, A technique to reduce noise in accidental coincidence

measurements and coincidence eciency calibration, J CAT, vol. 10, no. 6, pp. 845

850, 1986.

[59] E. Homan, T. Guerrero, G. Germano, W. Digby, and M. Dahlbom, PET system

calibrations and corrections for quantitative and spatially accurate images, IEEE

Trans Nucl Sci, vol. 36, pp. 11081112, Feb 1989.

[60] J. M. Ollinger, Detector eciency and compton scatter in fully 3d pet, IEEE Trans

Nucl Sci, vol. 42, no. 4, 1995.

172 BIBLIOGRAPHY

[61] P. E. Kinahan, D. W. Townsend, T. Beyer, and D. Sashin, Attenuation correction for

a combined 3D PET/CT scanner, Medical Physics, vol. 25, no. 10, pp. 20462053,

1998.

[62] R. E. Carson, C. Barker, J. S. Liow, and C. A. Johnson, Design of a motion-

compensation OSEM list-mode algorithm for resolution-recovery reconstruction for the

HRRT, IEEE Nuclear Science Symposium and Medical Imaging Conference Record,

2004.

[63] M. Rafecas, B. Mosler, M. Dietz, M. Pogl, A. Stamatakis, D. McElroy, and S. Ziegler,

Use of a monte carlo-based probability matrix for 3-D iterative reconstruction of

madpet-ii data, IEEE Trans Nucl Sci, vol. 51, pp. 25972605, Oct. 2004.

[64] A. Rahmim, J. Tang, M. A. Lodge, S. Lashkari, M. R. Ay, R. Lautamaki, B. M. W.

Tsui, and F. M. Bengel, Analytic system matrix resolution modeling in PET: an

application to Rb-82 cardiac imaging, Phys. Med. Bio., vol. 53, no. 21, pp. 5947

5965, 2008.

[65] J. A. Sorenson and M. E. Phelps, Physics in nuclear medicine. Grune & Stratton,

New York, 1980.

[66] R. Lecomte, D. Schmitt, and G. Lamoureux, Geometry study of a high resolution PET

detection system using small detectors, IEEE Trans Nucl Sci, vol. 31, pp. 556561,

Feb. 1984.

[67] D. Schmitt, B. Karuta, C. Carrier, and R. Lecomte, Fast point spread function com-

putation from aperture functions in high-resolution positron emission tomography,

IEEE Trans Med Imag, vol. 7, pp. 212, Mar 1988.

[68] V. Selivanov, Y. Picard, J. Cadorette, S. Rodrigue, and R. Lecomte, Detector response

models for statistical iterative image reconstruction in high resolution PET, IEEE

Trans Nucl Sci, vol. 47, pp. 11681175, Jun 2000.

[69] G. Brix, J. Zaers, L.-E. Adam, M. E. Bellemann, H. Ostertag, H. Trojan, U. Haberkorn,

J. Doll, F. Oberdorfer, and W. Lorenz, Performance evaluation of a whole-body PET

scanner using the NEMA protocol, J Nucl Med, vol. 38, no. 10, pp. 16141623, 1997.

[70] S. Jan, G. Santin, D. Strul, S. Staelens, K. Assie, D. Autret, S. Avner, R. Barbier,

M. Bardies, P. M. Bloomeld, D. Brasse, V. Breton, P. Bruyndonckx, I. Buvat, A. F.

Chatziioannou, Y. Choi, Y. H. Chung, C. Comtat, D. Donnarieix, L. Ferrer, S. J.

BIBLIOGRAPHY 173

Glick, C. J. Groiselle, D. Guez, P. F. Honore, S. Kerhoas-Cavata, A. S. Kirov, V. Kohli,

M. Koole, M. Krieguer, D. J. van der Laan, F. Lamare, G. Largeron, C. Lartizien,

D. L. M. C. Maas, L. Maigne, F. Mayet, F. Melot, C. Merheb, E. Pennacchio, J. Perez,

U. Pietrzyk, F. R. Rannou, M. Rey, D. R. Schaart, C. R. Schmidtlein, L. Simon, T. Y.

Song, J. M. Vieira, D. Visvikis, R. V. de Walle, E. Wieers, and C. Morel, GATE:

a simulation toolkit for PET and SPECT, Phys. Med. Bio., vol. 49, pp. 45434561,

Oct 2004.

[71] S. Orlov, Theory of three dimensional reconstruction. I. Conditions for a complete

set of projections, Sov. Phys. Crystallography, vol. 20, pp. 312314, 1976.

[72] S. Orlov, Theory of three dimensional reconstruction. II. The recovery operator, Sov.

Phys. Crystallography, vol. 20, pp. 429433, 1976.

[73] J. G. Colsher, Fully-three-dimensional positron emission tomography, Phys Med Bio,

vol. 25, no. 1, pp. 103115, 1980.

[74] J. G. Rogers, R. Harrop, and P. E. Kinahan, The theory of three-dimensional image

reconstruction for PET, IEEE Trans Med Imag, vol. 6, pp. 239243, Sept. 1987.

[75] P. Kinahan and J. Rogers, Analytic 3D image reconstruction using all detected

events, IEEE Trans Nucl Sci, vol. 36, pp. 964968, Feb 1989.

[76] M. E. Daube-Witherspoon and G. Muehllehner, Treatment of axial data in three-

dimensional PET, J Nucl Med, vol. 28, no. 11, pp. 17171724, 1987.

[77] R. M. Lewitt, G. Muehllehner, and J. S. Karp, Three-dimensional image reconstruc-

tion for PET by multi-slice rebinning and axial image ltering, Phys Med Bio, vol. 39,

no. 3, pp. 321339, 1994.

[78] R. M. Leahy and J. Qi, Statistical approaches in quantitative positron emission to-

mography, Statistics and Computing, vol. 10, pp. 147165, Apr 2000.

[79] A. M. Alession and P. E. Kinahan, Improved quantitation for PET/CT image recon-

struction with system modeling and anatomical priors, Med. Phys., vol. 33, no. 11,

pp. 4095 4103, 2006.

[80] L. Kaufman, Maximum likelihood, least squares, and penalized least squares for

PET, IEEE Trans Med Imag, vol. 12, pp. 200214, Jun 1993.

174 BIBLIOGRAPHY

[81] J. Qi, Calculation of the sensitivity image in list-mode reconstruction for PET, IEEE

Trans Med Imag, vol. 53, pp. 2746 2751, 2006.

[82] E. Mumcuogluyz, R. Leahy, and S. Cherry, Bayesian reconstruction of PET images:

Methodology and performance analysis, Phys. Med. Bio., vol. 41, pp. 17771807,

1996.

[83] R. H. Huesman, List-mode maximum-likelihood reconstruction applied to positron

emission mammography (PEM) with irregular sampling, IEEE Trans Med Imag,

vol. 19, pp. 532 537, 2000.

[84] A. J. Reader, K. Erlandsson, M. A. Flower, and R. J. Ott, Fast accurate iterative

reconstruction for low-statistics positron volume imaging, Phys. Med. Bio., vol. 43,

pp. 1001 1013, Jul 1998.

[85] A. J. Reader, S. Ally, F. Bakatselos, R. Manavaki, R. J. Walledge, A. P. Jeavons, P. J.

Julyan, S. Zhao, D. L. Hastings, and J. Zweit, One-pass list-mode EM algorithm for

high-resolution 3-D PET image reconstruction into large arrays, IEEE Trans Nucl

Sci, vol. 49, pp. 693 699, 2002.

[86] L. Parra and H. H. Barrett, List-mode likelihood: EM algorithm and image quality

estimation demonstrated on 2-D PET, IEEE Trans Med Imag, vol. 17, pp. 228 235,

1998.

[87] A. Rahmim, M. Lenox, A. Reader, C. Michel, Z. Burbar, T. J. Ruth, and V. Sossi,

Statistical list-mode image reconstruction for the high resolution research tomograph,

Phys. Med. Bio., vol. 49, pp. 42394258, Aug 2004.

[88] T. Budinger and G. Gullberg, Three-dimensional reconstruction in nuclear medicine

emission imaging, IEEE Trans Nucl Sci, vol. 21, pp. 220, 1974.

[89] R. Fletcher and C. Reeves, Function minimization by conjugate gradients, Computer

Journal, vol. 7, pp. 149154, 1964.

[90] J. R. Shewchuk, An introduction to the conjugate gradient method without the ago-

nizing pain, unpublished paper, Aug 1994.

[91] E. Polak and G. Ribière, Note sur la convergence de méthodes de directions con-

juguées, Revue Francaise d'Informatique et de Recherche Opérationnelle, vol. 16,

pp. 3543, 1969.

BIBLIOGRAPHY 175

[92] J. D. Flores, The conjugate gradient method in the presence of clustered eigenvalues,

SIGSMALL/PC Notes, vol. 19, no. 2, pp. 2529, 1993.

[93] E. Mumcuoglu, R. Leahy, S. Cherry, and Z. Zhou, Fast gradient-based methods for

Bayesian reconstruction of transmission and emission PET images, IEEE Trans Med

Imag, vol. 13, pp. 687701, Dec 1994.

[94] G. Chinn and S.-C. Huang, A general class of preconditioners for statistical iterative

reconstruction of emission computed tomography, IEEE Trans Med Imag, vol. 16,

no. 1, pp. 110, Feb 1997.

[95] G. Pratx, A. J. Reader, and C. S. Levin, Faster maximum-likelihood reconstruction

via explicit conjugation of search directions, IEEE Nuclear Science Symposium Con-

ference Record, 2008, pp. 50705075, Oct. 2008.

[96] C. J. Jaskowiak, J. A. Bianco, S. B. Perlman, and J. P. Fine, Inuence of recon-

struction iterations on 18F-FDG PET/CT standardized uptake values, J Nucl Med,

vol. 46, no. 3, pp. 424428, 2005.

[97] I. Hsiao, P. Khurd, A. Rangarajan, and G. Gindi, An overview of fast convergent

ordered-subsets reconstruction methods for emission tomography based on the incre-

mental EM algorithm, Nucl Instr Meth Phys Res, vol. 569, no. 2, pp. 429 433,

2006.

[98] C. A. Johnson, J. Seidel, and A. Sofer, Interior-point methodology for 3-D PET

reconstruction, IEEE Trans Med Imag, vol. 19, no. 4, pp. 271283, Apr 2000.

[99] K. Proudfoot, W. R. Mark, S. Tzvetkov, and P. Hanrahan, A real-time procedural

shading system for programmable graphics hardware, Proceedings of the 28th annual

conference on Computer graphics and interactive techniques, pp. 159 170, 2001.

[100] J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. KrÃ×ger, A. E. Lefohn,

and T. J. Purcell, A survey of general-purpose computation on graphics hardware,

Computer Graphics Forum, vol. 26, no. 1, pp. 80113, 2007.

[101] B. Cabral, N. Cam, and J. Foran, Accelerated volume rendering and tomographic

reconstruction using texture mapping hardware, Symp. on Volume Visualization,

pp. 9198, 1994.

[102] K. Chidlowy and T. Mollerz, Rapid emission tomography reconstruction, Vol.

Graph., pp. 1526, 2003.

176 BIBLIOGRAPHY

[103] F. Xu and K. Mueller, Accelerating popular tomographic reconstruction algorithms

on commodity PC graphics hardware, IEEE Trans Nucl Sci, vol. 52, pp. 654663,

Jun 2005.

[104] J. Kole and F. Beekman, Evaluation of accelerated iterative X-ray CT image recon-

struction using oating point graphics hardware, Phys. Med. Bio., vol. 51, pp. 875

889, 2006.

[105] Z. Wang, G. Han, T. Li, and Z. Liang, Speedup OS-EM image reconstruction by PC

graphics card technologies for quantitative SPECT with varying focal-length fan-beam

collimation, IEEE Trans Nucl Sci, vol. 52, pp. 1274 1280, Oct 2005.

[106] F. Xu and K. Mueller, Real-time 3D computed tomographic reconstruction using

commodity graphics hardware, Phys. Med. Bio., vol. 52, pp. 34053419, 2007.

[107] G. Pratx, G. Chinn, P. Olcott, and C. Levin, Accurate and shift-varying line pro-

jections for iterative reconstruction using the GPU, IEEE Trans Med Imag, vol. 28,

pp. 415422, Mar 2009.

[108] S. Green, The OpenGL framebuer object extension, Game Developers Conference,

2005.

[109] W. Mark, R. Glanville, K. Akeley, and M. Kilgard, Cg: a system for programming

graphics hardware in C-like language, ACM Trans. Graphics, vol. 22, no. 3, pp. 896

907, 2003.

[110] R. L. Siddon, Fast calculation of the exact radiological path for a three-dimensional

CT array, Med. Phys., vol. 12, pp. 252255, Mar 1985.

[111] J. Nickolls, I. Buck, K. Skadron, and M. Garland, Scalable parallel programming with

CUDA, ACM Queue, vol. 6, pp. 4053, Mar 2008.

[112] G. Chinn, A. M. K. Foudray, and C. S. Levin, Comparing geometries for a PET

system with 3-D photon positioning capability., IEEE Nuclear Science Symposium

and Medical Imaging Conference Record, 2005.

[113] P. Olcott, S. Buss, C. Levin, G. Pratx, and C. Sramek, GRAY: High energy photon ray

tracer for PET applications, IEEE Nuclear Science Symposium Conference Record,

pp. 20112015, November 2006.

BIBLIOGRAPHY 177

[114] D. Strul, R. B. Slates, M. Dahlbom, S. R. Cherry, and P. K. Marsden, An improved

analytical detector response function model for multilayer small-diameter PET scan-

ners, Phys. Med. Bio., vol. 48, no. 8, pp. 979994, 2003.

[115] S. Surti, A. Kuhn, M. E. Werner, A. E. Perkins, J. Kolthammer, and J. S. Karp,

Performance of Philips Gemini TF PET/CT scanner with special consideration for

its time-of-ight imaging capabilities, J Nucl Med, vol. 48, no. 3, pp. 471480, 2007.

[116] N. A. Mullani, J. Markham, and M. M. Ter-Pogossian, Feasibility of time-of-ight

reconstruction in positron emission tomography, J Nucl Med, vol. 21, no. 11, pp. 1095

1097, 1980.

[117] S. Surti, S. Karp, L. Popescu, E. Daube-Witherspoon, and M. Werner, Investigation

of time-of-ight benet for fully 3-D PET, IEEE Trans Med Imag, vol. 25, pp. 529

538, May 2006.

[118] M. Defrise, M. E. Casey, C. Michel, and M. Conti, Fourier rebinning of time-of-ight

PET data, Phys Med Bio, vol. 50, no. 12, pp. 27492763, 2005.

[119] S. Surti and J. S. Karp, Experimental evaluation of a simple lesion detection task

with time-of-ight PET, Phys Med Bio, vol. 54, no. 2, pp. 373384, 2009.

[120] S. Matej and R. Lewitt, Practical considerations for 3-D image reconstruction using

spherically symmetric volume elements, IEEE Trans Med Imag, vol. 15, pp. 6878,

Feb 1996.

[121] C. Watson, Extension of single scatter simulation to scatter correction of time-of-ight

PET, IEEE Trans Nuc Sci, vol. 54, pp. 16791686, Oct. 2007.

[122] C. S. Levin, M. P. Tornai, S. R. Cherry, L. R. MacDonald, and E. J. Homan, Comp-

ton scatter and X-ray crosstalk and the use of very thin intercrystal septa in high-

resolution PET detectors, IEEE Trans. Nucl. Sci., vol. 44, pp. 218224, Apr 1997.

[123] J. R. Stickel and S. R. Cherry, High-resolution PET detector design: Modelling com-

ponents of intrinsic spatial resolution, Phys. Med. Bio., vol. 50, no. 2, pp. 179195,

2005.

[124] K. A. Comanor, P. R. G. Virador, and W. W. Moses, Algorithms to identify detector

Compton scatter in PET modules, IEEE Trans. Nucl. Sci., vol. 43, pp. 22132218,

Aug 1996.

178 BIBLIOGRAPHY

[125] Y. Shao, S. R. Cherry, S. Siegel, and R. W. Silverman, A study of inter-crystal scatter

in small scintillator arrays designed for high resolution PET imaging, IEEE Trans.

Nucl. Sci., vol. 43, no. 3, pp. 19381944, 1996.

[126] C. Lehner, Z. He, and F. Zhang, 4π Compton imaging using a 3-D position-sensitive

CdZnTe detector via weighted list-mode maximum likelihood, IEEE Trans. Nucl.

Sci., vol. 51, pp. 16181624, Aug. 2004.

[127] M. Rafecas, G. B. Oning, B. J. Pichler, E. Lorenz, M. Schwaiger, and S. I. Ziegler,

Inter-crystal scatter in a dual layer, high resolution LSO-APD positron emission to-

mograph, Phys. Med. Bio., vol. 48, pp. 821848, 2003.

[128] U. G. Oberlack, E. Aprile, A. Curioni, V. Egorov, and K. L. Giboni, Compton scat-

tering sequence reconstruction algorithm for the liquid xenon gamma-ray imaging

telescope (LXeGRIT), Proc. SPIE, vol. 4141, pp. 168177, 2000.

[129] S. Boggs and P. Jean, Event reconstruction in high resolution Compton telescopes,

A&A, vol. 145, pp. 31121, Aug. 2000.

[130] G. J. Schmidt, M. A. Deleplanque, I. Y. Lee, F. S. Stephens, K. Vetter, R. M. Clark,

R. M. Diamond, P. Fallon, A. O. Macchiavelli, and R. W. MacLeod, A γ-ray tracking

algorithm for the GRETA spectrometer, Nucl. Instrum. Methods Phys. Res., vol. 430,

pp. 6983, Feb. 1999.

[131] O. Klein and T. Nishina, Uber die streuung von strahlung durch freie elektronen

nach der neuen relativistischen quantendynamik von Dirac, Zeitschrift fur Physik A

Hadrons and Nuclei, vol. 52, pp. 853868, Nov 1929.

[132] J. van der Marel and B. Cederwall, Backtracking as a way to reconstruct Compton

scattered γ-rays, Nucl. Instrum. Methods Phys. Res., vol. 437, pp. 538551, 1999.

[133] M. J. Berger, J. H. Hubbell, S. M. Seltzer, J. Chang, J. S. Coursey, R. Sukumar, and

D. S. Zucker, XCOM: Photon cross sections database. NIST Standard Reference

Database 8 (XGAM), 1998.

[134] C. S. Levin, A. M. Foudray, and F. Habte, Impact of high energy resolution detectors

on the performance of a PET system dedicated to breast cancer imaging, Physica

Medica, vol. 21, no. Supplement 1, pp. 28 34, 2006.

BIBLIOGRAPHY 179

[135] G. Chinn and C. S. Levin, A method to reject random coincidences and extract

true from multiple coincidences in PET using 3-D detectors, IEEE Nuclear Science

Symposium Conference Record, 2008.

[136] G. Chinn, A. M. K. Foudray, and C. S. Levin, PET image reconstrction with a

bayesian projector for multi-electronic collimation schemes., IEEE Nuclear Science

Symposium and Medical Imaging Conference Record, 2007.

[137] A. Rose, Vision: Human and Electronic. Plenum Press, 1973.

[138] Y. F. Du, Z. He, G. F. Knoll, D. K. Wehe, and W. Li, Evaluation of a compton scat-

tering camera using 3-D position sensitive CdZnTe detectors, Nucl. Instrum. Methods

Phys. Res., vol. 457, pp. 203211, Jan. 2001.

[139] Y. Gu, G. Pratx, F. W. Y. Lau, and C. S. Levin, Eects of multiple photon inter-

actions in a high resolution pet system that uses 3-d positioning detectors, in IEEE

Nuclear Science Symposium Conference Record, pp. 38143819, Oct. 2008.

[140] G. Chinn, A. Foudray, and C. Levin, A method to include single photon events in

image reconstruction for a 1 mm resolution PET system built with advanced 3-D

positioning detectors, IEEE Nuclear Science Symposium Conference Record, 2006.

[141] G. Chinn, A. Foudray, and C. Levin, Accurately positioning and incorporating tissue-

scattered photons into PET image reconstruction, IEEE Nuclear Science Symposium

Conference Record, 2006.

[142] P. Olcott, F. Habte, A. Foudray, and C. Levin, Performance characterization of

a miniature, high sensitivity gamma ray camera, IEEE Trans Nucl Sci, vol. 54,

pp. 14921497, Oct 2007.

[143] P. D. Olcott, J. A. Talcott, C. S. Levin, F. Habte, and A. M. K. Foudray, Compact

readout electronics for position sensitive photomultiplier tubes, IEEE Trans Nucl Sci,

vol. 52, pp. 21 27, Feb 2005.

[144] B. Lucas and T. Kanade, An iterative image registration technique with an application

to stereo vision, Proceedings of imaging understanding workshop, pp. 121130, 1981.