fadra: a cpu-gpu framework for astronomical data reduction ...

89
UNIVERSIDAD DE CHILE FACULTAD DE CIENCIAS F ´ ISICAS Y MATEM ´ ATICAS DEPARTAMENTO DE CIENCIAS DE LA COMPUTACI ´ ON FADRA: A CPU-GPU FRAMEWORK FOR ASTRONOMICAL DATA REDUCTION AND ANALYSIS TESIS PARA OPTAR AL GRADO DE MAG ´ ISTER EN CIENCIAS, MENCI ´ ON COMPUTACI ´ ON FRANCISCA ANDREA CONCHA RAM ´ IREZ PROFESOR GU ´ IA: MAR ´ IA CECILIA RIVARA Z ´ U ˜ NIGA PROFESOR CO-GU ´ IA: PATRICIO ROJO RUBKE MIEMBROS DE LA COMISI ´ ON: ALEXANDRE BERGEL JOHAN FABRY GONZALO ACU ˜ NA LEIVA Este trabajo ha sido parcialmente financiado por Proyecto FONDECYT 1120299 SANTIAGO DE CHILE 2016

Transcript of fadra: a cpu-gpu framework for astronomical data reduction ...

UNIVERSIDAD DE CHILEFACULTAD DE CIENCIAS FISICAS Y MATEMATICASDEPARTAMENTO DE CIENCIAS DE LA COMPUTACION

FADRA: A CPU-GPU FRAMEWORK FOR ASTRONOMICAL DATA REDUCTIONAND ANALYSIS

TESIS PARA OPTAR AL GRADO DEMAGISTER EN CIENCIAS, MENCION COMPUTACION

FRANCISCA ANDREA CONCHA RAMIREZ

PROFESOR GUIA:MARIA CECILIA RIVARA ZUNIGA

PROFESOR CO-GUIA:PATRICIO ROJO RUBKE

MIEMBROS DE LA COMISION:ALEXANDRE BERGEL

JOHAN FABRYGONZALO ACUNA LEIVA

Este trabajo ha sido parcialmente financiado por Proyecto FONDECYT 1120299

SANTIAGO DE CHILE2016

Resumen

Esta tesis establece las bases de FADRA: Framework for Astronomical Data Reduction andAnalysis. El framework FADRA fue disenado para ser eficiente, simple de usar, modular,expandible, y open source. Hoy en dıa, la astronomıa es inseparable de la computacion, peroalgunos de los software mas usados en la actualidad fueron desarrollados tres decadas atras yno estan disenados para enfrentar los actuales paradigmas de big data. El mundo del softwareastronomico debe evolucionar no solo hacia practicas que comprendan y adopten la era delbig data, sino tambien que esten enfocadas en el trabajo colaborativo de la comunidad.

El trabajo desarollado consistio en el diseno e implementacion de los algoritmos basicospara el analisis de datos astronomicos, dando inicio al desarrollo del framework. Esto con-sidero la implementacion de estructuras de datos eficientes al trabajar con un gran numerode imagenes, la implementacion de algoritmos para el proceso de calibracion o reduccion deimagenes astronomicas, y el diseno y desarrollo de algoritmos para el calculo de fotometrıa yla obtencion de curvas de luz. Tanto los algoritmos de reduccion como de obtencion de curvasde luz fueron implementados en versiones CPU y GPU. Para las implementaciones en GPU,se disenaron algoritmos que minimizan la cantidad de datos a ser procesados de manera dereducir la transferencia de datos entre CPU y GPU, proceso lento que muchas veces eclipsalas ganancias en tiempo de ejecucion que se pueden obtener gracias a la paralelizacion. Apesar de que FADRA fue disenado con la idea de utilizar sus algoritmos dentro de scripts, unmodulo wrapper para interactuar a traves de interfaces graficas tambien fue implementado.

Una de las principales metas de esta tesis consistio en la validacion de los resultadosobtenidos con FADRA. Para esto, resultados de la reduccion y curvas de luz fueron compara-dos con resultados de AstroPy, paquete de Python con distintas utilidades para astronomos.Los experimentos se realizaron sobre seis datasets de imagenes astronomicas reales. En el casode reduccion de imagenes astronomicas, el Normalized Root Mean Squared Error (NRMSE)fue utilizado como metrica de similaridad entre las imagenes. Para las curvas de luz, se proboque las formas de las curvas eran iguales a traves de la determinacion de offsets constantesentre los valores numericos de cada uno de los puntos pertenecientes a las distintas curvas.

En terminos de la validez de los resultados, tanto la reduccion como la obtencion decurvas de luz, en sus implementaciones CPU y GPU, generaron resultados correctos al sercomparados con los de AstroPy, lo que significa que los desarrollos y aproximaciones disenadospara FADRA otorgan resultados que pueden ser utilizados con seguridad para el analisiscientıfico de imagenes astronomicas. En terminos de tiempos de ejecucion, la naturalezaintensiva en uso de datos propia del proceso de reduccion hace que la version GPU sea inclusomas lenta que la version CPU. Sin embargo, en el caso de la obtencion de curvas de luz, elalgoritmo GPU presenta una disminucion importante en tiempo de ejecucion comparado consu contraparte en CPU.

i

Abstract

This thesis sets the bases for FADRA: Framework for Astronomical Data Reduction andAnalysis. The FADRA framework is designed to be efficient and easy to use, modular,expandable, and open source. Nowadays, astronomy is inseparable from computer science,but some of the software still widely used today was developed three decades ago and is notup to date with the current data paradigms. The world of astronomical software developmentmust start evolving not only towards practices that comprehend and embrace the big dataera, but also that lead to collaborative work in the community.

The work carried out in this thesis consisted in the design and implementation of basicalgorithms for astronomical data analysis, to set the beginning of the FADRA framework.This encompassed the implementation of data structures that are efficient when working witha large number of astronomical images, the implementation of algorithms for astronomicaldata calibration or reduction, and the design and development of automated photometry andlight curve obtention algorithms. Both the reduction and the light curve obtention algorithmswere implemented on CPU and GPU versions. For the GPU implementations, the algorithmswere designed considering the minimization of the amount of data to be processed, as a meansto reduce the data transfer between CPU and GPU, a slow process which in many cases caneven overshadow the gains in execution time obtatined through parallelization. Even thoughthe main idea is for the FADRA algorithms to be run within scripts, a wrapper module torun Graphical User Interfaces (GUIs) for the code was also implemented.

One of the most important steps of this thesis was validating the correctness of the resultsobtained with FADRA algorithms. For this, the results from the reduction and the light curveobtention processes were compared against results obtained using AstroPy, a Python packagewith different utilities for astronomers. The experiments were carried out over six datasetsof real astronomical images. For the case of astronomical data reduction, the NormalizedRoot Mean Squared Error (NRMSE) was calculated between the images to measure theirsimilarity. In the case of light curves, the shapes of the curves were proved to be equalby finding constant offsets between the numerical values for each data point belonging to acurve.

In terms of correctness of results, both the reduction and light curve obtention algorithms,in their CPU and GPU implementations, proved to be correct when compared to AstroPy’sresults, meaning that the implementations and approximations designed for the FADRAframework provide correct results that can be confidently used in scientific analysis of as-tronomical images. Regarding execution times, the intensive data aspect of the reductionalgorithm makes the GPU implementation even slower than the CPU implementation. How-ever, for the case of light curve obtention, the GPU algorithm presents an important speedupcompared to its CPU counterpart.

ii

Acknowledgements

First I would like to thank my family for always supporting me and helping me follow mydreams. This work and everything else I’ve accomplished so far would not have been possiblewithout their love and encouragement. I would also like to thank Fernando Caro for hissupport and company, not only through the development of this thesis but in life.

I would like to thank my friends for being the best company I could ask for, and for puttingup with my long disappearances because “I have to work on my thesis tonight”, many nights.Thank you for being so patient and for always cheering for me and supporting me.

I also want to thank Professor Patricio Rojo for all these many years of friendly work andadvice. This thesis would not have happened if it wasn’t for him and his insistence on makingbetter astronomical software. I would also like to thank Professor Maria Cecilia Rivara forher great support through my years as a student and all through this thesis, which I probablywouldn’t have finished already if it wasn’t for her relevant advice and comments. Both ofmy advisors were a fundamental part of my student years and of this work and I would nothave made it this far if it wasn’t for them.

Finally I would like to express my thanks to the members of the revision committee,Professors Alexandre Bergel, Johan Fabry, and Gonzalo Acuna, for their careful reviews ofmy thesis and for their relevant comments to improve it. Last but definitely not least I wantto thank Ren Cerro for kindly taking the time to proof-read this text.

iii

Contents

List of Tables vi

List of Figures vii

1 Introduction 11.1 Astronomical data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Astronomical software development . . . . . . . . . . . . . . . . . . . . . . . 21.3 Thesis description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Goals and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3.2 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3.4 Use of previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.5 Programming languages . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.6 Validation of results . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Literature revision 82.1 Existing software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Criticism of existing solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Astronomical data and analysis 173.1 Astronomical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1.1 Astronomical images . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.1.2 Astronomical spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Astronomical image acquisition . . . . . . . . . . . . . . . . . . . . . . . . . 193.3 Astronomical image reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 213.4 Astronomical image processing . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4.1 Image arithmetic and combining . . . . . . . . . . . . . . . . . . . . . 233.4.2 Filter application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.4.3 Photometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.4 Light curve or time series generation . . . . . . . . . . . . . . . . . . 29

4 Introduction to General-Purpose Graphics Processing Unit (GPGPU)computing 304.1 What is the Graphics Processing Unit (GPU)? . . . . . . . . . . . . . . . . . 304.2 General-Purpose GPU computing (GPGPU) . . . . . . . . . . . . . . . . . . 334.3 GPGPU use in astronomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3.1 GPGPU use for astronomical data analysis in this thesis . . . . . . . 35

iv

5 Software design and implementation 365.1 Data handling: AstroFile and AstroDir classes . . . . . . . . . . . . . . . . 365.2 Calibration image combination and obtention of Master fields . . . . . . . . 375.3 Astronomical image reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.1 CPU reduction implementation . . . . . . . . . . . . . . . . . . . . . 405.3.2 GPU reduction implementation . . . . . . . . . . . . . . . . . . . . . 40

5.4 Light curve obtention: the Photometry object . . . . . . . . . . . . . . . . . 415.4.1 Data handling for light curve obtention . . . . . . . . . . . . . . . . . 425.4.2 Obtaining target data stamps . . . . . . . . . . . . . . . . . . . . . . 435.4.3 Reduction process using stamps . . . . . . . . . . . . . . . . . . . . . 455.4.4 Aperture photometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.4.5 Light curve data handling and visualization: the TimeSeries object . 47

5.5 Graphical User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6 Experimental settings 526.1 Validation of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.1.1 Experiment 1: Validation of reduction results . . . . . . . . . . . . . 536.1.2 Experiment 2: Light curve evaluation . . . . . . . . . . . . . . . . . . 546.1.3 Experiment 3: Comparison between FADRA’s CPU and GPU

photometry implementations . . . . . . . . . . . . . . . . . . . . . . . 556.2 Execution time comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.3 Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7 Results 577.1 Validation of FADRA results . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.1.1 Experiment 1: Validation of reduction results . . . . . . . . . . . . . 577.1.2 Experiment 2: Light curve evaluation . . . . . . . . . . . . . . . . . . 597.1.3 Experiment 3: Comparison between FADRA’s CPU and GPU

photometry implementations . . . . . . . . . . . . . . . . . . . . . . . 617.2 Execution time comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.2.1 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627.2.2 Light curve generation . . . . . . . . . . . . . . . . . . . . . . . . . . 64

8 Conclusions 658.1 Development of basic algorithms for astronomical data analysis . . . . . . . . 658.2 Implementation of algorithms for light curve obtention . . . . . . . . . . . . 678.3 GPU implementation of algorithms . . . . . . . . . . . . . . . . . . . . . . . 688.4 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

8.4.1 Within the scope of this thesis . . . . . . . . . . . . . . . . . . . . . . 708.4.2 The FADRA framework . . . . . . . . . . . . . . . . . . . . . . . . . 72

Bibliography 73

A Details of results 80A.1 Validation of reduction results . . . . . . . . . . . . . . . . . . . . . . . . . . 80A.2 Execution time results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

v

List of Tables

6.1 Datasets used for experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.1 Mean, median, and standard deviation for reduction results . . . . . . . . . . 597.2 AstroPy and FADRA’s CPU light curve results comparison . . . . . . . . . . 617.3 FADRA’s CPU and GPU light curve results comparison . . . . . . . . . . . 62

A.1 NRMSE for reduction result validation . . . . . . . . . . . . . . . . . . . . . 80A.2 Reduction execution times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81A.3 Light curve obtention execution times . . . . . . . . . . . . . . . . . . . . . . 81A.4 Average execution time for reduction . . . . . . . . . . . . . . . . . . . . . . 81A.5 Average execution time for light curve obtention . . . . . . . . . . . . . . . . 81

vi

List of Figures

3.1 Diagram of a spectrograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Diagram of a telescope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.3 Diagram of a CCD detector . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.4 Wavelength filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5 Example of the reduction process . . . . . . . . . . . . . . . . . . . . . . . . 223.6 Gaussian kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.7 Air mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.8 Raw vs. differential photometry . . . . . . . . . . . . . . . . . . . . . . . . . 273.9 Target, sky annulus, and reference star selection for photometry . . . . . . . 28

4.1 Host and devices on GPU computing . . . . . . . . . . . . . . . . . . . . . . 314.2 Memory on a GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.3 Work-items and work-groups on a GPU . . . . . . . . . . . . . . . . . . . . . 33

5.1 The AstroFile and AstroDir classes . . . . . . . . . . . . . . . . . . . . . . 385.2 The Photometry object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.3 Data stamps for aperture photometry . . . . . . . . . . . . . . . . . . . . . . 435.4 Data stamps following targets . . . . . . . . . . . . . . . . . . . . . . . . . . 445.5 Data stamp parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.6 The TimeSeries class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.7 GUI for AstroDir creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.8 GUI showing loaded AstroDir objects . . . . . . . . . . . . . . . . . . . . . 505.9 GUI for photometry targets selection . . . . . . . . . . . . . . . . . . . . . . 505.10 GUI for aperture photometry parameters selection . . . . . . . . . . . . . . . 515.11 Example of light curves for two targets . . . . . . . . . . . . . . . . . . . . . 51

7.1 NRMSE between AstroPy and FADRA CPU reduction results . . . . . . . . 587.2 NRMSE between FADRA’s CPU and GPU reduction results . . . . . . . . . 607.3 Execution times for reduction algorithms . . . . . . . . . . . . . . . . . . . . 637.4 Execution times for light curve obtention . . . . . . . . . . . . . . . . . . . . 64

vii

Chapter 1

Introduction

1.1 Astronomical data analysis

Starting from the first astronomical observations, when the human eye was the only tool usedto examine the Cosmos, ancient astronomers realized that the movement of the objects inthe sky was related to the seasons, and thus to cycles in nature relevant to survival. Since thebeginning, astronomers have used all the available resources to keep track of the movementson the celestial sphere, from clay tablets to ancient papyri. After the invention of the telescopewas made public by Galileo Galilei in 1609, a new era of astronomy emerged, in which muchmore than meets the eye was to be observed in the sky and subsequently recorded. The onlyway to document astronomical observations was to spend countless hours behind a telescope,taking notes and making illustrations of what was being seen. Some of the most importantastronomical discoveries of all times were carried out during this era.

It was not until the early decades of the 20th century that the use of photographic plates asdetectors on telescopes became the standard, granting astronomers the chance to finally andpermanently capture exactly what they were seeing through the instrument. Thousands ofnew objects were discovered through the careful human-eye inspection of these plates. Duringthe 1980s, the use of digital detectors on telescopes became widespread, again revolutionizingthe way astronomical analysis could be conducted. With the astronomical images in digitalformats, and with the aid of computers, the analysis of astronomical data became moreprecise, more standardized, and faster.

Nowadays, astronomy is inseparable from computer science. And, today, a new era ofobservational astronomy is also starting: the survey era, which goes hand in hand withthe big data era in technology. The usual observational astronomy paradigm, in which theastronomer applies for nights at an observatory, observes the desired target objects for a fewnights, and then goes back home with the data, is starting to be replaced by survey telescopes:instruments devoted to observe the complete night sky, or their specific catalog of objects,all night, every night. Data from these surveys is then released online for astronomers todownload the information relevant to their scientific interests and perform analyses withoutthe need of visiting an observatory.

1

Along with new ways of obtaining data, there is also a need for new ways of processingand analyzing it. Looking for changes pixel by pixel, frame by frame as was done in the timesof photographic plates is simply unfeasible with the amount of data available today. Rightnow, astronomy is not only concerned with the science obtained from observations, but alsowith designing better ways to inspect data. Furthermore, as new telescopes and astronomicaldetectors are developed all over the world, astronomical data thrives and so is the sciencethat utilizes it.

1.2 Astronomical software development

The development of astronomical software oriented to astronomers for individual use datesback to the 1980s. As will be further commented on chapter 2, a lot of software developedalmost three decades ago is still used to this day. Of course, these software are completelyreliable, and when working with them one can be assured that the results will be correct.However, having been developed so long ago, they are not up to date with the current dataparadigms. Waiting minutes for one astronomical image to be calibrated, or having to setparameters and analyze each image frame separately, are common occurrences in the mostestablished astronomical software used today.

This has lead astronomers to develop their own astronomical software according to theirneeds. Today, most astronomers have a preferred programming language and implement theirown algorithms. Even though this may seem like a good solution, in practice it yields a lot ofdissimilar software, prone to human errors, since the same algorithms are implemented overand over by different scientists. Users of certain programming languages, such as Python,are working in the development of libraries to make the task of analyzing astronomical dataeasier. These projects, however, have so far only yielded separate algorithms, but no unifiedsoftware has been released. Also, by taking care of just very specific functionality, thesedevelopments serve more as an aid for programmers, rather than a software program orframework that is ready for astronomers to use.

Since software development is such a fundamental part of the work of astronomers, doingso in the best possible way should be a priority for them; sadly, this is not always the case.Given the fact that astronomers need to program their own code, they are often not able tofully finish or polish it before they have to begin doing science on their images; much lessdocument or release it publicly for the rest of their colleagues [81]. These problems couldbe greatly diminished, for example, by the creation of open source frameworks allowingastronomers to freely add new packages according to their needs. This reduces duplicateefforts, and makes it easier for the whole astronomical community to work together makingbetter software tools.

The world of astronomical software development must start evolving not only towardspractices that comprehend and embrace the big data era, but also that lead towards collab-orative work in the community.

2

1.3 Thesis description

The aim of this thesis is to set the bases of the FADRA framework for astronomical imagereduction and analysis. The work carried out in this thesis considers setting up the modularstructure of FADRA, as well as implementing the basic algorithms needed for astronomicalimage analysis and also more advanced procedures for light curve obtention.

Besides the focus on the design of the framework and algorithm implementation, this thesisexperiments with GPU implementations of certain algorithms. Investigating the possibilitiesof using GPU-accelerated algorithms for astronomical software meant for common use couldbring about more efficient and faster ways for astronomers to calibrate and analyze theirdata, a critical point in the big data era.

1.3.1 Goals and objectives

The following goals are achieved with the work implemented in this thesis:

G1: To develop a framework that provides the basic algorithms necessary for astronomicaldata analysis: data reduction algorithms and image combination algorithms for theobtention of calibration files.

G2: To provide algorithms for automated light curve (time series) generation, with as mi-nimal user intervention as possible, but also that are efficient in execution time andprovide good results.

G3: To provide GPU accelerated versions of reduction algorithms and of light curve obten-tion procedures.

G4: To create this framework through code that is modular, expandable, well documented,and open source.

These goals can be further specified through the definition of a set of secondary goals, orobjectives, achieved through the work of this thesis. These objectives consider:

O1: Implementation of data structures that are efficient when working with hundreds orthousands of astronomical images.

O2: Implementation of algorithms to combine images and obtain calibration files neededfor the astronomical image analysis process.

O3: Implementation of algorithms for astronomical image reduction, or calibration, that areeasy and direct to use over a large amount of images.

O4: Design of novel ways to implement the reduction and analysis processes over severalastronomical images for the obtention of light curves, as a means to reduce computa-tion time and make data transfer more efficient when dealing with GPU acceleratedalgorithms.

O5: Validation of the quality of the results obtained with the FADRA framework, carried outby comparing said results to the ones obtained with established astronomical software.

3

O6: Implementation of Graphical User Interface modules that allow the user to use FADRA’sfunctionalities in an interactive way, and which provide visualization for the data andthe corresponding calculation results.

All the algorithms and processes to be carried out over astronomical images mentionedin the goals and objectives are further detailed and explained in Chapter 3. The technicaldesign and implementation aspects of the aforementioned objectives can be found in Chapter5.

1.3.2 Research questions

This thesis develops and evaluates GPU-accelerated versions of some of the algorithms nec-essary for the analysis of astronomical images. Even though some approaches to GPU imple-mentations for astronomical applications exist, they focus mainly on numerical cosmologicalsimulations [17, 32, 38, 43, 64, 80]. GPU performance analysis of classical astronomical imagealgorithms is an area that is just now being developed [7,34,87]. Chapter 4 presents a moreextended review of GPU use in astronomical software.

Since data transfer rates between CPU and GPU are still slow, it is of vital importance tomake sure that the transfer is made in the most efficient way possible. Because of this, theimplementation of this thesis considers a novel approach in terms of data handling, transferingto the GPU only the least amount of data as possible. This process is further explained insection 5.4.

The experiments performed in the context of this thesis seek to answer two questions:

Q1: Is it possible to obtain significant GPU speedup in astronomical algorithms that dealwith a large amount of data transfers between CPU and GPU?

Q2: Are these speedups justified? In other words, is the obtained acceleration worth itconsidering the extra implementation that GPU algorithms convey?

This thesis answers these questions by analyzing the results of timing performance on thedifferent algorithms. Execution time comparisons were carried out between the GPU andCPU implementations of FADRA, as well as between FADRA algorithms and establishedastronomical software.

1.3.3 Software architecture

FADRA is designed as completely modular software. All algorithms and functions belongingto specific processes were implemented as separate Python packages. This serves two mainpurposes: first, it allows users to easily find the implementation of the algorithms and func-tions, in case they wish to edit some of them to better fit their personal needs. Secondly, itmakes it easier for users to simply create new packages and integrate them to the FADRAcode. The organization of modules in Python packages also allows for users to simply import

4

certain packages in their own personal implementations, in case they do not need to useFADRA’s complete implementation.

1.3.4 Use of previous work

As a starting point for the algorithms of this thesis, previous work by Professor Patricio Rojo,developed in the Astronomy Department of Universidad de Chile was reviewed. Said workencompasses the development of data structures to work with large number of astronomicalimages at a time (further explained in section 5.1) as well as the approximation for aperturephotometry used in the CPU version of said algorithm (further explained in section 5.4.4).

This preceding implementation was adapted to work through the developments carriedout in this thesis. Further comments about the existing code and its consequent adaptationis given in the relevant chapters of this thesis.

1.3.5 Programming languages

FADRA was developed using the Python programming language. The version used wasPython 3.4.3. Python is currently one of the most widely used languages for data analysis,especially in astronomy. Beginner-friendly but still powerful, Python currently competesdirectly with astronomical programming languages such as the commercial IDL or MATLAB.The GUIs were implemented using Tk through its package Tkinter for Python.

The GPU part of the algorithms was implemented in the GPU-oriented programminglanguage OpenCL (Open Computing Language) [36,73]. Although CUDA (Compute UnifiedDesign Architecture) [52,58] is a widely used language for GPU programming, it presents thedisadvantage of running only on NVidia hardware. OpenCL, on the other hand, is open, free,cross-platform, and really heterogeneous: it runs on any brand of graphical hardware. Theimplementation of this thesis was done in OpenCL as a means to not restrict the machineswhere the framework can be used. The version used was OpenCL 1.2.

1.3.6 Validation of results

The validation of the results obtained through the algorithms and functions developed forthis thesis consists in two different stages: validation of astronomical reduction results, andvalidation of light curve results. The reduction process for astronomical images is furtherexplained in section 3.3, whereas the process of light curve obtention is explained in section3.4.4.

Reduction of astronomical images is the first step to be carried out before performinganalyses over the data. Because of this, the results from the reduction algorithms must becorrect. The best way to demonstrate this is to compare the results from FADRA’s reductionimplementations, both in CPU and GPU versions, to the results of reduction carried out with

5

different, established astronomical software. In this case, AstroPy (section 2.1.2) was used,with its module ccdproc, designed to perform basic operations in astronomical images suchas calibration and combinations.

The similarity metric used to compare the pairs of reduced images was the NormalizedRoot Mean Squared Error (NRMSE). The NRMSE measures the percentage of differencebetween a predicted value and a real, observed value. For the case of comparing FADRA’sreduction results to AstroPy results, the latter was considered as the predicted or gold stan-dard value and the former considered as the observed value. When comparing FADRA’sCPU results to GPU results, the CPU reduction results were considered as the standard, andthe GPU results were considered as the observed values. Further details on the calculationof the NRMSE and the evaluation of reduction results are given in Chapter 6.

The validation of results also considers evaluating the correctness of light curve obtentionresults. Again, FADRA’s CPU results were compared again established software results, andFADRA’s CPU and GPU algorithms were compared against each other. When comparingcurves, relationships between the points that compose each different curve were found, tocheck that the differences between the points inside of the curves themselves are maintained.This is because the most important feature to evaluate when comparing two light curvesis not only that they are similar, but that the variations within the points are exactly thesame, since it is these variations that are of importance when studying light curves of variableastronomical objects.

Further details about the experiments carried out in this thesis, similarity metrics, anddatasets, are given in Chaper 6.

Outline

Chapter 2 of this thesis consists in an exhaustive literature revision of current astronomicalsoftware. Every software that vouches to offer utilities similar to the ones in the FADRAframework is described, and an analysis of why none of the existent solutions covers the sameextents as FADRA is given.

Chapter 3 presents detailed explanations on astronomical data obtention, processing, andscientific analyses. The said chapter introduces all the basic processes to be carried out overastronomical images, which are implemented in the FADRA framework. Chapter 4 presentsan introduction to General Purpose Graphics Processing Unit computing, or GPGPU com-puting, the state-of-the-art method for algorithm acceleration. A description of how GPGPUis used to the advantage of analysis in the FADRA framework is also stated.

Chapter 5 discusses the software design and implementation steps carried out for thecreation of FADRA. The different classes, data structures, and algorithms implemented inthis thesis are detailed and explained.

The experimental framework and settings used for the validation of FADRA results aredetailed in Chapter 6. The metrics used to compare the different results are introduced, and

6

the details of the datasets used on the experiments are presented.

The results of said experiments and analyses over several astronomical datasets are pre-sented in Chapter 7. Timing analyses of different FADRA algorithms are also presented,to show the speedup obtained with GPU acceleration compared with FADRA’s own CPUalgorithm implementations.

Finally, in Chapter 8, the conclusions obtained from the results are presented, as well asa revision of the current functioning of FADRA and an overview of the future work to beimplemented for the framework.

7

Chapter 2

Literature revision

The following is an exhaustive summary of current existing astronomical software. All thedetails regarding astronomical image processing and analysis will be further explained inChapter 3. There are, however, some basic concepts to be introduced before presenting thesoftware review:

� Astronomical image reduction (section 3.3) corresponds to the process of calibra-ting astronomical images after they are acquired through a telescope. The reductionprocess for astronomical images is standardized and must be applied to every astrono-mical image over which scientific analyses are to be carried out. It is a basic, essentialtool for astronomical image processing.

� Filter application (section 3.4.2) refers to the application of convolution kernels overthe images, such as a Gaussian filter or mean filter.

� Photometry (section 3.4.3) is the process of measuring the amount of light receivedfrom an astronomical object, through the images obtained during the observations ofsaid object. The most common type of photometry performed over astronomical imagesis called aperture photometry.

� Light curves (section 3.4.4) are curves resulting from the photometry measurementsover a series of images of the same astronomical object through a period of time. Lightcurve obtention is a crucial step for many areas of astronomy, including variable stars,supernovae, and extrasolar planet studies, among others.

� FITS files (section 3.1.1) are the standard data storage system for astronomical images.

The criteria for the software to be included in this list were the following:

� The software must focus on the analysis of astronomical images in the visual spectrum.Software dedicated to radioastronomy and spectroscopy was not considered, since thiswork does not cover those areas.

� Only software that has at least basic astronomical image reduction tools was consid-ered. Software meant for telescope control or just image acquisition, but without imageprocessing tools, was not considered.

8

2.1 Existing software

2.1.1 Aladin Sky Atlas [11]Launch year: 1999Latest version: March 2014License: GPL v31

Platforms: Linux, Mac OSX, WindowsDeveloper team: Centre de Donnees astronomiques de Strasbourg, Universite de Stras-bourgInteractive sky atlas, cross-platform. It allows visualization of astronomical images obtainedfrom the SIMBAD database2. It includes photometric information in its latest version, how-ever the photometry is not calculated from the images, but loaded from the VizieR astrono-mical catalog3, also from Strasbourg University [3]. Its latest version was launched alongsideAladin Lite [10], HTML5 version to be used in web browsers. In its desktop version, it allowsthe user to create operation scripts, using Aladin’s own Application Programming Interface(API)4.

2.1.2 AstroPy [5]Launch year: 2011Latest version: December 2015License: 3-clause BSD license5

Platforms: Linux, Mac OSX, WindowsDeveloper team: open-sourcePython package including different tools of common use in astronomy. For now, it hastools for opening, reading and writing FITS files, astronomical coordinate use, and someastrometry6 tools. Analysis functions, image visualization and photometry packages arementioned as “planned”, but have not been developed yet [4].

2.1.3 The Starlink Project [48]Launch year: 1980Latest version: June 2015License: part GPL1, part commercial (original Starlink license)Platforms: Linux, Mac OSXDeveloper team: from 1980 to 2005, the Joint Astronomy Centre, Hawaii University. From2005 to present, the East Asian Observatory.Group of software developed for general astronomical use. It provides tools for reduction,

1https://www.gnu.org/copyleft/gpl.html2http://simbad.u-strasbg.fr/simbad/3http://vizier.u-strasbg.fr/viz-bin/VizieR4An application programming interface, or API, is a set of functions and procedures that allow access

to the features or data of a software. Many software have their own API definitions, which are the uniquecommands necessary to run the program.

5https://github.com/astropy/astropy/blob/master/licenses/LICENSE.rst6The measurement of the exact position of astronomical objects in the sky and their variations through

time

9

aperture photometry, and statistical analysis of the images. The aperture photometry im-plementation requires the user to input the target object coordinates for each frame of theseries. It provides a graphical interface through the GAIA-Skycat software (2.1.17). Sincethe administration change in the year 2005, its focus has been sub-milimetric (radio) datareduction [20].

2.1.4 IRAF: Image Reduction and Analysis Facility [77]Launch year: 1984Latest version: March 2012License: free for public domain use.Platforms: Linux, Mac OSX, Windows through CygwinDeveloper team: NOAO, National Optical Astronomical Observatories. AURA, Associa-tion of Universities for Research in Astronomy. Tucson, Arizona.One of the most widely used astronomy software tools. Its functioning is based on variousmethod packages, developed by different institutions. Each user can define their own pack-ages, which must be written in the native IRAF command language, SPP. It possesses imagereduction packages, as well as packages for stellar and aperture photometry [21]. It does nothave any visualization or graphical tools of its own, and all the interaction must be carriedthrough the command line. To visualize images, it must be used together with differentsoftware. The commonly used one is DS9 (2.1.15).

2.1.5 STSDAS: Space Telescope Science Data Analysis SystemLaunch year: 1994Latest version: March 2014License: STSci license7. Free for public domain use.Platforms: Linux, Mac OSXDeveloper team: Science Software Branch of the Space Telescope Science InstituteIRAF (2.1.4) based astronomical software suite. Contains tools for reduction and analysisof images, for both general use and specific for Hubble Space Telescope (HST) data. It isdesigned as a series of enhancements for IRAF. The user interface and graphical terminals aregiven by IRAF, so they are just as minimal as in such software. It possesses tools specificallyfor aperture photometry [15]. It can also be used through the IRAF Python package, PyRAF(2.1.14).

2.1.6 IRIS [14]Launch year: 1999Latest version: September 2014License: free for non-commercial use.Platforms: WindowsDeveloper team: Christian BuilSoftware designed mostly for astronomical image acquisition. It also contains tools for basicreduction and analysis, including rudimentary aperture photometry which must be performedby the user manually over each frame.

7http://www.stsci.edu/institute/software hardware/pyraf/pyraf-license

10

2.1.7 CCDSoftLaunch year: -Latest version: January 2001License: CommercialPlatforms: WindowsDeveloper team: Software Bisque8, together with Santa Barbara Instrument Group (SBIG)9.Designed together with astronomical instrumentation company SBIG, CCDSoft is an imageacquisition software that also contains reduction and analysis tools. It has an interactivegraphical interface, which includes interactive options for photometry. However, just like inAladin (2.1.1), the photometry is not calculated directly from the images, but obtained fromcatalogs. The program allows loading and accessing different photometric catalogs, includ-ing the US Naval Observatory CCD Astrograph Catalog [84] and the VizieR catalog fromStrasbourg University [57]. Software Bisque, the developer company, specializes in cameraand telescope control software.

2.1.8 Mira ProLaunch year: 1988Latest version: December 2012License: commercialPlatforms: WindowsDeveloper team: Mirametrics10

Promoted as software “with no peer for speed, features, and efficiently integrating a richcollection of tools for image display, plotting, processing, measurement, and analysis”. Itpossesses a graphical interface, with tools for image reduction and also for obtaining andplotting the photometry of the images, to be executed one image at a time. This programstands out with its good handling of images of great size. It allows the user to create andrun scripts, in the programming language Lua11. A version designed for the amateur public,Mira AL, is available and also commercial. It does not have complicated analysis tools. Thesoftware was developed by Mirametrics, company dedicated to imaging software for sciencean engineering, mainly for astronomy and medical sciences.

2.1.9 MaxIm DLLaunch year: 1993Latest version: 2013License: commercialPlatforms: WindowsDeveloper team: Diffraction Limited12

Integrated software with tools for both telescope control and image acquisition, as well as im-age reduction and basic analysis. Provides interactive tools for basic photometry, to be doneone image at a time, through a graphical interface. It was developed by Diffraction Limited,

8http://www.bisque.com/sc/9http://www.sbig.com

10http://www.mirametrics.com11https://www.lua.org/12http://www.cyanogen.com/

11

a Canadian company dedicated to astronomical, biomedical, and laboratory software.

2.1.10 AIP4WinLaunch year: 2000Latest version: 2006License: commercialPlatforms: WindowsDeveloper team: Willmann-Bell, Inc.13

Software originally designed to go with the book “The Handbook of Astronomical ImageProcessing”, by Richard Berry and James Burnell [9]. It provides a graphical visualizationinterface and tools for image reduction and analysis. It has basic photometry tools whichobtain the numerical value of the studied object, but it does not have tools for plotting orvisualizing such information.

2.1.11 CCDOpsLaunch year: -Latest version: November 2011License: commercialPlatforms: Windows, only the first version available for Linux and Mac OS XDeveloper team: Diffraction Limited12

Similar to CCDSoft (2.1.7). Used mainly for astronomical image acquisition from SBIGcameras. Provides a graphical interface and tools for image reduction and basic image en-hancement [66]. It does not provide tools for photometry, filter application, or complex imageanalysis, since it is mainly aimed at image acquisition.

2.1.12 AstroArtLaunch year: 1998Latest version: February 2015License: commercialPlatforms: WindowsDeveloper team: MSB Software14,15

Software designed for astronomical image reduction. Provides catalog-assisted astrometryand photometry tools. It also provides basic filters for image enhancement.

2.1.13 IDL Astronomy User’s Library [78]Launch year: 1990Latest version: May 2016License: free download, but requires the commercial programming language IDLPlatforms: Linux, OS X, Windows

13http://www.willbell.com/aip4win/aip.htm14http://www.msbsoftware.it/15http://www.msb-astroart.com/

12

Developer team: Astrophysics Science Division (ASD) of NASA16

Low-level astronomical routine repository, developed in the programming language IDL,which is only distributed on payware license. It is not an integrated package, but sepa-rate routines which can be used independently by the users [46, 47]. It contains aperturephotometry routines, similar to the package DAOPHOT [22] from IRAF.

2.1.14 PyRAF [33]Launch year: 2000Latest version: November 2015License: STSci license7

Platforms: Linux, OS X, WindowsDeveloper team: Science Software Branch of the Space Telescope Science InstitutePython package developed to work with IRAF (2.1.4) commands. It gives users the ability torun different IRAF packages, taking advantage of Python’s flexibility. It provides access to allof IRAF’s reduction and analysis packages, including the photometry package DAOPHOT.All instructions must be given through IRAF commands. Since IRAF does not provide agraphical user interface, neither does PyRAF. Some plotting packages have been planned anddesigned, to show IRAF plots on Python’s graphical interfaces [23,24], but it does not comewith its own GUI or allows interactive performing of operations such as aperture photometry.

2.1.15 SAOImage DS9 [72]Launch year: 1999Latest version: December 2015License: combination GPL, LGPL, and BSD, depending on the packagePlatforms: Linux, OS X, WindowsDeveloper team: Smithsonian Astrophysical Observatory (SAO) Center for Astrophysics,Harvard UniversityA tool focusing on image visualization [40]. With a graphical interface based on simplicity[41], its premise is visualization only: although it provides the option of having differentimage frames, scale changes, zoom, and geometrical markers, it does not come with imagereduction tools or any other type of operation to be performed on images. This is why itis used together with IRAF (2.1.4), giving it the visual interface that the analysis softwaredoes not possess.

2.1.16 MIDAS: Munich Image Data Analysis System [6]Launch year: 1983Latest version: September 2015License: GPLPlatforms: Linux, OS XDeveloper team: European Southern Observatory (ESO)Software developed by ESO with general tools for reduction and analysis of astronomicalimages. It provides mathematical and statistical tools, and also packages for astrometry andphotometry. Instructions are given through the command line, and they must be in their

16http://idlastro.gsfc.nasa.gov/

13

own MIDASCL language.

2.1.17 GAIA-Skycat: Graphical Astronomy and Image Analysis [2,25]Launch year: 1997Latest version: 2014License: GPLPlatforms: Linux, OS XDeveloper team: Very Large Telescope (VLT) project at ESOVisualization software belonging to the Starlink Astronomical Software Project (2.1.3). Itprovides a graphical user interface, besides basic tools for image reduction and photometry.

2.1.18 MOPEX: MOsaicking and Point-source EXtraction [55]Launch year: 2006Latest version: December 2014License: GPLPlatforms: Linux, OS X, WindowsDeveloper team: Spitzer Science Center, California Institute of TechnologyReduction and analysis software, designed by the Spitzer Space Telescope17 team and spe-cialized to work on data acquired by this instrument. Even though it works for general data,the team recommends checking data parameters and they do not assure that it will work aswell with data from telescopes other than Spitzer. It also provides a command line interface.The GUI does not allow for use of all the available functionalities, only the most commonreduction ones. More complex analysis must be carried out through the command line inter-face. It has basic aperture photometry tools. Its strength is in its generation of astronomicalimage mosaics18.

2.1.19 THELI [70]Launch year: 2005Latest version: February 2016License: GPL v2Platforms: LinuxDeveloper team: Gemini Observatory, University of BonnTHELI is a package designed for automated astronomical data reduction. It provides toolsfor background calibration, astrometry, and basic photometry tools for flux calibration only,not for light curve obtention. It does not have filter application functions. The GUI versionof THELI gives the user a graphical interface to select the input data and to insert thenecessary parameters. It offers acceleration through CPU parallel implementations of somealgorithms.

17http://www.spitzer.caltech.edu/18A mosaic must be obtained when a large object spans over a series of images. These images must be

carefully aligned to make sure the fit of the images is perfect and thus that the final image is correct.

14

2.1.20 ATV.PRO [8]Launch year: 1998Latest version: January 2016License: free download, but requires the commercial programming language IDLPlatforms: LinuxDeveloper team: Aaron Barth, University of California IrvineJust like DS9 (2.1.15) serves as a visualization tool for IRAF (2.1.4), ATV.PRO serves as avisualization tool for IDL Astronomy User’s Library (2.1.13). It provides image visualization,scaling, color scales, and world coordinate systems. It also provides a very simple aperturephotometry tool, for calibration purposes only and it does not allow for light curve obtention.It provides no other analysis or filter application functions.

2.1.21 Aperture Photometry Tool (APT)Launch year: 2012Latest version: May 2016License: Free for research and educational purposesPlatforms: Linux, OS X, WindowsDeveloper team: Infrared Processing and Analysis Center (IPAC), California Institute ofTechnology, on behalf of the National Aeronautics and Space Administration (NASA)As its name says, APT is designed to perform GUI based aperture photometry analysis. Itdoes not provide any kind of reduction, calibration, or filtering functions. APT is designedfor manually analyzing one image at a time, and light curve obtention is not supported.

2.2 Criticism of existing solutions

Even though there is varied availability of reduction and analysis software, the offers are verydissimilar in terms of efficiency, functionality, and availability for users. Some points can bedirectly noted:

� None of the free software options possesses automated photometry, light curve obten-tion, or time series generation tools.

� None of the software options previously mentioned provide GPU support.

� Software that allows scripting requires that it is done through their own APIs. Althoughthis is understandable, it can also pose problems when the user is to use the softwareor to add new functionality.

� Software from science institutions, even though they are mainly free for public use andopen source, focus on their own APIs, or are optimized for their own specific data.They are available for general use from astronomers and their personal projects, butthe specificity of the software may carry usage problems.

� In general, astronomical software development focuses on mathematical analysis, leav-ing simplicity for users behind, since in many cases scripts must be written using theAPIs of the program. This sets a gap between the software and the individual users.

15

� The best applications, that integrate the larger number and better level analysis toolswith interactive GUIs are all payware and without free access.

� The best software for image visualization and editing is mostly Windows-based. Thisis because the focus of such a software is for amateur astronomy and astrophotography,and not for scientific astronomy.

In terms of performance, none of the previously mentioned software exploits the GPU asa way to speed up the processing time of algorithms. Nowadays, many applications takeadvantage of GPU computing, including everyday applications such as computer games,which do not require specific, state-of-the-art machines to be run. Current astronomicalsoftware solutions have not yet taken advantage of this area.

It can be said that there is currently no scientific astronomical software framework thatcovers reduction, data analysis, and automated light curve obtention, that is free of chargefor users and open source.

16

Chapter 3

Astronomical data and analysis

These days, astronomy itself and astronomical data are very diverse. Data comes from manydifferent sources and, as such, takes many different forms. Astronomical data can be split upin two main categories: images and spectra. Even though both have their own subcategories,the finer distinctions are only relevant in terms of the science to be carried out on the data.That is why only the two main categories will be discussed and presented as such. Sincethe analysis of astronomical spectra is not considered in this thesis, the following sectionsof this chapter (acquisition, calibration, and processing) will be regarded only in terms ofastronomical images.

3.1 Astronomical data

3.1.1 Astronomical images

By far, images are the most widely recognized type of astronomical data, and the one mostcommonly related to this discipline. The standard accepted format for astronomical imagesis the .FITS (Flexible Image Transport System) [82]. FITS is an open standard for scientificdata transport and storage. A .FITS file consists of one or more blocks, each one composedof a Header and a Data Unit. Each one of these blocks is called an HDU for short. All .FITSfiles must contain at least one HDU, which is called the primary. More HDUs are optional,and they depend on the type of data stored. These secondary HDUs are known as extensions.Each part of an HDU is composed as follows [37]:

� Header: ASCII-formatted unit containing metadata about the Data Unit. Each Headercontains an 80-character keyword sequence, with the format KEYWORD = value / comment

string. In the case of astronomical data, the Header can contain information such asname and position of the observed object; time exposure of the image; location, dateand time of the observation; instrument used; climate conditions of the night, amongothers. The Header also contains information about the celestial coordinate systemused to find the object and obtain the image. This way, image pixel positions can be

17

mapped to coordinates and positions in the sky.

� Data Unit: data array, usually containing a 1-dimensional spectrum, a 2-dimensionalimage, or a 3-dimensional data cube. The Data Unit can contain arrays, of dimensionfrom 1 to 999, of integers of 1, 2 or 4 bytes, or floating point real numbers of 2 or 4bytes, in IEEE representation. The Data Unit may also correspond to tabular data, inASCII or binary format. These binary Data Units are usually stored as extensions ofanother HDU, to be used in relational database systems.

During the last decade, since digital cameras and portable telescopes have become moreeasily available to the public, amateur astronomy imaging has also seen an increase, in whatis usually known as astrophotography. Astrophotography is the process of obtaining astro-nomical images for pure aesthetic purposes, with no means of carrying out scientific analysis.These images are usually stored in the usual .JPG and .RAW formats, as in every digitalcamera, and processing is done using common image editing software, such as Photoshopor GIMP. In the last few years, however, new software has been developed specifically forastrophotography image editing. This software, although not containing detailed analysistools, bring the editing of astrophotography images closer to the world of scientific astro-nomy image editing, by providing tools for reduction and correction of images similar to theones in scientific software, but not as precise.

3.1.2 Astronomical spectra

Spectroscopy provides information about astronomical objects that would not be easily ob-tained from images, such as density, temperature, and chemical composition. An electromag-netic spectrum is a plot of light intensity, or power, as a function of frequency, wavelength,temperature, or other physical properties. Spectra is used in astronomy to measure threemain bands: optical, radio, and X-Ray.

Optical spectra are obtained through telescopes, just like astronomical images. The dif-ference is that an spectrograph must be used instead of a camera. Essentially, what thespectrograph does is make the light pass through a light dispersion device, for example aprism, before reaching the image acquisition system (Figure 3.1). This way, the light comingfrom the astronomical object is not captured directly, but rather “fanned out” in the com-plete light spectrum. Slits are usually used to make sure that only the light from the desiredobject is entering the spectrograph. This way, different information about wavelength canbe obtained.

Figure 3.1: Simple diagram of a single-slit spectrograph.

18

Spectra, just like astronomical images, can be stored in .FITS files. The science performedon spectra, however, is completely different from the one done over images. The softwareused to analyze spectra and images are different and do not usually come together.

3.2 Astronomical image acquisition

There are two necessary tools for acquiring astronomical images: telescopes and detectors.Telescopes serve as the zoom for a camera: it is within their internal structure that lightgets captured. Most telescopes these days, from portable, amateur telescopes to the giantstructures that can be found around the world, are based on a combination of mirrors andlenses that collect light from distant astronomical sources. The mirror and lenses help directthe light to the detection device. Figure 3.2 shows a basic diagram of how light is directedto the observer, using mirrors.

Figure 3.2: Simple diagram of a Newtonian telescope, which uses only mirrors to direct the lightto the observer. Light enters the tube of the telescope through the left side of the diagram, whereit travels all the way to the right to meet the primary mirror. This mirror redirects light to thesmaller, secondary mirror, where it is then directed to the eye-piece of the observer, shown in redat the top. Image source: [44].

The amount of light collected by a telescope is proportional to the size of the primarymirror. This is why bigger telescopes are needed to see the fainter and further objects in thesky. Once the light passes through the telescope and onto the detector, however, the processremains the same, even across different instruments.

Professional telescopes use different optical arrays devices to capture the light and trans-form it into a digital image. The most common ones are Charge-Coupled Devices (CCD). ACCD allows the transformation of electrical charge into a digital value. CCDs are widely usedin digital cameras and, since the 1980s, are the type of detector used in telescopes, replacingthe old photographic plates, which had to be examined through human visual inspection.A CCD is usually a square array of CCD pixels, light-sensitive circuit elements covered insilicon. Photons reaching each CCD pixel generate an electronic charge, due to the photo-electric effect. This charge is then transformed to a digital copy of the light patterns cominginto the device.

19

Figure 3.3: Simple diagram of a CCD detector. Arriving photons are turned into an electric currentdue to the photoelectric effect. This current is then turned into a numerical value in the computer.

An array of pixels on a CCD can be imagined as an array of buckets that capture photonsas if they were rain water. All “buckets” are exposed to the photons for the same amountof time. The buckets fill up with a varying amount of “water”, depending on the field thetelescope is observing: the areas of the CCD array where light from an astronomical objectis hitting will fill up faster than the surrounding areas. Each “bucket” of the CCD is thenread and transformed into a digital signal, which is transformed into a digital image in thecomputer.

Even though color CCDs are available and used in digital cameras, telescopes work withblack and white CCDs, otherwise lots of important luminic information can be lost. Differentfilters can be located between the telescope and the CCD, in order to select the wavelengthsto be observed. These filters can be used to enhance certain characteristics of the objectsobserved, since they selectively leave out colors and wavelengths that are not of interestto the observations. Filters allow astronomers to select different pass-bands, ranges of theelectromagnetic spectrum between certain wavelengths [39], without the drawbacks of colorcameras.

Figure 3.4: Example of the use of U, B, and V filters. The plot shows the amount of lightdetected through the different filters, over wavelength. Each filter defines a band of wavelengthto be observed, leaving out the parts of the spectrum that are not of interest to the observations.Source: [39].

20

3.3 Astronomical image reduction

After acquisition, astronomical images are analyzed with different algorithms to obtain therelevant scientific information. The process is not standard: scientists looking for differ-ent information can carry out many different procedures over the same image. There are,however, certain algorithms which must be executed every time an astronomical image is tobe analyzed, independent of the further work to be done on it. This process is known asastronomical image reduction. The reduction process considers three main steps: removingsystematic electron count errors generated by the acquisition instrument, calibrating the lightsensitivity of each CCD pixel, and removing defective pixels from the image.

To remove the electrons generated by the temperature of the instrument, special types ofimages, bias and darks, are obtained with the same CCD camera that will be used to acquirethe astronomical images. Bias frames are 0 second exposure images, taken with the camerashutter closed, to obtain only the electronic background inherent to the camera and to thetransmission process, from the camera to the computer. Dark frames are also obtained withthe camera shutter closed, but with the same exposure time that will be used for the realastronomical images. In this way, the amount of thermal electrons added to the image duringacquisition is sampled. Bias or dark frames are subtracted from the original image, since thegoal is to remove these counts from it. Given the wavelenghts and filters commonly used inastronomy (section 3.2), usually only one of these fields, a bias or a dark frame, is used forthe reduction of the images. From now on, only the dark frame will be considered, since itcontains the bias correction.

Flat fielding corresponds to the correction and calibration of the CCD pixels, according totheir sensitivity to the light received. Not all CCD pixels interact with photons in the sameway: some of them may have their sensitivity altered by different causes, and this will bereflected in every image obtained through that CCD and camera. Consequently, the imagesneed to be calibrated considering this factor. For this purpose, flat field images are obtained,which correspond to an homogeneously illuminated image. These can be obtained artificiallyover a drop background, or also with images of the sky at sunset or sunrise, known as skyflats. It is of vital importance to make sure that no stars or other objects appear on theimage. The variation of light sensitivity of pixels has a multiplicative effect, so the originalimage has to be divided by the flat.

Finally, if the CCD has defective pixels, these could be reflected on the dark and flatfields. A mask of the image can be obtained to know which pixels have to be ignored ortreated exceptionally in further analysis.

Because of Poisson noise inherent to the photon arrival process (further explained insection 3.4.1), each image obtained with a camera will have an associated error value. Thismeans that if mathematical operations are to be performed between images, which is thecase of image reduction, these errors have to be considered and correctly propagated throughthe operations. As a means to simplify the operations to be performed between the scienceimages and the calibration ones, the error corresponding to the dark and flat fields can beminimized by obtaining several fields of each kind, and then combining them to obtain thefinal calibration files to be used in the procedure, usually known as Master calibration fields:

21

one MasterDark and one MasterFlat.

In the case of the MasterFlat, it can be obtained by simply combining the correspondingimages as explained in section 3.4.1. In the case of the MasterDark, many times it can not beobtained directly from a combination. This, because it is important that the exposure timeof the MasterDark used in the reduction is equivalent to the exposure time of the scienceimages to be reduced. Usually, a series of dark frames is obtained with different exposuretimes for each. If the exposure time needed for the reduction is not present, a MasterDarkcan be interpolated from the rest of the images.

Once the MasterDark and MasterFlat are obtained, the MasterDark is subtracted fromall fields and the reduction of an astronomical image is obtained as the result of the followingoperations:

Reduced image =Original image−MasterDark

MasterFlat−MasterDark(3.1)

In terms of astronomical image processing, the original, pre-reduction image is referred toas a raw image. The image resulting from the reduction is referred to as the science imageor the reduced image. Scientific analysis is performed over the reduced image.

Figure 3.5: Example of the reduction process. The left image shows a portion of a raw astronomicalimage. The middle image shows the result after subtracting the dark field, and the right image showsthe result after dividing by the flat field. That final image corresponds to the reduced image overwhich scientific analyses are to be carried out.

3.4 Astronomical image processing

After astronomical images are reduced, scientific information can begin to be obtained fromthem. Depending on the data to be studied and the object observed, different processes willhave to be carried out over the images in order to obtain the relevant information. There aresome standard procedures to be carried out in many types of different astronomical imageanalyses:

22

3.4.1 Image arithmetic and combining

Often astronomical images have to be combined or stacked in different ways. Combiningor stacking multiple exposures of the same object is a common method for noise reductionin astronomical images. The quality of an astronomical image can be defined in terms ofthe Signal-to-Noise Ratio (SNR), which corresponds to the ratio between the number ofphotons belonging to the observed light source (the signal) over the noise, which is the totalcontribution of photons from various and random sources that affect the signal:

SNR =Object (signal) photons

Standard deviation of image photons

The SNR is a reflection of how well an object is measured in the image. Assuminga Gaussian distribution for photon arrival, the SNR value can be translated to standarddeviation (σ) values of a Gaussian. Values between 2σ and 3σ mean that only about 68% ofthe photons come from the astronomical object of interest. A value of 4σ means that 95% ofthe photons are signal instead of noise, while 6σ means that 99.7% of the incoming photonscorrespond to signal.

More strictly, the number of photons N on a CCD detector follows a Poisson distribution[35]:

Pr(N = k) =e−λt(λt)k

k!

This is a standard Poisson distribution, with the rate parameter λt, that corresponds tothe expected incident photon count N . λ is the expected number of photons per unit timeinterval (thus the t). The random noise of an image can then be represented as the standarddeviation for the Poisson distribution:

σ =√N

One of the easiest ways to stack astronomical images to reduce noise is by simply summingthem up. The sum is done pixel to pixel. This reduces the SNR, since the signal will beconstant for all the images, but the noise is random. However, just as the SNR increases,the noise also does, although at a slower rate. When summing N images, the SNR follows:

SNR ∝√N

To reduce the background standard deviation noise, it is best to combine the images usingaverage or median combining functions. A mean combining or average combining consistsof taking the average value, pixel to pixel, for the N images stacked. A median combiningconsists of taking the median pixel, instead of the average. A median combining tends to

23

work better than an average combining, since extreme pixels are canceled out from the finalimage. For an average or median combining of N images, the SNR follows:

SNR ∝√

2N

π

All of these operations are done pixel to pixel over each of the N images to be combined.While noise reduction is one of the main reasons for image combining, the image combiningprocess can also be carried out when using different filters for image acquisition, to mergethe different layers into one real color image. This is commonly done for aesthetic purposesin astrophotography.

It is important to note that the stacking methods mentioned here are to be used onlyfor Master calibration file obtention. When images with actual astronomical objects areto be stacked, the process is not so straightforward. Since there can be pixel or even sub-pixel differences between images, due to telescope movement or atmospheric interference, thestacking of astronomical images with objects must make sure that the pixels on the imagesare correctly aligned. This implies that some transformations, such as rotations or positionchanges, will have to be carried out in some of the images before the stacking procedures,yielding a process much more complex than simply calculating the mean or median valuesacross the z axis.

3.4.2 Filter application

Combining or stacking images is not the only way to reduce noise. Smoothing filters can alsobe applied for this purpose. Unlike the previous case of image combination, where the averageand mean are calculated between different images, the application of a mean or median filteruses the pixels of the input image only. A blurring of an image can also be used to removeunnecessary details, such as smaller objects around a larger target. A smoothing filter foran image consists essentially of successive convolutions of the image with a specific kernel,depending on the type of filter. The convolution is performed pixel to pixel. The outputvalue for the new image pixels corresponds to the multiplication of each input pixel valuewith the corresponding kernel.

In the case of a Gaussian smoothing, the kernel consists of a discrete approximation of aGaussian curve. An example of this kind of kernel is shown in figure 3.6.

Applying such a kernel over an image is an approximation of convoluting each pixel witha Gaussian curve. The degree of the smoothing is given by the standard deviation of thecurve. What the Gaussian convolution outputs is a weighted average of the neighborhood ofeach pixel, giving a larger weight to the pixels near the center of the Gaussian.

A median filter is not based directly on a kernel, but rather on a pixel window. The filtertakes each individual pixel and looks at its neighbors inside the window. Then, the value ofthe pixel is replaced by the median value of the neighborhood pixels. This median is obtained

24

Figure 3.6: Example of a 3 × 3 Gaussian kernel to be used in Gaussian smoothing. The multiplyingfactor in front of the mask is equal to the sum of the values of its coefficients, as is required tocompute an average.

by sorting all the pixels in numerical order.

A Gaussian filter generates a much softer smoothing than a median filter. Median filters,however, are better for removing bad pixels and random extreme value pixels, since veryunrepresentative values on the pixel neighborhood will not affect the median.

3.4.3 Photometry

Images alone are not enough to gather important information about astronomical objects.While images can give us information about the morphology (shape) of the objects, quanti-tative information is needed to get estimates of energy output, temperature, size, and otherphysical properties. One of the ways to obtain this new information is through the processof photometry.

Photometry corresponds to the measurement of luminous flux, in terms of amount ofenergy as electromagnetic radiation, that is received from an astronomical object. Usually,photometry refers to measurements of flux over specific bands of electromagnetic radiationusing filters such as the ones shown in figure 3.4. The measurement of this luminous fluxrequires the extraction of the raw image magnitude of the target object.

Absolute photometry is a complicated process that directly measures the luminous fluxfrom the target. Many factors can interfere with the amount of photons captured from eachlight source: mainly, the size of the telescope (a telescope with a bigger mirror will capturemore photons than a smaller one) and, when using images from earth-based telescopes, theeffects of the atmosphere. The atmosphere produces extinction, meaning some of the photonsfrom the source get absorbed or scattered in the sky before they hit the telescope mirror. Italso produces seeing, which is the technical name for the effect that is normally referred to asthe “twinkling” of stars, caused by the refraction of light several times as it passes throughthe different, turbulent layers of the atmosphere.

25

Figure 3.7: Air mass and how it affects astronomical observations. Depending on the position ofthe observed object in the night sky, the light will go through different amounts of air mass, whichcan be understood as it will travel through different lengths of atmosphere. In the image, air massC is greater than air masses B and A. This affects what is known as the extinction of the objects:the longer the path light has to travel through the atmosphere, the more photons will be absorbedand/or scattered, not reaching the telescope. Thus, a bigger air mass means fewer photons fromthe object will reach the observer. Mathematically, the air mass X corresponds to the secant of theangle z formed by the zenith and the direction of the object: X = sec z.

If absolute photometry is to be performed, the observer must take into account all thementioned effects. Observation parameters must be determined, such as nightly extinctiondue to air mass (Figure 3.7), calibration equations must be obtained and they have to beapplied to the observations. Most of the time, photometric nights are needed for absolutephotometry: a night with completely cloudless skies, and where the extinction is a simplefunction of the air mass. These nights only happen a few times per year at observatories,and absolute photometry should not be carried out on other nights. All of these factors makeabsolute photometry a very difficult process, and the type of photometry which yields theworst precision for the magnitude values.

Differential photometry or relative photometry consists in measuring not only the fluxfrom the target object, but also from other stars, and the atmospheric effects that might bechanging the value of the received amount of photons. This can be done in two differentways:

1. Using photometric standard stars. Standard stars are objects for which their absoluteflux has been carefully measured and calculated, on a standard photometric system,over a long period of time and said measurements are usually available on specificastronomical catalogs. If one of these standard stars is located in the field of theimage to be used, comparison between the measured flux of the standard star(s) andits known, absolute flux can be used to calibrate the atmospheric effects on the image,thus generating means to calculate the actual absolute flux of the target object, withresults close to standard photometric systems.

26

2. Using comparison stars from the same field of view as the target. The magnitudeof the target object is calculated relative to the magnitude of comparison stars inthe field. This way, all atmospheric effects are removed from the images, since thevariation will be the same for all the stars in the field. This is the simplest method fordifferential photometry, and also the one that yields the highest precision, especiallywhen several comparison stars are used. The drawbacks of this method are the fact thatthe magnitude obtained will not necessarily be close to a standard photometric system(which is not of importance if only the variations of flux are to be studied), and also thefact that one must be very careful when choosing comparison stars, since some of themmay be variable stars. The flux from the stars used to calculate photometry should benormalized, to make sure that the results do not depend on the comparison stars anddifferent ones can be used on different images. Figure 3.9(b) shows an example of atarget object, along with many comparison stars selected.

Figure 3.8 shows the difference between raw and differential photometry measurements forthe same object. It is direct to see how differential photometry removes all the atmosphericeffects on the measurements. The graphs are the result of performing photometry over aseries of images of the same object. These curves show the variation of the light of the objectover a period of time.

(a) Raw instrumental magnitude (b) Differential photometry magnitude

Figure 3.8: Comparison between raw photometry and differential photometry. Image (a) showsthe raw photometric measurements of the target star’s luminosity on images taken at differenttimes during the night. The slight upward bow in the measurements is caused by the atmosphericextinction decreasing as the object moved higher in the sky, reached its zenith, and then startedgetting lower again. The significant lower measurements for some points were probably caused byclouds moving through the observation field at that time. Image (b) shows photometry for the samestar, using the differential magnitudes between the target and a comparison star for every image.It is clear to see how the effects of extinction and clouds are removed, giving a much clearer resultof the luminosity of the target object without atmospheric interference. Images source: [12].

Whatever the method of photometry to be used, the process to obtain the actual measure-ments from the images is the same. It is important to note that because of the atmosphere,stars do not look like punctual light sources, but actually span an area of the image. This iswhy the most widely used photometry technique, known as aperture photometry, is carriedout adding the pixel counts within a circumference centered on the target object. This isknown as the aperture radius. However, photons coming from the target object will not be

27

the only ones detected in this area: the background sky of the image also contributes withphotons, and they must be considered and removed from the final photon count of the target.To do this, not only the aperture radius is selected, but also a sky annulus is defined aroundthe object. In this annulus, an average value of the nearby sky’s photon count is calculated,which is afterwards removed from the total photon count of the target. Figure 3.9 shows atarget object, with its aperture radius and sky annulus selected.

(a) (b)

Figure 3.9: Aperture radius (red) and sky annulus (green) selected for a target object. Figure (b)shows the same target, along with several comparison stars selected. Although it is not pictured,the measurement of flux for the comparison stars is carried out the same way as for the target,meaning there will also be a sky annulus defined for each one of them.

Selection of both the aperture radius and the sky annulus are very sensitive parts of thephotometry process. The aperture radius must be wide enough to contain the whole object,but also the least amount of sky as possible, and it should not consider objects that might beclose to the target, such as another star. For the sky annulus, one must be sure that there areno stars inside, and that the area contains only background sky surrounding the target. Also,different combinations of aperture radius and sky annulus sizes can yield different results forthe photometry measurements.

Photometry error

Given the fact that photometry deals with the amount of photons arriving to a particulararea during a certain time, it is very important that the light curves obtained consider theassociated uncertainty of the measurements. These errors can give astronomers an idea ofthe certainty of the performed observations.

As was explained in section 3.4.1, every astronomical image has an associated error value.The arrival of photons to the detector follows a Poisson distribution, where the random noiseis represented by the standard deviation:

σ =√N

28

In this case, N is the expected number of photons. The fact that photon arrival follows aPoisson distribution has strong implications when aperture photometry is performed. Aper-ture photometry takes an aperture radius and a sky annulus and subtracts the amount ofphotons, translated into digital counts, that are inside each one of them.

Since each one of the photons has an associated error, the sum of photons inside the aper-ture radius and the annulus will also have an error value. These errors must be propagatedwhen the counts inside the sky annulus are subtracted from the counts inside the apertureradius.

3.4.4 Light curve or time series generation

Many times, obtaining photometry measurements from just one image frame is not enoughfor a complete analysis of the target. Variations of the photometry measurements of the sametarget through time can provide detailed insights about the behavior of the observed system.With multiple photometry measurements over time, a type of time series called light curvecan be obtained. Light curves are valuable tools for many different areas of astronomy, suchas variable star studies, supernova studies, and analysis of stars believed to have extrasolarplanets around them [60,74].

Time series analysis tools [13] can also be carried out over light curves: the application ofstatistical and mathematical tools to a set of variable data, as a way to discover the behaviorof the system. Time series analysis works by studying the variations within the points thatmake up a curve, looking for trends and correlations, among others.

29

Chapter 4

Introduction to General-PurposeGraphics Processing Unit (GPGPU)computing

This chapter serves as an introduction of what General-Purpose Graphics Processing Unit(GPGPU) computing is. The architecture that differentiates a GPU from a common CentralProcessing Unit (CPU) will be presented, as well as what makes this architecture suitablefor non-graphical algorithm programming. The restrictions imposed by said architecture willalso be presented, followed by a discussion of what makes an algorithm appropriate for GPUimplementation.

A review of GPGPU use in astronomy is given as a means to introduce the work developedin this thesis regarding GPU implementations of certain astronomical processing algorithms.Even though a description of the general GPU implementation in this thesis is presentedin this chapter, a more in depth characterization of the algorithms, as well as the technicaldetails, data management operations, and actual implementation can be found in Chapter 5.

4.1 What is the Graphics Processing Unit (GPU)?

The Graphics Processing Unit (GPU) is an electronic circuit specialized and designed for theacceleration of image rendering in a display [53]. The GPU came to existence in the 1980sas a means to reduce the workload of the CPU in terms of image rendering. Initially, GPUswere used to accelerate computationally intensive work such as polygon rendering [1], texturemapping [71], and anti aliasing [56], among many others. Modern GPUs also support theacceleration of geometry calculations through their handling of vertices.

Regarding scene rendering, the GPU handles the complete process of turning a CPU des-cription of an image into an actual image that is ready to display. This process ranges fromtransforming vertices to their homogeneous coordinate representation [50] and the triangu-lation of polygons [31], to the lighting model application [19], camera position simulations,

30

texturing, pixel rasterization [61] and hidden surface handling. Not to mention that, forexample in computer games, all of this must be done in real time.

It would be simply unfeasible for a sequential CPU pipeline to do all this work fast enoughfor a computer game to run properly. GPUs were specifically designed to handle the saidprocessing in a parallel fashion. Because of this, the architecture of a GPU is completelydifferent than the one of a CPU. Modern commercial CPUs are composed of around 10 cores,each one with the capacity to handle a limited number of software threads at a time. Thecores of a CPU are optimized for sequential, serial processing. On the other hand, modernuser-level commercial GPUs can have hundreds or even thousands of simpler, smaller, andmore efficient cores, giving GPUs the ability of processing thousands of software threads atthe same time.

Usually, when referring to GPU computing, the CPU receives the name of host and canbe connected to several GPUs, or devices, as can be seen in Figure 4.1. This programmingparadigm gives the GPU another very attractive trait over CPUs: direct scalability. SinceGPU computing is data parallel, every point of data receives the same instruction to beexecuted, through a special program called the kernel. Every thread on the GPU will executethe instructions given by the kernel. That means that if the user wants to add more computingpower to a machine, more devices or GPUs can be added to the host without requiring anychanges to the code. The data will be properly distributed through the different GPU devicesand the kernel will be executed in the same way as before. Such direct scaling on a CPUprogram would not be possible without several changes to the code.

Figure 4.1: Host (CPU) and devices (one or more GPUs) model on GPU computing.

Even though GPUs from different manufacturers vary in their hardware architecture, ageneral model is maintained across pretty much all graphic cards [49, 85]. Using Khrono’sGroup OpenCL Specification [42] nomenclature, this model consists of a memory hierarchyformed by the global memory, constant memory, local memory, and private memory. Inthis same denomination, each thread is called a work-item, and work-items are associated inwork-groups. Each work-item runs one instance of the corresponding kernel. A scheme of thememory model inside the devices is shown in Figure 4.2.

31

Figure 4.2: Memory inside a GPU device.

The global memory is shared by the whole multiprocessor, meaning all work-items canaccess it, no matter which work-group they belong to. Work-items can read from or writeto any element of the global memory. Access to global memory is very slow, sometimes evenhundreds of times slower than access to local memory, and thus should be used carefully.Constant memory is a special region of the global memory, and as its name says, it remainsconstant throughout the execution of the program. Work-items can only read from constantmemory, but not write to it.

Local memory is shared within work-groups. All work-items inside the same work-groupcan access the group’s local memory, but work-items in other work-groups cannot access it.This memory can be used to allocate variables that will be used through the execution of akernel. Access to local memory is much faster than access to global memory, and so it shouldbe selected every time the program allows it. Private memory corresponds to a memory areathat is private to a work-item. All variables stored in private memory cannot be accessed byanother work-item, even if it belongs to the same work-group.

Work-items are accessed through the index that they occupy on their work-group. Everykernel must obtain the index of the work-item that it is going to be running on. If everywork-item is equivalent to a data point, with one pixel in an image for example, no guaranteesare given about the order of execution, since it is not known which pixel will be assignedto which work-item. Because of this, it is strictly necessary that the instructions executedthrough the kernel be the same for each data point, encompassing the GPU’s intrinsic dataparallel nature. A scheme of work-items and work-groups is shown in Figure 4.3.

Because of the GPU’s unique architecture, and as can be inferred from this basic descrip-tion, modifying a CPU algorithm to run on the GPU is not always a direct procedure. The

32

Figure 4.3: Work-items and work-groups on a GPU can be accessed through indices.

algorithm must first be evaluated to check if a data parallel implementation is possible anduseful. Then, whatever sequential code that was inside the program must be transformedinto chunks to be executed within a kernel. This is one of the main limitations for GPUimplementations: not all algorithms are designed in a way that can be accelerated using aGPU.

4.2 General-Purpose GPU computing (GPGPU)

As was explained in the previous section, GPUs were designed and built to perform data-parallel computations of graphic functionalities for faster rendering. However, there aremany other problems similar in nature, but not necessarily related to graphics processing,that can make use of the GPU’s properties to obtain important acceleration in executiontime. General-Purpose GPU programming, or GPGPU, is the term for when the GPU isused for computations that would usually take place on the CPU.

The kind of algorithms that are well suited for GPGPU implementations are the oneswhich are data parallel and throughput intensive. An algorithm being data parallel meansthat operations can be performed over several different data elements simultaneously, andthat the output from some operations does not affect the outcome of others. Being throughputintensive concerns the fact that there will be a lot of data involved, meaning a lot of proceduresto be executed in parallel.

33

Nowadays, GPUs are used for tasks which used to be in the realm of high-performanceCPUs. Many areas of science are taking great advantage of GPU use for their computations.For example, in the realms of biology and chemistry, parallelization has allowed faster im-plementations for evolutionary algorithms [54], DNA sequencing [76], and protein synthesissimulations [59], among many others. Applications have also surged in the area of math-ematics, including linear algebra libraries [75] and applications closer to computer sciencesuch as encryption algorithms [63].

Even though the GPU has proven to be immensely useful for calculations other than therendering of graphics, there are still some limitations for the development of general GPUalgorithms. The greatest one is the data transfer rate. The process of transferring data fromthe CPU to the GPU and back is still very slow, to the point that some algorithms that couldhighly benefit from GPU acceleration end up working faster in the CPU. It is crucial that anefficient data transfer model is implemented when GPGPU is to be used. Besides the datatransfer rates and the need for an algorithm design that allows for data parallelization, thereare still some precision differences between the results obtained in a GPU. Even though thedifferences are minimal and take several decimal places to show up, in cases where extremeaccuracy of results is needed, GPGPU may still not be a reliable computational method.

Although there are still limitations in GPGPU programming, scientists across all fieldsare gaining knowledge about this technology that may help them achieve calculations thatwould have taken days on a CPU.

4.3 GPGPU use in astronomy

During the last decade, astronomy has slowly but steadily started to incorporate GPU ac-celeration for different calculations [7, 28]. Starting from the reviews by Fluke [28, 29] andXiao et al [83], it is N-body dynamical simulations that make up most of the GPU use inastronomy, with the first implementations dating back to 2006 [26]. N-body simulations inastronomy are mostly dedicated to solving cosmological problems [16,17,32,38,43,64,65].

GPU-accelerated methods for spectra extraction have also been developed in the lastdecade, mostly through the implementation of data pipeline software for spectrographs [30,83]. Radio astronomy has also benefited from GPU implementations, mostly for reductionand combination of signals coming from many different antennas in large radioastronomyarrays [18,67–69].

In terms of image analysis, deconvolution and source extraction methods are currentlybeing implemented using GPU acceleration [51,62,79,86]. These processes are needed whenthe image field is too crowded with objects to perform photometric analyses directly, asdetailed in section 3.4.3.

Astronomical data is thriving, observation and data analysis paradigms are changing,and astronomical software has not been left behind in terms of taking advantage of GPUacceleration. However, the astronomical community has been slow at accepting this new way

34

of parallel programming, and there is still a lot of work to do and applications to design thatmake use of this technology.

4.3.1 GPGPU use for astronomical data analysis in this thesis

As was mentioned in section 4.2, the algorithms best fit to be implemented on a GPU arestreaming algorithms, where there are highly data parallel computations but with little reuseof input data [27].

In a first approach, matricial operations concerning the astronomical data reduction pro-cess such as the subtraction and division of Master calibration (section 3.3) files seem likeprocedures that could be directly implemented through a GPU kernel. The fact that theseoperations are performed pixel to pixel and are not dependent on data changes makes thealgorithms highly data parallel and throughput intensive. However, it is likely that the needfor intense data transfer between the CPU and the GPU in the reduction process makes thisprocess not so fit to run on the GPU.

This thesis also analyzes the subject of accelerating light curve obtention (section 3.4.4)through GPU implementations. For this, a way to reduce the amount of data transferredbetween the CPU and the GPU was designed. Taking advantage of the fact that light curveobtention only requires a small portion of the image to be analyzed, a GPU approach isimplemented and tested in this thesis. The details of the algorithm implementation for thispurpose can be found in section 5.4.4.

It is probable that the reason why the GPU has not yet been exploited much in termsof astronomical data processing has to do with the current slow data transfer rates betweendevices. However, considering the quick development of the big data era and the jump to thenew, survey-driven observational approach in astronomy, quicker ways to process and analyzedata will have to be developed and incorporated in astronomical data pipelines. Consideringthe GPU’s great scalability properties commented in the previous section, even though rightnow the advantages of using GPU acceleration for small datasets may not be substantial, thefuture will require redesigned astronomical software to enforce parallelization. It is withinthis context that this thesis analyses GPU implementations for reduction and light curveobtention, seeking to answer the two research questions posed in section 1.3.2:

Q1: Is it possible to obtain significant GPU speed up in astronomical algorithms that dealwith a large amount of data transfers between CPU and GPU?

Q2: Are these speedups justified? In other words, is the obtained acceleration worth itconsidering the extra implementation that GPU algorithms convey?

35

Chapter 5

Software design and implementation

This chapter presents an outline of the complete functioning of FADRA’s currently imple-mented data structures and functions. Since every section of the pipeline was implementedas separate Python modules, FADRA can be used as an imported package through an in-teractive Python shell (such as iPython), or the packages can be imported into a separatePython script.

The present chapter begins by introducing the objects used by FADRA to handle thedata from the astronomical images. Then, the processes of calibration image obtention andastronomical image reduction are presented. The light curve obtention procedure designedfor FADRA is then presented.

5.1 Data handling: AstroFile and AstroDir classes

Data within the FADRA framework is handled by using the implemented AstroFile andAstroDir objects. AstroFile and AstroDir objects do not store the image data itself, butrather the filename and path to the corresponding .FITS image, working more as pointersthan as data containers. This design looks to save memory by not keeping every .FITS imagefile open at all times. Files are opened and data is accessed only when operations over imagesare to be carried out.

The AstroFile and AstroDir objects are part of the dataproc Python package, a previousimplementation developed by Professor Patricio Rojo, as was mentioned in section 1.3.4. Thedataproc package is a requisite for FADRA, since the latter uses AstroFile and AstroDir

objects directly to handle files. The implementation of these objects was not modified, andthus below follows the details on how they operate within the framework.

An AstroFile object contains the path to the original .FITS file, in string format. Whenan AstroFile is initialized, the referred file is checked to correspond to a valid .FITS file.An AstroFile can also contain one flat and one dark field corresponding to the calibrationfiles of the specific image. The AstroFile class also contains functions to read and assign

36

values from the image’s .FITS header, and to read the data from the image.

The data of an AstroFile can be read through the reader function of the class. Whenthe data or header of an AstroFile is to be read, the Python package PyFits1 is used. ThePyFits package opens the .FITS file contained as a filename on the AstroFile. Both the dataand the header from the open file are then available as SciPy arrays, allowing for immediateprocessing in Python. The PyFits package is also used when writing data to a .FITS file orediting a header value.

The AstroDir object serves as a container for AstroFile objects. An AstroDir can beinitialized directly with a list of AstroFile objects, or it can be given a path to a directory,in which case AstroFile objects are created for every file on the directory and added to theAstroDir. An AstroDir can also have associated dark and flat fields, which can be used toreduce all the contained AstroFiles at once. Only one file of each calibration field type isallowed for each AstroDir. In the case that whole directories of calibration images are to beused, the corresponding MasterDark and MasterFlat fields must be obtained and then addedto the desired AstroDir . The AstroDir class provides the readdata function, which readsthe data from every AstroFile through AstroFile’s reader function, which was explainedabove.

The AstroDir class also provides a sorting function, to sort the AstroFile objectsaccording to a given header parameter. A filter function is also provided, returning onlythe AstroFile objects that match the given condition for header parameters.

The direct addition of two or more AstroDir objects will yield a new AstroDir objectwhich contains the AstroFile objects of all the added AstroDir objects. Other mathematicaloperations between AstroDir objects are not permitted.

5.2 Calibration image combination and obtention of

Master fields

As mentioned in section 3.3, astronomical image reduction requires two kinds of calibrationimages: dark and flat fields. Although in most astronomical observation runs many differentimages of each of these kinds are obtained, the reduction process requires only one dark fieldand one flat field. These are usually referred to as MasterDark and MasterFlat, respectively,and are obtained from the combination of all the images of each kind.

As explained in section 3.3, there are different ways to combine said images and obtainthe Master calibration fields. In the case of the flats, they can simply be combined usingmean or median combining functions (or any other desired combining function). In the caseof dark fields, they can be combined directly or calculated for a specific exposure time. Thisthesis implements both the direct combination of calibration fields and the interpolation ofdark fields according to a given exposure time.

1http://www.stsci.edu/institute/software hardware/pyfits

37

Figure 5.1: The AstroFile and AstroDir classes

This thesis implements two strategies to combine calibration fields to obtain the corre-sponding Masters : a mean combining function and a median combining function. Both areimplemented in the CPUmath package as mean combine and median combine respectively.Each one of these functions receives, as input, a SciPy array with the data of each file. If thefiles are originally contained in an AstroDirs, the data from the AstroDir must be extractedand then passed to the combine function as a parameter. This, because the CPUmath packageis intended to be useful for any kind of file, not just AstroFiles or AstroDir objects.

The input files are first checked to see if they are all of the same shape, otherwise a warningis issued to the user and the combination is not carried out. When all files are of the propershape, mean and median combining procedures are executed along the z axis. This stacksall the images together according to the selected function. The combined Master image isreturned as a SciPy array, which can be saved and stored as a AstroFile.

Even though mean and median combining functions were implemented for this matter, theimplementation of the Master field obtention routines, which are respectively get masterdark

and get masterflat, allow for the user to explicitly give the algorithm a combination func-tion as a parameter. This allows the user to define their own way to combine the images,with no limitations or restrictions other than that the function should return files that areof the same shape as the raw science images. This thesis recommends mean combine andmedian combine, but the user is also allowed to pass their own combining function as aparameter of the routines. If no function is given, mean combine is used as default.

The case for the obtention of the MasterDark field is different from the previous cases, aswas explained in section 3.3. Even though the option to directly combine all the dark fields isgiven in the get masterdark function, most of the time the astronomer desires to interpolateor calculate a specific dark field for a given exposure time, since this would be the dark field

38

that correctly reflects the amount of instrumental noise during the image acquisition time.For this matter, the get masterdark function receives the desired exposure time as one of itsinput parameters. First, all the given dark field files are scanned to get their correspondingexposure time. If one file is found to have the same exposure time as the desired value,this file is immediately returned as the MasterDark. If, however, no file of the exact wantedexposure time is found, a linear interpolation is carried out to artificially generate a darkfield of said exposure time, which is then returned as the MasterDark. If the exposure timeargument is not given to get masterdark, all the dark field images will be combined as inthe cases for the MasterFlat, and exposure time will not be considered. It is up to the userto decide which method is of better use for their specific data.

Since the combining functions are used within the routines to get the Master files, thefunctions get masterdark and get masterflat receive as input an AstroDir or SciPy arraywith the files, the combining function to be used (mean combine is used as default if nooption is given) and, in the case of get masterdark, the desired exposure time as an optionalparameter.

The routines get masterdark, and get masterflat can be found in the reduction pack-age. mean combine and median combine can be found in the CPUmath package.

5.3 Astronomical image reduction

Image subtraction and element-wise division are image arithmetic procedures vital to theimage reduction process, as explained in section 3.3. These can be thought of as operationsbetween matrices, since each grayscale astronomical image is essentially a matrix where eachvalue corresponds to the gray value on the said pixel.

This thesis implements astronomical image reduction for full images, in CPU and GPUversions. Both of them receive as input: an AstroDir of the raw science files to be reduced,an AstroDir related to the path were the reduced files are to be saved, and the Mastercalibration files. If the Master files have already been calculated, they can be passed asseparate parameters or related to the input raw images AstroDir. If AstroDir objects arepassed as parameters for Masters instead of AstroFile objects or SciPy arrays, the files inAstroDir objects will be used to calculate the Masters through the procedures explained insection 5.2. In this case, the function to use for the combination and the desired exposure timefor the Masterdark must also be given as input parameters. The default combining function,to be used if no other specific function is given, is the previously introduced mean combine.

The use of CPU or GPU reduction procedures is determined by a GPU flag given to thereduce funtion. GPU=True will run the GPU version of the reduction, while the value False

(default) will run the CPU reduction.

39

5.3.1 CPU reduction implementation

The CPU implementation of arithmetic operations between images is found in the packageCPUmath. Subtraction and element-wise matrix division were implemented using SciPy’soperations between n-dimensional arrays. The data from the AstroFiles is read as neededand stored as a SciPy array while in use. Each file is opened only when needed and closedimmediately after the relevant data has been fetched.

An additional step performed to the results of reduction corresponds to sigma-clipping2.Sigma clipping is a method used to prevent high peaks in data. In the case of astronomicalimages, extremely high or low values in certain pixels might correspond to defective pixelsin the detectors, or to cosmic rays hitting the detector during the image acquisition process.These pixels should be removed before carrying out scientific analysis over the images. Afterthe images are reduced as was explained in section 3.3, the images go through a sigma-clippingfunction, with the default cut out value of 3σ.

5.3.2 GPU reduction implementation

The GPU reduction algorithm receives an AstroDir of raw science images as input. Eachraw science image is then flattened down to a one-dimensional SciPy array. This can bedone because knowing the (pixel) size of the images allows for each work-item to be mappedto a pixel on the raw images and on the calibration files. The number of images to beconcatenated and passed to the kernel is defined by the global memory size of the device. Ifit is not possible for all the images to fit in the devices’ memory, the kernel is called in as manyiterations as needed. Each one of the calibration fields is also flattened to a one-dimensionalarray separately. This flattening process is done on CPU and then given as input for theGPU kernel.

The reduce.cl OpenCL kernel receives these arrays as input, as well as the originaldimensions of the images, to correctly advance through the arrays indexes and make surethat the 1D operation is exactly equivalent to the direct, 2D one. The calibration arrays arepassed to the GPU’s constant memory, while the image arrays are passed to global memory.Each work-item is then mapped to one pixel (one value in the one-dimensional array ofimages) and performs the subtraction and division with the calibration constants.

After the results are copied back to the CPU, the arrays are reshaped to the originalshape of the images, which are then run through the sigma-clipping algorithm explained inthe previous section, saved to disk, and added into a new AstroDir , which is returned asoutput of the reduction process.

2Sigma-clipping corresponds to the process of iterating through the data, rejecting data points that areabove or below a specified number of standard deviations (σ). An ε value can also be given as a thresholdfor the process.

40

5.4 Light curve obtention: the Photometry object

The light curve obtention algorithm implemented in this thesis does not make use of theastronomical image reduction process explained in the previous section. Reducing hundredsof whole astronomical images results in a loss of memory and a longer computation time,though for the light curve obtention only a few pixels on each one of the images are needed.For this reason, if the user wishes only to reduce full astronomical images, the proceduresdescribed in section 5.3 should be used. If, however, the user does not need to reduce allthe images but only wishes to obtain the light curve from their data, a different procedure iscarried out, which is detailed in the present section.

To handle light curve obtention, the Photometry object was created. A Photometry objectis initialized with all the parameters needed to carry out aperture photometry though theprocess devised in this thesis. A scheme of the Photometry class is shown in Figure 5.2.The parameters will be clarified as the different stages of the process are explained in thefollowing sections.

Figure 5.2: The Photometry object. The functions and parameters will be explained in the follow-ing sections.

41

5.4.1 Data handling for light curve obtention

The current astronomical image processing paradigm, implemented by IRAF (currently themost widely used astronomical data analysis software) and others, considers the followingsteps:

1. Read raw image from disk.

2. Reduce image following steps in section 3.3.

3. Save reduced science image to disk.

4. Each time scientific analysis is to be carried out over a reduced image, read said imagefrom disk.

This approach presents several problems:

1. Each image is read from disk, reduced, and saved to disk. In cases where lots of imagesare available (which is usually the standard for astronomical observation runs), thispre-processing can take a long time and take up lots of memory during computation.

2. There might not be sufficient disk space available to save all the reduced images.

3. Some images might be reduced, but never considered for analysis, thus having wastedcomputation time on images which are never to be used.

The software implemented in this thesis follows a paradigm designed for faster and moreefficient processing and analysis. The obtention of light curves is based on the use of stamps:to obtain a light curve, only a few targets from the complete astronomical images are used.This software does not pre-process the whole images, but only the “stamps” surroundingthe target stars. This makes it possible to handle light curves with lots of image frames,something that may not be possible when having the full astronomical images available inmemory. Reduction is also done as needed: when the user has selected the images to use andthe corresponding targets, reduction is performed only in the target stamps.

This approach is based on the system used by the Kepler space telescope3. The Keplerspace telescope is dedicated exclusively to the observation of stars believed to have extrasolarplanets around them. The telescope has a very large field of view, 105 square degrees. Incomparison, the fields of view of even the biggest telescopes in the world are around one squaredegree. This means that each image obtained with the Kepler space telescope may containhundreds of targets. Because of this, the Kepler Data Search and Retrieval4 interface usedto obtain scientific data from the telescope returns only a handful of pixels surrounding eachstar on the target list given by the user [45]. The users obtain only stamps with their relevanttargets, this way making data transfer and further analysis more efficient. A similar stampapproach is used in [51] to implement GPU-accelerated source extraction in astronomicalimages.

In the case of this thesis, the stamp approach also serves as a way to optimize the GPUimplementations of the algorithm. Data transfer between the host and the device is a time-

3http://kepler.nasa.gov/4http://archive.stsci.edu/kepler/data search/search.php

42

Figure 5.3: Example of data stamps for aperture photometry. Instead of using the full, 1024×1024pixels image, the stamps of size 100×100 pixels each are used to obtain the photometry measure-ment.

critical operation, and should be reduced as much as possible. When using full astronomicalimages, lots of information that will not be useful in the process is passed back and forthbetween host and device, making the process very inefficient, and in many cases the datatransfer rate can even overshadow the gains in execution time obtained with the use of GPU.By using only the data around the relevant targets, each data transfer to the GPU is movingonly the data that is strictly necessary.

5.4.2 Obtaining target data stamps

For the specific case of light curve obtention through aperture photometry, the use of thecomplete astronomical images is not necessary. Aperture photometry requires only the areaof the image that contains the target star, and a border around it to obtain the measurementof the sky background. Given that the coordinates of the targets must be introduced bythe user to do aperture photometry, images can be pre-processed to obtain only the relevant“stamps” around the desired targets. This way the reduction and photometry proceduresdeal only with the strict amount of data necessary. This decrease in the amount of data to beread and analyzed is highly significant for GPU processing. The data transfer rates between

43

host and device can be very slow, therefore moving the least amount of data or doing it moreefficiently is optimal. Using only the data from the stamps around the needed targets is away to optimize the GPU computations implemented in this thesis.

Initially, the user has to input the coordinates corresponding to the desired target, aswell as other reference stars for differential aperture photometry (section 3.4.3). In an idealsetup, these initial coordinates would be enough to obtain the targets’ stamps for all theother images in the observation run. However, telescopes are not perfect instruments, andmany times there are shifts in the positions of the targets. Most of the time these shifts areonly a few pixels total, but in the case of photometry, we must be assured that the targetwill always be centered on the corresponding stamp and inside the given aperture radius. If,for example, a big enough shift happens causing part of the target to be out of the apertureradius, the photometry measurement will be incorrect.

To deal with this problem, the initial coordinates given by the user are used only as astarting point. The first stamp is calculated using the said coordinates. The centroid of thefirst stamp is calculated. For the next stamps, the coordinates to use will be the coordinatesof the centroid. The process follows not the given coordinates, but the centroid of the targeton each image. This way, even if there are pixel shifts between images, the aperture radiusand sky annulus will always be centered on the target.

Figure 5.4: Series of data stamps following the centroids of two different targets

As mentioned in section 3.4.3, aperture photometry requires the definition of a sky annulusaround the desired target. Sufficient space for the sky annulus must be considered whenobtaining the stamps. The user must define not only the coordinate where the initial stampwill be centered, but also the “square radius” of the stamp, which will be the space leftbetween the center of the stamp and each one of its borders. This radius must be enough sothat the sky annulus fits completely inside the stamp.

Before doing the data reduction and photometry, the pipeline obtains the stamps forevery target through all of the image frames. Each target generates the same number ofstamps, and on the same time stamp measurements. If there are n targets and m totalimages, n data-cubes are then returned, where each cube has m elements, each one of shape(2 ∗ stamp rad, 2 ∗ stamp rad).

44

Figure 5.5: Parameters of the data stamp for aperture photometry

5.4.3 Reduction process using stamps

As was explained in section 3.3, all astronomical images must be reduced before carrying outscientific analyses over them. Since the reduction is a pixel-wise operation, it is critical thateach one of the stamps is reduced with the exact corresponding stamps of the calibrationfiles.

For this, the coordinates used as center of the stamps obtained in the previous step(through following the centroid) are stored in a list. To reduce an individual stamp, a stampof the same size and centered in the exact same position is obtained from the calibrationfields. These calibration stamps are then used for the reduction of the corresponding datastamps.

The reduction process is carried out right before the photometry calculation. This way,if the user decides a stamp or target is not to be used for some reason, such stamps will notbe unnecessarily reduced. The reduction process is executed only when it is certain that thecorresponding stamp will be used for the light curve to be obtained.

5.4.4 Aperture photometry

Aperture photometry is then carried out over the reduced stamps, in the same way that itwould be done if a full image was used. Slightly different processes are implemented for CPUand GPU photometry.

45

CPU Photometry

The first step is executing the data reduction, with the corresponding stamps from theMasterDark and MasterFlat calibration files.

After a stamp is reduced, the photometry measurement inside the aperture radius isobtained. To obtain an estimation of the sky background, a polynomial fit is calculatedinside the sky annulus. The least squares method is used to fit a polynomial in the annulus,of a degree given by the user, although the default degree is 1. The obtention of this fit isimplemented using SciPy’s optimize module. Once the sky background has been estimated,this value is subtracted for the photometry measurement obtained inside the aperture radius,as a means to obtain only the amount of luminous flux coming from the target object.

The implementation developed to obtain the aperture photometry with the CPU algorithmwas part of the previous work developed by Professor Patricio Rojo, as commented in section1.3.4. The changes made to this previous implementation correspond to the adaption towork with the data stamps instead of full images, which included adding a specific reductionprocess for each stamp (since for each stamp, corresponding stamps from the calibrationimages are obtained, as explained in section 5.4.3), and the integration of the algorithm tobe part of the Photometry module designed and implemented for this thesis.

GPU Photometry

All the stamps for the corresponding target, along with one MasterDark file and one Master-Flat file, are given as input to the GPU kernel. This kernel also receives the desired aperturefor the photometry, the radii for the sky annulus, and the size of the stamp. Passing datato the GPU is done in the exact same way as for the data reduction explained in section5.3.2: the number of stamps given to the GPU in each iteration is calculated depending onthe global memory of the device. The data stamps and calibration files are then flattenedinto 1-dimesional arrays and passed to the GPU.

Inside the kernel, the photometry measurement is calculated by adding the counts of thepixels inside the aperture radius. A pixel is said to be inside the aperture radius if its distanceto the center of the stamp is less than the said radius. The value of the sky is calculatedas the average count number inside the sky annulus. This average result, multiplied by thenumber of pixels inside the aperture radius, is subtracted from the photometry measurementas a means to subtract the effect of the sky background.

Photometry value =∑

Counts inside aperture radius

−( ∑

Counts inside sky annulus

Number of Pixels in sky annulus

)∗ Number of pixels inside

aperture radius

(5.1)

The output of this GPU kernel corresponds to a 4-element vector where the values ofinterest are saved to each element: the photometry calculated inside the aperture radius, the

46

number of pixels inside the aperture radius, the total photometry calculated inside the skyannulus, and the number of pixels inside the sky annulus. These four values are returned forevery stamp, and then equation 5.1 is carried out on the CPU once the results are copiedback. However, since many work-items will be writing to the output buffer at the sametime, synchronization between work-items is necessary to avoid data race conditions duringthe calculations. In this case, since the operations performed by the kernel correspond onlyto subtraction, division (for the reduction part of the process) and addition to calculatethe photometry values, atomic functions5 were used inside the kernel. OpenCL’s atomicfunctions provide synchronization and avoid data race conditions. The only drawback is thatOpenCL only supports atomic functions for 32-bit integers. Because of this, the photometrymeasurements calculated in the GPU are integer values.

5.4.5 Light curve data handling and visualization: the TimeSeries

object

After aperture photometry is carried out for a series of targets and images, and the lightcurves for the different targets are obtained, the light curve is visualized. As explained insection 3.4.3, differential aperture photometry contemplates performing operations betweenlight curves for different targets, as a means to reduce atmospheric effects over the curves.

In order to easily store and access the results of the light curves, and to allow for operationsbetween different target light curves, a separate class for the handling of aperture photometryresults, TimeSeries, was implemented.

Figure 5.6: The TimeSeries class

A TimeSeries object can contain several channels, where each channel is the result ofaperture photometry carried out on one target over a period of time. The number of channels

5https://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/atomicFunctions.html

47

corresponds to the number of targets in the image over which aperture photometry was carriedout, and the number of elements in one channel corresponds to the number of images in thelight curve. All channels within one TimeSeries object must have the same length.

Besides having data channels, a TimeSeries object can also contain a group of error chan-nels, where each element of an error channel corresponds to the error on the correspondingphotometry measurement. Each data channel has its own, corresponding error channel. Bothkinds of channels are implemented simply as SciPy arrays. A TimeSeries object also receivesan optional labels parameter, which should be a list with the names or tags of the targetsto be referenced. There should be as many names in the labels list as data channels in theTimeSeries .

Data channels within a TimeSeries can be accessed as items on a list. For example, ifwe have the TimeSeries object ts, calling ts[0] would return the first data channel of theTimeSeries as a SciPy array. If the labels parameter is given, the data channel can alsobe called using the target name, for example ts[’target1’].

The data channels on a TimeSeries can be divided into two groups. This, since mostof the time the main target will be one of the data channels, and the other targets will beused to perform differential aperture photometry, as explained in section 3.4.3. By default,a newly initialized TimeSeries comes with the first data channel (assumed to be the firsttarget) as the first group, and the rest of the channels together as the second group. Theuser can assign the groups to their own liking by using a mask. For example, if ts is aTimeSeries with five channels, the expression:

ts.set_group([1,1,1,0,0])

...asigns the first three channels to the first group, and the other two channels to the secondgroup. The 1s in the mask will always correspond to the first group. set group also setsthe grouping for the corresponding error channels. Groups in a TimeSeries can be accessedwith ts.group1() and ts.group2(), and the error groups with ts.errors group1() andts.errors group2(), correspondingly.

Mathematical operations can be carried out between the channels of one group. This thesisimplements the calculation of the mean and the median of the channels of the group. Theresults are stored in two “hidden” channels, ts[-1] and ts[-2], corresponding to the resultsof the operations on the first and the second group, respectively. The corresponding “hidden”error channels store the error regarding the operation. The mean and median implementedfunctions receive the group id (1 or 2) as input. The data and error channels on the groupsare not modified with these operations, however, corresponding error “hidden” channels withthe errors of the operations are also obtained.

Operations are also allowed between channels. For example, ts[0]/ts[1] returns thedivision between the first and second channels of the TimeSeries . These results, however,are not stored in hidden channels as the group operations. These operations allow for theremoval of atmospheric effects by the use of differential photometry, as explained in section3.4.3.

48

5.5 Graphical User Interface

The FADRA framework also provides a wrapper class to access the light curve obtentionprocess through an interactive GUI. The GUI allows the user to choose the raw and calibrationfiles to be used, as well as the aperture radius and other parameters relevant to light curveobtention explained in previous sections. When starting from scratch (the case when noworking directories have yet been created or loaded), the user begins by creating:

� An AstroDir object with the science image files

� A Masterdark and a Masterflat field files, as AstroFiles

� A TimeSeries object, initialized as empty

� A path to store the work environment files described above

When initializing a work environment, the user is prompted to select the correspondingpath for science, dark, and flat files. The Master calibration files will be obtained from thecorresponding dark and flat files contained in the given paths. The user can then select themethod to combine the calibration files and obtain the Masters. A drop-down menu containsall the combination options for the used to choose. For now, the default combination modesare mean and median functions. However, if the user creates their own functions and addsthem to the CPUMath package, they will be shown in the menu as well. These functionsand how calibration files are combined and Master calibration images obtained is furtherexplained in section 5.2.

Figure 5.7: GUI for AstroDir creation

In the case of Masterdark obtention, the exposure time of the images to be analyzed isneeded, as explained in section 3.3. For this, the exposure time of the first file on the scienceimages folder will be obtained and used for the calculation.

If the Master calibration files have already been generated, the user can select the filesinstead of paths. The selection of an output path to save the reduced astronomical images isalso optional, and only needed is the user wants to reduce the complete set of astronomical

49

images, as was explained in section 3.3. If the user wishes only to obtain a light curve fromthe images, giving this path is not necessary.

After an AstroDir has been created or loaded, the window shows a list of the current openAstroDir objects. Next to each AstroDir file, three different buttons are shown: one to viewthe TimeSeries (if the corresponding TimeSeries object has already been calculated), oneto obtain TimeSeries , and one to delete the selected AstroDir.

Figure 5.8: GUI showing loaded AstroDir objects

When the user chooses to obtain the TimeSeries from a given AstroDir , a new windowopens showing the first image from the given science images path. This window gives theuser the option to show the image in different scales. In this GUI, the user can click on asmany target points as they wish. Every time a point is selected, the list of points to theright is updated with the coordinates of the point and a text box, where the user can assigna name for the corresponding target. The user can also delete points through the interface.

Figure 5.9: GUI for photometry targets selection

50

The first point on the list will be considered as the target, and the rest of the points willbe considered as reference stars. After all the desired targets have been selected, pressingthe ‘CONTINUE’ button opens up a new window which shows the selected target. ThisGUI lets the user select the photometry and sky annulus radii for the aperture photometry(as explained in section 3.4.3) as well as the size of the stamp to be used in the aperturephotometry algorithm. The radii and stamp sizes selected here will be used for every othertarget in the image.

Figure 5.10: GUI for aperture photometry parameters selection

After parameters have been selected, pressing the ‘Obtain light curve’ button starts thecalculations. After aperture photometry has been performed and the corresponding lightcurve obtained, a plot of the curve is shown, and the values are saved in a file associatedwith the current AstroDir. The light curve itself is handled by the TimeSeries class.

Figure 5.11: An example of the visualization of the light curves obtained for one set of images withone target star and one reference star.

51

Chapter 6

Experimental settings

The experiments carried out in this thesis serve three different purposes. The first one is tovalidate that the results obtained with FADRA’s implementation of algorithms are correct.The second one is to compare the light curves obtained using FADRA’s CPU and GPUimplementations, to make sure that the algorithm for GPU light curve obtention described insection 5.4.4 yields correct results. Finally, the third purpose of the experiments is to comparethe execution times between the CPU and GPU implementations of FADRA algorithms.These experiments address the research questions laid out in section 1.3.2.

The different settings for the experiments are detailed in the following sections, as well asthe metrics used to compare the results in the different cases and the details of the datasetsused. Results of the experiments carried out within these settings are presented in Chapter7.

6.1 Validation of results

The first stage of the experimental part of this thesis corresponds to the validation of re-sults obtained with FADRA. This stage is composed of three different experiments: the firstone deals with the validation of the results obtained in the reduction process (section 3.3).The second experiment deals exclusively with light curves, and encompasses the comparisonof the FADRA light curves against curves obtained with established astronomical software.The third and last experiment compares light curves obtained using FADRA’s CPU imple-mentation with the ones obtained using FADRA’s GPU photometry implementation. Thefollowing sections describe three experimental stages as well as the metrics used to comparethe results.

52

Dataset ID Number of images Image size (pixels) Image size (MB) Exposure time (s)1 73 1024 x 1024 2.3 2002 140 1024 x 1024 2.3 203 162 1024 x 1024 2.3 304 172 1024 x 1024 2.3 255 254 1024 x 1024 2.1 256 297 1024 x 1024 2.1 10

Table 6.1: Datasets used for experiments

6.1.1 Experiment 1: Validation of reduction results

To confirm that the results obtained with the algorithms implemented in FADRA are correct,they must be compared against results obtained with established astronomical software. Thealgorithms tested in this stage correspond to the reduction algorithms, both in CPU andGPU implementations.

To obtain a gold standard for comparison against FADRA’s results, the processes werefirst carried out using AstroPy’s ccdproc module (section 2.1.2). The ccdproc1 module isdesigned specifically to perform basic processes over astronomical images, such as reductionand image combination for Master calibration file obtention. The validation experimentswere carried out over 6 datasets of real astronomical images. The details of each dataset canbe found in Table 6.1.

The results from FADRA’s CPU reduction implementation were compared against thereduction results obtained with ccdproc and the results obtained with FADRA’s GPU re-duction algorithm.

Similarity metrics

To test if the results obtained with FADRA are correct according to the ones obtained withAstroPy, the similarity between the resulting images in both cases must be measured. Forthis purpose, the Root Mean Square Error (RMSE) was used.

The RMSE value measures the difference between a predicted value and a real, observedvalue. In the case of comparing FADRA results against established software, the resultsobtained with AstroPy were considered as the predicted value or gold standard for the com-parison, and the results obtained with FADRA were considered as the observed values. Whencomparing FADRA’s CPU and GPU implementations, the CPU results were used as the pre-dicted value, and the GPU results were considered as the observed values. The RMSE iscalculated as follows:

1http://ccdproc.readthedocs.io/en/latest/

53

RMSE =

√∑ni=1(yi − yi)2

n(6.1)

where yi corresponds to the predicted value and yi corresponds to the observed value.

However, this definition of the RMSE yields values that are scale-dependent. Since theintent behind these experiments is not to measure the specific value of the RMSE for eachdifferent dataset, but to comprehend the general behavior of the FADRA implementation,the normalized version of the RMSE (NRMSE) was used. For each frame, the NRMSE iscalculated as follows:

NRMSE =RMSE

ymax − ymin

(6.2)

This way, the values obtained for the similarity of the images within the same datasetwill not depend on differences between the datasets themselves, providing a broader viewof FADRA’s performance. The NRMSE is scale-independent and expressed as a percentagewhere low values indicate less residual variance.

The NRMSE between each pair of images to be compared was calculated. First, betweenthe corresponding images of the results of reduction obtained with ccdproc and FADRACPU; then, between the corresponding images obtained with FADRA’s CPU and GPU im-plementations. For each dataset, the NRMSE values obtained for each pair of images wasaveraged.

6.1.2 Experiment 2: Light curve evaluation

The first experiment regarding light curves consists in their evaluation again results obtainedwith established astronomical software. In this case, again AstroPy was used to obtain thephotometry measurements, through its package photutils2. The photutils package is anin-development module associated to AstroPy which provides source extraction3 and aperturephotometry tools. Their aperture photometry function was used within a Python script,since it only allows for photometry calculation of one image at a time, for which apertureand coordinates must be given in every iteration.

The datasets used for this experiment are the same as for experiment 1, and the detailscan be found in Table 6.1.

2http://photutils.readthedocs.io/en/latest/index.html3Source extraction corresponds to the process of identifying the astronomical objects before performing

aperture photometry. This must be done when the image field is very crowded and stars can not be easilyidentified or separated from their neighboring stars.

54

Similarity metrics

When evaluating the similarity of two light curves, what matters the most is not that thenumerical difference between the curves is small, but also that the variations between thepoints forming each curve are consistent. Since time series analysis deals with the variationsinside the curve, it is crucial that the differences between points belonging to each curve arethe same.

As a first approach to this, the standard deviation would seem like a good way to mea-sure the similarity between the curves. However, two curves could have the same standarddeviation and still be very different. A better way to prove that the differences between thepoints of the two curves are consistent is to show that one curve simply presents an offsetdisplacement from the other one. If all the points in the second curve turn out to be anadditive or multiplicative offset of the first curve, then the two curves can be considered tobe the similar enough in terms of internal behavior for the same astronomical analyses to becarried out over them.

This means that what should be confirmed is the following relationship:

(yi = yi + α) ∨ (yi = βyi) (6.3)

where yi are the points in the first curve (in this case, the one obtained with AstroPy) andyi are the points in the second curve (the one obtained with FADRA’s CPU implementation).The values α and β correspond to the additive or multiplicative offsets, respectively. It isimportant to mention that the α or β values must be the same for all the points inside thecurves being compared, but can be different between different pairs of curves from othercomparisons. A dataset must have a consistent α or β value all through its two light curves,but said values can change between different datasets.

If an α or β offset parameter can be found to be the difference between them, then thetwo curves are said to be the same in terms of what astronomers need.

6.1.3 Experiment 3: Comparison between FADRA’s CPU and GPUphotometry implementations

Once the results obtained with FADRA’s CPU photometry implementation are proven tobe similar to AstroPy’s, the next step is to compare the said results against FADRA’s GPUimplementation to confirm that the aperture photometry approximation explained in section5.4.4 also yields correct results. For this, the same datasets from the previous experiments(Table 6.1), but now considering all the image frames, were used to obtain their correspondinglight curves, both with the CPU and the GPU algorithms. These results were then comparedwith each other.

55

Similarity metrics

The similarity metrics for the curves in this experiment are exactly the same as presented forexperiment 2. The only change is that, instead of comparing AstroPy results to FADRA’sCPU results, the results from FADRA’s both CPU and GPU implementations of the lightcurve obtention algorithm will be compared against each other. An additive or multiplicativeoffset is to be found between the curves to prove that they have the exact same behavior.This is a good enough indicator that the same astronomical analyses can be carried out overthe curves.

6.2 Execution time comparison

Once the results are validated, the speedup provided by the GPU implementation of thealgorithms is to be calculated. For this, the execution time of the reduction and light curveobtention procedures will be measured for FADRA’s CPU and GPU implementations.

The execution times will be measured using Python’s function clock from the time mod-ule. The clock function returns processor time and is recommended in Python’s documen-tation as the function to use when benchmarking Python or timing algorithms4.

6.3 Platforms

The experiments carried out on this thesis were performed on an Intel Core i5-3337U CPU@ 1.80GHz x 2 with 3.6 GB memory, and an Intel 3rd Gen core GPU running Intel HDGraphics 4000 with 2048 MB memory. OpenCL was run on the GPU by using the driversprovided by the Beignet project5.

4https://docs.python.org/2/library/time.html#time.clock5https://01.org/beignet

56

Chapter 7

Results

This chapter introduces the results of the experiments detailed in Chapter 6. The first stageof the experimental part of this thesis corresponded to the validation of the results obtainedwith FADRA’s implementation. For this, experiments were carried out to compare reductionresults obtained with FADRA against reduction results obtained with established astronomi-cal software. The results of FADRA’s photometry and light curve obtention algorithms werealso evaluated. The results of FADRA’s CPU photometry implementation were comparedagainst light curves obtained with other astronomical software. Then, FADRA’s CPU andGPU light curve obtention algorithms were compared against each other to make sure thatthe photometry approximation given in section 5.4.4 is correct.

Once the results from the FADRA implementation were proven to be correct, experimentswere run to compare the execution times between FADRA’s CPU and GPU algorithms. Thistells us if the implementation of GPU procedures for reduction, photometry, and light curveobtention for astronomy algorithms is justified, and takes into account the intensive datatransfer between host and device that working with astronomical data signifies.

The conclusions and analyses of the results here presented are further expanded andcommented on Chapter 8.

7.1 Validation of FADRA results

7.1.1 Experiment 1: Validation of reduction results

As was detailed in section 3.3, reduction of astronomical images is a vital step for any kind ofscientific analysis to be performed over said data. Because of this, it is necessary that FADRAreduction functions return very exact results when compared to established astronomicalsoftware, to make sure that the further processes to be carried out in the framework are runon a correct base.

57

The software used to evaluate FADRA’s reduction results, as mentioned in section 6.1.1,was AstroPy’s ccdproc package. The Normalized Root Mean Square Error (NRMSE) de-tailed in section 6.1.1 was used as the metric to compare the images. The results fromFADRA’s CPU reduction implementation were also compared against FADRA’s GPU re-duction results, to make sure that both algorithms provide correct outcomes.

The calculated NRMSE for the comparison between ccdproc results and FADRA CPU re-sults is presented in Figure 7.1. The comparison between FADRA’s CPU and GPU reductionresults is presented in Figure 7.2. The NRMSE was calculated between each correspondingpair of images in the datasets being compared, and the average of the said value for eachdataset is presented in the mentioned figures. The detailed results used to obtain the figurescan be found in Appendix A.

Figure 7.1: Normalized Root Mean Squared Error between the reduction results obtained withAstroPy’s ccdproc package and FADRA’s CPU reduction algorithm. The datasets on the x-axisare arranged from lower (left) to higher (right) number of images.

As can be seen in Figure 7.1, the NRMSE between AstroPy’s ccdproc and FADRA CPUresults is below 1% for every dataset, meaning that the average difference between the com-pared images in each dataset is less than 1%. Considering that the astronomical imagereduction process is very straightforward and does not allow for really different implementa-tions (section 3.3), such small differences were expected due only to the precision differencesbetween calculations. It can be seen, however, that only datasets 2, 4, and 6 present NRMSEvalues that could be explained only by precision differences, while datasets 1, 3, and 5 presenthigher NRMSE values. To make sure that the reduction process works properly, and thatthose higher NRMSE values are because of properties intrinsic to the images, the mean,median, and standard deviation values for each image were calculated and averaged for each

58

dataset. The results are presented in Table 7.1.

FADRA CPU(pixel flux counts)

Dataset Mean Median Standard deviation1 320.711 ± 62.208 305.555 ± 59.887 695.251 ± 108.8732 511.098 ± 72.426 514.768 ± 79.218 268.098 ± 191.5833 4,064.732 ± 1,574.698 4,183.901 ± 1,625.461 680.334 ± 247.6174 256.025 ± 80.558 291.836 ± 82.051 3,354.797 ± 11.6965 37.689 ± 13.208 30.001 ± 11.075 291.861 ± 76.1426 2.529 ± 4.458 1.977 ± 7.234 104.480 ± 42.192

AstroPy(pixel flux counts)

Dataset Mean Median Standard deviation1 NaN 301.805 ± 58.583 NaN2 531.575 ± 26.033 514.490 ± 80.094 221.245 ± 116.3283 NaN 4,062.611 ± 1,577.868 NaN4 256.029 ± 80.558 291.836 ± 82.051 3,358.125 ± 11.5795 NaN 31.921 ± 10.723 NaN6 2.528 ± 4.457 1.977 ± 7.234 104.659 ± 42.243

Table 7.1: Mean, median, and standard deviation for reduction results for all datasets. Thevalues calculated for each image were averaged for each dataset.

Table 7.1 shows that the datasets that present NRMSE values higher than expected dueto precision differences are the datasets in which AstroPy’s reduction process leaves NaNvalues in the images. As explained in section 5.3.1, reduction results obtained with FADRAalgorithms are run through a sigma-clipping function to delete abnormal values, a processthat apparently is not directly handled by AstroPy’s ccdproc package. However, looking atthe median values for the datasets where this happens, we see that AstroPy and FADRAresults are close, with AstroPy results probably being influenced by the presence of incorrectvalues in the images.

The next comparison corresponds to the results obtained with FADRA’s CPU and GPUreduction implementations, shown in Figure 7.2. Here, the very small NRMSE between theimages reflects only the differences in precision that occur when transferring data betweenCPU and GPU and executing GPU kernels, as was discussed in section 4.2. In this case, nobig dissimilarities between NRMSE results were found between datasets, because of the factthat both the results of CPU and GPU reduction implementations in FADRA go throughthe sigma-clipping process, unlike what happens in the previous case.

7.1.2 Experiment 2: Light curve evaluation

An important validation step in this thesis corresponds not only to the evaluation of thereduction results, but also of the light curves obtained from the different datasets. To dothis, as was explained in section 6.1.2, the light curves obtained with AstroPy’s photutils

59

Figure 7.2: Normalized Root Mean Squared Error between the reduction results obtained withFADRA’s CPU reduction algorithm and FADRA’s GPU reduction algorithm. The datasets on thex-axis are arranged from lower (left) to higher (right) number of images.

package and FADRA’s CPU photometry implementation were compared for similarity. Anadditive or multiplicative offset between the curves shows that they are the same in terms ofthe astronomical information that can be obtained from them. The process to compare thedifferent lightcurves was the following:

For each dataset:

� Two light curves were obtained: one with AstroPy’s photutils and one with FADRA’sCPU light curve obtention algorithm.

� The difference (subtraction) between the two curves was calculated. The result is avector of the same length as the light curves. If a direct additive offset is present, allthe values of this resulting vector should be equal.

� The photutils curve was divided by FADRA’s CPU curve. The result is a vector ofthe same length as the light curves. If a direct multiplicative offset is present, all thevalues of this resulting vector should be equal.

� The mean and standard deviation of the resulting difference and division vectors wascalculated, to see if the values are all the same.

The final results of this calculation for each dataset are presented in Table 7.2.

60

Subtraction DivisionDataset Mean Standard deviation Mean Standard deviation

1 6,445.370 4,707.175 1.001 0.0012 -26,917.033 2,503.898 0.975 0.0023 6,086.900 7,070.096 1.032 0.0964 -7,851.717 876.498 0.957 0.0015 2,772.740 0.05 1.003 0.00026 716.289 0.18 0.953 0.0007

Table 7.2: Results of the subtraction and division of the two light curves calculated for each dataset:one with the AstroPy’s photutils package, and one with FADRA’s CPU algorithm. The mean ofthe subtraction and division resulting vectors are shown, as well as the standard deviation of thesaid resulting vectors.

From the results shown in Table 7.2 it can be noted that a direct additive offset is notpresent between the photutils and FADRA CPU curves. This can be seen in the fact thatthe standard deviation values on the subtraction result vector are very high, in some cases(such as datasets 1 and 3) they are even close to the mean. This means that there are somebig differences in the points resulting from the subtraction of one curve from the other, sothe differences between them cannot be represented simply as an additive offset.

The results from the division of the two curves, however, seem to point in the directionof a multiplicative offset setting the differences between the curves. Even though the meanvalues of the result of dividing the curves are small, what is more important here is thatthe standard deviation values are also small, meaning that dividing the two curves resultsin a vector of values that are very similar to each other, in almost all cases up to the thirddecimal. This strongly suggests the presence of a multiplicative offset between the curves.

7.1.3 Experiment 3: Comparison between FADRA’s CPU and GPUphotometry implementations

As was explained in section 6.1.3, the light curves obtained with FADRA’s CPU and GPUphotometry implementations were compared for similarity. The presence of an additive ormultiplicative offset shows that the shapes of the curves are the same. The process to obtainthese results was the following:

For each dataset:

� Two light curves were obtained: one with the CPU algorithm, and one with the GPUalgorithm.

� The difference (subtraction) between the two curves was calculated. The result is avector of the same length as the light curves. If a direct additive offset is present, allthe values of this resulting vector should be equal.

� The GPU curve was divided by the CPU curve. The result is a vector of the samelength as the light curves. If a direct multiplicative offset is present, all the values of

61

this resulting vector should be equal.

� The mean and standard deviation of the resulting difference and division vectors wascalculated to see if the values are all the same.

The final results of this calculation for each dataset are presented in Table 7.3.

Subtraction DivisionDataset ID Mean Standard deviation Mean Standard deviation

1 4.999 3.522 1.011 0.0072 11.273 5.373 1.921 0.0353 35.385 1.680 1.423 0.0064 -17.951 1.286 0.846 0.0035 2.624 3.957 1.038 1.0216 -4.025 1.180 1.843 0.016

Table 7.3: Results of the subtraction and division of the two light curves calculated for each dataset:one with the CPU algorithm, and one with the GPU algorithm. The mean of the subtraction anddivision resulting vectors are shown, as well as the standard deviation of said resulting vectors.

Just as in the previous experiment, results suggest that an additive offset is not presentbetween FADRA’s CPU and GPU curves, mainly because of the big standard deviation valuesof the results of subtracting one curve from the other. Also as in the previous experiment,a multiplicative offset seems to be present between the two curves. Even if the mean valuesshow that the GPU curves almost doubles the CPU curve in some cases (datasets 2 and 6), thestandard deviation values show that the differences between the points inside the curves aresimilar. Dataset 5 shows a slightly higher standard deviation value for this experiment, whichmight indicate the existence of an outlier point in one of the light curves. Considering thatit is the only dataset that presents a higher standard deviation value, the effect is probablydue to variations inside the dataset’s image, and not a reflection of the global functioning ofthe algorithm.

7.2 Execution time comparison

To measure execution times, each algorithm was run four times under the same conditions.The final running time for the results was calculated as the average of the results of thefour runs. The results are presented below. The detailed execution times obtained for eachdataset on each experimental run can be found in Appendix A.

7.2.1 Reduction

The execution time of the reduction algorithms on CPU and GPU are presented in Figure7.3.

62

Figure 7.3: Execution times for astronomical image reduction algorithms in their CPU and GPUimplementations. The datasets on the x-axis are arranged from lower (left) to higher (right) numberof images.

These results demonstrate directly that the GPU implementation of the reduction algo-rithm takes longer to execute than the CPU implementation for all datasets, coming closeto being equal for dataset 3, but never below the CPU execution times. This means thatthe astronomical image reduction process does not benefit from this GPU implementation,as was expected and commented in section 4.3.1. This is likely due to the fact that reduc-ing complete astronomical images requires intensive data transfer between the host and thedevice.

Something useful to note from these results is that the GPU execution time does not growalong with the number of images in the dataset. It would be expected that, as the numberof images on each dataset increases, so would the execution times, considering that in thiscase the datasets used contain images of around the same size. However, this is only seenin the CPU execution times, but not in those of the GPU, as can be specifically seen in theresults for datasets 3 and 5. This suggests that there might be other parameters in playon the timing of the GPU reduction process, besides the number of images. It is possiblethat some datasets exhibit internal properties, such as the standard deviation within imagesthemselves, that might make them better or worse for being reduced on the GPU. Dataset5 presents a decrease in reduction time compared to the previous dataset in both CPU andGPU implementations of the algorithm, also suggesting the presence of intrinsic propertiesof the datasets that might affect execution times.

63

7.2.2 Light curve generation

The execution time of the light curve obtention procedures on CPU and GPU implementa-tions are presented on Figure 7.4.

Figure 7.4: Execution times for light curve obtention algorithms in their CPU and GPU imple-mentations. The datasets on the x-axis are arranged from lower (left) to higher (right) number ofimages.

Unlike the previous case, from these results we see that the light curve obtention processhighly benefits from the GPU implementation. There are differences of several seconds in theexecution times for all datasets. Though in the case of light curve obtention this outcomewas hoped for, it was not necessarily expected. These execution times show that the datastamp approach described in section 5.4 does work as a way to significantly reduce datatransfer between host and device, enough so that the time taken on this data transfer doesnot overshadow the speedup gains obtained by the GPU implementation of the photometryalgorithm, obtaining significant acceleration in all cases.

Dataset 3 exhibits an odd behavior seen as an inflation in execution time for this exper-iment, which is evident in the CPU implementation and slight, but still noticeable, in theGPU implementation. Together with the previous results of image reduction execution times,where dataset 3 also shows an unexpected behavior on the GPU reduction times, again wecan draw the conclusion that there are properties intrinsic to this dataset that affect theexecution times of the algorithms run over its images.

64

Chapter 8

Conclusions

The following chapter aims to, first, review the goals stated for this thesis in section 1.3.1, tothen relate the said goals to the results obtained and presented in Chapter 7. The answersfound to the research questions stated in section 1.3.2 are also presented and discussed.From this inspection of the goals of this thesis and the obtained results, conclusions aboutthe performance of the algorithms implemented in this thesis for the FADRA framework, aswell as about its applicability for use in astronomical data processing pipelines, are derived.

Along with the conclusions obtained from the work developed in this thesis, this chapterpresents a discussion regarding FADRA’s future advancements and research aspects, both inthe scope of the experiments and implementations developed within this thesis, as well as forthe future new capabilities to be added to the framework.

8.1 Development of basic algorithms for astronomical

data analysis

The first goal of this thesis corresponds to the implementation of the very basic algorithmsnecessary for astronomical image analysis: algorithms for image reduction and, as a part ofthat process, algorithms to combine images and obtain the Master calibration files neededfor the reduction process. All of this promotes as little user intervention as possible, makingit simple to quickly setup all the needed parameters to swiftly perform the reduction of greatamounts of astronomical images.

Given the fact that the reduction process is the base for every further astronomical analysisto be performed over the images, it is of great importance not only that the algorithms runquickly and without needing input from the user through the process, but also that theresults are proven to be correct. In the case of astronomical image reduction, the validationof the results is a vital step before using the algorithms to obtain scientific information. Asstated in section 5.3, only the reduction algorithms themselves were tested on the performedexperiments, since the image combination processes are based on SciPy’s implementations.

65

The details of the comparison carried out between the different results can be found in thedescription of Experiment 1, section 6.1.1.

In this regard, the FADRA implementation of reduction algorithms developed on thisthesis matches the standard of established astronomical software such as AstroPy, which wasused for the comparison. As can be seen in section 7.1 for the first experiment, the compari-son of FADRA’s CPU reduction algorithm with AstroPy’s ccdproc reduction process showsa very close match between the results. Three out of six datasets show differences that cor-respond only to precision errors between the results. This was expected, since the reductionprocess is standard and very different results would accuse an incorrect implementation ofone of the algorithms.

In the cases where the NRMSE value was somewhat higher than just differences in pre-cision, detailed inspection of the datasets shows that the reduction results obtained withccdproc present NaN values in some of the images, while the results obtained with FADRAdo not. Invalid values are common occurrences in astronomical images, and can be causedby defective pixels on the detector or by cosmic rays hitting the detector while the imagewas being acquired. The differences in the values of the two reduced datasets are probablyevidence of the different ways that FADRA and AstroPy handle these bad pixels. In thecase of FADRA, bad pixels are sigma-clipped until they reach a certain acceptable valuedefined by the user, as was explained in section 5.3.1. AstroPy’s documentation shows thatthe package does not automatically deal with NaN values, like FADRA does. The user canprovide masks for AstroPy arrays to deal with NaN values1, which was not explicitly done inthis experiment. However, the differences obtained between the reduction results of the twoprocedures are still well below 1%, and thus can be considered acceptable in the context ofthis work.

Regarding the comparison of the reduction results obtained with FADRA’s CPU and GPUimplementations, it is clear that the differences here are only due to precision changes whentransfering and computing data to the GPU, as was commented on section 4.2. The verylow NRMSE values between the results for every dataset show that choosing one of theseimplementations over the other does not yield significant differences, other than precision di-vergences that show up from the third decimal. This outcome was expected and is acceptablewithin the context of this work.

Overall, the results obtained with FADRA implementation of reduction algorithms arecorrect, in terms of being very similar to the ones obtained with AstroPy, both for the CPUand GPU implementations. This shows that the basic process of astronomical image analysisis well performed by the FADRA framework and that the coming processes to be carried outover the data are set on correct calibrated data.

Considering this, the reduction process implementation, along with the image combinationalgorithms used for Master files obtention, accomplish the first goal stated in section 1.3.1of this thesis: to develop a framework that provides the basic algorithms necessary for astro-nomical data analysis, corresponding to data reduction algorithms and image combinationalgorithms for the obtention of calibration files.

1http://docs.astropy.org/en/stable/table/masking.html

66

8.2 Implementation of algorithms for light curve obten-

tion

The next step regarding validation of FADRA results consisted in analyzing if the photom-etry and eventually light curve obtention algorithms yielded correct results. For this, twoexperiments were implemented: one to compare FADRA’s CPU light curve obtention resultsagainst the ones obtained with AstroPy’s photutils package (Experiment 2, section 6.1.2),and one to compare the results between FADRA’s CPU and GPU implementations for lightcurve obtention (Experiment 3, section 6.1.3). In both cases, the curves were comparedagainst each other to find a fixed offset between the points of each one of them, as is furtherdetailed in section 6.1.2.

It is important to note that in most cases of light curve analysis in astronomy the mostimportant characteristic of the curves is not the actual numerical values, but the differencebetween points of the curve in different observation times. Even though extreme differencesin the values of the curves are not desired in this context, small numerical differences alongwith the certainty that the shape of the two curves being compared is the same is a perfectlyacceptable result for astronomical analysis of the variations of observations through time.

From the results of Experiment 2, shown in section 7.1.2, we see that a multiplicative offsetexists between the results obtained with AstroPy and the ones obtained with FADRA’s CPUimplementation. This means that, even though the actual values of the points of the curvesare different, both curves have the exact same shape. Given the fact that the standarddeviation values of the vectors resulting from dividing the two curves are very small, itcan be concluded from these results that the points inside each curve have extremely similarbehaviors. This means that the characteristics intrinsic to the curves are maintained betweenthe light curves obtained with AstroPy and the ones obtained with FADRA. Even thoughthere are small differences in the photometry measurement values of the curves, the internalbehavior of the points being the same allows for the same time series analyses to be carriedout. These results show that FADRA’s CPU algorithm for light curve obtention providesresults that are up to the standards of established astronomical software, and thus can beused confidently in scientific applications.

It is interesting to see, in the results of Experiment 2, that the mean values obtained forthe division of the curves oscillate around 1, with very small deviations from said value. Thisis a good result, since it shows that the actual numerical differences between the points of thetwo curves is not excessive. This also gives a hint about what could be causing the differences.Lets remember how aperture photometry is performed (detailed in section 3.4.3): an apertureradius is defined to cover the target object completely. An annulus is then defined aroundthe object. It is important that the aperture radius envelops only the target, with as littleas sky as possible. As well, the annulus has to envelop only background sky, and contain noother stars or objects. In case this is not possible, it must encompass the lowest possible areaof different objects. After these shapes have been defined, the sum of luminous flux insidethe aperture radius is calculated and, in most cases, a polynomial fit of the background skyinside the annulus is obtained, which is then used to estimate the amount of background skypresent inside the aperture radius. This value is then subtracted from the flux inside the

67

aperture, obtaining the estimated value of the luminous flux coming only from the targetastronomical object.

There are, however, many different ways to estimate the sky background inside the an-nulus. In the case of FADRA’s CPU implementation, and as is further detailed in section5.4.4, the fit was obtained using a polynomial of the degree defined by the user (1 is thedefault). Different implementations of the aperture photometry algorithm can implementmany different ways to model the sky background of the images. Because of this, it is likelythat the small numerical differences between the points in the two curves is due to the factthat AstroPy’s implementation uses a different way to model the sky background that issubtracted from the flux inside the aperture radius.

The same analysis can be performed for the comparison between the results obtainedwith FADRA’s CPU and GPU implementations on Experiment 3 (section 7.1.3). Again, anadditive offset does not seem to be present between the curves, but the results suggest theexistence of a multiplicative one. In this case, the numerical differences of the values inside thecurves (the average values of the division result vectors) are higher than in the previous case.A bigger numerical difference was expected to be found between the results obtained withCPU and GPU, given the fact that the GPU photometry measurements are done throughthe approximation presented in equation 5.1. The approximation used to estimate the skybackground in the GPU photometry algorithm is expected to yield results much less exactthan the ones obtained in the CPU algorithm. Also, as was explained in section 5.4.4, theuse of atomic operations in the GPU photometry kernel generates precision errors as well.The case of determining if a pixel is inside the aperture radius or sky annulus is based, in theGPU version, only in integer operations, which could bring about small variations betweenthe CPU and the CPU results.

In general, the light curves obtained with the GPU photometry implemented in FADRAare well within the acceptable range to perform scientific analyses. The small standarddeviations in the results of dividing the two curves mean the CPU and GPU curves are ofthe same shape.

Taking this into consideration, the light curve obtention algorithms implemented in FADRAfulfill the second goal established for this thesis, that is, to provide algorithms for automatedlight curve generation, with as minimal user intervention as possible, but also to be efficientin execution time and provide good results.

8.3 GPU implementation of algorithms

After the results obtained with FADRA’s different implementations were proven to be correct,the next step was to evaluate the GPU algorithms to analyze if there are significant speedupsof the execution times. For this, two research questions to be answered with the work of thisthesis were established:

68

Q1: Is it possible to obtain significant GPU speedup in astronomical algorithms that dealwith a large amount of data transfers between CPU and GPU?

Q2: Are these speedups justified? In other words, is the obtained acceleration worth itconsidering the extra implementation that GPU algorithms convey?

Looking at the execution time results obtained from the reduction algorithms, we seethat this process does not benefit from a direct GPU implementation. Reducing completeastronomical images for big datasets requires an intensive amount of data transfers betweenhost and devices, which overshadows any gains in execution time that might be obtained byperforming the operations on the GPU. However, it is interesting to note that some datasetsshow lower execution times than previous datasets containing less images, as is the casewith datasets 3 and 5 (Figure 7.3). The expected outcome would be that as the number ofimages on each dataset increases, so would the execution times, especially in this case werethe images from all datasets are about the same size. This suggests the presence of certaincharacteristics within each dataset that might make them better or worse for being processedon the GPU. This is also supported by the fact that a slight decrease in execution time canbe seen for this datasets in the CPU results as well. However, considering the intensive dataused due to the nature of the reduction process, bad execution times were expected for theGPU implementation of the algorithm.

For the case of the reduction algorithms, slower execution times for the GPU version wereexpected, because of the large amount of data transfer between host and device that mustbe carried out for this process. The light curve obtention process, however, was different,and initially no guesses could really be made about the outcome regarding execution times.While it was true that the amount of data used for the light curve obtention algorithms wassignificantly reduced compared to the amount of data used for the reduction process, therewere no guarantees that the data stamp approach would work faster on the GPU than onthe CPU. The data transfer process could still overshadow the computational gains.

However, as can be seen in the results (Figure 7.4), in this case there are important gainsin execution time when using the GPU version of the light curve obtention algorithm. Thisshows that the stamp approach designed to reduce the amount of data used for light curveobtention does turn out to be an efficient way to reduce data transfer between host anddevice and, as such, it takes advantage of the acceleration provided by GPU parallelization.

In terms of the second research question, the case of the light curve obtention algorithmshows a strong decrease in execution time. This demonstrates that the development of aGPU version of the algorithm results convenient, in terms of speed gains, over the extraimplementation. Using the aperture photometry approximation presented in section 5.4.4,the implementation of the GPU kernel turns out to be straightforward, therefore developinga GPU version of this algorithm is worth considering.

In combination with the comments about the results yielded with the GPU light curveobtention algorithm discussed in the the previous section, this implementation successfullyfulfills the second and third goals established for this thesis, which were: to provide algorithmsfor automated light curve generation with as minimal user intervention as possible but alsoto be efficient in execution time and which generate good results, and to provide GPU

69

accelerated versions of reduction and light curve obtention algorithms.

8.4 Future work

It is important to keep in mind that the work carried out in this thesis aims to set thebases for the FADRA framework, but this one must keep being developed, implemented,and improved. The work presented here does not represent the full scope of the FADRAframework, but only the first, basic and vital implementations needed to start this schemefor GPU-powered analysis of astronomical data.

With this said, this thesis should be considered as the starting point for the development ofthe FADRA framework, but there is still plenty to be done. Future endeavors with FADRAnot only encompass enhancing the algorithms developed for the scope of this thesis, butalso adding new functionalities to the framework. These new additions should hopefully bedesigned and implemented within the careful experimental method developed for the presentproject. This, as a means to make sure that the expansions are correct and that the resultsyielded by FADRA are always maintained up to the standards of established astronomicalsoftware.

Future work is separated in these two scopes: first, work to be developed over the imple-mentations carried out and presented in the context of this thesis, and second, the futuredevelopments to be added to FADRA and the expectations for the framework in the shortand the long term.

8.4.1 Within the scope of this thesis

One of the priorities for the future work in the algorithms implemented for this thesis relatesto fixing the precision differences that are obtained between CPU and GPU results. Eventhough these differences are still acceptable in the context of the analyses to be carried outover these results, it is critical that in the future FADRA can offer the certainty that theresults obtained with the framework are exact, in all its different implementations.

While it is true that there are differences in precision which are inherent to the use of theGPU to perform calculations, the use of atomic operations for the photometry approximationadds even more precision problems by being only available for use with integers. Designingboth new approximations for photometry calculation and/or new implementations of the cur-rent kernel could bring about more exact approximations and, consequently, results that aremore similar to the ones obtained with the CPU implementation of photometry algorithms.Also concerning GPU implementations, a next step in the development of FADRA would beto design GPU algorithms for image combination and the obtention of Master calibrationfiles, as well as for filter application over images.

Another useful feature to develop in relation to the implementation of GPU algorithms isto determine certain features of the datasets that might make them better or worse candidates

70

for processing on the GPU. For example, in the execution times for the reduction process(Figure 7.3) some datasets yielded lower execution times than other datasets containing lessimages. Considering only the number of images as a distinctive characteristic of the dataset,and considering that all images of all datasets are similar to each other, then the number ofimages should be the only influence in the execution time. However, as can be seen from theresults, this is not always the case. The different execution times are not the only result thatdemonstrates the differences in the images between datasets: for example, the results of theNRMSE values and even the standard deviation results obtained in the comparison of lightcurves could represent characteristics intrinsic to each dataset.

Studying these features that characterize each dataset could make it possible to implementa test that determines, for example, if a dataset will process faster with the CPU reductionalgorithm or with the GPU reduction algorithm. This, considering that for some datapointsthe GPU reduction time gets very close to the CPU reduction times. It would be very usefulto analyze if a feature exists which makes some datasets faster to be transfered and operatedon the GPU. Some characteristics worth analyzing correspond to, for example, the internalstandard deviation of each image on the dataset, the standard deviation of the dataset asa whole, how crowded with stars each image of the dataset is, among others. Studying theexecution times not only in terms of dataset size but also in terms of these characteristicswould bring a different vision on what makes a dataset more fit for the GPU. In the future, theFADRA implementation may be capable of automatically deciding if it is going to executethe CPU or GPU implementation of a certain algorithm, making the process even moreautomatized and thus more suitable to be executed within larger scripts for astronomicaldata processing.

Regarding light curves, the natural step that should come after the obtention procedureis the implementation of algorithms to find fits and perform statistical analysis over the saidcurves. Time series analysis requires the application of statistical methods over a series ofdata in order to be able to obtain information and make predictions about it. Correlations,trends, and seasonal variations, are just some of the internal structures that scientists lookfor in their curves when performing said analyses. This would be the next development stepin relation to FADRA’s light curve procedures.

Even though the main aim of the FADRA framework is to be able to smoothly performinside bigger scripts for data analysis, a wrapper class was developed to provide a GraphicalUser Interface (GUI) for the determination of photometry parameters and the obtention oflight curves. The idea of this is to allow the user to get a visual feel of the data that is beinganalyzed, and to help them select the proper parameters for each dataset. Considering this,an attractive addition to this GUI would be the presentation of quality checks along differentmilestones of the process. For example, after the Master calibration files have been obtained,the images themselves in different visualizations, besides statistical information about them,could be shown to the user to help them spot any possible problems along the way. If one ofthe images used to obtain the Masters had bad pixels, for example, this would reflect on thefinal result and the user would be able to see it before continuing the scientific processes.

Another addition that could help make sure that the data and parameters to be used arecorrect is to show the user not only the image to select the target star and the aperture

71

photometry parameters, but also the radial profile of the selected star. The radial profilecorresponds to a “sliced” visualization of the star, shown as a Gaussian curve, which makesit easy to determine at which point from the center the luminous flux stops corresponding tothe star and begins showing only background sky.

8.4.2 The FADRA framework

FADRA is expected to encompass not only the basic algorithms for astronomical data pro-cessing, but also implementations of more complicated and detailed procedures to be carriedout over different types of astronomical data. Today, the FADRA framework provides thebasic tools for the analysis of astronomical images, but the framework was conceived to beeasily expanded by the users. In terms of this, the FADRA framework is expected to bepublicly released soon, both to receive feedback from real users and to begin adding newfunctionalities.

The new modules added to the FADRA framework are expected to maintain the focusin developing new approaches to the known astronomical algorithms, as a means to provideGPU implementations for common processes in astronomical data analysis. This is becausewith the survey era in astronomy approaching, faster ways to analyze data are going tobecome more necessary as the years go by. In the future, the astronomer will not go to theobservatory to obtain data and then reduce it at home: the survey telescopes will observethe complete night sky, and the astronomer will have access to an online database in whichthe data of interest will simply be downloaded, already calibrated and ready to performscientific analyses. The process will possibly evolve even more, and the astronomer will nothave access to the images themselves, but to the light curves or the relevant data alreadycalculated. Because of this, and in order to become a tool that is useful for decades inastronomy, the developments added to FADRA should be designed always keeping in mindthat fast ways of handling big amounts of data must be a priority.

The previous comment also relates to the scalability of GPU systems, as was commentedin Chapter 4. To add more computing power to a GPU cluster, all that has to be done is toadd more GPUs to the machine. Only minimal interventions to the code are needed, if any.The implementation of GPU algorithms not only have the benefit of being many times faster(when the dataset is appropriate and the algorithm is well implemented), but also that theirbenefits can be experienced from the user level with just a few images to the big clustersprocessing hundreds of astronomical images from survey telescopes per minute.

Astronomy is evolving, and so should be the software used to perform it. If this is keptin mind and considered when developing new additions for FADRA, or for any other astro-nomical software, the tools used by astronomers will evolve in the same direction to whichdata and technology lead.

72

Bibliography

[1] K. Akeley and T. Jermoluk. High-performance polygon rendering. In ACM SIGGRAPHComputer Graphics, volume 22, pages 239–246, 1988.

[2] M. A. Albrecht, A. Brighton, T. Herlin, P. Biereichel, and D. Durand. Access to DataSources and the ESO SkyCat Tool. In Astronomical Data Analysis Software and SystemsVI, volume 125 of Astronomical Society of the Pacific Conference Series, page 333, 1997.

[3] M. G. Allen, F. Ochsenbein, S. Derriere, T. Boch, P. Fernique, and G. Landais. Extract-ing Photometry Measurements from VizieR Catalogs. In Astronomical Data AnalysisSoftware and Systems XXIII, volume 485 of Astronomical Society of the Pacific Confer-ence Series, page 219, May 2014.

[4] Astropy Collaboration. Current Status of Astropy Sub-packages. https://astropy.

readthedocs.org/en/stable/stability.html, May 2016.

[5] Astropy Collaboration, T. Robitaille, E. Tollerud, et al. Astropy: A community Pythonpackage for astronomy. A&A, 558:A33, Oct. 2013.

[6] K. Banse, P. Crane, P. Grosbol, F. Middleburg, C. Ounnas, D. Ponz, andH. Waldthausen. MIDAS - ESO’s new image processing system. The Messenger, 31:26–28, Mar. 1983.

[7] B. R. Barsdell, D. G. Barnes, and C. J. Fluke. Analysing astronomy algorithms forgraphics processing units and beyond. MNRAS, 408:1936–1944, Nov. 2010.

[8] A. J. Barth. ATV: An Image-Display Tool for IDL. In Astronomical Data AnalysisSoftware and Systems X, volume 238, page 385, 2001.

[9] R. Berry and J. Burnell. The handbook of astronomical image processing. Willmann-Bell,Inc., 2nd edition, 2006.

[10] T. Boch and P. Fernique. Aladin Lite: Embed your Sky in the Browser. In AstronomicalData Analysis Software and Systems XXIII, volume 485 of Astronomical Society of thePacific Conference Series, page 277, May 2014.

[11] F. Bonnarel, H. Ziaeepour, J. G. Bartlett, O. B. M. Creze, D. Egret, J. Florsch, F. Gen-ova, F. Ochsenbein, V. Raclot, M. Louys, and P. Paillou. The ALADIN InteractiveSky Atlas. In New Horizons from Multi-Wavelength Sky Surveys, volume 179 of IAU

73

Symposium, page 469, 1998.

[12] Brian D. Warner, Palmer Divide Observatory. Basic Instructions for Putting Data ontoa Standard System. Used with permission of the author. http://www.minorplanet.

info/ObsGuides/Misc/StandardizingData.htm.

[13] P. J. Brockwell and R. A. Davis. Introduction to time series and forecasting. Springer,2nd edition, 2002.

[14] C. Buil. IRIS: An astronomical images processing software. http://www.astrosurf.

com/buil/us/iris/iris.htm, Sept. 2014.

[15] H. Bushouse and B. Simon. The IRAF/STSDAS Synthetic Photometry Package. InAstronomical Data Analysis Software and Systems III, volume 61 of Astronomical Societyof the Pacific Conference Series, page 339, 1994.

[16] J. Bedorf. The gravitational billion body problem: Het miljard deeltjes probleem. InDoctoral dissertation, Leiden Observatory, Faculty of Science, Leiden University, 2014.

[17] S. Cavuoti, M. Garofalo, M. Brescia, M. Paolillo, A. Pescape’, G. Longo, and G. Ventre.Astrophysical data mining with GPU. A case study: Genetic classification of globularclusters. New A, 26:12–22, Jan. 2014.

[18] M. A. Clark, P. C. La Plante, and L. J. Greenhill. Accelerating Radio AstronomyCross-Correlation with Graphics Processing Units. ArXiv e-prints, 2011.

[19] F. Cohen, P. Decaudin, and F. Neyret. GPU-based lighting and shadowing of complexnatural scenes. In ACM SIGGRAPH 2004 Posters, page 91, 2004.

[20] M. J. Currie, D. S. Berry, T. Jenness, A. G. Gibb, G. S. Bell, and P. W. Draper. StarlinkSoftware in 2013. In Astronomical Data Analysis Software and Systems XXIII, volume485 of Astronomical Society of the Pacific Conference Series, page 391, May 2014.

[21] L. E. Davis. New Software For the IRAF Stellar Photometry Package. In AstronomicalData Analysis Software and Systems II, volume 52 of Astronomical Society of the PacificConference Series, page 420, Jan. 1993.

[22] L. E. Davis. A reference guide to the iraf/daophot package. IRAF Programming Group,NOAO, 1994.

[23] M. D. de La Pena and P. Greenfield. A Survey of Python Plotting Packages for PyRAF.In Astronomical Data Analysis Software and Systems XI, volume 281 of AstronomicalSociety of the Pacific Conference Series, page 193, 2002.

[24] M. D. de La Pena, R. L. White, and P. Greenfield. The PyRAF Graphics System.In Astronomical Data Analysis Software and Systems X, volume 238 of AstronomicalSociety of the Pacific Conference Series, page 59, 2001.

[25] P. W. Draper. GAIA: Recent Developments. In Astronomical Data Analysis Software

74

and Systems IX, volume 216 of Astronomical Society of the Pacific Conference Series,page 615, 2000.

[26] E. Elsen, V. Vishal, M. Houston, V. Pande, P. Hanrahan, and E. Darve. N-BodySimulations on GPUs. ArXiv e-prints, 2007.

[27] K. Fatahalian, J. Sugerman, and P. Hanrahan. Understanding the Efficiency ofGPU Algorithms for Matrix-Matrix Multiplication. In Proceedings of the ACM SIG-GRAPH/EUROGRAPHICS conference on Graphics hardware, pages 133 – 137, 2004.

[28] C. J. Fluke. Accelerating the Rate of Astronomical Discovery with GPU-Powered Clus-ters. In Astronomical Data Analysis Software and Systems XXI, volume 461 of Astro-nomical Society of the Pacific Conference Series, 2012.

[29] C. J. Fluke, D. G. Barnes, B. R. Barsdell, and A. H. Hassan. Astrophysical Supercom-puting with GPUs: Critical Decisions for Early Adopters. PASA, 28:15–27, 2011.

[30] C. J. Fluke, D. G. Barnes, and A. H. Hassan. Visualisation and Analysis Challenges forWALLABY. ArXiv e-prints, 2010.

[31] M. R. Garey, D. S. Johnson, F. P. Preparata, and R. E. Tarjan. Triangulating a simplepolygon. Information Processing Letters, 7:175–179, 1978.

[32] C. Gheller, P. Wang, F. Vazza, and R. Teyssier. Numerical cosmology on the GPU withEnzo and Ramses. ArXiv e-prints, Dec. 2014.

[33] P. Greenfield and R. L. White. A New CL for IRAF Based On Python. In AstronomicalData Analysis Software and Systems IX, volume 216 of Astronomical Society of thePacific Conference Series, page 59, 2000.

[34] S. Hartung and H. Shukla. Fast Image Subtraction Using Multi-cores and GPUs. InAmerican Astronomical Society Meeting Abstracts #221, volume 221 of American As-tronomical Society Meeting Abstracts, page 240.06, Jan. 2013.

[35] S. Hasinoff. Computer Vision: a reference guide, page 608. Springer, 2014.

[36] J. Hensley. What is OpenCL? In ACM SIGGRAPH ASIA 2010 Courses, pages 9:1–9:61,2010.

[37] High Energy Astrophysics Science Archive Research Center, at NASA. A primer on theFITS data format. http://fits.gsfc.nasa.gov/fits_primer.html, 2014.

[38] K. Igor. A new GPU-accelerated hydrodynamical code for numerical simulation ofinteracting galaxies. ArXiv e-prints, Nov. 2013.

[39] H. Johnson and W. Morgan. The UBV photometric system. ApJ, 117:313, 1953.

[40] W. A. Joye and E. Mandel. New Features of SAOImage DS9. In Astronomical DataAnalysis Software and Systems XII, volume 295 of Astronomical Society of the Pacific

75

Conference Series, page 489, 2003.

[41] W. A. Joye and E. Mandel. The Development of SAOImage DS9: Lessons Learned froma Small but Successful Software Project. In Astronomical Data Analysis Software andSystems XIV, volume 347 of Astronomical Society of the Pacific Conference Series, page110, Dec. 2005.

[42] Khronos OpenCL Working Group. Khronos Group OpenCL 2.0 Specifications v.29.https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf, July 2015.

[43] P. Klages, K. Bandura, N. Denman, A. Recnik, J. Sievers, and K. Vanderlinde. GPUKernels for High-Speed 4-Bit Astrophysical Data Processing. ArXiv e-prints, Mar. 2015.

[44] Krishnavedala. Diagram of the lightpath through a Newtonian telescope. Own work. Dis-tributed via Wikimedia Commons under CC BY-SA 4.0 license. http://en.wikipedia.org/wiki/File:Newtonian_telescope2.svg.

[45] C. A. Kuehn, J. Drury, D. Stello, and T. R. Bedding. Photometry Using Kepler “Su-perstamps” of Open Clusters NGC 6791 and NGC 6819. In International AstronomicalUnion Symposium No. 301, pages 445–446, 2014.

[46] W. B. Landsman. The IDL Astronomy User’s Library. In Astronomical Data AnalysisSoftware and Systems II, volume 52 of Astronomical Society of the Pacific ConferenceSeries, page 246, Jan. 1993.

[47] W. B. Landsman. The IDL Astronomy User’s Library. In Astronomical Data AnalysisSoftware and Systems IV, volume 77 of Astronomical Society of the Pacific ConferenceSeries, page 437, 1995.

[48] M. D. Lawden. The STARLINK Project. Starlink General Paper, 31, 1990.

[49] A. Lefohn. GPU memory model overview. SIGGRAPH’05: ACM SIGGRAPH 2005Courses, page 127, 2005.

[50] H. Li, D. Hestenes, and A. Rockwood. Generalized homogeneous coordinates for com-putational geometry. In Geometric Computing with Clifford Algebras, pages 27–59.Springer, 2001.

[51] J. Li, C. Yu, J. Sun, and J. Xiao. A GPU-based algorithm for astronomical imagesubtraction photometry. In Mechatronic Sciences, Electric Engineering and Computer(MEC), International Conference on, IEEE, pages 1937–1942, 2013.

[52] D. Luebke. CUDA: Scalable parallel programming for high-performance scientific com-puting. In 5th IEEE International Symposium on Biomedical Imaging, pages 836 – 838,2008.

[53] D. Luebke and G. Humphreys. How GPUs Work. Computer, 40(2):96–100, 2007.

[54] O. Maitre, L. A. Baumes, N. Lachiche, A. Corma, and P. Collet. Coarse Grain Paral-

76

lelization of Evolutionary Algorithms on GPGPU Cards with EASEA. In Proceedingsof the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO ’09,pages 1403–1410, 2009.

[55] D. Makovoz, T. Roby, I. Khan, and H. Booth. MOPEX: a software package for astrono-mical image processing and visualization. In Society of Photo-Optical InstrumentationEngineers (SPIE) Conference Series, volume 6274 of Society of Photo-Optical Instru-mentation Engineers (SPIE) Conference Series, June 2006.

[56] A. Mammen. Transparency and antialiasing algorithms implemented with the virtualpixel maps technique. Computer Graphics and Applications, IEEE, 9:43–55, 1989.

[57] M. Mobberley. Supernovae: and How to Observe Them, chapter 7, page 85. Springer,2007.

[58] NVIDIA. CUDA Parallel Computing Platform. http://www.nvidia.com/object/

cuda_home_new.html, 2006.

[59] A. Papadopoulos, I. Kirmitzoglou, V. J. Promponas, and T. Theocharides. GPU tech-nology as a platform for accelerating local complexity analysis of protein sequences.In Engineering in Medicine and Biology Society (EMBC), 35th Annual InternationalConference of the IEEE, pages 2684–2687. IEEE, 2013.

[60] J. Perdang and T. Serre. What can we learn from observational stellar time series?A&A, 334:976–986, June 1998.

[61] J. Pineda. A parallel algorithm for polygon rasterization. In ACM SIGGRAPH Com-puter Graphics, volume 22, pages 17–20, 1988.

[62] M. Prato, R. Cavicchioli, L. Zanni, P. Boccacci, and M. Bertero. Efficient deconvolutionmethods for astronomical imaging: algorithms and IDL-GPU codes. A&A, 539, 2012.

[63] H. Qiu and G. Memmi. Fast selective encryption method for bitmaps based on GPUacceleration. In Multimedia (ISM), IEEE International Symposium on, pages 155–158.IEEE, 2014.

[64] M. Rivi, C. Gheller, T. Dykes, M. Krokos, and K. Dolag. GPU accelerated particlevisualization with Splotch. Astronomy and Computing, 5:9–18, July 2014.

[65] J. Sainio. CUDAEASY - a GPU accelerated cosmological lattice program. ComputerPhysics Communications, 181:906–912, 2010.

[66] SBIG Astronomical Instruments. Ten things you never knew you could do with CCDOps.http://www.phys.vt.edu/~jhs/phys3154/TenThingsCCDOps.pdf, 2011.

[67] K. Schaaf, C. Broekema, G. Diepen, and E. Meijeren. The Lofar Central ProcessingFacility Architecture. Experimental Astronomy, 17:43–58, 2004.

[68] K. Schaaf and R. Overeem. Cots Correlator Platform. Experimental Astronomy, 17:287–

77

297, 2004.

[69] K. V. D. Schaaf. Efficient usage of HPC horsepower for the LOFAR telescope. In Astro-nomical Data Analysis Software and Systems (ADASS) XIII, volume 314 of AstronomicalSociety of the Pacific Conference Series, page 682, 2004.

[70] M. Schirmer. THELI GUI – Convenient reduction of optical, near- and mid-infraredimaging data. The Astrophysical Journal Supplement Series (ApJS), 209:21, 2013.

[71] M. Segal, C. Korobkin, R. Van Widenfelt, J. Foran, and P. Haeberli. Fast shadowsand lighting effects using texture mapping. In ACM SIGGRAPH Computer Graphics,volume 26, pages 249–252, 1992.

[72] Smithsonian Astrophysical Observatory. SAOImage DS9: A utility for displaying as-tronomical images in the X11 window environment. Astrophysics Source Code Library,Mar. 2000.

[73] J. Stone, D. Gohara, and G. Shi. OpenCL: A parallel programming standard for het-erogeneous computing systems. Computing in science and engineering, 12:66, 2010.

[74] M. Templeton. Time-Series Analysis of Variable Star Data. Journal of the AmericanAssociation of Variable Star Observers (JAAVSO), 32:41–54, June 2004.

[75] S. Tomov, R. Nath, H. Ltaief, and J. Dongarra. Dense linear algebra solvers for multicorewith GPU accelerators. In Parallel & Distributed Processing, Workshops and Phd Forum(IPDPSW), IEEE International Symposium on, pages 1–8. IEEE, 2010.

[76] C. Trapnell and M. Schatz. Optimizing data intensive {GPGPU} computations for{DNA} sequence alignment. Parallel Computing, 35:429–440, 2009.

[77] F. Valdes. The Interactive Data Reduction and Analysis Facility (IRAF). In Bulletin ofthe American Astronomical Society, volume 16 of Bulletin of the American AstronomicalSociety, page 497, Mar. 1984.

[78] F. Varosi, W. B. Landsman, and B. Pfarr. The IDL Astronomy User’s Library. InBulletin of the American Astronomical Society, volume 22 of Bulletin of the AmericanAstronomical Society, page 829, Mar. 1990.

[79] S. Vıtek, J. Svihlık, L. Krasula, K. Fliegel, and P. Pata. GPU accelerated processing ofastronomical high frame-rate videosequences. In Applications of Digital Image ProcessingXXXVIII, volume 9599 of Proc. SPIE, page 95992L, 2015.

[80] C. Warner, S. S. Eikenberry, A. H. Gonzalez, and C. Packham. Redefining the DataPipeline Using GPUs. In Astronomical Data Analysis Software and Systems XXII,volume 475 of Astronomical Society of the Pacific Conference Series, page 79, Oct.2013.

[81] B. Weiner, M. R. Blanton, A. L. Coil, M. C. Cooper, R. Dave, D. W. Hogg, B. P. Holden,P. Jonsson, S. A. Kassin, J. M. Lotz, J. Moustakas, J. A. Newman, J. X. Prochaska,

78

P. J. Teuben, C. A. Tremonti, and C. N. A. Willmer. Astronomical Software Wants ToBe Free: A Manifesto. In astro2010: The Astronomy and Astrophysics Decadal Survey,2010.

[82] D. C. Wells and E. W. Greisen. FITS - a Flexible Image Transport System. In ImageProcessing in Astronomy, page 445, 1979.

[83] M. Xiao, H. Deng, F. Wang, , and K. Ji. A Survey on GPU Techniques in AstronomicalData Processing. In Computer Sciences and Applications (CSA), 2013 InternationalConference on, IEEE, pages 206–209, 2013.

[84] N. Zacharias, C. T. Finch, T. M. Girard, A. Henden, J. L. Bartlett, D. G. Monet, andM. I. Zacharias. The Fourth US Naval Observatory CCD Astrograph Catalog (UCAC4).AJ, 145:44, Feb. 2013.

[85] Y. Zhang and J. Owens. A quantitative performance analysis model for GPU architec-tures. In High Performance Computer Architecture (HPCA), 17th International Sym-posium on, pages 382–393. IEEE, 2011.

[86] B. Zhao, Q. Luo, and C. Wu. Parallelizing Astronomical Source Extraction on the GPU.In eScience, 9th International Conference on, IEEE, pages 88–97, 2013.

[87] K. Zhao, G. Mei, N. Xu, and J. Zhang. On the Accelerating of Two-dimensional SmartLaplacian Smoothing on the GPU. ArXiv e-prints, Feb. 2015.

79

Appendix A

Details of results

A.1 Validation of reduction results

Table A.1 presents the values of the Normalized Root Mean Square Error (NRMSE) obtainedafter averaging the NRMSE for all the images on each dataset. The first two columns showthe NRMSE between AstroPy results using ccdproc and FADRA’s CPU reduction results.The next two columns show the NRMSE between FADRA’s CPU and GPU reduction results.

NRMSE (%)AstroPy vs. FADRA

NRMSE(%)FADRA CPU vs. FADRA GPU

Dataset Average Standard deviation Average Standard deviation1 0.0348 0.0005 0.0031 0.00022 2.562E-9 2.022E-9 3.132E-9 2.383E-93 0.5801 0.3293 0.0057 0.00104 8.471E-9 7.106E-11 1.190E-8 2.314E-105 0.0380 0.0187 0.0046 0.00066 1.026E-9 3.27E-10 1.157E-9 4.303E-10

Table A.1: Normalized Root Mean Square Error (%) for validation of reduction results usingAstroPy’s ccdproc package and FADRA’s CPU and GPU implementations.

A.2 Execution time results

The following tables correspond to the results of running the timing experiments. Eachexperiment was run four times. The results from each independent run are shown in TableA.2 for timing of the reduction process, and in Table A.3 for timing of light curve obtention.The values finally plotted and shown in section 7.2 correspond to the mean and standarddeviation obtained from the experimental runs. Said values can be found in Table A.4 forthe reduction algorithms, and in Table A.5 for light curve obtention.

80

Run 1 Run 2 Run 3 Run 4Dataset CPU (s) GPU (s) CPU (s) GPU (s) CPU (s) GPU (s) CPU (s) GPU (s)

1 1.58 1.74 1.61 1.81 1.58 1.79 1.59 1.712 2.42 3.80 2.47 3.82 2.63 3.88 2.69 4.583 2.93 3.13 3.30 3.38 3.00 3.22 3.21 3.424 4.06 6.15 4.04 6.18 4.10 6.09 3.99 6.045 4.00 4.27 3.94 4.23 3.94 4.60 3.87 4.396 6.36 9.73 6.50 9.90 6.80 11.02 6.65 10.85

Table A.2: Execution times for four runs of the astronomical image reduction algorithms.

Run 1 Run 2 Run 3 Run 4Dataset CPU (s) GPU (s) CPU (s) GPU (s) CPU (s) GPU (s) CPU (s) GPU (s)

1 4.66 0.52 4.66 0.47 4.65 0.53 4.66 0.532 1.66 0.50 1.68 0.42 1.68 0.51 1.66 0.513 11.97 0.98 12.04 0.85 12.06 0.97 12.32 1.014 5.44 0.78 5.51 0.83 5.46 0.75 5.57 0.815 5.43 1.06 5.44 0.86 5.48 1.01 5.45 1.016 14.88 1.93 15.09 1.72 15.04 1.87 11.64 1.73

Table A.3: Execution times for four runs of the light curve obtention algorithms.

CPU (s) GPU (s)Dataset Average Standard deviation Average Standard deviation

1 1.59 0.01 1.76 0.042 2.55 0.12 4.02 0.373 3.11 0.17 3.29 0.134 4.05 0.04 6.11 0.065 3.94 0.05 4.37 0.166 6.58 0.18 10.38 0.65

Table A.4: Average execution time for the astronomical image reduction algorithms.

CPU (s) GPU (s)Dataset Average Standard deviation Average Standard deviation

1 4.65 0.01 0.51 0.022 1.67 0.01 0.48 0.043 12.09 0.15 0.95 0.074 5.49 0.05 0.79 0.035 5.45 0.02 0.98 0.086 14.16 1.68 1.81 0.10

Table A.5: Average execution time for the light curve obtention algorithms.

81