The PAU Survey: Narrow-band image photometry - arXiv

31
MNRAS 000, 131 (2022) Preprint 29 June 2022 Compiled using MNRAS L A T E X style file v3.0 The PAU Survey: Narrow-band image photometry S. Serrano 1,2,3 *, E. Gaztañaga 1,2 , F. J. Castander 1,2 , M. Eriksen 4 , R. Casas 1,2 , A. Alarcon 5 , A. Bauer 1,2 , L. Cabayol 4 , J. Carretero 4 , E. Fernandez 4 , D. Navarro-Gironés 1 , C. Neissner 4 , P. Renard 1,2,6 , P. Tallada-Crespí 4 , N. Tonello 7 , I. Sevilla-Noarbe 8 , M. Crocce 1,2 , J. García-Bellido 9 , H. Hildebrandt 10 , H. Hoekstra 11 , B. Joachimi 12 , R. Miquel 4 , C. Padilla 4 , E. Sánchez 8 and J. de Vicente 8 1 Institute of Space Sciences (ICE, CSIC), Campus UAB, Carrer de Can Magrans, s/n, 08193 Barcelona, Spain 2 Institut d’Estudis Espacials de Catalunya (IEEC), 08193 Barcelona, Spain 3 Satlantis, University Science Park, Sede Bld 48940, Leioa-Bilbao, Spain 4 Institut de Física d’Altes Energies (IFAE), The Barcelona Institute of Science and Technology, 08193 Bellaterra (Barcelona), Spain 5 HEP Division, Argonne National Laboratory, Lemont, IL 60439 6 Department of Astronomy, Tsinghua University, Beijing 100084, China 7 Barcelona Supercomputing Center (BSC), Plaça Eusebi Güell 1-3, 08034-Barcelona, Spain 8 Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain 9 Instituto de Fisica Teorica UAM/CSIC, Universidad Autonoma de Madrid, 28049 Madrid, Spain 10 Ruhr University Bochum, Faculty of Physics and Astronomy, Astronomical Institute (AIRUB) German Centre for Cosmological Lensing, 44780 Bochum, Germany 11 Leiden Observatory, Leiden University, Niels Bohrweg 2, 2333 CA, Leiden, the Netherlands 12 Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, UK Accepted XXX. Received YYY; in original form ZZZ ABSTRACT PAUCam is an innovative optical narrow-band imager mounted at the William Herschel Telescope built for the Physics of the Accelerating Universe Survey (PAUS). Its set of 40 filters results in images that are complex to calibrate, with specific instrumental signatures that cannot be processed with traditional data reduction techniques. In this paper we present two pipelines developed by the PAUS data management team with the objective of producing science-ready catalogues from the uncalibrated raw images. The N pipeline takes care of all image processing, with bespoke algorithms for photometric calibration and scatter-light correction. The Multi-Epoch and Multi-Band Analysis (MEMBA) pipeline performs forced photometry over a reference catalogue to optimize the photometric redshift performance. We verify against spectroscopic observations that the current approach delivers an inter-band photometric calibration of 0.8% across the 40 narrow-band set. The large volume of data produced every night and the rapid survey strategy feedback constraints require operating both pipelines in the Port d’Informació Cientifica data centre with intense parallelization. While alternative algorithms for further improvements in photo-z performance are under investigation, the image calibration and photometry presented in this work already enable state-of-the-art photometric redshifts down to AB = 22.5. Key words: cosmology: observation – cosmology: large-scale structure of Universe – methods: data analysis – techniques: image processing – techniques: photometric – instrumentation: detectors *E-mail: [email protected] Also at Port d’Informació Científica (PIC), Campus UAB, C. Albareda s/n, 08193 Bellaterra (Cerdanyola del Vallès), Spain 1 INTRODUCTION Current cosmological studies have been increasing their volume and complexity of data up to a point that traditional analysis methods are not practical anymore. In the past, the data obtained by an astronomer could be processed at the telescope itself or by a personal computer at the research institute. Today, the massive volume and © 2022 The Authors arXiv:2206.14022v1 [astro-ph.IM] 28 Jun 2022

Transcript of The PAU Survey: Narrow-band image photometry - arXiv

MNRAS 000, 1–31 (2022) Preprint 29 June 2022 Compiled using MNRAS LATEX style file v3.0

The PAU Survey: Narrow-band image photometry

S. Serrano1,2,3∗, E. Gaztañaga1,2, F. J. Castander1,2, M. Eriksen4†, R. Casas1,2,A. Alarcon5, A. Bauer1,2, L. Cabayol4, J. Carretero4†, E. Fernandez4,D. Navarro-Gironés1, C. Neissner4†, P. Renard1,2,6, P. Tallada-Crespí4†, N. Tonello7†,I. Sevilla-Noarbe8, M. Crocce1,2, J. García-Bellido9, H. Hildebrandt10, H. Hoekstra11,B. Joachimi12, R. Miquel4, C. Padilla4, E. Sánchez8 and J. de Vicente81Institute of Space Sciences (ICE, CSIC), Campus UAB, Carrer de Can Magrans, s/n, 08193 Barcelona, Spain2Institut d’Estudis Espacials de Catalunya (IEEC), 08193 Barcelona, Spain3Satlantis, University Science Park, Sede Bld 48940, Leioa-Bilbao, Spain4Institut de Física d’Altes Energies (IFAE), The Barcelona Institute of Science and Technology, 08193 Bellaterra (Barcelona), Spain5HEP Division, Argonne National Laboratory, Lemont, IL 604396Department of Astronomy, Tsinghua University, Beijing 100084, China7Barcelona Supercomputing Center (BSC), Plaça Eusebi Güell 1-3, 08034-Barcelona, Spain8Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain9Instituto de Fisica Teorica UAM/CSIC, Universidad Autonoma de Madrid, 28049 Madrid, Spain10Ruhr University Bochum, Faculty of Physics and Astronomy, Astronomical Institute (AIRUB)German Centre for Cosmological Lensing, 44780 Bochum, Germany11Leiden Observatory, Leiden University, Niels Bohrweg 2, 2333 CA, Leiden, the Netherlands12Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, UK

Accepted XXX. Received YYY; in original form ZZZ

ABSTRACTPAUCam is an innovative optical narrow-band imager mounted at the William HerschelTelescope built for the Physics of the Accelerating Universe Survey (PAUS). Its set of 40filters results in images that are complex to calibrate, with specific instrumental signaturesthat cannot be processed with traditional data reduction techniques. In this paper we presenttwo pipelines developed by the PAUS data management team with the objective of producingscience-ready catalogues from the uncalibrated raw images. The Nightly pipeline takes careof all image processing, with bespoke algorithms for photometric calibration and scatter-lightcorrection. The Multi-Epoch and Multi-Band Analysis (MEMBA) pipeline performs forcedphotometry over a reference catalogue to optimize the photometric redshift performance. Weverify against spectroscopic observations that the current approach delivers an inter-bandphotometric calibration of 0.8% across the 40 narrow-band set. The large volume of dataproduced every night and the rapid survey strategy feedback constraints require operatingboth pipelines in the Port d’Informació Cientifica data centre with intense parallelization.While alternative algorithms for further improvements in photo-z performance are underinvestigation, the image calibration and photometry presented in this work already enablestate-of-the-art photometric redshifts down to 𝑖AB = 22.5.

Key words: cosmology: observation – cosmology: large-scale structure ofUniverse –methods:data analysis – techniques: image processing – techniques: photometric – instrumentation:detectors

∗E-mail: [email protected]†Also at Port d’Informació Científica (PIC), Campus UAB, C. Albareda

s/n, 08193 Bellaterra (Cerdanyola del Vallès), Spain

1 INTRODUCTION

Current cosmological studies have been increasing their volume andcomplexity of data up to a point that traditional analysis methodsare not practical anymore. In the past, the data obtained by anastronomer could be processed at the telescope itself or by a personalcomputer at the research institute. Today, the massive volume and

© 2022 The Authors

arX

iv:2

206.

1402

2v1

[as

tro-

ph.I

M]

28

Jun

2022

2 Serrano et al.

complex analysis require processing in a data center, or even in a gridof computing centers. In 2000 when the Sloan Digital Sky Survey(SDSS) (York et al. 2000) started their observations, the volume ofdata produced in a single week was larger than all previous datacollected in the history of astronomy. The whole SDSS data will benegligible compared to the 60 PB of raw observations that will beproduced by the Vera Rubin Observatory (Ivezić et al. 2019).

Fortunately, the increased volume of observations came withnew and more powerful computing technologies that enabled itsnecessary analysis. The era of big data arrived in time providingthe data management tools we need. The data reduction techniquesused in previous surveys cannot be simply applied, and new scal-able algorithms had to be designed that meet the more strict needsof today’s studies. Higher level languages such as Python enablefast and versatile program development not possible with older pro-gramming languages like Fortran or C. Even standard astronomylibraries such as IRAF (Tody 1986) that have been used for decadesare becoming obsolete with more flexible astronomical Python li-braries like Astropy (Astropy Collaboration et al. 2018) or Photutils(Bradley et al. 2020). The combination of parallel processing inHigh Throughput Computing data centers with these new advancedlibraries changed the paradigm in which data reduction pipeline arebeing built.

In this paper we describe the particular image processing andanalysis required for a large-scale narrow-band cosmology survey,the Physics of the Accelerating Universe Survey (PAUS). To achievethe scientific goals of the survey, we built PAUCam (Padilla et al.2019), a large field of view camera equipped with a large set ofnarrow-band filters that enables low resolution spectra for all thesources in the sky. This massive camera was mounted in the primefocus of the William Herschel Telescope (WHT). Its 4.2m diametermirror allows to observe fainter objects through the reduced trans-mission of the particular narrow-band filters of PAUS. Both cameraand data reduction system are designed to optimize the photometricredshift precision, delivering complete and homogeneous galaxycatalogues down to a magnitude of 𝑖AB = 22.5.

This paper describes the imaging data set we are dealing with(§2), the raw instrumental detrending (§3), the photometric (§5)and astrometric calibration (§4), the particular forced photometryprocess (§6) that enables the science ready catalogues and the val-idation process (§7). We emphasise in the specific challenges ofprocessing narrow-band images that prevent us from using genericsoftware. We also describe the validation procedures and intensiveoperations at the computing center.

2 IMAGING DATA

Here we describe the image data used in this work. First in §2.1we describe the uncalibrated raw data that comes directly from thecamera and secondly, in §2.2we define the reduced data produced bythe PAUS data management system (PAUdm) after the instrumentaldetrending process that will be detailed in §3.

2.1 PAUCam Raw exposures

PAUCam is an 18-detector imager camera (Padilla et al. 2019) with40 narrow-band filters, covering the range from 450nm to 850nm insteps of 10nm. After passing the optical system, the incident photonfluxes in the Charge-Coupled Device (CCD) detectors are convertedinto photo-electrons which are then stored in the individual CCDpixel potential wells. During the read-out process of the detectors,

Figure 1. A sky exposure of the full 18-detector PAUCam mosaic. As a rawimage, all instrumental signatures are present, and the 72 amplifiers can beidentified as well as the vignetting from the WHT prime focus corrector.

the charge in each pixel is passed sequentially (clocked) to an on-chip amplifier that converts the charge into a voltage and amplifiesthis voltage before sending it to an Analog-to-Digital Converter(ADC). To reduce read-out time, PAUCam CCD’s have four outputamplifiers, one for each read-out region consisting of 4096 rows and512 columns.

The data produced by the mosaic array is packed into multi-extension FITS (MEF) files (Wells et al. 1981) containing the pixelimaging data for the various types of frames, with associated meta-data information in the header of each extension. Separate header-data units (HDU) are created for each amplifier, resulting into a72-extension MEF file of about 670 MiB.

During the afternoon, bias and dome flats are observed forcalibration purposes. In the twilight, when the sky is too brightfor science observations, a high-altitude standard star is observedwith each detector-filter set for photometric calibration and totalsystem throughput calibrations. Once the sky is dark enough, themain scientific field observations are taken.

The PAUCam raw data are organized in observation sets, thatgroup exposures from a single night. Typically for an observingnight, the calibration frames (bias and dome flat-fields) and thescience exposures are stored in separate observation sets. Obser-vation sets allow also to separate observations between PAUS andcommunity observations in shared nights.

The three main types of raw exposures are

• Bias frames: zero exposure time images for electronic patterncalibration.

• Flat-fields frames: images of a screen with a flat illuminationat the dome for total throughput calibration.

• Science frames: the scientific exposureswith the target sourceson sky. An example of a raw science exposure is shown in Figure 1.

There are also other types of exposures (e.g. stacked focus or

MNRAS 000, 1–31 (2022)

PAUS image photometry 3

Photon Transfer Curve (PTC) sequences) but these are not part ofthe regular data reduction process. So far we have observed 240nights at the WHT, producing a total of 68000 raw exposures.

2.2 PAUdm reduced images

After the image processing that will be described in the followingsections, five more types of images are created;

• Master Bias: Stack of corrected bias images electronic biascalibration (described in §3.2).

• Master Flat: Stack of processed flat-field frames for through-put variations compensation (described in §3.4.1).

• Reduced Science: the instrumentally calibrated science im-ages (described in §3)

• Reduced Mask: the associated mask image with flag valuesper pixel (described in §3.6)

• Reduced Weight: the associated weight image (described in§3.6)

All reduced images contain only 8 extensions, one per detectoron the 8 most central and illuminated CCDs. Additionally, the PointSpread Function (PSF) models and astrometry catalogues are storedin the archive, together with other associated data that are recordedin the PAUS database such as image zero-points and single-epochdetections.

3 INSTRUMENTAL DETRENDING

Raw images contain significant instrumental signatures that needto be neutralized or masked before carrying out any photometricmeasurement. This section describes the correction flow for all rawscience exposures from PAUCam.

3.1 Gain calibration and PTC analysis

During the assembly of the PAUCam instrument (Casas et al. 2012),we performed the first Photon Transfer Curve (PTC) analysis withthe method described in Janesick (2001) to estimate the gain foreach of the four amplifiers in all detectors. Once on the mountain,with PAUCammounted in the Prime Focus of theWilliam HerschelTelescope under a stable installation, we repeated the test as theenvironmental conditions were different (low temperature, pressure,and humidity) and with the interaction with other devices and adifferent grounding.We decided to repeat this test on each observingrun since 2015.

In order to process the PTC we need to obtain dome flatswith a broad band filter, usually 𝑔 or 𝑟, and obtain pairs of imageswith scaled exposure times, from 1 to 30 seconds, covering therange between very low counts until saturation. Due to the strongvignetting caused by theWHTprime focus optical system, PAUCamallocated narrow-bands only on the eight central detectors. Theexternal detectors use broad-band filters for calibration and guidingpurposes. For this work we will focus on the central detectors withnarrow-band filters. An analysis of the gain for the full focal plane(the 18 detector set) is described in Padilla et al. (2019). Each2𝐾 × 4𝐾 Hamamatsu CCD of PAUCam has four outputs, and eachone must be analysed independently for the PTC analysis, so weanalyse 32 regions from the eight central CCDs.

To compute the PTCwe subtract the median overscan value foreach image and average the pairs of images with the same exposuretime to remove possible patterns and reduce the noise. A random

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00Signal (ADU) 1e5

0.0

0.5

1.0

1.5

2.0

2.5

Varia

nce

(AD

U2 )

1e5Linear fit: 0.82Parabolic fit: 0.69

100

101

102

103

hist

ogra

m c

ount

s

Figure 2. The relation between the mean flux value and its variance used toinfer the gain as described in the PTC process for the first amplifier of CCD1 in a test during the 19A period. The gain is measured as the inverse ofthe first order component of the parabolic fit (green line). A linear fit overthe whole flux range (yellow line) cannot be used to estimate the gain as itdelivers a biased value due to the non linear response of the detector.

choice of small squares of 100 × 100 pixels in the subtracted andaveraged images are used to determine the mean signal (in Analog-to-Digital Units or ADU) and the variance (in ADU2). The PTCrepresents the variance as a function of the signal as shown in Figure2. The gain can be evaluated by a linear fit between the variance andthe signal. The inverse gain can be estimated as the slope betweenthe variance and its signal, in units of 𝑒−/ADU. As the detectors donot respond linearly in the high end of the flux range, as shown inFigure 2, we have measured the slope in the linear regime of thedetector range.

Since the first installation of PAUCam in the Prime Focus ofthe WHT in 2015 we ran 16 PTC tests. The value of the gain variesfrom amplifier to amplifier but has remained stable in time acrossthe various observing runs.

3.2 Overscan, gain, bias and flat-field correction

The electronic readout process adds an artificial pedestal signalacross thewhole image, biasing the values of each pixel by a constantnumber. In order to estimate this bias value that slightly varies fromone amplifier to each other, we use the overscan section where onlythe bias signal is present. Then we estimate the value of the bias ineach amplifier, computing the median row by row in the overscanregion. To correct for low frequency oscillations in the verticalreadout, a 8-row Gaussian filtering is applied, allowing a moreaccurate subtraction of the electronic bias as the single row mediandid not deliver accurate statistics of the varying bias value. The gainestimated from the PTC analysis is applied to each amplifier array,and ADUs are transformed to electrons in this early stage of theprocess. To compensate for readout patterns that are present in allexposures, a master bias frame is produced, combining around 10individual zero-exposure bias frames. Aswe have identified residualpatterns in the first 2 or 3 exposures after a full readout system restart,the individual bias frames are analyzed and those with abnormallevels of noise are removed from the median average. Finally, amaster flat is produced from the individual dome-flat exposuresthat are taken every afternoon, prior to the night-time observations.We perform a median average of at least 5 individual flat-fieldexposures, to reduce noise fluctuations and cosmic ray hits. Scatter-

MNRAS 000, 1–31 (2022)

4 Serrano et al.

light residuals found in the raw flat images needed to be removed, asthis signature in the flat images is an additive component of the light,while the flat-field needs to contain only multiplicative factors ofthe main optical path. The process of removing the scatter-light willbe described in §3.4.1, together with the correction of the scientificsky images. Once a clean master flat has been produced, we use itto divide the science exposures from the same night, flattening theresponse across the field of view.

3.3 Cross-talk calibration and correction

Since the four amplifiers are read in parallel it is possible that currentis induced through magnetic fields between the different channelsread at the same time, an effect that is commonly known as crosstalk(Freyhammer et al. 2001). The charges from the same row are readsimultaneously in all four amplifiers of each detector and in all18 detectors. Amplifiers 1 and 3 read the pixels from right to leftwhile amplifiers 2 and 4 read in the opposite direction, due to itsdisposition in the detector. This allows us to identify unambiguouslythe pixels that are read simultaneously.

Another relevant instrumental effect occurs when too manyelectrons in a given pixel cause electrons to overflow to the nearbypixel wells in the same column. The average capacity of these de-tectors is ∼ 210.000 e−. This effect is known as saturation bleeding,producing an elongated shape around very bright stars. If the pixelsignal is above 18-bit (current depth of PAUCam ADCs), it willsaturate at 𝑁sat = 218 −1 (or 262,143) ADUs. With an average gainof 0.7, the final saturation value is limited by the ADC conversionlimit and not by the full well. In the top panel of Figure 3 one cansee ghost images parallel to the elongated star image due to thecrosstalk effect. Even though we can only see the ghost image fromthe saturated pixels, the crosstalk effect happens at all levels but isonly visible when the signal is strong enough to stand out from thenoise in the background.

To remove the crosstalk signal we need to estimate the induc-tion ratios 𝑟xy_ij between each pair of amplifiers from detector 𝑥amplifier 𝑦 to detector 𝑖 amplifier 𝑗 . Therefore, the signal of eachamplifier would be the sum of its direct integration signal 𝐼 int plusthe signal of all the remaining amplifiers in the mosaic being readout at the same time scaled by the induction ratio

𝐼 totxy = 𝐼 intxy +18∑︁i

4∑︁j𝐼 intij 𝑟xy_ij (1)

for a given set of detector 𝑥 and amplifier 𝑦.Although crosstalk analyses were made at the facility lab in

Barcelona, the crosstalk ratiosmay vary once the camera is mountedin the telescope and thus we had to estimate them with sky dataobtained when the instrument was mounted in the prime focus. Inorder to estimate the induction ratios between amplifier 𝑖 and 𝑗 ofexposure 𝑧, we measured the average image background level inthe target amplifier 𝑏𝑔i and subtracted it from the average level inthe mirrored positions of the saturated pixels 𝑓j, looking for anychange in flux with respect to the rest of the image background dueto crosstalk such as:

𝑟𝑧ij = median{ 𝑓j (𝑥, 𝑦) − 𝑏𝑔i} . (2)

where 𝑥, 𝑦 are saturated pixels in amplifier image 𝑗 of exposure 𝑧.If the mirrored position contains sources or is not flat enough, wediscard the measurement. We combine the 𝑖, 𝑗 ratio measured inall available images, weighting by the number of saturated pixels

Figure 3. Top: An image crop showing an 8.5 magnitude star that saturatesthe detector producing bleeding and a visible crosstalk signal on themirroredpositions of the remaining three amplifiers. Bottom: The same crop withcrosstalk correction enabled. The 3 ghost signals from the saturated starmirrored on the other amplifiers are completely removed after applying thecorresponding ratios and subtraction.

available in the each ratio as:

𝑟ij =

∑z 𝑟zij𝑁zsat∑

z 𝑁zsat

. (3)

We compute the ratios for the 8 central detectors (wherenarrow-band filters are located) against the remaining 18 detectorsand 4 amplifiers, resulting into a total of 2272 pairs. To remove thecrosstalk signal, each amplifier needs to subtract the flux of the other71 amplifiers by the corresponding ratio. Figure 3 illustrates howthe ghost produced by the bright star on the right is now removedafter applying the calibrated crosstalk ratios.

To verify the correct implementation of the crosstalk calibra-tion, we have measured the same crosstalk ratios over a dataset thathas been already crosstalk corrected. Figure 4 illustrates the matrixof crosstalk ratios. It can be seen how the intra-detector crosstalk ra-tios are significantly higher than the inter-detector ones, as expecteddue to further distance of wiring in the electronics. We measure thecrosstalk ratios in a large dataset with more than 430,000 images,and apply those ratios to a different dataset (as applying it overthe same dataset would give a zero residual by construction). Wecan see how the crosstalk ratios reduce from ∼ 0.04% to less than0.002% once the crosstalk calibration has been applied.

3.4 Scatter-light correction

The large narrow-band set of PAUS required engineering an effi-cient design to include all 46 filters, as described in Padilla et al.(2019). The final solution was to distribute multiple small filters,each covering a single detector in filter trays, such that with 5 traysand 8 narrow-bands in each covering the central most illuminateddetectors, we could cover the entire wavelength range. This designcaused a side effect in the quality of the images, as significant reflec-tions from the lateral edge of the filter and the filter trays themselvescreated localized scatter-light at the edge of each detector image (seethe top panels of Fig. 7 and 8). This effect was identified quicklyand the camera was reopened in mid 2016 to redesign the filter

MNRAS 000, 1–31 (2022)

PAUS image photometry 5

Figure 4. Left: The crosstalk induction ratios used to apply the correction.Each box contains the 4x4 amplifiers of each detector against another detec-tor. Right: The residual crosstalk ratios measured on a different dataset aftercrosstalk correction.

2015 2016 2017 2018 2019 2020Observation date

0

1

2

3

4

5

Mea

n B

ackg

roun

d (e

/s)

Figure 5. The mean background in the reduced PAUS images. With theoriginal filter tray design, a substantial fraction of the light was reflected atthe edges of the filter pieces and filter tray, causing severe levels of scatter-light in the science images. An intervention was carried out in mid 2016 andthe mean background was reduced from ∼ 4 𝑒−/𝑠 to less than 1 𝑒−/𝑠, ascan be seen after 16B observations and beyond.

trays and minimize the problem. After the camera intervention, thebackground caused by scatter-light was reduced by a factor of 4 asit can be seen in Figure 5. The scatter-light was not fully eliminatedand specific processing (described in the following sections) hadto be developed to correct the areas affected by this issue, withoutcompromising the light from the sources we need to measure.

Figure 6. Left: The raw flat-field exposure with severe scatter-light sig-nal. Right: The scatter-light corrected flat-fields preserving both the lowfrequency component of the vignetting and the high frequency of the pixel-to-pixel variations.

3.4.1 Implementation in flat-fields

Flat-fields can be divided into two main frequency componentsacross the focal plane: a low-pass band due to vignetting and a highpass pixel due to pixel variations (i.e. dust, dead and hot pixels).Fortunately the scatter-light is in between with a mid-size frequencyvariation, so we are able to isolate and correct for it. First we usethe broad-band flat-fields made of a single large filter, that does notcontain scatter-light, to construct a vignette image, mostly causedby the prime focus corrector optics. With a low-pass filtering wecould isolate the vignette component and dismiss the high frequencyvariations. Dividing the narrow-band flats with the vignette profileflattens the image and leaves the scatter-light as the lowest compo-nent in the image, ensuring that high-pass filtering would leave thescatter-light component only, which can then be subtracted from theflat. The resulting correction can be seen in Figure 6.

3.4.2 Implementation in science images

In the case of science images, the process is more complicatedas large extended sources (i.e. large galaxies or nearby objects)may have a similar spatial frequency as the scatter-light. In thecases where the target sources are distant and small galaxies, alow-pass filtering, sigma-clipping the sources, isolates the scatter-light, without affecting the photometry of the small sources. Thisis similar to background subtraction techniques, with the differencethat the codes used take into account the preferable direction of thescatter-light across the edges of the detector-filter system.

To preserve the flux from extended sources, a sky-flat correc-tion was the most effective solution. To produce the sky flats, a largeset of images were combined following a median average stack. Weidentified that around 50 images were enough to provide an accuratemodel of the background, including scatter-light. A combination ofmultiplicative factors (residuals from the dome flat) and additivecomponents (mostly scatter-light) are present in the sky flats andtherefore need to be separated to subtract and divide the imagesto detrend the effects properly. The complication of this method isthat it requires multiple epochs to detrend an image, and the imagesstacked need to have similar background levels, which might notalways be possible with shorter observations. However, when thereare enough exposures from the same filter tray under the similarsky conditions, this method provides a more accurate illuminationcorrection than the traditional dome flats.

MNRAS 000, 1–31 (2022)

6 Serrano et al.

Figure 7. Top:A typical reduced science frame without scatter-light correc-tion. Bottom: The flat-field science image preserving the flux from the starsand small galaxies, but affecting the larger extended sources and halos.

3.5 Cosmic Ray detection, rejection and masking

The fully depleted detectors are also very sensitive to cosmic raysimpacting the silicon sensitive area and leaving a trace that will im-pact the photometry if it overlaps with the target source. The typicalexposure times for PAUS are between 2 and 3 minutes, causing areduced amount of cosmic hits in the exposure. However the cam-era is capable of performing long exposures thanks to its integratedauto-guiding system, and in these cases having a proper cosmicray identification is critical. In any case, we perform a cosmic raydetection, rejection and masking, following a Laplacian filtering al-gorithm known as L.A.Cosmic (van Dokkum 2001). The main ideais to take advantage of the specific sharp profile of the cosmic rayhits, due to the fact that the particle is not blurred by the atmospherePSF as the rest of the photons in the image and can be highlightedwith a Laplacian filtering. For the noise model described in thealgorithm, we provide the measurement of the readout noise takenin the overscan regions of the raw image. Figure 9 illustrate howcosmic ray hits over a PAUS image disappear after being identifiedand masked with neighbouring information.

3.6 Image Masking

In order to keep track of the history of each pixel, we attach amask toeach exposure image, of the same size as the original. The pixels inthe mask are mapped into a bitmap where each bit corresponds to acertain issue. Bits 1 to 5 are left for SExtractor (Bertin & Arnouts1996) flags, and will specify when a detected source is crowded,merged, under a halo, truncated or deblended. Bits 6 to 13 are set atthe pixel or image level in the mask image to specify for crosstalk-corrected, detector cosmetics (dead or hot pixels), saturated pixel,rejected cosmic ray, highly vignetted area, close to edge and veryhigh distortion. Other bits are left for later photometry such as pixelsor sources affected by scatter-light background, very high extinction,

Figure 8. Top: A rare science frame with a very extended source (M101)without scatter-light correction. Bottom: A sky flat corrected science imagepreserving the light from extended sources.

Figure 9. Left: A crop of an image without cosmic ray masking. Right: Thesame image crop after the Laplacian filter algorithm, successfully identify-ing, masking and interpolating the cosmic ray hits in a single-epoch image.Red cicles highlight the location of cosmics before and after detection andmasking.

discordant measurements, astrometry issues and noisy background.The full list of flags can be found in Appendix D.

To classify the cosmetics in the detector, a flattening processover the master flats is applied and pixels with less than 40% of thenormalized flux are added to the mask and flagged as bad pixels.For each science frame, we also produce the associated weightmap, built from the master flat-field, with a very low weight valuethe pixels with mask values larger than 0. In this way we assigna higher weight for pixels that have a better system response andneglecting the masked pixels.

MNRAS 000, 1–31 (2022)

PAUS image photometry 7

Figure 10. Top: The stars in the COSMOS field from Gaia in DR1 showingnon-homogeneous areas. Bottom: A more uniform and dense coverage inDR2 enabling a better astrometric reference for calibration.

4 ASTROMETRIC CALIBRATION

Once the image has been cleaned and detrended for all knowninstrumental effects described in §3, we can proceed to correct fortelescope pointing inaccuracies and optical distortions. There arethree key elements in this process; the reference catalogue (§4.1),the calibration of the WCS (§4.2) and the astrometry matching andcorrection of the single-epoch images (§4.3).

4.1 Gaia reference catalogue

As of 2021, Gaia’s latest release is the Data Release 2 (DR2) oftheir public dataset (Gaia Collaboration et al. 2018). This datasetprovides the most accurate stellar catalogue, complete between 12< G < 17 and with a limiting magnitude of G=21. It also includesproper motions, which can be critical to find an accurate astrometricsolution in some situations. Nevertheless the observation periodof PAUS and Gaia overlap in time, so accurate proper motionscorrections are not as critical as with observations that are far apartin time. In early versions of the pipeline we tested the astrometriccalibration pipeline with reference catalogues that were observedmore than 10 years before PAUS, such as SDSS (Alam et al. 2015)or USNO-B (Monet et al. 2003) which resulted in complicated andless accurate solutions. While the first release of Gaia had someinhomogeneous completeness patterns, as illustrated in Figure 10,the second release ended up more complete and homogeneous,delivering better and more stable astrometric solutions. The wholeGaia DR2 was ingested into the PAUS database, enabling a high-quality astrometric reference for any PAUCam observation in thesky.

4.2 WCS calibration

The raw mosaic exposures come with a base World CoordinateSystem (WCS) (Calabretta & Greisen 2002) in its header that ap-proaches the plate scale of PAUCam at the WHT. However thisdefault WCS is not enough to calibrate single-epoch exposures andwe need to compute a more accurate WCS solution for the imple-mented focal plane geometry. Furthermore, the solution for eachdetector position, rotation and scale needs to be defined indepen-dently as the base WCS comes from the mechanical layouts andnot from precise measurements of the built focal plane. For thispurpose we use SCAMP (Bertin 2006) in a particular configuration(MOSAIC_TYPE LOOSE), giving freedom to each detector to move,scale and rotate around the focal plane for a perfect match betweenthe overlapping dithered exposures. We provide SCAMP all over-lapping measurements in the PAUS reference sky location, the 2deg2 COSMOS field (Capak et al. 2007), where additional ditherswere observed by the survey for validation purposes. This allowsus to compute a precise solution with enough stars across the focalplane, accurately determining the position, rotation and scale of theentire detector mosaic. Figure 11 shows the measured errors in bothcoordinates when calibrating the WCS with 100 overlapping expo-sures with respect to the reference Gaia catalogue and internally toPAU. From this validation analysis we estimate an absolute astro-metric accuracy of ∼40 milli-arcseconds (mas) Root Mean Square(RMS) and an internal consistency of ∼30 mas RMS. This processonly needs to be computed once (unless the astrometric referencecatalogue is updated), producing a calibrated WCS of the PAUCamfocal plane that is used during the nominal image processing.

4.3 Single-Epoch astrometry and PSF modelling

At this stage, we perform the first extraction of sources in the single-epoch image, using SExtractor in a configuration mode specificfor astrometry, with extended centroid measurements of windowedpositions and a vignette matrix for each source.

To compute each exposure astrometric solution, we use againSCAMP but in a configuration that takes into account the pre-calibrated WCS (MOSAIC_TYPE SAME_CRVAL), helping the globalmosaic solution and increasing the performance even with individ-ual exposures. It provides an updated WCS header for the FITS filethat includes the offset and distortion correction under a 3 degreepolynomial fit. Even though SCAMP works best with overlappingexposures, we perform the astrometry solution on a single-epochbasis, as it already delivers 50 mas RMS of astrometric accuracy,below from what it is required to perform forced photometry (<100mas). We have simulated a 50 mas aperture position error overan average galaxy size (1.′′22) and an average PSF size (1.′′2 FullWidth Half Max or FWHM), resulting into a tolerable 0.02% errorin the flux. This is only possible when using the base WCS cali-bration of the focal plane explained previously, serving as a firstguess reference for the single-epoch corrections. Even though thereare different filters over each CCD in the detector array, the globalmosaic solution performs well and no chromatic distortions remainfrom SCAMP’s solutions. Figure 12 illustrates the estimated pixelscale variations across the focal plane, for the WCS calibration runwith 100 overlapping exposures and for the individual single-epochsolutions. The difference between the two analyses is minor, indi-cating a good solution even when using a single exposure. The focalplane exhibits a strong distortion pattern with variations of 4% inthe pixel scale between the center and the edges of the 8 centraldetectors.

MNRAS 000, 1–31 (2022)

8 Serrano et al.

Figure 11. The position residuals corresponding to the difference betweenthemeasured position of each source after the astrometric calibration againstits reference in the two coordinate axes (1: Right Ascension, 2: Declination).The plot highlights the axis histogram distributions (color: all sources, grey:high SNR sources). Top (green): The astrometric residuals between internaloverlapping sources. Bottom (red): The astrometric residuals against thereference catalogue.

Using the same astrometric catalogue with additional centroid-ing and profile information, wemodel the PSF across the focal planewith PSFEx (Bertin 2011), which delivers a variable PSFmodel thatcan be reconstructed at any position in the focal plane. When thereare not enough stars in a single detector to properly model the vari-ations, the average PSF FWHM value is given. This information iskey to obtain accurate aperture photometry as described in §6.3.The current aperture scaling uses the average PSF FWHM at eachdetector.

5 PHOTOMETRIC CALIBRATION

In this section we briefly describe a key step in the data reductionprocess: the photometric calibration of the narrow-band images.The detailed description of the process is explained in Castanderet al. (2022).

5.1 Overview

In a photometric night, the atmospheric extinction correlates linearlywith airmass. Traditionally the calibration of astronomical images is

Figure 12. Top: The spatial variation of the pixel scale found by SCAMPin a single-epoch exposure. Bottom: The spatial variation of the pixel scalefound by SCAMP with 100 overlapping exposures.

performed taking observations of photometric standards (Landolt1992) observed at different values of the airmass to compute theextinction coefficient. After the extinction model of the night iscomputed, it can be applied to the rest of the science exposures. Dueto the non-standard filter set of PAUS and the variety of observingconditions in the survey, we had to design a particular process thatallows us to calibrate the fluxes in each image observed.

The approach presented in Castander et al. (2022) is to inferstellar templates from the SDSS broad band data and compute syn-thetic narrow-band photometry from the stellar templates. Syntheticphotometry is then compared to the PAUS measurements of thesame stars to obtain a zero-point (ZP) for each star. Even though thestellar templates inferred from the SDSS broad bands for a partic-ular star can be wrong, combining zero-points from multiple-starscancel errors of individual star zero-points and delivers accuratenarrow-band calibration.

As we compute the photometric zero-point for every PAUSimage independently and directly to the already calibrated SDSSsynthetic photometry, we could observe under all sky conditions,even in non-photometric nights. This has been essential tomaximizethe use of available time in the observatory which is always limitedand precious.

5.2 Implementation

The photometric calibration code is implemented inside theNightly pipeline. It has two main steps: the star photometry andthe zero-point calibration. For the first step, we run SExtractor onthe instrumentally detrended and astrometrically calibrated images.We choose calibration stars that are moderately bright, comprisingmagnitudes between 14 and 19, that typically deliver a signal-to-

MNRAS 000, 1–31 (2022)

PAUS image photometry 9

noise ratio higher than 20. For such bright stars we do not need tooptimize the aperture with complex and PSF-dependent methodsthat could be sensitive to the observation conditions or optical dis-tortions in the focal plane. Instead we use a constant large aperture(∼4.′′radius) that gathers all the light from the star independentlyfrom the image PSF, ensuring that the truncated flux left outside theaperture is negligible even in the worst conditions tolerated by thesurvey. From simulations we verified that for the average survey PSFof FWHM 1.′′1, the loss in flux is 0.03% and in the worse case of1.′′8 seeing, the loss in flux is below 0.5%.We tested various config-urations of aperture sizes, background modelling and scatter-lightcorrection, and the method using aperture photometry with a largeaperture size was the most reliable across the different observingconditions. Once the photometry is processed, we perform a spatialmatching with the SDSS DR12 catalogue (Alam et al. 2015), asonly those stars are of interest for the photometric calibration of thenarrow-band images. We make use of the standard flags describedin its own documentation1 and only clean measurements (flag=0)are used to determine the star zero-points.

The second step is to compute the zero-point for each star, aswell as the combined zero-point for each detector image by combin-ing the individual star zero-points. For this process we provide tothe calibration code the narrow-band fluxes measured in the uncal-ibrated image, attached to the ID of the reference SDSS cataloguewith its broad band photometry. The calibration code computes thesynthetic narrow-band fluxes for all the stellar templates and com-pares them to the observed uncalibrated fluxes of the selected starsin each image. It then computes the zero-point for each particularset of star-templates. Finally the code returns a single zero-pointper star, weighted by the 𝜒2 value of the stellar template fit and thesynthetic broadband fluxes of the stellar templates used.

The image zero-point is computed by the median of all thestars available. The median is more robust than a SNR-weightedaverage as it reduces the weight of underestimated errors in thebrightest stars that would otherwise dominate and possibly bias thefinal measurement. Weighting all the stars equally produces a moreuniform spatial sampling of the zero-point throughout the detectorthan inverse-variance weighting that typically determines the globalzero-point with just the brightest stars only sampling the detector ina few points. Even though we just use the image zero-point in thecalibration of the scientific catalogue, we store in the database allthe data for stellar photometry, individual star zero-points and thecombined image zero-points for validation purposes. In §6 we detailhow to use the zero-points computed to obtain calibrated fluxes andpropagate the corresponding error.

6 FORCED PHOTOMETRY

The most important aspect that maximizes photometric redshiftaccuracy is preserving the colors of itsmeasured bands. The narrow-band images of PAUSdeliver a lowSNR (<5) at the targetmagnitudeof 𝑖AB = 22.5 and therefore we need to perform the photometrybased on external reference catalogues. With the information fromthe reference catalogues, we can compute Forced Photometry forthe PAUS images, measuring the same fraction of light in each band.This would not be possible with the PAUS images themselves forthe faintest objects, as the shape could not be properly estimated onsourceswith such lowSN. From the reference cataloguewe establish

1 https://sextractor.readthedocs.io/en/latest/Flagging.html

a consistent location, shape and scale of each object and define anaperture that preserves the flux fraction at all wavelengths. The onlyconstraint for this technique is to have good astrometry accuracy asthe positions of the apertures for the sources are defined blindly (noobject detection or centroiding is done in the PAUS images). As wedefined in §4, we have a consistent astrometry at the sub-pixel level,enough for the purpose of Forced Photometry. As PAUS imageshave different seeing and PSF sizes, it is also important to modelthe PSF such that apertures are scaled accordingly for a constantflux fraction. This process is described in §6.3.

6.1 Reference catalogues

Performing a forced photometry technique requires a reference cat-alogue that overlaps with the images observed by PAUS. There aretwomain aspects that the reference catalogues must have: the neces-sary parameters to perform the forced photometry accurately and thecomplementary galaxy lensing measurements that, in combinationwith the outstanding redshift accuracy of PAUS, deliver a uniquescientific spot. Moreover, the reference catalogue must be completedown to the magnitude limit of PAUS such that the final combinedcatalogue has no target selection that could bias the cosmologicalmeasurements.

The selected PAUS fields are the Canada-France-Hawaii-Telescope Lensing Survey (CFHTLenS) fields W1, W3 and W4,the GAMA G09 field over the Kilo-Degree Survey (KiDS) Northfield and COSMOS. In the CFHTLenS catalogue (Heymans et al.2012) we have combined state-of-the-art reduction with THELI(Erben et al. 2013), shear measurement with lensfit (Miller et al.2013), and photometric redshift measurements with PSF-matchedphotometry (Hildebrandt et al. 2012). In the case of the KIDS weuse its latest release DR4 (Kuijken et al. 2019), also with outstand-ing cosmological lensing measurements (Kuijken et al. 2015). Andfinally, in COSMOS we built a merged catalogue from Laigle et al.(2016) and the Zurich Structure & Morphology Catalog2 for theaccurate shape information. COSMOS has been our main valida-tion sample and, with so many multi-wavelength observations, itprovided interesting photo-z tests with more than 70 bands (PAUS+ COSMOS).

Furthermore, all these catalogues contain the information toperform the forced photometry measurements such as sky coordi-nates, a star-galaxy classification, a reference 𝑖𝐴𝐵 magnitude, thescale of the source deconvolved from its observed PSF, the axis ratioto estimate its intrinsic ellipticity, the position angle and its Sérsicindex (or an equivalent parameter that allows us to infer the Sérsicprofile).

6.2 Background modelling

Accurate background subtraction is a key step to achieve precisephotometry. The fluxes of the sources we need to measure sit ontop of a floor of counts produced by the brightness of the sky,plus residuals of electronic bias and scatter-light. The latter is par-ticularly important due to the configuration of the filter trays inPAUCam, causing significantly more light to be reflected and scat-tered in the edges of the narrow-band filter glass (see §3.4). Anadditional complication of the scatter-light is that it can produce

2 Zurich COSMOS catalogue https://irsa.ipac.caltech.

edu/data/COSMOS/gator_docs/cosmos_morph_zurich_

colDescriptions.html

MNRAS 000, 1–31 (2022)

10 Serrano et al.

Figure 13. The annulus used to estimate the background of a particularsource. It can be seen how pixels present in the image mask are discardedfrom the annulus and will not enter into the sigma clipping statistics.

a non-homogeneous background, that is much harder to estimateand subtract. Underestimating the background will have a strongerimpact on faint sources with few counts, as it will add a bias thatscales with the size of the aperture.

For all reasons stated above, a careful background estimationneeded to be implemented. Amongst the different background sub-traction options, we designed an annulus around each source wherewe wanted to perform forced photometry. This provides an accurateestimate that takes into account large-scale variations of the scatter-light. The annulus had to be placed at a close distance to pick upsmaller scale variations, but not too close to introduce flux from thesource itself. Taking into account the typical size of sources we aimto measure, we set a fixed limit for the inner annulus at 30 pixelsfrom the center of the target source. To get enough pixel statisticsthe outer annulus was set at 45 pixels. All bad pixels that fall intothe background annulus are removed from the median statistics. Asample of an annulus with real pixels can be seen in Figure 13.Additionally, to avoid blending sources to affect the estimate, weperform a sigma clipping in the remaining pixels of the annulus,leaving only background free pixels in the average calculation. Themedian of the available pixels provides our estimate of the back-ground in each source. This average value is multiplied by the areaof the aperture and subtracted from the measurement in the mainphotometry process. The standard deviation and number of pixelsused in the estimate is kept as it is used in the flux error estimate.

An alternative Neural Network method to estimate and sub-tract the background of PAUCam narrow-band images proposed inCabayol-Garcia et al. (2020) seems to deliver very accurate mod-elling of the background. It is not yet implemented in production andis planned to be integrated in the main pipeline for future releases.

6.3 Aperture scaling

We correct here for the impact of an elliptical PSF on an ellipticalaperture for a galaxy. For simplicity,we present the results in circularcoordinates (𝑎 = 𝑏 = 𝑟0), but these results can be extended toelliptical apertures by just re-scaling the coordinate (𝑥, 𝑦) unitsusing elliptical coordinates:

𝑥′ → 𝑥 = 𝑟0𝑞1/2 cos \ ; 𝑦′ → 𝑦 = 𝑟0𝑞

−1/2 sin \ , (4)

where 𝑏 ≡ 𝑞𝑎 is the smaller axis of the ellipses. This scaling istested later on in Fig.16 below. The goal is to measure the fluxes inan elliptical aperture that corresponds to the same fraction of thetotal light after taking into account the effects of the PSF at the timeand sky location when the image was taken.

Sérsic profiles and aperture fluxes We assume a Sérsic circularprofile of slope 𝑛 and scale 𝑟0 for the surface brightness distribution:

𝐼 (𝑟) = 𝐼 (0) exp [−(𝑟/𝑟0)1/𝑛] . (5)

The total luminosity in an aperture radius 𝑟 = 𝐴 is:

𝐿 (𝐴) = 2𝜋∫ 𝑟=𝐴

0d𝑟 𝑟 𝐼 (𝑟) . (6)

For a Sérsic profile, we can relate 𝑟0 to the effective radius 𝑟50,defined as the aperture which contains half the total light (𝐿 (𝑟50) =𝐿 (∞)/2): 𝑟0 = [(0.86𝑛 − 0.142) ln 10]−𝑛 𝑟50.

Convolved profiles The observed surface brightness profile 𝐼o (𝑟)will result from the convolution of the intrinsic 𝐼 (𝑟) and the PSFkernel 𝑊PSF. This is a 2D convolution, so even when the image isoriginally symmetric, 𝐼 (®𝑥) = 𝐼 (𝑥), the convolved image 𝐼𝑜 mightnot be symmetric:

𝐼𝑜 (®𝑟) =∫Image

d®𝑥 𝐼 (𝑥) 𝑊PSF (®𝑟 − ®𝑥) . (7)

For a circular PSF:𝑊PSF (®𝑟 − ®𝑥) = 𝑊𝑃𝑆𝐹 ( |®𝑟 − ®𝑥 |) we have:

𝐼𝑜 (𝑟) =∫ ∞

0d𝑥𝑥𝐼 (𝑥)

∫ 2𝜋

0d\𝑊PSF

(√︁𝑟2 + 𝑥2 − 2𝑥𝑟 cos \

). (8)

Moffat PSF We will use a radial Moffat PSF profile

𝑊PSF (𝑟) =𝛽 − 1𝜋𝛼

[1 + (𝑟/𝛼)2

]−𝛽(9)

which has a FWHM = 2𝛼√21/𝛽 − 1. For a Gaussian PSF, the

FWHM is 2.355𝜎.

Effect of seeing on Aperture To obtain the same aperture flux inEq.6 with a different PSF we need to change the aperture to 𝑟 = 𝐴𝑜by solving

𝐿 (𝐴) = 𝐿𝑜 (𝐴𝑜) = 2𝜋∫ 𝑟=𝐴𝑜

0d𝑟 𝑟 𝐼𝑜 (𝑟) , (10)

where 𝐼𝑜 is the convolved profile in Eq.8.

Stellar Apertures A point source convolved with a Gaussian pro-file of width 𝜎 gives a Gaussian 𝐼𝑜 (𝑟) and an aperture flux,

𝐿 (𝐴) =√2𝜋𝜎

[1 − 𝑒−0.5𝐴

2/𝜎2]=√2𝜋𝜎

[1 − 𝑒−4𝐴

2𝐹ln 2

], (11)

where 𝐴𝐹 ≡ 𝐴

2𝜎√2 ln 2

in units of the FWHM. The fraction of light

is then 𝐿 (𝐴)𝐿 (∞) = 1 − 𝑒−4𝐴2𝐹 ln 2 which for 𝐴𝐹 = 1/2 gives 50% of

light (FWHM). For aMoffat profile the fraction is only a bit smaller.

MNRAS 000, 1–31 (2022)

PAUS image photometry 11

0.0 0.2 0.4 0.6 0.8 1.0Bulge fraction

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Sers

ic in

dex

Figure 14. The relation between bulge fraction and Sérsic index assumedfor CFHTLenS and KiDS reference catalogues.

Implementation Computing the convolution integrals for the 2DSérsic profile with a Moffat PSF model for each measurement (120per galaxy on average) is prohibitive as this operation may takeseveral seconds when it is not optimized. We take advantage of thevery optimal implementation of the fraction of light radius calcula-tion in Galsim (Rowe et al. 2015). Even when Galsim claims thisis only accurate down to a few percent, with these smooth profileswe achieve precision better than 1% in most cases. This is preciseenough as our PSF model and the estimated effective radius willdominate the error budget and will determine the final precision wecan reach in estimating the aperture.

The Sérsic profile is generated with the effective radius of thereference catalogue and the Sérsic index estimate. In case the Sérsicindex is not available in the reference catalogue, we assign an indexof 4 to elliptical galaxies and an index of 1 spiral galaxies. In thecase of CFHTLenS and KiDS we derive the Sérsic index from thebulge fraction as shown in Figure 14.

The PSF is modelled as a Moffat profile with a 𝛽 ∼ 4.75(Trujillo et al. 2001) and the FWHM measured in PSFEx for eachdetector image. The Galsim function calculateHLR (CalculateHalf Light Radius) is applied over the convolved object and it allowsto compute either the HLR or a specific flux fraction. Typically werun the photometry over a variety of flux fractions, from 0.5 to 0.9.On average, the signal-to-noise is maximized around 0.625, evenif the optimal SNR depends on each source. An example of howaperture size relates to flux fraction can be seen in Figure 15.

As the r50 is defined as the effective radius on the major axis,we apply the same procedure for the minor axis, multiplying the r50by the axis ratio defined in the reference catalogue.

We verified the aperture scalingmethod, including Eq.4, with asimple error-free simulation, rendering the models described previ-ously (Sérsic profile convolvedwith aMoffat PSF) at a reference fluxand performing aperture photometry. For multiple combinations ofPSF FWHM, galaxy scales and Sérsic indexes, the reconstructedflux was always accurate within less than 1%. Figure 16 shows anexample of a convolved galaxy with a circular and elliptical apertureat 62.5% of light. Both cases estimate and recover accurately theflux fraction but the elliptical aperture delivers a higher SNR.

0 20 40 60 80 100

flux fraction

0.0

2.5

5.0

7.5

10.0

12.5

15.0

17.5

aper

ture

rad

ius

(arc

sec)

n=1n=2n=3n=4n=5

Figure 15. The aperture size radius as a function of flux fraction at variousSérsic indexes. The sample source is a galaxy with an effective radius 𝑟50of 1.′′5 and a Moffat PSF with a FWHM of 0.′′7 and a beta parameter of 4.75.

0 10 20 30 40Y pixels

0

10

20

30

40

X pi

xels

Figure 16. A simulation of an elliptical Sersic profile of 1.′′9 and b/a =0.2 convolved with a Moffat profile equivalent to a PSF of FWHM = 1.′′2.Dashed line:A circular aperture containing 62.5% of light accurate to 0.3%.Solid line: An elliptical aperture at the same flux fraction with equivalentaccuracy but with 60% less area, improving the SNR of the measuremnt.

6.4 Flux measurement and error estimate

In the forced photometry process we blindly place the aperture inthe image, without any additional centroiding. As the broad-bandreference catalogue is significantly deeper than the narrow-bandPAUS images, we can rely on good and complete positions andshapes from the reference catalogue. At this stage the narrow-bandimages successfully passed all astrometry quality cuts and therefore,we can use the calibrated single-epoch WCS solution to determinethe pixel positions of the source with sub-pixel precision.

After computing bothmajor andminor aperture radius from theprevious step, we can perform aperture photometry with elliptical

MNRAS 000, 1–31 (2022)

12 Serrano et al.

apertures. For that we make use of the PhotUtils library (Bradleyet al. 2020), an affiliated package of Astropy (Astropy Collabora-tion et al. 2018). The aperture flux 𝑓𝑡𝑜𝑡 is measured on the reducedimage, still containing the component of the background. The bit-wise information of all pixels inside the aperture is propagated tothe flux measurement flag.

At the same location we estimate the background with an an-nulus as described in §6.2. The flagged pixels in the annulus donot enter into the combined flagging of the source, they are simplyignored in the statistics of the background calculation. The back-groundmodelling function estimates themedian background flux 𝐵,standard deviation 𝜎bg and number of valid samples in the annulus𝑁bg.

The background subtracted flux of the source 𝑓𝑠𝑟𝑐 is therefore:

𝑓src = 𝑓tot − 𝐵 · 𝑁ap , (12)

where 𝑁ap is the area of the aperture in pixels, taking into accountthe fractional overlap of the aperture and each pixel region.

The model of the error in the measurement 𝜎tot is composedby the source flux error (𝜎src), the noise in the background (𝜎bg)and the uncertainty in the estimation of the background.

The source flux error is assumed to be a Poisson process, andit is estimated as the variance of the flux inside an aperture. Aswe perform the measurement over single-epoch images, we assumethere are no pixel-to-pixel correlations as it would appear on aremapped image. Therefore the flux error of the source in electronsis:

𝜎src (e−) =√︁𝑔 · 𝑓raw , (13)

where 𝑔 is the amplifier gain in the detector readout system and 𝑓rawis the raw detector counts of the source itself. As the measured unitsare flux rate in e−/s, we need to scale the variance and the error suchas:

𝜎src (e−/s) =√︁𝑓src𝑡exp

𝑡exp=

√︄𝑓src𝑡exp

. (14)

Finally, we compute the total measurement uncertainty byadding in quadrature the flux error, the independent backgroundnoise and the background estimate error in the annulus such as:

𝜎tot =

√√√𝑓src𝑡exp

+(𝑁ap + 𝑘

𝑁2ap𝑁bg

)𝜎2bg , (15)

where 𝑘 = 𝜋/2 is the efficiency correction for the median we usedin our background estimation method.

6.5 Flux co-addition

So far all measurements were made at the single-epoch level. How-ever PAUS observes each area on the sky multiple times for everyfilter. On average we perform 3 passes in the main fields (W1, W2and W3) and 5 times in our calibration field (COSMOS - Laigleet al. 2016). Observing the same area multiple times has some ad-vantages, such as covering the gaps between detectors, increasingthe signal-to-noise, rejecting outliers, reducing the density of cos-mic rays and increasing the dynamic range of the sources, as shorterexposure times will allow brighter stars not to saturate. This is doneat the expense of an increased volume of data and a slower observingrate due to a constant readout time of 20s.

Most surveys combine their multiple layers at the image level,

which is convenient for cosmic ray rejection (a median averagealmost completely removes all cosmic hits) but the stacking re-quires to resample the images, causing correlated noise which iscomplicated to model. Due to our objective of measuring very lowsignal-to-noise galaxies, we decided to stack the measurements atthe catalogue level, performing all image measurements on indi-vidual exposures at their original pixel sampling. The combinedmeasurements are called coadd fluxes.

Before combining the single-epoch aperture measurements,they need to be corrected to a standard system so all fluxes are con-sistent. Light rate from the same source may vary due to particularobserving conditions like variations in the atmospheric extinctionon non-photometric nights, different telescope elevation resultinginto different observing airmass or any other effect that varies thetransmission with time. For this purpose we have calibrated eachimage and assigned a multiplicative factor ZP and its correspondingcalibration error 𝜎zp.

The calibrated single-epoch flux is simply defined as:

𝑓cal = 𝑓src · ZP (16)

and its calibrated error 𝜎𝑐𝑎𝑙 , assuming non-linear error propagationwith independent and not negligible variances, we derive:

𝜎cal =√︃𝜎2src𝜎

2zp + 𝜎2srcZP2 + 𝑓 2src𝜎

2zp (17)

To avoid too small numbers that could require special datatypes, we added a magnitude offset of 26 in the calculation of thezero-point and thus, one could derive the AB magnitude from thePAUS calibrated flux as:

𝑚AB = −2.5 log10 ( 𝑓cal) + 26 . (18)

However with narrow-band photometry we are often dealingwith sources that have flux close to zero and magnitudes are in-convenient at that level. All the processing and archive of sourcebrightness is done at the flux level.

Now that we have a calibrated flux and its associated error forthe individual measurements, we can proceed to combine all therepeated measurements of the same source and band into a coaddflux and error using an inverse-variance weighted average such as:

𝑓coadd =

∑𝑓cali𝜎

−2cali∑

𝜎−2cali

, (19)

where only non-flagged sources (§6.6) will enter into the combinedmeasurements.

Assuming that the overlapping measurements are independentwe estimate the coadd error as:

𝜎2coadd =1∑

𝑁 𝜎−2cali

, (20)

where N is the number of unflagged measurements to be combined.Additionally we compute the reduced chi-square 𝜒2 as a mea-

surement of consistency for the multiple measures:

𝜒2coadd =𝑁∑︁i

( 𝑓cali − 𝑓coadd)2

𝜎2cali

/(𝑁 − 1) . (21)

All three forced photometry coadd parameters are stored in thedatabase for further processing and quality analysis.

MNRAS 000, 1–31 (2022)

PAUS image photometry 13

6.6 Flagging

Throughout the whole processing of an image from its original rawstate, we identify any possible cause that may affect the confidenceof its value. To track each possible cause of problems we use flagsboth in the Nightly processing and in theMEMBA pipeline. In theimage calibration process of theNightly Pipeline we track the flagsat the pixel level. Thus we created a mask image where each pixelcontains the flag values of the corresponding pixel in the scienceimage. In order to track all possible flag combinations in a singlevalue, we have mapped each flag condition to a bit in the value ofthe pixel, allowing for 16 different flags in a 16-bit depth image. InPAUdm we have defined the following image-type flags:

• Cosmetics: pixels not responding correctly to light, either hotones that deliver constant high values or dead pixels that do notreact to light inputs. Dust or imperfections in the detector mosaicor filter may appear here too.

• Saturated: pixels with so much flux that reached the ADC limit(18-bit in the case of PAUCam)

• Cosmic Rays: pixels identified as cosmics rays in the Laplacianfiltering algorithm (van Dokkum 2001). Even though pixel valuesare interpolated from the neighbouring ones and cosmic rays mayseem to have disappeared, the mask will keep track and the pixelvalue will not be used for science.

• Vignetted: areas in the focal plane with low transmission dueto optical vignetting. The default value is set to 40%.

• Crosstalk: pixels contaminated by a strong signal of crosstalkfrom a related amplifier or detector (§3.3).

In contrast, in the MEMBA pipeline we perform photometryof sources and flagging will take place at the catalogue level foreach source measurement. The flags in the image that overlap theaperture are propagated to the measurement flag. Additionally wehave defined the following catalogue-type flags:

• Edge: source too close to the edge (< 80 pixels) or partially outof the image array.

• Distortion: source in an area with strong optical distortion (>50arcmin from the focal plane center) or with an elongated PSF suchthat flux ratio in the aperture scaling may be inaccurate.

• Scatter-light: source with intense and spatially dependentscatter-light that could compromise the background subtraction.We estimated the presence of scatter-light in the background withtwo methods: variance ratio and ellipticity ratio. In the first method,we simply compared the variance in the annulus around each sourcecompared to a global variance in the background of the whole im-age. If the ratio was above a certain threshold (typically 5%) we flagthe source. This method was effective to flag sources in scatter-lightareas but was not efficient as it was over-flagging sources that weresimply on noisier areas. In the second method, we compute the el-lipticity of the background image using the second order brightnessmoments of the sigma-clipped stamp around each source, definedas:

𝑞𝑥𝑥 =∑︁𝑥𝑦

𝐼 (𝑥, 𝑦) (𝑥 − 𝑥)2Δ𝑥Δ𝑦 , (22)

𝑞𝑥𝑦 =∑︁𝑥𝑦

𝐼 (𝑥, 𝑦) (𝑥 − 𝑥) (𝑦 − 𝑦)Δ𝑥Δ𝑦 , (23)

𝑞𝑦𝑦 =∑︁𝑥𝑦

𝐼 (𝑥, 𝑦) (𝑦 − 𝑦)2Δ𝑥Δ𝑦 (24)

and from these quadrupole moments we build the ellipticity of the

stamp as:

𝜖 =𝑞𝑥𝑥 − 𝑞𝑦𝑦 − 2𝑖𝑞𝑥𝑦

𝑞𝑥𝑥 + 𝑞𝑦𝑦 + 2√︃𝑞𝑥𝑥𝑞𝑦𝑦 − 𝑞2𝑥𝑦

, (25)

where real component measures deviations from circle along axesand imaginary component along the main diagonals (Bridle et al.2009).To obtain a reference ellipticity value, we compute the median

ellipticity of the whole image scanning the detector in steps of 25by 25 pixels and we flag those measurements with a backgroundellipticity larger than 10x the median of the image. This has provedto be an efficient method to track scatter-light residuals, flaggingonly sources with non-reliable background subtraction.

As described in §6.5 we skip all measurements that containany flag inside its aperture for the combination of the coadd mea-surement. The full list of flags and its bit mapping value can befound in Appendix D.

6.7 Survey mask

The particular layout of the narrow-band filters in the camera traysand the fact that CCD detectors are separated, leave gaps in the focalplane and result in a non-homogeneous coverage of the sky for eachpass band.Additionally there are telescope pointing errors that resultin amore in-homogeneous sky coverage. In large cosmology surveysthat intend to identify statistical correlations of galaxy positions anddensities, it is mandatory to accurately identify how the survey hastiled the sky with its thousands of exposures.

For this purpose we have built a survey mask with two levels ofinformation. First we generate the exposure mask, where we definefor each filter how many times we observed each area in the skyfor an entire field with a resolution of 5 arc-seconds. The mask isbuilt taking into account variations in the system response fromthe flat-field and flagged pixels. This will create a complex maskthat introduces effects like vignetting, bright stars that saturate orcorners in the detector not visible due to mechanical pieces in theoptical path.

The second level of mask is the bands mask and it is builtfrom the combination of all 40 exposure masks for each band. Itrepresents the same area in the sky as the exposure masks butcontains the number of bands available in each location of the skywith one or more effective units of exposure. An example of howboth levels of mask can be seen in Figure 17.

The survey masks are associated with MEMBA runs, as theresulting coadd catalogue is built with a set of images and this sameset is the one used to build the masks. As we are performing forcedphotometry from an external catalogue, the selection of sourceswill not depend on PAUS observations and we will need access tothe survey mask provided by the external survey. However the finalsurvey mask must be created as an intersection between the externalmask and the PAUS survey mask.

We used SWARP (Bertin et al. 2002) from Astromatic3 toremap and build image stacks. We process the science images andtheir corresponding flat-field maps as the weight map of an entirefield with SWARP. We obtain the exposure mask as the result-ing combined weight map provided by SWARP in "MAP_WEIGHT"mode. As a by-product we obtain the science-stacked image eventhough this is not used as part of the main science processing.

3 https://www.astromatic.net/

MNRAS 000, 1–31 (2022)

14 Serrano et al.

Figure 17. The PAUS Survey mask in the 25 deg2 W3 field. Top: Theexposure mask of narrow-band NB655 indicating the number of overlappingobservations of this filter on the sky. Bottom: The bands mask indicating thenumber of narrow-bands with at least one observation in each part of thesky.

7 QUALITY ASSURANCE & VALIDATION

One of themajor challenges in PAUdm is the volume and complexityof the data to be processed and analyzed. Contrary to spectroscopicsurveys that can obtain a spectrum in a single observation (or stack-ing a reduced number of individual spectrum), each galaxy in PAUSis composed by more than 120 measurements. Furthermore a singleproblematic image can impact thousands of galaxies, altering its40 narrow-band spectrum and causing catastrophic outliers in itsphotometric redshift determination.

Tuning an algorithm to process a small dataset is simple, asone can manually verify the correct behaviour of the processing andits output result. However with such large and complex dataset weneed to build automatic control systems that verify that a particularcode or configuration worked for the entire volume of data andraise an alarm or discard the data that did not meet some specificrequirement.

7.1 Quality controls

At the time of writing this publication, the PAUS data managementsystem has processed more than 7 million images of PAU, and thenumber is increasing with additional reprocessing of data. The com-plexity and volume of this set requires an automated data qualitycontrol system to ensure that the data products meet the expected re-quirements. Although the most imaginative and cautious developer

will miss the variety of circumstances that data from an observatorycan contain. From closed petals of the main mirror, to dust from theSahara desert or even vapor condensation in the entry window ofthe camera in extreme weather conditions. These are some of theunpredictable conditions that we must catch to reject bad qualityexposures and request observations to be repeated.

With this particular aim, we built a quality control system as-sociated with theNightly pipeline. We define the following qualitycontrol tests with the corresponding tolerance limits in each metricto classify an image as valid:

• Readnoise: check that electronic readnoise is under specifica-tion. This is measured in the overscan region of each amplifier. Thedefault limit is set to 20e−.

• Flat-field level: check that the flat-field image is illuminatedin the correct range of values. Too bright illumination could resultinto saturation and too faint illumination would increase the noiseof the master flat-field. The default range is set between 1000 and120.000 ADUs.

• Saturation: check that the science images do not contain toomany saturated pixels. A certain amount of saturated pixels are ex-pected due to bright stars in the field. However too many satureatedpixels are indicative of an issue in the exposure time, electronics ortarget selected. The default limit is 0.1% of saturated pixels (averageis ∼0.02%).

• Cosmic rays: check that the cosmic ray detection algorithmdoes not classify too many cosmic ray pixels. Issues in the elec-tronics or very noisy images may affect the sensitive CR detectionalgorithm and end up with too many pixels being classified as cos-mics. The default limit to reject an image is 1% of pixels (averageis ∼0.05%).

• Astrometry: check that the contrast (the ratio of the amplitudeof the detected peak to the amplitude of the second highest peakfound in the cross-correlation) and 𝜒2 to the reference catalogue aregood enough to ensure SCAMP found a reliable solution. This iscritical as we rely on the single-epoch astrometry and images withhigh extinction may end up with too few stars to deliver a solution.The default limits are contrast greater than 3 and reference 𝜒2 below50.

• Seeing: check that the average image PSF FWHM measuredby PSFEx is below a certain value. Large PSFs reduce the signal-to-noise and limit the target sources of interest. The default limit is1.′′8.

• Calibration stars: check that there are enough stars matchedwith SDSS to be used for the photometric calibration. The defaultlimit is set to 5 stars.

• Zero-point error: check that the estimated error in the photo-metric zero-point is constrained. An unusually high ZP error maybe due to a non uniform response across the detector. The defaultlimit is set to 0.2 (flagging ∼3% of the images).

The quality controls processed in each job are aggregated andpropagated to the parent jobs so quality issues in large processingsets with many dependencies can be tracked easily.

7.2 Nightly report

Periods of observation typically last for 1 to 2 weeks. During thistime it is very important to provide feedback to the astronomersin the observatory in the shortest time. This was one of the keyconstraints in the design of the PAUS data management system andwemanaged to process the whole night data set before the next nightstarts (∼8 hours).

MNRAS 000, 1–31 (2022)

PAUS image photometry 15

Figure 18. Some of the quality control plots that the nightly report producesas a result of the nightly pipeline processing. This data corresponds to oneof the last nights of PAUS before the COVID-19 pandemic break. Eachcolor represents a different narrow-band filter tray. Top: Evolution of theatmospheric seeing and how it got quite high before midnight, stabilizinglow before the end of the night.Bottom:Evolution of the transparency, wheresome clouds entered at around 4am, reducing the absolute transparency toalmost 10%.

The Nightly report is a web-based application that providesfeedback to the astronomers on the quality of the data from theprevious night so that observers can reschedule observations thatdid not meet a certain quality. It has also been critical to identifyissues in the camera or telescope that were fixed with minimal delayfor the remaining observing run. The Nightly report has defaultquality limits necessary for the survey and generates a report filethat can be ingested directly to the PAUCam control system forre-scheduling targets. The application also displays statistics andevolution plots for each night, with options to adjust the metricsto be analyzed and its time span, such as the ones in Figure 18. Atotal of 35 parameters can be displayed to help PAUS astronomersunderstand the atmospheric, weather and instrument behaviour fromany previous night.

In addition to the quality checks described in §7.1, theNightlyreport displays the processing status of the main blocks in theNightly pipeline: detrending, astrometry, PSF modelling and pho-tometric calibration. The status in each block can be used as qualitycut (i.e. repeating all observations where PSF modelling failed in

any detector). Approximately 40% of the exposures did not meetthe image quality requirements imposed by the survey and had tobe rejected due to bad weather or any other possible issues.

7.3 Forced aperture inspector

With such a large narrow-band set and with the additional overlap-ping exposures, each object depends on the correct reduction andcalibration of hundreds of images. There are many things that cango wrong, even if all quality tests passed in the processing of animage. It is critical to identify the source of issues that may end upin outliers on PAUS spectra and that the photo-z code could misin-terpret as a physical feature of the galaxy, resulting in a catastrophicredshift determination.

A very reliable reference to validate PAUS measurements arehigh-resolution spectra from external surveys. The selection of thecalibration fields was in fact driven by the overlap with datasetsof carefully calibrated spectra. Our main validation field COSMOScontains ∼17.000 sources with good reference spectra. Additionallyour main fields contain other datasets such as VIPERS (Le Fèvreet al. 2013) in W1 or DEEP2 (Newman et al. 2013) in W3, comple-menting the already extensive set in COSMOS. From the calibratedspectra, we can infer the synthetic narrow-band and perform a directcomparison to the PAUS measurements. Some examples are shownin Appendix C. Galaxy sources with a spectrum provide accuratespectroscopic redshift estimations which allow us to know wherethe expected emission and absorption lines are, providing additionalconfidence on a particular spectral shape.

We built another quality control web application where PAUSmeasurements for a particular source are displayed together with thesynthetic measurements from a reference spectra.When the redshiftis available, we also display the emission and absorption lines inthe expected wavelength position. This allows us to visually inspectPAUS data and easily identify outliers or discrepancies with thereference spectrum. The application allows to click on a particularsuspicious PAUS band, displaying themeasurements that contributeto that band. Clicking again on a single measurement will displaythe portion of that particular image together with the aperture donein MEMBA and all quality parameters associated to the image orobserving conditions. Very rapidly we can identify issues from thefinal PAUS spectrum to the original image that contributed to everymeasurement.

As this resulted a very powerful tool and many scientists fromthe PAUS collaboration contributed to it, we added a reportingsystem where people inspecting the sources could specify the issuethat caused a particular outlier. The list of possible issues grew upas we learned more and more about the PAUS data, and we endedup with a list of 18 possible issues, such as scatter-light, blendedsource, crosstalk, astrometry issue and more. Even though this is asubjective test that required some training, we could extract validstatistics and correct for multiple systematics that originally causedtrouble. An example of a the 40 bands of a galaxy in the ForcedAperture Inspector can be seen in Figure 19.

7.4 Duplicate observations test

Repeated exposures over the PAUS fields have been used to validatethe finalMEMBA photometry. We have between 3-10 independentfluxes for the same object in each narrow-band. We use these cata-logs to build a sample of over a million pairs of duplicate (repeated)measurements of the same object. There are 45 pairs of measure-ments for the same object and about 40,000 separate objects, before

MNRAS 000, 1–31 (2022)

16 Serrano et al.

4500 5000 5500 6000 6500 7000 7500 8000 8500Wavelength (Å)

200

400

600

800

1000

1200

AB fl

ux (e

)

He I S IIH H H O III O I H S IIKH G Mg Na

Convolved spectraEmission lineAbsoption lineCoadded flux

0.5

1.0

1.5

2.0

2.5

Coa

dd

2

Figure 19. A bright galaxy at z=0.2 from the COSMOS field. The aperture inspector displays the coadd measurements fromMEMBA and overlays a syntheticnarrow-band photometry from the corresponding SDSS hi-resolution spectrum (when available). With the emission and absorption lines depicted, the redshiftsolution can be confirmed, specially on emission-line galaxies such as the example above. The redder points correspond to higher 𝜒2 values, caused bydiscrepant single-epoch fluxes, suggesting possible issues in the combined measurement.

cuts and masking. We identified duplicates as objects with the samereference ID (which means apertures with the same position inMEMBA). We select pairs with 𝑆𝑁𝑅 > 3. The goal is to test if theuncertainties in fluxes produced byMEMBA are consistent with re-peated measurements for the same object as a function of differentproperties of the object and observation.

The top panels of Figure 20 show two examples of histogramin the values of normalized flux differences

𝑑𝑓 ≡ 𝑓1 − 𝑓2𝜎

=𝑓1 − 𝑓2√︃𝜎21 + 𝜎

22

(26)

of duplicate measurements with fluxes 𝑓1 and 𝑓2. The error is justadded in quadrature from theMEMBA error of 𝑓1 and 𝑓2. In general,the filled histograms of normalized differences follow a normaldistribution (shown by dashed red lines) but the width is typicallya bit larger, 𝜎68 ' 1.028 and 𝜎68 ' 1.175 in these two cases, thanunity (black line) which is what we would expect ifMEMBA errorswere perfectly accurate.

The bottom panels of Figure 20 show the width of the normal-ized duplicate distributions, 𝜎68, as a function of the total broad𝑖-band of the reference catalog (W3 in CFHTLS) which we called𝐼auto. We can see that there is a strong dependence with 𝐼auto whichindicates that MEMBA errors are correct at the faint end but areunderestimated at the bright end. Similar results are found for otherPAUS fields. The machine learning background modelling proto-type presented in Cabayol-Garcia et al. (2020) seems to improvethis trend for brighter sources, suggesting there is a backgroundsubtraction problem in higher SNR sources.

We have seen similar tendencies in the photometry of dupli-cates for other broad band surveys. This happens for both galaxiesor stars and using aperture photometry or also SExtractor totalmagnitudes (e.g. in the PAUS zero-point calibration with SDSSstars). We speculate that these results may come from a small bi-ases in zero-point calibration values, which have spatial variations(see e.g. Fig.8 in Castander et al. (2022)). These variations affectmore strongly the brighter fluxes and are not accounted for by errorpropagation of zero-points to flux in Eq.17. A zero-point bias of

only 1% can produce up to 20% mis-estimation in the flux errors ofbright galaxies, which have larger fluxes and smaller relative errors.

In Figure 21 we show 𝜎68 for all other narrow-band wave-lengths in PAUS. The background green color intensity variationsrepresent the relative density of pairs as a function of _ and𝜎68. Thestandard deviation is typically larger than 𝜎68 = 1 which indicatesthat errors are underestimated by up to 20% in some narrow-bands.We find similar tendencies for all wavelengths. The yellow arrowspoint to NB575 and NB755 which correspond to the two extremecases shown in more detailed in Figure 20. We conclude from thefigure that the variations shown in Figure 20 extend the values seenin other NB.

7.5 Comparison with SDSS and VIPERS spectra

We use spectroscopic data from SDSS and VIPERS to predict theMEMBA fluxmeasurements.We then compare the predictions withthe actual measurements to test MEMBA and the previous datacalibration steps.

7.5.1 Synthetic narrow-band photometry

In order to compare spectra with PAUS narrow-band photometry,we need to generate synthetic photometry from the high-resolutionspectra. This provides a high quality reference to compare with anypassband, especially valuable with the non-standard PAUS filtersystem. This comparison will only be possible with objects thatboth are observed by a spectroscopic survey and by PAU.

The first step in the process of generating the synthetic bandsis to retrieve and homogenize the spectral data. In our case wehave converted all fluxes to a more common 𝑓_ with units oferg/cm2/s/Å. Generally, the data set contains the wavelengthswhere the spectrum is sampled, the fluxes, the noise (or inverse vari-ance) and a mask. Optionally SDSS also includes a measurementof the sky, that allows to identify possible contamination in strongemission or absorption lines. Second, we interpolate the bandpassresponse 𝑅(_) to the sampling of the spectral data, as these two

MNRAS 000, 1–31 (2022)

PAUS image photometry 17

Figure 20. Top: Duplicates for W3 galaxies (run 941) with NB = 575𝑛𝑚 (left) and NB = 755𝑛𝑚 (right). Histograms of values of 𝑑 𝑓 ≡ ( 𝑓2 − 𝑓1)/𝜎. Thereare ' 1.1 · 106 pairs with mean S/N of 6.82 and mean flux of 24.78 for NB = 575𝑛𝑚. Bottom: 𝜎68 as a function of the 𝐼auto total 𝑖-band magnitude. Continuouslines correspond to all duplicates. Dashed red line shows values with smaller half light radius 𝑟50 < 0.616 arcseconds. The histogram (below the lines) showsthe relative number of duplicates which increases sharply at the faint end and dominate the statistics for a full population.

Figure 21. Normalized error (corresponding to 𝜎68 in Figure 20) in 𝑑 𝑓 ≡( 𝑓2 − 𝑓1)/𝜎 as a function of narrow-band wavelength. Red continuous linecorresponds to the value of 𝜎68 for all duplicates. Dashed yellow (blue) linesshow 𝜎68 for half of the sample with brighter (fainter) broad band reference𝐼auto magnitudes. The background color shows the number of pairs in eachnarrowband (light color has higher number).

are not necessarily in the same space. Then we mask both the spec-tral fluxes and the passband with the flaggedmeasurements from thespectrummask. At this point we can compute the integrated averageflux density of the source at the specific passband in erg/cm2/s/Hzsuch as:

〈𝐹a〉 =∫

𝑓_𝑅(_)_2𝑐

d_ (27)

and the its associated integrated response:

𝑅i =

∫𝑅(_)d_ . (28)

Finally we can compute the synthetic magnitudes in the ABsystem with the following transformation:

𝑚syn = −2.5(log

〈𝐹a〉𝑅i

)− 48.6 . (29)

It is also important for the statistical analysis to estimate theerror of each synthetic band. As the flux in the spectrum has beenweighted by the response of the transmission, we must weight thenoise in the spectrum by the relative transmission throughout theentire passband:

𝜎2𝑓a=

∫𝑅(_)2𝜎2

__2

𝑐2𝑅2id_ , (30)

MNRAS 000, 1–31 (2022)

18 Serrano et al.

4000 5000 6000 7000 8000 9000wavelength (A)

1

0

1

2

3

Flux

(erg

/cm

2 /s/Å

)

1e 16 Input spectra

4000 6000 8000 10000wavelength (A)

0.0

0.2

0.4

0.6

0.8

1.0

Tran

smiss

ion

Target bands

4000 5000 6000 7000 8000 9000wavelength (A)

17

18

19

20

21

AB m

ag

Synthetic Photometry

0

10

20

30

SN

fluxnoiseflux errorsn

SDSS spectra 0267-51608-0459 - SN: 22.69 z: 0.14

Figure 22. Top: A galaxy sample at z=0.14 from SDSS used as input forsynthetic photometry. The left axis (in red) represents the flux and its errorwhile the right axis (in orange) represents the SNR.Middle:The target bandsare the 40 narrow-band set from PAUS plus the two broad band systems fromSDSS and CFHT. Bottom: The computed synthetic photometry from the hi-resolution spectrum. The bands without enough unmasked samples from thespectrum are marked in red.

where 𝜎_ is the noise in the high-resolution spectrum. We canapproximate the magnitude error such as:

𝜎𝑚syn ≈ 1.0857𝜎2𝑓a

〈𝐹a〉/𝑅i. (31)

Following the previous procedure we compute the photometryover all VIPERS spectra and all SDSS spectra that overlap withPAUS over the 40 PAUS narrow-band set and the SDSS and CFHTbroad band systems. We have flagged all measurements where theoverlap between the systems response and the unmasked spectrumis below 70%. An example of synthetic spectrum with SDSS overPAUS narrow-bands and other broad bands is shown in Figure 22.More examples together with PAUS real observations are shown inAppendix C.

7.5.2 Re-calibration of spectra

To account for remaining aperture or PSF effects in the measuredspectra we use total broad band (BB) photometry in the corre-

sponding reference survey (SDSS or VIPERS) to re-calibrate eachindividual spectrum. To do this we first estimate synthetic broad-bands from the spectra, 𝐹𝑆 (𝐵𝐵), as shown in previous section. Wethen use the BB measured flux 𝐹𝑂 (𝐵𝐵) to find a multiplicativezero-point, 𝑍𝑃, which is in general different for each BB:

𝑍𝑃(𝐵𝐵) = 𝐹𝑂 (𝐵𝐵)𝐹𝑆 (𝐵𝐵)

. (32)

We use ZP to re-scale each individual spectrum. In the cases wherewe have 2 (or 3) BB measurements fully within the spectra wave-length coverage we combine them using a fit to a linear (or cubic)function ZP=ZP(_), where _ is the mean of the bandpass response𝑅(_). Each synthetic narrow-band _NB from the spectrum is re-scaled by ZP𝑁𝐵 =ZP(_NB).

Fig.11 in Castander et al. (2022) shows the histogram of val-ues of ZPNB for all 40 narrow-bands in 25644 independent mea-surements of 170 different SDSS calibration stars. The mean re-calibration is only a 2% offset with a 5% scatter: ZP= 1.02 ± 0.05.Similar results are found for VIPERS.

7.5.3 Aperture corrections

Once the SDSS spectra are re-calibrated with Eq.32, we also per-form aperture correction of the amplitude of each individual spec-trum (𝑆) to the PAUSmeasurements. This is a fit to a linear constant𝐴 = 𝐴(𝑆)

𝐴(𝑆) =∑i 𝑓PAUS (𝑆, i) 𝑓SDSS (𝑆, i)∑

i 𝑓2SDSS (𝑆, i)

(33)

between PAUS rawfluxes 𝑓PAUS and SDSS re-scaled synthetic spec-tral 𝑓SDSS (including the spectral recalibration). The sum is overindividual PAUS measurements 𝑖 in a given spectrum (𝑆) and ituses inverse variance weighting 𝑤𝑖 = 1/𝜎2𝑖 , where 𝜎𝑖 is the jointerror (from SDSS and PAUS) added in quadrature. Typically thereare 200 PAUS independent measurements (40 narrow-bands times5 exposures) for each SDSS spectrum.

Fig.13 in Castander et al. (2022) shows the distribution ofvalues of 𝐴 for different SDSS star calibration spectra and 42420independent measurements for PAUS run #955 in COSMOS. Wefind a mean value and scatter of 𝐴 = 0.98 ± 0.02, which indicatesthat PAUS data is in very good agreement overall with the SDSScalibration within 2% overall scatter.

Figure 23 shows a comparison of SDSS,VIPERS andPAUSes-timated narrow-band fluxes for two typical SED examples of galax-ies (at 𝑧 = 0.4 and 𝑧 = 0.5) with 𝑖𝐴𝐵 ' 20. One can see thecorresponding figure for stars in Fig.12 of Castander et al. (2022).

7.5.4 Color terms

We now check for any residual differences as a function of narrow-band wavelength _ using galaxies with SDSS and VIPERS spectra.Figure 24 shows the mean and scatter zero-point difference for eachnarrow-band _

ZP(_) =∑𝑓PAUS (_) 𝑓SDSS (_)∑

𝑓 2SDSS (_)(34)

between PAUS raw fluxes 𝑓PAUS and SDSS or VIPERS re-scaledsynthetic spectral 𝑓SDSS (including the aperture correction 𝐴 inEq.33). The sum is over all individual PAUS measurements (42420in total) and uses inverse variance weighting 𝑤 = 1/𝜎2, where 𝜎 isthe join error added in quadrature.

MNRAS 000, 1–31 (2022)

PAUS image photometry 19

Figure 23. Example of validation spectrum for the photometric calibrationin PAUS (points with errorbars) with synthetic narrow-band photometryfrom SDSS (blue) and VIPERS (red) galaxy spectrum. SDSS (VIPERS)spectra have been multiplied by 𝐴1 (𝐴2), as shown in the labels, to accountfor possible differences in the aperture used in each observation. The "chi2"label shows the normalized 𝜒2 as compared to PAUS data.

In the top panel we use SDSS calibration stars which show amuch smaller scatter than SDSS galaxies (middle). This is becausewe use fainter galaxies and also because the aperture correctionbecomes more important for extended objects.

We find a small residual color tilt between the SDSS and PAUSnarrow-band systems when using SDSS star spectra to compare

ZP(_) = 1.05 ± 0.04 − (0.05 ± 0.04)(

_

650𝑛𝑚

)(35)

which is consistent with unity within errors. The scatter betweenthe 40 bands after correcting for this linear residual slope is just0.8%. The scatter increases from 0.8% to 1.1% without this linearcolor correction.

Figure 24. A validation study of the photometric calibration in PAUS(ZP=PAU/SDSS) using synthetic narrow-band photometry from SDSS starsspectra (Top), SDSS galaxy spectra (Middle) and VIPERS galaxy spectra(Bottom). Blue errorbars indicate the scatter in the values, which is muchlarger for galaxies.

8 KNOWN LIMITATIONS & FURTHER WORK

Since the first light of PAUCam we have been constantly improvingboth Nightly and MEMBA algorithms, continuously improvingour understanding of the systematics and the instrument behaviour.Even though we reached outstanding photometry and photo-z accu-racy, there are still steps in the overall process that can be improved.However in this paperwe refer only to those algorithms that were im-plemented in the main pipeline and that delivered scientific resultsin already published papers. We plan to improve and present furtheralgorithms with the corresponding upgrade in scientific results oncethese are stabilized and validated.

Regarding the flux calibration step in theNightly pipeline, weknow the current dome flat-fields do not reproduce the sky illumi-nation accurately. Therefore the detector response after the flat-fieldcalibration is not homogeneous. A possible solution is processingsky flats and leave the dome flats for small-scale pixel variations andnot to correct the large-scale vignetting and illumination patterns.This has the difficulty of the scatter-light as an additive componentwhich complicates the sky flat processing. The solution proposed in§3.4.2 has only been applied to particular studies involving extendedobjects such as the M101 of Figure 8. Once we verify the fluxes ofthemain target galaxies (smaller and fainter down to 𝑖AB < 22.5) arepreserved, we will implement the sky-flat scatter-light subtractionto the main processing.

MNRAS 000, 1–31 (2022)

20 Serrano et al.

The background modelling implemented inMEMBA is a sim-ple but reliable method. Due to the complex varying background inPAUS images resulting from the flat-field and scatter-light residuals,a more complex background estimator that understands trends of thebackground can provide significant benefits. This is especially im-portant for low SN sources where small residuals of the backgroundcan bias the measurement. A machine learning technique has beenstudied and published in Cabayol-Garcia et al. (2020) and will soonenter into themain processing of PAUdm.There is also an innovativemethod developed for PAUS (Cabayol et al. 2021) that provides fluxestimates from neural network algorithms, improving the accuracyand increasing the signal-to-noise of the measurements.

The areas in the focal plane with most distortion are currentlyflagged. There is a possibility to increase the survey efficiency by ac-curately modelling the PSF at the edges of the focal plane and adaptthe apertures to include a larger area of the unvignetted mosaic.

Modelling the PSF and computing the astrometric solution isdone at the single-epoch level, independently for each exposure, forsimplicity and because it delivers good enough precision. Howevermore stable and accurate solutions can be obtained by computingastrometry and PSF models with multiple overlapping exposures, atthe expense of complicating theNightly processing adding depen-dency between single-epoch reductions. The astrometric precisionwould only improve marginally at the sub-pixel level, which couldhave some benefits for particular applications. But a more accurateand stable varying PSF modelling could improve the photometryand increase the area efficiency, contrary to the simple model wecurrently use.

Similar to the previous point, the photometric calibration de-livers zero-points independently for each detector image. There arealgorithms such as the Übercalibration that compute a global solu-tion from the overlapping measurements between images, homoge-nizing the calibration and ensuring a flat response across wide areain the sky. However, as we calibrate against SDSS stars and thosehave been globally calibrated in a similar process, we would expectthis improvement to be minor for PAU, assuming a good match withSDSS photometry.

9 SUMMARY & CONCLUSIONS

The PAU data management system described here has been ableto provide the most accurate photo-z catalogues available down to𝑖AB < 22.5, dealing with very particular aspects of the narrow-bandphotometry and specifics of the instrument. Customalgorithmsweredesigned to calibrate narrow-bands down to 1% accuracy. Subtlesystematic effects had to be modelled and corrected such as thespecialized processing to deal with scatter-light residuals caused bythe unusual filter tray disposition of PAUCam. The technical im-plementation presented has also been challenging due to the largevolume and complex data set for this survey. It has been key to or-chestrate the processing and metadata around a powerful databasewith flexibility to modify and extend the processing as needed andallowing very complex analysis that enforced the scientific exploita-tion of the data. Although some of the adopted solutions are boundto the infrastructure of the data center, it can be adapted to differentsurveys or hardware configurations with similar volumes of data(below 100 TB and 1010 database entries).

The current PAUdm implementation has some limitations thatwe are currently working to improve. Even under these limitations,the photometric catalogues published by PAUS deliver the mostprecise photo-z down to 𝑖AB < 22.5<22.5. PAUSdata is available in

the EarlyDataRelease (Eriksen et al. 2019), in the PAUS+COSMOSphoto-z catalog (Alarcon et al. 2021) and in Soo et al. (2021). Thesemeasurements open new windows in various astronomy scientificareas as published in Stothert et al. (2018), Tortorelli et al. (2021),Johnston et al. (2021) or Renard et al. (2021) with its large-scaleimaging survey of narrow-band photometry where each pixel is alow-resolution spectrum.

ACKNOWLEDGEMENTS

The PAU Survey is partially supported by MINECO under grantsCSD2007-00060, AYA2015-71825, ESP2017-89838, PGC2018-094773, PGC2018-102021, SEV-2016-0588, SEV-2016-0597,MDM-2015-0509, PID2019-111317GB-C31 and Juan de la Ciervafellowship and LACEGAL and EWC Marie Sklodowska-Curiegrant No 734374 and no.776247 with ERDF funds from the EUHorizon 2020 Programme, some of which include ERDF fundsfrom the European Union. IEEC and IFAE are partially fundedby the CERCA and Beatriu de Pinos program of the Gener-alitat de Catalunya. Funding for PAUS has also been providedby Durham University (via the ERC StG DEGAS-259586), ETHZurich, Leiden University (via ERC StG ADULT-279396 andNetherlands Organisation for Scientific Research (NWO) Vici grant639.043.512), Bochum University (via a Heisenberg grant of theDeutsche Forschungsgemeinschaft (Hi 1495/5-1) as well as an ERCConsolidator Grant (No. 770935)), University College London,Portsmouth support through the Royal Society Wolfson fellowshipand from the European Union’s Horizon 2020 research and innova-tion programme under the grant agreement No 776247 EWC. Theresults published have been also funded by the European Union’sHorizon 2020 research and innovation programme under the MariaSkłodowska-Curie (grant agreement No 754510), the National Sci-ence Centre of Poland (grant UMO-2016/23/N/ST9/02963) and bythe Spanish Ministry of Science and Innovation through Juan dela Cierva-formacion program (reference FJC2018-038792-I). ThePAUS data center is hosted by the Port d’Informació Científica(PIC), maintained through a collaboration of CIEMAT and IFAE,with additional support from Universitat Autònoma de Barcelonaand ERDF. P.R. is supported by National Science Foundation ofChina (grant No. 12073014).

DATA AVAILABILITY

The data underlying this article are available in the PAUS website,under theData Releases section, at https://pausurvey.org. Theavailable catalogues include the Early Data Release (EDR) cata-logue and the PAUS+COSMOS photo-z catalogue. The EDR cor-responds to PAUS data obtained in the COSMOS field, describedand used in Eriksen et al. (2019). The PAUS+COSMOS photo-zcatalogue contains accurate and precise photometric redshifts inthe ACS footprint from the COSMOS field for objects with iAB<23combining all 40 PAUS bands with 26 broad-bands from the COS-MOS2015 catalogue (Alarcon et al. 2021). Further data releases areexpected with the photometry presented in this article.

REFERENCES

Alam S., et al., 2015, ApJS, 219, 12Alarcon A., et al., 2021, MNRAS, 501, 6103Astropy Collaboration et al., 2018, AJ, 156, 123

MNRAS 000, 1–31 (2022)

PAUS image photometry 21

Bertin E., 2006, in Gabriel C., Arviset C., Ponz D., Enrique S., eds, Astro-nomical Society of the PacificConference SeriesVol. 351,AstronomicalData Analysis Software and Systems XV. p. 112

Bertin E., 2011, in Evans I. N., Accomazzi A., Mink D. J., Rots A. H.,eds, Astronomical Society of the Pacific Conference Series Vol. 442,Astronomical Data Analysis Software and Systems XX. p. 435

Bertin E., Arnouts S., 1996, A&AS, 117, 393Bertin E., Mellier Y., Radovich M., Missonnier G., Didelon P., Morin B.,2002, in Bohlender D. A., Durand D., Handley T. H., eds, AstronomicalSociety of the Pacific Conference Series Vol. 281, Astronomical DataAnalysis Software and Systems XI. p. 228

Bradley L., et al., 2020, astropy/photutils: 1.0.0,doi:10.5281/zenodo.4044744, https://doi.org/10.5281/

zenodo.4044744

Bridle S., et al., 2009, Annals of Applied Statistics, 3, 6Cabayol-Garcia L., et al., 2020, MNRAS, 491, 5392Cabayol L., et al., 2021, MNRAS, 506, 4048Calabretta M. R., Greisen E. W., 2002, A&A, 395, 1077Capak P., et al., 2007, ApJS, 172, 99Casas R., et al., 2012, in Holland A. D., Beletic J. W., eds, Society ofPhoto-Optical InstrumentationEngineers (SPIE)Conference SeriesVol.8453, High Energy, Optical, and Infrared Detectors for Astronomy V.p. 845326, doi:10.1117/12.924640

Castander F. J., Serrano S., Eriksen M., Gaztanaga E., Casas R., 2022, AJ,157, 246

Erben T., et al., 2013, MNRAS, 433, 2545Eriksen M., et al., 2019, MNRAS, 484, 4200Freyhammer L. M., Andersen M. I., Arentoft T., Sterken C., Nørregaard P.,2001, Experimental Astronomy, 12, 147

Gaia Collaboration et al., 2018, A&A, 616, A1Górski K.M., Hivon E., BandayA. J.,Wandelt B. D., Hansen F. K., ReineckeM., Bartelmann M., 2005, ApJ, 622, 759

Heymans C., et al., 2012, MNRAS, 427, 146Hildebrandt H., et al., 2012, MNRAS, 421, 2355Ivezić Ž., et al., 2019, ApJ, 873, 111Janesick J. R., 2001, Scientific charge-coupled devices. SPIEJohnston H., et al., 2021, A&A, 646, A147Kuijken K., et al., 2015, MNRAS, 454, 3500Kuijken K., et al., 2019, A&A, 625, A2Laigle C., et al., 2016, ApJS, 224, 24Landolt A. U., 1992, AJ, 104, 340Le Fèvre O., et al., 2013, A&A, 559, A14Miller L., et al., 2013, MNRAS, 429, 2858Monet D. G., et al., 2003, AJ, 125, 984Newman J. A., et al., 2013, ApJS, 208, 5Padilla C., et al., 2019, AJ, 157, 246Renard P., et al., 2021, MNRAS, 501, 3883Rowe B. T. P., et al., 2015, Astronomy and Computing, 10, 121Soo J. Y. H., et al., 2021, MNRAS, 503, 4118Stothert L., et al., 2018, MNRAS, 481, 4221Tody D., 1986, in Crawford D. L., ed., Society of Photo-Optical Instrumen-tation Engineers (SPIE) Conference Series Vol. 627, Instrumentation inastronomy VI. p. 733, doi:10.1117/12.968154

Tonello N., et al., 2019, Astronomy and Computing, 27, 171Tortorelli L., et al., 2021, J. Cosmology Astropart. Phys., 2021, 013Trujillo I., Aguerri J. A. L., Cepa J., Gutiérrez C. M., 2001, MNRAS, 328,977

Wells D. C., Greisen E. W., Harten R. H., 1981, A&AS, 44, 363York D. G., et al., 2000, AJ, 120, 1579van Dokkum P. G., 2001, PASP, 113, 1420

APPENDIX A: OPERATION AND TECHNICALPERFORMANCE

The PAUS data management system has been designed to operatein the infrastructure at the Port d’Informaciò Científica (PIC). In

this section we describe the technical aspects of the main scientificpipelines and the tools required to operate the pipeline under theavailable infrastructure. The solution presented below was not theoriginal design as both the infrastructure and the project requiredchanges since the beginning of the operation. A more technicaland infrastructure-oriented description of PAUdm is described inTonello et al. (2019). This section also includes updated data andpipeline flows from the ones presented in the technical paper.

A1 Archive

The PAUS camera produces ∼300GB of raw data per observingnight. These are mostly FITS files that contain the exposure imagesand additional metadata in its header. The data are processed in themain pipelines where more sub-products are generated, multiplyingthe volumes of raw data. The actual size of the raw archive (until20A observations) is 42 TB and it is safely archived as a two-copytape and a third copy on disk. The processed data are significantlylarger due to the increased bit depth and the various sub-productsgenerated per exposure (science, mask, weights, PSF models, etc).For the reduced data we have a single copy on disk, except forpublished releases where we include an additional copy on tape. Theraw archive tree organizes the data in observation sets, followingthe schema in the PAUCam temporary archive at the observatory.The reduced tree adds an additional layer to account for multiplereprocessing of the same data, that we call productions.

Even though most of the access to the archive system is pro-vided by the nodes in the computing farm, we wanted to makeavailable both raw and reduced data to the PAUS Collaboration. Forthat purpose we set up a WebDAV server that allows web access tothe entire archive in a user-friendly format.

A2 Database

With such a large and complex dataset where millions of galaxies,measurements and images are related, it has been key to set up arelational database that tracks all the information and metadata ofthe survey. We have chosen a PostgreSQL database running on apowerful 12-core 96GB server in a twin configuration for reliabilityand performance. The database can be accessed by the pipelinesvia an object relational mapper (ORM) for better integration andreliability under the pipeline environment. It is also accessed bythe different web applications such as the Nightly report and theforced aperture inspector described in previous sections. As PAUSis a large collaboration, we make available via the PAUdm websitea dynamic view of the database to browse the schema and performsimple queries. Additionally, for development and validation pur-poses, it can be accessed via Python notebooks under a read-onlyrole, dumping queries directly into dataframeswith all the flexibilityand potential that these objects provide.

The database model was designed to allow for reprocessingof data at any level, tracked by the production table. Carefulconstraints were set on each table to ensure unique entries undereach production set. We defined 4 main pipelines: The Pixel Sim-ulation, where we produce survey and pixel image simulation fordevelopment and assessment of performance. TheNightly pipeline(§A4.1) is where the main image reduction and calibration happens.The Nightly can process input productions from the pixel simu-lation or real observations from PAUCam. The MEMBA pipeline(§A4.2) where we perform forced photometry over theNightly im-ages. And finally the Photo-z pipeline, a wrapper to BCNz (Eriksen

MNRAS 000, 1–31 (2022)

22 Serrano et al.

et al. 2019) where we estimate photometric redshifts fromMEMBAphotometry. Each pipeline can be processed independently, allow-ing to process multiple times a given input production. For instance,one can process different aperture photometry in MEMBA undervarious configurations with the same set of image reduction madeby the Nightly pipeline.

In addition to the data-related tables, we have a survey strategydatabase, synchronized with the one at the observatory, contain-ing all information about fields, exposure status and observationprogress. These tables are used throughout the night to scheduletargets that need to be observed by PAUCam. It is also necessary inthe regular PAUdm processing to select the exposures associated toa survey field and that have been classified as valid exposures.

Finally, the database have all data operation tables with infor-mation about job configuration, status and their dependencies. Theuse of these is described in the next subsection (§A3). Additionallythe whole database schema with the most relevant columns in eachtable can be found in Appendix B.

A3 Processing

The high volume of data and its complex analysis requires process-ing the pipelines in a data center with enough computing powersuch as at the Port d’Informació Científica (PIC). This is a High-Throughput Computing data center and therefore we had to spliteach pipeline into smaller tasks than can be processed independentlywith limited consumption of memory and CPU time. Consequently,a pipeline may result in hundreds or even thousands of jobs, withits configuration and inter-dependencies, that need to be launched,monitored and operated.

For that purpose we designed a job orchestration tool namedBrownthrower (BT) that gives us the flexibility to operate thepipelines at PIC. With BT we can create jobs, add dependenciesbetween them so certain jobs do not begin processing until othersare complete, share configuration between jobs and monitor the sta-tus of a pipeline and its sub jobs. To process these jobs we submit tothe computing farm via HTCondor a set of pilot jobs that are con-tinuously grabbing free jobs to be processed (in status Queued andwithout pending dependencies) until all jobs are being processed.

A Jupyter Lab web service was recently set at PIC, runningover actual nodes from the computing farm (with memory up to32GB) or even GPU nodes. This service has been of great help todevelop and deploy new algorithms, as well as for validation andquick test purposes.

A4 Data flow and orchestration

The data flow in PAUdm is divided between the main PAUdmarchive, where large files are stored and the database, where meta-data and information that requires complex selections is uploadedto the database, orchestrated by the different pipelines. A summaryof the data flow is depicted in Figure A1. In the first place, the rawdata are transferred from the observatory on La Palma to the archiveset in Barcelona at PIC (detailed in Tonello et al. (2019)). Imme-diately, the exposure metadata is registered in the database. Next,the Nightly Pipeline begins its image calibration and archives theclean images, its PSF models and the astrometric solutions in WCS.Photometric measurements and their calibration ZPs are uploadedto the database. Once enough sky area has been processed by theNightly,MEMBA can begin the galaxy photometry and reingest itscoadd catalogue once all measurements have been done. MEMBA

is also in charge of producing survey masks and thus, it stores themin the archive. Finally, the photo-z pipeline obtainsMEMBA’s mea-surements and computes the photo-z for each galaxy. The photo-zvalues and estimated errors enter the database while the large red-shift probability distribution files are archived in the storage.

A4.1 Nightly Pipeline

This is the pipeline in charge of the image calibration, as describedin §3. It begins with a set of raw exposures, including flat and biascalibration images. It ends up with the science exposures astromet-rically and photometrically calibrated, ready to perform the fluxmeasurements.

The Nightly pipeline has two main steps. First we have theproduction of master bias and master flats and secondly the single-epoch reduction of science images. During observation periods, weprocess the pipeline in batches of observation sets. Typically an ob-servation set contains the exposures from a single night. However, asPAUCam is also available in the WHT as a community instrument,it allows to have more than one observation set per night in caseswhere observations belong to different surveys. When we operatein observing mode, the Nightly pipeline tree starts with the masterbias, then with the master flats of each filter tray and associated toeach tray the corresponding sky images. This means that the imagecalibration won’t start until the jobs of master bias and master flatare successfully completed. The pipeline tree can be seen in FigureA2.

There is a second mode to operate theNightly pipeline meantto process entire fields from multiple nights. This is used when weimprove the Nightly code and want to reprocess a given subset.First we process all calibration frames (master bias and master flats)from the nights with science exposures that we plan to process. Ata second stage, once all the calibration frames are available, weanalyze all the science images in parallel.

The pipeline has evolved significantly since the beginning aswe had to deal with heterogeneous data sets such as very cloudyskies, observation sets without calibration images, saturated flats,etc. A major effort had to be done to automatically detect any pos-sible situation (detailed in §7.1) and either correct for it or classifythe faulty data as invalid.

To allow for the 8-hour rapid feedback during an observationperiod,we launch 50BTpilots to the computing farm that process allimages in time. Reprocessing entire fields involves a much greaterset of images and therefore we increase the number of pilots to 100.Master bias and master flat jobs are processed in 10 minutes whilesingle-epoch exposure reduction can take up to 20 minutes per job.

The Nightly pilots require intense I/O access to the archivesystem to retrieve raw data and ingest reduced images and for thisreason we do not allow more than 100 Nightly pilots to run inparallel.

A4.2 MEMBA Pipeline

The Multi-Epoch and Multi-Band Analysis pipeline is intended toperform the photometry over the reduced images across the differentbands and overlapping exposures from different epochs. It can bedivided into three main steps: the forced photometry, the coaddcatalogue and the production of survey masks. The pipeline treecan be seen in Figure A3.

Each forced photometry job takes care of running the photom-etry of a single detector image. It will load the reference catalogue,

MNRAS 000, 1–31 (2022)

PAUS image photometry 23

Figure A1. A simplified schema of the PAUS data flow, where interactions between pipeline, storage and database are shown.

Figure A2. The dependency chart of the Nightly pipeline. Empty boxes define jobs creating sub jobs while filled boxes refer to jobs processing data at thecomputing farm. The dependency at the filter level starts with the master flat-fields and the single-epochs are related with its own masters.

the corresponding reduced image and mask and will upload themeasurements to the database once completed.

The coadd jobs are divided in different areas in the sky. Weselect the areas by HEALPix pixels (Górski et al. 2005) of N side128. This approach allows us to limit the load of each job, retrievingonly the overlappingmeasurements of a reduced area, defined by thepixel size. Coadding a larger fieldwould only increase the number ofjobs but those will always be constrained in memory and processingtime.

The dependency of jobs is set between the forced photometrytasks and the coadds, as the latter requires the photometry to becomplete at all bands and layers prior to the combination. On thecontrary, mask jobs can be processed independently as these do notrequire inputs from the forced photometry or coadd tasks.

AsMEMBA jobs do not have such intense access to the archiveand interact mostly with the database, we can increase the numberof parallel pilots up to 200. The CPU time of a MEMBA run isdominated by the forced photometry jobs. The largest fields are

made of > 30.000 images. Each job lasts approximately 15 minutes,resulting in a total process time of one day for every 10 deg2.

APPENDIX B: DATABASE SCHEMA

In this appendixwe describe the PAUSdatabase schema. The currentdatabase contains the following tables in the following groups:

• Photometric Calibration

– image_zp: Contains the image zero-point measurements foreach photometry-calibration method.– phot_method: Method used during photometry for image

calibration.– phot_zp: Photometric zero-points.– star_photometry: Contains the individual photometry

measurements for each star matched with the reference catalogueduring the Nightly photometry.

MNRAS 000, 1–31 (2022)

24 Serrano et al.

Figure A3. The dependency chart of the MEMBA pipeline. Empty boxes define jobs creating sub jobs while filled boxes refer to jobs processing data at thecomputing farm. The parallelization takes place at three levels: forced photometry, coadds and survey masks.

– star_zp: Contains the individual zero-point measurementsfor each star matched with the reference catalog during theNightly photometry.

• MEMBA

– forced_aperture: Single-epoch forced aperture photome-try.– forced_aperture_coadd: coadd forced aperture fluxes per

band.– forced_aperture_report: Reports for forced aperture in-

spector.– mask_image: Contains the list of mask images (band and

field).– memba_ref_cat: Reference catalogue used in each

MEMBA production.

• Crosstalk

– crosstalk_diff: Crosstalk differences between raw images.– crosstalk_ratio: Crosstalk ratios between amplifiers.

• Photo-z

– photoz_bcnz: Photometric redshifts from BCNz code.

• Nightly

– detection: Contains the detections measured directly on theimage after the Nightly data reduction.– image: Contains the list of images associated to the mosaics

(CCD and single amplifier images).– mosaic: Contains the list of mosaic exposure images (raw

and reduced).

– obs_set: Contains the list of observation sets registered inthe database.– obs_set_project: Projects associated to observation sets.– project: List of projects associated to PAUCam observa-

tions.– quality_control: Contains quality control entries measured

during the data reduction process.

• Survey Strategy

– ss_target: Survey Strategy targets from observations.

• External

– cfhtlens: External CFHTLenS catalogue for forced photom-etry.– cosmos: External table from zCOSMOS (DR3). Sources

with accurate redshifts for forced photometry and validation.– deep2: The DEEP2 DR4 redshift catalog.– gaia_dr2: Gaia DR2 stellar catalogue.– kids: KiDS KV450-G9 reference catalogue.– sdss_spec: SDSS Spectra catalogue.– sdss_spec_photo: External table from SDSS DR12

(Spec_Photo view). Sourceswith spectrum for forced photometryand validation.– sdss_star: External table from SDSS DR12 (Star view).

Stars for simulation and calibration.

• Synthetic Photometry

– match_to_spec: Match table between forced aperture cata-logues and spectra catalogues.– spec_conv: Contains the convolved fluxes derived from

MNRAS 000, 1–31 (2022)

PAUS image photometry 25

spectra observations from external surveys (SDSS, COSMOSand DEEP2).– synth_sdss: Synthetic photometry over SDSS spectra (over

COSMOS and W1).– synth_vipers: Synthetic photometry over VIPERS spectra

(over W1).

• Brownthrower (Operation tables)

– dependency: Tracks the dependency between Brown-thrower jobs.– job: Tracks the list of Brownthrower computing jobs (Oper-

ation table).– tag: Contains tags for Brownthrower jobs (Operation table).

• Production

– production: Tracks the different processing production runsfor all pipelines.

APPENDIX C: SYNTHETIC SPECTRA AGAINST PAUSPHOTOMETRY EXAMPLES

In this section we include some interesting reference synthetic spec-tra samples against the PAUS narrow-band measurements after allthe processing described in this study. This highlights only a star, agalaxy and a QSO but it illustrate the possibilities of PAUS and thevalidation with this synthetic reference method.

MNRAS 000, 1–31 (2022)

26 Serrano et al.

mosaic id

production_id archivepath

filename kind

exp_num ra

dec mean_psf_fwhm detrend_status

astro_status psf_model_status

photo_status ...

image id

mosaic_id archivepath

filename image_num ccd_num amp_num

filter zp_nightly

zp_nightly_err psf_fwhm bkg_mean psf_stars

transparency ...

quality_control id

job_id ref

qc_pass ...

detection id

image_id flux_auto

flux_err_auto ...

obs_set id

operator rjd_start rjd_stop

...

project id

name description

contact_name ...

obs_set_project project_id obs_set_id

star_zp id

star_photometry_id zp

zp_error chi2

calib_method

image_zp id

image_id zp

zp_error phot_method calib_method transparency

n_stars

star_photometry id

image_id ref_cat ref_id

x_image y_image

flux flux_err flags bg

bg_err phot_method_id

phot_zp id

production_id zp

band date

phot_method id

extraction_code extraction_method

background_method ...

ss_target target_id

field pointing_id dither_step filter_tray

ra dec

status exp_num

relative_sn ...

production id

input_production_id pipeline release

software_version job_id

comments created

photoz_bcnz production_id

ref_id zb

odds pz_width zb_mean

chi2 n_band

ebv qz

best_run iteration

job id

super_id name status token config input

output ts_created

...

tag job_id name value

dependency super_id parent_id child_id

crosstalk_ratio ccd_num_orig amp_num_orig ccd_num_dest amp_num_dest production_id

ratio

crosstalk_diff image_orig_id image_dest_id production_id

background_orig_all background_dest_all background_dest_sat

npix_orig_sat

forced_aperture production_id

image_id ref_id

pixel_id aperture_x aperture_y aperture_a aperture_b

aperture_theta flux

flux_error flag

annulus_a_in annulus_a_out annulus_b_in

annulus_b_out ...

mask_image production_id archivepath

filename field band

n_images pixel_scale

forced_aperture_coadd production_id

ref_id band flux

flux_error chi2

n_coadd run

forced_aperture_report

id fac_id fa_id band

report_status user

insert_date

memba_ref_cat production_id

ref_cat

synth_sdss mjd

fiberid specobj_id

ra dec

sn_median_r z

zerr class

flux_pau_NB455 flux_err_pau_NB455

band_fraction_pau_NB455 ...

synth_vipers id_IAU num

alpha delta

selmag errselmag

zspec zflg

photoMask flux_pau_NB455

flux_err_pau_NB455 band_fraction_pau_NB455

...

match_to_spec ref_id

spec_id ref_cat None

spec_conv id

spec_id spec_cat

band instrument

flux flux_err

photo calibrationmemba

crosstalkphoto-znightly

survey strategyextenal

synthetic photometrybrownthrower

production

cfhtlens paudm_id

seqnr flux_auto_theli

fluxerr_auto_theli mag_auto

magerr_auto_theli alpha_j2000 delta_j2000

...

cosmos paudm_id

ra dec

zp_gal I_auto zspec

r50 sersic_n_gim2d

...

deep2 objno

ra dec

obj_type magi

magierr zbest zerr …

kids

paudm_id seqnr

flux_auto_theli fluxerr_auto_theli

mag_auto magerr_auto_theli

alpha_j2000 delta_j2000

...

sdss_star objID

thingId ra

dec raErr

decErr clean

psfMag_i psfMagErr_i

...

gaia_dr2 source_id

duplicated_source ra

ra_err dec

dec_err pmra

pmdec phot_g_mean_mag phot_g_mean_flux

...

sdss_spec id

survey plate_id

mjd fiber ra

dec redshift

sn type_class

vipers_spectro_pdr2 id_iau num

alpha delta

selmag errselmag pointing quadrant

zspec zfig ...

sdss_spec_photo objID

specObjID mjd

plate fiberID survey

ra dec

z zErr

zWarning

gaia source_id

ra dec

ra_err dec_err

phot_g_mean_mag phot_g_mean_flux

phot_g_mean_flux_error ref_epoch

Figure B1. PAUS data base schema.

MNRAS 000, 1–31 (2022)

PAUS image photometry 27

4000 5000 6000 7000 8000 9000wavelength (A)

4

3

2

1

0

1

2

3

Flux

(erg

/cm

2 /s/Å

)

1e 16 Input spectra

4000 5000 6000 7000 8000 9000wavelength (A)

0.0

0.2

0.4

0.6

0.8

1.0

Tran

smiss

ion

Target bands

4000 5000 6000 7000 8000 9000wavelength (A)

17

18

19

20

21

AB m

ag

Synthetic + PAUS PhotometrySpectraSynthSynth (masked)PAUS

5

0

5

10

15

20

25

SN

fluxnoiseflux errorsn

SDSS spectra 0501-52235-0461 - SN: 8.96 z: 0.00

Figure C1. A M3 star observed by PAUS (COSMOS-79081) with reference SDSS Spectrum (Plate 501 - MJD 52235 - FiberID 461 - SNR 8.95).

MNRAS 000, 1–31 (2022)

28 Serrano et al.

4000 5000 6000 7000 8000 9000 10000wavelength (A)

1.5

1.0

0.5

0.0

0.5

1.0

1.5

2.0

Flux

(erg

/cm

2 /s/Å

)

1e 16 Input spectra

4000 5000 6000 7000 8000 9000 10000wavelength (A)

0.0

0.2

0.4

0.6

0.8

1.0

Tran

smiss

ion

Target bands

4000 5000 6000 7000 8000 9000 10000wavelength (A)

18

19

20

21

22

AB m

ag

Synthetic + PAUS PhotometrySpectraSynthPAUS

2.5

0.0

2.5

5.0

7.5

10.0

12.5

15.0

SN

fluxnoiseflux errorsn

SDSS spectra 4737-55630-0058 - SN: 8.37 z: 0.36

Figure C2. A red galaxy observed by PAUS (COSMOS-3956) with reference SDSS Spectrum (Plate 4737 - MJD 55630 - FiberID 58 - SNR 8.37) at redshift0.362.

MNRAS 000, 1–31 (2022)

PAUS image photometry 29

4000 5000 6000 7000 8000 9000wavelength (A)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Flux

(erg

/cm

2 /s/Å

)

1e 16 Input spectra

4000 5000 6000 7000 8000 9000wavelength (A)

0.0

0.2

0.4

0.6

0.8

1.0

Tran

smiss

ion

Target bands

4000 5000 6000 7000 8000 9000wavelength (A)

17.0

17.5

18.0

18.5

19.0

19.5

20.0

AB m

ag

Synthetic + PAUS PhotometrySpectraSynthPAUS

0

5

10

15

20

25

30

35

SN

fluxnoiseflux errorsn

SDSS spectra 0500-51994-0587 - SN: 19.41 z: 0.12

Figure C3. An H𝛼 star-forming galaxy observed by PAUS (COSMOS-67024) with reference SDSS Spectrum (Plate 500 - MJD 51994 - FiberID 587 - SNR19.41) at redshift 0.122.

MNRAS 000, 1–31 (2022)

30 Serrano et al.

4000 5000 6000 7000 8000 9000wavelength (A)

2

1

0

1

2

Flux

(erg

/cm

2 /s/Å

)

1e 16 Input spectra

4000 5000 6000 7000 8000 9000wavelength (A)

0.0

0.2

0.4

0.6

0.8

1.0

Tran

smiss

ion

Target bands

4000 5000 6000 7000 8000 9000wavelength (A)

18.0

18.5

19.0

19.5

20.0

20.5

AB m

ag

Synthetic + PAUS PhotometrySpectraSynthSynth (masked)PAUS

0

5

10

15

20

SN

fluxnoiseflux errorsn

SDSS spectra 0501-52235-0462 - SN: 9.31 z: 2.00

Figure C4. A QSO observed by PAUS (COSMOS-80935) with reference SDSS Spectrum (Plate 501 - MJD 52235 - FiberID 462 - SNR 9.31) at redshift 2.00.

MNRAS 000, 1–31 (2022)

PAUS image photometry 31

APPENDIX D: FLAGGING

In this section we describe the list of flags that any source can havethroughout thewhole data processing of the PAUSdatamanagementsystem. Each flag is assigned to a bit such that with a single integerwe can obtain the unique list of flags affecting each source.

This paper has been typeset from a TEX/LATEX file prepared by the author.

Flag Value (bit) Origin Level

Crowded 1 (1) SExtractor SourceMerged 2 (2) SExtractor SourceHalo 4 (3) SExtractor SourceTruncated 8 (4) SExtractor SourceDeblended 16 (5) SExtractor SourceCrosstalk 32 (6) Nightly Mask Pixelscatter-light 64 (7) Nightly Mask PixelExtinction 128 (8) Nightly Photometry Imagezero-point 256 (9) Nightly Photometry ImageCosmetics 512 (10) Nightly Mask PixelSaturated 1024 (11) Nightly Mask PixelCosmics 2048 (12) Nightly Mask PixelVignetted 4096 (13) Nightly Mask PixelDiscordant 8192 (14) MEMBA Photometry SourceEdge 16384 (15) MEMBA Photometry SourceDistortion 32768 (16) MEMBA Photometry SourceNoisy 65536 (17) MEMBA Photometry SourceAstrometry 131072 (18) MEMBA Photometry Source

Table D1. The list of flags used at across the PAUS data managementsystem. The table specifies the flag reason, the assigned value and bit, thesoftware/pipeline origin and the level at which the flag is defined.

MNRAS 000, 1–31 (2022)