Whales Habitat mapping

41
0 GREEN MATRIX UPLOADED: A NEW ECOSYSTEM VARIABLE FOR MARINE RESOURCES SECTOR GREENUP use case: Whales Habitat mapping Deliverable: Final Report Date: January 5, 2017 Lead GREENUP partner: Dr Mónica Almeida e Silva MARE – Marine and Environmental Sciences Centre IMAR - Institute of Marine Research Rua Frederico Machado, 4 9901-862 Horta Portugal Phone: (+351) 292200400 Email: [email protected]; [email protected] http://www.whales.uac.pt/ Authors: S Pérez Jorge, M Tobeña, MA Silva The objective of GREENUP is to extend the CMEMS products catalogue by developing a new product covering a key ecosystem component at the mid-trophic level, i.e., the micronekton, to better address the Marine Resources area of benefit. Two applications benefiting of the CMEMS products and the new proposed MTL variables will illustrate the interest of these new developments for the research, management and monitoring of marine resources. This report describes the use case developed by the Institute of Marine Research at the University of Azores.

Transcript of Whales Habitat mapping

0

GREEN MATRIX UPLOADED: A NEW ECOSYSTEM VARIABLE FOR MARINE RESOURCES SECTOR

GREENUP use case: Whales Habitat mapping

Deliverable: Final Report

Date: January 5, 2017

Lead GREENUP partner:

Dr Mónica Almeida e Silva MARE – Marine and Environmental Sciences Centre IMAR - Institute of Marine Research Rua Frederico Machado, 4 9901-862 Horta Portugal Phone: (+351) 292200400 Email: [email protected]; [email protected] http://www.whales.uac.pt/

Authors: S Pérez Jorge, M Tobeña, MA Silva

The objective of GREENUP is to extend the CMEMS products catalogue by developing

a new product covering a key ecosystem component at the mid-trophic level, i.e., the

micronekton, to better address the Marine Resources area of benefit. Two applications

benefiting of the CMEMS products and the new proposed MTL variables will illustrate

the interest of these new developments for the research, management and monitoring

of marine resources. This report describes the use case developed by the Institute of

Marine Research at the University of Azores.

0

Content

1 Executive Summary ________________________________________________________ 1

2 Introduction ______________________________________________________________ 2

3 Methodology _____________________________________________________________ 3

3.1 Whale sighting data __________________________________________________________ 3

3.1.1 Whale sightings and effort data _______________________________________________________ 3

3.1.2 Effort segments ____________________________________________________________________ 3

3.1.3 Habitat and Micronekton variables ____________________________________________________ 4

3.1.4 Species Distribution Models (SDMs) ___________________________________________________ 7

3.2 Whale tracking data __________________________________________________________ 9

3.2.1 Telemetry data ____________________________________________________________________ 9

3.2.3 Simulated correlated random walks (CRWs) ____________________________________________ 11

3.2.4 Sample environmental variables _____________________________________________________ 11

3.2.5 Telemetry-based habitat models _____________________________________________________ 11

3.3 Predictive performance of the SDM-i vs SDM-t ___________________________________ 13

4 Results _________________________________________________________________ 13

4.1 Whale sighting data _________________________________________________________ 13

4.1.1 Contemporaneous models __________________________________________________________ 13

4.1.1.1 Fin whales ________________________________________________________ 13

4.1.1.2 Blue whales _______________________________________________________ 16

4.1.1.3 Sei whales ________________________________________________________ 18

4.2 Whale tracking data _________________________________________________________ 20

4.2.2 Climatological models ______________________________________________________________ 21

4.2.2.1 Fin whales ________________________________________________________ 21

4.2.2.2 Blue whales _______________________________________________________ 26

4.2.2.3 Sei whales ________________________________________________________ 28

5 Discussion ______________________________________________________________ 32

5.1 Predictive performance of SDM-i versus SDM-t ___________________________________ 32

5.1.1 Whale sighting data _______________________________________________________________ 32

5.1.2 Whale tracking data _______________________________________________________________ 33

5.2 Importance of SEAPODYM variables ____________________________________________ 34

5.3 Limitations of the SEAPODYM product to improve models’ predictive performance _____ 34

5.4 SEAPODYM-MTL versus SEAPODYM-ARMOR3D ___________________________________ 35

6 Conclusions _____________________________________________________________ 35

7 References ______________________________________________________________ 36

1

GREENUP – Whales habitat mapping

1 Executive Summary

The main objective of the Whales Habitat Mapping use case was to demonstrate the potential of

enhanced SEAPODYM-MTL products developed by GREENUP to study and monitor the distribution

and habitat use of large marine predators. More specifically, the use case aimed to use the outputs

from the micronekton numerical model to develop and geostatistical models predicting ocean-basin

to regional-scale distribution of baleen whales, a major group of micronekton predators with broad

distribution and transoceanic movements across the North Atlantic Ocean. The predictive power of

these models was then compared to that of models built only with physiographic and oceanographic

predictors.

These objectives were achieved through a number of key activities:

1. Creating a dataset of occurrences of target whale species

We created two dataset of whale occurrences using sighting and satellite telemetry data. Whale

sightings and effort information around the Azores collected by fisheries observers over the period

2001-2015 were collated, cleaned and integrated into a Geographic Information System. Because

fisheries observer data suffer from the effects of biased sampling across space and time, we developed

a framework to standardize survey effort information and compute the number of whale groups and

individuals sighted per unit of effort. Switching state-space models were applied to raw satellite whale

tracks to account for errors in the location estimates and provide a position every 12-h, creating true

whale occurrences (case points). A correlated random walk model was then applied to each whale

track to simulate “pseudo-absences” (control points).

2. Creating a dataset of environmental and micronekton predictors

Values of environmental (physiographic, oceanographic) and micronekton predictor variables for the

study areas and periods were assembled from different sources. The original products were projected

to a 10 km resolution grid. We computed 8-day climatologies for the dynamic products by averaging

the available time series. Finally, we obtained covariate values at the centroids of 10-km survey

segments for the whale sighting data, and at all case and control points for every tracked whale.

3. Developing Species Distribution Models (SDMs)

Generalized Additive Models (GAMs) and Generalized Additive Mixed Models (GAMMs) were fit to

the whale sighting and whale tracking data respectively, to develop SDMs. For each whale species and

data type (sightings and tracking), we fit 8 separate models using only environmental or

environmental and micronekton variables, two versions of SEAPODYM products (MTL and ARMOR3D),

and two different temporal resolutions (contemporaneous and 8-day climatological estimates). For

each of the 48 model formulations, we plotted the functional relationships of covariates, predicted

density surfaces/spatial distribution and corresponding standard deviation estimates (representing

the variability in the spatial model).

4. Assessing the predictive ability of novel micronekton products

We used predictive statistics and inspected modelled density/distribution maps to analyse relative

performance of models incorporating micronekton products and quantify the improvement in

predictive power over models containing only physiographic and oceanographic variables. This

comparative analysis was carried out for the two versions of the SEAPODYM, and for models based on

contemporaneous and climatological.

2

GREENUP – Whales habitat mapping

2 Introduction

Understanding current and predicting future distributions of marine megafauna is essential to gain insights into their ecology and population dynamics, to implement conservation and management measures and assess their effectiveness, and to project the future of the ecosystem services and goods they provide (food provisioning, top-down control of the food web, carbon regulation, tourism and leisure).

Species distribution models (SDMs), that statistically relate animal observations to environmental variables, have been successfully used to explain and predict distribution patterns of marine predators in a variety of ecosystems and at multiple spatio-temporal scales. Developing SDMs that can effectively forecast animals’ distribution depends on the availability of ecologically meaningful environmental data at the appropriate spatial and temporal resolutions.

Marine megafauna is supported by the productivity of primary and secondary consumers and their distribution and movements are tied closely to prey availability (Benoit-Bird et al., 2013; Boyd et al., 2015). Yet, most SDMs rely on physiographic variables and a handful of remotely sensed or in situ oceanographic measurements as proxies for prey distribution (Guisan and Zimmerman, 2000; Scales et al., 2015; Manocci et al., 2017), which is rarely available at the scales relevant for these predators.

Marine megafauna is supported by the productivity of primary and secondary consumers and their distribution and movements are tied closely to prey availability (Benoit-Bird et al., 2013; Boyd et al., 2015). Yet, most SDMs rely on physiographic variables and a handful of remotely sensed or in situ oceanographic measurements as proxies for prey distribution (Guisan and Zimmerman, 2000; Scales et al., 2015; Manocci et al., 2017), which is rarely available. In addition, the spatial and temporal resolution of the data used to build SDMs is a critical aspect, as animals interact with dynamic oceanographic and biological processes that vary at scales from meters to hundreds or thousands of kilometres and from seconds to decades (Manocci et al., 2017).

Recently developed ecosystem models enable simulating the ocean basin scale spatial distribution and dynamics of epi- and mesopelagic prey, providing a valuable tool to improvide distribution models an opportunity to of pelagic predators. The SEAPODYM-MTL is a multi-species dynamic model that uses coupled biophysical-biogeochemical models to predict the biomass and production of six mid-trophic level (micronekton) functional groups in the water column to ~1000 m depth (Lehodey et al., 2010). The SEAPODYM-MTL used was successfully used to model the dynamics of tuna populations (Lehodey et al., 2008), habitat use and movements of loggerhead turtles (Abecassis et al., 2015), and the foraging habitats and density of several cetaceans (Lambert et al., 2014; Roberts et al., 2016). Although all these authors claimed that incorporating model-derived prey data improved predator distribution models, no attempt was made to specifically test it.

The main objective of the Whales Habitat Mapping use case was to demonstrate the potential of novel SEAPODYM-MTL products developed by GREENUP to study and monitor the distribution and habitat use of large marine predators. Large baleen whales are an excellent group to evaluate the predictive potential of micronekton products. They feed on zooplankton and small epipelagic and mesopelagic schooling fish and squids, major components of the micronekton community. Their spatial distribution and movements are closely linked to the distribution of their prey, with individuals concentrating in areas of predictable prey aggregations and where prey density is higher (Croll et al. 2005; Block et al. 2011, Prieto et al. 2016).

We assessed whether novel SEAPODYM-MTL products improved forecasts of the distribution of three baleen whale species (Balaenopera musculus, B. physalus, B. borealis), by comparing the predictive ability of SDMs with micronekton products (hereafter referred to integrative SDMs, SDM-i) relative to models with only oceanographic and physiographic predictors (traditional SDMs, SDM-t). We tested two micronekton products: SEAPODYM-MTL, which uses physical forcing variables produced from GLORYS.2v3 ocean reanalysis, and SEAPODYM-ARMOR3D which uses outputs from ARMOR3D, a 3D thermohaline field computed from in situ temperature and salinity profiles and satellite altimeter data

3

GREENUP – Whales habitat mapping

and SST. We evaluated the predictive power of these micronekton products at different spatial and temporal scales and resolutions, by examining spatial congruence and performance statistics of SDMs developed for regional and cross-regional scales, and from contemporaneous and climatological data.

3 Methodology

3.1 Whale sighting data

3.1.1 Whale sightings and effort data Whale sightings and effort data were recorded by the Azorean Fisheries Observer Program (POPA) on-board tuna fishing vessels from 2001 to 2015 (Silva et al., 2002, 2014) (Fig.1). Cetacean surveying is conducted only when the vessels are cruising or searching for tuna and the sea state was 3 or less on the Beaufort scale. The observer searches for cetaceans from the ship’s flying bridge (8 m above the water) by naked eye and using binoculars. During on-effort periods, observers record the time, position and speed of the vessel every 30 minutes or every time the vessel changes course (>30º). At each cetacean sighting, observers record the time, location, species (including the level of identification reliability), number of individuals and behaviour.

Fig. 1. Cetacean on-effort positions of POPA transects between 2001 and 2015 around the Azores.

Data were collected from May to November, but for this study we only analysed data from May to

August, due to the low number of sightings of baleen whales outside this period. We also discarded

sightings outside the study area (see below the section ‘Effort segments’ for the definition of the study

area) and sightings that were clearly erroneous (due to transcription typos or recording errors). Some

sightings were also discarded as a result of splitting the on-effort track lines into equal length segments

(refer to ‘Effort segments’ below).

3.1.2 Effort segments The study area was defined between 35°- 41° N and 22°- 33° W, containing 97% of on-effort vessel

positions (117,391 of a total of 120,874 positions collected) during the study period (May-August,

2001-2015). Effort outside that box was not considered for this study (Fig.1).

4

GREENUP – Whales habitat mapping

In order to minimize spatial and temporal biases from the unequal allocation of survey effort, vessel

track lines were split into continuous-effort segments of 10 km. We started by computing the length

of all on-effort boat tracks and excluding all tracks shorter than 8 km. Tracks 8 to 10 km long were

retained as segments and tracks with lengths >10 km were split into ‘n’ segments of 10 km, until only

one segment <10 km remained. When the remainder was <5 km, it was distributed uniformly among

the other n segments (resulting in segments slightly larger than 10 km). The current protocol resulted

in 33,995 segments with a mean length of 9.95 km (SD= 1.29 km) and a range of 8-15 km (Fig. 2).

Fig. 2. Effort segments retained for developing models, with the sightings of baleen whale target species.

The total number of sightings and individuals from each whale species were computed for each ~10-

km effort segment. Subsequently, in order to avoid duplication of sightings made by different fishing

vessels, we selected those segments that had contiguous segments at <5 km within the same day.

From this selection, we only retained the segments that, in decreasing order of priority, had more blue

whale sightings, higher species diversity, larger number of sightings or individuals, in order to retain

the maximum number of sightings available for modelling. Blue whale sightings were prioritised

because there were less sightings than for the other two species. Finally, we calculated the centroid

of each effort segment, which was used to extract values of habitat and micronekton variables.

3.1.3 Habitat and Micronekton variables A set of 32 candidate variables (Table 1) were selected based on species’ ecological preferences and

regional oceanography and physiography (Lehodey et al., 2010; Mannocci et al., 2014). Depending on

the model runs, micronekton and zooplankton were derived either from the SEAPODYM-MTL or from

the SEAPODYM-AMORD3D. Description and sources of all other variables are presented in Table 1.

Baleen whales forage mainly on prey available within the epipelagic layer (Croll et al., 2005).

Therefore, biomass and production outputs from the SEAPODYM ocean model pertaining to the

bathypelagic and migrant bathypelagic prey, which are not available within the upper layer of the

water column, were not included as candidate variables in the models.

5

GREENUP – Whales habitat mapping

Table 1. Description of candidate variables for habitat modelling.

Environmental variable Acronym Transformation

Resolution

Units Source Spatial/temporal

Depth Depth none 1 arc-minute/static

m

NationalGeophysicalDataCenter(NGDC),

National OceanicandAtmospheric

Administration(NOAA) http://www.ngdc.noaa.

gov/mgg/global/global.html. (Amante & Eakins 2008)

Seamounts None none 10 meters unitless www.int-res.com/articles/suppl/m357p017_app.pdf. (Morato et al. 2008).

Euclidean distance to seamount d-Seamounts none 1 arc-minute/static

m Seamounts

Euclidean distance to large seamount dL-Seamounts none 1 arc-minute/static

m Seamounts

Euclidean distance to small seamount dS-Seamounts none 1 arc-minute/static

m Seamounts

Sea surface temperature sst 0.25 degrees degrees_celsius CMEMS ocean reanalysis (1998-2015)*

Sea water potential temperature temperature 0.25 degrees degrees_celsius CMEMS ocean reanalysis (1998-2015)*

Euphotic Depth ZEU 0.25 degrees m CMEMS ocean reanalysis (1998-2015)*

Mass concentration of chlorophyll a in sea water (glo glorys2 vgpm 025x7d)

CHLA 0.25 degrees mg/m3 CMEMS ocean reanalysis (1998-2015)*

Net primary production of carbon (glo glorys2 vgpm 025x7d)

NPP 0.25 degrees mmolC/m2/d CMEMS ocean reanalysis (1998-2015)*

Derived environmental variables Original variable

Slope within a 3x3 pixel kernel Slope log10 1 arc-minute/static

degrees from the horizontal

Depth

Euclidean distance to shoreline Distance to shore

square root 1 arc-minute/static

m Depth

Euclidean distance to 200 meters isobath Dist(200) square root 1 arc-minute/static

m Depth

Euclidean distance to 2000 meters isobath Dist(500) square root 1 arc-minute/static

m Depth

Euclidean distance to seamount d-Seamounts none 1 arc-minute/static

m Seamounts

6

GREENUP – Whales habitat mapping

Euclidean distance to large seamount dL-Seamounts none 1 arc-minute/static

m Seamounts

Euclidean distance to small seamount dS-Seamounts none 1 arc-minute/static

m Seamounts

Time-lagged Chlorophyll-a concentration (_1 month) CHLA (_1 m) 0.25 degrees mg/m3 CHLA

Time-lagged Chlorophyll-a concentration (_2 month) CHLA (_2 m) 0.25 degrees mg/m3 CHLA

Time-lagged NPP (_1 month) NPP (_1 m) 0.25 degrees mmolC/m2/d NPP

Time-lagged NPP (_2 month) NPP (_2 m) 0.25 degrees mmolC/m2/d NPP

Euclidean distance to SST fronts Ed_Fronts Canny algorithm, MGET

0.25 degrees m Sea surface temperature

Euclidean distance to Eddies Ed_Eddies Okubo-Weiss, MGET

0.25 degrees m Sea surface height, AVISO, MADT

Outputs from SEAPODYM ocean model

Epipelagic micronekton(pb) epi_pb 0.25 degrees g/m2 SEAPODYM ocean model*

Epipelagic micronekton(pp) epi_pp 0.25 degrees g/m2/d SEAPODYM ocean model*

Mesopelagic micronekton(pb) meso_pb 0.25 degrees g/m2 SEAPODYM ocean model*

Mesopelagic micronekton(pp) meso_pp 0.25 degrees g/m2/d SEAPODYM ocean model*

Migrant mesopelagic micronekton(pb) mmeso_pb 0.25 degrees g/m2 SEAPODYM ocean model*

Migrant mesopelagic micronekton(pp) mmeso_pp 0.25 degrees g/m2/d SEAPODYM ocean model*

High migrant bathypelagic micronekton (pb) hmbathy_pb 0.25 degrees g/m2 SEAPODYM ocean model*

High migrant bathypelagic micronekton (pp) hmbathy_pp 0.25 degrees g/m2/d SEAPODYM ocean model*

Lower trophic level plankton(pb) pk_pb 0.25 degrees g/m2 SEAPODYM ocean model*

Lower trophic level plankton(pp) pk_pp 0.25 degrees g/m2/d SEAPODYM ocean model*

* in the case of the models with the new simulation of SEAPODYM-ARMOR3D micronekton the physic variables source is the reanalysis of ARMOR 3D.

7

3.1.4 Species Distribution Models (SDMs) We used Generalized Additive Models (GAMs) to examine the relationship between whale sightings and explanatory variables. GAMs have been extensively used with good results to predict the non-linear relationships between top predators and environmental covariates (Ferguson et al., 2006, Becker et al., 2010, Forney et al., 2012). Our approach was to develop models containing SEAPODYM-prey outputs and relevant environmental variables (SDM-i) commonly used in cetacean SDMs, and test these models against those containing only environmental variables (SDM-t), to assess if and to what extent inclusion of model-derived prey data improved performance of SDMs. We fitted both “contemporaneous models”, i.e, models using dynamic explanatory variables within the same week of any given effort segment, and “climatological models”, i.e, models with dynamic variables based on weekly averages from the study period (2001-2015). Overall, we fitted 8 separate models for each whale species, combining different sources of variables and model outputs (see Table 1 for interpretation of variable and source acronyms): 1) Contemporaneous model with CMEMS environmental variables; 2) Climatological model with CMEMS environmental variables; 3) Contemporaneous model with CMEMS environmental variables + SEAPODYM-MTL outputs; 4) Climatological model with CMEMS environmental variables + SEAPODYM-MTL outputs; 5) Contemporaneous model with ARMOR-3D environmental variables; 6) Climatological model with ARMOR-3D environmental variables; 7) Contemporaneous model with ARMOR-3D environmental variables + SEAPODYM-AMORD3D

outputs; 8) Climatological model with ARMOR-3D environmental variables + SEAPODYM-AMORD3D

outputs; The following scheme (Fig.3) summarizes the types of models created:

Fig. 3. Workflow of model building and process to analyse the SEAPODYM-MTL and SEAPODYM-ARMOR3D outputs.

8

GREENUP – Whales habitat mapping

Prior to running the models, we investigated the collinearity between pairs of covariates (first for the SEAPODYM-MTL variables and a second time with the new SEAPODYM-AMORD3D), which has been shown to increase type II errors (Zuur et al., 2010). First, we used the Pearson’s correlation coefficient to identify those covariates highly correlated (Pearson coefficient higher than 0.7). Second, we ran Generalized Linear Models (GLMs) with one variable at a time to check how each predictor explained the observed response variable using the Akaike Information Criteria (AIC) value. Of each pair of highly correlated variables, the variable with the lowest AIC value was retained for analysis, to keep the variables with most explanatory ability. This procedure was applied for each species (with the SEAPODYM-MTL and SEAPODYM-AMORD3D variables), as each had a different set of predictor variables (Table 1; Supplementary material 1.1.1). We then ran GAMs using the total number of individual whales per segment as a response variable, the natural logarithm of the survey effort as an offset, and the covariates as predictors. The smoothing splines were limited to a maximum of 5 degrees of freedom to avoid overfitting and preserve the ecological interpretability of functional relationships. A shrinkage process was used to select a subset of predictors, based on their estimated degrees of freedom (<0.85) and p-value (>0.05) (Roberts et al., 2016). We used a Tweedie distribution which is useful in datasets with a high proportion of zeros (Shono, 2008; Peel et al., 2012). The restricted maximum likelihood (REML) optimization method was used to fit the models (Woods, 2011).

We used a cross-validation procedure to assess the accuracy of our final models for each species,

splitting our dataset in training (75% of records, 2001-2011) and test (remaining 25% of records, 2012-

2015) data. The predictive performance of the SDM-i and SDM-t was validated using the concordance

index (C-index), which can be applied to continuous and categorical data. This index is equal to the

area under the Receiver Operating Characteristic curve (AUC). It is generally assumed that a C-index

between 0.7-0.8 indicate “moderate discrimination”, 0.8–0.9 “good discrimination” and 0.9–1

“excellent discrimination” (Harrel et al., 1996). Models were built within the R environment using

several R packages (R Core Team). The modelling procedure is summarized in Fig. 4.

9

GREENUP – Whales habitat mapping

Fig. 4. Workflow of the habitat modelling procedure applied to the whale sighting data.

3.2 Whale tracking data

3.2.1 Telemetry data We used data from a total of 33 location-only satellite tags (SPOT5 Widlife Computers, Redmond,

Washington, USA) attached to fin, blue and sei whales (16, 10 and 7 respectively) off Faial island, in

the Azores, between June 2008 and June 2016 (Silva et al., 2013; Prieto et al., 2014) (see

Supplementary material 1.2.1 for more information on each individual tag). A Bayesian switching

state-space model (SSSM) was fitted to the raw and unfiltered locations to obtain regularized time

steps for each track, incorporating the measurement errors from the Argos data to correct position

estimates (Jonsen et al., 2005) (Fig.5). We fitted the SSSM within an hierarchical framework to

estimate parameters jointly across multiple individual tracking datasets. We tested multiple time steps

(2h, 5h, 12h and 24h) with the aim of finding the most appropriate parameters to reduce the sample

autocorrelation. For each model, we run two Markov Chain Monte Carlo (MCMC) chains for 50,000

iterations, dropping the first 45,000 samples as a burn-in and retaining every 5th sample from the

remaining 5,000 assumed postconvergence samples to reduce sample autocorrelation. Model

convergence and sample autocorrelation were visually checked in the diagnostic plots, and the time

step of 12h was selected as the most suitable due to its lower sample autocorrelation (Fig. 6).

10

GREENUP – Whales habitat mapping

Fig. 5. Workflow of the habitat modelling procedure applied to the whale tracking data.

Fig. 6. Maps showing the original fin whale track (left, tag ID: 80704) and the hierarchical switching state-space model track with a time step of 12h (right).

11

GREENUP – Whales habitat mapping

3.2.3 Simulated correlated random walks (CRWs) As the track locations obtained through the SSSM only provide information on the presence of the

animals, it was necessary to create simulated tracks called “pseudo-absences” for each original track

as a measure of habitat availability (Philips et al., 2009; Aarts et al., 2012). The simulated tracks were

built using a correlated random walk (CRW) model, taking the first transmitted position of each

satellite track as the starting point. For each of the 33 baleen whale tracks, 200 CRW tracks were

created with the same duration of the original tracks, and turning angles and step lengths randomly

sampled from the distribution of the original tracks (Willis-Norton et al., 2015). The similarity between

the original and the simulated tracks was ranked based on the overall direction and distance and

assigning a flag value between 0 and 4, 0 being the most similar tracks and 4 the most dissimilar. This

flag value was estimated as the normalized difference between the original and the simulated track

length distance, d, summed with the normalized difference in net angular displacement, θ, of the

original and simulated track:

Flag value = 2 x (distanceoriginal – distancesim)/distanceoriginal

+ (angleoriginal – anglesim)/90

Simulated tracks within the upper quartile of flag values and tracks crossing land were excluded from

the analysis.

3.2.4 Sample environmental variables The list of candidate variables for the whale tracking data were:

1. Environmental variables: sea surface temperature, sea water potential temperature, euphotic

Depth, chlorophyll-a, net primary production, eddy kinetic energy, distance to coast, slope,

Depth.

2. SEAPODYM model variables: epipelagic biomass, epipelagic production, mesopelagic biomass,

mesopelagic production, migrant mesopelagic biomass, migrant mesopelagic production, lower

trophic level biomass and lower trophic level production.

These variables were extracted for each original and simulated track points using the R software

(version 2.15.3). As with the analyses of whale sightings, prior to running the models we examined

collinearity between pairs of covariates (Zuur et al., 2010) using the Pearson’s correlation coefficient

and eliminating those >0.7.

3.2.5 Telemetry-based habitat models Based on the original and simulated track points, we applied Generalized Additive Mixed Models

(GAMMs) to determine the probability of whale occurrence and predict monthly distributions of each

whale species. GAMMs were fit using a binomial family and a logit link function and residual maximum-

likelihood estimator (MGCV 1.8.7; Wood 2006), with whale ID included as a random effect. To model

our data as a binary response variable, original points were assigned a value of 1 and simulated points

a value of 0 (Aarts et al., 2012). The best model was chosen based on the lowest corrected AIC, and

checking that the resulting variables had low levels of concurvity, representing nonlinear

dependencies among predictor variables.

A sensitivity analysis was also carried out to determine whether the simulated CRW tracks produce

outputs that were robust to parameter uncertainty. For this process, we randomly selected two of the

200 CRWs tracks per whale and assessed if the predictors identified for the best fitting model were

12

GREENUP – Whales habitat mapping

also significant for the different simulated points, repeating this procedure 40 times. For each of the

40 runs, we calculated the predictive performance through the concordance index (C-index) and the

variance explained (R2). Two examples of these 40 runs for fin whales are shown in Fig. 7. Examples

for blue and sei whales can be found in Supplementary material 1.2.

Fig. 7. Two examples of the original (red) and correlated random walk (blue) tracks simulated for fin whales that were used to assess parameter uncertainty in telemetry-based habitat analysis.

13

GREENUP – Whales habitat mapping

Based on the best model, high spatial resolution (0.1°) monthly predictions of habitat preferences

were produced for each whale species over the tracking period. For this, environmental and

SEAPODYM model data were compiled at a monthly scale.

3.3 Predictive performance of the SDM-i vs SDM-t We compared the predictive performance of the models with (SDM-i) and without (SDM-t)

SEAPODYM-derived prey data using different statistics and metrics. For the whale sighting data, we

used deviance explained, AIC, C-index and its standard deviation (SD). For the whale tracking data, we

used the coefficient of determination, also commonly known as "R-squared", and also the C-index and

respective SD values. For both datasets, we performed a visual check of the spatial predictions.

4 Results

4.1 Whale sighting data A total of 670 sightings of our three target species were recorded by POPA during the study period

(blue whale: n=105; fin whale: n=327; sei whale: n=238). These species were distributed all over the

study area without any clear pattern (Fig. 2). After data quality control, we retained a total of 326

sightings for analysis (blue whale: n=60; fin whale: n=152; sei whale: n=114) (Fig.2). The results of the

first process and the comparison of models with and without the SEAPODYM-MTL outputs are

presented in Supplementary material 2. Below, we present in detail the results from the

contemporaneous SEAPODYM-ARMOR3D models, as these were the SDMs with most predictive

power. SDMS fit to climatologies derived from SEAPODYM-ARMOR3D outputs are shown in

Supplementary material 2.1.4.

4.1.1 Contemporaneous models

4.1.1.1 Fin whales The best SDM-t model - without SEAPODYM-ARMOR3D prey data - included only two variables and

explained 11.6% of the total deviance (Table 2). Temperature had the strongest effect and the highest

probabilities of whale occurrence occurred for temperatures between 15°C and 17.5°C (Fig. 8). Depth

had a linear, positive relationship with fin whale occurrence. Model evaluation showed that SDM-t

had a good ability to discriminate between areas where fin whales were present and absent (C-index

values ~ 0.76; Table 2).

When we added the SEAPODYM-AMORD3D variables to the best environmental model above, the

deviance explained increased 3%, to a total of 14.6%. This was due to a significant, positive effect of

the Lower trophic level plankton biomass (pk_pb). Although the deviance explained was higher with

the addition of the SEAPODYM data and the AIC was slightly lower, the predictive ability (C-index) of

the SDM-i for train (0.78) and test (0.76-0.79) data were almost the same as those obtained for the

best environmental model.

14

GREENUP – Whales habitat mapping

Table 2. Summary of the habitat modelling output and model evaluation.

Species Model Variables in best

model

Deviance explained

(%) AIC

TRAIN DATA TEST DATA

Mean C-index

SD C-index

Mean C-index

SD C-index

Fin whale

SDM-t Best environmental

model 11.6 21423 0.76 0.003 0.76 0.007

SDM-i Best environmental model +SEAPODYM

14.6 21419 0.78 0.036 0.79 0.063

Blue whale

SDM-t Best environmental

model 22.8 21229 0.87 0.039 0.85 0.119

SDM-i Best environmental model +SEAPODYM

25.3 21229 0.89 0.035 0.85 0.127

Sei whale SDM-t

Best environmental model

14.1 21299 0.74 0.05 0.71 0.09

SDM-i Best environmental model +SEAPODYM

14.5 21293 0.75 0.05 0.72 0.07

The model predictions for both the SDM-t and SDM-i matched the pattern of observed fin whale

sightings. The SDM-t predicted fin whale density for the study area ranged from 4.43x10-6 to 1.47x10-

3 individuals per 100 km2, with the highest densities occurring in the eastern side of the study area,

mainly in the channel between Terceira and São Miguel, northeast of São Miguel, and southeast of

Santa Maria, with another important area north of Corvo, in the western side of the study area (Fig.

9). These same higher-density areas were highlighted in the SDM-i models, but these models predicted

slightly higher densities across the study area (4.78×10-6 to 2.42×10-3 individuals per 100 km2). These

areas also showed the highest variability (SD) in predictions. Monthly variations in predicted density

are shown in Supplementary material 2.1.3.1.

15

GREENUP – Whales habitat mapping

Fig. 8. Functional relationships between fin whale occurrence probability and predictors based on

SDM-t and SDM-i outputs from contemporaneous models.

16

GREENUP – Whales habitat mapping

Fig. 9. Top panel: Fin whale predicted mean density (individuals per 100 km2) over the 2001 and 2011 period by the SDM-t (left) and SDM-i (right) based on contemporaneous models. Bottom panel: Variability (SD) in fin whale predicted density (individuals per 100 km2) over the 2001 and 2011 period by the SDM-t (left) and SDM-i (right) based on contemporaneous models.

4.1.1.2 Blue whales The best SDM-t model explained 22.8% of total deviance (Table 2). Net primary production of carbon

(NPP) was the most important predictor for blue whales, with highest probability of whale occurrence

at 1.4-2.2 mmolC/m2/d (Fig. 10). Blue whale probability of occurrence also increased nonlinearly with

Temperature, peaking at 15-17.5°C, and linearly with Depth. When we added the SEAPODYM-

AMORD3D variables to the best environmental model, the following variables were retained:

Epipelagic micronekton biomass (epi_pb), Mesopelagic micronekton productivity (meso_pp), and

Lower trophic level plankton biomass (pk_pb). The SDM-i had a higher deviance explained (25.3%),

with AIC and C-index values also indicating that the inclusion of the SEAPODYM-AMORD3D improved

the ability to predict the density of blue whales in the area.

17

GREENUP – Whales habitat mapping

Fig. 10. Functional relationships between blue whale occurrence probability and predictors based on SDM-t and SDM-i outputs from contemporaneous models.

18

GREENUP – Whales habitat mapping

Mean predicted blue whale density based on the best SDM-t model across the study area ranged from

8.08×10-11 – 1.14×10-3 individuals per 100 km2, which was the lowest density of all three whales.

Similarly to fin whales, predictions obtained by the SDM-i were similar but slightly higher than the

predictions from the SDM-t, ranging from 2.94×10-10 to 1.8×10-3 individuals per 100 km2. Both models

predicted slightly higher blue whale densities in the eastern islands, especially northeast of São Miguel

and southeast of Santa Maria.

Fig. 11. Top panel: Blue whale predicted mean density (individuals per 100 km2) over the 2001 and 2011 period by the SDM-t (left) and SDM-i (right) based on contemporaneous models. Bottom panel: Variability (SD) in blue whale predicted density (individuals per 100 km2) over the 2001 and 2011 period by the SDM-t (left) and SDM-i (right) based on contemporaneous models.

4.1.1.3 Sei whales In the case of sei whales, the best SDM-t model explained 14.1% of the total deviance (Table 2). Net

primary production of carbon (NPP) was the most important variable describing sei whale probability

of occurrence, with the highest probabilities close to 1.9 mmolC/m2/d (Fig.12). Sei whale probability

of occurrence decreased slightly with increasing Slope and Temperature. The only variables derived

from SEAPODYM-AMORD3D that were retained in the SDM-i were Lower trophic level plankton

biomass (pk_pb) and Epipelagic micronekton biomass (epi_pb). Deviance explained, AIC and C-index

for both train and test data were equal for both SDM-t and SDM-i, indicating no improvement from

the incorporation of the SEAPODYM-AMORD3D variables.

19

GREENUP – Whales habitat mapping

Fig. 12. Functional relationships between sei whale occurrence probability and predictors based on SDM-t and SDM-i outputs from contemporaneous models.

20

GREENUP – Whales habitat mapping

Sei whale predicted density was almost uniform within the study area (Fig. 13), with slightly higher

values around Corvo island. The mean predicted density from the SDM-t model ranged from 4.25×10-

7 to 8.01×10-3 individuals per 100 km2. The predicted density maps and values (8.80×10-7 to 1.05×10-3

individuals per 100 km2) of the SDM-i were similar to those from the SDM-t (Fig. 13).

Fig. 13. Top panel: Sei whale predicted mean density (individuals per 100 km2) over the 2001 and 2011 period by the SDM-t (left) and SDM-i (right) based on contemporaneous models. Bottom panel: Variability (SD) in sei whale predicted density (individuals per 100 km2) over the 2001 and 2011 period by the SDM-t (left) and SDM-i (right) based on contemporaneous models.

4.2 Whale tracking data Sixteen fin whales were tracked between the months of March to October, from 2009 to 2016 (Table

3). This species had the lowest tracking time and distance travelled compared to blue and sei whales,

with an average of 17.75 days and 1953 km respectively. Blue whale was the species with the shortest

tracking period, only from April to July, but had the most homogenous distance travelled among

tracked individuals. Sei whale, on the other hand, had the highest mean track time, number of

locations received and distance travelled, but this information came exclusively from seven tracks.

Below, we present in detail the results from the climatological SEAPODYM-ARMOR3D models.

Contemporaneous models including SEAPODYM-ARMOR3D and whale tracking data are shown in

Supplementary material 2.2.3.

21

GREENUP – Whales habitat mapping

Table 3. Summary of tracking data used to fit habitat models for each whale species.

Data collection Species

Fin whale Blue whale Sei whale

Total number of tracks 16 10 7

Years interval 2009-2016 2009-2016 2008-2009

Months interval March-October April-July May-October

Mean duration in days (SD) 17.75 (12.72) 27 (15.59) 35.85 (21.83)

Mean locations received (SD) 280 (487) 223 (163) 306 (372)

Mean distance travelled in km (SD) 1953 (1895) 2389 (1367) 3690 (2798)

4.2.2 Climatological models

4.2.2.1 Fin whales The best SDM-t for fin whales incorporating climatological data, selected Depth, Slope and

Temperature as the most important variables (Table 4; Fig. 14). This model showed a good predictive

performance with mean C-index values of 0.84 for all 40 models, and the highest R2 among all

climatological SEAPODYM-ARMOR3D models.

Table 4. Summary of the habitat modelling output and model evaluation from climatological models.

Best model Best model Best model

Depth temperature Depth

sqrt_slope sqrt_slope

temperature temperature

Mean SD Mean SD Mean SD

R2 0.34 0.06 R2 0.20 0.08 R2 0.19 0.06

C-index 0.84 0.03 C-index 0.78 0.03 C-index 0.84 0.02

Best model Best model Best model

Depth log_epi_mnk_pp Depth

log_epi_mnk_pp log_pk_pb log_epi_mnk_pp

log_pk_pb log_pk_pb

sqrt_slope 37

Mean SD Mean SD Mean SD

R2 0.29 0.05 R2 0.18 0.06 R2 0.11 0.05

C-index 0.83 0.03 C-index 0.79 0.03 C-index 0.79 0.02

36

40

40

37

38

40

40

40

n-significant models n-significant models n-significant models

40

1

40

40 34

23

40

n-significant models

SEAPODYM MODEL + BEST ENVI (SDM-i) SEAPODYM MODEL + BEST ENVI (SDM-i) SEAPODYM MODEL + BEST ENVI (SDM-i)

Fin whale Blue whale Sei whale

ENVIRONMENTAL MODEL (SDM-t) ENVIRONMENTAL MODEL (SDM-t) ENVIRONMENTAL MODEL (SDM-t)

n-significant models n-significant models

22

GREENUP – Whales habitat mapping

Fig. 14. Functional relationship between fin whale predicted probability and predictors based on SDM-t and SDM-i outputs from climatological models.

23

GREENUP – Whales habitat mapping

Regarding the response from the model output, Temperature and Depth were significant for all runs,

with Temperature being the covariate with the strongest negative relationship. The best SDM-i model

had a similar predictive performance to SDM-t, but a slightly lower R2, with values close to 0.29,

retaining Epipelagic micronekton productivity (epi_pp), Lower trophic level plankton biomass (pk_pb)

and Depth as the most important predictors (Table 4; Fig. 14).

The monthly predictions of the best SDM-t model identified the area with the highest occurrences

above 50°N of latitude, except for the period between March and June when the mid-Atlantic Ridge

was also relevant (Fig. 15). On the other hand, the best SDM-i model determined a northward

movement from March to June, and a concentration in northern latitudes during the rest of the

months (July to October), selecting nearly the same area as the SDM-t model (Fig. 16).

The best SDM-t incorporating SEAPODYM-MTL and climatological data had a lower R2 and predictive

performance than the same SDM-t using SEAPODYM-ARMOR3D. Variables selected were also

different, with SEAPODYM-MTL retaining primary production instead of Temperature retained by the

SEAPODYM-ARMOR3D. The SDM-i had the same R2 and predictive ability for both datasets,

SEADPOYM-MTL and SEAPODYM-ARMOR3D (Supplementary material 2.2.2).

24

GREENUP – Whales habitat mapping

Fig. 15. Fin whale predicted distribution from March to October for SDM-t.

25

GREENUP – Whales habitat mapping

Fig. 16. Fin whale predicted distribution from May to October for SDM-i.

26

GREENUP – Whales habitat mapping

4.2.2.2 Blue whales The best climatological SDM-t model retained Temperature as the most important covariate, being

significant in all 40 runs, and showing a C-index close to 0.78 in all of them and a R2 around 0.20 (Table

4; Fig. 17). This predictive performance and R2 were very similar when we added the SEAPODYM

ARMOR3D data, selecting Epipelagic micronekton productivity (epi_pp) and Lower trophic level

plankton biomass (pk_pb) for the best SDM-i model. For the SDM-t, temperature showed a strong

negative effect between 5 and 12°C, and a small variation for the rest of the Temperature interval.

The SDM-i model identified a negative relationship of Epipelagic micronekton productivity (epi_pp)

and a peak on Lower trophic level plankton biomass (pk_pb) between 1.8 and 2.0 g/m2 (Fig. 17).

Fig. 17. Functional relationship between blue whale predicted probability and predictors based on SDM-t and SDM-i outputs from climatological models.

The monthly spatial predictions for SDM-t determined the area with the highest occurrence in

northern latitudes, between 50 and 62° of latitude. This area was constrained during the months of

June and July (Fig. 18). For these two months, the predictions of the SDM-i were very similar to the

SDM-t. However, the SDM-i model identified the highest occurrence values in lower latitudes during

the month of April and May (Fig. 19).

27

GREENUP – Whales habitat mapping

Fig. 18. Blue whale predicted distribution from April to July for SDM-t.

28

GREENUP – Whales habitat mapping

Fig. 19. Blue whale predicted distribution from April to July for SDM-i.

4.2.2.3 Sei whales For this species, the SDM-t model identified Depth, Slope and Temperature as the most important

covariates (Table 4; Fig. 20), showing a good forecasting ability to discriminate between areas where

sei whales were present and those areas where they were absent. On the other hand, the SDM-i model

had a lower predictive performance and R2, selecting the Epipelagic micronekton productivity (epi_pp)

and Lower trophic level plankton biomass (pk_pb) in addition to Depth and Slope (Fig. 20). In both

models, the species had a preference for deeper areas (>1500 m) with flat bottoms, lower

Temperatures for the SDM-t model, and a high variability in values of Epipelagic micronekton

productivity (epi_pp) and Lower trophic level plankton biomass (pk_pb) for the SDM-i model.

These differences led to slight variations in the monthly predictions, with SDM-t identifying only

northern latitudes as the main potential habitat for the whole study period, and SDM-i selecting a

similar distribution during May and June, and adding a southern area from July to October (Fig. 21 &

Fig. 22).

29

GREENUP – Whales habitat mapping

Climatological models using SEAPODYM-MTL data retained the same variables as the climatological

SEAPODYM-ARMOR3D models, but there was a slight increment in the R2 and predictive performance

(see Supplementary material 2.2.2).

Fig. 20. Functional relationship between sei whale predicted probability and predictors based on SDM-t and SDM-i outputs from climatological models.

30

GREENUP – Whales habitat mapping

Fig. 21. Sei whale predicted distribution from May to October for SDM-t.

31

GREENUP – Whales habitat mapping

Fig. 22. Sei whale predicted distribution from May to October for SDM-t.

32

GREENUP – Whales habitat mapping

5 Discussion For this project, we used various statistics and metrics to estimate the predictive performance of the

SEAPODYM models, comparing models containing only environmental variables with models with

additional SEAPODYM data. This was done at two different spatial scales, sighting data collected

around Azores and tracking data collected from the Azores to higher latitudes, and at different

temporal resolutions, contemporaneous versus climatological features, representing a total of 48

different species distribution models.

5.1 Predictive performance of SDM-i versus SDM-t

5.1.1 Whale sighting data SDM-i models, with SEAPODYM-ARMOR3D prey data, usually had a higher predictive performance

than models including only environmental data (SDM-t from SEAPODYM-ARMOR3D), except for sei

whales where the forecast ability was basically the same (Table 5). In general, our models had a good

ability to predict the distribution of the three baleen whales, as suggested by the high concordance

index. Blue whale models had the highest predictive performance, in most cases reaching C-index

values >0.85. Most of the fin whale models were close to a good accuracy, C-index around 0.8, and

the lowest values were found in sei whale models, with a moderate ability to predict the species

distribution. These results based on the train dataset, including data from years 2001 to 2011, were

in line with the outputs obtained from the test data, 2012 to 2015, showing nearly equal predictive

performance values. The patterns described for the concordance index were very similar to the

deviance explained, with highest values for blue whales.

Table 5. Comparison of the predictive ability of SDM-i in relation to SDM-t fit to whale sighting data: situations where SDM-i > SDM-t are shown in green; where SDM-i < SDM-t shown in red; where SDM-i = SDM-t shown in black.

Species ARMOR3D

contemporaneous ARMOR3D

Climatological

Fin whale SDM-i > SDM-t SDM-i = SDM-t

Blue whale SDM-i > SDM-t SDM-i > SDM-t

Sei whale SDM-i = SDM-t SDM-i = SDM-t

Contemporaneous models consistently performed better than climatological ones, regardless of the

inclusion or not of SEAPODYM prey outputs, emphasizing the importance of choosing the appropriate

temporal scale for dynamic variables. In areas where inter-annual variability is high, SDMs of migratory

marine species incorporating contemporaneous dynamic variables data usually perform better than

models using climatologies (Manocci et al., 2017). We could also see the effect of this variability on

the annual density of whales predicted by the models, that showed a clear decreasing trend from 2001

to 2011 (not shown), linked to the reduction in available primary production in the area.

In general, our models explained little variability in the observed whale sighting data. The low deviance

explained is common in GAMs used in cetacean distribution studies (Forney et al. 2012) and may be

the consequence of modelling datasets with many absences (Welsh et al. 1996). Nevertheless, the

deviance explained obtained for SDM-t and SDM-i for whale sighting data were comparable with that

33

GREENUP – Whales habitat mapping

reported by the other studies using only micronekton data (Lambert et al., 2014). Predictive

performance of models based on whale sightings were very similar to a previous study that reported

values of 0.82, 0.79 and 0.7 for blue, fin and sei whales, respectively (Prieto et al., 2017). Our models

highlighted more or less the same areas favoured by blue and fin whales around the Azores as those

suggested by Prieto et al. (2017) using Maxent models. Model outputs for sei whales were also similar

between these two studies, suggesting that prey availability may not be an important determinant of

sei whale distribution because the species mainly uses the area for travelling.

5.1.2 Whale tracking data SDM-i using contemporaneous data had a lower capacity to discriminate between sites of whale

presence and absence than SDM-t (Table 6). Although this trend was the same for all the species,

predictive performance of SDM-i was closer to that of SDM-t for blue and fin whales than for sei

whales, for whom SDM-t performed much better. SDM-t and SDM-i built with climatological data

exhibited similar explanatory performance for blue and fin whales, but not for sei whale, because this

species was much more influenced by instantaneous covariates.

The best predictive performances were estimated for fin and sei whales, with C-index scores close to

0.84 for the original tracks and also for the majority of the 40 simulated tracks, as was evident by the

low standard deviation values of these tracks. Lower C-index values (<0.8) were obtained for blue

whales, both for contemporaneous and climatological models. The poorer predictive performance for

this species could be due to the smaller sample size, or it may indicate more complex behavioural

processes or ecological relationships not adequately captured by our models and environmental

variables. However, considering the better performance of sei whale models, that were based on a

lower sample size than that used in blue whale models, it seems plausible that the environmental

variables used in the models were not effective at capturing blue whale distribution.

Table 6. Comparison of the predictive ability of SDM-i in relation to SDM-t fit to whale tracking data: situations where SDM-i > SDM-t are shown in green; where SDM-i < SDM-t shown in red; where SDM-i = SDM-t shown in black.

Species ARMOR3D

contemporaneous ARMOR3D

climatological

Fin whale SDM-i < SDM-t SDM-i = SDM-t

Blue whale SDM-i < SDM-t SDM-i = SDM-t

Sei whale SDM-i < SDM-t SDM-i < SDM-t

The same predictive performance was estimated for SDM-t using contemporaneous and

climatological data. On the other hand, SDM-i models with climatological data showed a higher

predictive performance than SDM-i with contemporaneous models, suggesting that whale distribution

was better described by persistent oceanographic and prey features than by ephemeral ones. The

contrast to the results of the models based on whale sighting data may be explained by the

hierarchical structure of the ocean processes that drive animals’ movement at multiple spatio-

temporal scales (Benoit-Bird, Battaile, Nordstrom & Trites, 2013; Fauchland & Tveraa, 2006) and by

the resolution of the data. At intermediate scales such as around the Azores, whales and other marine

predators tend to associate with fine-scale to mesoscale ocean features generating appropriate

34

GREENUP – Whales habitat mapping

foraging habitats. Average conditions in climatologies probably are unable to reproduce these small-

scale and short-living features that drive prey availability and whale distribution locally. On the other

hand, the long-distance northward migration of baleen whales described by the satellite tracking data

appears strongly linked to the temporal and latitudinal change in abiotic and abiotic conditions

characteristic of the North Atlantic spring bloom. Thus, at the broader scale, whale distribution is likely

dominated by oceanographic features with large spatial extent and longer time scales, that are well

captured by climatological models.

Regarding the blue and fin whale tracking data, the GAMMs output were similar in predictive

performance and coefficient of determination to those obtained in other studies (Hazen et al., 2016;

Scales et al., 2017b). Models for sei whales performed slightly worse when compared to the results

obtained by Manocci et al. (2016) for the western North Atlantic.

5.2 Importance of SEAPODYM variables In our comprehensive ecological modelling approach, we found that the dynamic predictors Net

primary production of carbon (NPP) and Temperature had a strong influence on the whales’ spatial

ecology. However, the response and strength of these predictors varied between whale species. On

the other hand, all best SDM-i models retained Lower trophic level plankton biomass (pk_pb), showing

the importance of this variable for the three species. The strong positive correlation between this

variable and whales’ presence, using both whale sighting and tracking data, suggests that whales tend

to associate with more productive areas when they are foraging around in the Azores, as well as during

migration. In addition, the relationship with lower trophic level prey, only found in the first hundred

meters of the water column, is in accordance with the known foraging behaviour of these species.

Other variables that were also important, but to a lesser extent, in SDM-i models were Epipelagic

micronekton productivity (epi_pp), Migrant mesopelagic micronekton productivity (mmeso_pp) and

Mesopelagic micronekton productivity (meso_pp).

Regarding static predictors, most of the times they were retained in the best SDM-i models, showing

the importance of these features in whales’ distribution. The most important static covariates were

Depth and Slope, each whale species showing a different response depending on the dataset (sighting

and tracking data) and temporal scale used (contemporaneous and climatological).

5.3 Limitations of the SEAPODYM product to improve models’ predictive

performance As we described in previous sections, the predictive performance of SDMs containing SEAPODYM

products differed depending on the spatial scale and temporal resolution of the data. We think there

are at least two possible causes limiting the improvement on the predictive performance of the

models using SEAPODYM products: 1) correlation between variables; 2) spatial and temporal scale of

the SEAPODYM products.

A first limitation was to choose the most appropriate SDM-i variables among environmental and

SEAPODYM variables that were strongly correlated. An example of this was the high collinearity

(>0.91) between the Lower trophic level plankton biomass (pk_pb) and the Net primary production of

carbon (NPP). This was a problem when this latter variable was one of the most important predictors

retained on the best SDM-t model, but could not be kept on the SDM-i. As NPP and pk_pb explained

similar degrees of variance in the whale sighting data, improvement on the predictive performance of

SDM-i including only pk_pb was moderate. In addition, in the analysis of whale tracking data, there

35

GREENUP – Whales habitat mapping

was a robust correlation between NPP and Temperature and a moderate correlation with pk_pb, as

all covariates followed a strong seasonal pattern from low to high latitudes. This forced the exclusion

of two of these variables from the models, probably affecting its predictive performance as it failed to

capture more variability in the data. Nonetheless, exclusion of collinear variables is a mandatory

procedure in SDMs, also followed by other studies using SEAPODYM data (Manocci et al., 2016;

Roberts et al., 2016).

The second limitation was the spatial resolution of the SEAPODYM data, at 0.25°, and we expect this

limitation to mostly affect smaller-scale models based on whale sighting data. The scale-match

between local or regional oceanographic dynamics and animal movement in SDMs is critical and can

have a major influence in the models’ predictive performance. The highly dynamic environment of the

Azores may not be adequately captured by grids of more than 500 km2, eventually leading to lack of

accuracy in the models. This probably represents a limitation for models at both temporal resolutions:

contemporaneous and climatological. Recent studies revealed that the use of coarse resolution

climatological data in habitat-based models for migratory marine predators can decrease models’

predictive performance, mainly in highly dynamic regions (Scales et al., 2017a). In addition, in 50% of

the SDM-i models based on sighting data, these models explained more variation in whale distribution

than environmental variables alone. In the remaining 50%, however, the predictive performance was

equal, suggesting that whales may associate with environmental and prey features at spatial and

temporal scales lower than the resolution of SEAPODYM outputs.

5.4 SEAPODYM-MTL versus SEAPODYM-ARMOR3D SEAPODYM-ARMOR3D had a higher predictive performance than SEAPODYM-MTL for the whale

sighting data around Azores. In the case of the whale tracking data, covering a much wider area,

differences in performance were less evident but SEAPODYM-ARMOR3D still performed better. The

SEAPODYM variables selected in the best SDM-t and SDM-i models were nearly the same for both

products, and the response of these variables was generally very similar.

Based on our results, ARMOR3D has a higher accuracy and allows a better discrimination between

presence and absence using whales’ data. For this reason, this product that incorporates physical

fields derived from altimetry is suggested for future use with baleen whale data.

6 Conclusions In this project, we tested the ability of two SEAPODYM products – SEAPODYM-MTL and SEAPODYM-

ARMOR3D - to predict the fine- and large-scale distribution of three whale species and quantified the

improvement in predictive performance of SDMs using these products relative to models using only

static and commonly available oceanographic variables. The use of these novel micronekton products

as predictors gave encouraging results. Although, the results differed depending on the temporal and

spatial scale of the data, our study demonstrates that simulated micronekton distributions enable a

better understanding of the whales’ distributions.

In addition, the outputs from this project represent a step further in the complexity of the species

distribution modelling, not only by including an integrative modelling approach, but also providing a

protocol to evaluate the potential impact of modelling data at different spatial and temporal scales.

This modelling process has allowed us to have a better understanding of the environmental drivers

determining the movements and habitat use of three highly migratory marine species, which is crucial

to implement appropriate management and conservation measures.

36

GREENUP – Whales habitat mapping

Our models suggest that blue and fin whales stop their northward migration to exploit increased prey

biomasses in the Azores that develop after the spring bloom. Whales migrate towards more

productive areas, following the seasonal cycle in primary production, as indicated by the inclusion of

prey variables but also temperature in most models. This result is in accordance with Visser et al.

(2011) and Silva et al. (2013), who suggested that baleen whale presence in the Azores coincide with

the presence of the spring bloom in the region, with whales interrupting their migration to feed

around the area for several weeks. Whale distribution patterns at the Atlantic Ocean scale seem also

constrained by physiographic factors, but their importance differed amongst the species. The mid-

Atlantic Ridge was highlighted as a relevant habitat for fin and blue whales during their northward

movement, but not for sei whales. This latter species showed two separate core habitats, one at mid-

Atlantic latitudes between 30º to 45° N, and a second above 45° N onwards, between Canada and

Greenland. These habitats likely represent a northern latitude main feeding ground, and a cross-

Atlantic migration corridor between this feeding ground and southern areas (highlighted in one of the

models at 25 to 35° N).

7 References Aarts, G., Fieberg, J. & Matthiopoulos, J. (2012) Comparative interpretation of count, presence–

absence and point methods for species distribution models. Methods in Ecology and Evolution, 3, 177–187.

Abecassis, M., Senina, I., Lehodey, P., Gaspar, P., Parker D, et al. (2013) A Model of Loggerhead Sea

Turtle (Caretta caretta) Habitat and Movement in the Oceanic North Pacific. PLoS ONE 8: e73274 Becker, E. A., Forney, K.A., Foley, D.G., Smith, R.C., Moore, T.J. & Barlow, J. (2014). Predicting

seasonal density patterns of California cetaceans based on habitat models. Endangered Species Research 23, 1–22.

Benoit-Bird, K. J., Battaile, B. C., Nordstrom, C. A., & Trites, A. W. (2013). Foraging behavior of

northern fur seals closely matches the hierarchical patch scales of prey. Marine Ecology Progress Series, 479, 283–302.

Boyd, C., Castillo, R., Hunt, G.L., Punt, A.E., VanBlaricom, G.R., Weimerskirch, H. & Bertrand, S.

(2015). Predictive modelling of habitat selection by marine predators with respect to the abundance and depth distribution of pelagic prey. J Anim Ecol, 84: 1575–1588. doi:10.1111/1365-2656.12409

Croll, DA., Marinovic, B., Benson, S., Chavez, FP., Black, N., Ternullo, R. & Tershy, BR. (2005) From

wind to whales: Trophic links in a coastal upwelling system. Marine Ecology Progress Series. 289: 117– 130

Elith, J., Kearney, M. & Phillips, S. (2010) The art of modelling range-shifting species. Methods in

Ecology and Evolution, 1, 330–342. Fauchald, P., & Tveraa, T. (2006). Hierarchical patch dynamics and animal movement pattern.

Oecologia, 149, 383–395.

Ferguson M.C., Barlow J., Fiedler P., Reilly S.B., & Gerrodette T. (2006). Spatial models of delphinid (family Delphinidae) encounter rate and group size in the eastern tropical Pacific Ocean. Ecological Modelling.

37

GREENUP – Whales habitat mapping

Forney, K.A., Ferguson, M.C., Becker, E.A. & Fiedler, P.C. (2012) Habitat-based spatial models of

cetacean density in the eastern Pacific Ocean. Endangered Species Research 16,113–133.

Gregr, E. J., Baumgartner, M. F., Laidre, K. L. & Palacios, D. M. (2013). Marine mammal habitat

models come of age: the emergence of ecological and management relevance. Endangered Species

Research 22, 205-212.

Guisan, A. & Zimmermann, N.E. (2000). Predictive habitat distribution models in ecology. Ecol.

Model. 135:147–86

Harrell, F. E., Lee, K. L. & Mark, D. B.(1996). Multivariable prognostic models: issues in developing

models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in

Medicine 15, 361–38

Hazen, E. L., Palacios, D. M., Forney, K. A., Howell, E. A., Becker, E., Hoover, A. L., & Bailey, H. (2016).

WhaleWatch: A dynamic management tool for predicting blue whale density in the California Current.

Journal of Applied Ecology. https://doi.org/10.1111/1365-2664.12820

Jonsen, I.D., Flemming, J.M. & Myers, R.A. (2005) Robust state-space modeling of animal

movement data. Ecology, 86, 2874–2880.

Lambert, C., Mannocci, L., Lehodey, P. & Ridoux, V. (2014). Predicting cetacean habitats from their

energetic needs and the distribution of their prey in two contrasted tropical regions. PLoS ONE 9,

e105958.

Lehodey, P., Murtugudde, R. & Senina, I. (2010). Bridging the gap from ocean models to population

dynamics of large marine predators: A model of mid-trophic functional groups. Progress in

Oceanography 84, 69–84.

Mannocci, L., Laran, S., Monestiez, P., Dorémus, G., Van Canneyt, O., Watremez, P. & Ridoux, V.

(2014). Predicting top predator habitats in the Southwest Indian Ocean. Ecography 37, 261-278.

Mannocci, L., Roberts, J. J., Miller, D. L., & Halpin, P. N. (2016). Extrapolating cetacean densities to

quantitatively assess human impacts on populations in the high seas. Conservation Biology.

Mannocci L,Boustany AM,Roberts JJ, etal.Temporal resolutions in species distribution models of

highly mobile marine animals: Recommendations for ecologists and managers. Diversity

Distrib.2017;23:1098–1109.

Peel, D., Bravington, M.V., Kelly, N., Wood, S.N. & Knuckey, I. (2012) A Model-Based Approach to

Designing a Fishery-Independent Survey. Journal of Agricultural, Biological, and Environmental

Statistics, 18, 1–21

Phillips, S.J., Dudik, M., Elith, J., Graham, C.H., Lehmann, A., Leathwick, J. & Ferrier, S. (2009) Sample

selection bias and presence-only distribution models: implications for background and pseudo-

absence data. Ecological Applications, 19, 181–197.

Prieto, R., Silva, M.A., Waring, G.T. & Gonçalves, J.M.A. (2014). Sei whale movements and

behaviour in the North Atlantic inferred from satellite telemetry. Endangered Species Res. 26,103–

113.

Prieto, R., Tobeña, M. & Silva, M. A. (2017). Habitat preferences of baleen whales in a mid-latitude

habitat. Deep Sea Research Part II: Topical Studies in Oceanography.

38

GREENUP – Whales habitat mapping

Redfern, J. V., Ferguson, M. C., Becker, E. A., Hyrenbach, K. D., Good, C., Barlow, J., Kaschner, K.,

Baumgartner, M. F., Forney, K. A., Ballance, L. T., Fauchald, P., Halpin, P., Hamazaki, T., Pershing, A. J.,

Qian, S. S., Read, A., Reilly, S. B., Torres, L. & Werner, F. (2006). Techniques for cetacean–habitat

modeling: a review. Marine Ecology Progress Series 310, 271-295.

Roberts, J. J., Best, B. D., Mannocci, L., Fujioka, E., Halpin, P. N., Palka, D. L., Garrison, L. P., Mullin,

K. D., Cole, T. V. N., Khan, C. B., McLellan, W. A., Pabst, D. A. & Lockhart, G. G. (2016). Habitat-based

cetacean density models for the U.S. Atlantic and Gulf of Mexico. Scientific Reports 6, 22615.

Scales, K. L., Hazen, E. L., Jacox, M. G., Edwards, C. A., Boustany, A. M., Oliver, M. J. and Bograd, S.

J. (2017a), Scale of inference: on the sensitivity of habitat models for wide-ranging marine predators

to the resolution of environmental data. Ecography, 40: 210–220. doi:10.1111/ecog.02272

Scales, K.L., G.S. Schorr, E.L. Hazen, S.J. Bograd, P.I. Miller, R.D. Andrews, A.N. Zerbini & Falcone,

E.A (2017b). Should I stay or should I go? Modelling year-round habitat suitability and drivers of

residency for fin whales in the California Current. Diversity and Distributions, 1-12. DOI:

10.1111/ddi.12611

Shono, H. (2008) Application of the Tweedie distribution to zero-catch data in CPUE analysis.

Fisheries Research 93, 154–162.

Silva, M.A., Feio, R., Prieto, R., Gonçalves, J.M., & Santos, R.S. (2002). Interactions between

cetaceans and the tuna-fishery in the Azores. Marine Mammal Science 18:893;901.

Silva, M.A., Prieto, R., Jonsen, I., Baumgartner, M.F., & Santos, R.S. (2013). North Atlantic blue and

fin whales suspend their spring migration to forage in middle latitudes: building up energy reserves

for the journey? PLoSOne 8, e76507.

Silva, M. A., Prieto, R., Cascão, I., Seabra, M. I., Machete, M., Baumgartner, M. F. & Santos, R.S.

(2014). Spatial and temporal distribution of cetaceans in the mid-Atlantic waters around the Azores.

Marine Biology Research 10, 123-137.

Tobeña, M., Prieto, R., Machete, M. & Silva, M. A. (2016). Modeling the potential distribution and

richness of cetaceans in the Azores from fisheries observer program data. Frontiers in Marine Science.

Visser, F., Hartman, K.L., Pierce, G.J., Valavanis, V.D. & Huisman, J. (2011). Timing of migratory

baleen whales at the Azores in relation to the North Atlantic spring bloom. Mar Ecol Prog Ser 440:267-

279. https://doi.org/10.3354/meps09349

Welsh, A. H. et al. 1996. Modelling the abundance of rare species: statistical models for counts with

extra zeros. – Ecol. Model. 88: 297 – 308

Willis-Norton, E., Hazen, E.L., Fossette, S., Shillinger, G., Rykaczewski, R.R., Foley, D.G., Dunne, J.P.

& Bograd, S.J. (2015) Climate change impacts on leatherback turtle pelagic habitat in the Southeast

Pacific. Deep Sea Research Part II: Topical Studies in Oceanography, 113, 260–267.

Wood, S. (2006) Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC Press,

Boca Raton, FL, USA.

Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation

of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B

(Statistical Methodology). 73, 3–36.

Yuan, Yuan., Bachl, F.E., Lindgren, F., Borchers, D.L., Illian, J B., Buckland, S.T., Rue, H. & Gerrodette,

T. (2017). Point process models for spatio-temporal distance sampling data from a large-scale survey

39

GREENUP – Whales habitat mapping

of blue whales. Ann. Appl. Stat. 11, no. 4, 2270--2297. doi:10.1214/17-AOAS1078.

https://projecteuclid.org/euclid.aoas/1514430286

Zuur, A.F., Ieno, E.N. & Elphick, C.S. (2010). A protocol for data exploration to avoid common

statistical problems. Methods in Ecology and Evolution 1, 3–14.