GHSL application in Europe: Towards new population grids

8
1 EUROPEAN FORUM FOR GEOGRAPHY AND STATISTICS KRAKOW CONFERENCE 2014 2224 October, Krakow, Poland PAPER Title: GHSL application in Europe: Towards new population grids Author 1: Sergio Freire European Commission Joint Research Centre, Italy Author 2: Matina Halkia European Commission Joint Research Centre, Italy Keywords: population disaggregation, GHSL, Europe ABSTRACT In the context of the Urban and Regional Built-up Analysis (URBA) project which uses the Global Human Settlement Layer (GHSL) technology, to map buildings in Europe, population data of different sources and types are being used to produce population grids using the built environment as reference for disaggregation. GHSL technology relies on automatic analysis of satellite imagery to produce unprecedented fine- scale maps quantifying built-up structures. Among the outputs of GHSL technology, is a raster geo- dataset representing, for each cell, the ratio of coverage by building structures. Population censuses, although accurate, are not uniformly and readily available across European Union member states. The use of related ancillary data and spatial modeling allows disaggregating and refining population distributions. Making these data available as regular grids facilitates spatial analysis and mitigates biases such as the modifiable areal unit problem and the related ecological fallacy issue. The use of the European GHSL data in the process of producing refined and consistent population distribution grids for Europe, at 100 and 1000 m resolution, will be presented.

Transcript of GHSL application in Europe: Towards new population grids

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

1  

   EUROPEAN  FORUM  FOR  GEOGRAPHY  AND  STATISTICS  KRAKOW  CONFERENCE  2014  22-­‐24  October,  Krakow,  Poland    

 PAPER  

Title:  GHSL  application  in  Europe:  Towards  new  population  grids    

Author  1:  Sergio  Freire  

European  Commission  Joint  Research  Centre,  Italy  

Author  2:  Matina  Halkia    

European  Commission  Joint  Research  Centre,  Italy  

 

Keywords:    population  disaggregation,  GHSL,  Europe  

 

ABSTRACT    

In the context of the Urban and Regional Built-up Analysis (URBA) project which uses the Global Human Settlement Layer (GHSL) technology, to map buildings in Europe, population data of different sources and types are being used to produce population grids using the built environment as reference for disaggregation.

GHSL technology relies on automatic analysis of satellite imagery to produce unprecedented fine-scale maps quantifying built-up structures. Among the outputs of GHSL technology, is a raster geo-dataset representing, for each cell, the ratio of coverage by building structures.

Population censuses, although accurate, are not uniformly and readily available across European Union member states. The use of related ancillary data and spatial modeling allows disaggregating and refining population distributions. Making these data available as regular grids facilitates spatial analysis and mitigates biases such as the modifiable areal unit problem and the related ecological fallacy issue. The use of the European GHSL data in the process of producing refined and consistent population distribution grids for Europe, at 100 and 1000 m resolution, will be presented.

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

2  

Preliminary results suggest a more consistent an improved depiction of the spatial distribution of resident population. These data will improve analyses involving population distribution, at a range of scales, and will benefit environmental studies, risk and emergency management, planning of public facilities, and policy assessment in general.

INTRODUCTION  

Better geoinformation on the spatial distribution of population is increasingly required for various applications. Population censuses provide accurate data on the characteristics and number of residents for administrative or enumeration areas. However, these data are not uniformly and readily available across European Union member states. Additionally, these data are available as total count for units varying widely in size and shape, and frequently residents occupy only specific zones of these units, at different densities (Freire, 2010).

Ancillary data such as land use, if compatible in time, space, and semantics, can be useful to inform the disaggregation process by allowing the discrimination of distinct functional areas and respective densities. The most effective way to take advantage of these datasets is through dasymetric mapping and areal interpolation approaches. Dasymetric mapping is a cartographic technique, originally used for population mapping, which aims at limiting the distribution of a variable to the areas where it is present, by using related ancillary information in the process of areal interpolation (Eicher and Brewer 2001).

Recent approaches to population disaggregation in Europe include those implemented by Steinnocher et al. (2010) and Batista e Silva et al. (2013), using European-wide spatial information and dasymetric approaches . Steinnocher et al. have disaggregated census population to the European Soil Sealing Layer (ESSL), after masking this raster layer with a combination of selected CORINE Land Cover (CLC) classes and transportation data. Results showed underestimation in urban centers and rural areas. Batista e Silva et al. have produced a 100-m population layer by disaggregating census data to a set of selected CLC refined classes (‘Urban fabric’) by using weights derived from the average soil sealing degree per LULC class. Overall performance of this approach was quite similar when compared to the former, although it showed greater tendency to commission errors and less to omission.

The Global Human Settlement Layer (GHSL), recently produced for the whole Europe, uses remote sensing source data and innovative approaches to provide unprecedented mapping and quantification of built-up areas (Pesaresi et al., 2013). Among the outputs is a seamless 10m-resolution raster mosaic representing, for each cell, the ratio of area covered by building structures (Ferri et al., 2014).

The current project aims at producing the best disaggregation of residential population from the 2011 round of censuses, taking advantage of GHSL at detecting, mapping, and quantifying current built-up

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

3  

areas at high-resolution. The technical requirements are i) to preserve the total volume of population at country level, and ii) to use a functional definition of population (based on residence). The GHSL method relies on the geolocation of population counts in built-up structures, without taking into account the built-up volume of the structure, only built-up surface area. LULC data are used as ancillary data, in one of the methods tested which involves weighting the application of the GHSL method according to the LULC class. The other two methods tested are masking GHSL with LULC data, and no use of ancillary data.

This paper discusses potential approaches to meet the task and presents the work performed so far towards the production of a new population grid at 1km of resolution for Europe, based on GHSL. Census tract data, the GEOSTAT 1 km grid or statistics at the level of municipalities (LAU2), will be used depending on whatever best is available in each country.

The paper presents the preliminary results towards the completion of the aforementioned work, namely i) disaggregating population data in Portugal using Census Tract and LAU2 data using the GHSL method ii) comparing the results of the GHSL disaggregation method to the results of the method used by AIT in Portugal iii) testing methods for obtaining weights per Land Use/ Land Cover (LULC) class.

USE  OF  GHSL  FOR  POPULATION  DISAGGREGATION    

Our approach assumes that the GHSL layer mapping built-up percentage can be used to inform both the location and the density of built-up so as to support the disaggregation and spatial refinement of the distribution of population from administrative or enumeration areas to effective populated areas. In this framework, the use of GHSL as proxy for the distribution of residential population can potentially be augmented by including LULC maps as true ancillary data in the disaggregation process.

Batista e Silva et al. (2012) have produced a European-wide 100-m resolution LULC layer, named CLC06_r, by combining CLC2006 with additional geoinformation to increase thematic and spatial detail. This refined CLC has as minimum mapping unit of 1 ha for Artificial surfaces and Inland waters classes and 25 ha for the remaining.

However, while GHSL mostly represents 2011 and 2012, CLC06_r maps LULC in 2006.

Given the requirements and datasets available, three main approaches for the disaggregation are possible (Fig. 1):

A. Use GHSL alone

B. Masking of GHSL with LULC ancillary data

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

4  

C. Weighting of GHSL with LULC ancillary data

Figure 1. Basic generic workflow for population disaggregation with GHSL

All approaches have advantages and shortcomings, which also depend on the characteristics and level of census data used as input. For each approach, a brief discussion of the main challenges and pros and cons of each approach follows.

A. Use GHSL alone: this approach poses no challenges or requires decisions, relying entirely on GHSL as proxy for resident population distribution. It allows changes after 2006 to be fully represented. The main drawback is the allocation of significant population to non residential areas, due to absence of functional discrimination of buildings in GHSL.

B. Masking of GHSL with LULC ancillary data: this approach requires decisions on which LULC classes to mask and how to best select them. While it may limit allocation of population to non residential areas, it limits representing changes after 2006 and confines the residential detection value of GHSL in rural areas.

C. Weighting of GHSL with LULC ancillary data: this approach requires decisions on how to derive the weights, and which weights to use when sampling is not possible (i.e., how to export weights to other areas). This approach adapts the Intelligent Dasymetric Mapping (IDM) proposed by Mennis (2003) and Mennis and Hultgren (2006). The weighting scheme is based on empirical sampling of the census with the containment method, as recommended by these authors, to learn the relation between population distribution and LULC classes.

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

5  

For application purposes, while not all classes were sampled, a decision was made not to exclude (mask) any LULC class by attributing to those classes a weight lower than the lowest weight obtained for sampled classes (i.e., residual weight), thus preserving the power of GHS to detect built-up areas. In this experimental design, LULC information is used to inform the average population densities, while GHSL determines the location and the specific densities of population.

WEIGHTING  LULC  CLASSES    

Using the sampling approach of C., a database of population-CLC06_r weights was produced using census-tract level data from 5 countries: Portugal, Italy, Netherlands, UK, and Spain.

Experiments performed in different geographic areas of Europe, and at different geographic scales (regional and country-wide) confirm that dense urban LULC classes weigh proportionately more in residential population than middle density LULC classes, as expected. However, there are significant differences presented between ratios of the dense and middle density urban fabric, across countries or regions in Europe.

MODEL  TESTING  IN  PORTUGAL  

To test the three approaches, a pilot study was implemented in Portugal. The study area encompassed seven LAU2 administrative areas (23,504 ha) close to Lisbon, displaying a wide range of population densities and high LULC heterogeneity (presence of 21 of the 45 CLC06_r classes).

2011 census population was disaggregated from LAU2 (parishes) units using different methods, and results were assessed using census blocks as reference, the finest census zoning. The Total Absolute Error (TAE) and the Relative Total Absolute Error (RTAE) were used as overall accuracy metrics, as presented and adopted in Gallego et al. (2011) and Batista e Silva et al. (2012).

Tests performed so far indicate that approach C. performs best by enabling to explore all the building-detection power of GHSL, and as such confirms GHSL as a good proxy for the location/distribution of residential population (Fig. 2). In the pilot study this approach has proven superior to disaggregating to ESSL masked by LULC, as implemented by Steinnocher et al. (2010). In particular, this approach has proven superior at limiting major population re-allocation errors, such as population assigned to unpopulated areas (complete commission) and the converse (complete omission).

Approach A. has performed the worst, mostly due to error of commission, i.e., assignment of population to non-residential built-up areas.

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

6  

Figure 2. 100-m population grid produced with approach C. in pilot study area

To confirm these results, a country-wide study is currently ongoing, the outcome of which will be reported soon.

CONCLUSIONS  

Regular grids are increasingly preferred as the structure to map, report, and analyze population distribution and other socio-economic variables. The recent availability of a pan-european GHSL layer mapping built-up percentage at high spatial resolution offers a good proxy for the distribution of population. The use of these data for population disaggregation is being tested using different methodological approaches. Preliminary results are very encouraging, confirming the potential of GHSL to refine and improve population disaggregation models. This new population distribution grid will also be useful to refine the current settlement classification model used in GHSL to analyze

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

7  

settlement types in Europe. They will be used in the analysis of the relationship between administrative boundaries and settlement types.

Future tests will explore the combined use of weighting and masking selected LULC classes in the disaggregation process.

ACKNOWLEDGEMENTS    

The work reported here is part of a team effort of the Unit on Global Security and Crisis Management of the Joint Research Centre, European Commission. The authors would like to acknowledge the contribution of Martino Pesaresi, and Daniele Ehrlich, who contributed with their experience and ideas at all stages of this work, and of Stefano Ferri who offered technical support.

BIBLIOGRAPHY    

Batista  e  Silva,  F.,  Gallego  J.  &  Lavalle  C.  (2013).  A  high-­‐resolution  population  grid  map  for  Europe.  Journal  of  Maps,  DOI:10.1080/17445647.2013.764830.  

Batista  e  Silva,  F.,  Lavalle,  C.,  &  Koomen,  E.  (2012).  A  procedure  to  obtain  a  refined  European  land  use/cover  map.  Journal  of  Land  Use  Science,  ttp://www.tandfonline.com/doi/abs/10.1080/1747423X.2012.667450.  

Eicher  C.L.,  Brewer  C.A.  (2001).  Dasymetric  mapping  and  areal  interpolation:  Implementation  and  evaluation.  Cartogr  Geogr  Inf  Sci  28:125–138.  

Ferri,  S.,  Syrris,  V.,  Florczyk,  A.,  Scavazzon,  M.,  Halkia,  S.,  Pesaresi,  M.,  (2014).  A  new  map  of  the  European  settlements  by  automatic  classification  of  2.5-­‐m  resolution  SPOT  data.  Proceedings  of  IGARSS  2014,  Quebec,  Canada,    13-­‐18  July  2014.  

Ferri, S., et al. (2014). GHSL for Copernicus Spot-5 data: Customization and optimization for Core_003, JRC Technical Report, January 2014.  

Freire,  S.  (2010).  Modeling  of  spatio-­‐temporal  distribution  of  urban  population  at  high-­‐resolution  –  value  for  risk  assessment  and  emergency  management.  In:  Konecny,  M.,  Zlatanova,  S.,  Bandrova,  T.L.  (eds.),  Geographic  Information  and  Cartography  for  Risk  and  Crisis  Management.  Lecture  Notes  in  Geoinformation  and  Cartography.  Springer  Berlin  Heidelberg,  pp.  53-­‐67.  

                                   

                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                         

8  

Gallego,  F.  J.,  Batista,  F.,  Rocha,  C.,  &  Mubareka,  S.  (2011).  Disaggregating  population  density  of  the  European  Union  with  CORINE  Land  Cover.  International  Journal  of  Geographical  Information  Science,  25(12),  2051–2069.  

Halkia,  S.,  et  al.  (2014).  Built-­‐up  detection  for  Regional  Policy.  JRC  Technical  Report.  

Mennis,  J.  (2003).  Generating  surface  models  of  population  using  dasymetric  mapping.  The  professional  Geographer,  55(1),  31–42.  

Mennis,  J.  and  Hultgren,  T.  (2006).  Intelligent  dasymetric  mapping  and  its  application  to  areal  interpolation.  Cartogr.  Geogr.  Inf.  Sci.,  33,179–194.

Pesaresi, M., et al. (2013). A Global Human Settlement Layer from optical HR/VHR RS data: Concept and first results(2013). IEEE journal of Selected Topics in Applied Earth Observation and Remote Sensing, Vol 6, No 6.

Steinnocher K., Kaminger I., Köstl M. and Weichselbaum J., (2010). Gridded Population – new data sets for an improved disaggregation approach. E-proceedings of the 3th European Forum for Geostatistics Conference, 5-7 October 2010, Tallinn, Estonia.