Open data: A review of the state of the art - Shift2Rail projects

133
Contract No. H2020 730539 This project has received funding from the Shift2Rail Joint Undertaking under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 730569 IN2SMART Project Title: Intelligent Innovative Smart Maintenance of Assets by integrated Technologies Starting date: 01/09/2016 Duration in months: 36 Call (part) identifier: H2020-S2RJU-CFM-2016-01-1 Grant agreement no: 730569 Open data: a review of the state-of-the-art D7.1 Due date of deliverable: Month 12 Actual submission date: 31-08-2017 Leader of this deliverable: DLR Dissemination level: CO Revision: Issued Revision History Table Version Reason for change Issue Date V1.0 Initial Issue 18/07/2017 V2.0 Requested Revision 28/06/2018 Ref. Ares(2018)3457727 - 29/06/2018

Transcript of Open data: A review of the state of the art - Shift2Rail projects

Contract No. H2020 – 730539

This project has received funding from the Shift2Rail Joint Undertaking under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 730569

IN2SMART

Project Title: Intelligent Innovative Smart Maintenance of Assets by integrated Technologies

Starting date: 01/09/2016

Duration in months: 36

Call (part) identifier: H2020-S2RJU-CFM-2016-01-1

Grant agreement no: 730569

Open data: a review of the state-of-the-art

D7.1

Due date of deliverable: Month 12

Actual submission date: 31-08-2017

Leader of this deliverable: DLR

Dissemination level: CO

Revision: Issued

Revision History Table

Version Reason for change Issue Date

V1.0 Initial Issue 18/07/2017

V2.0 Requested Revision 28/06/2018

Ref. Ares(2018)3457727 - 29/06/2018

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 2 of 133

Details of contribution

Author(s) DEUTSCHES ZENTRUM

FUER LUFT - UND

RAUMFAHRT EV (DLR)

Elmar Brockfeld

Jörn Christoffer Groos

Rüdiger Ebendt

Christian Rahmig

Lucas Schubert

Michael Scholz

- Deliverable coordination - Main contents in chapters 1, 2, 3, 5, 7.1,

8.1.2, 8.5, 9, 10, 11 - Discussions about document structure

and contents - Complete document review

Contributor(s) ANSALDO STS S.p.A.

(ASTS)

Fabrizio Cosso

Matteo Pinasco

- Main contributions chapters 4, 6.5, 7.2 - Discussions about document structure

and contents - Complete document review

NETWORK RAIL

INFRASTRUCTURE

LIMITED (NR)

Caroline Lowe

- Main contributions to chapter 8.1 - Discussions about document structure

and contents - Complete document review

BOMBARDIER

TRANSPORTATION

SWEDEN AB (BT)

Zbigniew Dyksy

Mikael Danielsson

Martin Karlsson

- Main contributions to chapters 8.1.3 – 8.1.6

- Discussions about document structure and contents

- Complete document review

SIEMENS

AKTIENGESELLSCHAFT

(SIE)

Sven Adomeit

Andreas Bolm

Frank Aust

Jochen Grühser

- Main contributions to chapters 6.3, 6.4, 8.3

- Discussions about document structure and contents

- Complete document review

THALES GROUND

TRANSPORTATION

SYSTEMS UK LTD (THA)

David Tickem

- Main contributions to chapters 6.3.2, 7.3 - Discussions about document structure

and contents - Complete document review

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 3 of 133

Kompetenzzentrum -

Das Virtuelle Fahrzeug,

Forschungsgesellschaft

mbH (VIF)

Josef Fuchs

Alexander Meierhofer

- Main contributions to 6.1.1, 6.1.2, 6.1.3, 8.2

- Discussions about document structure and contents

- Complete document review

FCP FRITSCH, CHIARI &

PARTNER

ZIVILTECHNIKER GMBH

(FCP)

Gerald Julian Rajasingam

- Main contributions to chapters 8.1.4, 8.4 - Discussions about document structure

and contents - Complete document review

WIENER LINIEN GMBH

&CO KG (WL)

Simon Wallner

- Main contributions to chapter 7.2.2 - Discussions about document structure

and contents - Complete document review

LULEA TEKNISKA

UNIVERSITET (LTU)

Mustafa Aljumaili

Matti Rantatalo

Karim Ramin

- Main contributions to chapters 6.1, 6.4, 7.4, 7.5

- Discussions about document structure and contents

- Complete document review

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 4 of 133

TABLE OF CONTENTS

EXECUTIVE SUMMARY .................................................................................... 10 1

1.1 Acronyms and Abbreviations ..................................................................................... 11

BACKGROUND ............................................................................................... 18 2

OBJECTIVE/AIM ............................................................................................. 19 3

SUMMARY OF RELEVANT IN2RAIL RESULTS .................................................... 20 4

4.1 IN2RAIL Description .................................................................................................. 20

4.2 IN2RAIL Deliverables ................................................................................................ 21

4.3 D9.1 Asset Status Representation [3] ....................................................................... 22

4.3.1 Deliverable content .............................................................................................. 22

4.3.2 Deliverable Conclusions ...................................................................................... 23

4.4 D8.1 Requirements for the Integration Layer ............................................................ 23

4.4.1 Deliverable content .............................................................................................. 23

4.4.2 Deliverable Conclusions ...................................................................................... 24

4.5 D8.5 Requirements for the Generic Application Framework ..................................... 24

4.5.1 Deliverable content .............................................................................................. 24

4.5.2 Deliverable Conclusions ...................................................................................... 25

4.6 Annex to D8.3: Description of the Canonical Data Model .......................................... 25

4.7 Conclusions ............................................................................................................... 25

ONLINE SURVEY ............................................................................................. 27 5

5.1 Questionnaire ............................................................................................................ 27

5.2 Feedback .................................................................................................................. 27

5.2.1 Participants’ domains of service ........................................................................... 27

5.2.2 Extent of use and use cases of Open Data Exchange formats ............................ 28

5.2.3 How generic/specialised are Open Data Exchange formats? .............................. 29

5.2.4 “Best” example of an Open Data Exchange format suitable for one of several sources of information ...................................................................................................... 30

5.2.5 The extent to which a participant’s company or institution participates in Open Data Exchange initiatives and communities ..................................................................... 31

5.2.6 Optional mindset questions: Open Data Exchange policy and attitude towards Open Data ........................................................................................................................ 32

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 5 of 133

5.2.7 Strengths and weaknesses of Open Data Exchange formats .............................. 32

5.2.8 Licensing and/or legal issues hampering application of Open Data Exchange formats 33

5.3 conclusions ............................................................................................................... 33

OPEN DATA EXCHANGE: TECHNOLOGIES .......................................................... 35 6

6.1 Files ........................................................................................................................... 35

6.1.1 General formats ................................................................................................... 35

6.1.2 Specific formats ................................................................................................... 39

6.2 Modeling Languages and Tools ................................................................................ 41

6.3 Communication protocols .......................................................................................... 42

6.3.1 OPC UA ............................................................................................................... 42

6.3.2 Queue/topic based messaging systems .............................................................. 45

6.4 Web services / APIs .................................................................................................. 49

6.4.1 Web Services ....................................................................................................... 49

6.4.2 Web APIs ............................................................................................................. 51

6.4.3 Comparison Web Services vs. Web APIs ............................................................ 52

6.5 In memory data grid technologies ............................................................................. 52

6.5.1 In-Memory Data Grid Overview ............................................................................ 52

6.5.2 Infinispan.............................................................................................................. 52

6.5.3 Redis .................................................................................................................... 53

OPEN DATA EXCHANGE: APPLICATIONS ........................................................... 54 7

7.1 Geodata (DLR, LTU) ................................................................................................. 54

7.1.1 Vector Data .......................................................................................................... 54

7.1.2 Raster Data .......................................................................................................... 55

7.1.3 Geo Web Services ............................................................................................... 55

7.1.4 Open Street Map, Open Railway Map ................................................................. 56

7.2 Sensor / Measurement data ...................................................................................... 57

7.2.1 sensorML ............................................................................................................. 57

7.2.2 LAS file format ..................................................................................................... 59

7.3 Maintenance .............................................................................................................. 59

7.3.1 Building Information Modeling – BIM ................................................................... 59

7.3.2 Maintenance Management................................................................................... 61

7.3.3 Asset Condition .................................................................................................... 64

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 6 of 133

7.3.4 Alarms Systems ................................................................................................... 66

7.4 Process Mining / business process analytics ............................................................ 68

7.5 Business Process Data Exchange Standards ........................................................... 69

7.5.1 Business Process Definition Metamodel (BPDM) ................................................ 70

7.5.2 XML Process Definition Language (XPDL) .......................................................... 70

7.5.3 B2B Information Exchange Standards ................................................................. 71

7.6 Strengths and weaknesses of interchange standards ............................................... 72

OPEN DATA EXCHANGE: USAGE IN DOMAINS AND RELEVANT COMMUNITIES ........ 73 8

8.1 Railway ...................................................................................................................... 73

8.1.1 Open data provided by European infrastructure Managers ................................. 73

8.1.2 railML® ................................................................................................................ 79

8.1.3 TAF/TAP TSI ........................................................................................................ 86

8.1.4 UIC 407-1............................................................................................................. 88

8.1.5 RINF – Register of Infrastructure ......................................................................... 90

8.1.6 EULYNX............................................................................................................... 93

8.2 Automotive ................................................................................................................ 95

8.2.1 Automotive centered formats ............................................................................... 95

8.3 Industry and Home Automation ................................................................................. 97

8.3.1 Industry ................................................................................................................ 97

8.3.2 Home Automation ................................................................................................ 97

8.3.3 OPC Foundation: The Interoperability Standard for Industrial Automation .......... 98

8.4 Civil engineering / construction.................................................................................. 99

8.5 Traffic Management ................................................................................................ 100

8.5.1 OpenLR (Location Referencing) ........................................................................ 100

8.5.2 Transport Protocol Experts Group TPEG ........................................................... 100

8.5.3 Traffic Message Channel (TMC) ........................................................................ 101

8.5.4 DATEX II ............................................................................................................ 101

8.5.5 General Transit Feed Specification (GTFS) ....................................................... 101

8.5.6 The TRIAS interface .......................................................................................... 102

8.5.7 Mobilitätsdatenmarktplatz (MDM) and mCLOUD ............................................... 102

REFERENCED DOCUMENTS ............................................................................ 104 9

APPENDIX A-QUESTIONNAIRE ..................................................................... 110 10

10.1 appendix a-questionnaire: participants’ domains of service .................................... 110

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 7 of 133

10.2 appendix a-questionnaire: extent of use and use cases of open data exchange formats ............................................................................................................................... 110

10.2.1 Appendix A-questionnaire: extent of use and use cases of railway formats ... 110

10.2.2 Appendix A-questionnaire: extent of use and use cases of maintenance formats 112

10.2.3 Appendix A-questionnaire: extent of use and use cases of other formats ...... 115

10.3 appendix a-questionnaire: how generic/specialised are open data exchange formats? 120

10.3.1 Appendix a-questionnaire: how generic/specialised are railway formats? ...... 120

10.3.2 Appendix a-questionnaire: how generic/specialised are maintenance formats? 121

10.3.3 Appendix a-questionnaire: how generic/specialised are other formats? ......... 121

10.4 appendix a-questionnaire: “best” example of an open data exchange format suitable for one of several sources of information ........................................................................... 125

10.5 appendix a-questionnaire: Optional mindset questions: Open Data Exchange policy and attitude towards Open Data ........................................................................................ 127

10.6 appendix a-questionnaire: strengths and weaknesses of open data exchange formats 129

10.7 appendix a-questionnaire: licensing and/or legal issues hampering application of open data exchange formats ....................................................................................................... 132

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 8 of 133

TABLE OF FIGURES

Figure 1: Generic architecture overview. ................................................................................. 18

Figure 2: Domains of the participating companies or institutions ............................................. 27

Figure 3: How generic/specialised are railway formats on average? ....................................... 30

Figure 4: Tag cloud of "best" formats and the respective criteria ............................................. 31

Figure 5: Extent of participation in Open Data Exchange initiatives and communities ............ 32

Figure 6: Object representation in JSON. ................................................................................ 36

Figure 7: Array representation in JSON. .................................................................................. 36

Figure 8: Value representation in JSON. ................................................................................. 36

Figure 9: String representation in JSON. ................................................................................. 37

Figure 10: Number representation in JSON. ............................................................................ 37

Figure 11: OPC UA Concepts .................................................................................................. 43

Figure 12: Typical Messaging System - Logical Model ............................................................ 45

Figure 13: ISO Standards related to BIM ................................................................................. 61

Figure 14: Effectiveness of asset maintenance methodologies on .......................................... 61

Figure 15: MIMOSA – Open Asset Information Model ............................................................. 63

Figure 16: MIMOSA Open System Architecture for Enterprise Application Integration (OSA-EAI).......................................................................................................................................... 64

Figure 17: OSA-CBM functional blocks ................................................................................... 65

Figure 18: IEC 62682 Alarm State Model ................................................................................ 67

Figure 19: Principle of common interface for TAF/TAP TSIs ................................................... 87

Figure 20: Common interface for TAF TSI ............................................................................... 88

Figure 21: Principle of common interface for RINF .................................................................. 92

Figure 22: EULYNX System architecture................................................................................. 94

Figure 23: Tag cloud of use cases/comments for railway formats ......................................... 111

Figure 24: Tag cloud of use cases/comments for maintenance formats ................................ 114

Figure 25: Tag cloud of use cases/comments for other formats ............................................ 118

Figure 26: How generic/specialised are maintenance formats? ............................................ 121

Figure 27: Top 15 generic other formats................................................................................ 122

Figure 28: Top 15 specialised other formats .......................................................................... 123

Figure 29: How generic/specialised are other formats on average? ...................................... 124

Figure 30: Policy of the company/institution .......................................................................... 128

Figure 31: Attitude of the company/institution: Mean attitude ................................................ 129

Figure 32: Tag cloud of strengths and weaknesses of Open Data exchange formats ........... 130

Figure 33: Tag cloud for licensing and/or legal issues hampering application of Open Data Exchange formats .................................................................................................................. 133

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 9 of 133

TABLE OF TABLES

Table 1: IN2RAIL deliverables ................................................................................................. 21

Table 2: Extent of use of railway formats (frequencies of answers) ......................................... 28

Table 3: OPC UA Standards .................................................................................................... 44

Table 4: Vendor Messaging system protocols ......................................................................... 47

Table 5: Vendor Messaging system: Message Exchange Patterns Support ........................... 48

Table 6: Comparison Web Services vs. Web APIs [29] ........................................................... 52

Table 7: Data feeds in France ................................................................................................. 74

Table 8: Data feeds in Germany .............................................................................................. 75

Table 9: Data feeds in Switzerland .......................................................................................... 75

Table 10: Data feeds in the United Kingdom ........................................................................... 76

Table 11: railML® versions [77] ............................................................................................... 79

Table 12: Primary purpose of RINF ......................................................................................... 92

Table 13: Use cases/comments for railway formats .............................................................. 111

Table 14: Extent of use of maintenance formats (frequencies of answers) ........................... 112

Table 15: Use cases/comments for maintenance formats ..................................................... 114

Table 16: Extent of use of other formats (frequencies of answers) ........................................ 115

Table 17: Use cases/comments for other formats ................................................................. 118

Table 18: "Best" formats for several information sources with the respective criteria ............ 125

Table 19: Strengths and weaknesses of Open Data exchange formats ................................ 130

Table 20: Licensing and/or legal issues hampering application of Open Data Exchange formats .............................................................................................................................................. 133

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 10 of 133

EXECUTIVE SUMMARY 1

This deliverable is providing a current, focused and brief spotlight on today’s most relevant formats and technologies as IT/data technologies evolve quickly. All mature industry sectors put concurrently significant effort in digitalization and automation leading to rapid developments. This document serves as working resource for the work to be done on data standardization in task 7.2. D7.1 enlarges furthermore the list of standards suitable for data representation with respect to the work done in lighthouse project IN2RAIL. These new identified standards should be considered to integrate the data representation being defined in IN2RAIL, focused on TMS, in order to cover the maintenance requirements and needs.

In fact, the state of the art analysis presented here is not capable to provide already a selectable short-list. Without forestalling WP7.2 work results it seems rather that not a single standard data format but a couple of them might be recommendable. The detailing of the final requirements as well as the prototype implementation to be done in task 7.2 will be necessary to finally select a suitable combination of open data exchange formats. It is foreseen, that next to the utilization of as much as possible flexible cross-domain formats and technologies the ongoing IN2RAIL activities in communities relevant for railways (e. g. railML) should also be followed during the next steps in IN2SMART. Due to the rapid developments in data exchange in all industry domains a careful reviewing and monitoring of available and emerging technologies have to be maintained throughout the project.

The deliverable summarizes in section 4 the state of IN2RAIL results which are relevant for IN2SMART work package 7. In order to have a further enhanced basis for the state of the art a questionnaire was conceptually designed and conducted via an online survey “Open Data Exchange formats” together with all project partners from March to May 2017. Section 5 presents the results of it to give first impressions about relevant formats and opinions - detailed information can be found in “10 appendix a – questionnaire”. In section 6 the general relevant technologies are listed and described reaching from basic file formats over modeling languages and tools, communication protocols and Web Services to special in memory data grid technologies. In section 7 applications and application areas are described which gives insights in more specialized use in the fields of geodata, sensor and measurement data, maintenance, business process analytics and business data exchange standards. Section 8 gives insights to some more domain-specific concepts as well as appreciations about the most important ones and the main streams, in parts driven by also mentioned relevant initiatives/communities in the according domains and central places to go.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 11 of 133

1.1 ACRONYMS AND ABBREVIATIONS

The following tables provide definitions for acronyms and abbreviations and for terms used in this document.

Definition

AD Active Directory

ANSI American National Standards Institute

AP Access Point

API Application programming interface

ASCII American Standard Code for Information Interchange

ASPRS American Society for Photogrammetry and Remote Sensing

AWS Automatic Warning System

BIM Building Information Modeling

BP (“Betriebsprotokoll”) operations log

BMWi German Federal Ministry for Economic Affairs and Energy

BMVi German Federal Ministry for Transport and Digital Infrastructure

BPM Business Process Management

BPML Business Process Modeling Language

BPMI Business Process Management Initiative

BPMS Business Process Management System

B2B Business to Business

C “C” programming language

C++ “C++” programming language

CAD Computer-Aided Design

CC Control Component of ACC (Active Cruise Control)

CDF Common Data Format

CDM Canonical Data Model

CIF

The Common Interface File (CIF) format is the industry standard for transfer of schedules electronically from Network Rail's Integrated Train Planning System (ITPS) to downstream operational and information systems.

CM Configuration Management

CM Counting Monitoring error

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 12 of 133

Definition

CM Coupling Mode

COM Communication

COM Component Object Model

CPO Code of PLM Openness

COTS Commercial off-the-shelf

CSV Comma-Separated Values

DB Data Bus signal

EB Emergency Brake

EC Element Controller

EC European Community

EC Evaluation Computer

EN European standard

ERA European Railway Agency

ERTMS European rail traffic management system

EV (“Endverbinder”) terminal bond

F Fail-safe

FIFO First In, First Out

FM Function Module

FS Full Supervision

FTP File Transfer Protocol

GIS Geographic Information System

GML Geography Markup Language

GPS Global Positioning System

GTFS General Transit Feed Specification

GWT Google Web Toolkit

HDF Hierarchical Data Format

HTML Hypertext Markup Language

HTTP Hypertext Transfer Protocol

HVAC Heating, Ventilation and Air-Conditioning

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 13 of 133

Definition

I/O Input/Output

ID Identifier

IDMVU (Infrastruktur-Daten-Management für Verkehrsunternehmen“) infrastructure data management for transportation companies

IDS Intrusion detection system

IEC International Electrotechnical Commission

IEEE Institute of Electrical and Electronics Engineers

IFC InterFace Connection

IL Integration Layer

INSPIRE Infrastructure for Spatial Information in Europe

IM Interface Module

IMDG In-Memory Data Grid

IP Ingress Protection (class)

IP Internet Protocol (RFC791)

IS Information Security

IS Isolation Mode

ISO International Organization for Standardization

IT Information Technology

JMS Java Messaging Service

JSON JavaScript Object Notation

KM (“Kilometrierung”) mileage, kilometrage

LCC Life-Cycle Costs

LDP Linked Data Platform

MDM (“Mobilitäts-Daten-Marktpatz”) Mobility data market place

ML Delete Reminder Note

MP MegaPixel

MS Mini-main Signal

MS Modular Standard

MS Microsoft

N Neutral conductor

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 14 of 133

Definition

NetCDF Network Common Data Format

NR Noise speed Reduction

NR Not Responsible mode

NURBS Non-Uniform Rational B-Spline

O&M Operation and Maintenance

O&M Operation and Monitoring logic

ÖBB Austrian Federal Railways

OGC Open Geospatial Consortium

OMG Object Management Group

OOP Object-Oriented Programming

OpenLR Open Location Referencing

openCRG Open Curved Regular Grid

OPC UA OPC Unified Architecture

OS On-Sight (mode)

OS Operating System

OSA Operating System Adaptor

OSA-EAI Open System Architecture for Enterprise Application Integration

OSM Open Street Map

OSLC Open Service for Lifecycle Collaboration

PA Passenger Announcement

PA Possession Area

PA Proceed Authority

PC Personal Computer

PDF Portable Document Format

PDM Product Data Management

PLM Product Lifecycle Management

PMI Product and Manufacturing Information

PNG Portable Network Graphics

POI Points Of Interest

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 15 of 133

Definition

QoS Quality of Service

railML Rail Markup Language

RAM Random Access Memory

RAM Reliability, Availability, Maintainability

RAMS Reliability, Availability, Maintainability and Safety

RBAC Role Based Access Control

RDF Ressource Description Framework

REST Representational State Transfer

RIF Requirements Interchange Format

RINF Register of infrastructure model

RF Radio Frequency

RST Reset button

RST Rolling STock

RTM RailTopoModel

RU Regional Unit

SCADA Supervisory Control and Data Acquisition

SD SecureDigital Memory (card)

sensorML Sensor Markup Language

SGML Standard Generalized Markup Language

SIG Signal information

SIG "Signaling equipment supplier”; “signaling and safety systems"

SIL Safety Integrity Level

SIL Siemens Interlocking Language

SMTP Simple Mail Transfer Protocol

SNMP Simple Network Management Protocol

SOA Service-Oriented Architecture

SOAP Simple Object Access Protocol

SOS Sensor Observation Service

SPS Sensor Planning Service

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 16 of 133

Definition

SQL Structured Query Language

SRS Speed Restriction Section

SRS System Requirements Specification

SSN Semantic Sensor Network

STS Security Translator System

SWE Sensor Web Enablement

SWEET Semantic Web for Earth and Environmental Terminology

SysML Systems Modeling Language

S&C Switches and Crossings

TAF Track Ahead Free

TC Track Circuit

TC Train Consist

TCP Transmission Control Protocol

TCP/IP Transmission Control Protocol/Internet Protocol

TD Maximum permissible data transmission duration

TMC Traffic Message Channel

TMS Traffic Management System (same as OCS)

TMS Train Management System

TPEG Transport Protocol Experts Group

TR Technical Report

TSI Technical Specifications for Interoperability

TSR Temporary Speed Restriction

UIC Union internationale des chemins de fer (international union of railways)

UML Unified Modelling Language

UN Non-provided mode

UN Unfitted

URL Uniform Resource Locator

V Voltage

VB Visual Basic

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 17 of 133

Definition

VDV Association of German Transport Undertakings

W3C World Wide Web Consortium

WCF Windows Communication Foundation

WCS Web Coverage Service

WFS Web Feature Service

WFMC Workflow Management Coalition

WFMS Workflow Management System

WP Work Package

WMS Web Mapping Service

WS Web Service

WSDL Web Service Description Language

XML eXtensible Markup Language

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 18 of 133

BACKGROUND 2

This deliverable represents summarized state-of-the-art information gathered within task 7.1 of work package 7 as background for the work in the following task 7.2. Main purpose of this document is to provide a source of information for the upcoming work within work package 7 to define standards for data formats and exchange throughout IN2SMART and within the Shift2Rail ecosystem. This document is only capable to provide a current, focused and brief spotlight on today’s most relevant formats and technologies as IT/data technologies evolve quickly. Furthermore, all mature industry sectors put concurrently significant effort in digitalization and automation leading to rapid developments. Figure 1 illustrates a preliminary generic architecture for a solution for seamless diagnostic data gathering from multiple signalling and telecom systems, each characterized by a proprietary interface. In the middle of this architecture, named as proxy level in the figure, the objective of WP7 is to develop a guideline for Open Standard Interfaces for maintenance data including models and data exchange.

Figure 1: Generic architecture overview.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 19 of 133

OBJECTIVE/AIM 3

The aim of the D 7.1 is to analyse the state of the art in open data exchange formats in order to have a comprehensive overview as a resource for the work to be done on data standardization in IN2SMART task 7.2. The analysis is done not only for the railway domain, but for different domains to get a look at what applications / ideas / concepts can be used to fulfill requirements in the IN2SMART project. Main purpose is to support the utilization of already established and emerging standards for data exchange for the domain of railway asset management to efficiently implement sustainable solutions instead of creating costly, isolated, and railway specific solutions. The other WPs are also involved in the process of defining the needs and requirements for task 7.2. The focus of task D 7.1 relies on the data exchange formats itself, the evaluation of data exchange formats in combination with big data approaches such as data ingestion will be addressed in task 7.2.

This document is only capable to provide a current, focused and brief spotlight on today’s most relevant formats and technologies as IT/data technologies evolve quickly. Furthermore, all mature industry sectors put concurrently significant effort in digitalization and automation leading to rapid developments. Therefore, careful reviewing and monitoring of available and emerging technologies have to be maintained throughout the project.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 20 of 133

SUMMARY OF RELEVANT IN2RAIL RESULTS 4

4.1 IN2RAIL DESCRIPTION

The IN2RAIL project [1] is one of the “lighthouse” projects of Shift2Rail and is contributing to Innovation Programme 2 “Advanced Traffic Management and Control Systems” and 3 “Cost-Efficient and Reliable High-Capacity Infrastructure”.

IN2RAIL aims to set the foundation for a resilient, cost-efficient, high capacity, and digitalised European rail network and to make advances towards Shift2Rail objectives:

enhancing the existing capacity fulfilling user demand of the European rail system,

increasing the reliability delivering better and consistent quality of service of the European rail system,

reducing the Life Cycle Cost (LCC) increasing competitiveness of the European rail system and European rail supply industry.

IN2RAIL has been organized into three technical sub-projects:

1. Smart Infrastructure Smart Infrastructure adopts a whole system approach which addresses the fundamental design of critical infrastructure assets – switches and crossings (S&C), and the track system. It will research infrastructure components capable of meeting the demands of future rail transport and will utilise modern development technologies such as rapid prototyping and integrated virtual testing in the process. Risk and condition-based LEAN approaches to optimise RAMS and lifecycle costs in asset maintenance activities will be created to tackle the root causes of degradation and target known problem areas.

2. Intelligent Mobility Management sub-project (I2M) I2M researches advanced traffic management systems that are automated, interoperable and inter-connected; scalable and upgradable. Utilising standardised products and interfaces enables easy migration from legacy systems. The research targets the wealth of available data and transforms it into harmonised, useable information to improve and fully exploit network capacity. Currently the data is distributed over a wide range of information systems of differing standards. A standard ICT environment supporting transport operations with standard interfaces and protocols will be developed, enabling an open, integrated Traffic Management System (TMS). Advances will be made to the state of the art of asset information management systems, adding the capability of ‘nowcasting’ and forecasting of critical asset status.

3. Rail Power Supply and Energy Management Rail Power Supply and Energy Management sub-project provides solutions to improve the energy performance of the railway system. The research focuses on new power systems characterised by reduced losses and capable of balancing energy demands,

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 21 of 133

along with innovative energy management systems that enable accurate and precise estimates of energy flows within the railway. This should result in reduced energy consumption and costs, optimised asset management and better use of the railway capacity.

IN2SMART WP7 “DRIMS Open Standard Interfaces” topics are mainly addressed by the I2M subproject within the following WP:

IN2RAIL WP8 – Intelligent Mobility Management (I2M) - Integration Layer It addresses and develops a standardised integrated ICT environment capable of supporting diverse TMS dispatching services and operational systems. WP8 includes standard interfaces to external systems outside TMS/dispatching (for other railway management systems and transport modes) with a plug and play framework for TMS/dispatching applications.

IN2RAIL WP9 – Intelligent Mobility Management (I2M) – ‘Nowcasting’ and Forecasting: WP9 focuses on the design and development of an advanced asset information system with the ability to ‘nowcast’ and forecast network asset status with the associated probabilities. This should allow TMS/dispatching systems to seamlessly access heterogeneous data sources. WP9 bases its work on the findings of WP7 and complements the standardised integrated ICT environment of WP8.

4.2 IN2RAIL DELIVERABLES

The following table lists all deliverables that should be considered for the review of the state-of-the-art [2].

Table 1: IN2RAIL deliverables

Number Title Dissemination Level

Due date

(project months)

Delivered

D8.1 Requirements for the Integration Layer Public 18 Y

D8.2 Requirements for Interfaces Public 27 N

D8.3 Description of Integration Layer and Constituents

Public 36 N

D8.4 Interface Control Document for Integration Layer Interfaces, external/ Web interfaces and Dynamic Demand Service

Public 36 N

D8.5 Requirements for the Generic Application Framework

Public 15 Y

D8.6 Description of the Generic Application Framework and its constituents

Public 27 N

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 22 of 133

D8.7 Interface Control Document (ICD) for Application-specific Interfaces

Public 27 N

D8.8 Integration Test Plan for Application Framework and Constituents

Public 36 N

D9.1 Asset status representation Public 18 Y

The IN2RAIL project duration is 36 months, being the starting date 01/05/2015.

For this reason the delivery date of IN2SMART 7.1 deliverable will correspond to IN2RAIL M27.

Not all IN2RAIL deliverables will produce their final results in time for evaluation within the review of the state-of-the-art.

4.3 D9.1 ASSET STATUS REPRESENTATION [3]

4.3.1 Deliverable content

The “Asset Status Representation” document aims to describe a data representation for the status of assets within the railway infrastructure.

The logical steps followed by the document are:

Identification of the attributes needed to represent the operational status of a set of nine railway assets relevant to the TMS (defined in other work packages within IN2RAIL project). The considered assets are:

o Switch o Crossing o Track (Rail) o Catenary o Bridge o Tunnel o Embankments o Line sections o Level crossing

Each of them has been described by distinguishing their attributes into: o Static data: related to static characteristics of the asset under examination, with

values that never change or change infrequently, o Dynamic data: with values that change frequently and are related to operational

state; they are further classified in Internal, Asset-related, External, Diagnostic and Maintenance

A review of existing modelling approaches to the problem area, and production of recommendations for modelling of assets as described in previous step. Different models have been considered for both static and dynamic attributes: railML, railML2, RailTopoModel/railML3, Register of Infrastructure (RINF) model, Infrastructure

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 23 of 133

for Spatial Information in Europe (INSPIRE), Open Geospatial Consortium’s (OGC) Sensor Web Enablement (SWE) framework, Semantic Web for Earth and Environmental Terminology (SWEET), Semantic Sensor Network (SSN).

Production of proof of concept examples illustrating the use of the proposed approach for some of identified assets: level crossing and switch.

4.3.2 Deliverable Conclusions

None of reviewed existing models was able to adequately represent both static and dynamic attributes independently.

For this reason a hybrid approach has been proposed:

railML has been proposed to describe static elements

OGC/sensorML has been proposed to describe dynamic elements

The proof-of-concept examples mentioned in previous paragraph have been produced using the railML/sensorML combined approach.

4.4 D8.1 REQUIREMENTS FOR THE INTEGRATION LAYER

4.4.1 Deliverable content

The D8.1 summarises the work done in first part of task 8.1, Integration Layer (IL), to produce a system requirements specification (SRS) for a standardised information exchange layer to be provided to TMS and external systems.

Regarding IL purpose, WP8 focused mainly on:

providing communication based on a standardised data model between railway services, applications, and interface plug-ins communicating to external systems,

providing a standard communication medium between the business applications (i.e. TMS applications) running in the context of the Generic Application Framework (D8.5 scope).

Among the others, D8.1 introduces the concepts of:

Canonical Data Model: it contains data types for exchanged data between TMS and external systems and also contains the definitions of relations between them

Information Item: it is a unit of information exchanged within the TMS (i.e. between TMS business applications/services) or between the TMS and the external systems. Information Item has the following properties: it may be structured data, and it is atomic (i.e. irreducible in fields without loss of meaning)

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 24 of 133

The requirement list has been organized into the following categories:

Communication

Messaging

Topic Tagging

Security and Accounting

Availability and QoS

IT Management

Data Access Patterns

Implementation of IL

Compliance with existing standards

4.4.2 Deliverable Conclusions

The D8.1 output is a requirements list that should be used as reference document for all following deliverables related to Integration Layer design.

4.5 D8.5 REQUIREMENTS FOR THE GENERIC APPLICATION FRAMEWORK

4.5.1 Deliverable content

The D8.5 summarises the work done in first part of task 8.2, Generic Framework for Application, to produce a system requirements specification (SRS) for a standardized generic application framework allowing plug-and-play of service application module.

The Generic Application Framework, which is work of IN2RAIL task 8.2, comprises TMS core applications managing highly dynamic service related processes, associated communications and required system services to enable plug-and-play functionality.

The long term objective is to provide a standardised integrated ICT environment supporting diverse TMS applications that are connected to other multimodal operational systems.

The standardisation includes specification of the interfaces to external systems and plug-and -play mechanisms for the TMS-applications inside of the Application Framework.

The requirement list has been organized into the following categories:

Communication

Availability, Performance

Data Management system

Security of information system

Requirements on Data model

Safety Integrity Level (SIL) Requirements

Start-up, Shut-down

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 25 of 133

Synchronization/Time Management

Directory (naming service, identifying services)

Requirements related to API-Type

Requirements related to operating environment

Monitoring/profiling the Applications

Scalability

Scheduling

Transactions

Logging and traceability

Alarm and Events

Workflow, module orchestration

Backwards/version compatibility

Applicable standards

Diagnostics and System Maintenance

Manuals and documentation

4.5.2 Deliverable Conclusions

The D8.5 output is a requirements list that should be used as reference document for all following deliverables related to Generic Application Framework design.

4.6 ANNEX TO D8.3: DESCRIPTION OF THE CANONICAL DATA MODEL

The requirement analysis of D8.1 and D8.5 identified the concept of Canonical Data Model as one of the patterns to be used within the framework for message data modeling.

A Canonical Data Model defines message formats that are independent from any specific application so that all applications can communicate with each other in this common format.

Since no specific deliverable for this important topic has been indicated within the project, a dedicated Annex to deliverable D8.3 “Description of Integration Layer and Constituents” will be prepared.

The annex will describe the data model to be used within IN2RAIL taking into account existing data models and analyzing their characteristics.

The current plan is to base the data model on railML 3. A collaboration with railML.org has been established, in order to expand the scope of railML to cover the needs of TMS, in particular real time data. There is a major risk related to the fact that railML3 is not yet released and the choice to base CDM on railML3 could be re-evaluated.

4.7 CONCLUSIONS

At the time of writing the above mentioned deliverables represent the work related to data management/data exchange that should be available before the due date of IN2SMART D7.1.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 26 of 133

IN2RAIL D9.1 should be taken into account for data modeling and data format analysis since it analyses a subset of existing standards and makes an explained choice of the best fitting ones.

The CDM document, previously planned as Annex to D8.6 to be refined as Annex to D8.3, won’t be delivered with D8.6, thus it won’t be available before the due date of this deliverable. Nevertheless the ongoing work within IN2RAIL WP8 Canonical Data Model document could be taken into account as emerging data model.

The D9.1 approach is to provide a subset of relevant assets and generate some examples of how they could be mapped into selected standards. The Annex will concentrate on providing a generic data model.

IN2RAIL D8.1 and D8.5, Requirement for Integration Layer and Generic Application Framework are less data standardization oriented, but they provide requirements, e.g. for communication patterns, that should be considered while analysing existing communication protocols to be used for data exchange.

All the considerations/conclusions of IN2RAIL are focused on a TMS application; this must be considered keeping in mind that the scope of IN2SMART is the Intelligent Asset Management and requirements may also be different.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 27 of 133

ONLINE SURVEY 5

5.1 QUESTIONNAIRE

From March 20, 2017 to May 2, 2017, experts on Open Data Exchange formats have been invited to an online questionnaire on Open Data Exchange formats. The purpose of the questionnaire was to collect information

about the use of Open Data Exchange formats in the participating companies or institutions,

about the extent of their participation in the respective Open Data Exchange format communities,

about their general mindset and attitude towards Open Data Exchange formats, and

about other aspects of formats for Open Data Exchange, such as e.g. how generic or specialised they are perceived as by the users.

More details can be found in appendix A.

5.2 FEEDBACK

A summary of the obtained feedback follows; details can be found in Appendix A.

5.2.1 Participants’ domains of service

As shown in Figure 2, most participants came from Railway, Automotive, and from Traffic Management (each with about the same fraction of the sample size).

Figure 2: Domains of the participating companies or institutions

Railway; 8

Automotive; 7 Industry automation; 3

Traffic management; 7

Tunnel; 1

Building monitoring; 1

Energy; 1

Domains of the participating companies or institutions

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 28 of 133

5.2.2 Extent of use and use cases of Open Data Exchange formats

Railway formats:

Table 2 shows the extent of use of 19 well-known railway formats, given as frequencies of answers in the respective categories (the darker the green cell colour, the more frequently the respective category has been chosen as answer). IDMVU, LandXML, and the rail track database from ÖBB, an Austrian mobility services provider (“ÖBB Gleisdatenbank”), are all formats used within the context of the traffic planning software PROVI. RTM (RailTopoModel), railML v 3 as well as RINF are all tested in pilots, whereas TAF TSI is already in use for Train Composition. Figure 23 in Appendix 10.2.1 depicts a tag cloud (generated with a tool like e.g. Wordle [4]) of use cases/comments for railway formats given by the participants: according to Wikipedia, a tag cloud is “[…] a visual representation of text data […]. The importance of each tag is shown with font size or colour. This format is useful for quickly perceiving the most prominent terms and for locating a term alphabetically to determine its relative prominence”. Table 13 in Appendix 10.2.1 gives their answers in full detail.

Maintenance formats:

According to the feedback, the Alarms Management Standard in Industrial Asset Management EN 62682:2015 & EMMUA 191, while quite new for Rail markets, are established well in plant, manufacturing and in materials processing markets. MIMOSA OSA-CBM is used for rail remote condition monitoring of infrastructure assets, for data acquisition, manipulation and state detection. Newer uses are in preparation for health assessment and for prognostic assessment. SensorML is currently tested in pilots.

Table 2: Extent of use of railway formats (frequencies of answers)

Format: of interest in future

considered, not used

in preparation in operation

RINF 1 2 2 2

TAF TSI 3 2 0 2

UIC 407-1 0 0 1 1

LandXML 0 0 0 1

CIVIL3D 0 0 0 1

csv 0 0 0 1

Signalling Data Exchange Format (SDEF) 0 0 0 1

DATEX II 0 0 0 1

OSLC 0 0 0 1

FMI 0 0 0 1

OPC-UA 0 0 0 1

MQTT 0 0 0 1

ICE870-5 0 0 0 1

RTM / railML v 3 3 2 7 0

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 29 of 133

TAP TSI 2 1 3 0

ÖBB Gleisdatenbank 0 1 0 0

FRAME 0 1 0 0

OpenScenario 0 0 1 0

OpenSimulation 0 1 0 0

Table 14 in Appendix 10.2.2 shows the extent of use of 8 well-known maintenance formats, given as frequencies of answers in the respective categories (the darker the green cell colour, the more frequently the respective category has been chosen as answer). Figure 24 in Appendix 10.2.2 depicts a tag cloud of use cases/comments for maintenance formats given by the participants. Table 15 in Appendix 10.2.2 gives their answers in full detail.

Other formats:

According to the feedback, the formats ASCII-GRID, Simple Feature Access, GeoPackage, Esri Shape and RoadXML are used for Noise Maps and Road Noise Maps; LandXML, CityGML, OSM, and Esri Shape are used in conjunction with software for traffic or infrastructure planning / Building Information Modeling (BIM) like PROVI, Infra Works, and Civil 3D; formats like e.g. UML, GML, City GML, and Simple Features OLE/COM (OpenGIS) play a major role in tasks like import, export and modelling; OSM and OpenWeather-Maps find application in Webservices; GeoTIFF and GML in JPEG 2000 are used for Overlays. Moreover, SNMP is used for Standard Server Monitoring and Product Monitoring. Table 16 in Appendix 10.2.3 shows the extent of use of 96 other (miscellaneous) formats, given as frequencies of answers in the respective categories (the darker the green cell colour, the more frequently the respective category has been chosen as answer). Figure 25 in Appendix 10.2.3 depicts a tag cloud of use cases/comments for the other formats given by the participants. Table 17 in Appendix 10.2.3 gives their answers in full detail.

5.2.3 How generic/specialised are Open Data Exchange formats?

Railway formats:

Figure 3 shows how generic/specialised 14 prominent railway formats were perceived on average, averaged over the participating experts. According to the obtained feedback, the ÖBB Gleisdatenbank, the comma-separated values (CSV) file format in its special use case for time tables, as well as the RTM / railML v 3 formats are perceived as the most specialised formats.

Maintenance formats:

According to the obtained feedback, the NR-L2-SIG-30036-Issue1 and the comma-separated values (CSV) file format in its special format/use case for time tables are perceived as the most specialised formats. Figure 26 in Appendix 10.3.2 shows how generic/specialised 8 well-known maintenance formats were perceived on average over the participating experts.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 30 of 133

Other formats:

OpenLR, OpenSCENARIO, TMS, TPEG, WKT CRS, Coordinate Transformation, KNXbus, RoadXML, ONVIF, OpenWeather-Maps, BACnet, RDS, TMC, DATEX2, and ASCII- GRID have been named as the top 15 most specialised miscellaneous formats. Figure 27/Figure 28 in Appendix 10.3.3 show the top 15 generic/specialised other formats, averaged over the participating experts.

Figure 3: How generic/specialised are railway formats on average?

5.2.4 “Best” example of an Open Data Exchange format suitable for one of several sources of information

Figure 4 depicts a tag cloud for best formats and the respective criteria given by the answers of the participants. Table 18 in Appendix 10.4 gives their answers in full detail.

In most of the answers, railML has been given as best format for modelling Rail infrastructure and Rolling-stock, and also for Rail operation plans. RailML is also the most frequently named best format in total, preferred because it is an open format, and due to its large user base.

0 1 23

45

UIC 407-1

IDMVU

OJP

LandXML

Civil 3D

SDEF - Network Rail

TAF TSI

IP-KOM-ÖV

EULYNX

railML

RINF

TAP TSI

RTM / railML v 3

csv

ÖBB Gleisdatenbank

How generic/specialised are railway formats on average?

How generic/specialised onaverage1-5;0: not answered

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 31 of 133

For asset condition, alarms and events, and asset maintenance, the majority of answers gave SensorML as the best format, e.g. because of a good user experience. For alarms and events, OPC-UA or OPC-AE was named as best format just as often as SensorML.

Among other formats like SysML and BPMN, BPML has been named as best format for a business process notation model because it is generic, widely used, and well known.

Figure 4: Tag cloud of "best" formats and the respective criteria

5.2.5 The extent to which a participant’s company or institution participates in Open Data Exchange initiatives and communities

Figure 5 shows the extent to which a participant’s company or institution participates in Open Data Exchange initiatives and communities. According to the feedback, a significant part of the companies or institutions of the participating experts were contributors and/or active developers in OpenDRIVE, OpenLR, railML.org, the RailTopoModel Expert Group, OpenSCENARIO, OpenCRG, SysML, OSLC, and in Road2Simulation.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 32 of 133

5.2.6 Optional mindset questions: Open Data Exchange policy and attitude towards Open Data

The feedback for a set of optional general mindset questions is shown in Figure 30 and Figure 31 in Appendix 10.5. For half of the participating companies or institutions, answers were given to the optional set of mindset questions. According to the obtained feedback, all companies or institutions are willing to contribute to Open Data Exchange initiatives/portals. Two out of three companies or institutions define Open Data strategies, and 40% of the companies or institutions even owned an Open Data portal.

Figure 5: Extent of participation in Open Data Exchange initiatives and communities

5.2.7 Strengths and weaknesses of Open Data Exchange formats

General weaknesses have been seen in possible misinterpretations, in a potential confusion arising from the fact that there are too many formats in total, too many solutions, and finally because the potential risk of misuse of data is relatively high with universal formats. A strength of railML and railTopoModel is the fact that they are defined involving the main European railway actors, and that they are standardised open formats. On the other hand, railML has been criticised because it is not yet completed for all railway assets, because it is “uglily” (/nasty) hierarchical and huge, and because the tools interpret the data differently (i.e. only a subset of the format is implemented and not all data format versions are supported). Figure 32 in Appendix 10.6 depicts a tag cloud for strengths and weaknesses of Open Data Exchange formats as given by the participants. Table 19 in Appendix 10.6 gives their answers in full detail.

012345678

Fre

qu

en

cy o

f an

swe

r

Extent of participation in Open Data Exchange intiatives and communities

No activities

Following/applying

Contributing/developing

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 33 of 133

5.2.8 Licensing and/or legal issues hampering application of Open Data Exchange formats

Issues include unclear adoption policies (railML), a confusing tangle of different uses, programs and policies (SHP), the obstacle of horrendous costs for joining the consortium prior to access (NDS), and the general problem that data from business projects usually is not royalty-free and therefore cannot be provided as Open Data. Figure 33 in Appendix 10.7 depicts a tag cloud for licensing and/or legal issues hampering application of Open Data Exchange formats as named and explained by the participants. Table 20 in Appendix gives their answers in full detail.

5.3 CONCLUSIONS

Regarding use cases of Open Data Exchange formats, RTM (RailTopoModel), railML v 3 as well as RINF are all tested in pilots, whereas TAF TSI is already in use for Train Composition. EN 62682:2015 & EMMUA 191, while quite new for Rail markets, are established well in plant, manufacturing and in materials processing markets. MIMOSA OSA-CBM is used for rail remote condition monitoring of infrastructure assets, for data acquisition, manipulation and state detection. Sensor ML is currently tested in pilots.

Regarding the question, as how specialised / generic experts do perceive Open Data Exchange formats, formats perceived as most specialised are RTM / railML v 3, NR-L2-SIG-30036-Issue1, OpenLR, OpenSCENARIO, TMS, TPEG, WKT CRS, Coordinate Transformation, KNXbus, RoadXML, ONVIF, OpenWeather-Maps, BACnet, RDS, TMC, DATEX2, and ASCII-GRID.

Concerning the question for the best formats for various sources of information, in most of the answers, railML has been named as the best format for modelling Rail infrastructure and Rolling-stock, and also for Rail operation plans, preferred because it is an open format, and due to its large user base. For asset condition, alarms and events, and asset maintenance, the majority of answers gave SensorML as the best format, e.g. because of a good user experience.

Regarding participation in Open Data Exchange initiatives/communities and regarding optional mindset questions, a significant part of the companies or institutions of the participating experts were contributors and/or active developers in OpenDRIVE, OpenLR, railML.org, the RailTopoModel Expert Group, OpenSCENARIO, OpenCRG, SysML, OSLC, and in Road2Simulation. According to the obtained feedback, all companies or institutions are willing to contribute to Open Data Exchange initiatives/portals. Two out of three companies or institutions define Open Data strategies, and 40% of the companies or institutions even owned an Open Data portal.

As for strengths and weaknesses of Open Data Exchange formats, general weaknesses have been seen in possible misinterpretations, in a potential confusion arising from the fact that there are too many formats in total, too many solutions, and finally because the potential risk of misuse of data is relatively high with universal formats. A strength of railML and railTopoModel is the fact that they are defined involving the main European railway actors, and that they are

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 34 of 133

standardised open formats. On the other hand, railML has been criticised because it is not yet completed for all railway assets, and because it is “uglily” (/nasty) hierarchical and huge. In this regard it seems worth to note that in planned version 3 of railML the structure will already be simpler and "flatter", and thus less hierarchical.

Finally, concerning licensing and/or legal issues hampering application of Open Data Exchange formats, issues include unclear adoption policies (railML), a confusing tangle of different uses, programs and policies (SHP), and the obstacle of horrendous costs for joining the consortium prior to access (NDS). A general problem is that data from business projects usually is not royalty-free and therefore cannot be provided as Open Data.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 35 of 133

OPEN DATA EXCHANGE: TECHNOLOGIES 6

The survey with the partners described in section 5 gives a first sketch and valuable hints for the state of the art in open data exchange. In section 6 as a first step going into details the commonly used relevant technologies are described reaching from basic file formats over modeling languages and tools, communication protocols and Web Services to special in memory data grid technologies. This will be followed by the use in application area where some more formats are described as well in section 7 and more domain-specific technologies, applications and relevant communities in section 8.

6.1 FILES

6.1.1 General formats

6.1.1.1 JSON

JSON (JavaScript Object Notation) is a simple file format that is very easy for any programming language to read and it is a lightweight data-interchange format. Because of its simplicity and lightweight as well as strict structure it is generally easier for computers to process than (of course proprietary formats) as well as XML. Additionally, it is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON a most suitable data-interchange language for many applications (see [5]). JSON is built on two structures:

a) A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.

b) An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures. In JSON, they take on these forms:

An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma).

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 36 of 133

Figure 6: Object representation in JSON.

An array is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma).

Figure 7: Array representation in JSON.

A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.

Figure 8: Value representation in JSON.

A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. A character is represented as a single character string. A string is very much like a C or Java string.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 37 of 133

Figure 9: String representation in JSON.

A number is very much like a C or Java number, except that the octal and hexadecimal formats are not used.

Figure 10: Number representation in JSON.

Whitespace can be inserted between any pair of tokens. Excepting a few encoding details, that completely describe the language.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 38 of 133

6.1.1.2 XML

The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more. It was derived from an older standard format called SGML (ISO 8879), in order to be more suitable for Web use. XML is a widely used format for data exchange because it gives good opportunities to keep the structure in the data and the way files are built on, and allows developers to write parts of the documentation inside data files without interfering with the reading of them ([6]).

6.1.1.3 RDF

A W3C-recommended format called RDF makes it possible to represent data in a form that makes it easier to combine data from multiple sources. RDF is a framework for describing resources on the web. It is designed to be read and understood by computers. RDF is not designed for being displayed to people. It is written in XML a part of the W3C's Semantic Web Activity. RDF data can be stored in XML and JSON, among other serializations. RDF encourages the use of URLs as identifiers, which provides a convenient way to directly interconnect existing open data initiatives on the Web. RDF is still not widespread, but it has been a trend among Open Government initiatives, including the British and Spanish Government Linked Open Data projects. The inventor of the Web, Tim Berners-Lee, has recently proposed a fivesstar scheme that includes linked RDF data as a goal to be sought for open data initiatives ([7]).

6.1.1.4 SPREADSHEETS

Many authorities have information left in the spreadsheet, for example Microsoft Excel. This data can often be used immediately with the correct descriptions of what the different columns mean. However, in some cases there can be macros and formulas in spreadsheets, which may be somewhat more cumbersome to handle. It is therefore advisable to document such calculations next to the spreadsheet, since it is generally more accessible for users to read ([8]).

6.1.1.5 COMMA SEPARATED VALUES

CSV files can be very useful because it is a compact format and thus suitable to transfer large sets of data with the same structure. However, the format is so spartan that data are often useless without documentation since it can be almost impossible to guess the significance of the different columns. It is therefore particularly important for the comma-separated formats that documentation of the individual fields is accurate. Furthermore, it is essential that the structure of the file is respected, as a single omission of a field may disturb the reading of all remaining data in the file without any real opportunity to rectify it, because it cannot be determined how the remaining data should be interpreted ([9]).

6.1.1.6 TEXT DOCUMENT

Classic documents in formats like Word, ODF, OOXML, or PDF may be sufficient to show certain kinds of data - for example, relatively stable mailing lists or equivalent. It may be cheap to exhibit in, as often it is the format the data is born in. The format gives no support to keep the structure consistent, which often means that it is difficult to enter data by

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 39 of 133

automated means. Be sure to use templates as the basis of documents that will display data for re-use, so it is at least possible to pull information out of documents. It can also support the further use of data to use typography markup as much as possible so that it becomes easier for a machine to distinguish headings (any type specified) from the content and so on. Generally it is recommended not to exhibit in word processing format, if data exists in a different format ([9]).

6.1.1.7 PLAIN TEXT DOCUMENTS (.TXT)

These are very easy for computers to read. They generally exclude structural metadata from inside the document however, meaning that developers will need to create a parser that can interpret each document as it appears. Some problems can be caused by switching plain text files between operating systems. MS Windows, Mac OS X and other Unix variants have their own way of telling the computer that they have reached the end of the line ([9]).

6.1.1.8 HTML

Nowadays much data is available in HTML format on various sites. This may well be sufficient if the data is very stable and limited in scope. In some cases, it could be preferable to have data in a form easier to download and manipulate, but as it is cheap and easy to refer to a page on a website, it might be a good starting point in the display of data. Typically, it would be most appropriate to use tables in HTML documents to hold data, and then it is important that the various data fields are displayed and are given IDs which make it easy to find and manipulate data ([7])

6.1.1.9 SCANNED IMAGE

Probably the least suitable form for most data, but all TIFF, JPEG-2000 and PNG can at least mark them with documentation of what is in the picture - right up to mark up an image of a document with full text content of the document. It may be relevant to their displaying data as images whose data are not born electronically - an obvious example is the old church records and other archival material - and a picture is better than nothing ([9]).

6.1.1.10PROPRIETARY FORMATS

Some dedicated systems, etc. have their own data formats that they can save or export data in. It can sometimes be enough to expose data in such a format - especially if it is expected that further use would be in a similar system as that which they come from. Where further information on these proprietary formats can be found should always be indicated, for example by providing a link to the supplier’s website. Generally it is recommended to display data in non-proprietary formats where feasible ([9]).

6.1.2 Specific formats

6.1.2.1 HDF5

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. HDF5 is portable and is extensible, allowing applications to evolve in their

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 40 of 133

use of HDF5. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analysing data in the HDF5 format. The HDF5 technology suite includes (see [10]):

A versatile data model that can represent very complex data objects and a wide variety of metadata.

A completely portable file format with no limit on the number or size of data objects in the collection.

A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces.

A rich set of integrated performance features that allow for access time and storage space optimizations.

Tools and applications for managing, manipulating, viewing, and analysing the data in the collection.

6.1.2.2 NETCDF

NetCDF is an abstraction that supports a view of data as a collection of self-describing, portable objects that can be accessed through a simple interface. Array values may be accessed directly, without knowing details of how the data are stored. Auxiliary information about the data, such as what units are used, may be stored with the data. Generic utilities and application programs can access netCDF datasets and transform, combine, analyse, or display specified fields of the data. The development of such applications has led to improved accessibility of data and improved re-usability of software for array-oriented data management, analysis, and display (see [11]).

The netCDF software implements an abstract data type, which means that all operations to access and manipulate data in a netCDF dataset must use only the set of functions provided by the interface. The representation of the data is hidden from applications that use the interface, so that how the data are stored could be changed without affecting existing programs. The physical representation of netCDF data is designed to be independent of the computer on which the data were written.

6.1.2.3 JUPITER TESSELATION (JT)

JT (Jupiter Tesselation) is an ISO-standardized 3D data format and is in industry used for product visualization, collaboration, CAD data exchange, and in some also for long-term data retention. It can contain any combination of approximate (faceted) data, boundary representation surfaces (NURBS), Product and Manufacturing Information (PMI), and Metadata (textual attributes) either exported from the native CAD system or inserted by a product data management (PDM) system. ([15])

6.1.2.4 OASIS OSLC LIFECYCLE INTEGRATION CORE (OSLC CORE) TC

The OSLC (Open Services for Lifecycle Collaboration) initiative supports integration between a heterogeneous set of products and components from various sources using an architecture that is minimalist, loosely coupled, and standardized. OSLC applies World Wide Web and Linked Data principles, such as those defined in the W3C Linked Data Platform (LDP), to create a cohesive set of specifications that can enable products, services and other distributed

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 41 of 133

network resources to interoperate successfully. The OSLC Core TC is responsible for specifications that expand W3C LDP concepts, as needed, to enable integration. ([16])

6.1.2.5 REQUIREMENTS INTERCHANGE FORMAT (RIF/REQIF)

RIF/ReqIF (Requirements Interchange Format) is an XML file format that can be used to exchange requirements, along with its associated metadata, between software tools from different vendors. The requirements exchange format also defines a workflow for transmitting the status of requirements between partners. ([17])

6.1.2.6 OPEN DATA PROTOCOL (ODATA)

OData (Open Data Protocol) is an ISO/IEC approved, OASIS standard that defines a set of best practices for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the various approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats, query options, etc. ([18])

6.2 MODELING LANGUAGES AND TOOLS

6.2.1.1 BUSINESS PROCESS MODELING LANGUAGE (BPML)

Business Process Modeling Language (BPML) is an XML-based language for business process modeling. It was maintained by the Business Process Management Initiative (BPMI) until June 2005 when BPMI and OMG (Object Management Group) announced the merger of their respective Business Process Management (BPM) activities to form the Business Modeling and Integration Domain Task Force (BMI DTF). ([12])

6.2.1.2 CODE OF PLM OPENNESS (CPO)

The Code of PLM Openness (CPO) is a worldwide unique approach and runs under the patronage of the German Federal Ministry for Economic Affairs and Energy (BMWi). CPO is a prostep ivip initiative, for establishing a common understanding on openness of IT systems in the context of PLM between IT customers, IT vendors and IT service providers. Thereby, the CPO goes far beyond the requirement to provide IT standards and related interfaces. It defines measurable criteria (‘shall’, ‘should’, ‘may’) for the following categories: interoperability, infrastructure, extensibility, interfaces, standards, architecture as well as partnership. ([13])

6.2.1.3 SYSTEMS MODELING LANGUAGE (SYSML)

The Systems Modeling Language (SysML) is a general-purpose modeling language for systems engineering applications. It supports the specification, analysis, design, verification and validation of a broad range of systems and systems-of-systems. ([19])

6.2.1.4 UNIFIED MODELING LANGUAGE (UML)

The Unified Modeling Language (UML) is a general-purpose, developmental, modeling language in the field of software engineering that is intended to provide a standard way to visualize the design of a system.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 42 of 133

The OMG's Unified Modeling Language™ (UML®) helps to specify, visualize, and document models of software systems, including their structure and design, in a way that meets all of these requirements. (UML can be used for business modeling and modeling of other non-software systems too.) Using any one of the large number of UML-based tools on the market, future application's requirements can be analysed and design a solution that meets them, representing the results using UML 2.0's thirteen standard diagram types.

Models can be built about any type of application, running on any type and combination of hardware, operating system, programming language, and network, in UML. Its flexibility enables modelling of distributed applications that use just about any middleware on the market. Built upon fundamental OO concepts including class and operation, it's a natural fit for object-oriented languages and environments such as C++, Java, and the recent C#, but it can be used to model non-OO applications as well in, for example, Fortran, VB, or COBOL. UML Profiles (that is, subsets of UML tailored for specific purposes) helps in model Transactional, Real-time, and Fault-Tolerant systems in a natural way. ([20], [21])

6.2.1.5 GOOGLE WEB TOOLKIT (GWT)

Google Web Toolkit (GWT) or GWT Web Toolkit, is an open source set of tools that allows web developers to create and maintain complex JavaScript front-end applications in Java. Other than a few native libraries, everything is Java source that can be built on any supported platform with the included GWT Ant build files. It is licensed under the Apache License version 2.0. ([14])

6.3 COMMUNICATION PROTOCOLS

A communication protocol is a system of rules in telecommunications that allow two or more entities of a communications system (e.g. M2M Machine to Machine) to transmit information via any kind of variation of a physical quantity. These are the rules or standard that defines the syntax, semantics and synchronization of communication and possible error recovery methods. In addition, data definitions have to be provided in terms of data types, structure and semantics. Protocols may be implemented by hardware, software, or a combination of both.

6.3.1 OPC UA

6.3.1.1 INTRODUCTION

OPC UA answers the increasing need for interoperability and communication of industry4.0.

A state of the art interface definition has to provide important features beyond transmitting data including means for standardized definition of data and functions, standardized transmission, security, availability, and others. The following sections base on contributions of ascolab GmbH ([22]).

With the new Unified Architecture (UA) the OPC Foundation ([23]) follows todays and future requirements of industrial communication needs. Based on the functionality of all previous OPC Specifications (DA, A+E, HDA, Commands, Complex Data) the new defined standard is completely realized using a service oriented architecture (SOA). This new approach is platform

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 43 of 133

independent, scalable and high-performance. The use in small devices of process and measurement technology with their specialized operating systems is just as well possible as the use in enterprise applications on Unix/Linux machines or Mainframes ([22]).

Figure 11: OPC UA Concepts

6.3.1.2 OPC UA APPROACH

Definition ([22]):

OPC Unified Architecture, or OPC UA for short, is a TCP/IP based communication technology developed by the OPC Foundation to allow a manufacturer independent exchange of information in the field of industrial automation. OPC UA is also referred to as a machine to machine (M2M) communication protocol. Due to its generic information model, OPC UA has been adapted to other sectors as well, e.g. building automation, power generation and distribution, oil and gas exploration.

Data Model ([22]):

The OPC Information Model is not just a hierarchy based on folders, items and properties anymore, but a so-called Full Mesh Network based on Nodes instead. This network of Nodes can additionally transmit all varieties of meta information and diagnostic data. The closest image of a node would be an object, known from object-oriented programming (OOP). It can

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 44 of 133

own attributes for read access (Data Access (DA), Historical Data Access (HDA)), methods which can be called (Commands), and triggered events which can be fired (AE, DA DataChange) to exchange certain information between devices. An Event contains among other things a time of notification, a message and a severity. Nodes are used for process data as well as for all other types of meta data. The newly modelled OPC namespace now contains the Type Model used to describe all possible data types as well.

Transport ([22]):

The transport layer transforms these methods into a protocol, which means it serializes/deserializes the data and transmits it over the network. Currently there are two TCP/IP based protocols specified for this purpose. One is a binary, high performance optimized TCP protocol and the second, a web service based protocol. The binary protocol is mandatory and is supported by all UA stacks. In addition, there is a combination of both protocols, the so-called hybrid protocol. Here, a binary encoded (unencrypted) message is sent using an encrypted channel (HTTP). Additional protocols are possible and may be added when necessary.

OPC UA Implementation

The OPC Foundation provides the communication stack for its members. Developers of OPC UA products can choose between three implementations: C, .NET, or Java. All stacks provide the same functionality and, within the limits of the programming languages, the APIs can be applied similarly. The OPC Foundation maintains these implementations and integrates innovations if necessary. For members, the source code is available as well. All three implementations are tested against each other to ensure compatibility of protocol implementations.

6.3.1.3 STANDARDS

A set of standards released all main aspects covered. All referred standards are valid.

Table 3: OPC UA Standards

Document ID Issued Title

IEC/TR 62541-1 2016-10-01 OPC Unified architecture - Part 1: Overview and concepts

IEC/TR 62541-2 2016-10-01 OPC Unified architecture - Part 2: Security Model

IEC 62541-3 2015-03-01 OPC Unified Architecture - Part 3: Address Space Model

IEC 62541-4 2015-03-01 OPC Unified Architecture - Part 4: Services

IEC 62541-5 2015-03-01 OPC Unified Architecture - Part 5: Information Model

IEC 62541-6 2015-03-01 OPC Unified Architecture - Part 6: Mappings

IEC 62541-7 2015-03-01 OPC Unified Architecture - Part 7: Profiles

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 45 of 133

Document ID Issued Title

IEC 62541-8 2015-03-01 OPC Unified Architecture - Part 8: Data Access

IEC 62541-9 2015-03-01 OPC Unified Architecture - Part 9: Alarms and conditions

IEC 62541-10 2015-03-01 OPC Unified Architecture - Part 10: Programs

IEC 62541-11 2015-03-01 OPC Unified Architecture - Part 11: Historical Access

IEC 62541-13 2015-03-01 OPC Unified Architecture - Part 13: Aggregates

6.3.2 Queue/topic based messaging systems

Before describing message based systems it is worth giving an architectural overview which incorporates some main messaging pattern types. Understanding these pattern types will help understand the differences of the vendors’ products.

Consider the real world example of a letter being sent from one responsible party to another via a postal delivery service. The letter is the message and is contained within an envelope. The envelope defines the addressee’s information as well as the destination address amongst other things. This letter maybe sent recorded delivery whereby a receipt is needed. This letter may traverse several channels and delivery hubs until it reaches its destination, where the letter is signed for. This signature recording the delivery of the letter will eventually let the original responsible party know that their letter has been received.

6.3.2.1 MESSAGING – OVERVIEW

Messaging provides the ability for one system to send a self-contained message to another. The sender and the recipient do not need to be aware of each other, this is called loose coupling. The receiving system cannot guarantee its availability; thus, the message is needs to be sent asynchronously.

Messaging is structured with the following logical model, implementations may vary and some product vendors have variances again.

Message

Sender System Messaging System Reciever System

Channel

Figure 12: Typical Messaging System - Logical Model

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 46 of 133

Channels o Messages are transmitted through a Message Channel that connects a Sender

System to a Receiver System. This Channel will need to be determined, for example the channel could be TCP or HTTP.

Message o A Message is textual or binary package to be sent or received. To transmit a

Message the contents must be encoded for the Channel. To receive a Message from the Channel the system must be able to decode Message.

o Message Header (the envelope) Information regarding the sender & receiver Time sent and time received Sending system information including protocol Size of the Message

o Message Body (the letter) The information to be sent and received

o Message Attachment Any other MIME type data can also be attached to the message

Routing o A Message may traverse many Channels, which are triaged by Routing systems.

The original Sender does not need to be aware of all the Channels, only the one for sending that they submit the message to. The Routing system is then responsible for ensuring the Message is delivered by using Pipes/Filters to the receiver, or the next Routing System.

Transformation o Systems that do not support a common message format will need the Message

translating in transit so that the receiving system can decode the Message.

Endpoints o An Endpoint is an interface to the sending/receiving and the messaging system.

6.3.2.2 ENTERPRISE MESSAGING – OVERVIEW

Building on the principles for Messaging, Enterprise Messaging seeks to address more of the business aspects of Messaging such as the following:

Data Structure formats o XML o JSON

Messaging protocols o AMQP – Advanced Messaging Queuing Protocol o DDS – Data Distribution Service o MSMQ – Microsoft Message Queuing o JMS – Java Messaging Service o ZMQ – Zero Message Queue

Security

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 47 of 133

o Encryption o Signing o Altered Content o Point to Point Security (Transport) o Non-repudiation o Replay protection o Persistence

Routing o Efficiency o QoS – First Class, Second Class etc.

Metadata o Message Expiration o Receipt required o Message Correlator o Message Identification o others

Enterprise Policies o Policies pertinent to the organization sending a receiving message o RBAC

Message Patterns o Synchronous – Recipient expected to be available and operating at time of

request – the originator will wait for a response o Asynchronous – Recipient may be offline, may delay processing until later – the

originator does not wait for a response o Publish Subscribe – Subscribe to Message topics that match a pattern such as

latest news, o Distribution: one to one, one to many and many to many patterns o Queues – Messages utilizing FIFO

6.3.2.3 MESSAGING SYSTEMS – MIDDLEWARE

To address the varied complexities of Messaging, vendors created their own implementations of the Messaging Protocols all of which have their own complexities.

The following table attempts to distill the main capabilities of the main vendors and allegiance to protocol

Table 4: Vendor Messaging system protocols

Product Vendor License Technology Protocol

Pro

prie

tary

AM

QP

0.9

AM

QP

1.0

(I

SO

/IE

C

194

64:2

014

)

JM

S

CO

RB

A

MQ

TT

(IS

O/IE

C P

RF

209

22

)

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 48 of 133

ActiveMQ Apache OSS X X X

ZeroMQ LGPL X

HornetMQ OSS X

Service Bus Microsoft Commercial X X

MSMQ Microsoft Commercial X

Simple Queue Service

Amazon AWS

Commercial X

RabbitMQ Rabbit Commercial X

DDS X

WebSphereMQ IBM Commercial X

Tibco Commercial X

Table 5: Vendor Messaging system: Message Exchange Patterns Support

Product Vendor License Message exchange pattern (MeP)

Bro

ke

r

Req

ue

st-

Rep

ly

(Syn

ch

ron

ou

s)

Req

ue

st-

Resp

on

se

(Asyn

ch

ro

no

us)

Pu

blic

Su

bscrib

e

(Asyn

ch

ro

no

us)

ActiveMQ Apache OSS X X X X

ZeroMQ Zero LGPL X X X

HornetMQ Redhat OSS X X X X

Service Bus Microsoft Commercial X X X X

MSMQ Microsoft Commercial X X X X

Simple Queue Service

Amazon AWS

Commercial X X X X

RabbitMQ Rabbit Commercial X X X X

DDS RTI Commercial X X

WebSphereMQ IBM Commercial X X X X

Tibco Commercial X X X

6.3.2.4 FUTURE OF MESSAGING SYSTEMS

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 49 of 133

Whilst the message exchange patterns stay consistent, the goals of the messaging technology depend on the domain being addressed. Domains such as messaging of medical information need to be reliable, whereas financial data works on a principle where each transaction needs to ensure synchronicity in preference to reliability.

It is fair to say that each industry sector has its own challenges in terms of choosing the appropriate messaging technology.

IoT – Internet of Things – Consumer grade devices

IIoT – Industrial Internet of Things - Industrial grade devices

Cloud – Storage and Compute

Smart Devices – Users devices

Edge – Devices at the edge of the enterprise network

Fog/Mist – Extending the enterprise and processing closer to the assets

All the above drivers indicate a push towards a de-centralised cloud delivered messaging architecture, where point to point and point to brokers still exist. Low latency and throughput become more of a concern when delivering global services.

The conclusion is more effort and research should be invested in the emerging technologies that will advance the messaging system and possibly negate the need for discrete messaging middleware. These areas are from the perspective of IIoT, whereby edge, and fog processing is required in an atomic way before reaching it intended target.

6.4 WEB SERVICES / APIS

There is some ambiguity regarding the terms. API (application programmers interface) is a basic concept of software architecture which enables use/reuse of functionality (e.g. by including libraries). In the context of internet and machine – machine communication, protocol stacks are regarded as well as data specifications.

Therefore, these terms are used here in the context and meaning as follows.

6.4.1 Web Services

Almost all aspect of internet communication is coordinated ([25]). Web services are defined as follows: [Definition: A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards.] ([26])

The IT systems perform services that are defined and described in the context of the enterprise’s business activities with Service-Oriented Architecture (SOA). At a business

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 50 of 133

level of abstraction, services are offered which renders the interface as a business interface i.e. a contract. The contract is a platform neutral and standard way of describing what the service does. This principle enables use of techniques such as service composition, message-based communication, discovery, and model-driven implementation, which give fast development of effective and flexible solutions. They are important features of SOA. Their benefits – especially that of enterprise agility – are the most frequently quoted reasons for SOA adoption ([25]).

Service‐Oriented Architecture is an architecture model or design approach, which states that the system should be composed of several independent, loosely‐coupled services. It is recommended that SOA infrastructure implementations use open standards to realize interoperability and location transparency. Therefore, key concepts of SOA are loose coupling, high interoperability and services. SOA aims to enhance the agility, efficiency and productivity of an enterprise system. SOA can be implemented by using web services, particularly with WCF services. WCF is configurable to communicate with web services using both SOAP and XML messages. Because WCF can communicate using web service standards, interoperability is straightforward with other platforms that also support SOAP. Therefore, interoperability is gained through a set of XML-based open standards, such as WSDL, SOAP, and UDDI. These standards provide a common approach for defining, publishing, and using web services ([25]).

Web services are XML software systems over web or clouds, which are designed to support interoperable machine-to-machine interaction. To support refined communications between various nodes in a network standards act as series of protocols ([22]).

The Web service protocol stack is a collection of open standards that are used to make Web services interact with each other ([28]):

Discovery Protocol: This protocol is a directory for storing information about web services. Service providers use Universal Description, Discovery, and Integration (UDDI) specification to advertise the existence of their services and then requesters use to search and discover already registered services.

Description Protocol: This protocol is used to describe and locate web services. Web Service Definition Language (WSDL) is used to describe what type of message a Web Service accepts and generates. For a service it can be thought of as the overall technical interface specification. It serves as not only the definition of the interface but also contains technical information such as the allowable operations for a service and its endpoint address.

• Messaging Protocol: This protocol is responsible for encoding messages so that they can be understood at either end of a network connection by using XML format. Extensible Markup Language (XML) has become the fundamental message form for SOA consumers and services. In an SOA based on Web services, the message has a structure to allow for deeper integration and cross-platform collaboration. A key part of which is an enveloping scheme known as Simple Object Access Protocol (SOAP), which includes the message content, and is also encoded using XML. Thus, SOAP is the specific format for exchanging Web Services data over HTTP.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 51 of 133

Transport Protocol: This protocol is responsible for the transport of messages between network applications. Internet uses HTTP (HyperText Transfer Protocol) as the low-level protocol for the transport layer. As service interface hides the implementation logic from the users, therefore, the service can be used on different platforms and any application capable of communicating through the standard XML messaging protocol can use the service through the standard interface. The main advantage of Web services is that the service can be used remotely without the user’s actual involvement and thus, eliminating the need for constant updates to locally installed software.

Uses of Web Services o Standardized Protocol: For communication, Web Services use standardized

industry standard protocol. In the Web Services protocol stack all the four layers (Service Transport, XML Messaging, Service Description and Service Discovery layers) use the standardized protocol. This standardization of protocol stack gives the business many advantages like increase in the quality and reduction in the cost due to competition.

o Exposing the existing function on to network: A Web service is a unit of managed code that can be activated using HTTP requests. So, Web Services allows us to expose the functionality of our existing code over the cloud. Once it is exposed on the cloud, other application can use the functionality of our program.

o Interoperability i.e. Connecting Different Applications: Web services are used to make the application platform and technology independence by allowing different applications to talk to each other and share data and services among themselves. So, for example VB or .NET application can talk to java web services and vice versa.

o Low Cost of communication: We can use our existing low cost Internet for

implementing Web Services because it uses SOAP over HTTP protocol for the

communication. This solution is much less costly compared to proprietary

solutions. Beside SOAP over HTTP, Web Services can also be implemented on

other reliable transport mechanisms like FTP etc.

6.4.2 Web APIs

Representational state transfer is a concept (and implementation) for communication via internet. Compared to WS* it is easier to implement. Often referred to as RESTful services, simplified access to services via http(s) requests and responses is possible. The basic HTTP requests (GET, POST, PUT, DELETE) are used to transmit the data and/or function requests (e.g. searching a database with given parameters).

Identification of server, functions, and parameters for execution are provided in the http request itself. Example access google:

https://www.google.de/?gfe_rd=cr&ei=EYhzWKH9E8nb8Af0iZrICg&gws_rd=ssl

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 52 of 133

6.4.3 Comparison Web Services vs. Web APIs

Table 6: Comparison Web Services vs. Web APIs [29]

Criterion WS* Web API

Specification of data and functions

Standardized Application specific

Complexity High Low(er)

Standards All aspects of application of

web services are standardized

Available for basic communication (http) and data formats (xml, json)

Support by .Net, Java Many languages, easy to implement

Security Included as part of the protocol stack

Depending on application

WS* - the web services protocol stack

6.5 IN MEMORY DATA GRID TECHNOLOGIES

6.5.1 In-Memory Data Grid Overview

An In-Memory Data Grid (IMDG) is a distributed in-memory (RAM) data structure. IMDG is typically implemented using a key-value data structure.

The advantages of using IMDG are mainly related to:

Enhanced performance in terms of read/write speed,

Easily scaling and upgradable,

High availability (fault tolerance) thanks to distributed data,

Persistent storage caching.

The IMDG can be also thought as a data exchange platform/middleware between heterogeneous systems in the scope of the open data exchange standardization.

In the following paragraphs some technologies implementing IMDG are listed.

6.5.2 Infinispan

Infinispan [30] is an open source distributed in-memory key/value data store implementing an IMDG.

It is developed under Java and it implements the JSR 107 specification.

The main applications of Infinispan are:

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 53 of 133

Local cache: providing a fast in-memory cache of frequently accessed data,

Clustered cache: In case a single node is not enough for data storage,

Remote cache: in case a decoupling between application and stored data is needed

Data grid: using advanced features such as transaction, notifications …

A communication mechanism can be implemented by using Listeners and Notifications: clients can register and are notified when an event takes place; events trigger a notification which is dispatched to listeners.

6.5.3 Redis

Redis [31] is an open source, in memory data structure store, used as a database, cache and message broker.

It supports different kinds of data structures such as string, hashes, lists, sets, sorted sets, bitmaps, hyperloglogs and geospatial indexes.

Some of Redis feature:

replication: is master-slave replication that allows slave Redis servers to be exact copies of master servers

clustering: automatically sharing data across multiple Redis nodes

on-disk persistence: dumping the dataset to disk periodically or by appending each command to a log

Publishing/Subscribing: implementing the Publish/Subscribe messaging paradigm

Redis clients exist for most programming languages.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 54 of 133

OPEN DATA EXCHANGE: APPLICATIONS 7

Having the basic commonly used technologies described in section 6, section 7 describes the use of technologies in different application areas and amongst others describes application-specific formats. This will be followed in section 8 by the usage of these technologies and applications in different domains and relevant communities.

7.1 GEODATA (DLR, LTU)

7.1.1 Vector Data

Vector data represent geographic features with discrete coordinates as points, lines and polygons which could be stations, tracks and land use areas, for instance. Most of the geographic vector formats implement the OGC Simple Feature Access ([32]). Each geographic feature is characterised by an arbitrary number of specific attributes usually stored in tables. Vector datasets are either stored in stand-alone files on file-system-level or in databases. The Geospatial Data Abstraction Library (GDAL, [33]) offers functionality for conversion between different vector data formats.

7.1.1.1 ESRI SHAPEFILES (SHP)

The shapefile format is a popular geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a (mostly) open specification for data interoperability among Esri and other GIS software products. The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature. Spatial reference system information is usually included as textual description ([39]).

7.1.1.2 GEOJSON (.JSON, .GEOJSON)

GeoJSON is based on the popular JSON format (see section 6.1.1.1) with added support for geometries in form of Point, LineString, Polygon, MultiPoint, MultiLineString and MultiPolygon. Commonly GeoJSON is found in light-weight web mapping applications such as Leaflet/OpenLayers. Spatial reference system information can be included as spatial reference identifier (SRID).

7.1.1.3 WELL-KNOWN TEXT (WKT)

Text-based representation of geographic features which can be of the type Point, LineString, Polygon, Multipoint, MultiLineString, MultiPolygon and GeometryCollection. Often used for small-sized, quick and human-readable exchange of geodata. Spatial reference system information is usually not included and has to be provided separately.

7.1.1.4 SPATIAL DATABASE

Many common database management systems, including Oracle, PostgreSQL, MySQL, SQLite, offer support for geographic vector data through the implementation of the OGC Simple Feature Access with spatial reference system information.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 55 of 133

7.1.2 Raster Data

Raster data represent geographic features as continuous grids of information which can be thematic base maps, aerial imagery, elevation models and arbitrary geo-referenced sensor measurements. Many common image formats such as GIF, JPEG, TIFF and PNG can be turned into geographic raster data by adding a textual meta data file (world file [34]) defining the pixel extent/scale and coordinate system origin of the dataset. The Geospatial Data Abstraction Library (GDAL) offers functionality for conversion between different raster data formats.

7.1.2.1 GEOTIFF

GeoTIFF as extension of TIFF can be considered as the most popular and versatile raster exchange format. It supports different bands/channels with compression and includes spatial reference system information which is natively supported by most GIS.

7.1.3 Geo Web Services

Geographic web services serve geodata from vector and raster sources through standardised interfaces for easy access and distribution independent from the underlying and often heterogeneous raw data backends.

7.1.3.1 WEB MAPPING SERVICES (WMS)

Web mapping services offer tile-based images of vector and raster data as layers which are rendered on server side for each request. Such tiles can be easily included in client web mapping applications and desktop GIS. Each layer can support different styles for custom visualisation of geographic raw data. For better performance in large-scale applications such tiled mapping services can pre-render tiles (WMS-C, WMTS, TMS) which are then served from a tile store instead of dynamic re-rendering of tiles on each request.

Recently also vector tiles are supported as output format offering better integration in mobile applications, smoother rendering and better performance. One drawback of vector tiles is that the styling must be known to the client side for client-based rendering of vector tiles.

WMS also support spatio-temporal data offering easy/standardised access to time series of geographic features.

7.1.3.2 WEB FEATURE SERVICES (WFS)

Web feature services offer direct, standardised raw data access to geographic vector data obscuring different data backend implementations of various data sources. Different vector geodata output formats (see section 7.1.1) are supported depending on the server implementation. Modification of raw data is additionally realised through transactions (WFS-T).

7.1.3.3 WEB COVERAGE SERVICES (WCS)

Web coverage services offer direct, standardised raw data access to geographic raster data obscuring different data backend implementations of various data sources. Different raster geodata output formats (see section 7.1.2) are supported depending on the server implementation.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 56 of 133

7.1.4 Open Street Map, Open Railway Map

7.1.4.1 OPENSTREETMAP (OSM)

OpenStreetMap (OSM) is a collaborative project to create a free editable map of the world. The creation and growth of OSM has been motivated by restrictions on use or availability of map information across much of the world, and the advent of inexpensive portable satellite navigation devices. OSM is considered a prominent example of volunteered geographic information. OSM datasets contain Points, LineStrings and Polygons representing different geographic features. Each geo-feature is tagged with specific attributes. No certain tagging rules exist but just a rough guideline which can lead to problems when the same type of features is tagged in different ways through different people. ([35])

7.1.4.2 POINTS OF INTEREST (POI)

OSM provides a huge database of common points of interest (POI) in many different domains such as leisure, tourism, traffic, and transport ([36]). These elements can be queried via different APIs online or extracted directly from raw data which is freely available for everyone. The data basically consists of spatial objects with attached attributes describing the features by an arbitrary amount of tags ([37]). The data export can be obtained in standardized XML files and also converted to many different other vector-based spatial data formats, as described in section 7.1.1. This allows OSM databased to be easily used for heterogeneous analysis scenarios.

7.1.4.3 OPENRAILWAYMAP

The OpenRailwayMap is a collaborative project to create a map of the world’s railway infrastructure. This map is based on the OpenStreetMap project but extended for the railway domain. This map can be used to display railway-specific information such as signals, infrastructure elements and its meta information. It includes diverse rail-mounted vehicles such as railways, subways and trams.

This project was founded in 2011, previously known as “Bahnkarte” and since 2013 known as OpenRailwayMap under the URL ([38]).

Such as in OSM, the available data was uploaded by individuals, companies and institutions that are willing to share their data with the rest of the community. Depending on how and when the data was recorded, the information could be old, or not representing the exact position of an element in the reality.

The main motivations of the project are:

Worldwide coverage

Open source and open data

Up-to-date and detailed

OpenStreetMap

There is also a Tagging scheme that is country specific, so that elements or signals that are country specific can also be modelled.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 57 of 133

7.2 SENSOR / MEASUREMENT DATA

The integration of measurement data coming from distributed sensors, often referring to the same system or asset, is becoming increasingly important with the advances in sensor technology and network technology.

Sensors provide measurements that could be used both for dynamic monitoring and for analytics purposes.

Sensor measurement data can be associated to additional properties (metadata) that can be used for discovery and for understanding the nature of the object and for qualifying the output.

7.2.1 sensorML

7.2.1.1 OGC - SENSOR WEB ENABLEMENT

The Open Geospatial Consortium (OGC) [40] is an international consortium of industry, academic and government organizations that collaboratively develop open standards for geospatial and location services.

The OGC standardization activities focus on sensors, sensor networks and Sensor Web are known as Sensor Web Enablement [41].

The functionalities targeted by OGC within SWE include:

Discovery of sensor systems, observations, and observation processes;

Establishing of a sensor’s capabilities and quality of measurements;

Access to sensor parameters that automatically allow software to process and geo-locate observations;

Retrieval of real-time or time-series observations and coverages in standard encodings

Description of sensors task to acquire observations of interest;

Subscription to and publishing of alerts to be issued by sensors or sensor services

To achieve its objectives SWE initiative has created a framework including several OGC standards harmonized with other OGC standards for geospatial processing:

Sensor Model Language (SensorML) – Standard models and XML Schema for describing the processes within sensor and observation processing systems.

Observations & Measurements (O&M) –The general models and XML encodings for observations and measurements.

Sensor Observation Service (SOS) – Open interface for a web service to obtain observations and sensor and platform descriptions from one or more sensors.

Sensor Planning Service (SPS) – An open interface for a web service by which a client can 1) determine the feasibility of collecting data from one or more sensors or models and 2) submit collection requests.

SWE Common Data Model – Defines low-level data models for exchanging sensor related data between nodes of the OGC Sensor Web Enablement (SWE) framework.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 58 of 133

SWE Service Model ­– Defines data types for common use across OGC Sensor Web Enablement (SWE) services. Five of these packages define operation request and response types.

PUCK Protocol Standard – Defines a protocol to retrieve a SensorML description, sensor "driver" code, and other information from the device itself, thus enabling automatic sensor installation, configuration and operation.

7.2.1.2 SENSORML

SensorML [42] is one of the implementation standards included in the SWE suite. It defines conceptual models and XML Schema encoding for describing sensors and measurement processes.

The primary focus of SensorML is to provide a framework for defining processes and processing components associated with measurement and post-measurement transformation of observations.

The common framework provided by SensorML is particularly well-suited for the description of sensors and systems and the processes underlying the act of measurement

and subsequent processing of observations. Sensor and transducer components (detectors, transmitters, actuators and filters) are all modeled as physical processes interconnected and equally participating within a system.

The basic entities of the model are processes which take one or more input and produce one or more outputs, through the application of well-defined methods and configurable parameters.

SensorML process model also allows explicit linking between processes using a composite pattern to define aggregate processes (e.g. chains, network and workflows).

Current version of SensorML is version 2.0. SensorML is heavily dependent on the SWE Common Data Model standard for defining inputs, outputs, and parameters, as well as for specifying characteristics, capabilities, interfaces, and event properties. The SWE Common Data Models, which were originally defined within the version 1.0 SensorML specification, are in version 2.0 defined as a separate specification and are utilized throughout the SWE family of encoding and web service specifications.

The SWE Common Data Model is intended to be used for describing static data (files) as well as dynamically generated datasets (on the fly processing), data subsets, process and web service inputs and outputs and real time streaming data.

UML is used to describe both SensorML and SWE Common Data models.

Within IN2RAIL WP9, SensorML has been selected for dynamic data representation; among different options for the encoding of the sensor data the decision taken was to enable access to the data via RESTful web service interfaces and simple text serialization.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 59 of 133

7.2.2 LAS file format

The LAS file format is an open file format for the interchange of 3-dimensional point cloud data between users of different systems. It was mainly developed for the exchange of LiDAR (Light detection and ranging) or other point cloud data but supports the exchange of any 3-dimensional data tuple. This binary file format is an alternative to proprietary systems or generic ASCII file interchange systems used by many companies. ([43])

The LAS 1.4 Specification was approved by the ASPRS in November 2011 and is the most recent approved version of the document ([44]). It stores an x, y, z coordinate set per point and additional information as the actual intensity or magnitude of the return value and user defined values, such as point classification.

Though the LAS format is widely adopted and used, it is not spatially indexed and does not provide generalizations which are problems when working with very large datasets. Spatial indexing allows locating all the points within a given area quickly without scanning the whole file and generalization allows for a representative subset of the points to be used for visualization at small scales.

Potential use-cases in the project could be the exchange of sensor data gathered by drones or laser scanners. Also two-dimensional image sequences, gathered by drones, could be used to derive 3d point-clouds by using structure from motion (SfM) photogrammetric range imaging techniques that may be coupled with local motion signals and be shared using the LAS format.

7.3 MAINTENANCE

In this section the topic of open data exchange for maintenance is explored. This subject includes the areas of:

asset design information – how open data exchange is achieved both during the design and construction of assets, and, the transfer of asset information from construction projects to the “operate and maintain” phase of the asset life (see 7.3.1),

maintenance management – how open data exchange is achieved during the operational phase of the asset life (see 7.3.2),

asset condition – how open data exchange is achieved for asset condition assessment and evaluation (see 7.3.3)

asset alarms – standardization around events and notifications of asset state (7.3.4)

7.3.1 Building Information Modeling – BIM

7.3.1.1 BACKGROUND

Building Information Modeling has its roots within the construction and building industry. The term is used to refer to software tools, design processes and structured data models used throughout the design, construction and maintenance phases of the asset lifecycle.

Unlike CAD tools, which optimize traditional pen and paper design and drawing processes, the BIM approach to design is model driven. Complex assets are modeled from real construction

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 60 of 133

elements such as slabs, windows, walls and roof. The model is stored as structured data and is a digital representation of the physical and functional characteristics of a facility.

Drawings are generated using rendering software that takes the BIM data model and combines it (for example) with a camera position and definition, and automatically generates a drawing of the asset from that perspective.

Through the use of models rather than drawings, the effect of change on elements within a design can be rapidly evaluated for impact on other areas and the construction drawings automatically updated. This delivers a significant reduction in effort required for all phases of asset life - design, build and operation/maintenance. The use of a model also enables rapid iterations of designs between teams of different disciplines, significantly reducing the risk of design conflict and the incidence of conflict discovery during construction.

Realizing the potential benefits of having a full BIM model for assets requires that the model is maintained throughout the asset life. For example, this enables decision making to be made against the current asset without the need for re-surveying.

The use of BIM approaches and modeling is growing within the rail industry with governments and administrations mandating the use of BIM in projects.

Examples include:

1. Norwegian National Rail Administration – the rail administrator’s design manual defines the types of models required and the content of the models. It does not define the tools or methods to be used. The overall coordination model is a combination of base models and rail discipline models. Base models include map data, surveyed data, rail data, water and services pipelines and underground data. The discipline models include track, superstructures, signaling, telecoms, electrical distribution and substructures such as tunnels. ([45])

2. UK Rail Industry – Crossrail and Brighton Mainline upgrade investment projects are two examples where BIM is being used to significantly derisk project development and implementation. In addition, the UK Government has mandated BIM level 2 on all centrally procured HM Government projects by 4th April 2016, and is currently on track in delivering a strategy for Level 3. ([46])

7.3.1.2 OPENBIM AND BUILDINGSMART

From the OpenBIM web site ([47]), OpenBIM is described as:

OpenBIM is a universal approach to the collaborative design, realization and operation of buildings based on open standards and workflows. OpenBIM is an initiative of buildingSMART and several leading software vendors using the open buildingSMART Data Model.

The buildingSMART core is based around a common model called IFC that enables the storage and exchange of BIM information between software applications. These models are captured as ISO standards – as illustrated below (reproduced from http://buildingsmart.org/ifc/).

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 61 of 133

Figure 13: ISO Standards related to BIM

7.3.2 Maintenance Management

7.3.2.1 BACKGROUND

Asset maintenance is used within the Rail Industry to mitigate the safety risks associated with rail undertaking. It is also used to maximize asset availability to deliver the service promised.

There are many approaches to asset maintenance; an example of these as a measure of maintenance maturity is illustrated below.

Equ

ipm

ent

Rel

iab

ility

an

d A

vaila

bili

ty

Maintenance Approach

Fault repair only

Inspect and service

Preventive maintenance

Systematic planning and scheduling

Predictive Maintenance

Diagnostics

Reliability engineering

Figure 14: Effectiveness of asset maintenance methodologies on

asset reliability and Availability

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 62 of 133

In summary, the methodologies illustrated are:

Fault repair only – maintenance interventions are only performed when an asset has failed to perform the requested function

Inspect and service – an unstructured approach to asset condition management, ad-hoc inspections and service actions

Preventive maintenance – following a time or use based regime, usually stipulated by the manufacturer to mitigate in-warranty failure.

Systematic planning and scheduling – following a time or use based regime, designed around risk assessment and barriers to hazards, threats and consequences. Most common in developed rail operators.

Predictive maintenance – the use of condition monitoring to detect symptoms of emerging failure modes, initiating a maintenance intervention prior to asset failure, with the goal of preventing in-service failures (Predict and prevent).

Diagnostics – using derived diagnoses to reduce the time spent maintaining an asset in response to an alarm on asset condition (emerging fault or full failure) and increasing the effectiveness of predictive maintenance through better knowledge prior to attending site

Reliability engineering – combining asset knowledge derived from asset performance with asset design and manufacture to design for dependability for the planned life and usage of the asset.

Open standards for interfaces and data that support these maintenance methodologies have been in development over a number of years. One example of this is from MIMOSA – the Open System Architecture for Enterprise Asset Integration.

7.3.2.2 MIMOSA OSA-EAI

MIMOSA is an operations and maintenance information open system alliance. It is a non-profit industry association that is focused on solutions that leverage supplier neutral, open standards to establish an interoperable industrial ecosystem for Commercial Off The Shelf (COTS) solution components provided by major industry suppliers.

MIMOSA maintains a specification: Open System Architecture for Enterprise Application Integration (OSA-EAI). The OSA-EAI specification provides an information exchange standard to allow sharing asset registry, condition, maintenance and reliability information between enterprise systems; and a relational database model to allow storage of the same asset information.

The specification is maintained as a UML model and is freely downloadable from the MIMOSA web site. It is aligned to the Condition Monitoring and Diagnosis Information Architecture as set out in ISO 13374-2:2007.

The MIMOSA site illustrates the information scope with the following diagram: (reproduced from [48])

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 63 of 133

Figure 15: MIMOSA – Open Asset Information Model

The specification is described in four main areas,

1. Open object registry management 2. Open maintenance management, and 3. Open reliability management 4. Open Condition Management.

The exchange of information is supported through the definition of an XML schema that can be exchanged over a variety of transport options – including files, HTTP and SOAP web services.

The MIMOSA site illustrates the overall OSA-EAI architecture using the diagram below (reproduced from [48]).

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 64 of 133

Figure 16: MIMOSA Open System Architecture for Enterprise Application Integration

(OSA-EAI)

The database model (CRIS) is represented as a logical and physical model, with direct Oracle and Microsoft SQL Server support for both table creation and reference data population.

7.3.3 Asset Condition

7.3.3.1 BACKGROUND

Rail infrastructure maintainers have traditionally relied on a prescriptive, time based preventive maintenance strategy to manage asset condition. These methodologies include formal inspection and asset condition assessment executed by maintenance teams.

Asset condition monitoring systems have been implemented by infrastructure maintainers to supplement the preventive maintenance strategies with predictive maintenance and to inform Reliability Centric Maintenance strategies.

Traditional condition monitoring systems are often based on the SCADA systems model, being tightly integrated applications where the concept of modularity is not applied to the processing blocks within the system. This tight coupling results in applications that are not easily extended from a processing perspective. Within an environment where new algorithms or insights are still being developed (for example, in the rail industry), algorithms need to be quickly evaluated and de-risked before being generally applied for predicting and preventing failures across a rail operators’ estate.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 65 of 133

An open, modular, loosely coupled approach to integrating data and condition processing blocks into a predictive maintenance system enables the rapid evaluation of algorithms without introducing significant risk to existing processing and benefits.

7.3.3.2 MIMOSA OSA-CBM

In recognition that, historically, Condition Monitoring and Diagnostics (CM&D) systems are often tightly integrated, ISO 13374 (parts 1-3) sets out requirements for an open CM&D processing architecture. The purpose of this open architecture is to enable asset condition data to be processed and communicated in a plug-and-play capability.

MIMOSA OSA-CBM (Open System Architecture for Condition Based Maintenance) is an implementation of the ISO 13374 CM&D processing architecture, adding data structures and interface method definitions for the blocks defined in the standard. It is modeled using UML and is distributed under a non-exclusive, royalty free, perpetual license:

http://www.mimosa.org/sites/default/files/policies-charters/MIMOSA_License_Agreement.pdf

The processing architecture is identified as having six functional blocks – Data Acquisition (DA), Data Manipulation (DM), State Detection (SD), Health Assessment (HA), Prognostic Assessment (PA) and Advisory Generation (AG).

Figure 17: OSA-CBM functional blocks

1. Data Acquisition blocks are responsible for transforming the output of a transducer or sample test to a scaled digital representation.

Data Acquisition

Data Manipulation

State Detection

Health Assessment

Prognostic Assessment

Advisory Generation

Sensor / Transducer / manual entry

Extern

al system

s, data arch

iving an

d b

lock

con

figuratio

n

Tech

nical d

isplays an

d in

form

ation

pre

sen

tation

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 66 of 133

2. Data Manipulation blocks calculate descriptors and identify features of interest from sampled sensor data, other descriptors or the output of computations (For example, an average or calculated duration)

3. State Detection blocks categorize data and generate descriptors for a measurement, component or system as normal or abnormal, including the degree of abnormality in the associated operational context (For example, is the average current greater than a threshold).

4. Health Assessment blocks assess a component’s or system’s current health state with associated diagnoses of discovered abnormal states in the associated operational context. (For example, Point Machine is at 60% health and is showing symptoms of an emerging brush fault with 80% confidence)

5. Prognostic Assessment blocks assess a component’s or system’s future health state with the associated predicted abnormal states and remaining life for a projected operational context. (For example, Point Machine will reach critical health of 35% in two days based on normal timetable operation and current planned maintenance – with a confidence of 70%)

6. Advisory Generation blocks integrate information to generate advisories to operations and maintenance and to respond to capability forecast assessment requests. (For example, Recommendation to maintain a point machine in the overnight maintenance window tonight as the asset has enough remaining useful life to support normal timetable operation until then, but not enough to reach the next scheduled maintenance intervention – with confidence of 60%)

This model can be considered as a form of maturity model, where the most data and least value is at the DA layer and the least data of highest value is derived at the AG layer.

Although diagrammatically presented as a sequence from DA, to DM, to SD, to HA, to PA and finally to AG, the model allows for a functional block at any level to ingest and process data from a functional block at any other level. For example, a SD block may consume data directly from a DA functional block.

When realized through implementation on an appropriate software architecture, such as a service bus or middleware, this model enables a plug-and-play approach to algorithm evaluation and implementation at any of the six levels in a loosely coupled manner. This enables the system owner to introduce new functions at any of the conceptual levels as and when they are available, greatly reducing the risks and costs associated with change to a tightly coupled system.

7.3.4 Alarms Systems

7.3.4.1 BACKGROUND

An alarm is defined in EEMUA 191 as indicating to an operator that equipment or process malfunction or abnormal condition. Alarm systems provide support to operators for generating and handing alarms, for managing abnormal situations.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 67 of 133

7.3.4.2 ALARM STATE MODEL

IEC 62682 is an international technical standard on the management of alarm systems for the process industries. It does not define an open data exchange format or model, however, it defines an alarm state model that is applicable directly to alarm management in any industry including rail.

Using a common alarm state model across alarm systems can greatly reduce the complexity of integration and alarm definition.

Figure 18: IEC 62682 Alarm State Model

The alarm states are defined in IEC62682 as:

Normal state (A) The normal (NORM) alarm state is defined as the state in which the process is operating within normal specifications, the alarm is inactive and past alarms have been acknowledged.

ANormal

Process: NormalAlarm: Not active

Ack: Acknowledged

BUnacknowledged alarm

Process: AbnormalAlarm: Active

Ack: Unacknowledged

CAcknowledged alarm

Process: AbnormalAlarm: Active

Ack: Acknowledged

DRTN unacknowledged

Process: NormalAlarm: Not active

Ack: Unacknowledged

EShelved

Process: N/AAlarm: Not active

Ack: N/A

FSuppressed by

designProcess: N/A

Alarm: Not activeAck: N/A

GOut of serviceProcess: N/A

Alarm: Not ActiveAck: N/A

Acknowledge

Abnormal condition

Abnormal condition

Re-alarm

Return to norm

al

condition

Acknowledge

Return to norm

al

condition

Shel

ve

Un

-sh

elve

Des

ign

ed

sup

pre

ssio

n

Des

ign

ed u

n-

sup

pre

ssio

n

Rem

ove

fr

om

ser

vice

Ret

urn

to

se

rvic

e

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 68 of 133

Unacknowledged state (B) The unacknowledged alarm (UNACK) state is the initial state of an alarm becoming active due to abnormal conditions. In this state the alarm is unacknowledged. Previously acknowledged alarms can be designed to re-alarm, causing a return to this state.

Acknowledged state (C) The acknowledged (ACKED) alarm state is the state in which the alarm is active and the operator has acknowledged the alarm.

Return to normal unacknowledged state (D) In the returned to normal unacknowledged (RTNUN) alarm state, the process is within normal limits and the alarm becomes inactive before an operator has acknowledged the alarm condition.

Shelved state (E) In the shelved (SHLVD) alarm state an alarm is temporarily suppressed using a controlled methodology, and not annunciated. An alarm in the shelved state is under the control of the operator. The shelving function can automatically unshelve alarms.

Suppressed-by-design state (F) In the suppressed–by-design (DSUPR) alarm state an alarm is suppressed based on operating conditions or plant states, and not annunciated. An alarm in the suppressed-by-design state is under the control of logic that determines the relevance of the alarm.

Out-of-service state (G) In the out-of-service (OOSRV) alarm state an alarm is manually suppressed (e.g., control system functionality to remove alarm from service) when it is removed from service, typically for maintenance, and not annunciated. An alarm in the out-of-service state is under the control of maintenance.

7.4 PROCESS MINING / BUSINESS PROCESS ANALYTICS

The need for companies to learn more about how their processes operate in the real world is a major driver behind the development and increasing use of process-mining techniques. The practice of business process mining derives from the field of data mining. Data mining refers to the extraction of knowledge from large data sets through identification of patterns within the data. Data mining practice has been developed and adapted to create the business process-mining techniques that are now being used to mine data logs containing process execution data to reconstruct actual business processes. Business process-mining techniques use execution logs of business processes. These are typically hosted within business process management (BPM) systems, though they may also be accessible though other process-related systems installed within a company (see [49]).

There are many techniques that may be used to perform mining of business processes:

Genetic algorithms.

General algorithmic approach.

Markovian approach.

Neural network.

Cluster analysis.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 69 of 133

The BPM life cycle consists of (see [50]):

Process design. In this stage, fax- or paper-based as-is business processes are electronically modeled into BPMS. Graphical standards are dominant in this stage.

System configuration. This stage configures the BPMS and the underlying system infrastructure (e.g. synchronization of roles and organization charts from the employee’s accounts in the company’s active directory. This stage is hard to standardize due to the differing IT architectures of different enterprises.

Process enactment. Electronically modeled business processes are deployed in BPMS engines. Execution standards dominate this stage.

Diagnosis. Given appropriate analysis and monitoring tools, the BPM analyst can identify and improve on bottlenecks and potential fraudulent loopholes in the business processes. The tools to do this are embodied in diagnosis standards.

The main issues still encountered in business process mining as (see [50]):

Noise. Logged data may be incorrect or incomplete creating problems when data is being mined.

Hidden tasks. Tasks that exist but cannot be found in the data.

Duplicate tasks. Two process nodes may refer to the same process model.

Non-free choice constructs. These are controlled choices that depend on choices made in other part of the process model.

Mining loops. A process may be executed several times; loops may be simple involving one or more events or more complex.

Different perspectives. Process events may be appended with additional information for mining purposes.

Delta analysis. Comparison of process model and reference model to check for similarity/disparity.

Visualising results. The results of process mining may be presented in graphical form in terms of a management panel.

Heterogeneous results. Access to information systems based on different platforms.

Concurrent processes. Mining of processes occurring at the same time.

Local/global search. Local strategies restrict the search space and are less complex, global strategies are complicated but have a better chance of finding the optimal solution.

Process re-discovery. The selection of a mining algorithm which can rediscover a class of process models from a complete workflow log.

7.5 BUSINESS PROCESS DATA EXCHANGE STANDARDS

With intensified globalisation, the effective management of an organisation’s business processes became ever more important. Many factors such as ([51]):

the rise in frequency of goods ordered;

the need for fast information transfer;

quick decision making;

the need to adapt to change in demand;

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 70 of 133

more international competitors; and

demands for shorter cycle times

Many new BPM terminologies and technologies are often not well defined and understood by many practitioners and researchers using them. New languages and notations proposed often contain duplicating features for similar concepts.

Standardisation groups (e.g. OMG) which pioneered interchange standards often claim their creations as the missing link between the business analyst and the IT specialist.

There are currently some prominent interchange standards:

(1) BPDM by OMG.

(2) XPDL by the WfMC.

(3) B2B information exchange standards.

7.5.1 Business Process Definition Metamodel (BPDM)

The BPDM is an XML-based proposal by the OMG. It was initiated following a RFPs issued on 31 January 2003 and is still in its formative years. At the time of writing, the finalization of the specifications is underway (see [70]). BPDM provides the capability to represent and model business processes independent of notation or methodology, thus bringing different approaches together into a cohesive capability.

As its name suggests, the BPDM was meant to be the authoritative meta-object facility (an abstract modelling language by the OMG) metamodel for the common elements in process definitions (see [44]).

The metamodel behind BPDM captures business processes in a very general way and provides a XML syntax for storing and transferring business process models between tools and infrastructures. Various tools, methods and technologies can then map their way to view, understand and implement processes to and through BPDM (see [70]).

This means that BPDM works like a multi-lingual standards translator with a common platform. BPDM is not as concerned with graphical notation as with semantics. It is conceivable that vendors will choose to maintain their existing notations but use the OMG BP metamodel to facilitate the transfer of information to other tools and models. In other words, a variety of different notations can continue to thrive in the OMG BP metamodel. In the long-run, however, the OMG will probably move most companies toward UML AD. However, BPDM is criticised as a complex and user-unfriendly standard. As the BPDM is relatively immature with no software tool using it.

7.5.2 XML Process Definition Language (XPDL)

The XML-based XPDL stood the test of time and will mark its tenth-year anniversary in 2008. XPDL started in 1995 when the WfMC published the workflow reference model identifying five key interfaces necessary for any WfMS. One of the interfaces was for defining business processes. It includes a process definition expression language developed via a programmatic

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 71 of 133

interface (i.e. process definition tool) to transfer the process definition to/from the workflow management system (see [71]).

From 2002 to 2004, XPDL was an influential standard for the interchange of process design. This was especially so after WfMC endorsed BPMN as a graphical standard in 2004, after it was enhanced to represent the concepts present in a BPMN diagram in XML.

This extension made XPDL ideal not only as a definition (i.e. execution) standard for business processes, but also as an interchange format between BPMN and XML-based execution standards (e.g. BPEL). The third revision of XPDL (XPDL 2.0) was released by the WfMC in 2005. Today, there are about 70 different BPM-related software based on XPDL.

As its flow control features cannot be compared to that of BPEL and BPML ([122]), the main strength of XPDL still remains in its interchange capabilities, which is its selling point.

There are currently over 70 products and applications that leverage XPDL on Java, Microsoft.NET Framework, or Linux. Some examples include Oracle 9i Warehouse Builder, IDS Scheer Business Architect, BEA Enterprise Repository and BPM Suite, etc.

7.5.3 B2B Information Exchange Standards

Electronic data interchange. Electronic data interchange – EDI, one of the early B2B information exchange standards, was created for communications between different proprietary formats of collaborating partners. There are two predominant forms of EDI; the American National Standards Institute X12 standards and the European UN/EDIFACT standards. In 1987, the International Organisation for Standardisation (ISO) adopted the EDIFACT standard. EDI serves to facilitate document exchange between companies. It is a medium for exchanging business documents with external entities, and integrating the data from those documents into the company’s internal systems. This is done via a value-added network, which is like a post office that forwards the data bundles to their designated businesses for a service fee (see [72]).

ebXML BPSS. The Electronic Business using eXtensible Markup Language (ebXML) was formalised in 2001 as a joint initiative between the United Nations Centre for Trade Facilitation and Electronic Business – UN/CEFACT and OASIS. Presently, it is a full set of ISO standards maintained by its two contributing organisations. ebXML’s stated objective was to make it possible for any business of any size in any industry to do business with any other companies anywhere in the world. The initial hope was that the presence of an accepted international e-business standard would motivate small business software developers to support ebXML. Compared to RosettaNet, ebXML is a collection of general standards which are not specific to any business (i.e. horizontal standards) while RosettaNet comprises specific standards, thereby making a thorough coverage (i.e. vertical standards). ebXML is adopted at much lower cost as compared to RosettaNet (see [51]).

RosettaNet. launched in June 1998, aims to standardise supply chain interactions by creating interoperable collaborative business processes. Member companies transact billions of dollars within their trading networks using partner interface process (PIP) specifications. PIPs are system-to-system, XML-based dialogues that represent operational-level collaborative business processes. Each PIP defines how two specific

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 72 of 133

processes, running in two different partners’ organizations, are standardized and interfaced across the entire supply chain. PIPs include all business logic, message flow, and message contents to align the two business processes. The entire scope of RosettaNet processes is divided into seven clusters containing all supply chain processes: partner product and service review, product information, order management, inventory management, marketing information management, service and support, and manufacturing (see [51]).

Universal Business Language. Universal Business Language – UBL is a royalty-free library of XML-based, commonly used business documents such as purchasing orders, invoices, legal documents, etc. It is an international effort by OASIS, designed to eliminate the re-keying of data in existing fax- and paper-based business correspondence and provide an entry point into electronic commerce for small and medium-sized businesses. Its second version, UBL 2.0, was released in 2006 (see [51]).

7.6 STRENGTHS AND WEAKNESSES OF INTERCHANGE STANDARDS

The strengths of interchange standards include:

interchange standards offer a “globally accepted” file format to save process definitions and Business process models in different BPMS are perfectly compatible; and

XPDL is well-accepted and stable, having had a ten-year history.

The shortcomings of interchange standards include (see [51]):

Owing to fundamental differences in graph-oriented graphical and block-oriented execution standards, the quality of transformation of the interchange standards is limited by different syntax and structures. For instance, a cyclical and temporal implication in a graphical standard cannot be easily transformed into an execution standard. The translation of recursive capabilities from an execution standard to a graphical standard is an even more challenging task.

Currently in the industry, translation from graphical to execution is easier than that from execution to graphical standards. This applies to XPDL and even BPDM. This limitation raises doubts as to whether the “bridge between the business analyst and the IT specialist” is near in sight.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 73 of 133

OPEN DATA EXCHANGE: USAGE IN DOMAINS AND 8RELEVANT COMMUNITIES

In section 8 the usage of the most important afore mentioned technologies and applications in different domains is described and relevant communities as well as places to go are sketched. This is first done for the railway domain as the state of the art in this field given a basis to someone who works or will work in this field. The usage in other domains may give hints to possible adoption of formats and applications and places to go for evaluation of further details.

8.1 RAILWAY

8.1.1 Open data provided by European infrastructure Managers

There are a range of open data-feeds available from European railways. The tables below present an overview of some of these. Typically this type of information can be classified within the following genres:

Health & Safety: Metrics that describe the numbers and types of safety and occupational health incidents.

Operational: Information associated with the current timetable and the operational movements of trains.

Operational Performance: Train delays, cancellations and other quality of service indicators.

Network & Asset Characteristics: Physical description of the laydown of the network including asset registries and the location of assets. Some railways provide information about the condition of the assets.

Network Usage: Information about the extent of services and the numbers of passengers or quantity of freight that is carried.

Corporate: Information about finance, human resources, carbon emissions and other indicators where there is a responsibility to make data available to the public.

With the exception of “operational” data, most of the other data does not significantly change frequently. Consequently, whilst some operational data-feeds are provided in real-time (via messaging services or application programmable interfaces) the remaining are relatively static and may be updated quarterly or even less frequently.

France

Access to the real-time data feeds is via https://data.sncf.com/api [52] and https://ressources.data.sncf.com/explore/?sort=modified [53].

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 74 of 133

Table 7: Data feeds in France

Theme Feed Type of

Data Updating

rate Description and Comments

Operational Train journeys - best duration

Static Yearly Best (lowest) duration evaluated for certain journeys

Operational Train journeys

Real-time Automatically updated 5 times a week.

Calculates a multi-train itinerary that goes through multiple train stations. Provides planned scheduled times for TGV and Regional trains

Operational Timetables Real-time Automatically updated 5 times a week.

Consults a line’s scheduled route (and stops) Provides planned scheduled times for TGV and Regional trains

Operational Scheduled stops

Real-time Automatically updated 5 times a week.

Looks up scheduled stops in each station Provides planned scheduled times for TGV and Regional trains

Operational SNCF Transilien Real-Time Departures

Real-time Once a week Provides SNCF Transilien network departures in real time for given stations

Network and Assets characteristics

List of lines with general information

Static Yearly List of lines including type of line (e.g. high speed)

Network and Assets characteristics

List of stations Static Yearly List of all the stations along the network, including the type e.g. passenger station, marshalling yard, etc.

Network and Assets characteristics

Technical and operating characteristics of the lines

Static Yearly Data provided per homogeneous line section including operating status, maximum speed, electrified, speed control system implemented, links line/regional areas

Network and Assets characteristics

Lists of specific assets

Static Yearly Individual lists of assets (provided per asset type), with their location on the network including: level crossings, track circuits, hotbox detectors, bridges, tunnels, substations, earthworks.

Network and Assets characteristics

List of private siding

Static Yearly List of private sidings locations and network availability

Network and Assets characteristics

Technical and operating characteristics of the tracks

Static Yearly Data provided per homogeneous track section including curves, grade, specific operating rules (equipment for occasional wrong-track working)

Maintenance Track possessions

Static Monthly Maintenance and renewals activities on the lines, including the kind of assets concerned (catenaries, track etc.). Data shared per month and per line (no detailed location and maintenance dates).

Operational performance

Delays and Quality of service indicators

Static Monthly Performances indicators provided per service line (TGV, Paris regional lines, other regional lines, etc.)

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 75 of 133

Usage of the network

Annual statistics - Passengers and Freight

Static Yearly National indicators expressed in Passenger km/year and gross ton-km/year

Corporate National Financial indicators

Static Yearly Data and information published as part of SNCF Réseau commitment to transparency Indicators describing the financial status and performances of SNCF Réseau, as presented each year in the Annual Production and Activities Report Includes sales revenue, taxes, debt, annual volumes of OPEX and CAPEX

Corporate Corporate Social Responsibility

Static Yearly Data and information published as part of SNCF Réseau commitment to transparency that includes workforce and organisation, staffing / retirement, effective annual working duration, CO2 emissions, etc.

Corporate Safety incidents

Static Weekly Description and location of events with significant safety issues (precursors of accidents)

Corporate Passengers' accidents

Static Yearly Annual counting of events and related consequences

Germany

Table 8: Data feeds in Germany

Theme Feed Type of Data Description and Comments

Network and Assets characteristics

Rail network DB

Static National rail network provided in XML or GeoJSON format. http://data.deutschebahn.com/dataset/data-streckennetz [54]

Network and Assets characteristics

Station data Static The API provides station addresses, GPS and additional information (including the length of platforms). http://data.deutschebahn.com/dataset/data-stationsdaten [55]

Operational Target timetable Fernverkehr

Static Target timetable for long-distance trains http://data.deutschebahn.com/dataset/api-fahrplan [56]

Operational Berlin Brandenburg API

Real-time The transport association Berlin-Brandenburg provides an API for real-time data for all suburban railways (S-Bahn) and metro trains (U-Bahn). http://www.vbb.de/de/article/fahrplan/webservices/schnittstellen-fuer-webentwickler/5070.html [57]

Switzerland

Table 9: Data feeds in Switzerland

Theme Feed Type of Data Description and Comments

Network and Assets characteristics

Actual data Real-time The actual service provided is displayed. The final forecast is used where no actual data are available. The "quality" is shown in the appropriate Status fields. https://opentransportdata.swiss/en/dataset/istdaten [58]

Operational Timetable 2017 (GTFS)

Real-time The timetable contains the essential topological and temporal elements that enable timetable display and

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 76 of 133

information. https://opentransportdata.swiss/en/dataset/timetable-2017-gtfs [59]

Network and Assets characteristics

DiDok Static DiDok stands for “Dienststellendokumentation” (location documentation). The data are an extract from all of the operating points in Switzerland, including all of the stops. https://opentransportdata.swiss/en/dataset/didok [60]

Corporate Business Organisations

Static The business organisations display transport companies organised structurally by billing-related and customer-information-related features. https://opentransportdata.swiss/en/dataset/goch [61]

Network and Assets characteristics

Station list Static The station list consists of two files taken from the timetable (HRDF). https://opentransportdata.swiss/en/dataset/bhlist [62]

Operational Timetable 2017 (HRDF)

Yearly The annual timetable contains the timetable data that are primarily communicated in the form of printed materials. https://opentransportdata.swiss/en/dataset/timetable-2017-hrdf [63]

Network and Assets characteristics

GTFS Realtime Real-time GTFS Realtime is an expansion to GTFS static. It offers the “Trip Updates” feed for transport companies supplying real-time information. https://opentransportdata.swiss/en/dataset/gtfsrt [64]

Operational Trip forecast Real-time API allows the user to retrieve real-time data about a specific trip. https://opentransportdata.swiss/en/dataset/fahrtprognose [65]

Operational Departure/arrival display

Real-time The departure/arrival display’s API allows you to search for the departures/arrivals from/to a stop at a specific time. Real-time information is given where applicable. https://opentransportdata.swiss/en/dataset/aaa [66]

Operational Timetable overview

Operational The file provides an overview of the available timetable data as well as its status, its validity and the corresponding permalink. https://opentransportdata.swiss/en/dataset/timetabeloverview [67]

United Kingdom

The real-time data-feeds presented in the table above are targeted for use by software

developers (see [68]). In addition the Office of Rail & Road provides high-level statistics about

the UK railway ([69]).

Table 10: Data feeds in the United Kingdom

Theme Feed Type of Data Updating

rate Description and Comments

Operational RTPPM Real-time 1 per minute Real Time Public Performance Measure. This shows the performance of trains against the timetable, measured as the percentage of trains arriving at destination on time, and is updated every minute

Operational Train Movements

Real-time Up to 600 per minute

Messaging from the TRUST system, containing reports of train movements past timetabled calling and passing points.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 77 of 133

Note: Messages are batched to reduce

network overheads.

Operational TD Real-time Up to 1000 per minute

Berth-level data from the Train Describer system, showing raw data with train movements in more detail than the Train Movements feed. Note: Messages are batched to reduce

network overheads.

Operational VSTP Real-time Low volume Late-notice train schedules which are not available through the SCHEDULE feed

Operational TSR Low-volume <10 per week on Fridays

Temporary Speed Restriction data as published in the Weekly Operating Notice

Operational SCHEDULE Static Daily Extracts of train schedules from the Integrated Train Planning System in CIF and JSON format

Operational Reference Data

Static Infrequent Reference data which can be used to help analyse other data feeds: SMART -train describer berth offset data used for train reporting Corpus - location reference data (JSON format) BPLAN - train planning data, including locations and sectional running times (Public Interface Format “PIF”) Train Planning Network Model - contains

very detailed information on the network model used by ITPS, the Integrated Train Planning System.

Summary Rail Statistics Compendium

Static Yearly Annual compendium publication contains a summary of the statistical releases published by ORR.

Usage of the network

Freight rail usage - freight moved, freight lifted, normalised freight delay

Static Quarterly All information on rail freight usage in Great Britain.

Usage of the network

Estimates of station usage

Static Yearly All information on station usage in Great Britain.

Usage of the network

Passenger rail usage - Passenger train KM, Passenger KM, journeys, revenue

Static Quarterly All information on rail passenger usage in Great Britain.

Corporate Passenger Rail service complaints - Complaints, Appeals,NRE

Static Quarterly This release contains information on complaints made by passengers regarding rail services in Great Britain.

Corporate Disabled Person’s Railcard (DPRC) and assisted journeys data

Static Quarterly Rail passenger assists and bookings.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 78 of 133

Usage of the network

Regional rail usage

Static Yearly Provides passenger journeys data for each Region of Great Britain, covering the volume of journeys between Regions and within Regions. It also looks at cross-border flows between England, Scotland and Wales.

Operational performance

Passenger & freight rail performance - PPM, CaSL, FDM

Static Quarterly This section includes reports on the punctuality of passenger and freight services and the reliability of passenger service.

Health & Safety indicators

Signals passed at danger (SPADS) - Official Statistics

Static Quarterly Number of signals passed at danger (SPADs) without authority on the mainline

Corporate UK rail industry financial information - Official Statistics

Static Yearly UK rail industry financial information presents ORR's analysis of the latest financial data from across the industry

Corporate Rail fares index

Static Yearly Shows average change in price of rail fares and by ticket type.

Health & Safety indicators

Occupational Health - Official Statistics

Static Yearly Provides occupational health indicators for rail including manual handling, shock or trauma incidents, assaults and verbal abuse.

Corporate Rail finance Static Yearly Includes government support, subsidy, private investment

Health & Safety indicators

Rail safety - Key Safety Statistics

Static Yearly Provides safety indicators for rail including broken and buckled rail, passenger, public, workforce, road-rail interface, injuries at level crossings and near misses.

Network and Assets characteristics

Rail infrastructure, assets and environmental

Static Yearly Asset management information including asset renewals, remediation projects, asset failures, asset condition, carbon emissions, late possessions, station and station stewardship data.

There are a number of further initiatives that are being considered to make data openly

available to academia and suppliers to enable increased exploit. The focus of these activities is

around providing specific datasets associated with:

Asset inventories

Asset faults and incidents

Track and OLE monitored data (by NR fleet and also passenger-based monitoring)

Environmental data as available

Train delay data

Data available is focused on developing methodologies to respond to specific NR challenges. Activities are being progressed through research and development programs and are at an early stage of development.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 79 of 133

Use of standard open formats

In general, most data is provided either through:

A web service that is assessed to download “static” data. Much of this data is provided in

open common-separated-values (CSV) or Excel format.

Real-time data streams are provided via an Application Programmable Interface. A

relatively narrow range of formats are employed including JSONS and XML.

There is limited use of railway/transport specific open formats; the Common Interface

Format1 is used to exchange schedules in the UK; the General Transit Feed

Specification (GTFS) is used in France and Switzerland.

RailML and SensorML are not actively used by the Infrastructure Managers that have been

involved in this review (Network Rail and SNCF).

8.1.2 railML®

8.1.2.1 INTRODUCTION

railML® is a data exchange format based on the Extensible Markup Language (XML) focusing on railway applications ([75]). At the same time, railML.org is an open source initiative working constantly on the development of this data exchange format for railway applications.

Currently, railML includes the following data schemes ([76]):

Timetable

Infrastructure

Rolling stock

Interlocking

The latest version of railML®, railML® v2.3, has been released in March 2016. The following figure provides an overview about the versions that are currently supported and the planned future releases ([77]):

Table 11: railML® versions [77]

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 80 of 133

Additionally, railML® v2.4 has been announced at the last railML® Conference in Berne on 22.03.2017 ([79]). According to this announcement, the summary of changes in the railML® infrastructure schema will be small ([80]).

railML is a user driven standard for data exchange in railways and based on an open development. For interaction between users and developers, a number of tools are provided:

The railML Website is the central point of information. From here, railML users, developers and interested people are directed to all the other information they are searching for ([81])

The railML Forum is the discussion platform where users can discuss with users and developers about certain modelling and application aspects ([82]). The different scheme-specific forum topics are moderated by the railML scheme coordinators.

The railML Wiki is the open usage documentation that complements the scheme documentation (cf. [83]). In particular, railML beginners will find here useful information about the different elements and attributes.

The railML Trac is a ticket system for tracking defects and enhancements that have been discussed and consolidated in the railML forum and which shall be solved / implemented in the future (cp. [84]).

railVIVID is an open-source tool for validation and viewing of railML files. It can be downloaded from the railML Website (cf. [85]).

The railML website [81] lists more than 100 companies as railML partners. 24 of these companies are categorized as “developers”. 42 companies are categorized as “users”. The remaining companies are tagged as “supporters”.

In order to use railML with projects and products the license terms listed in [87] have to be obeyed. If you intend to use railML in a productive manner, you are obliged to certify your railML interface(s) that you want to promote or sell. A detailed description of the certification process is given in [88].

The data exchange format is used in domains such as:

exchange of the track geometry

capacity operational simulation

timetable information

exchange of train formation data

schematic track plan

exchange of infrastructure, interlocking, timetables and rolling stocks information.

8.1.2.2 INFRASTRUCTURE SCHEME

The railML infrastructure schema has its focus on the description of the railway network and related infrastructure ([89]).

Topology. The track network is described as a topological node edge model at the level of tracks and switches.

Coordinates. All railway infrastructure elements can be located in an arbitrary 2- or 3-dimensional coordinate system, e.g. the WGS84 that is widely used by today's

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 81 of 133

navigation tools and software. It is further possible to define a separate height coordinate system.

Geometry. The track geometry can be described in terms of radius and gradient change points along the track.

Railway infrastructure elements enclose a variety of railway relevant assets that can be found on, under, over or next to the railway track, e.g. balises, platform edges and level crossings.

Further located elements encompass elements that are closely linked with the railway infrastructure, but that "cannot physically be touched", e.g. speed profiles and track conditions.

8.1.2.3 TIMETABLE SCHEME

The railML timetable schema has its focus on the description necessary to exchange any kind of timetable for operational or conceptional purposes, including the following information listed below ([83]):

Operating Periods. The operating days for train services or rostering.

Train Parts. The basic parts of a train with the same characteristics such as formation and operating period. The train part includes the actual information regarding the path of the train as a sequence of operation or control points together with the corresponding schedule information.

Trains. One or more train parts make up a train and represent either the operational or the commercial view of the train run.

Connections. The relevant connections/associations between trains at a particulare operaton or control point.

Rostering. Train parts can be linked to form the circulations necessary for rostering (rolling stock schedules).

8.1.2.4 ROLLING STOCK SCHEME

The schema rolling stock has its focus on the description of rail vehicles including locomotives, multiple units, passengers and freight wagons as also the combination of single vehicles into formations. Scheme features are listed below ([83]):

separate parts for vehicles and for train parts or complete trains

possible specification of vehicle families and individual vehicles using the common features of the family

different level of detail for data

1. vehicle as black box (with respect to dynamic characteristics) with only mean values

2. vehicle as black box (with respect to dynamic characteristics) with curves for particular values being variable within the operating range

3. vehicle as white box with details about the internal propulsion system

vehicles with motive power, for passenger or freight use

combination of vehicles to formations, i.e. train parts or complete trains

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 82 of 133

8.1.2.5 INTERLOCKING SCHEME

The railML interlocking scheme has its focus on the description of information that infrastructure managers usually maintain in signal plans and route locking tables. A few scenarios are listed below ([83]):

Data transfer. a standard data exchange format will allow the automation of data transfer, which is the process of adapting a railway interlocking and signalling system to a specific yard.

Simulation programs. the railML® IL schema allows modellers to quickly absorb information about the interlocking systems such as timing behaviour and routes and analyse the impact on railway capacity.

The Interlocking scheme is not available for the railML Version 2.x, but will be implemented in the railML Version 3.

8.1.2.6 RAILML® V2.3

railML version 2.3 is the latest release of the data exchange format and it has been published in March 2016. Like the previous versions, railML v2.3 contains infrastructure, timetable and rollingstock elements and attributes. A detailed list of changes between railML v2.2 and v2.3 can be found in [90]. For infrastructure, the amount of modifications is not very extensive since the railML infrastructure development is already actively working on the new baseline of the model – railML v3.

8.1.2.7 RAILTOPOMODEL AND RAILML® V3

UIC RTM Feasibility Study

Assigned by UIC the Swiss IT company TrafIT Solutions did a feasibility study for a common railway infrastructure data model. They presented their results at the 24th railML.org Conference in Paris on 18.09.2013 ([91]). As a result of this study it had been concluded that about 95% of all the elements and attributes of the different existing data models used by European railway companies are very similar to each other due to their reference to the built railway network. The feasibility study postulates central requirements for a generic railway infrastructure topology model:

The model must be scalable: a generic core may be extended by various user specific themes.

Topology is the core of the data model.

The model shall support different levels of detail and these levels are linked with each other.

Depending on the specific user application relevant information are stored in the matching level of detail.

RTM Development

Based on the results of the feasibility study the railway infrastructure managers organized in the UIC working group ERIM (European Railway Infrastructure Masterplan) started the development of a generic railway infrastructure data model – the RailTopoModel (RTM). In 2016, the RailTopoModel approach has been released by the UIC as International Railway

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 83 of 133

Solution IRS 30100 ([92]). Thus, the RailTopoModel can be seen as a standardized modelling approach for railway infrastructure data models that is recommended for use by the International Railway Union. In addition to the IRS 30100 UIC published a wiki in order to provide complementary information about the modelling concepts, practical model usage and model constraints ([93]).

RTM Modelling concepts

The foundation for RTM is a mathematical graph model that is described in detail in [95]. It allows modeling the topology of a railway network of any complexity. Further, RTM defines a generic concept for locating railway infrastructure elements at the railway network described by the graph: A generic “NetEntity” is working as an anchor element for any objects referencing the topology network. The anchor element may be modelled punctiform, linear or as sub-network. For further information about the RTM modelling concepts, please read the relevant wiki pages in [101].

railML® v3

The RTM builds the basis for the development of the new version of the data exchange format, railML v3. It can be seen as a first implementation of the RTM with a special focus on data exchange using XML syntax. Like the RTM, railML v3 has been developed on the basis of UML class diagrams using the software Enterprise Architect. The railML v3 schema files (XSD) are generated from the UML using integrated and proprietary export tools. A first official release of railML v3 schema files is scheduled for autumn 2017 (cp. [77])

8.1.2.8 RAILML® IN IN2RAIL

railML in IN2RAIL WP9

IN2RAIL (Innovative Intelligent Rail) is the main predecessor project to IN2SMART. In its work package WP 9 the focus was on design and development of an advanced asset information system that is able to analyse and predict the status of the network assets. In this context, the data structure of the asset status data to be collected is of major interest. The deliverable D9.1 lists all the data that are necessary for the asset information system and its interfaces (cf. [3]). Further, it is analysed how existing data exchange formats like railML cover these requirements. As a result of this investigation it has been decided to use railML for exchanging static railway infrastructure data and to use a different format like SensorML for all the dynamic information.

IN2RAIL requires railML v3

Since the current railML version 2.3 does not cover all the aspects required for the asset status representation, the IN2RAIL project is longing for the upcoming new version railML v3. In order to have all the required elements and attributes then being implemented, a railML data exchange use case has been derived from the report D9.1 and officially submitted to railML.org. This use case is now available in the railML wiki [102]. It currently has the priority 2, which means that the use case is going to be implemented not with the very first railML v3

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 84 of 133

release (railML v3.1), but with the following one (railML v3.2). According to railML.org [78], the release of railML v3.2 is planned for the end of 2018.

railML in IN2RAIL WP7

The focus of the WP 7 is the development of a prototype based on the IN2RAIL WP 9 described above. The deliverable D7.1 describes that the import format of the topology data could be railML 2.3 or the new railML version 3. The railML 3 version uses the UIC RailTopoModel for modelling the railway infrastructure. The format that could be used is still in discussion and should be decided in WP8.

8.1.2.9 RAILML COMMUNITY

railML® is a community driven project that is coordinated by the non-profit organization railML.org which is a registered association by German law. railML.org is fully independent from railways, manufacturers and authorities ([129]). Two coordinators, Vasco Paul Kolmorgen and Dr. Daniel Hürlimann, are in charge of the main coordination at railML.org. The ongoing development of the different railML data subschemes is managed by four scheme coordinators that come from scientific railML partners as well as from industry using the railML standard.

The following two sections describe the sequence of steps for railML® scheme development. The first one is directed to the use case driven railML development, which is quite new and specifically set up for the new railML version 3. The second section addresses the process of incorporating small changes in the railML schema as it has been done in the past for previous and current versions of railML 2.x.

Use case view

Initial situation: you have a specific data exchange issue for which you want to use the railML data exchange format. Such an issue is called a railML use case.

Step 1: Review the lists of use cases in the railML wiki in order to find out whether your use case has already been recorded ([95]). There exists a list of use cases for every railML subschema. The use cases for railML based data exchange of railway infrastructure data can be found in [97].

Step 2a: If your use case is already listed, review it with respect to your specific task to find out whether there are relevant aspects missing in the use case. If that’s the case, bring your issues to the railML forum ([81]) and discuss it there together with the railML community. The responsible railML scheme coordinator will lead the process of use case modification. Finish.

Step 2b: If your use case is not yet listed, inform the responsible scheme coordinator by email including a very brief use case description. The scheme coordinator reviews the use case description and asks you to formulate the use case according to the structured template either directly in the railML wiki ([83]) or using a Word document. The use case description comprises:

o A precise description of the data exchange application.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 85 of 133

o A brief analysis of the relevant data flows and interfaces for the data exchange o A brief summary of characteristics of the data to be exchanged via the interface. o If possible, you may further add a list of functional elements that are included in

your data exchange use case

Step 3: The responsible railML scheme coordinator leads the task of consolidating the use case user input. In particular, requirements are derived and formulated as tickets in the railML Trac ticket system ([84]).

Step 4: The responsible railML scheme coordinator leads the implementation of the requirements and the resulting elements and attributes in the railML data model. Usually, the work is done by the railML scheme coordinator and a specific scheme working group. The implementation comprises:

o Changes in the railML data model (UML or XSD) o Tracking the changes in the railML Trac ticket system o Documenting the changes in the source code o Documenting the changes in the railML Wiki

Step 5: The responsible railML scheme coordinator leads the task of use case element specification. The aim of this step is to specify which elements and attributes of the railML data model are mandatory considering the given use case and which elements and attributes are optional.

Step 6: The responsible railML scheme coordinator leads the work of writing an official use case document that brings all the use case facets mentioned before together. The official use case document will be entitled “Use Case Definition” and released on the railML website. Thus, the use case definition is the reference document for certification of railML interface implementations.

Element view

Initial situation: you discover a bug in the existing railML® schemes or you want to enhance the model at a specific point.

Step 1: Discuss the issue with users and developers of the railML® community in the railML forum ([81]). There is one forum for each railML subschema and one forum for common aspects.

Step 2: The responsible scheme coordinator summarizes the outcome of the forum discussion and consolidates the solution / result. If the solution comprises a modification or extension of the existing railML data model, the scheme coordinator will create a ticket using the railML Trac ticket system ([84]). Each ticket is linked with a future version of railML. Thus, users can see when to expect which modifications being implemented in the schema.

Step 3: The railML scheme coordinator leads the implementation of the scheme modification or enhancement and tracks the state in the railML Trac ticket system. The implementation comprises:

o Changes in the railML data model (UML or XSD) o Tracking the changes in the railML Trac ticket system o Documenting the changes in the source code o Documenting the changes in the railML Wiki

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 86 of 133

o Releasing the new version of railML in the railML subversion repository ([98]) and publishing information about the changes

o If necessary, adapting the use cases that are affected by the changes o Presenting the summary of changes at the next railML conference

In the meantime: Modifications and enhancements of the railML data model require some time for implementation. If you cannot wait that long, you may want to make use of the “any element” and “any attribute” to attach your own temporary scheme extensions to the railML model.

Based on the sequence steps above, the schema is continually extended, modified and enhanced.

8.1.3 TAF/TAP TSI

The Telematic Applications for Freight / Passenger Services Technical Standards for Interoperability ([TAF], [TAP]) are EU regulations specifying the exchange of information between relevant stakeholders, in order to enable cross-border rail services.

All RU/IM messages described in TAP are common with TAF (which contains additional messages specific for freight traffic). The common processes are related to path allocation, train readiness, train running reporting and service interruption. Consequently, messages are harmonised between TAP and TAF and gathered in the same data model.

The Technical Specification for Interoperability on “Telematics Applications for Passengers” (TAP TSI) prescribes protocols for the data exchange of

timetables

tariffs

reservations, fulfilment

information to passengers in station and vehicle area

train running information

etc.

which must be expected by the European rail sector (railways, infrastructure managers, ticket vendors etc.) according to the European Rail Passengers’ Rights Regulation EC/1371/2007 and to the Interoperability Directive EC/2008/57.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 87 of 133

Figure 19: Principle of common interface for TAF/TAP TSIs

Data formats used in TAP TSI:

EDIFACT (timetabling)

Fixed length text files (tariff data)

Binary messages (reservation messages)

XML-messages (home printed tickets, PRM reservation)

The Technical Specification for Interoperability on “Telematics Applications for Freight” (TAF TSI) drafted by ERA prescribes protocols for the data exchange of:

Path request

Train Running Forecast

Service Disruption Information

Shipment Estimated Time of Interchange / Arrival

Etc.

TAF TSI prescribes furthermore databases which must be implemented by European RUs, IMs, or Freight Customers:

Reference Files (such as location ID, company ID etc.)

Rolling Stock Reference Databases

Wagon and Intermodal Unit Operational Database

Trip plan for wagon / Intermodal unit

TAF TSI prescribes the mandatory use of a so called “common interface” which is mandatory for all RUs and IMs:

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 88 of 133

Figure 20: Common interface for TAF TSI

The Common Interface support following connectors that are used by existing legacy systems:

IP socket connector (customized application protocol)

JMS connector

MQ (IBM)/JMS connector

FTP Connector

FS Connector

SMTP Connector

Web Service Connector Supported data formats are:

XML

Text

CSV

UIC 407-1

(see [103])

8.1.4 UIC 407-1

“Standardized data exchange for the execution of train operations, including international punctuality analysis” - standard developed by the International Union of Railways (UIC).

The objective of the standard is to automate as far as possible the exchange of operationally necessary information between the RUs (IMs and /or RUs) involved in a train movement and to overcome language barriers in international rail traffic.

Messages defined in this standard contain information / data needed to carry out the most important processes in train operations. These are in the first instance processes of operative train running (for instance: traffic regulations and the planning of resource deployment) but also of planning and quality control. Where planning is concerned, however, this only applies in

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 89 of 133

the case of alterations at short notice to current train utilization (e.g. scheduling of a special train or cancellation of a train to take effect within 24 hours).

Messages are functionally designed to facilitate exchanges between different IMs and between IMs and RUs. Exchanges between RUs only do not form part of this standard.

From a technical point of view, messages are suitable for exchange both between process-driven telematics systems and PC-based mail systems. Furthermore, an exchange of messages is also feasible between both systems variants (i.e. between telematics systems on the one hand and mail systems on the other).

The type and exact content of messages to be transmitted are dependent on the information requirements of the RUs involved in the exchange of messages as well as on potential of their telematics or mail systems.

From architecture point of view an essential distinction is made between:

- train related messages (primary statement always concern individual train) - event related messages (primary statement always concern a specified event, for

instance strike or bomb threat) - messages between Quality Monitoring Centers

Train related messages by turn are divided into:

- those that are not replied to by the recipient (unidirectional messages) - those that require a reply from the recipient (bidirectional messages)

Train-related messages can be designed and subsequently issued or received, displayed or further processed within a system as part of process-controlling telematics system (traffic control system) or else via a suitably appointed mail system.

In telematics systems, a message can be designed and sent either in an event-driven automatic fashion as part of the ongoing process or else as triggered by the operator. The latter approach is always required in mail system.

Messages are defined as text strings with field of fixed length. In this standard messages have been numerically coded to reflect the aforementioned principles. The coding pattern is as follows:

2001 to 2099 – unidirectional messages between IMs

2101 to 2199 – unidirectional messages from IMs to RUs

2201 to 2299 – unidirectional messages from RUs to IMs

2301 to 2399 - bidirectional messages between IMs

2401 to 2699 – bidirectional messages between IMs to RUs

2701 to 2799 – event related messages

2801 to 2899 – messages defined and structured in a national context in accordance with this standard

2901 to 2999 – messages between Quality Monitoring Centres

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 90 of 133

Unidirectional messages are numbered in rising sequence within a given number block, commencing with “01” in each case.

Bidirectional messages comprise an enquiry and a reply or several possible replies. The enquiries and their respective reply messages are numbered in rising sequences of tens, the units digit for enquiries in each case being “0”.

Each message within a number block has a set textual designation. Identical messages retain the same text from block to block. They are merely distinguished by their differing message numbers. These differing message numbers are nevertheless structured in such a way that the identical nature of given message can be inferred. For example, the “running forecast” between IMs bears the message number “2001” whilst the running forecast from IMs to RUs bears the message number “2101”.

The standard provides message frames to be exchanged, however do not define protocols to be used to exchange the data with.

Currently there are other concepts that can replace the UIC 407-1 like TAF/TAP TSI.

(see [104])

8.1.5 RINF – Register of Infrastructure

The European Register of Infrastructure refers to Article 49 of Directive (EU) 2016/797 and provides for transparency concerning the main features of the European Railway infrastructure. The common technical specifications are set out in a Commission Implementing Decision (RINF Decision).

The most recent RINF Decision (Decision 2014/880/EU from 26 November 2014) repeals the previous Decision 2011/633/EU and introduces a computerised common user interface (CUI) which simplifies queries of infrastructure data. This interface, set up and managed by the European Railway Agency, is publicly available.

Furthermore, the RINF Decision obliges each Member State to nominate an entity (NRE) in charge of setting up and maintaining its register of infrastructure and to notify an implementation plan.

The primary purpose of RINF is to support technical compatibility between fixed installations and rolling stock within the European community.

For that purpose, the railway network is considered to be at the macro-level a series of operational points and sections of line. At the micro-level, subsystem features are assigned to infrastructure elements, such as tracks and sidings. Ultimately macro- and micro-level should be presented in terms of digital maps.

Railway network structure for RINF For the purpose of RINF

the railway network is considered to be a series of operational points (OPs) connected by sections of line.

a line is a sequence of one or more sections, which may consist of several tracks

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 91 of 133

a section of line is the part of line between adjacent OPs and may consist of several tracks

operational points are locations for train service operations for example where train services can begin and end, change route and where passenger or freight services are provided

stopping points for passengers on plain line are also regarded as OPs

operational points may be locations where the functionality of basic parameters of a subsystem are changing for example: track gauge, voltage and frequency, signaling system

operational points may be at boundaries between MSs or IMs

passing loops and meeting loops on plain line or track connections only required for train operation do not need to be published (however, if parameters change at the connection it would be considered an Operational Point and included in the register)

sidings are all tracks not used for train service movements

Figure 2 shows an example of the railway network structure of RINF, the elements of which belonging to different IMs.

Figure 2: Structure of the railway network for the register

Items collected in RINF have to be accessible for end-users (process of data retrieval). This requires an implementation using IT-means with the need to define a harmonized model of the railway system.

The use cases mentioned in table below represent the primary purpose of RINF, which have been influencing the selection of items of the data base.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 92 of 133

Table 12: Primary purpose of RINF

Title Description Quality demand

Technical compatibility for route allocation

RUs to retrieve technical characteristics of a specific route for the check of line specific technical compatibility between fixed installations and rolling stock.

Mission critical

Technical compatibility for EC verification

NoBos/DeBos to retrieve technical characteristics of a MS for the conformity assessment within the process of EC verification.

Process critical

RST design Rolling stock manufacturers to retrieve technical characteristics for a certain part of the network in order to achieve compliance when designing and authorizing vehicles for placing in service on “type”-level.

Financial critical

Interoperability progress

EC/ERA/MSs/NSAs to retrieve characteristics for specific parts of the networks to follow up regularly the progress towards an European interoperable network in terms of key performance indicators.

Financial critical

Regarding the transition period for infrastructures placed in service the RINF WG decided to set the final deadline for transition to five years. All types of network shall be included in RINF within five years after entry into force (1st January of 2015) of the RINF specification.

Figure 21: Principle of common interface for RINF

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 93 of 133

Providing data to RINF can be done via web application under “Data management” where users have the possibility to upload XML datasets compressed within a .zip file. The XML file is then validated against the corresponding XML Schema Definition (XSD). If the XSD

validation fails, the data will be rejected and a report will be dispatched to the NRE listing the

encountered errors.

The communication between the users and the RINF system is performed through the internet.

Thus, in both cases the RINF architecture is transparent to the users. The users (i.e. RUs, IMs,

Manufacturers, NSAs, etc.) open a web browser and connect to the RINF system via HTTP

(or HTTPS for extra security if necessary). The RINF system provides access to the provided

infrastructure information, as well as any additional functionalities and services. The RINF

system queries the central RINF database and provides back to the users the proper RINF

information.

More information can be found under RINF application guide or RINF User Manual (see [130]).

8.1.6 EULYNX

The EULYNX (European Initiative Linking Interlocking Subsystems) project is an initiative of several European infrastructure managers with a common goal for standardization of interfaces.

The following Infrastructure Managers are currently involved as partners in the initiative:

Société Nationale des Chemins de Fer Luxembourgeois (CFL)

DB Netz AG (DB)

S.A. Infrabel

Bane NOR

Liikennevirasto (FTA)

Network Rail

ProRail B.V.

Société Nationale des Chemins de Fer Français (SNCF)

SŽ-Infrastruktura, d.o.o. (SŽ)

Trafikverket

The project aspires to a mutually shared vision toward harmonization of rail signaling systems, their technical architecture, its functions and interfaces. The work breakdown structure of the

EULYNX project includes items like system architecture, modelling & testing, data preparation, interfaces between interlocking, interfaces to track vacancy detection and adjacent interlocking or signaling subsystems.

The rail infrastructure managers intend to benefit from the project by being able to change, maintain, renew and update the systems in a competitive way.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 94 of 133

This would place the IM’s as the system integrators into a position which provides them with a choice of various suppliers for different subsystems during the systems life cycle. The goal is reduced costs for new projects, or when modifying existing system functionality or infrastructure layouts. Also maintenance related activities should benefit from this.

Results of previous European initiatives concerning interlocking system standardization (e.g. Euro Interlocking, INESS and ERTMS) provide the basis for the project.

The EULYNX initiative acts as a cooperation based on a mutually accepted agreement following democratic principles and membership fees. The project community expects to have different kinds of partners: rail infrastructure managers as core members, other active members like signaling or industrial partners, engineering bureaus and universities. Also observers like associations, regulators etc. may be joining in.

The current phase of the project will provide a full set of specifications. The project has started on 19 February 2014, with a three year lifespan for this stage. After three years the project organization will evolve into a standing organization for standardization of interfaces, based on a full set to be published in 2017 (Baseline Set 1 - partly released in March 2017 rest of documents planned to release in June 2017; Set 2 is scheduled for December 2017 – including formal models). ([123])

Figure 22: EULYNX System architecture

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 95 of 133

8.2 AUTOMOTIVE

A multitude of different formats is available for use in the automotive industry. As it is not possible to mention every used format within the scope of this research, some examples of important formats are listed below. To help readability, this section is separated into two parts: in the first one, automotive centered formats are discussed. While these formats were developed for automotive specific tasks, some of them may be applied in other domains as well. The second part focuses on general formats that were not specifically developed with the automotive industry in mind and are often used in many different domains outside of automotive. They range from data processing and management to non-specific formats like csv and UML.

8.2.1 Automotive centered formats

8.2.1.1 ASAM ODS

ODS (Open Data Services) focuses on the persistent storage and retrieval of testing data. The standard is primarily used to set up a test data management system on top of test systems that produce measured or calculated data from testing activities. Components of a complex testing infrastructure can store data or retrieve data as needed for proper operation of tests or for test data post-processing and evaluation. A typical scenario for ODS in the automotive industry is the use of a central ODS server, which handles all testing data produced by vehicle test beds. The major strength of ODS as compared to non-standardized data storage solutions is that data access is independent of the IT architecture and that the data model of the database is highly adaptable yet still well-defined for different application scenarios. ([105])

8.2.1.2 AUTOMOTIVE DATA AND TIME-TRIGGERED FRAMEWORK (ADTF)

EB Assist ADTF (Automotive Data and Time-Triggered Framework) is a tool for the development, validation, visualization and test of driver assistance and automated driving features that includes the latest technology. You can deliver advanced driver assistance systems (ADAS) and highly automated driving (HAD) features to your customers with EB Assist ADTF. This trustworthy tool is flexible, efficient, extendable, and stable. ([106])

8.2.1.3 AUTOMOTIVE OPEN SYSTEM ARCHITECTURE (AUTOSAR)

AUTOSAR (AUTomotive Open System ARchitecture) is a worldwide development partnership of automotive interested parties founded in 2003. It pursues the objective of creating and establishing an open and standardized software architecture for automotive electronic control units (ECUs) excluding infotainment. Goals include the scalability to different vehicle and platform variants, transferability of software, the consideration of availability and safety requirements, a collaboration between various partners, sustainable utilization of natural resources, and maintainability throughout the whole "Product Life Cycle". ([107])

8.2.1.4 FUNCTIONAL MOCK-UP INTERFACE (FMI)

Functional Mock-up Interface (FMI) is a tool independent standard to support both model exchange and co-simulation of dynamic models using a combination of xml-files and compiled

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 96 of 133

C-code. It is used to develop virtual products based on components that interchange their data. The variables for data exchange are defined as XML files and the components (applications) are provided as compiled C-functions. ([108])

8.2.1.5 STEP AP 242 XML (ISO 10303-242)

The standard STEP AP 242 (ISO 10303-242) “Managed model based 3D engineering" is the merging of 2 ISO standards: Aerospace's STEP AP203 "Configuration controlled 3D design" and Automotive's STEP AP214 "Core data for automotive mechanical design processes. ([109])

8.2.1.6 OPENDRIVE

OpenDRIVE is a XML based road description format that defines road networks in a topologically way by linking road elements and topographically by using mathematical descriptions in a 3-dimentional way. Infrastructure is defined as road objects or signals in a relative coordinate system. By now OpenDRIVE can be seen as the de facto standard within the driving simulator domain and is used for testing purpose as well. It is developed and maintained by a core team consisting of car manufactures, map makers and research institutes. OpenDRIVE is royalty-free. ([110])

8.2.1.7 NAVIGATION DATA STANDARD (NDS)

Navigation Data Standard is a data format for road networks used for navigation purpose. It has a strong focus on interoperability and minimal usage of resources. It defines road networks in a topological and topographical way (in lane level based on mathematical functions) and defines road infrastructure, city and terrain models in a rudimental way, too. NDS developed by a consortium consisting of car manufactures, tier-1 suppliers and map makers. ([111])

8.2.1.8 OPENSCENARIO

OpenSCENARIO is a XML based traffic scenario description format that defines entities, environmental conditions and interaction or behavior of traffic participants. These interactions can be descripted relative to other entities, relative to the road or absolute. For road description a link to a corresponding OpenDRIVE file is used. OpenSCENARIO is developed and maintained by a core team consisting of car manufactures, simulation tool provider and research institutes. It is royalty-free. ([112])

8.2.1.9 SENSORIS

SENSORIS is an initiative founded by HERE to define a car-to-cloud data standard. Sensor data and information about the environmental condition should be shared in a defined way between vehicles and backend systems of different manufactures. Currently information about hazard warnings, street parking and traffic condition are modelled. ([113])

8.2.1.10ADASIS

The Advancing map-enhanced driver assistance systems (ADASIS) provide a so called “electric horizon” in the vehicle to different assistance systems. This horizon contains e.g. information about the map, vehicle position and speed in a standardized data model. The

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 97 of 133

system is developed in a consortium consisting of car manufactures, tier-1 suppliers and map makers. ([114])

8.2.1.11CAR 2 CAR COMMUNICATION CONSORTIUM

The Car 2 Car Communication Consortium (C2C CC) defines and establishes standards for cooperative intelligent transportation systems (C-ITS). Part of this task is the specification and contribution to the vehicle-to-vehicle and vehicle-to-infrastructure communication using various message types to descript the state of vehicles, distribute information and warnings as well as information about intersection layouts and signal time and phases. The consortium cooperates closely with the European Telecommunications Standards Institute. ([115])

8.3 INDUSTRY AND HOME AUTOMATION

8.3.1 Industry

See chapter 6.3.1 OPC UA.

8.3.2 Home Automation

8.3.2.1 BACNET

The communication protocol BACnet was specially developed for the requirements of buildings. It is suited for both the automation and the management level. The emphasis is placed on building automation and control with a view to HVAC plants, fire control panels, intrusion detection and access control systems. BACnet is continually being extended for additional building-specific systems such as escalators and elevators. By integrating new IT technologies such as IPv6 and Web services, the BACnet standard is further developing into a modern, IT-friendly and multidisciplinary building protocol. At the same time, standardized ASHRAE or AMEV device profiles ensure a high level of quality and planning reliability with a strict testing and certification procedure. ([116])

Standard: ISO 16484-5

8.3.2.2 KNX

KNX is an open, worldwide standard used for more than 20 years, conforming to EN 50090 and ISO/IEC 14543, which is supported by more than 300 vendors. With KNX technology, advanced multiple disciplines as well as simple solutions can be implemented to satisfy individual requirements in room and building automation in a flexible way. KNX products for the control of lighting systems, shading and room climate plus energy management and security functions excel in ease of installation and commissioning. A vendor-independent tool (ETS) is available for commissioning. KNX can use twisted pair cables, radio frequency (RF) or data transmission networks in connection with the Internet Protocol for communication between the devices. Coordinated room and building management often demands the integration of other technologies and systems. Hence, KNX links and interfaces for connection to Ethernet/IP, RF, lighting control with DALI and building automation and control systems are provided. ([117])

Standard: EN 50090 and ISO/IEC 14543

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 98 of 133

8.3.2.3 LONWORKS

The LonWorks-based communication protocol is one of the most widely deployed technologies worldwide. Using the protocol, complete networks made up of interoperable products can be created. This is proven by the fact that more than 700 LonMark®-certified products from more than 400 companies in the fields of building automation and control, traffic and energy supply are used. Owing to its worldwide use and being a global standard, LonWorks is focusing on HVAC functions in room automation and at the field level. The protocol conforms to ISO/IEC 14908 (worldwide), EN 14908 (Europe), ANSI/CEA-709/852 (U.S.) and is also standardized in China. ([118])

Standard: ISO/IEC 14908 (worldwide), EN 14908 (Europe), ANSI/CEA-709/852 (U.S.)

8.3.2.4 DALI

DALI (Digital Addressable Lighting Interface) is a standardized interface for lighting control. Electronic ballasts for fluorescent lamps, transformers and sensors of lighting systems communicate with the building automation and control system via DALI. ([119])

Standard: IEC 62386-101:2009-06; Teil 101: System; IEC 62386-102: 2009-06 Teil 102: Betriebsgeräte

8.3.2.5 ENOCEAN

Worldwide leading companies operating in the field of building infrastructure have joined to form the EnOcean Alliance, aimed at implementing innovative RF solutions for sustainable building projects. Core technology is the self-powered RF technology developed by EnOcean for maintenance-free sensors, which can be installed wherever desired. The EnOcean Alliance stands for the incremental development of the interoperable standard and for a secure future of the innovative RF sensor technology. ([120])

Standard: ISO/IEC 14543-3-10 EnOcean-Funk

8.3.3 OPC Foundation: The Interoperability Standard for Industrial Automation

OPC Foundation: The Interoperability Standard for Industrial Automation™

The mission of the OPC Foundation (https://opcfoundation.org/) is to manage a global organization in which users, vendors and consortia collaborate to create data transfer standards for multi-vendor, multi-platform, secure and reliable interoperability in industrial automation.

To support this mission, the OPC Foundation:

- Creates and maintains specifications - Ensures compliance with OPC specifications via certification testing - Collaborates with industry-leading standards organizations

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 99 of 133

8.4 CIVIL ENGINEERING / CONSTRUCTION

8.4.1.1 CITY GML

City Geography Markup Language is a format for 3D models of cities and agglomerations. It is widely used for contour lines, roof areas from buildings, land use attributes, roads, railways and vegetations.

It can be imported in calculation programs for 3D noise mapping, where the areas are split into parts with different noise related attributes. Road and Railway imports are used for defining the sources and the acoustic properties. The vegetation attributes lead to different reflection coefficients, which have to be taken into account in large free field calculations.

8.4.1.2 GEOTIFF

TIFF is primary used by government bodies for satellite pictures or pictures from overflights while scanning other attributes and geometries. The Geo-referencing makes it easy to import and adopt to the demanded coordinate system. TIFF pictures are not compressed, thus printed plans are always in the highest possible resolution. The pictures can also be used for modelling in a 3D noise mapping program. The picture resolution is usually reliable for being used for modelling manually on parts where no scanned 3D models exist. It also supports the improvement of unprecise or incorrect scanned 3D models.

8.4.1.3 SIMPLE FEATURE ACCESS (OPENGIS)

Simple Feature Access is a format for storing 2D objects for geographical modelling. 2D objects like lines, areas and points can also be linked with attributes, like population numbers or street names but also with the height of the object - if needed. It is used in every calculation program for free field noise mapping and modelling in a large scale. It is also used in GIS programs like ArcGIS or QGIS. This programs are used for cutting and relabeling noise maps for uploading on a public GIS platform that is also using this format as it is widely used in most of the GIS applications.

8.4.1.4 GEO JAVASCRIPT OBJECT NOTATION (GEOJSON)

JSON (short for JavaScript Object Notation) is a text-based file format for data exchange. The format is well-structured and conducted as a valid java-script. JSON is widely used in web-based and mobile applications. Its file extension is *.json.

GeoJSON is built upon JSON. It allows representing geographical data by means of point, line string and polygon geometries as well as sets of them (multipoint, multiline string and multipolygon). GeoJSON files describing countries or smaller regional entities are freely available on the internet. In many cases the open source javascript library D3.js. is applied. Its file extension is *.geojson. Usually, country-boundaries are represented by polygons.

The GeoJSON extension TopoJSON encodes topology. Thus it is possible to eliminate redundancies. Common borders of two countries are noted twice (polygon of country one and polygon of country two) in GeoJSON format. The TopoJSON extension allows the definition of

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 100 of 133

“arcs” – in this case the common border is understood to be an arc, then arcs are combined to polygons. Thus topological relations are considered. As a result TopoJSON is substantially more compact than GeoJSON, frequently offering a reduction of ≥ 80% in memory size without a loss in accuracy.

8.5 TRAFFIC MANAGEMENT

In the traffic management domain “classic data formats” as spreadsheets, csv files, plain text, scanned images are sometimes still used for exchange of static data (e.g. traffic signal plans, loop detector meta data). But for static data as well as for highly dynamic data more and more structured and more flexible formats as xml are used and for real time data especially the web service technology, WMS, WFS etc. are often used.

As a map basis in addition to here and TomTom / Teleatlas the OSM becomes more and more popular and is even used by public authorities. Setting up e.g. traffic information- or management-systems often many data have to be processed and combined and as a first step may be put on a central basic map. For this purpose the location Reference System OpenLR, which is described below can be used.

In traffic management an important topic is “Real time traffic information”. The commonly used formats TMC, TPEG and DATEX II are described in the following as well. Concerning public transport the General Transit Feed Specification GTFS is used to distribute real time information as well as provides journey planners, in Germany e.g. used with the TRIAS interface.

8.5.1 OpenLR (Location Referencing)

OpenLR is a royalty-free open standard for "procedures and formats for the encoding, transmission, and decoding of local data irrespective of the map" developed by TomTom. The format allows locations localised on one map to be found on another map to which the data have been transferred. ([122], [123])

OpenLR requires that the coordinates are specified in the WGS 84 format and that route links are given in meters. Also, all routes need to be assigned to a "functional road class".

The specification is licensed under a Creative Commons license ([123]). TomTom has published a library for the format under the GPLv2.

8.5.2 Transport Protocol Experts Group TPEG

The Transport Protocol Experts Group (TPEG) is a data protocol suite for traffic and travel related information. TPEG can be carried over different transmission media (bearers), such as digital broadcast or cellular networks (wireless Internet). TPEG applications include, among others, information on road conditions, weather, fuel prices, parking or delays of public transport. ([124])

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 101 of 133

8.5.3 Traffic Message Channel (TMC)

Traffic Message Channel (TMC) is a technology for delivering traffic and travel information to motor vehicle drivers. It is digitally coded using the ALERT C protocol into RDS Type 8A groups carried via conventional FM radio broadcasts. It can also be transmitted on Digital Audio Broadcasting or satellite radio. TMC allows silent delivery of dynamic information suitable for reproduction or display in the user's language without interrupting audio broadcast services. Both public and commercial services are operational in many countries. When data is integrated directly into a navigation system, traffic information can be used in the system's route calculation. ([125])

8.5.4 DATEX II

Delivering European Transport Policy in line with the ITS Action Plan of the European Commission requires co-ordination of traffic management and development of seamless pan European services. With the aim to support sustainable mobility in Europe, the European Commission has been supporting the development of information exchange mainly between the actors of the road traffic management domain for several years. In the road sector, the DATEX standard was developed for information exchange between traffic management centres, traffic information centres and service providers and constitutes the reference for applications that have been developed in the last 10 years. The second generation DATEX II specification now also pushes the door wide open for all actors in the traffic and travel information sector.

Much investment has been made in Europe, both in traffic control and information centres over the last decade and also in a quantum shift in the monitoring of the trans-European transport network (TEN-T). This is in line with delivering the objectives of the EasyWay programme for safer roads, reduced congestion and a better environment. Collecting information is only part of the story – to make the most of the investment data needs to be exchanged both with other centres and, in a more recent development, with those developing pan-European services provided directly to road users. DATEX was originally designed and developed as a traffic and travel data exchange mechanism by a European task force set up to standardise the interface between traffic control and information centres. With the new generation DATEX II it has become the reference for all applications requiring access to dynamic traffic and travel related information in Europe. The aim of the DATEX II organisation is that in 2020 DATEX II is THE information model for road traffic and travel information in Europe. ([126])

8.5.5 General Transit Feed Specification (GTFS)

The General Transit Feed Specification (GTFS), also known as GTFS static or static transit to differentiate it from the GTFS real time extension, defines a common format for public transportation schedules and associated geographic information. GTFS "feeds" let public transit agencies publish their transit data and developers write applications that consume that data in an interoperable way.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 102 of 133

A GTFS feed is composed of a series of text files collected in a ZIP file. Each file models a particular aspect of transit information: stops, routes, trips, and other schedule data. The details of each file are defined in the GTFS reference.

An example feed can be found in the GTFS examples. A transit agency can produce a GTFS feed to share their public transit information with developers, who write tools that consume GTFS feeds to incorporate public transit information into their applications. GTFS can be used to power trip planners, time table publishers, and a variety of applications, too diverse to list here, that use public transit information in some way. ([127])

8.5.6 The TRIAS interface

With the interface TRIAS (corresponding to German VDV-431-2) it is possible to connect to different journey planners. The interface emerged from the research and standardisation project IP-KOM-ÖV of the German BMVI. The information given by TRIAS have the same actuality as the information given by the original journey planner ([128]).

8.5.7 Mobilitätsdatenmarktplatz (MDM) and mCLOUD

Providers and users of traffic data can find everything to move them forward on the marketplace MDM: a neutral B2B platform ([132]). Defined standards for data exchange. And above all, the most information in Germany on traffic flows, congestion, road works, parking facilities and more. The Mobility Data Marketplace is where stakeholders, information and opportunities meet. The MDM …

- is a neutral platform, on which real time road traffic data of the public authorities and the private sector are offered and exchanged

- in Germany is the national single point of access for road traffic following the European IVS- guideline.

- offers a defined service level for the data exchange and undertakes the data distribution to customers.

- releases providers and customers organisationally and technically. This way e.g. road operators can offer their traffic data on a central platform and don’t have find individual solutions for different customers.

The mCLOUD is a data platform of the BMVI (German Federal Ministry of Transport and Digital Infrastructure) which started in May 2016. The data treasure of the ministry and its subordinate agencies, millions of mobility, geo-, and weather data are made investigable at one central point. The mCLOUD is a growing system and is as well open for data from science and economy. The mCLOUD

- is an investigation platform for open data from the business domain of the BMVI. - Works as a search engine. Search and find is made simple. - Doesn’t distribute the data itself, but instead refers to data interfaces and download links

of the supplying providers / organisations.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 103 of 133

At start in 2016 the mCLOUD included data sets from mobility (roads, railways, water ways air transportation), weather and climate and waters, e.g.:

- Data of the 1.700 count locations of the German Bundesanstalt für Straßenwesen (BASt) (road workloads, traffic density)

- Flood times and water levels at the German Bay - Real time data and water levels of navigable waterways - Time series of more than 1.000 climate stations of the German Weather Service - Time tables of the Deutsche Bahn incl. data about the parking situation at railway

stations

MDM as well as mCLOUD are part of the corporate strategy of the BMVI to support intelligent mobility in Germany. While the mCLOUD is a free open data portal where to search for data, the MDM offers its users comprehensive functionalities for offering, subscription and exchange of real time data

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 104 of 133

REFERENCED DOCUMENTS 9

[1] http://www.in2rail.eu/, accessed 2017. [2] http://www.in2rail.eu/Page.aspx?CAT=DELIVERABLES&IdPage=69d2e365-3355-

45d4-bb3c-5d4ba797a3ac, accessed 2017 [3] IN2RAIL: Deliverable D9.1 – Asset status representation.

[4] http://www.wordle.net/create, accessed 2017.

[5] JSON.ORG. (2017). Introducing JSON. Retrieved from JSON.ORG: HTTP://JSON.ORG/JSON-EN.HTML, accessed 2017.

[6] XML ESSENTIALS. (2017). Retrieved from w3.org: https://www.w3.org/standards/xml/core, accessed 2017.

[7] W3Schools. (2017). XML RDF. Retrieved from w3schools.com: https://www.w3schools.com/xml/xml_rdf.asp, accessed 2017.

[8] Computer-Hope. (2017). Spreadsheet. Retrieved from computerhope.com: https://www.computerhope.com/jargon/s/spreadsh.htm, accessed 2017.

[9] Opendata-handbook. (2017). File Formats. Retrieved from Opendatahandbook.org: http://opendatahandbook.org/guide/en/appendices/file-formats/, accessed 2017.

[10] HDF Group, https://www.hdfgroup.org/, accessed 2017

[11] An Introduction to NetCDF, Unidata, http://www.unidata.ucar.edu/software/netcdf/docs/netcdf_introduction.html, accessed 2017

[12] https://en.wikipedia.org/wiki/Business_Process_Modeling_Language, accessed 2017

[13] http://www.prostep.org/en/projects/code-of-plm-openness/, accessed 2017

[14] https://en.wikipedia.org/wiki/Google_Web_Toolkit, accessed 2017

[15] https://en.wikipedia.org/wiki/JT_%28visualization_format%29, accessed 2017

[16] https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=oslc-core, accessed 2017

[17] https://en.wikipedia.org/wiki/Requirements_Interchange_Format, accessed 2017

[18] http://www.odata.org/, accessed 2017, accessed 2017

[19] https://en.wikipedia.org/wiki/Systems_Modeling_Language, accessed 2017

[20] https://en.wikipedia.org/wiki/Unified_Modeling_Language, accessed 2017

[21] http://www.uml.org/, accessed 2017

[22] Website of ascolab GmbH. (2017) http://www.ascolab.com/, last access: 15.05.2017.

[23] https://opcfoundation.org/, accessed 2017

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 105 of 133

[24] BOSAK, J., 1997. XML, Java, and the future of the Web. World Wide Web Journal, 2(4), pp. 219-227.

[25] http://www.w3.org/Consortium/, accessed 2017

[26] http://www.w3.org/TR/2004/NOTE-ws-arch-20040211/#introduction

[27] CÂNDIDO, G., JAMMES, F., DE OLIVEIRA, J.B. and COLOMBO, A.W., 2010. SOA at device level in the industrial domain: Assessment of OPC UA and DPWS specifications, Industrial Informatics (INDIN), 2010 8th IEEE International Conference on 2010, IEEE, pp. 598-603.

[28] DISCENZO, F.M., NICKERSON, W., MITCHELL, C.E. and KELLER, K.J., 2001. Open systems architecture enables health management for next generation system monitoring and maintenance. Development program white paper, OSA-CBM Development Group.

[29] http://www.w3.org/TR/tr-technology-stds, accessed 2017.

[30] http://infinispan.org, accessed 2017. [31] https://redis.io/, accessed 2017. [32] http://www.opengeospatial.org/standards/sfa, accessed 2017. [33] http://gdal.org/, accessed 2017. [34] https://en.wikipedia.org/wiki/World_file, accessed 2017. [35] https://en.wikipedia.org/wiki/OpenStreetMap, accessed 2017. [36] http://wiki.openstreetmap.org/wiki/Map_Features, accessed 2017. [37] http://wiki.openstreetmap.org/wiki/Tags, accessed 2017. [38] http://www.openrailwaymap.org, accessed 2017. [39] https://en.wikipedia.org/wiki/Shapefile, access 2017. [40] http://www.opengeospatial.org/, accessed 2017. [41] http://docs.opengeospatial.org/wp/07-165r1/, accessed 2017. [42] http://www.sensorml.com/, accessed 2017. [43] LAS Specification, Version 1.4-R13, American Society for Photogrammetry and

Remote Sensing, 15 July 2013. Retrieved from http://www.asprs.org/wp-content/uploads/2010/12/LAS_1_4_r13.pdf, accessed 2017.

[44] asprs.org. The imaging & geospatial information society. https://www.asprs.org/committee-general/laser-las-file-format-exchange-activities.html; accessed 09.06.2017.

[45] http://www.novapoint.com/sets-standard-bim-railway-projects, accessed 2017. [46] http://www.bimtaskgroup.org/, accessed 2017. [47] http://buildingsmart.org/standards/technical-vision/, accessed 2017. [48] http://www.mimosa.org/mimosa-osa-eai, accessed 2017.

[49] Tiwari, A., Turner, C. J., & Majeed, B. (2008). A review of business process mining:

State-of-the-art and future trends. Business Process Management Journal, 14(1), 5-22.

[50] Van der Aalst, Wil MP, van Dongen, B. F., Herbst, J., Maruster, L., Schimm, G., &

Weijters, A. J. (2003). Workflow mining: A survey of issues and approaches. Data &

Knowledge Engineering, 47(2), 237-267.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 106 of 133

[51] Ko, R. K., Lee, S. S., & Wah Lee, E. (2009). Business process management (BPM)

standards: A survey. Business Process Management Journal, 15(5), 744-791.

[52] https://data.sncf.com/api, accessed or used 2017.

[53] https://ressources.data.sncf.com/explore/?sort=modified, accessed or used 2017.

[54] http://data.deutschebahn.com/dataset/data-streckennetz, accessed or used 2017.

[55] http://data.deutschebahn.com/dataset/data-stationsdaten, accessed or used 2017.

[56] http://data.deutschebahn.com/dataset/api-fahrplan, accessed or used 2017.

[57] http://www.vbb.de/de/article/fahrplan/webservices/schnittstellen-fuer-

webentwickler/5070.html, accessed 2017.

[58] https://opentransportdata.swiss/en/dataset/istdaten, accessed or used 2017.

[59] https://opentransportdata.swiss/en/dataset/timetable-2017-gtfs, accessed or used

2017.

[60] https://opentransportdata.swiss/en/dataset/didok, accessed or used 2017.

[61] https://opentransportdata.swiss/en/dataset/goch, accessed or used 2017.

[62] https://opentransportdata.swiss/en/dataset/bhlist, accessed or used 2017.

[63] https://opentransportdata.swiss/en/dataset/timetable-2017-hrdf, accessed or used

2017.

[64] https://opentransportdata.swiss/en/dataset/gtfsrt, accessed or used 2017.

[65] https://opentransportdata.swiss/en/dataset/fahrtprognose, accessed or used 2017.

[66] https://opentransportdata.swiss/en/dataset/aaa, accessed or used 2017.

[67] https://opentransportdata.swiss/en/dataset/timetabeloverview, accessed or used 2017.

[68] http://nrodwiki.rockshore.net/index.php/Main_Page, accessed 2017.

[69] http://dataportal.orr.gov.uk/browsereports, accessed 2017.

[70] OMG website, http://www.omg.org/bpdm/, accessed 2017.

[71] WFMC website, http://www.wfmc.org/XPDL.htm, accessed 2017.

[72] Edibasics website, http://www.edibasics.com, accessed 2017.

[73] W3Schools. (2017). XML RDF. Retrieved from w3schools.com:

https://www.w3schools.com/xml/xml_rdf.asp, accessed 2017.

[74] XML ESSENTIALS. (2017). Retrieved from w3.org:

https://www.w3.org/standards/xml/core, accessed 2017.

[75] Nash, A.; Huerlimann, D.; Schütte, J.; Krauss, V.P. (2004): RailML – a standard data interface for railroad applications. In: Computers in Railways IX, pp. 233-240.

[76] railML.org: The railML subschemas. https://www.railml.org/en/user/subschemes.html, accessed 04.04.2017.

[77] Wikipedia: railML. https://en.wikipedia.org/wiki/RailML, accessed 30.03.2017.

[78] railML.org: Version Planning. https://www.railml.org/en/developer/version-timeline.html, accessed 26.06.2018.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 107 of 133

[79] railML.org: 31st railML Conference. https://www.railml.org/en/event-reader/31st-railml-conference-berne.html; last access: 04.04.2017.

[80] railML.org: (Forum) railML alpha version 3.0.5 available / railML 2.4 announced. http://www.railml.org/forum/index.php?t=msg&th=506&start=0&; last access: 04.04.2017.

[81] railML.org: Official railML website. https://www.railml.org/en/; last access: 30.03.2017.

[82] railML.org: railML Forum. http://forum.railml.org/; last access: 30.03.2017.

[83] railML.org: railML Wiki. https://wiki.railml.org/; last access: 30.03.2017.

[84] railML.org: railML Trac ticket system. http://trac.railml.org/; last access: 30.03.2017.

[85] railML.org: railVIVID – The railML Viewer & Validator powered by UIC. https://www.railml.org/en/user/railvivid.html; last access: 29.04.2017.

[86] railML.org: railML® partners. https://www.railml.org/en/introduction/partners.html; last access: 04.04.2017.

[87] railML.org: Licence terms. https://www.railml.org/en/user/licence.html; last access: 29.04.2017.

[88] railML.org: Certification of your railML® interface. https://www.railml.org/en/user/certification.html; last access: 29.04.2017.

[89] railML.org: Infrastructure. https://www.railml.org/en/user/subschemes/infrastructure.html; last access: 04.04.2017.

[90] railML.org: (Wiki) CO: changes / 2.3. http://wiki.railml.org/index.php?title=CO:changes/2.3; last access: 25.04.2017.

[91] UIC: Rail TopoModel and railML® - The foundation for an universal Infrastructure Data Exchange Format. In: 24th railML.org Conference, Paris, 18.09.2013; http://documents.railml.org/events/slides/2013-09-17_uic_nissi-erim_presentation.pdf, accessed 2017.

[92] UIC: International Railway Solution IRS 30100 – RailTopoModel; 1st edition September 2016 (IRS 30100:2016).

[93] UIC, railML.org: RailTopoModel Wiki. http://wiki.railtopomodel.org/; last access: 28.04.2017.

[94] railML.org: Organisation. https://www.railml.org/en/organisation.html; last access: 04.04.2017.

[95] Gély, L.; Dessagne, G.; Vanderbeck, F. (2010): A multi scalable model based on a connexity graph representation. In: Computers in Railways XII, pp. 193-204.

[96] railML.org: (Wiki) Dev: Use cases. http://wiki.railml.org/index.php?title=Dev:Use_cases; last access: 04.04.2017.

[97] railML.org: (Wiki) IS: Use Cases. http://wiki.railml.org/index.php?title=IS:UseCases; last access: 04.04.2017.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 108 of 133

[98] railML.org: railML Subversion repository. https://svn.railml.org; last access: 04.04.2017.

[99] railML.org: Schema railML.xsd. https://www.railml.org/files/download/schemas/2016/railML-2.3/documentation/railML.html; last access: 24.04.2017.

[100]UIC, http://railML.org: Official RailTopoModel Website. http://www.railtopomodel.org/en/; last access: 25.04.2017.

[101]UIC, http://railML.org: (Wiki) RTM modelling concepts. http://wiki.railtopomodel.org/index.php?title=RTM_modelling_concepts; last access: 28.04.2017.

[102]http://railML.org: (Wiki) IS:UC:Asset status representation. http://wiki.railml.org/index.php?title=IS:UC:Asset_status_representation; last access: 02.05.2017.

[103]http://www.era.europa.eu/Document-Register/Pages/TAF-TSI.aspx, accessed 2017.

[104]http://www.uic.org/com/IMG/pdf/UIC_Leaflet_407-1.pdf, accessed 2017.

[105]https://wiki.asam.net/display/STANDARDS/ASAM+ODS, accessed 2017.

[106]https://www.elektrobit.com/products/eb-assist/adtf/, accessed 2017.

[107]https://en.wikipedia.org/wiki/AUTOSAR, accessed 2017.

[108]http://fmi-standard.org/, accessed 2017.

[109]http://www.ap242.org/, accessed 2017.

[110]http://www.opendrive.org/project.html, accessed 2017.

[111]http://nds-association.org/#thestandard, accessed 2017.

[112]http://openscenario.org/project.html, accessed 2017.

[113]https://here.com/en/innovation/sensoris, accessed 2017.

[114]http://adasis.org/, accessed 2017.

[115]https://www.car-2-car.org, accessed 2017.

[116]http://www.big-eu.org, accessed 2017.

[117]http://www.knx.org, accessed 2017.

[118]http://www.lonmark.org, accessed 2017.

[119]http://www.dali-ag.org, accessed 2017.

[120]http://www.enocean-alliance.org, accessed 2017.

[121]http://www.opengeospatial.org/standards/sensorml

[122]http://web.archive.org/web/20110807055627/http://www.h-

online.com/open/news/item/Open-format-for-local-map-data-743315.html, accessed

2017.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 109 of 133

[123]http://www.openlr.org, accessed 2017.

[124]https://en.wikipedia.org/wiki/TPEG, accessed 2017.

[125]https://en.wikipedia.org/wiki/Traffic_message_channel, accessed 2017.

[126]http://www.datex2.eu/content/datex-background, accessed 2017.

[127]https://developers.google.com/transit/gtfs/, accessed 2017.

[128]http://www.connect-fahrplanauskunft.de/unsere-services/open-service.html, accessed

2017.

[129]http://railML.org: Organisation. https://www.railml.org/en/organisation.html; last access: 20.06.2017.

[130]http://www.era.europa.eu/Core-Activities/Interoperability/Pages/RINF.aspx, accessed

2017.

[131]http://www.eulynx.eu, accessed 2017.

[132]http://www.mdm-portal.de/, accessed 2017.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 110 of 133

APPENDIX A-QUESTIONNAIRE 10

The basic design of the questionnaire was that of a division of the relevant formats into

Railway formats

Maintenance formats, and miscellaneous

Other formats.

10.1 APPENDIX A-QUESTIONNAIRE: PARTICIPANTS’ DOMAINS OF SERVICE

Figure 2 showed how the companies and institutions of the participants are distributed over the domains, the respective business or institution services. Most participants came from Railway, Automotive, and from Traffic Management (each with about the same fraction on the sample size). Additionally, there have also been participants working in the domain of Industry automation, Tunnel, Building monitoring, and Energy.

10.2 APPENDIX A-QUESTIONNAIRE: EXTENT OF USE AND USE CASES OF OPEN DATA

EXCHANGE FORMATS

10.2.1 Appendix A-questionnaire: extent of use and use cases of railway formats

Figure 23 depicts a tag cloud (generated with a tool like e.g. Wordle, cf. http://www.wordle.net/create) for the use cases of railway formats and comments given by the participants. Table 13 gives their answers in full detail.

According to the feedback, the Infrastructure Data Management (IDMVU) Interface Standard, the civil engineering and survey measurement data format LandXML, and the rail track database from ÖBB, an Austrian mobility services provider, i.e. the ÖBB Gleisdatenbank, are all formats used within the context of the traffic planning software PROVI (Programmsystem für Verkehrs- und Infrastrukturplanung) from the OBERMEYER Planen + Beraten GmbH.

RTM (RailTopoModel), a logical object model to standardise the representation of railway infrastructure-related data, and railML v 3, version 3 of the Railway Markup Language for data exchange for infrastructure managers and railway companies, as well as RINF, the Register of Infrastructure, which refers to Article 49 of Directive (EU) 2016/797 and provides for transparency concerning the main features of the European Railway infrastructure, are all tested in pilots, whereas TAF TSI (Telematics applications for freight service) is already in use for Train Composition.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 111 of 133

Figure 23: Tag cloud of use cases/comments for railway formats

Table 13: Use cases/comments for railway formats

Format: Use case/comment:

IDMVU PROVI (program)

LandXML PROVI (program)

ÖBB Gleisdatenbank PROVI (program)

CIVIL3D CIVIL3D

csv Timetable

Signalling Data Exchange Format (SDEF)

Network Rail XML model for capture of network models,

Initially for signalling. Supports model of network at multiple levels of detail,

linked between levels for cross-referencing.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 112 of 133

Format: Use case/comment:

FRAME European Union Transport

OpenScenario http://www.opensce-nario.org/

DATEX II Roads Format

OSLC https://open-services.net/

OpenSimulation https://github.com/Open-SimulationInterface

FMI https://www.fmi-standard.org/

ICE870-5 old but still working

IDMVU PROVI (Program)

IP-KOM-ÖV non à vérifier

OJP moovit?

railML Tested in pilots Tested in pilot

RTM / railML v 3 Tested in pilots Tested in pilot

RINF Tested in pilot Used customized version A first description of the infrastructure was sent.

TAF TSI Train Composition is already in use.

RTM Tested in pilots Tested in pilot SNCF RESEAU is part of the network RailML.

Not used for the moment.

10.2.2 Appendix A-questionnaire: extent of use and use cases of maintenance formats

Table 14 shows the extent of use of 8 well-known maintenance formats, given as frequencies of answers in the respective categories (the darker the green cell colour, the more frequently the respective category has been chosen as answer). Figure 24 depicts a tag cloud for the use cases and comments given by the participants. Table 15 gives their answers in full detail.

Table 14: Extent of use of maintenance formats (frequencies of answers)

Format: of interest in future

considered, not used in preparation in operation

OPC UA 0 1 0 5

ISO 13372 3 1 0 2

MIMOSA OSA-CBM 3 2 0 2

EN 62682:2015 & EMMUA 2 1 0 1

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 113 of 133

191

NR-L2-SIG-30036-Issue1 0 0 0 1

MIMOSA OSA-EAI 3 2 0 1

Sensor ML 2 0 5 1

csv 0 0 0 1

According to the feedback, the Alarms Management Standard in Industrial Asset Management EN 62682:2015 & EMMUA 191, while quite new for Rail markets, are established well in plant, manufacturing and in materials processing markets.

The ISO13372 Data Standards for Reference point for condition monitoring and diagnostics of machines is a reference point for EEMUA191, IEC62682 and ISO13374.

The Network Rail Interlocking Log Standard NR-L2-SIG-30036-Issue1 is a common record format for interlocking incident recorders: it is used to capture sequences of events and actions, supports fault investigation and incident investigation (e.g. a signal passed at danger).

MIMOSA OSA-CBM, the open system architecture for moving information in a condition-based maintenance system of MIMOSA, an Operations and Maintenance Information Open System Alliance, is an implementation of ISO 13374 and is used for rail remote condition monitoring of infrastructure assets, for data acquisition, manipulation and state detection. Newer uses are in preparation for health assessment and for prognostic assessment.

Likewise, MIMOSA OSA-EAI, the Open System Architecture for Enterprise Application Integration is of interest for Enterprise application integration, and used in defence markets.

Use cases for OPC UA, a machine-to-machine communication protocol for industrial automation are assumed in the field of security related applications. The format is considered to be too specific for local (plant) data collection, not suitable for wide area / national data acquisition and test automation.

Finally, the Sensor Model Language Sensor ML is currently tested in pilots.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 114 of 133

Figure 24: Tag cloud of use cases/comments for maintenance formats

Table 15: Use cases/comments for maintenance formats

Format: Use case/comment:

EN 62682:2015 & EMMUA 191 Quite new for Rail markets, established well in plant, manufacturing and materials processing markets.

ISO 13372 Common terminology for condition monitoring and diagnosis. Reference point for EEMUA191, IEC62682 and ISO13374.

NR-L2-SIG-30036-Issue1 Common record format for interlocking incident recorders. Used to capture sequences of events and actions, supports fault investigation and incident investigation (e.g. signal passed at danger)

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 115 of 133

Format: Use case/comment:

MIMOSA OSA-CBM For rail remote condition monitoring of infrastructure assets, data aquisition, manipulation and state detection. Newer uses in preparation for health assessment and prognostic assessment.

MIMOSA OSA-EAI Of interest for Enterprise application integration. Used in defence markets.

OPC UA At the time was too specific for local (plant) data collection, not suitable for wide area / national data aquisition and test automation.

Just for security related applications

Sensor ML Tested in pilots Tested in pilot

csv Time table

10.2.3 Appendix A-questionnaire: extent of use and use cases of other formats

Table 16 shows the extent of use of 96 well-known miscellaneous open data exchange formats, given as frequencies of answers in the respective categories (the darker the green cell colour, the more frequently the respective category has been chosen as answer). Figure 25 depicts a tag cloud for the use cases and comments given by the participants. Table 17 gives their answers in full detail.

The obtained feedback in terms of use cases for the 96 other formats was comprehensive, and therefore only a few key statements are given here. E.g., according to the feedback, the formats ASCII-GRID, Simple Feature Access, GeoPackage, Esri Shape and RoadXML are used for Noise Maps and Road Noise Maps; LandXML, CityGML, OSM, and Esri Shape are used in conjunction with software for traffic or infrastructure planning / Building Information Modelling (BIM) like PROVI, Infra Works, and Civil 3D; formats like e.g. UML, GML, City GML, and Simple Features OLE/COM (OpenGIS) play a major role in tasks like import, export and modelling; OSM and OpenWeather-Maps find application in Webservices; GeoTIFF and GML in JPEG 2000 are used for Overlays. Moreover, SNMP is used for Standard Server Monitoring and Product Monitoring.

Table 16: Extent of use of other formats (frequencies of answers)

Format: of interest in future

considered, not used in preparation

in operation

UML 2 2 0 14

SNMP 0 2 0 10

GUID, UUID, … 1 0 0 8

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 116 of 133

Format: of interest in future

considered, not used in preparation

in operation

Esri Shape 2 2 0 7

OSM 0 4 1 7

GWT 4 2 0 6

Modbus 0 2 0 6

GeoJSON 3 0 1 5

BACnet 0 0 0 4

GML 3 2 0 4

GeoTIFF 2 0 1 4

KML 0 1 1 4

OpenDRIVE 0 0 0 4

OpenLayers 0 0 0 4

OWS Context 0 1 0 4

Simple Feature Access 0 0 0 4

TMC 1 1 0 4

Coordinate Transformation 3 3 0 3

DATEX2 0 0 0 3

XES 1 0 0 3

GeoAPI 4 1 0 3

GML in JPEG 2000 4 0 0 3

OpenCRG 0 1 0 3

Simple Features SQL (OpenGIS) 1 0 0 3

WKT CRS 0 0 0 3

Filter Encoding 0 0 0 2

HDF5 5 0 0 2

INSPIRE 1 0 0 2

ISO 8601 1 0 0 2

KNXbus 0 0 0 2

ONVIF 0 0 1 2

OpenSCENARIO 0 0 2 2

OpenWeather-Maps 2 1 2 2

PubSub 1 1 0 2

Road2Simulation 1 0 0 2

Web Feature Service 0 0 0 2

Web Map Service 0 0 0 2

Web Map Tile Service 0 0 0 2

Earth Observation Products 0 0 0 1

CityGML 0 2 1 1

GeoPackage 0 1 1 1

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 117 of 133

Format: of interest in future

considered, not used in preparation

in operation

GeoSparql 2 0 0 1

GeoXACML 1 0 0 1

GUF 1 0 0 1

LandXML 0 3 0 1

Moving Features 2 0 0 1

OpenLR 0 0 0 1

RDS 1 2 1 1

RoadXML 0 4 0 1

Simple Features OLE/COM (OpenGIS) 0 0 0 1

Styled Layer Descriptor 0 0 0 1

TMS 0 0 0 1

Web Coverage Service 1 0 0 1

ASCII- GRID 0 0 0 1

ARML2.0 1 1 1 0

Catalogue Service 2 1 1 0

GeoSciML 1 0 0 0

IndoorGML 1 0 0 0

ISO 6709 2 0 0 0

OpenLS 1 0 0 0

NetCDF 0 1 0 0

Observations and Measurements 1 0 0 0

Open GeoSMS 1 0 0 0

Ordering Services Framework for Earth Observation 0 1 0 0

PUCK 1 1 0 0

Sensor Observation Service 1 0 0 0

Sensor Planning Service 1 0 0 0

SENSORIS 1 1 0 0

SWE Common Data Model 1 0 1 0

SWE Service Model 0 0 1 0

TPEG 1 0 0 0

WaterML 1 0 0 0

Web Coverage Processing Service 2 0 0 0

Web Map Context 1 0 0 0

Web Processing Service 0 0 1 0

Web Service Common 1 0 0 0

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 118 of 133

Format: of interest in future

considered, not used in preparation

in operation

Navigation Data Standard (NDS) 0 0 1 0

Figure 25: Tag cloud of use cases/comments for other formats

Table 17: Use cases/comments for other formats

Format: Use case/comment:

GUID, UUID, … IT- Infrastructure/ Inventory NR datafeeds ARIANE Model

BACnet Building Control Management

CityGML Data Import, Modelling Infra Works

Coordinate Transformation

Likely to be used in RINM Dominion, Virtuelle Welt, Road2Simulation, wherever spatial data is processed

DATEX2 for Highways England

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 119 of 133

Format: Use case/comment:

Esri Shape Export; Import; Noise Mapping

Civil 3D, Infra Works Used in multiple systems for geospatial information

Dominion, Virtuelle Welt, Road2Simulation, wherever spatial data is processed

XES Used in nuber of software sustems

GeoAPI Webservices General use of ARCGIS

GML Import; Modelling

GeoJSON SQL presentation of interactive contents

GeoPackage Import; Noise Mapping

GeoTIFF Webservices Overlays Civil 3D, Automap Dominion, Virtuelle Welt, Road2Simulation, wherever spatial data is processed

GML in JPEG 2000

Webservices Overlays

GWT tested in pilots

IndoorGML Room Acoustics

ISO 6709 Cross system

ISO 8601 cross system

KML project-specific application in the course of a Lybian project

Noise Maps Dominion, Virtuelle Welt, Road2Simulation, wherever spatial data is processed

LandXML Import, Noise Maps Infra Works, Civil3D, PROVI

Modbus device configuration Many customers

ONVIF for Highways England cameras

OpenCRG Road2Simulation

OpenDRIVE Dominion, Virtuelle Welt, Road2Simulation, ...

OpenLayers DAT-GDV-Wiki, Bahnserver, lot more; superb for quick, accessible visualisation of geodata

OpenSCENARIO Dominion

OSM Webservices Navigation; Overlays InfraWorks nearly in every project

OpenWeather-Maps

Webservices

OWS Context the institute's geodata

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 120 of 133

Format: Use case/comment:

infrastructure

PUCK Measurements Devices

Road2Simulation We develop it :-p

RoadXML Road Noise Maps

Simple Feature Access

Data Handeling; Noise Mapping

Nearly all geo-libraries build up on that: Oracle Spatial, PostGIS, GeoTools, GDAL/OGR, Spatialite, ESRI stuff, QGIS

Simple Features OLE/COM (OpenGIS)

Import; Export

Simple Features SQL (OpenGIS)

Import; Export Oracle Spatial, PostGIS, SpatiaLite ...

SNMP Standard Server Monitorin; Product Monitoring

device-dependent many customers

Styled Layer Descriptor

Used in OGC OWS

SWE Common Data Model

Industry common model

TMC for Highways England

UML Software Development Import, Modelling

Web Coverage Service

the institute's geodata infrastructure

Web Feature Service

the institute's geodata infrastructure, Geo-Bug-Tracker, ...

ASCII-GRID Noise Mapping, Height Points

10.3 APPENDIX A-QUESTIONNAIRE: HOW GENERIC/SPECIALISED ARE OPEN DATA

EXCHANGE FORMATS?

10.3.1 Appendix a-questionnaire: how generic/specialised are railway formats?

Figure 3 showed how generic/specialised 14 prominent railway formats were perceived on average, averaged over the participating experts. According to the obtained feedback, the ÖBB Gleisdatenbank, the comma-separated values (CSV) file format in its special format/use case for time tables, as well as the RTM / railML v 3 formats are perceived as the most specialised formats.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 121 of 133

10.3.2 Appendix a-questionnaire: how generic/specialised are maintenance formats?

Figure 26 shows how generic/specialised 8 prominent maintenance formats are, averaged over the participating experts. According to the obtained feedback, the NR-L2-SIG-30036-Issue1 and the comma-separated values (CSV) file format in its special format/use case for time tables are perceived as the most specialised formats.

Figure 26: How generic/specialised are maintenance formats?

10.3.3 Appendix a-questionnaire: how generic/specialised are other formats?

Figure 27 shows the top 15 generic other (miscellaneous) formats, while Figure 28 shows the top 15 specialised other formats (Figure 29 shows how generic/specialised all of the other formats are perceived, averaged over the participating experts).

0 1 2 3 4 5

MIMOSA OSA-EAI

EN 62682:2015 & EMMUA 191

ISO 13372

Sensor ML

OPC UA

MIMOSA OSA-CBM

NR-L2-SIG-30036-Issue1

csv

How generic/specialised are maintenance formats?

How generic/specialised1-5;0: not answered

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 122 of 133

Figure 27: Top 15 generic other formats

00,5

11,5

OpenLayers

ARML2.0

GUF

LonMark

Moving Features

NetCDF

Observations and Measurements

Sensor Observation Service

Sensor Planning Service

SensorThings

Simple Features CORBA

Symbology Encoding

Table Joining Service

Web Map Context

Web Service Common

Top 15 generic other formats

How generic/specialised1-5;0: not answered

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 123 of 133

Figure 28: Top 15 specialised other formats

0 1 2 3 45

OpenLR

OpenSCENARIO

TMS

TPEG

WKT CRS

Coordinate Transformation

KNXbus

RoadXML

ONVIF

OpenWeather-Maps

BACnet

RDS

TMC

DATEX2

ASCII- GRID

Top 15 specialised other formats

How generic/specialised1-5;0: not answered

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 124 of 133

Figure 29: How generic/specialised are other formats on average?

01

23

45

ARML2.0

GUF

LonMark

Moving Features

NetCDF

Observations and Measurements

Sensor Observation Service

Sensor Planning Service

SensorThings

Simple Features CORBA

Symbology Encoding

Table Joining Service

Web Map Context

Web Service Common

OpenLayers

GUID, UUID, …

Earth Observation Products

GeoXACML

PubSub

Simple Features OLE/COM (OpenGIS)

Styled Layer Descriptor

XES

LandXML

UML

Web Processing Service

Catalogue Service

GeoPackage

GeoSciML

GeoSparql

GWT

HDF5

LandInfra

OpenLS

Open GeoSMS

OpenMI

OpenSearch Geo

OWS Context

tsml

Web Coverage Processing Service

Web Coverage Service

Web Feature Service

Web Map Service

Web Map Tile Service

SNMP

GeoTIFF

ISO 6709

PUCK

OpenCRG

Simple Feature Access

GeoJSON

Esri Shape

Filter Encoding

Ordering Services Framework for Earth Observation

SENSORIS

SWE Common Data Model

SWE Service Model

WaterML

GML in JPEG 2000

GeoAPI

GML

Simple Features SQL (OpenGIS)

OpenMTC

INSPIRE

KML

Road2Simulation

OSM

CityGML

ALERT C

IndoorGML

ISO 8601

Modbus

OpenDRIVE

OpenLR

OpenSCENARIO

TMS

TPEG

WKT CRS

Coordinate Transformation

KNXbus

RoadXML

ONVIF

OpenWeather-Maps

BACnet

RDS

TMC

DATEX2

ASCII- GRID

How generic/specialised are other formats on average?

How generic/specialised1-5;0: not answered

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 125 of 133

10.4 APPENDIX A-QUESTIONNAIRE: “BEST” EXAMPLE OF AN OPEN DATA EXCHANGE

FORMAT SUITABLE FOR ONE OF SEVERAL SOURCES OF INFORMATION

One question of the questionnaire asked for the (one) “best” example of an Open Data Exchange format suitable for the following information sources:

Asset model – Rail infrastructure, including asset definition, geospatial, logical and relational model

Asset model – Rolling-stock, including asset definition, logical and relational model

Asset condition

Alarms and events

Asset maintenance activity – past and future schedule

Asset fault history

Rail operation history – past rolling stock movements and information

Rail operation plan – timetable data

Business process notation model (including standard operating procedures format)

Moreover, it was asked with respect to which criterion the chosen format is “best”.

Figure 4 had depicted a tag cloud for best formats and the respective criteria given by the answers of the participants. Table 18 gives their answers in full detail.

In most of the answers, railML has been given as best format for modelling Rail infrastructure and Rolling-stock, and also for Rail operation plans. RailML is also the most frequently named best format in total, preferred because it is an open format, and due to its large user base.

For asset condition, alarms and events, and asset maintenance, the majority of answers gave SensorML as the best format, e.g. because of a good user experience. For alarms and events, OPC-UA or OPC-AE was named as best format just as often as SensorML.

Among other formats like Systems Modeling Language (SysML, based on UML) and Business Process Model and Notation (BPMN), Business Process Modeling Language (BPML) has been named as best format for a business process notation model because it is generic, widely used, and well known.

Table 18: "Best" formats for several information sources with the respective criteria

Information

source:

Rail infra-structure

Rolling-stock

Asset

condition Alarms and events

Asset maintenance

Rail operation plan

Business process notation model

Road

infrastructure

Format: SHP, ASCII, XML, GML

railML, sensorML

OSA-CBM sensorML sensorML XML SysML/UML OpenDRIVE

Criterion: Applicability and definition clear.

For: acquisition, manipu-

user experience

Specific meaning open format, large user base

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 126 of 133

Information

source:

Rail infra-structure

Rolling-stock

Asset

condition Alarms and events

Asset maintenance

Rail operation plan

Business process notation model

Road

infrastructure

lation, state detection, health assess-ment, prognosis and advisory generation.

Format: PROVI, Civil 3D

railML sensorML OPC-UA or OPC-AE

CIF UML, BPMN

Criterion: open format, large user base

user experience

widely used user experience

Format: (1) RailML, (2) SDEF

railML SMTP railML BPML

Criterion: for network models, at many layers, interlinked, with time domain.

wide use user experience

widely used, generic, well known

Format: RailML sensorML railML

Criterion: XML based so easy to understand

open format, large user base

Format: railML + RailTopoModel

sensorML railML

Criterion: user experience

Format: railML, sensorML

Criterion:

Format: railML

Criterion: open format, large user base

Format: railML

Criterion: openness

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 127 of 133

Information

source:

Rail infra-structure

Rolling-stock

Asset

condition Alarms and events

Asset maintenance

Rail operation plan

Business process notation model

Road

infrastructure

Format: railML

Criterion:

Most frequently:

railML railML sensorML sensorML, OPC-UA or OPC-AE

sensorML railML UML OpenDRIVE

10.5 APPENDIX A-QUESTIONNAIRE: OPTIONAL MINDSET QUESTIONS: OPEN DATA

EXCHANGE POLICY AND ATTITUDE TOWARDS OPEN DATA

In Figure 30, percentages of answers “yes” are given for three questions regarding the Open Data Exchange policy of the company or institution:

Does your company or institution define Open Data strategies?

Does your company or institution own Open Data portals?

Is your company or institution willing to contribute to Open Data Exchange initiatives/portals?

This question belongs to a set of optional general mindset questions, the feedback for which is shown in Figure 30 and Figure 31. For exactly half of the participating companies or institutions (i.e. 50% of them), a participating employed expert chose the option to actually answer this set of questions.

According to the obtained feedback, all companies or institutions with an expert answering on their behalf (i.e. 100% of them) are willing to contribute to Open Data Exchange initiatives/portals. Two out of three companies or institutions with an expert answering on their

behalf (i.e. 66. 6% of them) define Open Data strategies, whereas 40% of the companies or institutions with an expert answering on their behalf even owned an Open Data portal.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 128 of 133

Figure 30: Policy of the company/institution

Figure 31 shows the attitude of the company or institution of the expert, averaged over the participating experts. According to the obtained feedback, the mean attitudes towards using non-proprietary data formats in products, as well as towards Open Data is very positive (both between 4 and 5 on a scale ranging from 1 (strongly negative attitude) to 5 (strongly positive attitude).

0% 20% 40% 60% 80% 100%

Does your company or institutiondefine Open Data strategies?

Does your company or institutionown Open Data portals?

Is your company or institutionwilling to contribute to Open Data

Exchange inititatives/portals?

Policy of the company/institution: Percentage of answers "yes"

Percentage of answers "yes" amongall companies with an expert answer

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 129 of 133

Figure 31: Attitude of the company/institution: Mean attitude

10.6 APPENDIX A-QUESTIONNAIRE: STRENGTHS AND WEAKNESSES OF OPEN DATA

EXCHANGE FORMATS

Figure 32 depicts a tag cloud for strengths and weaknesses of Open Data Exchange formats as given by the participants. Table 19 gives their answers in full detail: notice that strengths and weaknesses presented in one row of the table may be authored by different participants, and thus be controversial.

According to the obtained feedback, general strengths of Open Data Exchange formats are

open documentation,

universal use of data,

the fact that no vender logins do exist,

no country-specific data is involved

General weaknesses have been seen in possible misinterpretations, in a potential confusion arising from the fact that there are too many formats in total, too many solutions, and finally because the potential risk of misuse of data is relatively high with universal formats.

4 4,1 4,2 4,3 4,4 4,5

What is yourcompany's/institution's attitude

towards Open Data?

What is yourcompany's/institution's attitudetowards using non-proprietary

data formats in products?

Attitude of the company/institution: Mean attitude

Mean attitude (1: stronglynegative; 5: strongly positive)

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 130 of 133

Figure 32: Tag cloud of strengths and weaknesses of Open Data exchange formats

Moreover, strengths of OSA-CBM include covering applications ranging from data acquisition through to advisory generation in a structured, well defined format. A strength of railML and railTopoModel is the fact that they are defined involving the main European railway actors, and that they are standardised open formats. On the other hand, railML has been criticised because it is not yet completed for all railway assets, and because it is “uglily” (/nasty) hierarchical and huge. Other weaknesses of railML and OpenDRIVE include their slow development and innovation cycle, the fact that development of supporting tools is uncoupled, and that the tools interpret the data differently (i.e. only a subset of the format is implemented and not all data format versions are supported).

Table 19: Strengths and weaknesses of Open Data exchange formats

Format: Strengths: Weaknesses:

OSA-CBM Covers data acquisition through to advisory generation in a structured, well defined format.

Verbose. Somewhat ambiguous use of XML inheritance.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 131 of 133

Format: Strengths: Weaknesses:

OPC-UA (and OPC-DA) Very broadly used Proprietary, inflexible, limited structure and functionality in DA (i.e. just Data Acquisition, no higher level functional support - e.g. health assessment, diagnosis, prognosis)

XML clear definition Creates large data files

time/location ease of exchange and use none

OGC OWS (WMS, WFS, WCS, ...) versatile, standardised for decades already, today easy to use

quite generic; only suitable to transport simple/flat geodata without sophisticated data model dependencies

OpenDRIVE detailed road topology and topography description, well established, standardised open formats

in version <= 1.4 very specific; representation of lanes other than driving lanes very poorly possible, e.g. in intersection areas; mathematical representation of geometries is pain in the ass for data exchange and should be switched to something like OGC Simple Features. Do the bloody (/nasty) smoothing of your trajectories on the application's side, (for hell's sake)! slow innovation cycle slow development, development of supporting tools uncoupled and interpret the data different (only subset of the format is implemented, not all data format versions are supported)

OpenCRG full open source none

OpenSCENARIO great potential slow innovation cycle

railTopoModel defined involving the main European railway actors

railML defined involving the main european railway actors

not yet completed for all railway assets, uglily (/nasty) hierarchical, huge, you should split in into smaller

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 132 of 133

Format: Strengths: Weaknesses:

standardised open formats

formats too generic in asset description, slow development, development of supporting tools uncoupled and interpret the data different (only subset of the format is implemented, not all data format versions are supported), slow innovation cycle

Road2Simulation more generic road description in OGC Simple Features

still under development and not known well, yet; offers possibility to derive other formats from, like OpenDRIVE, NDS, …

sensorML good feedback from other sectors

CityGML you can represent nearly everything with it

you can represent nearly everything with it; huge, you should split in into smaller formats

10.7 APPENDIX A-QUESTIONNAIRE: LICENSING AND/OR LEGAL ISSUES HAMPERING

APPLICATION OF OPEN DATA EXCHANGE FORMATS

Figure 33 depicts a tag cloud for licensing and/or legal issues hampering application of Open Data Exchange formats as named and explained by the participants. Table 20 gives their answers in full detail.

Named issues include unclear adoption policies (railML), a confusing tangle of different uses, programs and policies (SHP), the obstacle of horrendous costs for joining the consortium prior to access (NDS), and the general problem that data from business projects usually is not royalty-free and therefore cannot be provided as Open Data.

GA 730569 D7.1

Open data: a review of the state-of-the-art

IMS-WP7-D7.1-DLR-006-02-I 133 of 133

Figure 33: Tag cloud for licensing and/or legal issues hampering application of Open

Data Exchange formats

Table 20: Licensing and/or legal issues hampering application of Open Data Exchange formats

Format: Issue:

SHP many different programs and different uses

Navigation Data Standard (NDS) no access to specification unless joining the consortium/development group for horrendous costs!

railML policy for adoption not clear

general

data from projects is not royalty-free

and cannot be provided as Open Data