Proceedings IE 2015.pdf - International Conference on ...

International Conference on

Informatics in Economy

Proceedings of the 14th International Conference on

INFORMATICS in ECONOMY (IE 2015)

Education, Research & Business Technologies

Bucharest, Romania

April 30 – May 03, 2015

Published by Bucharest University of Economic Studies Press

www.conferenceie.ase.ro

ISSN 2284-7472

ISSN-L = 2247-1480






Bucharest, Romania

April 30 – May 03, 2015

Published by Bucharest University of Economic Studies Press


Cover: Assoc. Prof. Cătălin BOJA Copyright © 2015, by Bucharest University of Economic Studies Press

All the copyright of the present book belongs to the Bucharest Academy of Economic Studies Press.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or

transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or

otherwise, without the prior written permission of the Editor of Bucharest University of Economic

Studies Press.

All papers of the present volume were peer reviewed by two independent reviewers. Acceptance was

granted when both reviewers' recommendations were positive.

See also: http://www.conferenceie.ase.ro

ISSN 2284-7472

ISSN-L 2247-1480

http://www.conferenceie.ase.ro/






Bucharest, Romania

April 30 – May 03, 2015

Conference organized by:

Bucharest University of

Economic Studies

Department of Economic Informatics

and Cybernetics

INFOREC Association

Editors: Assoc. Prof. Cătălin Boja, Bucharest University of Economic Studies, Romania

Lect. Mihai Doinea, Bucharest University of Economic Studies, Romania

Lect. Cristian Ciurea, Bucharest University of Economic Studies, Romania

Prof. Paul Pocatilu, Bucharest University of Economic Studies, Romania

Assoc. Prof. Lorena Bătăgan, Bucharest University of Economic Studies, Romania

Lect. Alina Ion, Bucharest University of Economic Studies, Romania

Lect. Vlad Diaconiță, Bucharest University of Economic Studies, Romania

Lect. Madalina Andreica, Bucharest University of Economic Studies, Romania

Assist. Camelia Delcea, Bucharest University of Economic Studies, Romania

Assist. Alin Zamfiroiu, Bucharest University of Economic Studies, Romania

Assist. Madalina Zurini, Bucharest University of Economic Studies, Romania

Assist. Oana Popescu, Bucharest University of Economic Studies, Romania

International Program Committee Members: Frederique Biennier, INSA de Lion, FRANCE

Wladimir Bodrow, University of Applied Sciences, Berlin, GERMANY

Ewa Bojar, Lublin University of Technology, POLAND

Pino Caballero-Gil, University of La Laguna, SPAIN

Hans Czap, Trier University, GERMANY

Florin Filip, Romanian Academy, ROMANIA

Howard Duncan, Dublin City University, IRELAND

Manfred Fischer, Wirtscahftsuniversitaet Wien, AUSTRIA

Janis Grundspenkis, Riga Technical University, LATVIA

Timothy Hall, Univesity of Limerick, IRELAND

Luca Iandoli, University Federico II, ITALY

Ivan Jelinek, Czech Technical University in Prague, CZECH REPUBLIC

Jones Karl, Liverpool John Moores University, UNITED KINGDOM

Karlheinz Kautz, Copenhagen Business School, DENMARK

Wong Wing Keung, National University of Singapore, SINGAPORE

Yannis Manolopoulos, Aristotle University of Thessaloniki, GREECE

Lynn Martin, University of Central England, Birmingham, UNITED KINGDOM

Antonio Jose Mendes, University of Coimbra, PORTUGAL

Mihaela I. Muntean, West University of Timisoara, ROMANIA

Peter Nijkamp, Free University of Amsterdam, NETHERLANDS

Maria Parlinska, Warsaw University of Life Sciences, POLAND

Boris Rachev, Bulgarian Chapter of the ACM, BULGARIA

George Roussos, BirkBeck University of London, UNITED KINGDOM

Frantz Rowe, University of Nantes, FRANCE

Doru E Tiliute, "Stefan cel Mare" University of Suceava, ROMANIA

Eduardo Tome, Universidade Lusiada de Famalicao, PORTUGAL

Michael Tschichholz, Fraunhofer eGovernment Center, GERMANY

Giuseppe Zollo, University Federico II, ITALY

Preface

This year, the 14th International Conference on INFORMATICS in ECONOMY (IE 2015),

Education, Research & Business Technologies, was held in Bucharest, Romania, between April 30th

and May 03rd, 2015. The Conference promoted research results in Business Informatics and related

Computer Science topics: Cloud, Distributed and Parallel Computing, Mobile-Embedded and

Multimedia Solutions, E-Society, Enterprise and Business Solutions, Databases and Data

Warehouses, Audit and Project Management, Quantitative Economics, Artificial Intelligence and Data

mining. The Conference has represented a meeting point for participants from all over the world, both

from academia and from industry.

The conference was first organized in 1993 in collaboration with researchers from Institut

National des Sciences Appliquées de Lyon (INSA de Lion), France. From 1993 to 2011, the

conference have been organized once at two years, publishing in ten editions high quality papers and

bringing together specialists from around the world. Starting with 2012, the conference takes place

annually, the 11th and 12th edition volumes have been indexed by ISI Thomson Reuters in its ISI

Proceedings directory and the 13th edition volume is under evaluation.

The International Conference on Informatics in Economy is one of the first scientific events

on this subject in Romania and during the last ten years has gained an international scientific

recognition. At national level, remains one of the most important scientific events that gather the

entire Romanian Economic Informatics community.

The conference has made partnerships with international journals like Journal of Economic

Computation and Economic Cybernetics Studies and Research (http://ecocyb.ase.ro), Informatica

Economica (http://revistaie.ase.ro), Economy Informatics (http://economyinformatics.ase.ro), Journal

of Applied Quantitative Methods (http://www.jaqm.ro), Database Systems Journal

(http://www.dbjournal.ro/), Journal of Mobile, Embedded and Distributed Systems

(http://www.jmeds.eu) and International Journal of Economic Practices and Theories

(www.ijept.org) to publish an extended format of the conference best papers.

A Conference such as this can only succeed as a team effort, so the Editors want to thank the

International Scientific Committee and the Reviewers for their excellent work in reviewing the papers

as well as their invaluable input and advice.

The Editors

Proceedings of the IE 2015 International Conference


i

Table of Contents

SECTION Cloud & Distributed/Parallel Computing

GENERAL PURPOSE SYSTEM FOR GENERATING EVALUATION

FORMS (GPS4GEF) ............................................................................................................... 1

Daniel HOMOCIANU, Dinu AIRINEI

UPON A MULTI CRITERIA OFFER-DEMAND CONTINOUS MATCHING

ALGORITHM .......................................................................................................................... 7

Claudiu VINŢE, Amelia CRISTESCU

STOCHASTIC PROCESSES AND QUEUEING THEORY FOR CLOUD

COMPUTER PERFORMANCE ANALYSIS .................................................................... 13

Florin-Cătălin ENACHE

AN EVALUATION OF THE FUZZY VAULT SCHEME DIFFUSION POINTS

ORDER OF MAGNITUDE .................................................................................................. 20

Marius-Alexandru VELCIU, Victor-Valeriu PATRICIU, Mihai TOGAN

EVOLUTION OF TELECOM BSS: FROM VOICE SERVICES TO COMPLEX IT

SERVICES.............................................................................................................................. 26

Ioan DRĂGAN, Răzvan Daniel ZOTA

HIERARCHICAL DISTRIBUTED HASH TABLES FOR VIDEO RECOGNITION IN

CONTENT DELIVERY NETWORKS ............................................................................... 32

Alecsandru PĂTRAȘCU, Ion BICA, Victor Valeriu PATRICIU

DRIVING BUSINESS AGILITY WITH THE USE OF CLOUD ANALYTICS ............ 38

Mihaela MUNTEAN

ORGANIZING SECURE GROUPS OF RELIABLE SERVICES IN DISTRIBUTED

SYSTEMS ............................................................................................................................... 44

Cezar TOADER, Corina RĂDULESCU, Cristian ANGHEL, Graţiela BOCA

AN OVERVIEW STUDY OF SOFTWARE DEFINED NETWORKING ...................... 50

Alexandru STANCU, Simona HALUNGA, George SUCIU, Alexandru VULPE

CONSUMER RIGHTS IN THE CONTEXT OF CLOUD BASED PROCESSING OF

OPEN DATA .......................................................................................................................... 56

Lorena BĂTĂGAN, Cătălin BOJA, Mihai DOINEA

SECURE CYBER SECURITY THREAT INFORMATION EXCHANGE .................... 63

Mihai-Gabriel IONITA

ADOPTION OF CLOUD COMPUTING IN THE ENTERPRISE .................................. 68

Floarea NĂSTASE, Carmen TIMOFTE

HEALTHY AGEING MOBILE GIS APPLICATIONS DEVELOPMENT AND AUDIT

FOR THE ACHIEVEMENT OF SOCIAL SUSTAINABILITY ...................................... 73

Cosmin TOMOZEI, Cristian AMANCEI

CLOUD–BASED ARCHITECTURE FOR PERFORMANCE MANAGEMENT

SYSTEMS FOR SMES.......................................................................................................... 79



ii

Alexandra RUSĂNEANU

SOFTWARE TOOLS AND ONLINE SERVICES THAT ENABLE

GEOGRAPHICALLY DISTRIBUTED SOFTWARE DEVELOPMENT OF WEB

APPLICATIONS ................................................................................................................... 84

Mihai GHEORGHE

THE IMPORTANCE OF JAVA PROGRAMMING LANGUAGE IN

IMPLEMENTING DISTRIBUTED SYSTEMS OF DECISION FOR ONLINE

CREDITING .......................................................................................................................... 90

Robert-Madalin CRISTESCU

PERSONAL DATA VULNERABILITIES AND RISKS MODEL .................................. 95

Gheorghe Cosmin SILAGHI

SECTION Mobile-Embedded & Multimedia Solutions

A BUSINESS MODEL FOR THE INTERACTION BETWEEN ACTORS OF

CULTURAL ECONOMY ................................................................................................... 102

Cristian CIUREA, Florin Gheorghe FILIP

NEAR FIELD COMMUNICATION - THE STATE OF KNOWLEDGE AND USE OF

NFC TECHNOLOGIES AND APPLICATIONS BY USERS OF SMART MOBILE

DEVICES .............................................................................................................................. 108

Mihaela Filofteia TUTUNEA

HOW TO VISUALIZE ONTOLOGIES. A STUDY FROM AN END USER’S

POINT OF VIEW ................................................................................................................ 116

Bogdan IANCU

STUDY ON STUDENTS MOBILE LEARNING ACCEPTANCE ................................ 122

Daniel MICAN, Nicolae TOMAI

CONDUCTING PROCESS MODEL UNDERSTANDING CONTROLLED

EXPERIMENTS USING EYE-TRACKING: AN EXPERIENCE REPORT ............... 128

Razvan PETRUSEL, Cristian BOLOGA

A TEST DATA GENERATOR BASED ON ANDROID LAYOUT FILES .................. 135

Paul POCATILU, Sergiu CAPISIZU

EMV/BITCOIN PAYMENT TRANSACTIONS AND DYNAMIC DATA

AUTHENTICATION WITH SMART JAVA CARDS .................................................... 141

Marius POPA, Cristian TOMA

ACCURATE GEO-LOCATION READING IN ANDROID .......................................... 152

Felician ALECU, Răzvan DINA



iii

SECTION E-Society, Enterprise & Business Solutions

ABOUT OPTIMIZING WEB APPLICATIONS ............................................................. 158

Marian Pompiliu CRISTESCU, Laurentiu Vasile CIOVICA

THE RELATIONSHIP BETWEEN ENVIRONMENTAL AND ENERGY

INDICATORS. THE CASE STUDY OF EUROPE ......................................................... 164

Titus Felix FURTUNĂ, Marian DÂRDALĂ, Roman KANALA

IS THE INTEGRATED MANAGEMENT SYSTEM OF QUALITY, ENVIRONMENT

AND HEATH AND SAFETY A SOLUTION FOR SME’S PERFORMANCE? .......... 170

Dorin MAIER, Adela Mariana VADASTREANU, Andreea MAIER

A FRAMEWORK FOR DESIGNING AN ONTOLOGY-BASED E-LEARNING

SYSTEM IN HEALTHCARE HUMAN RESOURCE MANAGEMENT ..................... 176

Lidia BAJENARU, Ion Alexandru MARINESCU, Ion SMEUREANU

PROJECT MANAGEMENT COMPETENCY ASSESSMENT FOR IT

PROFESSIONALS: AN ANALYTIC HIERARCHY PROCESS APPROACH ........... 182

Elena-Alexandra TOADER

BUSINESS SUCCESS BY IMPROVING THE INNOVATION

MANAGEMENT ................................................................................................................. 189

Adela Mariana VADASTREANU, Dorin MAIER, Andreea MAIER

RECOMMENDER SYSTEMS, A USEFUL TOOL FOR VIRTUAL MARKETPLACE;

CASE STUDY ...................................................................................................................... 195

Loredana MOCEAN, Miranda VLAD, Mihai AVORNICULUI

METHODOLOGY FOR THE COHERENT ROMANIAN LINKED OPEN

GOVERNMENT DATA ECOSYSTEM............................................................................ 202

Codrin-Florentin NISIOIU

QUALITY METRICS FOR EVALUATING INTERACTIVITY LEVEL OF CITIZEN

ORIENTED WEB BASED SOFTWARE SOLUTIONS ................................................. 207

Emanuel Eduard HERȚELIU

BRIDGE PKI ........................................................................................................................ 212

Eugen Ștefan Dorel COJOACĂ, Mădălina Elena RAC-ALBU, Floarea NĂSTASE

AN ENTERPRISE APPROACH TO DEVELOPING COMPLEX

APPLICATIONS ................................................................................................................. 218

Alexandru-Mihai MARINESCU, Anca ANDREESCU

A PRELIMINARY ANALYSIS OF BUSINESS INFORMATION SYSTEMS MASTER

PROGRAMME CURRICULUM BASED ON THE GRADUATES SURVEY ............ 224

Marin FOTACHE, Valerică GREAVU-ȘERBAN, Florin DUMITRIU

TEAMWORK CULTURE IN IASI IT COMPANIES .................................................... 231

Doina FOTACHE, Luminița HURBEAN

CRM- PHENOMENOLOGICAL AND CONCEPTUAL ANALYSIS IN REFERENCE

TO THE “STATE OF ART” .............................................................................................. 238

Mihaela IONESCU

CONTINUOUS INTEGRATION IN OPEN SOURCE SOFTWARE

PROJECTS ........................................................................................................................... 244

Mihai GEORGESCU, Cecilia CIOLOCA



iv

SERVICE-ORIENTED MODELING AND ARCHITECTURE FOR AN

E-FINANCIAL ASSISTANT INTEGRATION WITHIN THE

BANKING SYSTEM ........................................................................................................... 250

Mirela TURKEȘ, Irina RAICU, Alexandra RUSĂNEANU

RISK ASSESSMENT FRAMEWORK FOR SUCCESSFUL

E-GOVERNMENT PROJECTS ........................................................................................ 256

Otniel DIDRAGA

AN ASSESSMENT OF THE MAIN BENEFITS AND CHARACTERISTICS

OF BUSINESS SOFTWARE FROM THE PERSPECTIVE OF

ROMANIAN SMEs ............................................................................................................. 261

Victor LAVRIC

ECONOMIC VALUE EXCHANGES IN MULTI-PARTY COLLABORATIONS:

USING E3-VALUE TO ANALYSE THE M-PESA ECOSYSTEM ............................... 267

Caroline KINUTHIA, Andrew KAHONGE

ASPECTS OF INTER-ORGANIZATIONAL KNOWLEDGE MANAGEMENT IN

COLLABORATIVE NETWORKS ................................................................................... 273

Marinela MIRCEA

4D(ATA) PARADIGM AND EGOVERNMENT ............................................................. 279

Bogdan GHILIC-MICU, Marian STOICA, Cristian USCATU

AUTOMATIC USER PROFILE MAPPING TO MARKETING SEGMENTS IN A

BIG DATA CONTEXT ....................................................................................................... 285

Anett HOPPE, Ana ROXIN, Christophe NICOLLE

DECISION-MAKING PROCESS ASSISTANCE USING PRECISION TREE

MODULE OF PALISADE DECISION TOOLS SUITE ................................................. 292

Georgeta SOAVA, Mircea Alexandru RADUTEANU, Catalina SITNIKOV

RoaML: AN INNOVATIVE APPROACH ON MODELING WEB SERVICES ......... 299

Cătălin STRÎMBEI, Georgiana OLARU

INFORMATION SYSTEMS IMPLEMENTATION IN THE JIU

VALLEY SME’S .................................................................................................................. 307

Eduard EDELHAUSER, Lucian LUPU DIMA

META-INSTRUCTION IN E-EDUCATION ................................................................... 314

Gabriel ZAMFIR

SEMANTIC WEB TECHNOLOGIES FOR IMPLEMENTING COST-EFFECTIVE

AND INTEROPERABLE BUILDING INFORMATION MODELING........................ 322

Tarcisio MENDES de FARIAS, Ana-Maria ROXIN, Christophe NICOLLE

PRELIMINARY RESULTS OF AN EMPIRICAL INVESTIGATION ON BLENDED

LEARNING IMPLEMENTATION IN A ROMANIAN HEI ......................................... 329

Iuliana DOROBĂȚ

BUSINESS PROCESS MANAGEMENT DRIVEN BY DATA GOVERNANCE ........ 335

Liviu CIOVICĂ, Răzvan Daniel ZOTA, Ana-Maria CONSTANTINESCU

M-LEARNING AND LIFELONG LEARNING............................................................... 341 Alina-Mihaela ION, Dragoș VESPAN



v

SECTION Databases & Data Warehouse

THE PROBLEM OF DATA CONSISTENCY IN ANALYTICAL SYSTEMS ............ 347

Oleksandr SAMANTSOV, Olena KACHKO

BIG DATA ANALYSIS AS FUNDAMENT FOR PRICING DECISIONS ................... 352

Anca APOSTU

ON A HADOOP CLICHÉ: PHYSICAL AND LOGICAL MODELS

SEPARATION ..................................................................................................................... 357

Ionuț HRUBARU, Marin FOTACHE

BIG DATA CHALLENGES FOR HUMAN RESOURCES MANAGEMENT ............ 364

Adela BÂRA, Iuliana ȘIMONCA (BOTHA), Anda BELCIU, Bogdan NEDELCU

ARCHITECTURE OF SMART METERING SYSTEMS .............................................. 369

Simona-Vasilica OPREA, Ion LUNGU

DATA WAREHOUSE PYRAMIDAL SCHEMA ARCHITECTURE - SUPPORT FOR

BUSINESS INTELLIGENCE SYSTEMS ......................................................................... 375

Aida-Maria POPA

BUSINESS INTELLIGENCE FOR HEALTHCARE INDUSTRY ............................... 381

Mihaela IVAN, Manole VELICANU, Ionut TARANU

STREAMLINING BUSINESS PROCESSES IN ACADEMIA BY BUILDING AND

MANIPULATING A BUSINESS RULES REPOSITORY ............................................. 387

Alexandra Maria Ioana FLOREA, Ana-Ramona BOLOGA, Vlad DIACONIȚA, Razvan

BOLOGA

ENHANCING THE ETL PROCESS IN DATA WAREHOUSE SYSTEMS ................ 392

Ruxandra PETRE

SECTION Audit and Project Management

SOFTWARE DEVELOPMENT METHODOLOGY FOR INNOVATIVE PROJECTS -

ISDF METHODOLOGY .................................................................................................... 398

Mihai Liviu DESPA

AGILITY IN THE IT SERVICES SECTOR: A STUDY FOR ROMANIA ................. 410

Eduard-Nicolae BUDACU, Constanta-Nicoleta BODEA, Stelian STANCU

IMPROVEMENT OPPORTUNITIES BY USING REMOTE AUDIT IN THE

MARITIME TRANSPORT ................................................................................................ 418

Costel STANCA, Viorela-Georgiana STȊNGĂ, Gabriel RAICU, Ramona TROMIADIS

AUDIT REPORTS VALIDATION BASED ON ONTOLOGIES .................................. 422

Ion IVAN, Claudiu BRANDAS, Alin ZAMFIROIU

APPLICATION OF BUSINESS INTELLIGENCE IN PROJECT

MANAGEMENT ................................................................................................................. 428

Mihaela I. MUNTEAN, Liviu Gabriel CABᾸU



vi

SECTION Quantitative Economics

ANALYSIS OF THE MACROECONOMIC CAUSAL RELATIONSHIPS WHICH

CAN DETERMINE A COUNTRY TO ENTER A SOVEREIGN DEBT CRISIS ....... 433

Alexandra Maria CONSTANTIN, Adina CRISTEA

INVESTING OPTIMALLY IN ADVERTISING AND QUALITY TO MITIGATE

PRODUCT-HARM CRISIS ............................................................................................... 437

Francesco MORESINO

COUPLING TECHNO-ECONOMIC ENERGY MODELS WITH A SHARE OF

CHOICE ............................................................................................................................... 443

Francesco MORESINO, Emmanuel FRAGNIÈRE, Roman KANALA, Adriana REVEIU, Ion

SMEUREANU

A GENDER ANALYSIS OF THE MINIMUM WAGE EFFECTS UPON

EMPLOYMENT IN ROMANIA........................................................................................ 449

Madalina Ecaterina POPESCU, Larisa STANILA, Amalia CRISTESCU

JUNCTIONS BETWEEN EVALUATION THEORY AND GAMES THEORY ......... 455

Marin MANOLESCU, Magdalena TALVAN

ROMANIAN EDUCATION SYSTEM – EFFICIENCY STATE ................................... 461

Madalina Ioana STOICA, Crisan ALBU

GST FOR COMPANY’S REPUTATION ANALYSIS IN ONLINE

ENVIRONMENTS .............................................................................................................. 467

Camelia DELCEA

DETERMINANTS OF EU MIGRATION. PANEL DATA ANALYSIS ....................... 473

Costin-Alexandru CIUPUREANU, Elena-Maria PRADA

EUROPEAN COUNTRIES AND THE SUSTAINABILITY CHALLENGE: FOCUS

ON TRANSPORTATION ................................................................................................... 479

Georgiana MARIN, Alexandra MATEIU

THE EVALUATION AND STRENGTHENING OF THE FREIGHT TRANSPORT

SYSTEM, AS A SOLUTION FOR SUSTAINABLE DEVELOPMENT

IN ROMANIA ...................................................................................................................... 485

Georgiana MARIN, Alexandra MATEIU

INNOVATION – CONTENT, NATIONAL INNOVATION STRATEGIES

AND MODELLING INNOVATION USING THE MICROECONOMIC

APPROACH ......................................................................................................................... 491

Stelian STANCU, Constanţa-Nicoleta BODEA, Oana Mădălina POPESCU, Orlando Marian

VOICA, Laura Elly NAGHI

RISK AWARENESS AS COMPETITIVE FACTOR FOR PUBLIC

ADMINISTRATION - A GERMAN CASE STUDY ....................................................... 497

Markus BODEMANN, Marieta OLARU, Ionela Carmen PIRNEA

SOVEREIGN RISK DEPENDENCE PATTERN IN EMERGING EUROPE ............ 504

Gabriel GAIDUCHEVICI

MEASURE YOUR GENDER GAP: WAGE INEQUALITIES USING BLINDER

OAXACA DECOMPOSITION .......................................................................................... 510

Radu-Ioan VIJA, Ionela-Catalina ZAMFIR



vii

THE CONSUMPTION CHANNEL OF NON-KEYNESIAN EFFECTS. SOME

EMPIRICAL EVIDENCES FOR ROMANIA ................................................................. 517

Ana ANDREI, Angela GALUPA, Sorina GRAMATOVICI

FEEDBACK ANALYSIS AND PARAMETRIC CONTROL ON PROCESS OF

DISPOSABLE INCOME ALLOCATION – A DYNAMIC MODEL ON PORTUGAL’S

NATIONAL ECONOMY .................................................................................................... 523

Bianca Ioana POPESCU, Emil SCARLAT, Nora CHIRIȚĂ

INEQUALITY OF INCOME DISTRIBUTION IN ROMANIA. METHODS OF

MEASUREMENT AND CAUSES ..................................................................................... 529

Malina Ionela BURLACU

WAR GAMES AND A THIRD PARTY INTERVENTION IN CONFLICT ................ 534

Mihai Daniel ROMAN

MACROECONOMIC FACTORS OF SMEs PERFORMANCE IN ROMANIA IN THE

PERIOD 2005-2013. A TIME SERIES APPROACH ...................................................... 540

Marușa BECA, Ileana Nișulescu ASHRAFZADEH

EFFICIENCY OF THE EUROPEAN STRUCTURAL FUNDS INVESTED IN

EDUCATIONAL INFRASTRUCTURE ........................................................................... 546

Monica ROMAN

RESOURCES ALLOCATION MODEL IN A CLUSTERED CLOUD

CONFIGURATION ............................................................................................................. 552

Mioara BANCESCU

UPON DECISION-MAKING IN ALTERNATIVE DESIGN PROBLEMS ................. 558

Dimitri GOLENKO-GINZBURG

SECTION Artificial Intelligence & Data-mining

ARTIFICIAL NEURAL NETWORK APPROACH FOR DEVELOPING

TELEMEDICINE SOLUTIONS: FEED-FORWARD BACK PROPAGATION

NETWORK .......................................................................................................................... 563

Mihaela GHEORGHE

NEURAL NETWORK-BASED APPROACH IN FORECASTING FINANCIAL

DATA .................................................................................................................................... 570

Cătălina-Lucia COCIANU, Hakob GRIGORYAN

SEMANTIC HMC FOR BUSINESS INTELLIGENCE USING CROSS-

REFERENCING .................................................................................................................. 576

Rafael PEIXOTO, Thomas HASSAN, Christophe CRUZ, Aurélie BERTAUX, Nuno SILVA

MULTI-DOMAIN RETRIEVAL OF GEOSPATIAL DATA SOURCES

IMPLEMENTING A SEMANTIC CATALOGUE .......................................................... 582

Julio Romeo VIZCARRA, Christophe CRUZ

EMOTIONAL ROBO-INTELLIGENCE CREATION PROCESS ............................... 587

Dumitru TODOROI



viii

MODELING THE RELATIONSHIPS NETWORKS INSIDE GROUPS AS

GRAPHS ............................................................................................................................... 601

Diana RIZESCU (AVRAM), Vasile AVRAM

A SEMANTIC MOBILE WEB APPLICATION FOR RADIATION SAFETY IN

CONTAMINATED AREAS ............................................................................................... 607

Liviu-Adrian COTFAS, Antonin SEGAULT, Federico TAJARIOL, Ioan ROXIN

PREDICTING EFFICIENCY OF JAPANESE BANKING SYSTEM USING

ARTIFICIAL NEURAL NETWORKS (ANN): DATA ENVELOPMENT ANALYSIS

(DEA) APPROACH ............................................................................................................. 613

Ionut-Cristian IVAN

SEMANTIC RELATIONS BETWEEN AUTHORSHIP, DOMAINS

AND CULTURAL ORIENTATION WITHIN TEXT DOCUMENT

CLASSIFICATION ............................................................................................................. 618

Mădălina ZURINI



1

GENERAL PURPOSE SYSTEM FOR GENERATING EVALUATION

FORMS (GPS4GEF)

Daniel HOMOCIANU

Department of Research - Faculty of Economics and Business Administration

“Alexandru Ioan Cuza” University of Iasi [email protected]

Dinu AIRINEI

Department of Accounting, Business Information Systems and Statistics - Faculty of

Economics and Business Administration, “Alexandru Ioan Cuza” University of Iasi [email protected]

Abstract. The paper introduces a general purpose model used to dynamically and randomly

generate on-line evaluation forms starting from a simple data source format containing

questions, answers and links to interactive materials that embeds documentations and/or

simulations and aiming to increase the productivity of evaluation and assessment. It also

underlines the advantages of using such a model for any teacher, professor or user involved in

assessment and evaluation processes and presents a short description of the components

designed to make it functional.

Keywords: On-Line Forms, GAS (Google Apps Script), GPS4GEF

JEL classification: C88, I25, Y10

1. Introduction

In 1995 and 1996 we had the first implementations of forms using the html language. Dave

Raggett had been working for some time on his new ideas for HTML, and at last he formalized

them in a document published as an Internet Draft in March, 1995. He dealt with HTML tables

and tabs, footnotes and forms. In December 1996, the HTML ERB became the HTML Working

Group and began to work on `Cougar', the next version of HTML with completion late spring,

1997, eventually to become HTML 4. With all sorts of innovations for the disabled and support

for international languages, as well as providing style sheet support, extensions to forms,

scripting and much more, HTML 4 breaks away from the simplicity and charm of HTML of

earlier years [1].

PHP as known today is the successor of a product named PHP/FI (Personal Home Page Tools/

Forms Interpreter). At beginning it was used by its original creator (Rasmus Lerdorf - 1994) to

tracks the visits of his on-line CV. Then it was developed to provide interactions with data

bases and a framework for developing simple and dynamic web applications [2]. One of the

most powerful features of PHP is the way it handles HTML forms [3].

The history of forms is longer taking into account the traditional programming languages that

evolved to the ones based on events and visual interfaces with forms, icons and many other

objects. A simple example is Microsoft Visual Basic (MS VB) that included forms as core

objects even from its 2.0 version (1992). And that seems to be related to the fact that thought-

out the History of VB the focus has always been on rapid application development and that's

what makes it such a widely used programming environment [4].



2

In fact, by considering telephone questionnaires or listed and distributed forms, this history

involves a long period of time while these were manually processed (typewritten/edited) and

later scanned and recognized.

In this paper we present the components of a general purpose system designed to automatically

and randomly generate evaluation forms needed in education and not only.

2. Reasons for using Google Apps Script (GAS) GAS is a Java Script cloud scripting language that provides easy ways to automate tasks across

Google products and third party services and build web applications [5].

In terms of ease of use and speed of programming we consider that GAS is as good as Visual

Basic for Applications (VBA) whereas VB is recognized a RAD (Rapid Application

Development) [6] environment. There are many on-line available examples [7] proving GAS’s

ability to interact with data sources [8] and to automatically generate results.

3. The GPS4GEF’s architecture The structure of a data source accepted by the GPS4GEF form generator system was designed

as many sheets in a Google Spreadsheet file (Figure 1). Every single sheet contains records

(lines) with question id, type, title, help text, number of answers, all answers, all correct

answers and the URL of an interactive (and/or video) support file if necessary (Figure 1).

Figure 1. The general structure of a simple data source format (Google Spreadsheets) designed by authors [9]

as input for GPS4GEF



3

This kind of data sources (Figure 1 – example on Information Technologies for Business) will

be specified (a forty four characters ID) using the GPS4GEF Google form (Figure 2 – upper

left) responsible for generating evaluation forms with automatic feed-back. Their simple

structure is meant to increase productivity when dealing with the necessity to create a great

amount of questions and support files available for a group of users and integrated into

randomly and automatically generated forms in order to use them to give a feed-back and

evaluate. The control of URL’s persistence is another big reason of this approach because after

a test is done (the forms are distributed and completed), the associated data collectors must be

downloaded and the forms deleted.

Figure 2. The GPS4GEF form and its corresponding data collector designed by authors [10] to accept simple

data sources in order to generate evaluation forms and their corresponding data collectors embedding score

computation and feed-back scripts In association with the “On form submit” event (Figure 3), the GPS4GEF form uses a function

named onThisFormSubmit (Figure 4) responsible for interacting with the specific data source

constructed following the recommended pattern [11] (Figure 1) in order to dynamically

generate evaluation forms (Figure 5).

Figure 3. The function onThisFormSubmit associated to the corresponding event

This function is defined using GAS in the script section (Tools / Script editor…) of the data

collector spreadsheet (bottom of fig.2) associated to the GPS4GEF form and has sixty eight

commented lines of code (top of fig.4) needed: (1) to open the GPS4GEF’s suggested source

spreadsheet made and/or used by the instructor according to the indicated pattern (fig.1 and

fig.2 – upper left), (2) to automatically and randomly generate a new evaluation form, a

corresponding data collector spreadsheet with a submit trigger needed to give to participants



4

an evaluation feed-back (e-mail) and (3) to send to the instructor the links of the newly created

form and data collector (e-mail).

Figure 4. The onThisFormSubmit function as edited by authors [12] behind the GPS4GEF form’s data collector

and the other eleven functions [13] needed to support it



5

In addition, we have defined two other commented functions (middle of fig.4 – code lines: 70-

75 and 77-96) needed to support onThisFormSubmit, namely: createSubmitTriger (function

that programmatically creates a submit trigger associated to the dynamically generated

evaluation form’s data collector spreadsheet) and oFs (function that is used by the

createSubmitTriger to compute the evaluation scores after each submit). There are also nine

commented functions (bottom of fig.4 – code lines: 98-134) used to easily manipulate strings:

Cstr, Len, Mid, Left, Right, Search, DebugAssert, IsMissing and isUndefined. Last three of

these are used for controlling exceptions in strings.

When generating an evaluation form the onThisFormSubmit function checks every single sheet

of the spreadsheet source suggested by the id used when filling in the GPS4GEF’s form and

randomly extracts just one line (raw) per sheet with information about a single question. Thus

GPS4GEF will automatically and randomly generate evaluation forms with a number of

questions equal to the number of sheets in the suggested spreadsheet source (Figure 1 and 5).

Figure 5. Example of evaluation form generated [14] using GPS4GEF

For the moment there are some limitations related to the fact that the instructor user won’t be

able to delete the form and its corresponding data collector, but that can be solved by replacing

the code line no. 67 with 2 lines (Figure 4- addEditor method both for form [15] and for

spreadsheet [16]). Some additional restrictions to be set behind the evaluation form generator

(onThisFormSubmit function) are those necessary to eliminate the possibility of one participant

to complete the same form more than once.

The name chosen for this approach suggests even a new direction (GPS) to be set in order to

increase the productivity of evaluation tools available for any instructor (teacher, professor –

code name Jeff / GEF).



6

4. Conclusions The general conclusion of the paper underlines the context of knowledge, education and

lifelong learning and the importance of assessment and evaluation supported by tools able to

move the focus from technical and implementation details to patterns to be followed in order

to increase the productivity of evaluation.

The examples describes the components of a functional model implemented with minimum

effort and using technologies freely available for a large target group.

The paper does not claim completeness although the approach was defined after many tests

made using various educational methods and tools for simulation, learning and evaluation (Tata

Interactive Systems TOPSIM [17], Blackboard, Moodle and Microsoft SharePoint [18]) most

of them finalized with dissemination of results.

Acknowledgment This paper was funded by “Alexandru Ioan Cuza” University of Iasi (UAIC) within the research

grant no.GI-2014-17 - the competition, named “Grants for Young Researchers at UAIC” and

also by The Department of Research of The Faculty of Economics and Business Administration

(FEAA), UAIC.

References [1] Raggett. A history of HTML: http://www.w3.org/People/Raggett/book4/ch02.html, 1998,

[Feb. 20, 2015].

[2] History of PHP, http://php.net/manual/en/history.php.php, [Feb. 20, 2015].

[3] Dealing with Forms, http://php.net/manual/en/tutorial.forms.php, [Feb. 20, 2015].

[4] History of Visual Basic, http://www.max-visual-basic.com/history-of-visual-basic.html,

2010, [Feb. 20, 2015].

[5] Build web apps and automate tasks with Google Apps Script, http://www.google.com/

script/start/, [Feb. 20, 2015].

[6] Janssen. Rapid Application Development (RAD): http://www.techopedia.com/defini

tion/3982/rapid-application-development-rad, [Feb. 20, 2015].

[7] Google Apps Script Tagged Questions, http://stackoverflow.com/questions/tagged/

google-apps-script, [Feb. 20, 2015].

[8] Top 10 Google Apps Scripts for Education, https://www.synergyse.com/blog/top-10-

google-apps-scripts-for-education/, July 1, 2013, [Feb. 20, 2015].

[9] sites.google.com/site/supp4ie2015/downloads/spsh_source.tiff?attredirects=0&d=1

[10] docs.google.com/forms/d/1hhhRPaG07P47VFwbVMIejmQd0j6LQXFVosb2S_2BdiI

[11] sites.google.com/site/supp4ie2015/downloads/BTR.xlsx?attredirects=0&d=1

[12] sites.google.com/site/supp4ie2015/downloads/OTFS.tiff?attredirects=0&d=1

[13] sites.google.com/site/supp4ie2015/downloads/OFS_other_f.tiff?attredirects=0&d=1

[14] docs.google.com/forms/d/1Ljf0c-EAXOhsvOpFGR3NjXaNHK09pc0j7blUgzOhR9o

[15] Google Apps Script, Class From, https://developers.google.com/apps-script/reference

/forms/form, [Feb. 20, 2015].

[16] Google Apps Script, Class Spreadsheet, https://developers.google.com/apps-script/ref

erence/spreadsheet/spreadsheet, [Feb. 20, 2015].

[17] D. Homocianu and D. Airinei, “Design of a Simulation Environment for Group

Decisions”, in Proc. The 19th IBIMA Conference, Barcelona, 2012, pp.1944-1950.

[18] D. Homocianu, et al. (2014, September). Multimedia for Learning in Economy and

Cybernetics, Journal of Economic Computation and Economic Cybernetics Studies and

Research [Online]. 3(48). Available: http://www.ecocyb.ase.ro/eng/articles_3-2014/

homocianu%20daniel,%20cristina%20necula.pdf



7

UPON A MULTI CRITERIA OFFER-DEMAND CONTINOUS

MATCHING ALGORITHM

Claudiu VINŢE

Bucharest University of Economic Studies

[email protected]

Amelia CRISTESCU


[email protected]

Abstract. Our ongoing research intends to identify a mechanism for continuously matching

the offers and the demands for job positions posted by companies and candidates, respectively,

on a dedicated web portal: job2me. This paper briefly presents our focus on the matching

algorithm, and the environment that we developed purposely for simulating the load with

requests that may come from both sides: offers and demands for job positions. Our approach

for the job2me portal involves a collection of distributed services interconnected through a

message oriented middleware (MoM). The whole architecture of the distributed system is a

service oriented one. The services that deliver the system functionality are glued together

through a proprietary message oriented API based on JMS. This paper concerns the data

model and the methodology that we conceived and implemented for obtaining a fast and fair

matching algorithm, along with the test results that we obtained within the simulation

environment.

Keywords: Job Market, Demand and Offer, Multi Criteria Matching Algorithm, Messaging.

JEL classification: C610, C630, C880

1. Introduction

In an ever more competitive social and business environment, the ability to find a job, from a

potential employee perspective and, correspondently, to identify a suitable candidate for a

certain open position, from an employer point of view, are two sides of a coin that urge each

other to reach unification in a timely fashion. And there are indeed two aspect of this problem:

creating the opportunity for job offer side to enter in contact, to meet, to set up an interview

with a candidate, potential employee;

make this link as quickly as possible, and in a cost effective way.

The stable matching problem, and the algorithm that Gale and Shapely proposed in 1962 for

making all marriages stable for any equal number of men and women [1], inspired a numerous

researches in various domains where the process of bringing together offer and demand could

be performed in an algorithmic manner.

There are a many web portals, see www.ejobs.ro for instance, where job offers and demands

can be posted, and tools for screening are provided for both sides of the market. These platforms

generally create a meeting place for offer and demand, but lack to play an active role in

identifying suitable matches.

SAUGE project on the other hand, which name stands for Semantic Analyses for Unrestricted

Generalized Employment, aims at providing a technology that could capture the key aspects

contained in a manually written CV, and transforming them into structured information which

would be subsequently connected within the context of a Linked Open Data initiative [2].

http://www.ejobs.ro/



8

In this context, our research aims to conceive and implement a deterministic algorithm that

matches multiple criteria provided by, or collected from both offer and demand sides [3].

2. The simulation environment for the matching engine

The premises from which we embarked on our research project are the followings:

I. regardless of how the key aspects of a job offer (requirements), or of a job demand (skills)

are collected and supplied to the matching engine as input data, the algorithm should

complement and complete the process of creating a direct link offers and demands placed

into the system;

II. the output of the matching algorithm should be in the form of asynchronous messages

that inform the sides regarding a potential match, and open the gateway for arranging job

interviews;

III. the matching algorithm has to have embedded enough relaxation, in order to not exclude

interview opportunities – more interviews are preferred, rather than missing a potentially

suitable candidate.

The first premise means that either the key aspects, identified by each side as being defining

selection criteria, are collected explicitly via various forms supplied within the web portal, or

are deducted through semantic analysis from descriptive files containing job requirements, and

CVs respectively, the matching has to be eventually attempted on a common set of criteria, for

both offer and demand.

The simulation environment that we conceived for testing the matching algorithm has

architecture of a service orientation, and comprises the following components, interconnected

through a proprietary message oriented API based on JMS:

DOME – Demand-Offer Matching Engine – the component that encapsulates the

matching algorithm;

PRODS – Pseudo-Random Offer-Demand Simulator – is the service responsible with

feeding quasi-continuously the matching algorithm with job offers and demands

constructed based on the instructions provided in the configuration files.

The simulation environment is a distributed software solution that is intended to provide the

framework that can easily accommodate later on additional services required by a fully

functional web portal dedicated to collect and match job offers and demands [4]. The

components of the service oriented architecture communicate by passing messages via a JMS

message broker [5], [6], [7]. PRODS plays the role of message producer and places the newly

generated job offers and demands on the offer queues (DOME_OFFER_QUEUE), and demand

queues (DOME_DEAMND_QUEUE), respectively.

Once a match is identified by DOME, a message is generated and stored in the system database.

Within job2me software solution, the matching message is to be transmitted asynchronously to

both sides via email. The simulation framework is illustrated in Figure 1.



9

Message Broker

DOME_OFFER_QUEUE

DOME_DEMAND_QUEUE DOME

PRODS

Configuration

files

Configuration

files

job2me DB

Figure 1. The architecture of simulation environment employed to test the matching algorithm

A job offer or a demand consists in a collection of attributes, such as:

ID – a unique identifier of a company offer or demand formulated by a candidate;

job demand - “D”, or job offer - “O”;

type: unpaid, internship, job;

job period: 1 month, 3 months, 6 months, one year, indefinite period;

start date;

end date;

number of working hours per day;

city;

district;

country;

year of experience (required by position offered, respectively possessed by the

candidate);

In addition to the above criteria, a job offer or demand had an associated list of skills. Our

simulation targeted job chiefly in the information technology sector, and therefore we grouped

the skills in five categories, or sections:

programming languages and technologies,

database systems;

operating systems;

foreign languages;

development, administrative tools and others.

Within each section, a particular skill may have a certain level:

beginner,

intermediate,

advance.

It is worth to notice that a greater number of categories, or more granular levels associated to

skills, do not change fundamentals of the matching algorithm approach, hence they a subject

to be customised. Each category of skills has associated a certain relevance, or weight, and

these weighting values are fetched from the configuration files, during the initialization of the

algorithm environment.



10

3. The offer-demand continuous matching algorithm

The matching algorithm that we propose is designed to attempt to find a match anytime a new

job offer or a new job demand is entered in the system, against the existing demands,

respectively offers, previously placed in the system, and employed as patterns to be matched

[8]. In order to achieve this readiness, the algorithm needs to dynamically maintain a data

structure illustrated in Figure 2.

C1-L1-S1 C1-L2-S1 C1-L3-S1 Cn-L1-S5 Cn-L2-S5 Cn-L3-S5

O-ID1 D-ID1

O-ID2 D-ID2

O-ID3 D-ID3

O-IDi D-IDj

O-ID1 D-ID1

O-ID2 D-ID2

O-ID3 D-ID3

O-IDk D-IDm

O D O D O D O D O D O D

Figure 2. The driving data structure conceived for the matching algorithm

Prior any matching attempt, based on the identified criteria, with the associated level, from

each section, or categories of criteria, it is created a hash table having as keys strings obtained

from concatenating: Criteria-Level-Section (C-L-S).

The values pointed by these keys contain two list of offer IDs and demand IDs that required,

respectively possessed the given C-L-S key.

The algorithm implies searches in the hash table of criteria, but not in the list of offers or

demands. It computes for any given ID in the system, offer or demand on the market, a

matching percentage against an existing data pattern from the other side, based on the

frequency of retrieving the given ID in the lists corresponding to each criterion, C-L-S key [9].

The algorithm penalizes the IDs retrieved in the lists corresponding to the adjacent keys from

the targeted C-L-S key.

The categories of skills are conceived to have different relevance in selecting the candidates

for an offer or in recommending certain job offers to a candidate. This degree of relevance is

modelled in the algorithm by assigning to each category of criteria a certain weight [10]. For

example, in the context of an IT job market, the programing language skills have assigned a

higher relevance (weight) than the development, administrative tools.

It important to note other attributes of a job offer or demand, such as job type, period, number

of working hours per day etc. can be also included as keys in the hash table.



11

4. Test results and further research

This paper aims to briefly present the results of our ongoing research on multi criteria offer-

demand continuously matching algorithm, along with the environment that we developed

purposely for simulating the load with requests that may come from both sides: offers and

demands for job positions.

Since in the real job market, an offer and a demand may stay listed for a reasonably long period

of time, we needed to verify the impact that a continuously growing data structure, employed

by the algorithm, would have on the required matching time [11].

Within the simulation environment we recorded time required by the algorithm to identify the

best match for a new job offer that enters a system that contains a growing number of demand

patterns to compare against to. The evolution of the average matching time in milliseconds is

presented in the Figure 3 below.

Figure 3. The evolution of the average matching time in milliseconds,

function of the existing number of demands in the system

The preliminary test results indicate a polynomial evolution of the average matching time,

function of the existing number of corresponding patterns in the system [12]. The findings are

corroborated by the test results obtained when matching a new demand against the existing

offer patterns in the system (Figure 4).

Figure 4. The evolution of the average matching time, in milliseconds,

function of the existing number of offers in the system

15 15 16 16

21

26

30

0

5

10

15

20

25

30

35

50 100 250 500 1000 2500 5000

Offer average matching time (ms)

15 15 16 16

2224

28

0

5

10

15

20

25

30

50 100 250 500 1000 2500 5000

Demand average matching time (ms)



12

The above results were obtained on a platform with the following characteristics: Intel® Core

™ i5 CPU, [email protected], dual core, 4GB RAM, running Windows 7 Home Premium, 64-

bit operating system.

Our ongoing research aims to refine the multi criteria offer-demand continuously matching

algorithm that we succinctly presented in the paper, and integrate it within the web portal

job2me, in order to offer a dedicated platform to the graduates of Economic Informatics Faculty

that would facilitate their moves on the job market.

References

[1] D. Gale, L. Shapley, "College Admissions and the Stability of Marriage", American

Mathematical Monthly, Vol. 69, pp. 9–14, 1962

[2] SAUGE project – Semantic Analyses for Unrestricted Generalized Employment. Internet:

http://sauge-project.eu/overview/

[3] C. Vinţe, “The Informatics of the Equity Markets - A Collaborative Approach”, Informatica

Economica, Vol. 13, Issue 2/2009, pp. 76-85, ISSN 1453-1305, Available at:

http://revistaie.ase.ro/content/50/009%20-%20Vinte.pdf

[4] A. S. Tanenbaum, M. van Steen, Distributed Systems - Principles and Paradigm, Vrije

Universiteit Amsterdam, The Netherlands, Prentice Hall, New Jersey, 2002, pp. 99-119,

414-488, 648-677

[5] M. Richards, R. Monson-Haefel, D. A. Chappell, Java Message Service (Second Edition),

O’Reilly Media Inc., Sebastopol, California, 2009.

[6] C. Vinţe, “Upon a Message-Oriented Trading API”, Informatica Economica, Vol. 14, No.

1/2010, pp 208-216, ISSN 1453-1305, Available at:

http://revistaie.ase.ro/content/53/22%20Vinte.pdf

[7] S. Mffeis, Professional JMS Programming, Wrox Press 2001, pp. 515-548, Available:

http://www.maffeis.com/articles/softwired/profjms_ch11.pdf

[8] J. W. Cook, H. W. Cunningham, R. W. Pulleyblank, A. Schrijver, Combinatorial

Optimization, John Wiley & Sons, Inc., New York, 1998

[9] D. E. Knuth, The Art of Computer Programming - volume 3, Sorting and Searching,

Addison-Wesley Publishing Company, 1973

[10] K. L. Donald, S. R. Douglas, Combinatorial Algorithms: Generation, Enumeration, and

Search, CRC Press LLC, New York, 1999

[11] L. G. Nemhauser, A. L. Wolsey, Integer and Combinatorial Optimization, John Wiley &

Sons, Inc., New York, 1999

[12] H. C. Papadimitriou, K. Steiglitz, Combinatorial Optimization - Algorithms and

Complexity, Dover Publication, Inc., Mineola, New York, 1998

mailto:[email protected]



13

STOCHASTIC PROCESSES AND QUEUEING THEORY FOR CLOUD

COMPUTER PERFORMANCE ANALYSIS

Florin-Cătălin ENACHE

Bucharest University of Economic Studies [email protected]

Abstract. The growing character of the cloud business has manifested exponentially in the last

5 years. The capacity managers need to concentrate on a practical way to simulate the random

demands a cloud infrastructure could face, even if there are not too many mathematical tools

to simulate such demands. This paper presents an introduction into the most important

stochastic processes and queueing theory concepts used for modeling computer performance.

Moreover, it shows the cases where such concepts are applicable and when not, using clear

programming examples on how to simulate a queue, and how to use and validate a simulation,

when there are no mathematical concepts to back it up.

Keywords: capacity planning, capacity management, queueing theory, statistics, metrics JEL classification: C02, C15, C61

1. Introduction During the last years, the types and complexity of people’s needs increased fast. In order to

face all changes, the technology had to develop new ways to fulfill the new demands.

Therefore, I take a deeper look into the basic terms needed for understanding the stochastic

analysis and the queueing theory approaches for computers performance models. The most

important distribution for analyzing computer performance models is the exponential

distribution, while the most representative distribution for statistical analysis is the Gaussian

(or normal) distribution. For the purpose of this article, an overview of the exponential

distribution will be discussed.

2.1 The Poisson Process

In probability theory, a Poisson process is a stochastic process that counts the number of events

and the time points at which these events occur in a given time interval. The time between each

pair of consecutive events has an exponential distribution with parameter λ and each of these

inter-arrival times is assumed independent of other inter-arrival times. Considering a process

for which requests arrive at random, it turns out that the density function that describes that

random process is exponential. This derivation will turn out to be extremely important for

simulations, in particular for applications modeling computer performance. A typical example

is modeling the arrival of requests at a server. The requests are coming from a large unknown

population, but the rate of arrival, λ can be estimated as the number of arrivals in a given period

of time. Since it is not reasonable to model the behavior of the individuals in the population

sending the requests, it can be safely assumed that the requests are generated independently

and at random.

Modeling such a process can help answering the question of how a system should be designed,

in which requests arrive at random time points. If the system is busy, then the requests queue

up, therefore, if the queue gets too long, the users might experience bad delays or request drops,

if the buffers are not big enough. From a capacity planner point of view, it is important to know

http://en.wikipedia.org/wiki/Probability_theory

http://en.wikipedia.org/wiki/Stochastic_process

http://en.wikipedia.org/wiki/Exponential_distribution



14

how build up a system that can handle requests that arrive at random and are unpredictable,

except in a probability sense.

To understand and to simulate such a process, a better understanding of its randomness is

required. For example, considering the following time axis (as in the second figure), the random

arrivals can be represented as in the figure below.

Figure 1. Random arrivals in time

If X is the random variable representing the times between two consecutive arrivals (arrows),

according to the PASTA Theorem (Poisson Arrivals See Time Averages)[1], it is safe to

assume that all X-es are probabilistically identical. Describing this randomness is equivalent to

finding the density function of X that represents the time distance between two consecutive

arrows.

Figure 2. Interval of length 𝑡 divided into n intervals.

The problem described above needs to be transformed so that it can be handled with known

mathematical tools. Supposing that an arbitrary interval of length 𝑡 is chosen, then the

probability of the time until the first arrival is longer than 𝑡 is P(X> 𝑡). This is by definition

1-FX(𝑡), where FX(𝑡) is the distribution function to be calculated. If time would be discrete, by

dividing the interval between 0 and 𝑡 into n intervals, the calculating FX(𝑡) reduces to

calculating the probability of no arrow in the first n intervals, and switching back to the

continuous case by taking n.

Let p be the probability that an arrow lands in any of the n time intervals, which is true for any

of the n intervals since any of them is as likely as any other to have an arrow in it, then

( ) 1n

P X t p , which is the probability on no arrow, 1-p, in the first n intervals. As

mentioned, when taking n , 0p and np t . The equality np t represents the

average number of arrows in n intervals – np – which is equal to the average number of arrows

calculated as t - the arrival rate multiplied by the length of the interval. After switching to

the continuous case, it is derived that:

00lim 1

( ) lim (1 ) lim(1 )n

x

n

n n t

n nt xp

enp t n

tP X t p e

n

(1)

Which is equivalent to0,( 0)

( ) 1 ( )1 ,( 0)t

tP X t P X t

e t

, and

0,( 0)( ) ( )

, ( 0)X tX

tdf t t

dt e tF

(2)



15

2.2 The exponential distribution.

The random variable X derived from the Poisson process studied in section 2.1 of this paper is

called exponential with the parameter 𝜆 ( X~Exp(𝜆) ). The probability density function (PDF)

of X is defined as fX (𝑡)={0, 𝑖𝑓 𝑡 < 0

𝑒−𝜆𝑡, 𝑖𝑓 𝑡 ≥ 0, which plots as in the figure below for different values

of the parameter 𝜆.

Figure 3. PDF for 𝜆 in (0.5, 1.0, 1.5)

Integrating by parts, it is easy to demonstrate the property that 0

1te dt

, which is actually

obvious, since the sum of all probabilities of a random variable X has to add up to 1. If

X~Exp(𝜆) then the following properties are true [2] :

The expected value the random variable X, E(X)= 0

1tt e dt

(3) ,

Expected value of X2 , E(X2)= 2

2

0

2tt e dt

(4) and

The variance of X, V(X)=E(X2) – [E(X)]2=

2

2 2

2 1 1

(5) .

When used in simulating computer performance models, the parameter λ denotes usually the

arrival rate. From the properties of the exponential distribution, it can be deduced that the

higher the arrival rate λ is, the smaller are the expected value – E(X) – and variance – V(X) –

of the exponentially distributed random variable X.

3.1. Introduction to the Queueing Theory M/G/1 Problem – FIFO Assumption

Considering a system where demands are coming at random, but the resources are limited, the

classic queueing problem is how to describe the system as a function of random demands.

Moreover, the service times of each request are also random, as in figure 4:



16

Figure 4. Random arrivals with random service times

From a request point of view, when a new request arrives, it has two possibilities:

It arrives and the server is available. Then it keeps the server busy for a random amount

of time until the request is processed, or

Typical case, when a request arrives, it finds a queue in front of it, and it needs to wait.

The queueing theory helps answering questions like what is the average time that a request

spends waiting in queue before it is serviced. The time a request must wait is equal to the sum

of the service times for every request that is in the queue in front of the current request plus the

remaining partial service time of the customer that was in service at the time of the arrival of

the current request.

Calculating the expected waiting time of the new request mathematically, it would be the sum

(further named “convolution”) of the density functions of each of the service time requirements

of the requests in the queue, which could be any number of convolutions, plus the convolution

with the remaining partial service time of the customer that was in service at the time of the

arrival of the current request. Furthermore, the number of terms in the convolution, meaning

the number of requests waiting in the queue, is itself a random variable [1].

On the other side, looking at the time interval between the arrival and the leave of the nth

request, it helps in developing a recursive way of estimating the waiting times. The nth request

arrives at time Tn and, in general, it waits for a certain amount of time – noted in the below

figure with Wn. This will be 0 if the request arrives when the server is idle, because the request

is being served immediately. To enforce the need of queueing theory, in real-life, a request

arrives typically when the server is busy, and it has to wait. After waiting, the request gets

serviced for a length of time Xn, and then leaves the system.

Figure 5. Representation for calculating the waiting time, depending on the arrival of the (n+1)th customer

Recursively, when the next customer arrives, there are 2 possibilities:

The arrival can occur after the nth request was already serviced, therefore Wn+1=0

(explained in the right grey-boxed part of figure 5), or



17

The arrival occurs after Tn but before the nth request leaves the system. From the fifth

figure the waiting time of the (n+1)th request is deduced as the distance between its

arrival and the moment when the nth request leaves the system, mathematically

represented as Wn+1=Wn+Xn-IAn+1, where IAn+1 is the inter-arrival time between the

nth and (n+1)th request. This can be easily translated into a single instruction that can be

solved recursively using any modern programming language.

3.2. Performance measurements for the M/G/1 queue

If λ is the arrival rate and X is the service time, the server utilization is given by:

( ), ( ) 1

1,

E X if E X

otherwise

(6)

Moreover, if the arrivals are described by a Poisson process, the probability that a request must

wait in a queue is 0P W (7), and the mean waiting time is given by the Pollaczek-

Khintchin formula [3]:

2

( ) ( )* (1 )

1 2 ( )

E X VE W

X

E X

(8)

In addition, if the service times are exponentially distributed and the service follows the FIFO

principle (“first-in-first-out”, also knows as FCFS, “first-come-first-serve”), then the

distribution function of the waiting time is given by the following formula [1]:

(1 )( )

0, 0

( )

1 , 0

tW p

E X

t

F t

e t

(9)

There is no simple formula for Fw(t) when the service times are not exponentially distributed,

but using computer simulation can help developing such models, after validating classic models

as the one above.

4.1. Software simulation of the Queueing Problem

As described previously, modeling the M/G/1 queue can be done by using a recursive algorithm

by generating the inter-arrival time and the service times using the Inverse Transform Method

[4].

The following lines written in the BASIC programming language simulate such an algorithm,

although almost any programming language could be used.

100 FOR I=1 to 10000

110 IA= ? ‘inter-arrival times to be generated

120 T=T+IA ‘time of the next arrival

130 W=W+X-IA ‘recursive calculation of waiting times

140 IF W<0 THEN W=0

150 IF W>0 THEN C=C+1 ‘count all requests that wait

160 SW=SW+W ‘sum of waiting times for calculating E(W)

170 X= ? ‘service times to be generated

180 SX=SX+X ‘sum of service times for calculating Utilization

190 NEXT I

200 PRINT SX / T, C / 10000, SW / 10000 ‘print Utilization, P(W)

and E(W)



18

4.2. Generating random service and inter-arrival times using the Inverse Transform

Method

Assuming that the computer can generate independent identically distributed values that are

uniformly distributed in the interval (0,1), a proper method of generating random variable

values according to any specified distribution function is using the Inverse Transform Method.

To generate the random number X, it is enough to input the random computer generated number

on the vertical axis and to project the value over the distribution function G, where G is the

desired distribution to be generated. Projecting the point from the G graph further down on the

horizontal axis, delivers the desired randomly distributed values described by the G density

function. This method is practically reduced to finding the inverse function of the distribution

function of the distribution according to which the numbers are generated. By plugging in the

computer randomly generated numbers, a new random variable is generated with has its

distribution function G(u) [4]. This procedure is schematically described in the below figure.

Figure 6. Illustration of the Inverse Transform Method

For example, for a Poisson process of arrivals that are exponentially distributed with parameter

λ, where λ is the arrival rate and 1

( )E IA , according to the Inverse Transform Method, a

value of λ=1.6 arrivals per second is derived, equivalent to an average inter-arrival time of

1 5

8 seconds. For ( ) 1 uG u e R with u≥0, it is deduced that

1( ) 1/ ln(1 )G R R

where R is the computer-generated value. Therefore, the instruction 110 from section 3 of this

paper becomes: 110 IA=-(5/8)*LOG(1-RND), where RND is the BASIC function that

generate values uniformly distributed between 0 and 1. Of course, any programming language

that is able to generate random independent identically distributed numbers between 0 and 1

can be used for simulation.

5. Comparing the mathematical solution of the queueing problem with the computer

simulation

To illustrate the applicability of the software simulation, 4 different arrival times distributions

are analyzed :

1. Exponential service time, with mean service time E(X)=0.5

2. Constant service time, X=0.5

3. Uniformly identical distributed service times between 0 and 1, X~U(0,1)

4. Service times of 1/3 have a probability of 90%, and service times of 2 have a probability

of 10%.



19

For all 4 simulations, exponential distributed inter-arrivals with λ=1.6 are used as derived in

4.2 section. All calculations in the following table are done according to the formulas presented

in section 3.2.

Table 1. Comparison between the mathematical and simulated results

ρ P(W>0) P(W>0.5) E(W)

X Formula of X Theory Simulation Theory Simulation Theory Simulation Theory Simulation

1 -0.5*LOG(1-RND) 0.8 0.799436 0.8 0.799817 0.6549

8

0.654924 2 1.991853

2 0.5 0.8 0.799724 0.8 0.799895 NA 0.55622 1 0.997296

3 RND 0.8 0.800048 0.8 0.800103 NA 0.622625 1,(3) 1.332808

4 q = RND:

IF q <= 0.9

THEN X = 1 / 3

ELSE X = 2

0.8 0.804667 0.8 0.799336 NA 0.616419 2 1.999094

All 4 simulations have been chosen in such way that E(X)=0.5, and the distinction is done by

choosing the service times with different distributions. Since the utilization is directly

dependent on the arrival rate and mean arrival times, it is equal with 80% in all 4 cases.

According to (7), the probability of waiting is also equal to 80% in all 4 cases.

In this simulation, the mean waiting time, as deduced from the Pollaczek-Khintchin(8) formula,

confirms the accuracy of the simulation model, and gives insights also for the other cases,

offering a clear approximation of the behavior of the designed system. It is interesting to

observe that mean waiting time when having exponential service times is double in comparison

with the mean waiting time when having constant service times, although the mean service

time, the utilization and the probability of waiting are equal in both cases.

6. Conclusions

Based on all information presented in this paper, I can conclude that computer simulation is an

important tool for the analysis of queues whose service times have any arbitrary specified

distribution. In addition, the theoretical results for the special case of exponential service times

(8) are extremely important because they can be used to check the logic and accuracy of the

simulation, before extending it to more complex situations.

Moreover, such a simulation gives insight on how such a queue would behave as a result of

different service times. Further, I consider that it offers a methodology for looking into more

complicated cases, when a mathematical approach cannot help.

References [1] R. B. Cooper, Introduction to Queueing Theory, Second Edition. New York: North Holland,

1981, pp. 208-232.

[2] S. Ghahramani, Fundamentals of Probability with Stochastic Processes, Third Edition.

Upper Saddle River, Pearson Prentice Hall 2005, pp.284-292. [3] L. Lakatos , “A note on the Pollaczek-Khinchin Formula”, Annales Univ. Sci. Budapest.,

Sect. Comp. 29 pp. 83-91, 2008.

[4] K. Sigman, “Inverse Transform Method”. Available at:

http://www.columbia.edu/~ks20/4404-Sigman/4404-Notes-ITM.pdf [January 15, 2015].

[5] K. Sigman, “Exact Simulation of the stationary distribution of the FIFO M/G/c Queue”, J.

Appl. Spec., Vol. 48A, pp. 209-213, 2011, Available at:

http://www.columbia.edu/~ks20/papers/QUESTA-KS-Exact.pdf [January 20, 2015].

http://www.columbia.edu/~ks20/4404-Sigman/4404-Notes-ITM.pdf



20

AN EVALUATION OF THE FUZZY VAULT SCHEME DIFFUSION

POINTS ORDER OF MAGNITUDE

Marius-Alexandru VELCIU

Military Technical Academy, Computer Science Department, Bucharest, Romania

Advanced Technologies Institute, Bucharest, Romania [email protected] Victor-Valeriu PATRICIU


[email protected]

Mihai TOGAN


[email protected]

Abstract. The continuous growth in the use of biometrics has also increased significantly the

importance of their security, more specifically, of the biometric templates stored within each

traditional biometric system. Bio-cryptography represents a safer alternative for their use, as

its algorithms eliminate the need for storing biometric templates within the system. Still, there

are some major concerns regarding the computational complexity exhibited by bio-

cryptographic algorithms, since they perform several mathematical processing. Fuzzy Vault

represents the most well-known bio-cryptographic algorithm. One of the highest resource-

consuming processes of its encryption stage is represented by diffusion points generation. This

paper aims to evaluate the relationship between their order of magnitude and the conferred

security level, as the usage of more diffusion points require greater computational resources.

Keywords: Biometric Encryption, brute-force polynomial reconstruction, diffusion points,

Fuzzy Vault, Lagrange Interpolation. JEL classification: C02, C6.

1. Introduction Bio-cryptography represents a relative new domain, which proposes the combination of

classical cryptography with the usage of biometrics. The secure binding of a secret value,

usually a cryptographic key, with biometric data, in such a way that none of them can be

recovered from the resulted bio-cryptogram, in the absence of a new biometric sample,

sufficiently similar to the one used during the encryption stage, is called Biometric Encryption

and represents the most reliable operating mode used within this domain.

Bio-cryptographic systems have the potential to confer a higher degree of security, since the

only information stored within the system are the resulted bio-cryptograms. On the other hand,

their high consumption of computational resources represents an important drawback, since

the response time of a bio-cryptographic system might be an inconvenient one, depending on

its available resources.

For example, the Fuzzy Vault bio-cryptographic algorithm performs lots of mathematical

processing, including polynomial projections, diffusion points generation or exhaustive

polynomial reconstruction, using Lagrange Interpolation.

This paper aims to evaluate the order of magnitude for the diffusion points generated during

the biometric encryption stage, since some papers in domain recommend at least ten times more



21

diffusion points than genuine ones [1][2], but without a reasoning for that choice. The

evaluation process will consist of a brute-force attacks resistance measurement, as an attempt

to relate the diffusion points order of magnitude with the security level of the entire encryption

schema.

The rest of the paper is organized as follows: Section 2 comprises the theoretical background

of this article, describing the Fuzzy Vault scheme and its main parameters; Section 3 describes

our proposed evaluation method for determining the optimal order of magnitude for diffusion

points and Section 4 presents the experimental results that were obtained.

2. Background information Although Bio-cryptography proposes several different modes of combining the usage of

cryptography and biometrics, including the key generation or key unlocking scenarios, the most

representative bio-cryptographic mode is key binding, also known as the Biometric Encryption

process. This is due to the fact that key regeneration is an instable process, as biometric data

tends to exhibit a high entropy, even for the same user characteristics.

The most widely used bio-cryptographic algorithm is the Fuzzy Vault scheme, described in the

following section of our paper.

2.1 Fuzzy Vault biometric encryption and decryption The Fuzzy Vault scheme represents a biometric encryption algorithm based on mathematical

principles regarding polynomial calculus in a finite field. Its security relies on the difficulty of

the polynomial reconstruction problem [3], as the initial secret key is encoded under the form

of a low-degree polynomial.

Figure 1. Fuzzy Vault enrollment and verification principles

During the enrollment stage, where biometric encryption takes place, the initial secret key is

mapped to the encoding polynomial construction, which has the following generic form:

P(u) = ckxk+ck-1x

k-1+ ... + c2x2+c1x+c0 , (1)

where k represents the degree of the encoding polynomial and c0 ... ck represent its coefficients,

resulted, usually, from mapping each 16 bits of the key to the corresponding decimal value.

The evaluation of each user digitized biometric template values on the polynomial curve

derived above will generate the associated original encoding points, also called genuine points



22

[4], that will represent the only information related to the initial secret key to be stored within

the system database.

Since the plaintext storing of the genuine points would make the polynomial reconstruction

problem trivial, as any polynomial of degree k can be reconstructed, by using Lagrange

Interpolation and any k + 1 points residing on its polynomial curve, the random diffusion points

generation mechanism is used, in order to spread the original information across the final bio-

cryptogram. An important condition needs to be imposed on each pair of these points, as they

mustn’t belong to the encoding polynomial curve, because they could generate false acceptance

points [3]:

RDP = {(ai , bi )}, with bi ≠ P(ai ) (2)

During the verification stage, in order to successfully complete the biometric decryption

process and recover the initial secret key, each user must provide a fresh biometric sample,

which will be used to identify genuine points within the bio-cryptogram. If it sufficiently

resembles with the one used during enrollment, polynomial reconstruction and secret key

recovery are feasible.

The checksum mechanism is required, as a method of identifying the correct encoding

polynomial from all the candidates [5], since diffusion points extraction from the bio-

cryptogram will result in erroneous polynomials reconstruction.

2.2 Fuzzy Vault specific operating parameters The main parameters used for configuring the Fuzzy Vault encryption scheme were already

mentioned in the algorithm description and they comprise the number of genuine points, the

order of magnitude of the diffusion points and the encoding polynomial degree.

The number of genuine points represents a quantifier for the initial secret value that is encoded

and secured within the final bio-cryptogram. Its value depends directly on the biometric

template content, being influenced by the digitizing parameters that were used to derive it from

the biometric sample. It is denoted by n, usually.

The order of magnitude of the diffusion points represents the particular aspect of the Fuzzy

Vault biometric encryption scheme, as these points are randomly generated with the purpose

of securing the encoded secret information and they influence, directly, the security level of

the entire scheme. It is denoted by m, usually.

The encoding polynomial degree is another important metric for the Fuzzy Vault scheme, as it

directly influences the genuine points acceptance threshold. A higher degree polynomial can

lower the threshold value, since it adds more precision for the genuine points evaluation on the

encoding polynomial curve [6]. It is denoted, usually, by k.

As already stated in the previous sections, the usage of more diffusion points confers a higher

security level, but it also affects the performances of the encryption algorithm, as their

generation is a highly resource consuming operation. Considering the fact that the usage of a

large scale bio-cryptographic system would require much user input and output, its time

response represents an important metric. This aspect brings the designers of such a system to

another compromise, similar to the false acceptance / false rejection one, which requires the

finding of an optimal balance between the conferred security level and the system response

time.

3. Our proposed evaluation method In our study, we aimed to evaluate the strength of the Fuzzy Vault construction, by measuring

its resistance against brute-force attacks, for different biometric encryption scenarios, with

various orders of magnitude for the diffusion points.



23

Our evaluation starts from the premise that the entire encryption algorithm is public, including

the encoding polynomial degree (related to the size of the secret key) and the addition of a

checksum appendix, calculated over the encoding polynomial coefficients. This aspects are

essential for a potential successfully conducted attack, since the degree of the polynomial

represents the indicator for the candidate points set used for polynomial reconstruction and the

checksum verification represents the stopping condition for the brute-force exhaustive search.

Basically, a brute-force attack on a Fuzzy Vault bio-cryptogram reduces to the exhaustive

Lagrange interpolation for all the (k+1) dimension points set, where k represents the encoding

polynomial degree. The attack is successful when the genuine polynomial is recovered,

confirmed by the checksum verification, thus the initial secret key is decrypted.

In order to simulate a larger-scale brute-force attack, we developed a client-server architecture

for the distribution of computational demand associated with the exhaustive polynomial

reconstruction process.

Figure 2. Brute-force attack client-server architecture

In the described architecture from figure 2, the server acts as a distributor and centralizer of the

computational calculus associated with the exhaustive polynomial reconstruction process. It

pre-computes the candidate points sets, using the targeted bio-cryptogram, and passes them to

the clients, centralizing their responses. Essentially, each client acts as a worker thread,

reconstructing the unique polynomial associated with each candidate points set and verifying

it against the checksum, in order to identify the genuine polynomial. The brute-force attack

completes successfully when the first client manages to crack the encoding polynomial.

4. Experimental results Our strength analysis for the Fuzzy Vault scheme targeted the evolution of the time required to

successfully brute-force its bio-cryptograms, using the above-described architecture and

varying the amount of diffusion points used for biometric encryption. Input data was

represented by bio-cryptograms derived using our voice-based fuzzy Vault implementation in

[7].

Our implementation was written in Java programming language and uses TCP sockets

communication for client-server interactions. The hardware resources used include an Intel

Core 2 Duo E6750 processor-based PC, with 2.66GHz frequency per-core, and two AMD A4-



24

5000 APU Quad-core processor laptops, with 1.5 GHz per-core. The PC was employed as the

server, and both laptops were used as clients. The brute-force mechanism was parallelized,

using java thread programming, meaning that we had eight active working threads at our

disposal.

Still, it is important to mention that our exhaustive polynomial reconstruction attempts started

from the statistical assumption that genuine points have a relatively uniform distribution within

the bio-cryptogram, of value 1/m, where m represents the diffusion points magnitude order.

This is due to the fact that even for the smallest bio-cryptogram used, the total number of points

contained was bigger than 5000, resulting, for a k = 7 degree encoding polynomial, in a total

number of C(5000, k+1) = C(5000, 8) = 9.63399119864E+24 polynomial reconstructions,

impossible to calculate in finite time, by any existing computing infrastructure.

In other words, for a biometric encryption scheme having an order of magnitude of the diffusion

points with the value of three, we can assume that one out of four points is genuine. This way,

the entire range of values to be used for candidate points sets combinations can get reduced to

a 8*4 = 32 points subset, for the same encoding polynomial mentioned above. Still, this

approach does not guarantee the genuine polynomial recovery by using only the partial subset,

but allows the reduction of the exhaustive search to a finite-time one, with a decent probability

of succeeding the attack. Table 1 summarizes the experimental results obtained using our

proposed architecture.

Table 1. Experimental results

Nr. of

original

points

Nr. of

diffusion

points

Diffusion

points

order of

magnitude

Restrained

analysis subset

dimension

Nr. of total

polynomial

reconstruction

attempts

Required time to

break the bio-

cryptogram

1342 4026 3 32 10.518.300 22 min.

1342 5368 4 40 76.904.685 2 h 45 min.

1342 6710 5 48 377.348.994 10 h 25 min.

1342 8052 6 56 1.420.494.075 1 day 18 h 30 min.

1342 9394 7 64 4.426.165.368 4 days 10 h

1342 10736 8 72 11.969.016.345 Unevaluated

Experimental results show that the Fuzzy Vault bio-cryptograms are vulnerable to a statistical-

based brute-force attack, even though we were able to evaluate the system until an order of

magnitude of the diffusion points with the value of seven. By extrapolating our attack success

rates and time durations, we could estimate that our infrastructure would require about three

months for cracking an ordinary Fuzzy Vault bio-cryptogram, with a recommended order of

magnitude value of ten. If we take into consideration our limited computational resources and

the improvisational character of our architecture, much under a large-scale grid computational

infrastructure, we can state that Fuzzy Vault algorithm can be cracked in a matter of weeks by

a statistical-based brute-force attack.

Still, we want to emphasize the fact that our starting presumption assumed an uniform

distribution of genuine points within the bio-cryptograms, considerably reducing our analysis

subsets. Our attacks success rates were possible due to the generation of our input data

accordingly to this assumption. That’s why, we consider that, besides the magnitude order of

the diffusion points, another important aspect is the distribution of genuine points within the



25

final bio-cryptogram, which should be as irregular as possible, making it harder for a statistical-

based brute-force attack to succeed.

5. Conclusions Bio-cryptography could represent an important breakthrough in the field of biometrics,

considering the elimination of the need for storing biometric templates. Still, there are some

important concerns regarding its algorithms computational complexity and resistance against

some bio-cryptogram oriented attacks.

Our paper proposes an evaluation of the relationship between the order of magnitude for the

diffusion points used within the Fuzzy Vault bio-cryptographic algorithm and its conferred

security level, since many papers in domain recommend the usage of at least ten times more

diffusion points than genuine ones.

We have implemented a statistical-based brute-force simulation infrastructure, in order to

conduct some conclusions regarding the targeted biometric encryption scheme. Experimental

results showed that Fuzzy Vault bio-cryptograms are quite vulnerable in front of such a

statistical-based attack, denoting the fact that the most common value used for diffusion points

magnitude order genuine, ten, does not confer a high security level. Moreover, we emphasized

the fact that points distribution within the Vault has a major importance too, as its degree of

irregularity might confer more security than simply increasing the number of diffusion points /

number of genuine points ratio.

Acknowledgment This paper has been financially supported within the project entitled “Horizon 2020 – Doctoral and Postdoctoral Studies: Promoting the National Interest through Excellence, Competitiveness and Responsibility in the Field of Romanian Fundamental and Applied Scientific Research”, contract number POSDRU/159/1.5/S/140106. This project is co-financed by European Social Fund through Sectoral Operational Program for Human Resources Development 2007-2013. Investing in people!

References [1] K. Nandakumar, Anil K. Jain, and S. Pankanti, “Fingerprint-Based Fuzzy Vault:

Implementation and Performance”, in IEEE Transactions on Information Forensics and

Security, vol. 2, pp.744–757, December 2007.

[2] N. Radha, S. Karthikeyan, “Securing Retina Fuzzy Vault System using Soft Biometrics”,

in Global Journal of Computer Science and Technology, vol. 10, pp.13–18, September 2010.

[3] P. Sood, M. Kaur, “Methods of automatic alignment of fingerprint in fuzzy vault”, in Recent

Advances in Engineering and Computational Sciences (RAECS), pp.1-4, Chandigarh, India,

March 2014.

[4] A. Mitas, M. Bugdol, “Strengthening a cryptographic system with behavioural biometric”,

Third international conference on Information Technologies in Biomedicine, pp. 266-276,

2010

[5] C. Orencik, T. Pedersen and E. Savas, “Securing Fuzzy Vault schemes through biometric

hashing”, in Turkish Journal of Electrical Engineering & Computer Sciences, vol. 18, 2010.

[6] G. Eskander, R. Sabourin and E. Granger, “A bio-cryptographic system based on offline

signature images”, Information Sciences, vol. 259, pp.170–191, 2014.

[7] M. A. Velciu, V. V. Patriciu, “Methods of reducing bio-cryptographic algorithms

computational complexity”, at The 15th International Symposium on Computational

Intelligence and Informatics (CINTI 2014), Budapest, November 2014.



26

EVOLUTION OF TELECOM BSS: FROM VOICE SERVICES TO

COMPLEX IT SERVICES

Ioan DRĂGAN


Răzvan Daniel ZOTA


Abstract. Although less known outside strictly specialized environments, Business Support

Systems (BSS) are highly complex and the subject of their installation in cloud implementations

is less addressed. This paper presents a study based on direct interviews with representatives

of telecom operators about their vision of the future BSS solutions depending on the services

they will provide. Installation in cloud environments represents a less approached subject by

the providers of such solutions. Moreover, this area has a certain number of challenges that

require collaboration between providers and operators.

Keywords: Business Support Systems, Cloud implementations, Telecom providers

JEL classification: L86, D83, L15

1. Introduction In one of the simplest forms, business support systems (BSS) represent the “connection point”

between external relations (customers, suppliers and partners) and an enterprise’s products and

services. Moreover, products and services are correlated with corresponding resources, like

networking infrastructure, applications, contents and factories [1].

Basically, a BSS has to handle the taking of orders, payment issues, revenues and managing

customers, etc. According to eTOM Framework it supports four processes: product

management, order management, revenue management and customer management [2].

Product management supports product development, sales and management of products,

offers and bundles addressed to businesses and regular customers. Product management

regularly includes offering product discounts, appropriate pricing and managing how

products relate to one another.

Customer management. Service providers require a single view of the customer and need

to support complex hierarchies across customer-facing applications also known as

customer relationship management. Customer management also covers the partner

management and 24x7 web-based customer self-service.

Revenue management is focused on billing, charging and settlement.

Order management involves taking and handling the customer order. It encompasses four

areas: order decomposition, order orchestration, order fallout and order status

management.



27

Figure 5. BSS Functions [3]

In order to identify the main characteristics several market research methods have been used:

Research of existing BSS software providers and analysis of their top selling products

Interviews with telecom operators’ representatives

Other existing studies based on market available products

eTOM standard [3]

By following these methods some quantitative and qualitative key performance indicators

were applied that would help us understand the operators’ needs and how we can develop this

as a collaborative system [4]:

Qualitative: user experience, ability to adopt new services, operator onboarding

experience, interconnection between on premise equipment and cloud based software,

quality of this interoperation and coordination;

Quantitative: number of customers it can support, number of operators it can support,

handling processing peeks and data retention policy.

2. Traditional BSS services Traditional telecom networks made money by providing technology to connect users, and

services that were derived from that technology. To achieve optimum return on investment,

network equipment and services investments were made with a very long life cycle. Products

were expected to be in service from five up to 20 years, but technology evolution proved them

wrong.

Initially, BSS systems supported a limited number of services:

Customer relationship management – this was used by telecom operators to manage

their customers, mostly based on human intervention from qualified personnel

Billing services – based on service usage the customers were billed and an invoice was

sent to them monthly.

Order handling – orders were mostly handled manually by operators; this was only a

matter of registering and tracking them.

Services supported by traditional BSS solutions:

Voice calls – initially, voice calls were billed based on connection time, without any

reference to source or destination of the call

SMS – billed based on number of text messages, without any source/destination

considerations

Data - In 1991, 2G was available and provided also data services, which was a challenge

to be billed. Initially, data services were billed based on the connection minutes



28

Billing services evolved into complicated price plans based on source, destination, roaming,

time based triggers and other constrains, but was still based on strict usage and monthly fees.

It didn’t provide any means of charging other experience based services.

Customer management systems evolved into self-service portals up to some degree, but still do

not offer any means of dynamically assigning services based on customer profile.

The next generation BSS is considered to be linked to the launch of 3G which provided high

speed internet connections and add on services based on the new service layer.

3. Evolution towards multi system integrated services

Figure 6. BSS evolution [5]

The introduction of 4G/LTE networks created a multitude of new opportunities for wireless

operators but also presented a set of new challenges. It’s giving telecom operators the chance

to develop new, differentiated wireless services, and potentially new sources of revenue like

[6]:

Content distribution: audio/video

Software services for enterprises. E.g. mobile device management, subscription based

office solutions

Mobile payment solutions

According to telecom operators, the top telecom industry trends in BSS systems and

architecture are being driven by service layer architecture and the need to manage customer

experiences rather than subscription services. The major changes include the following:

Transforming from a supply to demand side vision of the business;

Transforming from human personnel support to automated support

Transforming from “management as an overlay” to “management as service logic”.

In the telecom operators’ network of the future vision, the "services" will involve the

experiences from a dynamic composition of transport/connection resources, content and

processing resources, subscriber knowledge based on customer’s behavior and location.



29

This new vision is also reflected in the transformation of the eTOM through the addition of

processes support like the Service Delivery Framework (SDF), the Telecommunications

Interface Program (TIP) and the Integration Framework.

These will provide a fresh new dimension to the older tom processes and link long-cycle

services and products planning with shorter-cycle, software-driven service creation. Over the

next four years, these new views of BSS systems will dominate and eventually replace the old

static product based BSS systems.

The future services will be created through software processes and aimed at supporting

experiences instead of customers and will be more complex and much more numerous than

static products of the past.

New methods of handling this are being developed because operations costs cannot be allowed

to scale proportionally to the number of services or to grow exponentially based on the number

of component relationships. That means that the BSS systems of the past, which supported

human-driven provisioning processes, must support automated and included in a self-service

package with software-based dynamic services.

Figure 7. New BSS model [7]

4. Cloud deployment concerns

In the study performed on telecom operators some common concerns have been raised

regarding the deployment of BSS solutions on cloud environments:

Multitude of BSS products providers and usually no simple migration means from one

to another.

This has been raised by telecom operators who were forced to migrate from one vendor to

another and had no options available for a simple migration. Since service providers that offer

cloud solutions will choose at best an existing vendor or most probably develop a new product

suitable for this kind of deployment, migrating data from on premise solutions will be a

challenge from technical point of view and also a matter of costs which has to be covered.

Currently there is no available BSS cloud service that offers data migration.

No carrier grade cloud BSS services available on the market.



30

Existing cloud BSS services offer at most 99.9% availability and scalability up to a medium

MVNO(Mobile Virtual Network Operator), but none of them offers a carrier grade service

availability 99.999% (“the five 9s” as it is called by telecom operators) with scalability up to

tens of millions of subscribers. Since this is a “niche” market, such products were not suitable

until now from costs perspective. The only “promise of delivery” is Ericsson’s “BSS as a

service” which should be available at the beginning of 2016. This will be based on Ericsson

products only and might offer some degree of compatibility with on premise deployments.

No options to pay for occasional extension of capacity (ex. monthly bill run, New

Year’s eve, other planned events).

Current pay as you grow subscriptions (or pay for what you use) allow limited capacity

extension and require complex setups in some occasions. Telecom operators currently use

“ready to deploy” virtual machines for capacity expansions in case of special events like:

monthly bill run, New Year’s Eve, discounts or historical bills adjustments. Meanwhile, these

VMs are turned off and this capacity is used for other activities. Deploying and configuring

VMs every time it’s needed on public/shared cloud infrastructure would add extra complexity

and human intervention. Currently there are no available software solutions for “event aware”

self-defining infrastructure.

5. Conclusions Nowadays, telecom operators are looking for new sources of revenue based on customer

experience rather than fix priced products. New business models based on revenue sharing are

now used in partnerships between telecom operators and software vendors. Customers are now

getting software for free and paying only for usage/data plans/data consumptions.

The introduction of 4G/LTE networks and high speed Wi-Fi hotspots allowed telecom

operators to distribute media content like music or videos. Even though, this media content is

distributed as free of charge, the data consumption is charged accordingly.

This multitude of services has to be managed by the new BSS model. Since there are numerous

combinations of services, human operators can’t handle them on a regular basis. Based on the

customer’s profile, services have to be offered as an add-on experience or via a self-service

approach. Integrating with other software or media vendors is a must and cloud deployments

would enable a separation of concerns: BSS solution suppliers would handle the enterprise

integration and telecom operators would develop new experiences for their customers.

Eventually, cloud deployments present numerous problems since this niche software did not

present a financial interest for cloud solutions provides. Solving these problems is actually a

matter of processes and convincing telecom operators to invest time and effort and work

together with their software or media partners and cloud service providers.

Future research will be directed towards identifying the most suitable components to

accommodate telecom operator’s requests and developing a proof of concept cloud

deployment. This will be presented to the same audience in order to study it and provide the

following feedback:

Qualitative analysis of the system: does it fit your current needs? How would you

improve it? Does it provide the expected processing speed? Can it accommodate your

new services?

Quantitative analysis of the system: does it scale enough to your needs? Can it

accommodate more than one operator?



31

Acknowledgment Part of the present work is done under the auspices of the doctoral program in Economic

Informatics – the doctoral school of Bucharest University of Economic Studies.

References [1] L. Angelin, U. Ollson, P. Tengroth, Business Support Systems. Internet, Available at:

http://www.ericsson.com/res/thecompany/docs/publications/ericsson_review/2010/busines

s_support_systems.pdf [Feb, 2010].

[2] eTOM – The Business Process Framework, pp. 41-49, GB921B [Mar, 2014]

[3] T. Poulos. “The BSS/OSS Best,” TMforum. Internet, Available at:

http://www.tmforum.org/ArticleTheBSSOSS/9835/home.html [Dec, 2010]

[4] C. Ciurea, “A Metrics Approach for Collaborative Systems”, Informatica Economica, Vol.

13, No. 2/2009.

[5] Evolution of OSSBSS – Telcordia workshop, Ericsson Review [Dec, 2013] [Online].

Available: http://www.slideshare.net/Ericsson/next-generation-ossbss-architecture

[6] “Integrated platform for financial transactions and electronic banking services made

available on mobile devices using the technology with widespread” – SERAFIMO, Contract

PN II nr. 3039/01.10.2008

[7] “Ericsson BSS vision” in Mobile World Congress, Barcelona, [Mar, 2013].



32

HIERARCHICAL DISTRIBUTED HASH TABLES FOR VIDEO

RECOGNITION IN CONTENT DELIVERY NETWORKS

Alecsandru PĂTRAȘCU

Military Technical Academy, Computer Science Department, Bucharest, Romania [email protected]

Ion BICA


Victor Valeriu PATRICIU


Abstract. Content delivery networks are large distributed systems of servers deployed in

various places across networks. Nowadays, one of their main fields of application is the

delivery of video content. Furthermore, various transformations can be applied to the video

stream as it is delivered to the end user or other video processing software. The system

presented in this paper aims to deliver video content with additional features, such as

automatic tag generation resulting from a pattern recognition process. It is based on an

improved version of Distributed Hash Tables, in an Open Chord and OpenCV implementation.

Aside from the classical hash level associated with content splitting on different nodes, we

introduce a second hash level based on tags that link the video content to the tags.

Keywords: hierarchical distributed hash tables, content delivery network, peer-to-peer,

OpenChord, OpenCV

JEL classification: C61, C63

1. Introduction Traditionally, P2P file sharing and P2P video streaming have functioned very differently from

one another, and have been based on distinct criteria. Peer-to-peer systems and applications are

distributed systems without any centralized control or hierarchical organization, where the

software running on each node has equivalent functionality. A review of recent peer-to-peer

applications yields a long list of common features: redundant storage, dependability, selection

of nearby servers, anonymity, search, authentication, and hierarchical naming.

Decentralized structured P2P file sharing systems have traditionally employed Distributed

Hash Tables (DHTs) that map participating nodes to a virtual geometric structure (unit circle,

torus and butterfly). Based on their position within the geometric structure, nodes become

responsible for a specific portion of the overall dataset.

A decentralized DHT retrieval system does not come without its share of difficulties. Lookup

latency can be scaled down from an unsustainable O(n) time to a much more efficient O(log2

n). Also, because each node participating in the DHT becomes an overlay router, participating

in data transfer as well as routing lookup requests, higher resource requirements, such as

memory and processor speed, must be accounted for. While recent studies have devised

network protocols with O(1) lookup time, this usually comes at the cost of more resources in

the individual nodes. Other difficulties associated with DHTs include the poor relationship

between the node position on the overlay and its actual physical (geographic) location. Two



33

nodes whose overlay addresses may be virtually close to one another can easily be continents

apart, from a physical standpoint. Moreover, the churn process specific to the Chord protocol,

in which participating nodes join and leave the DHT rapidly, can result in partitioning and slow

recovery of the overlay.

Multimedia streaming and file sharing have very different approaches. The goals of media

streaming strategies include minimizing jitter and latency, while maximizing bandwidth usage

and visual quality. Popular methodologies for media streaming usually employ direct client-

server relationships, or otherwise leverage the strengths of P2P transfers in the form of

multicast trees. The benefits of a tree based system include minimal protocol overhead in

individual nodes, implementation simplification, and predictable video stream arrival latencies.

It is possible for many of the difficulties associated with multicast tree transmissions to be

mitigated by the strengths of the P2P file sharing system if a video stream can, in part, be

managed like a file. Ideally, the strengths of a decentralized file system and a multicast

broadcast system can be combined to form a decentralized P2P video streaming protocol. The

DHT can be used to provide tracking for sections of the video stream and thus enable VCR-

type fast forwarding and rewind-type functionality, by allowing nodes to locate other nodes of

interest.

This paper is structured as follows. In section 2 we present some general notions about DHTs

which are related to our topic, in section 3 we detail the proposed architecture. The

implementation progress so far is presented in section 4. Section 5 concludes our paper.

2. DHT Generalities

Recent work on DHTs [1] have resulted in algorithms such as CAN, in order to better correlate

overlay positioning with geographical positioning. The application of a latency-aware DHT

can aid in the formation of a multicast tree, by grouping nodes that are geographically closer

to one another, in the virtual overlay network. Furthermore, a DHT structure that handles churn

well can help nodes in a multicast tree to find new parent nodes when they are disrupted by

their peers leaving the network. For this purpose, we introduce our application - an application

layer protocol for streaming and routing, that runs on top of a structured DHT overlay. The

overlay that we have chosen is Chord [8].

The Chord protocol supports just one operation: given a key, it maps the key onto a node.

Depending on the application using Chord, that node might be responsible for storing a value

associated with the key. Chord uses a variant of consistent hashing to assign keys to Chord

nodes. Consistent hashing tends to balance load, since each node receives roughly the same

number of keys, and involves relatively little movement of keys when nodes join and leave the

system.

Video streaming over best-effort, packet-switched networks is challenging, due to a number of

factors such as high bit rates, delay, and loss of sensitivity [2-4]. Thus, transport protocols such

as TCP are not suitable for streaming applications. To this end, many solutions based on

different approaches have been proposed. From the Chord perspective, there are

implementations based on multicast and TCP-friendly protocols for streaming multimedia data

over the Internet. Multicast reduces network bandwidth by not sending duplicate packets over

the same physical link, but it is only appropriate for situations where there are one sender and

many receivers.

The rapid increase in computer processing power, combined with the fast-paced improvement

of digital camera capabilities [5], has resulted in equally rapid advances in computer vision

capability and use. Computer vision software is supported by the free Open Source Computer

Vision Library (OpenCV)[6] that can (optionally) be highly optimized, by using the



34

commercial Intel Integrated Performance Primitives (IPP). This functionality enables

development groups to deploy vision and provides basic infrastructure to vision experts.

3. Architecture

Chord simplifies the design of peer-to-peer systems and applications that are based on it, by

addressing the following difficult problems: load balancing (Chord acts as a distributed hash

function, spreading keys evenly over the nodes), decentralization (Chord is fully distributed,

meaning that no single node is more important than another), scalability (the cost of a Chord

lookup grows as a logarithmic function of the number of nodes, so even very large systems are

feasible), availability (Chord automatically adjusts its internal tables to reflect new nodes as

well as node failures, ensuring that, barring major failures in the underlying network, the node

responsible for a key can always be found) and flexible naming (Chord places no constraints

on the structure of the keys that it looks up).

As shown in Figure 1, the Chord-based application is responsible for providing any desired

authentication, caching, replication, and user-friendly naming of data. Chord's flat key space

eases the implementation of these features. For example, an application could authenticate data

by storing it under a Chord key derived from a cryptographic hash of the data. Similarly, an

application could replicate data by storing it under two distinct Chord keys derived from the

data's application level identifier.

Figure 1. Structure of a Chord-based Figure 2. System architecture

distributed storage system

The architectural diagram is depicted in Figure 2. Here, we can see the ”SuperPeer” node,

which is responsible for computing the content hash value and the tag hashes and also splits

content and distributes it to nodes, along with hash values. The first DHT layer is ”Content

storage”, which will retain the results of pattern recognition and the tags newly obtained from

the SuperPeer server. The second DHT layer is “Search tags”, and it is used for storing search

tags. A brief overview of the entire application is presented over the course of the following

paragraphs. Our application includes the following components: an upload module, a file

splitting module, a chord interaction module and a face recognition module.

The upload utility is a simple and friendly web interface. It consists of a “Browse” button and

an “Upload” button. Clicking on the former shows a modal window that lets the user choose a

video file. Clicking on the latter starts the upload to the server and the video file analysis

process. After the file is temporarily stored on the server, a process called “VideoSplitter” is

started. Its input is the uploaded video file and its output is a file containing the following

information on each line: the name of the chunk and a unique hash that will identify the chunk

in the DHT.

After the splitting has finished, the file created by the “VideoSplitter” is read and parsed line

by line. Each chunk that is received is stored in the DHT on the first layer. In order to save the

chunks, first a Chord server must be started. This process binds to a local address onto the

running computer, which every peer connects to. When inserting the chunk into the DHT, a



35

completion function is created, which instructs the receiving peer what to do with the content

that it has been given.

When a peer from the DHT receives the chunk it asynchronously starts a process called

“FaceRecognition” that analyzes the chunk and outputs a list of names of the people appearing

in the video chunk. Each recognized person’s name is then inserted in the DHT on the second

layer. In order for the user to get the files fast, the entire DHT uses a custom video content

implementation. The Chord library uses generic objects that can be stored. Thus in order to

better handle each request we created a custom object. Each content will store the chunk as-is,

along with the hash of the previous and the following chunk. This is useful for the times when

a user asks for specific content. All the chunks that make up the original video file are then

inserted into a double linked list, which the system can easily walk it in order to find all the

chunks. In our proposed application, we have a number of two DHT layers, both based on the

Chord protocol.

3.1 The First Layer

The first layer is used for content storage and its peers mainly do pattern recognition and

distribute the newly obtained tags to the second layer of DHT, also based on Chord. To be part

of this layer, a peer has to meet several criteria. First, because it is required to store video

content, it has to have a great amount of storage space. Second, because it is required to do face

recognition, which implies a lot of image processing, it has to have decent computing power.

However, the most important criterion is the availability and reliability of the peer.

For performance reasons, the video content uploaded by peers, which is to be stored in the

system, will be split in chunks. For every chunk, a hash is computed, using one of the known

hash algorithms. The chunks are then inserted in the DHT, with their computed hashes as keys.

Aside from the actual video content, each DHT entry will have two associated hash list: one

containing hashes of previous chunks in that video, and another containing hashes of chunks

that follow. An application retrieving video content from the system can then easily start

buffering operations for several of the chunks ahead.

3.2 The Second Layer

The second layer is introduced to improve the search operation. A DHT entry in this layer will

contain an association between a tag (its hash key) and a list of hashes for videos in the first

DHT layer. The list of tags for each video is obtained from two main sources: user entered tags

and tags obtained in the face recognition process.

When a new [tag:video_hash] association is introduced by a certain peer, the peer computes

the hash of the tag and performs an insertion operation in the DHT. The peer responsible for

the hash key detects if the key already exists in the DHT, in which case it simply appends the

video_hash to the existing list. If it does not, it creates a new entry in the DHT, with a new list,

to which it appends the video_hash.

3.3 Search Operation

A typical search operation consists of two steps. First, a peer searches videos based on some

tags. For each tag, and for each combination of tags, the peer computes a hash, which is then

used to search in the second DHT layer. This yields several lists of hash keys which will be

used to search in the first DHT layer, ordered by relevance. The second step of a search

operation is to identify the videos associated with the hashes obtained in the first step.



36

3.4 Face recognition

The face recognition module of our system uses OpenCV. This module needs to be a standalone

application that is present on every peer in the first layer. Once a peer has new video content,

it will start a face recognition job. This module attempts to identify people in the video and

return their names. Those names are then stored in the system, within the second DHT layer.

The face recognition process has two steps: face detection, which decides which parts of an

image are faces and which are not, and actual face recognition, which attempts to identify the

persons to whom the detected faces belong.

OpenCV uses a type of face detector called a Haar Cascade classifier. Given an image, which

can come from a file or from a live video stream, the face detector examines each image

location and classifies it as “Face” or “Not Face”. Classification assumes a fixed scale for the

face, say 50x50 pixels. Since faces in an image might be smaller or larger than this, the

classifier runs over the image several times, searching for faces across a range of scales. This

may seem as an enormous amount of processing, but thanks to some algorithmic tricks,

classification is very fast, even when it's applied using several scales. The classifier uses data

stored in an XML file to decide how to classify each image location.

The actual face recognition is the process of figuring out whose face it is. The eigenface

algorithm is used to this end. Eigenface is a simple face recognition algorithm that's easy to

implement. It's the first face-recognition method that computer vision students learn, and it's

one of the standard methods in the computer vision field. Turk and Pentland published the

paper that describes the Eigenface method in 1991.

The steps used in eigenface are also used in many advanced methods. One reason eigenface is

so important is that the basic principles behind it - PCA and distance-based matching - appear

over and over in numerous computer vision and machine learning applications.

Here's how recognition works: given example face images for each of several people, plus an

unknown face image to recognize:

1. Compute a “distance” between the new image and each of the example faces

2. Select the example image that's closest to the new one as the most likely known person

3. If the distance to that face image is above a threshold, “recognize” the image as that person,

otherwise, classify the face as an “unknown” person.

Distance, in the original eigenface paper, is measured as the point-to-point distance (the

Euclidean distance).

4. Implementation so far

Up until now, we have implemented the basic structure of the application. The source is freely

available at [7]. The upload utility, the file splitting, online face recognition, face recognition

and video recorder utilities are all working properly.

During development, we ran into a series of problems. One big problem that we encountered

was the lack of support of the Chord library the detecting that a video file chunk is received. In

order to make it work, we had to modify the library and add this feature. To make it more

interesting, we added the possibility for the framework to run more than one specific command

when receiving data chunks. The user can customize the command in order to choose what to

do after the chunk is received, which is basically a completion function. This is executed

asynchronously, in order to make the system responsive and scalable.

Another problem was communication between the three different platforms we used: C++ for

OpenCV, PHP for the upload utility and Java for the DHT network. The solution was that to

check every action executed by each component against the operating system, when launching

the processes. Also, a lot of “try-catch” blocks have been used, in order to catch errors that

could appear in such a large system implementation.



37

OpenCV was used to develop the automatic tagging functionality. The development was done

on a Windows 7 machine with Visual Studio 2010. Even though OpenCV has a Windows

installer which installs all required DLL's in place, it crashed when trying to run the utilities

which relied on it. The solution was to recompile all the libraries from source.

5. Conclusion and Future Work

In this paper we have proposed a system consisting of a content delivery network that uses

distributed hash tables on top of Chord, for storing and providing content, to achieve an entirely

decentralized and unmanaged form of peer-to-peer video streaming. To this end, we introduced

basic file splitting and content analysis across the network.

Research for our video content distribution network is currently in the incipient phase, but

results in this direction look encouraging. We are already taking into account improvements to

the whole system, especially the DHT modules. This means further development of the Chord

library, to make it more secure and reliable, and implementation of new features. We also aim

to improve path optimizations for the video chunks, so that the parts travel the least amount of

time in our DHT in order to reach the destination peer, as well as load balancing between peers,

so that the network doesn’t become unresponsive in case of massive reception of video files.

All of these are going to help us deliver video content between network nodes at much greater

speed.

Acknowledgment

This paper has been financially supported within the project entitled “Horizon 2020 - Doctoral

and Postdoctoral Studies: Promoting the National interest through Excellence, Competitiveness

and Responsibility in the Field of Romanian Fundamental and Applied Scientific Research”,

contract number POSDRU/159/1.5/S/140106. This project is co-financed by European Social

Fund through the Sectoral Operational Programme for Human Resources Development 2007 -

2013. Investing in people!

References [1] W. Tan and A. Zakhor, “Real-time INTERNET video using error resilient scalable

compression and tcp-friendly transport protocol”, IEEE Transactions on Multimedia, vol.

1, pp. 172-186, June 1999.

[2] I. Clarke, “A distributed decentralised information storage and retrieval system”, Master’s

thesis, University of Edinburgh, 1999.

[3] F. Dabek, E. Brunskill, M. F. Kaashoek, D. Karger, R. Morris, I. Stioca, H. Balakrishnan,

“Building peer-to-peer systems with Chord, a distributed location service”, In Proceedings

of the 8th IEEE Workshop on Hot Topics in Operating Systems (HotOS-VIII),

Elmau/Oberbayern, Germany, May 2001, pp. 71-76.

[4] W. Poon, J. Lee, and D. Chiu, “Comparison of Data Replication Strategies for Peer-to-Peer

Video Streaming”, Fifth International Conference on Information Communications and

Signal Processing, pp. 518-522, December 2005.

[5] C. Y. Fang, C. S. Fuh, P. S. Yen, S. Cherng, and S. W. Chen, “An Automatic Road Sign

Recognition System based on a Computational Model of Human Recognition Processing”,

Computer Vision and Image Understanding, Vol. 96 , Issue 2, November 2004.

[6] OpenCV library, http://www.intel.com/research /mrl/research/opencv

[7] https://github.com/apatrascu/dvcdn

[8] I. Stoica, "Chord: A scalable peer-to-peer lookup service for internet applications," ACM

SIGCOMM Computer Communication Review Vol. 31, No. 4, 2001, pp. 149-160.



38

DRIVING BUSINESS AGILITY WITH THE USE OF CLOUD

ANALYTICS

Mihaela MUNTEAN

Bucharest University of Economic Studies, Bucharest [email protected]

Abstract. In a global and dynamic economy, the businesses must adapt quickly to changes that

appear continuously, it must be agile. Businesses that are agile will be able to compete in a

dynamic global economy. Also, it is common knowledge that business intelligence is a crucial

factor to the business success. But traditional BI is in contradiction with frequently changing

business requirements and “big data”. The purpose of this paper was to investigate how

business intelligence and cloud computing could be used together to provide agility in business.

Also, the paper gives an overview of the current state of cloud-based business intelligence and

presents briefly the different models for cloud-based BI such as BI SaaS and BI PaaS. Finally,

the paper identifies the strengths and weaknesses of cloud-based BI.

Keywords: agile business, business intelligence, cloud-based business intelligence JEL classification: C88, L86

1. Introduction Considering the current situation, the businesses must adapt quickly to changes that appear

continuously, in a global and dynamic economy, they must be agile. In a world that changes

permanently, the leadership position is temporary, only agility creates a competitive advantage

for companies. Also, there is too much information that changes faster than the information

systems. Information is a strategic resource for companies, and decisions must be taken based

on a huge amount of real time information, from a high variety of internal and external sources,

unstructured and structured sources. In the article „The ten dimensions of business agility” [2],

Craig le Clair, from Forrester Research, has identified the main factors that influence business

agility. These factors are presented in figure 1. They are grouped into three categories:

marketing, organization and IT technologies. Also, in figure 1 are presented the main IT

technologies that affect directly and not business agility. We can see that cloud computing and

BI is two important factors that can influence the agility of a business. Also, during 2010-2015,

according to Gartner Group consulting company [4], BI and cloud computing were considered

high priority technologies for CIO. In 2014, the market survey included 2339 CIOs from 77

countries, with a total of 300 billion dollars revenue. We can observe that BI has been ranked

first from 2012 until today, 50% of those interviewed have considered that BI technology is

very important for companies activity (figure 2). Cloud computing ranked first in 2011 and

since 2012, it has been constantly ranked third until today. Also, the top three IT technologies

which will be subjected to massive investment in 2015 are: BI systems and advanced analytics,

cloud computing and mobile technologies. Cloud computing and business intelligence are part

of the core technological platform for digital businesses, named by Gartner Group “the nexus

of forces”. This technological platform will change the way we see the society and businesses,

and also, will create new business models. Also, this platform will modify the way businesses

interact with customers, it will change the collaboration with employees and partners and it



39

will improve business agility. The information will be accessible, shareable and usable by

anyone, anytime and anywhere.

Figure 1. The main factors that influence business agility

Figure 2. Cloud and BI in CIO’s technology priorities

The main characteristics of cloud computing are: uses the internet technologies, offers a

scalable and elastic infrastructure, offers shared resources, fault tolerance, offers services with

metered use that are accessible through a standardized interface (for example, web client) over

the Internet [1]. The services are offered at the customer’s demand and they are flexible, and

the resources are dynamically supplied and they can be shared by a large group of clients.

busi

nes

s ag

ilit

y

marketing Market responsiveness

-social networks produce rapid changes in customer behavior and

increase company visibility. Social marketing is equivalent to direct

marketing, but more quickly and for a very large audience;

social CRM

Channel integration

(online channels with offline channels)Big data analytics, mobile technologies

organization

Knowledge dissemination

-easy access to knowledge through the

organizational restructuring

-improving collaboration

using new collaboration software

using social networks as collaboration

software

Digital psychology

-advanced digital skills for business users using self-service tools for creating new

services

Change management

-awareness of change

-behavior change-brands and services must continuously change to be

competitive

IT technologies

Business intelligenceusing self-service BI, smart data discovery, advanced

analytics, in-memory

Infrastructure elasticity cloud computing

business processes architecture

-new working patterns, rules, templates

using new BPM tools,

BRM tools, BPM PaaS

Software innovation

-new information systemsusing agile development methodologies

Sourcing and supply chain

-feedback mechanisms for continuous adaptation of the

supply chain

using agile SCM



40

Therefore, cloud computing has the potential to help BI systems to become more agile, more

flexible and more responsive to changing business requirements. The following paragraph

presents the concept of cloud-based BI and the models for cloud-based BI.

2. Cloud-based BI

According to Gartner’s definition, cloud-based BI refers to “any analytics effort in which one

or more of these elements is implemented in the cloud be it public or privately owned. …The

six elements are data sources, data models, processing applications, computing power,

analytic models, and sharing or storing of results“ [http://

searchbusinessanalytics.techtarget.com/news/2240019778/Gartner-The-six-elements-of-

cloud-analytics-and-SaaS-BI]. According with [3] cloud-based BI refers to „the BI

technologies and solutions that employ one or more cloud deployment models”. Cloud-based

BI is a relatively new concept, which refers to the components of a BI system delivered as

services, but also to the data used by the BI system, data which can be stored in cloud. The

components of a traditional BI system (ETL instruments, data warehouse, BI tools and business

analytics solutions, business performance management tools and BPM applications) can be

delivered as cloud services. As shown in figure 3, any combination is possible, depending on

the company requirements and objectives.

Figure 3. Location of BI data and BI components

For example, data sources can be loaded on the client servers to ensure their security, and the

applications and instruments for business analysis can be stored in the cloud. However, data

security can be compromised because data must be accessed and analyzed over the Internet.

This is a hybrid deployment model for cloud-based BI. Other deployment models for cloud-

based BI are: public (all data in the cloud) and private. Cloud-based BI solutions are much

more flexible than traditional BI solutions. Therefore, a cloud-based BI solution may be a

feasible answer to the challenges of a dynamic global economy. Cloud-based BI refers to: BI

SaaS (BI software as a service), BI for PaaS, BI for SaaS and BA PaaS (business analytics

platform as a service). BI SaaS is also known as on-demand BI and includes: a) BI SaaS tools

that can be used to develop BI applications for deployment in a cloud; b) packaged BI SaaS

applications that can be deployed in a cloud environment (for example, applications for

business analysis or business performance management applications); c) data integration

On-premise business analytics/BI

tools/applications for business analysis,

BPM tools/BPM applications

(installed on clients servers, managed and

customized by clients)

On-premise DW

Decide where

data and BI

components

are stored

Business performance

management tools/BPM

applications, Business

analytics/BI

applications/BI tools

in cloud,

known as BI SaaS DW in

cloud/DW as

a service

On-premise ETL

ETL in

cloud/data

integration

services



41

services for BI; d) developing/testing services for BI. BI for SaaS refers to the inclusion of a

BI functionality in a SaaS application (for example, Microsoft Dynamics CRM online, a SaaS

solution includes a dashboard capability). BI for PaaS is a set of analytical services /information

delivery services integrated into a platform (PaaS) and managed by PaaS. For example, Oracle

BI Cloud Service is part of the Oracle Cloud PaaS. A platform as a service (PaaS) is “a broad

collection of application infrastructure (middleware) services (including application platform,

integration, business process management and database services” [http://www.gartner.com/it-

glossary/platform-as-a-service-paas]. PaaS makes the development, testing, and deployment of

applications quick, simple and cost-effective. The public PaaS marketplace includes:

application PaaS (for example, force.com), integration PaaS (for example, IBM WebSphere,

BOOMI), business process management/BPM PaaS (for example, Appian), Database PaaS (for

example, database.com), business analytics PaaS, etc. A business analytics PaaS (BA PaaS)

represents a shared and integrated analytic platform in the cloud and delivers the following

services: BI services, DW services, data integration services and infrastructure services (figure

4). BA PaaS is designed for developers, unlike BI SaaS which is designed for business users.

Figure 4. BA PaaS

For example, Microstrategy Cloud Platform is a public BA PaaS that includes BI services, DW

services, data integration services that enable customers to move data into the MicroStrategy

Cloud Data Warehouse environment and infrastructure services which provide storage,

network and compute infrastructure. Also, Microstrategy offers BI SaaS known as

Microstrategy Analytics Express. According to Gartner Magic Quadrant for Business

Intelligence and analytics platforms - 2015 [5], the BI market leaders are: Tableau, Qlik,

Microsoft, IBM, SAP, SAS, Oracle, Microstrategy and Information Builders. However, the

main leaders for BI SaaS solutions are those from challengers quadrant and the niche players:

Birst (Birst Enterprise Cloud BI- a pioneer in cloud-based BI), GoodData and Actuate (Open

Text), but also, those from Leaders quadrant as Microstrategy, Information Builders, Oracle,

SAS and SAP (SAP Business Objects on Demand). Others companies that offer BI SaaS

solutions are: Jaspersoft (Jaspersoft BI for Amazon Web Services marketplace), Clouds9 (C9

predictive sales platform), Bime (a cloud-based BI solution which allows access to sources like

Google BigQuery, Google Analytics, Salesforce, Facebook, excel files, Amazon DB, SAP

Hana, relational database and Web services), Host Analytics (a leader in performance

management at corporations level, in cloud and financial applications in cloud), etc. In figure

infrastructure services• storage, compute and network services

• public cloud, • industry standards for audit and security

• management services like security, user management and

resource management

information delivery services

• interactive reporting services

• ad-hoc query services• dashboard and scorecards services

• self-service BI

DW services, data management services

and data integration services• data acquisition from varied data sources, data movement, data tranformations and data loading

• relational database support, multidimensional database support, connectors to on-premise applications, big data support, cloud database connectors, cloud applications connectors

analytical services

• OLAP services, data discovery services

• advanced data visualization services• prediction, simulation and optimization services,

• data mining/text mining services, etc

• models as services (clustering models, bayesian models, etc)

BA PaaS



42

5 there are presented the different models for cloud-based BI together with few vendors and

their solutions.

Figure 5. Cloud-based BI models

The main factors that determine the implementation of a cloud-based BI solution by companies

and the main problems which appear during the implementation of a cloud-based BI solution

are presented in table 1. The importance of cloud–based BI solutions has significantly increased

every year from 2012 until today. The major factor for the cloud-based business intelligence

market growth is the huge volume of structured and unstructured data. Usually, small

companies are those who want the implementation of a cloud-based BI solution. The most

interested departments in cloud-based BI are: sales department (with most public BI cloud

implementations), marketing and executive management (with most private BI cloud

implementations). Also, the Gartner Magic Quadrant [5] shows that the primary interest is in

hybrid and private cloud-based BI.

BI for SaaS

Oracle Transaction BI -embedded SaaS

analytics in Oracle Fusion

SaaS applications

Power BI in Microsoft Office 365 in cloud, etc

BI for PaaS

Jaspersoft for

Redhat Open

Shift PaaS

Yellowfin

+Elastic

Intelligence+Con

nection Cloud -

PaaS

BA PaaS

Teradata Aster

Discovery PaaS (DW

as a service, Data

discovery as a service,

data management as a

service)

GoodData Open

Analytics Platform

Microstrategy Cloud

Platform,

Oracle Cloud Platform

as a Service

DW as a service (Vertica, Kognitio,

Amazon Redshift - PaaSDW service)

Data management services (Oracle DB

public Backup services, Oracle DB public Cloud

Services)

BI SaaS

Applications for business analysis in cloud (Birst,

PivotLink BI, Cloud9 Analytics, IBM Cognos Analytic

Applications, SAP BusinessObjects BI on-demandon public cloud, Microstrategy

Analytics Express, etc)

BI tools in cloud (SAP Lumira Cloud, Tibco Spotfire Cloud,

IBM Watson analytics in cloud, etc)

BPM services (Appian cloud-based BPM, Pega BPM,

Corddys Operation Intelligence)

Data integration services (IBM CastIron, Informatica

PowerCenter Cloud Edition, Snaplogic Dataflow, Talend

Integration Suite on RightScale managed platform, etc)

Development/testing services (IBM Rational soft delivery

services, Oracle Java public cloud services, etc)

IaaS for BI (SAP BI for cloud deployments, SQL Server 2012 BI in the cloud – Microsoft Azure VM, Microstrategy

Cloud, IBM Cloud Managed Services, Oracle Storage Cloud Services, Oracle Compute Cloud Services, etc)

Web client, mobile applications



43

Table 1. The strengths and the weaknesses of cloud-based BI

Strengths Weaknesses -the companies can implement a BI service/ a

SaaS BI solution faster and easier than an on-

premise BI solution. Also, the costs for

implementation and maintenance of

software/hardware are reduced;

-lower level of effort and lower risks;

-by reducing costs, small companies can use the

same IT technologies as big ones. The used

service is paid, which is financially more

effective than investing in hardware and software

acquisitions;

-it offers immediate access to hardware resources,

without any additional investments, it reduces the

time to develop BI solutions;

-it increases the speed of deployment of BI

solutions;

-easy sharing of information (only Web browser);

-self-service BI, it requires reduced IT skills;

-the SaaS solutions provider is forced to offer the

latest software versions and to configure it. In this

way, the SaaS BI version can be updated

continuously, so there is much more flexibility.

-improved business agility.

-the costs and time needed for big data transfers

in cloud. In a public cloud there can be replicated

only some of the stored data in the client data

warehouse, or the entire data warehouse can be

uploaded into cloud (solution used if

transactional applications are in the cloud,

meaning all resources are uploaded in the cloud);

-data security, protection against authenticity

fraud and cyber-attacks and security standards;

-integrating data from cloud and on-premise

sources;

-the lack of a strategy for how to combine and

integrate cloud services with on-premises

capabilities;

-auditing (risk assessment, prevention, detection,

response to attack) is hard to be accomplished

because data are on the outside of organization;

-legal issues (who is responsible for regulatory

compliance, if the cloud provider subcontracts

the services of another cloud provider).

5. Conclusions In this article, the author examined how cloud and BI can provide agility in business. Also, the

article briefly presented the different models for BI in the cloud. The combination of cloud

computing and business intelligence can provide a more flexible BI solution that aligns with

business objectives. Cloud computing has the potential to help BI to become BI for everyone.

Also, cloud analytics provides decision makers the ability to quickly make predictions and

decisions that influence performance in business.

References [1] M. S. Gendron. Business Intelligence and the cloud: strategic implementation guide,

chapter 2, pp. 23-46, chapter 7, pp. 130- 148, Wiley, 2014

[2] C. Le Clair, J. Bernoff, A. Cullen, C. Mines, J Keenan, The 10 Dimensions Of Business

Agility. Enabling Bottom-Up Decisions in a World of Rapid Change. 2013, Internet:

http://searchcio.techtarget.com/tip/Forrester-Achieve-business-agility-by-adopting-these-

10-attributes, 2013 [Dec, 2014]

[3] H. Dresner, Wisdom of crowds cloud business intelligence market study, Dresner Advisory

Services, LLC, Internet: http://www.birst.com/why-birst/resources/analyst-reports/2013-

wisdom-crowds-cloud-business-intelligence-market-study, 2013 [Nov, 2014]

[4] Gartner Executive Programs’ Worldwide Survey, Business Intelligence, Mobile and Cloud

Top the Technology Priority List for CIOs, Internet:

http://www.gartner.com/newsroom/id/1897514, 2010-2015 [Dec, 2014]

[5] R. L. Sallam, B. Hostmann, K. Schlegel,et al., Magic Quadrant for Business Intelligence and

Analytics Platforms, 23 February 2015, ID:G00270380, Internet: http://www.qlik.com/, [Mar,

2015]

https://www.forrester.com/Craig-Le-Clair

https://www.forrester.com/Josh-Bernoff

https://www.forrester.com/Alex-Cullen

https://www.forrester.com/Christopher-Mines

http://searchcio.techtarget.com/tip/Forrester-Achieve-business-agility-by-adopting-these-10-attributes

http://searchcio.techtarget.com/tip/Forrester-Achieve-business-agility-by-adopting-these-10-attributes



44

ORGANIZING SECURE GROUPS OF RELIABLE SERVICES IN

DISTRIBUTED SYSTEMS

Cezar TOADER

Technical University of Cluj-Napoca [email protected]

Corina RĂDULESCU


Cristian ANGHEL Technical University of Cluj-Napoca

[email protected] Graţiela BOCA


Abstract. This paper refers to service-oriented architectures where replication of services is

used to increase the system reliability. A protocol for secure message exchange is proposed in

this paper. This approach is based on organizing groups of replicated services and using

specific identifiers in the proposed security protocol.

Keywords: distributed systems, services, SOA, replication, dependability, security protocol. JEL classification: C65, C88, L86

1. Introduction Modern companies relies on various network technologies to communicate with clients,

partners, and institutions. IT specialists need to use all their creativity to change the structure

and shape the systems according to modern concepts as service-orientation, reliability,

readiness and so on. Analysts and IT professionals agreed that key concepts and trends like

Cloud Computing, and Advanced Technologies all rely on Applications Architecture, and this

means, mainly, Service-Oriented Architectures and Web-Based applications [1].

On the same trend of applications architecture, this paper refers to service-oriented

architectures where service replication is used in order to increase system dependability.

Dependability is a comprehensive concept which incorporates several components:

availability, reliability/safety, security, privacy, integrity, maintainability [2]. A very important

mean to achieve reliability is fault tolerance. This term refers to specific techniques able to

provide a correct service even in the presence of errors. In reliable systems, replication is a

technique widely accepted to avoid system failures.

Consider now a distributed service-oriented architecture. Consider the case when the main

service and its replicas are running on separate machines. Within this system, the replicated

service initiates and controls execution of operations on the remote services. For this reason,

in this paper the replicated service is called service manager and noted WS Manager, and its

replicas are called workers and are noted Worker 1, … Worker n, as in Figure 1 [3].

The communication between the service manager and the worker services can be considered a

problem of secure messages distribution to a group of legitimate receivers. In the absence of

specific mechanisms to ensure safety communications, there is a possibility for an attacker to



45

intercept messages and illegally get data about a service worker and its operations.

Figure 1. Distribution of unencrypted messages,

M, from a service manager to worker services Figure 2. The secure distribution of messages

to a group of legitimate recipients

2. The proposed approach A protocol able to ensure the security of the messages exchanged between the service manager,

noted WS Manager, and the worker services is mandatory for a reliable system.

The proposed protocol is required to meet the following major objectives, noted O.1÷O.5:

O.1. The content of the messages transmitted between the system services must be

encrypted using a crypto-system with public or private key.

O.2. An accidental decryption of a message sent to any of the service workers must not

give sufficient data for an attacker to be able to decrypt subsequent messages.

O.3. A possible compromise of a service worker must not provide sufficient

information to the attacker to compromise other workers.

O.4. A possible compromise of a service manager must not provide sufficient

information to attacker to compromise the service manager.

O.5. The algorithm used to protect sensitive information must use a series of secret data

which will never be sent over the network, such as: the identifier of service which issued

the message, the identifier of legitimate recipient of the message, and the identifier of

the service group including the transmitter and the legitimate receiver.

Consider now for analysis a distributed system based on Web services, in which replication is

used to increase reliability. Messages travels between the service manager and the service

workers. Essentially, these are XML documents complying with specific protocols [4].

In the approach proposed in this paper there are no restrictions on XML document schemas,

which allow the proposed method to be suited for all systems based on Web services.

In the normal operating mode of the system, when all services are functional, the manager

sends out specific messages. The security of communications between the manager and the

workers is realized by using crypto-systems with public or private key.

The method proposed in this paper refers to the broadcasting of secure messages to multiple

recipients, which is called secure message broadcasting. To broadcast secure messages, a

session key is used, in the case of private key crypto-system, or a pair of session keys is used,

in the case of a public key crypto-system. In this approach, a secure method to transmit the

session decryption key to the recipient should be established, simultaneously with the

encrypted message. This session decryption key must be protected by a locking algorithm.

The secure lock algorithm is known by both parts that exchange messages. This algorithm uses

two categories of data: (a) secret and constant data; (b) non-secret and variable data. Secret

data are never sent over the network, and are parts of the service configuration.



46

They are constant until a major reconfiguration of the system is made. The secrecy of this

information is essential. On the other hand, the algorithm also uses non-secret data, which are

transmitted over the network as part of the messages, but are changed in every message.

An attacker cannot decrypt message intercepted on the network, because he doesn’t know the

secret data necessary for unlock the message. And even he finds out non-secret data from a

certain message, this is useless for the next messages.

3. Related works There can be different ways to implement the secure lock. There are methods for secure

distribution based on Chinese Remainder Theorem [5]. In this paper, a different scheme for

secure lock algorithm is proposed. It’s based on Lagrange interpolation polynomials.

4. The proposed protocol Starting from the major objectives O.1 ÷ O.4 stated above, a specific organization of replication

services is proposed. Each service within the system is assigned a security identifier.

Furthermore, the services are organized as groups of security, by taking into account their role

within the system, or other criteria chosen by the administrators. Each group is assigned a group

security identifier, GID, established by the system administrators. Services communicate with

each other only within a security group. A message has a sender and a legitimate receiver in

the same group. Their identifiers are noted SID (Sender ID) and RID (Receiver ID). These

identifiers, SID, RID, and GID, are used in the algorithm.

The major objective O.1 is achieved if a crypto-system with public or private key will be

implemented within the distributed system. The use of the services security identifiers, SID for

message sender, and RID for message receiver, determines a separate protection of messages

and thus the major objectives O.2 and O.3 are achieved. By using the group identifiers, GID,

the lock algorithm determines a separation of security problems based on service groups. Thus

the major objective O.4 is achieved.

In this approach, after defining the security groups, the next goal is to obtain a higher degree

of security by “locking” the encrypted value of the session key used for messages. The locking

algorithm used in the distributed system is based on using different Shamir's threshold sharing

schemes [6] within different security groups. This scheme should be used in a specific way, by

taking into account the identifiers of services and group, SID, RID, and GID. The rules are

presented below. This way the major objective O.5 is achieved.

The service manager, WS Manager, broadcasts to many worker services the same encrypted

message SM together with the session key, obviously not in clear text. Firstly, this key is

encrypted and further this value is protected by a secure lock.

Notations in Figure 2:

WS Manager – the manager of the worker services (i.e. the replication manager);

Worker 1, … Worker 3 – the worker services (i.e. the replication executors);

C – the encrypted form of M (i.e. the message payload);

e, d – the encryption session key, and the decryption session key;

e1, e2, e3 – the secret keys of the services Worker 1, Worker 2, and Worker 3;

D1, D2, D3 – the results of the encryption of key d using the secret keys of workers;

Y1(D1), Y2(D2), Y3(D3) – the “locked” forms of the encrypted values D1, D2, D3;

SM – the secure messages (having two parts, C and Y) sent over the network.

Every legitimate recipient of the messages knows all the information required to reconstruct

the session decrypting key and, subsequently, to decrypt the secure message SM.

The technique presented below is used to protect the decryption session key with a computed



47

lock. This locked value can be „unlocked” by any of the legitimate recipients of the message.

The operation of lock removal is based on the secret reconstruction in the Lagrange

polynomials interpolation, and the secret is the value of the session key for decryption. This

decryption key, noted d, allows the legitimate recipient of the message to obtain the original

message M, in clear text. Every worker i must be able to compute the secret necessary to

remove the lock and obtain the decryption key d. The method is presented below.

5. The proposed secure lock method The secure lock method is based on a specific algorithm. The system components compute

different parts of the same algorithm: a) the message sender knows a secret value and uses this

secret to lock an information needed to be sent over the network, and b) the message receiver

have to compute the secret value and, based on it, have to determine the decryption key and

only after this step it can determine the clear message M.

The structure of the secure message SM can be seen in Figure 2. It has two parts. The secure

lock is used to protect only the second part and does not affect the message payload, C.

Firstly, the decryption session key, d, is itself encrypted using the encryption key of the

recipient ei in order to obtain the encrypted value Di. This encrypted value won’t be transmitted

in this form on the network. The message sender performs a “locking” procedure.

Using a specific algorithm based on Lagrange polynomials interpolation, the message sender

takes the value Di and computes the second part of the secure message SM. That message is

destined to a specific receiver, i.e. the worker i (i = 1, 2, 3). Protecting the values Di using a

specific algorithm is just like a protection given by a “secure lock”.

The message receiver (the worker i, where i = 1, 2, 3), using specific information found in the

message body, performs the unlocking procedure and recovers the encrypted value Di. After

this step, the recipient decrypts the value Di using its own key ei and obtains the session key,

d, which is required to decrypt the first part of the secure message, C, as seen in Figure 2.

At the level of worker i, the process of obtaining the value Di is the process of obtaining the

secret S in the Shamir’s (k, n) threshold sharing scheme [6]. The worker service needs to know

some parameters. If the threshold k is higher, then the number of necessary parameters to

compute the secret S is increasing. A short description of the scheme is given below.

In order to share a secret to n participants, the following values have to be computed:

ii xfy , where i = 1, 2, … n (1)

where the polynomial f(x) is given by the relation:

1

1

2

210

k

k xaxaxaaxf (2)

The term a0 is the secret S, and the other coefficients are non-null integers, randomly chosen.

In order to re-compute the secret using only k parts, k < n, the Lagrange interpolation

polynomial will be used. The Lagrange polynomial is given by the relation:

1

0

k

i

ii xlyxL (3)

where - yi are known values, previously calculated with (1);

- li (x) are the Lagrange basis polynomials, given by the relation:

1

0

k

ij

j ji

j

ixx

xxxl (4)

The secret S is the free term in the Lagrange polynomial given by (3).

The proposed approach presented in this paper is based on the case where the threshold k = 3.



48

The scheme becomes the Shamir’s threshold (3, n) secret sharing scheme.

In order to compute the value S, three pairs of values must be known:

221100 ,;,;, yxyxyx (5)

If all these values would be written in the secure message sent over the network, then too much

information would be exposed on the network. Thus, an attacker analyzing the network traffic

would be able to use these six values and to compute the secret S for its own use. This

possibility will be eliminated by using the method presented below.

The proposed method avoids sending over the network all the values in (5) at once.

In order to do that, a special organization of parameters is needed:

The values x0, x1 and x2 must never be sent over the network.

These values are, in fact, the identifiers of services and groups, respectively:

– x0 = GID (i.e. Group ID – the Group Identifier);

– x1 = SID (Sender ID) – the Sender Identifier;

– x2 = RID (Receiver ID) – the Receiver Identifier.

The values y0, y1 and y2 are to be sent over the network included in the secure

messages SM. These values depend on the secret S and other tow randomly chosen

coefficients, a1 and a2 , which are changed at every message broadcast session.

6. The protocol for building secure messages The secure messages are built by the message sender, WS Manager, and are meant to be

decrypted by the receiver, the Worker i, where i = 1, 2, 3, as shown above in Figure 2.

The layout and the steps of the proposed protocol are presented below.

Input data:

The original message M (clear text);

The encrypting session key, e (needed to obtain encrypted form C of M);

The decrypting session key, d (needed to obtain M from C);

The secret keys of receivers, noted ei , i = 1, 2, 3;

The values GID, SID, and RID, noted here as x0, x1, x2.

Output data:

The encrypted form, C, of the initial message M.

The value noted Yi (Di) = {y0, y1, y2} which is basically an array of values obtained by

the locking algorithm applied to the encrypted form Di of the decrypting session key. Algorithm: BuildSecureMessage

Class: WSManager {

GenerateKeys(); EncryptMessage (ClearMessage M, EncryptionKey e);

EncryptKey (DecryptionKey d, SecretKey ei );

Lock (EncryptedKey D, GID x0, SID x1, RID x2);

Build (EncryptedMessage C, LockedEncryptedKey Y); }

Implements: Class: WSManager, instance name: wsManager STEP 1: { e, d } := wsManager.GenerateKeys( );

STEP 2: C := wsManager.EncryptMessage( M, e );

STEP 3: Di := wsManager.EncryptKey(d, ei );

STEP 4: {y0, y1, y2 } := wsManager.Lock(Di, x0, x1, x2);

STEP 5: SM := wsManager.Build( C, {y0, y1, y2 } );

7. The protocol for decrypting secure messages

In order to extract the clear text M, the message receiver must remove the lock to obtain the

encrypted form of the decrypting session key d. Further, the clear text M can be obtained. The



49

steps of the proposed protocol are presented below.

Input data:

The encrypted form, C, of the original message, M;

The value Yi (Di) = {y0, y1, y2} which is basically a set of values used to compute the

encrypted form Di of the session key d;

The secret keys ei , i = 1, 2, 3, of every message receiver (i.e. workers);

The values GID, SID, and RID, noted here as x0, x1, x2.

Output data:

The decrypting session key, d (necessary to obtain the message M from C);

The original message M (clear text); Algorithm: DecryptSecureMessage

Class: Worker {

ExtractMessageParts(); Unlock (LockedEncryptedKey { y0, y1, y2 }, GID x0, SID x1, RID x2);

DecryptKey (EncryptedKey D, SecretKey di ); DecryptMessage (EncryptedMessage C, DecryptionKey d); }

Implements: Class: Worker, instance name: worker STEP 1: { C, {y0, y1, y2 } } := worker. ExtractMessageParts(SM );

STEP 2: Di := worker.Unlock( y0, y1, y2, x0, x1, x2 );

STEP 3: d := worker.DecryptKey(Di, di);

STEP 4: M := worker.DecryptMessage(C, d );

8. Conclusions A protocol for increasing the security of the message exchange between a service manager and

the managed worker services is proposed in this paper. The problem is defined and the logical

links between the proposed solution and the system dependability are shown.

Major objectives were defined, and an original approach based on secure groups of services is

presented. The proposed objectives are achieved one by one. The paper shows details about the

security protocol, and shows the security identifiers used in the locking algorithm.

The proposed protocol for replication systems organized as group of services keeps a high level

of security for the message exchange between legitimate services. The existence of security

groups and a security protocol which uses secret identifiers never sent over the network bring

many benefits in the security strategies of replication-based distributed systems. The security

protocol presented in this paper could be a significant contribution to the increasing of the

system dependability.

References [1] R. Altman, K. Knoernschild, “SOA and Application Architecture Key Initiative

Overview,” Gartner, Internet https://www.gartner.com/doc/2799817, July 16, 2014.

[2] A. Avizienis, J.C. Laprie, B. Randell, “Fundamental Concepts of Dependability,” in:

Research Report no. 1145, LAAS-CNRS, 2001.

[3] C. Toader, “Increasing Reliability of Web Services”, Journal of Control Engineering and

Applied Informatics, Vol.12, No.4, pp.30-35, ISSN 1454-8658, Dec.2010.

[4] T. Bray, J. Paoli, C.M. Sperberg-McQueen, E. Maler, F. Yeargeau, Extensible Markup

Language (XML) 1.0 (Fifth Edition), W3C Recommendation, Internet:

http://www.w3.org/TR/xml/, Nov. 26, 2008.

[5] G.H. Chiou and W.T. Chen, “Secure Broadcasting Using the Secure Lock”, IEEE

Transactions on Software Engineering, Vol.15, No.8, pp. 929-934, Aug.1989.

[6] A. Shamir, “How to share a secret”, Communications of ACM, Vol. 22, Issue 11, pp. 612-

613, Nov. 1979.



50

AN OVERVIEW STUDY OF SOFTWARE DEFINED NETWORKING

Alexandru STANCU

University “POLITEHNICA” of Bucharest

[email protected]

Simona HALUNGA


[email protected]

George SUCIU

University “POLITEHNICA” of Bucharest / BEIA

[email protected]

Alexandru VULPE


[email protected]

Abstract. Recent technological advances have determined the emergence of several limitations

that the traditional networks have. This situation lead the networking industry to rethink

network architectures in order to solve these issues. This paper presents a short overview and

history of the new paradigm that appeared in this context, Software Defined Networking (SDN).

First of all, this concept is briefly presented. After that, the ideas that form the fundaments of

SDN are introduced. They appeared from previous research in the industry: active networks,

data and control planes separation and OpenFlow protocol. Afterwards, the motivation behind

SDN is exposed by displaying the limitations that traditional networks have and the concepts

that promote this new paradigm. Next, we present a few details about the SDN paradigm and

some techno-economic aspects. The paper concludes with future research directions, which

are related to the definition of use cases for Wireless & Mobile applications and possibly

development and even implementation of SDN applications that optimize the wireless networks.

Keywords: Internet of Things, OpenFlow, Open Networking Foundation, programmable

networks, Software Defined Networking.

JEL classification: O30, O31, O33

1. Introduction Software Defined Networking (SDN) is a new paradigm in networking that has its roots in the

work and ideas behind OpenFlow project, which was started at the Stanford University around

the year 2009 [1]. Many of the concepts and ideas used in SDN, however, have evolved in the

last 25 years and now fit perfectly in this new paradigm that proposes to change the manner in

which networks are designed and managed.

Software defined networks represent a network architecture where the forwarding state of the

data plane is managed by a distant control plane, decoupled from the data plane. This network

architecture is based on the following four concepts [2]: (i) data and control plane decoupling;

(ii) forwarding decisions are based on data flows, instead of the destination address; (iii) the

control logic moves to an external entity, the SDN controller, that runs a network operating

system; (iv) the network is programmable through software applications that run in the network

operating system and interact with the devices from the data plane.



51

SDN emerged to satisfy some of the needs that appeared in traditional networks: innovation in

the management of the networks and the ease of introducing new services in the network [3].

These are not new needs in networking, they were studied also in the past, but now, through

SDN, they can be satisfied in a viable manner, which does not imply major changes in the

existing network infrastructure.

In the next section a brief history of SDN is presented, along with the ideas that form the

fundaments of this new paradigm and the research that led the emergence of these ideas: the

active networks, the separation of the data and control planes and the OpenFlow protocol. The

following section illustrates the motivation behind SDN and the limitations of the traditional

networks that caused the rethinking of network architectures. The penultimate section specifies

a few details about the SDN concept, such as its fundamental abstractions and a few techno-

economic aspects. The article concludes with some future research directions.

2. History of SDN SDN history can be divided into three stages, each influencing this new paradigm through the

concepts they proposed [1]:

Active networks – that introduced the programmable functions in the network,

enhancing the degree of innovation (in the mid 1990 – beginning of 2000);

Data and control plane separation – which led to the development of open interfaces

between the data and the control planes (approximately 2001 – 2007);

Development of the OpenFlow protocol and network operating systems – which

represents the first large scale adoption of an open interface, making the data and

control plane separation practical and scalable.

2.1 Active networks Active networks represent the networks where switches can perform some calculations or

operations on the data packets. They introduced a radical concept for the control of the network,

by proposing a programming interface that exposed resources from individual nodes in the

network and sustained the building of specific functionalities, that could have been applied to

a subset of packets that transit that node [1].

The main motivation for the active networks was that of accelerating innovation. At that

moment, introducing a new concept, service or technology in a wide area network, such as the

Internet, could take up to ten years, from the prototype phase until large scale implementation.

It was intended for the active nodes from the network to allow routers/switches to download

new services in the existing infrastructure, while coexisting without problems with legacy

devices in the network.

Even though the active networks were not implemented on a large scale, some of the proposed

ideas were considered by the SDN [1]:

Network programmable functions, that ease innovation;

Network virtualization and the possibility of decoding software programs based on

packet headers;

Attention to middleboxes (e.g. firewalls, deep packet inspection devices etc.) and the

manner in which their functions are composed.

2.2 Data and control planes separation

Networks have had, since the beginning, integrated data and control planes. This led to some

disadvantages: difficulties in the network management, in the debugging of the network

configuration or in the control or prediction of the forwarding behavior.



52

The first attempts for the separation of the data and control planes date from the 1980s [2]. At

that time, AT&T was proposing the discard of in-band signaling and the introduction of a

Network Control Point (NCP), thus achieving the data and control planes separation. Also,

newer initiatives exist that propose the separation between the data and the control planes:

ETHANE [4], NOX [5], ForCES [6], OpenFlow. These have the advantage of not needing

substantial changes in the forwarding devices, which translates into an easier adoption by the

networking industry.

The ideas behind the data and control planes separation that were considered in SDN are [1]:

A logical centralized control that uses an open interface to the data plane;

Distributed states management.

2.3 OpenFlow protocol and network operating systems Before the OpenFlow protocol emerged, the ideas behind SDN were suffering from a

contradiction between the vision of completely programmable networks and the pragmatism

that would allow launching in real networks. OpenFlow found a balance between these two

objectives, through the possibility of implementing on existing devices (existent hardware

support) and implementing more functions than its predecessors. Even though relying on

existing hardware implied certain limitations, OpenFlow was immediately ready for

implementing on existing production networks.

At the beginning, the OpenFlow protocol was implemented on campus networks. After the

success in this type of networks, the protocol began to be implemented in other types of

networks, such as data centers.

The ideas that emerge in SDN, derived from the research conducted for developing the

OpenFlow protocol:

Generalizing the functions and the network equipment;

The vision of a network operating system;

Techniques for managing the distributed states.

3. Motivation behind SDN

The explosion of the mobile devices and the content they access, the introduction of cloud

services and also server virtualization determined the networking industry to rethink the

architecture of the networks [7]. Thus, limitations of the traditional networks were found and,

together with the needs determined by the general technological evolution, led to the

conclusion that a new paradigm in networking is necessary: Software Defined Networking.

Satisfying market requirements at the moment, with the traditional network architectures is

almost impossible. Operational costs for such networks are very high, because network devices

need to be managed individually when implementing new policies or, due to the fact that the

equipment that comes from different vendors must be handled differently. In addition to the

operational costs, also the capital expenditures for a network increased, because of the

middleboxes that need to be introduced in the network in order to ensure security or to perform

traffic engineering tasks. Some of the limitations from traditional networks that led to the

emergence of this new paradigm are [7]:

Complexity – that leads to network stagnation. In order to introduce a new device in the

network, administrators need to reconfigure several hardware and software entities,

using management tools, as well as considering several factors, such as software

version of the devices, network topology etc. Thereby, this complexity of traditional

networks implies a slow evolution and difficulties in network innovation, in order to

reduce the risk of service disruption. It also leads to an inability of the network to adapt

dynamically to the changing traffic patterns, user requests or applications;



53

Vendor dependency – corporations nowadays require a fast response for the changes in

business or customer needs. However, this fast response is delayed by the product cycle

of the equipment vendors, which can span over multiple years;

Scalability issues – offering more bandwidth that the connection can support, or over-

subscription, based on predictable traffic patterns does not represent a solution anymore

in nowadays networks. In big data centers, that rely on virtualization, traffic patterns

are very dynamic, thus hard to predict. This implies also configuring hundreds of

network elements and it is impossible to do this manually;

Network policies inconsistency – for implementing network-wide policies in a

production network, up to thousands of devices need to be configured. Thus, because

of this complexity, ensuring such policies for quality of service, security or access is

very difficult.

There are some ideas that promote this new networking paradigm, such as:

The need for higher bandwidth – nowadays large data volumes require parallel

processing on up to thousands of interconnected servers, which need direct connections.

The growth of these data volumes translates into the need of higher bandwidth from the

networks. The data center operators need to create a network that scales to remarkable

dimensions and maintains connectivity between any two network nodes;

The need of flexible access to IT resources – lately, employees demand to be able to

join the enterprise network through a series of heterogeneous devices, such as laptops,

smartphones or tablets;

Cloud services development – enterprises began utilizing cloud services, both public

and private, leading to a massive growth of this type of services. Companies now desire

access to applications, infrastructure and other IT resources on demand and at any time.

In order to implement these demands, scalability is needed for computing power,

storage and also for network resources and it is advisable to be able to operate these

modifications from a common point and using common tools;

Dynamic traffic models – with the emergence of data centers, traffic patterns have

changed drastically. Newer applications access many more servers and databases, this

implying an avalanche of east-west traffic between different machines, before the

information gets back to the user through a traditional north-south traffic pattern.

Software defined networks are proving to be very suited in the context of the emergence of a

new concept, the Internet of Things (IoT), satisfying exactly the needs it has: the need for

higher bandwidth, dynamic network reconfiguration and a simplified network architecture that

facilitates innovation [8].

4. Software Defined Networking paradigm

SDN represents a new paradigm in networking architectures and it has four fundamental

concepts: (i) the decoupling of the data and control planes, (ii) forwarding decisions are based

on data flows, instead of destination address, (iii) the control plane moves into an external

logical entity, the SDN controller, which runs a network operating system and (iv) the network

is programmable through software applications.

Software defined networks can be defined through three fundamental abstractions [2], as

illustrated in Figure 8.

Forwarding abstraction;

Distribution abstraction;

Specifications abstraction.



54

Ideally, forwarding abstraction represents allowing any forwarding behavior that the software

applications desire, with the support of the control plane, without needing to be aware of the

details about hardware capabilities of the underlying infrastructure. An example of such an

abstraction is the OpenFlow protocol.

Distribution abstraction refers to the situation where SDN applications should not be aware of

distributed states issues from the network, transforming the problems of a distributed control

plane, as in traditional networks, into a logically centralized control plane. These issues are

solved through a common distribution level, the network operating system.

Specifications abstraction represents the ability of a software application to express a certain

behavior of the network, but not to be responsible also for implementing this behavior. This

ability can be achieved through virtualization solutions and also through network programming

languages.

From an economic point of view, SDN offers a reduction of the costs in the network. In

traditional networks, lately, both operational costs and capital expenditures increased. The

latter because the need of security and traffic engineering appeared, middleboxes need to be

purchased and introduced in the networks. The operational expenditures increased because the

network equipment comes from different vendors and have different methods and tools to be

managed. SDN provides a solution to these increasing costs. It is cheaper to hire a team of

software engineers that develop complex software applications for controlling the network,

than to purchase devices that have the same capabilities in a proprietary manner.

5. Conclusions and discussions SDN represents the most important paradigm that emerged in networking in the recent years.

It appeared as a solution to the stringent needs that surfaced in the industry, caused by the

Network infrastructure

Forwarding

devices

Network operating system (SDN controller)

Network abstractions

Application 1 Application 2 Application N

Global view of the network

Abstract views of the network

Open northbound interface

Open southbound interface

Dat

a p

lan

e C

on

tro

l p

lan

e

Figure 8 - SDN architecture and its fundamental abstractions [1]



55

evolution of other technologies, such as cloud infrastructures, mobility, big data applications

or the concept of Internet of Things [9]. Thereby, SDN tries to address and satisfy these needs,

by accelerating innovation in the network and simplifying and automation of the management

of big networks.

Because SDN is a research field that is not mature yet, standards and user cases are still

emerging for this paradigm. If, for the southbound interfaces, OpenFlow protocol is accepted

by the majority as being the most suitable, in the case of the northbound interface a proposal

that is unanimously accepted has not yet emerged. An interesting direction is represented by

the research from the “Wireless & Mobile” Work Group, from Open Networking Foundation

(ONF). This organization promotes the adoption of SDN through developing open standards.

A future research direction is represented by the studying of use cases and afterwards

development and implementation of SDN applications in the context of optimizing wireless

transport networks. These applications should not be mistaken for Software Defined Radio

(SDR). SDR is intended for replacing classical radio hardware with radios that are

reconfigurable through software [10]. In contrast, SDN applications that reside in the Wireless

& Mobile category run over the SDN network operating system and are able to reconfigure the

elements of a wireless network through the OpenFlow interface.

Acknowledgement

The work has been funded by the Sectorial Operational Program Human Resources

Development 2007-2013 of the Ministry of European Funds through the Financial Agreement

POSDRU/159/1.5/S/134398 and supported in part by the SARAT-IWSN project.

References

[1] N. Feamster, J. Rexford and E. Zegura, "The Road to SDN: An intellectual history of

programmable networks," ACM Queue, vol. XI, no. 12, 2013.

[2] D. Kreutz, F. M. V. Ramos, P. Verissimo, C. E. Rothenberg, S. Azodolmolky and S.

Uhlig, "Software-Defined Networking: A Comprehensive Survey," 2014.

[3] J. Tourrilhes, P. Sharma, S. Banerjee and J. Pettit, "The Evolution of SDN and OpenFlow:

A Standards Perspective," ONF, 2014.

[4] M. Casado, M. Freedman, J. Pettit, J. Luo and N. McKeown, "Ethane: Taking Control of

the Enterprise," in SIGCOMM, Kyoto, 2007.

[5] N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado and N. McKeown, "NOX: towards

an operating system for networks," Comp. Comm. Rev., 2008.

[6] A. Doria, J. Salim, R. Haas, H. Khosravi, W. Wang, L. Dong, R. Gopal and J. Halpern,

"Forwarding and Control Element Separation (ForCES) Protocol Specification," Internet

Engineering Task Force, 2010.

[7] Open Networking Foundation, "Software-Defined Networking: The New Norm for

Networks," White Paper, 2012.

[8] R. Vilata, R. Munoz, R. Casellas and R. Martinez, "Enabling Internet of Things with

Software Defined Networking," CTTC, 2015.

[9] G. Suciu, A. Vulpe, O. Fratu and V. Suciu, "Future networks for convergent cloud and

M2M multimedia applications," in Wireless Communications, Vehicular Technology,

Information Theory and Aerospace & Electronic Systems (VITAE), 2014.

[10] B. Bing, "Sowtware-Defined Radio Basics," IEEE Computer Society, 2005.



56

CONSUMER RIGHTS IN THE CONTEXT OF

BIG AND OPEN DATA SOCIETY

Lorena BĂTĂGAN

The Bucharest University of Economic Studies, Romania [email protected]

Cătălin BOJA


Mihai DOINEA


Abstract. We live in a digital world and access to information is a ubiquitous state either we

are at work, at home or at shopping. Although the costs of access to sources of information is

becoming smaller, consumers still encounter a lack of information about the products and

services they purchase in order to take their best decision. Consumers have the right to be

informed to make good decisions about their acquisitions. This paper examines the relationship

between efforts made by EU structures and organizations at national level towards an Open

Data society, as to a Data Driven economy, in which real time access to knowledge allows

consumers to find out almost immediately everything they can about what they buy and how

they can be protected against frauds of all sorts. The paper is highlighting the advantages of

integrating technologies and digital standards in public politics for protecting consumer rights.

Keywords: Big Data, innovation, SMEs, consumer rights, cloud processing.

JEL classification: O31, O33, L25

1. Introduction We are living in a fully digitalized era in which every economic process can be automatically

recorded, measured and its information can be delivered instantaneously to concerned parties.

Moreover, the technology development costs are lowered each day, making it possible the

production of various sensors and autonomous electronic boards that can be embedded easily

in consumer and industrial products allowing a constant monitoring of its qualities

characteristics.

Since the adoption by the United Nations in 1985 of the Consumer Protection Guidelines and

their review in 1999, CI - Consumers International (2013), the “right to be informed” through

accurate information about goods and services and access to proper information is one of the

consumer seven basic legitimate needs.

Because of the globalization of production chains in the food industry and not only, many

parties involved in the process of producing a single product or a category of products are

distributing their goods across the entire globe, this way making it harder to track them and to

monitor their quality. For example, in United States, 60% of fruits and vegetables on the market

are imported from more than 150 countries [1]. Production chains became so complex and so

large that it is very expensive and time consuming to record all the related information and it

is impossible to make it accessible to the consumer. Even providing a reduced set of

information becomes a challenge, given the limited space that is available on printed labels.



57

The large volume of information gathered for the product will require a redesign of products

labels. Today there are different policies for products labeling, based on text and visible logo’s,

like the ones for the ecologic products [2]. Also, the large number of industry and economy

parties that influence directly and indirectly the production, delivery, storage and selling stages

for a particular product requires an interoperability framework between e-government services

and the private sector ones [3].

The low level of adoption of information technology by agricultural farms in many countries,

mostly emergent and under development [4] is an obstacle in reaching this objective. Despite

that, the benefits for both producers and for the society are too valuable not to be included in

strategies for future development [5].

2. Big and Open Data

The preoccupation about open data in the world starts in January 2004 when Ministers of

science and technology of OECD - Organization for Economic Cooperation and Development

countries, that includes most developed countries of the world, met in Paris and discussed the

need for international guidelines on access to research data [6]. The specific aims and

objectives of these principles and guidelines [6] are:

to inform about the meaning of openness and sharing of research data among the

public research communities within member countries and beyond;

to promote the exchange of good practices in data access and sharing;

to inform the public about the potential costs and benefits;

to highlight the regulations regarding data access and sharing;

to establish a framework of operational principles for research data access

arrangements in member countries;

to inform the member countries on how to improve the international research.

In 2007 the OECD signed a declaration that essentially states that all publicly funded archive

data should be made publicly available. Subsequently, in 2006 the OKF - Open Knowledge

Foundation [7] has proposed a definition of what it means open content, namely: “A piece of

data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at

most, to the requirement to attribute and/or share-alike”. With respect to data, as stated by [8],

it is required that a dataset be accessible (usually by being online) at no cost and with no

technical restrictions to facilitate its reuse. The European Union (EU) was for many years

interested about the issue of open data as a resource for innovative products and services and

as a means of addressing social challenges and fostering government transparency [9]. It is has

been observed, as it is highlighted in the EU's report, that a better use of data, including

government data, can help to power the economy, serving as a basis for a wide range of

information products and services and improving the efficiency of the public sector and of

different segments of industry [9], [10].

Because the EU wants to promote openness, in the Open Data Charter - the report regarding

the open data, they committed to [9]:

identify and make available core and high-value datasets held at EU level;

publish data on the EU Open Data Portal;

promote the application of these principles of Open Data Charter in all 28 EU

Member States;

support activities, outreach, consultation and engagement;

share experiences of work in the area of open data.

From the beginning, the most important use of open data is in governmental area. Based on

Open Governmental Data the firms and institutions can reuse freely the available government



58

information in innovative ways. Vivek Kundra from Harvard College highlighted in his paper

[11] that the Weather Channel, an American television network, and Garmin, a firm that

develops navigation products, aviation, and marine technologies (with market cap of over $7

billion at end of January 2013) were built using raw government data. In this case we can say

that open data has significant importance for both public and private sectors (table no. 1).

Table 1. Economic Benefits of Open Data

Adapted from: The Open Data Economy Unlocking Economic Value [12]

Drive revenue

through multiple

areas

Cut costs and

improve

efficiency

Generate

employment and

develop skills

Build a transparent

society

Public

sector

Increased tax

revenues through

increased

economic activity

Revenues through

selling high value

added information

for a price

Reduction in

transactional costs

Increased service

efficiency through

linked data

Create jobs in

current challenging

times

Encourage

entrepreneurship

Transparency is

essential for public

sector to improve

their performance.

Private

sector

Drive new

business

opportunities

Reduced cost by

not having to

invest in

conversion of raw

government data

Better decision

making based on

accurate

information

Gain skilled

workforce

For private sector

transparency is an

important material

for innovative new

business ventures.

The availability of open data has grown significantly in the last years [13]. Some main

motivations for use open data are that open access to publicly funded data provides greater

returns from the public investment [14], improve the business efficiency and can help the

customer to access large quantities of datasets.

Open data is often indispensable for public system development and service delivery, but can

also be valuable for others, such as traffic information, healthcare, education, market [14]. It is

essential to the market to use open data because this will offer: diversity of opinion (each person

have some opinions), independence (people opinions are independent), decentralization (each

person is able to draw some conclusions using local knowledge) and aggregation (similar

solutions can be aggregated).

3. Open data impact

The ODI - Open Data Institute has highlighted in its Open Data Barometer in 2013 (a global

report) the open data impact. They mention [15] that in cities where open data had been used

there can be found some important change in areas like government transparency, government

efficiency, environmental sustainability, social inclusion, economic growth and entrepreneurial

activity (figure no. 1).



59

Figure 1. Open data impact

Data source: Open Data Barometer, 2013 Global Report– ODI

Open data initiative can be found in different domains. CCA - Capgemini Consulting Analysis

thought his vice-president Dinand Tinholt [12] highlighted that open data can enable an

increase in business activity by allowing the creation of new firms, new products and services.

A big number of countries including USA, France, UK, Denmark, Spain and Finland have

observed that open data have a tangible impact on the volume of business activity. It is evident

that the businesses that use open data in their activities can generate high returns through the

development of new products and services.

In every successful open data ecosystem [16] we can identify three principal components

(figure no. 2): government (open data produced or collected by the public sector), business

(open data produced or collected by private sector) and customers (personal or non-personal

data of individual customers/citizens published on open domain).

Figure 2. Open data ecosystem

For the moment we can find open data solutions implemented in domains like: transport,

business, geographic and these are based on open data produced by the public sector. Small

and medium companies based on open data from different domains generate new businesses

and jobs. The CCA [12] in their analysis identified the most popular open date domains (figure

no. 3).

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Transparency & accountability

Entrepreneurial open data use

Government efficiency

Economic growth

Enviromental sustainability

Inclusion of marginalised groups

Customers

Business and Industry

Government



60

Figure 3. The most important open data domains

Data source: The Open Data Economy, 2013 Capgemini Consulting

A thorough analysis performed in [17] underlines that by the use of big and open data, nations

must easily shift towards a data-driven economy by implementing the characteristic features

established by the European Commission.

The countries can be classified in three main categories based on the usage of open data. The

results from the CCA research [12] indicate that only very few countries, around 22%, can be

classified as "trend setters". They identify that 78% of counties don't use the data to its real

facility. In this case the countries are classified as beginners, followers and trend-setters (table

no. 2). Table 2. Open Data Initiatives

Adapted from: The Open Data Economy, 2013 Capgemini Consulting, European Public Sector Information

Platform, 2013 and The Global Competitiveness Report 2013–2014, World Economic Forum, 2013

Beginners Followers Trend

Setters

Portugal Italy USA

Ireland Denmark UK

Belgium New Zealand Canada

Saudi Arabia Spain Australia

Greece Finland France

Turkey Norway Germany

Romania Hong Kong Sweden

Ukraine Estonia

The big volumes, complexity and data openness determines a new mode of management and

new technological safeguards for privacy. Once the data is recorded and publicly available in

Open Data repositories there is a matter of seconds and customer culture and responsibility to

access it and query it on the web. Then the consumer will have all the data needed to take a

fully aware decision on buying or not that product.

4. Conclusions

In the next years open data will help us to fight with natural disasters, to personalize the

products and to improve the environment, quality of care, and people satisfaction. These are

going to be major changes. Also, open data will help consumers to make more informed

choices. Current solutions could increase the consumers' implication in the business process.

In this case all of the data can be brought together and made open. Using a solution for food

0 10 20 30 40 50 60

Socio-Demographic

Transport

Legal

Meterological

Geographic

Business

Cultural Files



61

safety, consumers will choose products that are good for their own health and for the

environment. Open data play an important role between consumers and businesses by offering

consumers more information. For the consumer, the right to be informed will be backed up by

the right information.

There are three milestones that once achieved will open new horizons in our society concerning

consumer protection and rights. One is the technology milestone that requires the development

of an autonomous sensor infrastructure that will acquire data. The second milestone is the

implementation of an access to knowledge infrastructure based on public policies and open

data repositories. The last milestone is modelling the consumers’ culture in order to increase

responsibility and awareness.

Acknowledgment This paper is supported by the Sectorial Operational Programme Human Resources

Development (SOP HRD), financed from the European Social Fund and by the Romanian

Government under the contract number SOP HRD/159/1.5/S/136077.

References

[1] U.S. Food and Drug Administration, 2007. Food Protection Plan: An integrated strategy

for protecting the nation’s food supply; [pdf] Washington: Department of Health and

Human Services. Available at: < http://www.ntis.gov>.

[2] Dinu, V., Schileru, I. and Atanase, A., 2012. Attitude of Romanian consumers related to

products’ ecological labelling. Amfiteatru Economic, XIV (31), pp.8-24.

[3] Constantinescu, R., 2013. Interoperability Solutions for E-Government Services, In

INFOREC, Proceedings of the 12th International Conference on INFORMATICS in

ECONOMY (IE 2013), Bucharest, Romania, 25-28 April 2013. Bucharest: ASE Publishing

House

[4] Moga, L.M., Constantin, D.L., Antohi, V.M., 2012. A Regional Approach of the

Information Technology Adoption in the Romanian Agricultural Farms, Informatica

Economică, 16(4), pp. 29-36

[5] Commission of the European Communities COM(2009) 278 final of 18 June 2009 on

Internet of Things — An action plan for Europe. [online] Available at: <http://eur-

lex.europa.eu/LexUriServ/LexUriServ.do?uri=COM:2009:0278:FIN:EN:PDF>,

[6] Organization for Economic Co-operation and Development, 2007. OECD Principles and

Guidelines for Access to Research Data from Public Funding, Retrieved January 25, 2014,

Available online at http://www.oecd.org/sti/sci-tech/38500813.pdf

[7] Open Knowledge Foundation, 2006. Open Knowledge Definition. [online] Available at:

<http://opendefinition.org>

[8] Davies, T., Perini, F. and Alonso, J. M., 2013a. Researching the emerging impacts of open

data, [online] ODDC (Open Data in Developing Countries) conceptual framework,

Available at: <http://www.opendataresearch.org/sites/default/files/posts/> [Accessed 23

November 2013]

[9] EU, 2013. EU implementation of G8 Open Data Charter, 2013. [pdf] Brussels: EU.

Available at: <http://ec.europa.eu/digital-agenda/en/news/eu-implementatio>

[10]G8UK, 2013. G8 Open Data Charter, [pdf] Available at

<http://www.diplomatie.gouv.fr/fr/IMG/pdf/Open_Data_Charter_FINAL_10_June_2013

_cle4a3a4b.pdf>

[11] Kundra, V., 2011. Digital Fuel of the 21st Century: Innovation through Open Data and

the Network Effect, [pdf] Harvard: Harvard University Press, 2011, Available at:



62

<http://www.hks.harvard.edu/presspol/publications/papers/discussion_papers/d70_kundr

a.pdf>

[12] Tinholt, D., 2013., The Open Data Economy Unlocking Economic Value by Opening

Government and Public Data, [online] Capgemini Consulting, Available at:

<http://www.capgemini-consulting.com/resource-file-

access/resource/pdf/opendata_pov_6feb.pdf>

[13] Bătăgan L., 2014, The Role of Open Government Data in Urban Areas Development,

Informatica Economică vol. 18, no. 2/2014

http://www.revistaie.ase.ro/content/70/08%20-%20Batagan.pdf

[14] Janssen, M., Charalabidis, Y. and Zuiderwijk, A., 2012. Benefits, Adoption Barriers and

Myths of Open Data and Open Government. Information Systems Management, [e-

journal] 29(4), pp.258-268, Available at:

<http://www.tandfonline.com/doi/full/10.1080/10580530.2012.716740>

[15] Davies, T., Farhan, H., Alonso, J., Rao, B., Iglesias, C., 2013b. Open Data Barometer,

2013 Global Report – ODI( Open Data Institute) conceptual framework , Available at:

<http://www.opendataresearch.org/dl/odb2013/Open-Data-Barometer-2013-Global-

Report.pdf>

[16] Deloitte, 2013. Open data - Driving growth, ingenuity and innovation, [pdf] Deloitte,

Available at: <http://www.deloitte.com/assets/dcom-unitedkingdom/> [Accessed 2

December 2013]

[17] Filip, F.G., Herrera-Viedma, E., 2014, Big Data in the European Union, The Bridge, Vol.

44, No. 4, pp. 33-37, 2014



63

SECURE CYBER SECURITY THREAT INFORMATION EXCHANGE

Mihai-Gabriel IONITA

Military Technical Academy, Bucharest, Romania

[email protected]

Abstract. The following paper tackles one of the most important fields of current cyber

security, in our opinion. This article concerns threat information exchange. Without

information exchange a cyber-security system’s functionality is severely hampered. An event

might not trigger a specific danger threshold if attacks are stealthy and targeted. But the same

attack, if information is gathered and correlated from different sources around an

organization’s network it might hit that specific threshold, and also hit an alarm point which

will be much more visible to a human operator. In different studies it is demonstrated that a

single hit can make the difference from an incident which is categorized as important and

treated in a timely manner or, in the other scenario which is categorized as usual traffic and

left uninvestigated. Information regarding cyber threats, when exchanged between entities

involved in the same field of action permits transforming information into intelligence. The

theme discussed in the present paper is focused on intelligent threat exchange, which makes

different checks and decisions before sending different information in a secure manner. Any

attack detail can be used by a third party for exploiting different vulnerable resources from the

protected organization, if discovered. Another thorny problem of the current cyber security

state is that of standardizing the way security incident information is normalized and packed

for transport. This latter problem is also delved into in the current article. The experimental

setup is built on top of a neural network and an evolved SIEM like infrastructure, for collecting,

analyzing and sharing threat information.

Keywords: cyber security, intelligent threat exchange, neural networks, SIEM, HIDS


1. Introduction

In today’s current cyber security world evaluating incidents without knowing what happens to

your neighbor or without having full visibility in your organization is imaginable and a sure

way towards failure. There have been different initiatives in this field but there is a huge

problem which keeps the domain from evolving. This is the standardization of event

information definition and the standardization of message format for exchanging information

regarding different cyber security events.

The organization which invests large amounts of money in any important initiative which can

return investments is the Department of Homeland Security (DHS) of the United States of

America (USA). The interest of the DHS is to keep the pole position in this filed which is of

huge interest to the civil, governmental and military forces of the USA. Cyber threat exchange

through a standardized, reliable, tested and nonetheless a secure protocol is of the utmost

importance. Because the USA have a large base of security information collectors, which are

geographically distributed and are administered by different entities which sometimes may not

be willing to share or give away all their collected security incident information to other entities

from other fields of activity. As an example, maybe the public sector would be unwilling to

share information with the governmental entities which are involved in intelligence collection




64

activities. In the same direction, it may be possible that militarized structures would not want

to give away attack information to civil organizations in the governmental hierarchy.

In this respect, there is high interest for selective security information sharing based on preset

relationships whit other organizations. But another problem consists a drawback and this is the

fact that log information has to be standardized when shared, otherwise computational

resources and time will be lost for interpreting, correlating and integrating the received

information into the own database. This will of course lead to delays in cross correlation of

events and decision taking when quick action is needed.

2 Protocols for the common definition of cyber threat information, incidents and

Indicators of Compromise (IOC)

As stated above, the DHS, one of the most active sponsors of the standardization initiatives has

pushed through MITRE the Common Vulnerabilities and Exposures (CVE) standard which

was adopted by more than 75 vendors, and quickly developed into the de facto standard for

defining vulnerabilities. Since then it is used for comparing different vulnerabilities from

different vendors. And it is really helpful in comparing the severity of different exposures.

Another initiative the MITRE organization is pursuing for standardizing is the Structured

Threat Information eXpression (STIX) which is “a collaborative community-driven effort to

define and develop a standardized language to represent structured cyber threat information.

The STIX Language intends to convey the full range of potential cyber threat information and

strives to be fully expressive, flexible, extensible, automatable, and as human-readable as

possible. [1]”

3 Protocols for securely exchanging cyber incidents and security information

As in the previous section, MITRE is also working on standardizing the Trusted Automated

eXchange of Indicator Information (TAXII), alongside STIX. “TAXII defines a set of services

and message exchanges that, when implemented, enable sharing of actionable cyber threat

information across organization and product/service boundaries. TAXII, through its member

specifications, defines concepts, protocols, and message exchanges to exchange cyber threat

information for the detection, prevention, and mitigation of cyber threats. TAXII is not a

specific information sharing initiative or application and does not attempt to define trust

agreements, governance, or other non-technical aspects of cyber threat information sharing.

Instead, TAXII empowers organizations to achieve improved situational awareness about

emerging threats, and enables organizations to easily share the information they choose with

the partners they choose. [2]”

This protocol is a flexible one, as it supports the major models for exchanging information in

a graph architecture:

Source-subscriber – one way transfer from the source to the subscriber, used in

public/private bulletins, alerts or warning

Peer-to-peer – both push and pull methodology for secret sharing, usually used in

collaboration on different attacks. It permits the entities to establish different trust

relationships directly with its partners, directly, for exchanging only the needed

information.

Hub-and-spoke – similar to the previous model, but here the dissemination of information

happens through a central entity, the hub. Here different checking and vetting operations

can be done on the information received from the spokes, before sending it to the other

spokes.

Another strong initiative in this domain is that of NATO countries. They have come up with

different frameworks for exchanging threat data in a secure manner.



65

The Cyber Defense Data Exchange and Collaboration Infrastructure (CDXI) [3] is one of the

proposals which can be used on international level for cooperation. In a similar manner the

Internet Engineering Task Force (IETF) has a set of standards for cooperation: Real-time Inter-

network Defense (RID) and the Incident Object Description Exchange Format (IODEF), as

further described in our article [4]. CDXI is one of the better documented proposals for an

information sharing architecture for NATO countries. Its author [3] outlines the major

problems of this domain:

“-there are no mechanisms available for automating large-scale information sharing” These

are a must have in the context of the proposed architecture.

“-many different sources of data containing inconsistent and in some cases erroneous data

exist.” For a system, which processes thousands of data streams any delay can be

considered catastrophic.

“-incompatible semantics using the same or similar words are used in different data sources

covering the same topics.” This only increases a repository size without adding any value,

and making it harder for a clustering algorithm to provide correct results. Once again, in

this context it is very important to have a clear algorithm of Quality Assurance over data

received from partners.

4. The proposed implementation

The above depicted system, in “Figure 1” is the one used for information sharing. The above

design illustrates a typical distributed system with a head office and multiple branch offices.

Figure 1. The proposed implementation

All of these systems have installed and running a custom version of the popular Host Intrusion

Detection System (HIDS) OSSEC. These act as micro Security Incident and Event

Management (SIEM) in their environment. They collect logs from the systems they reside upon

and exchange information with other similar agents in their branch. If instructed from the

headquarters full blown SIEM they can exchange information between branches if the situation

calls for quick action in a specific area. But usually they only exchange events inside the same

branch because they only have preset a specific key. Those of the other branch agents and that

of the central authority depicted as alienvault in the above “Figure 1”.

The custom OSSEC agents are based upon a neural network, better described in our article [5].



66

As depicted in Figure. 2, “the proposed architecture implies a Feed-Forward Backward-

Propagating neural network based on two input layers, ten hidden layers and one output layer.

The training was done using 1000 input values and 1000 output values captured from a network

of sensors formed by local agents, based on the processed security events. The training was

done using the Levenberg-Marquardt method. Performance was calculated using the Mean

Square Error approach. [5]”

Figure 2. The proposed architecture based on Feed Forward Backward Propagating Neural

We use the same experimental criteria as in [6] for defining risk assessment metrics, these are

described below, in Table 1.

Table 1. Risk calculated for different types of attacks

Asset

Determined

risk using

Neural Net

Probability Harm Calculated Risk

Network info 0.0026 5 0 Null – 0

User accounts 12.0014 3 4 High – 12

System

integrity 12.0013 4 3 HIGH – 12

Data

exfiltration 12.0009 2 6 HIGH – 12

System

availability 15.0007 3 5 HIGH - 15

The results in “Table 1” are obtained after comparing the output of the neural network to the

calculated result of the following formula:

Risk = (Probability x Harm) (Distress_ signal + 1) (1)

5. Conclusions and future research

As stated above, threat exchange information is crucial for the development of the cyber

security field. The detection of current, sophisticated, cyber-attacks is impossible without

proper sharing of an organization’s current attack information. From this information, if

introduced as input in our neural network different correlations can be made which could detect

sophisticated attacks which are orchestrated or even APT (Advanced Persistent Threat)

campaigns which could go undetected if callbacks to C&C (Command and Control )servers

are not registered. The key aspect and the “take away” idea of this paper is that without

standardizing and normalizing events all this collaboration would be impossible between

organizations which have heterogeneous communication infrastructures.

For extending this implementation we are currently working on extending this application in

the “complicated” world of the Internet of Things.



67

References

[1] MITRE, "the STIX Language," [Online]. Available: https://stix.mitre.org/.

[2] MITRE, "Trusted Automated eXchange of Indicator Information," [Online]. Available:

http://makingsecuritymeasurable.mitre.org/docs/taxii-intro-handout.pdf.

[3] O. S. S. Luc Dandurand, "Towards Improved Cyber Security," 5th International

Conference on Cyber Conflict, 2013 .

[4] V.-V. P. Mihai-Gabriel IONITA, "Autoimmune Cyber Retaliation Supported by Visual

Analytics," Journal of Mobile, Embedded and Distributed Systems, vol. VI, no. 3, pp.

112-121, 2014.

[5] M.-G. Ionita and V.-V. Patriciu, "Biologically inspired risk assessment in cyber security

using neural networks," IEEE Xplore, vol. 10.1109/ICComm.2014.6866746, no. 5, pp.

1 - 4, 2014.

[6] M.-G. Ionita and V.-V. Patriciu, "Achieving DDoS resiliency in a software defined

network by intelligent risk assessment based on neural networks and danger theory,"

Computational Intelligence and Informatics (CINTI), 2014 IEEE 15th International

Symposium on Computational Intelligence and Informatics, vol. IEEE, no.

10.1109/CINTI.2014.7028696, pp. 319-324, 19-21 Nov. 2014.

[7] S. Wohlgemuth, "Resilience as a New Enforcement Model for IT Security Based on

Usage Control," in Security and Privacy Workshops (SPW), 2014 IEEE, San Jose, CA,

2014.

[8] K. R. Hofmann Stefan, "Towards a security architecture for IP-based optical

transmission systems," in Bell Labs Technical Journal (Volume:16 , Issue: 1 ), DOI:

10.1002/bltj.20491, 2011.

[9] J. M. R. M. Sultana S., "Improved Needham-Schroeder protocol for secured and

efficient key distributions," in Computers and Information Technology, 2009. ICCIT

'09. 12th International Conference on, DOI: 10.1109/ICCIT.2009.5407301, 2009.

[10] T. T. M. J. Spyridopoulos T., "Incident analysis & digital forensics in SCADA and

industrial control systems," in System Safety Conference incorporating the Cyber

Security Conference 2013, 8th IET International, DOI: 10.1049/cp.2013.1720, 2013.



68

ADOPTION OF CLOUD COMPUTING IN THE ENTERPRISE

Floarea NĂSTASE

Department of Economic Informatics and Cybernetics

Bucharest University of Economic Studies, Romania

[email protected]

Carmen TIMOFTE

Department of Economic Informatics and Cybernetics


[email protected]

Abstract. The majority of companies have Internet access, but the number of them using cloud

computing is relatively low. This paper is about the companies that use cloud computing and

about those who should implement it, in order to increase efficiency. We talk about risk and

the immense benefits of adopting cloud computing in the enterprises.

Keywords: cloud computing, enterprises, ICT.

JEL classification: L86, L81, L84

1. Introduction

Information and communication technology (ICT) is a major factor of innovation and

productivity improvement in all sectors of society. In the last years, ICT has made a significant

contribution to labor productivity growth [1]. However, information and communication

technologies are expensive and require specialized skills and systems maintenance effort for

storing and processing data correctly. Not all companies or individuals are able to acquire and

operate on the latest dedicated systems.

The solution, instead of building their own IT infrastructure, companies are able to access

computing resources hosted on the Internet by third parties. Using hardware and software

accessible via the Internet or network, often delivered as services, is known as cloud

computing. Cloud technology will generate new business opportunities and will influence the

future work, since the allocation of physical space for corporate offices, the possibility of

developing new concepts, carrying out the audit etc.

2. Cloud Computing in enterprises of the European Union In September 2012, the European Commission adopted a strategy for exploiting the potential

of cloud computing in Europe [2]. The strategy highlights actions to achieve a net gain of 2.5

million new jobs in Europe and an annual increase of 160 billion euros in EU GDP

(approximately 1%) in 2020. Earnings are estimated will be achieved if the EU cloud strategy

fully materializes. The aim of the strategy is to stimulate the adoption of cloud computing

solutions in Europe, both in the public and private sectors, by providing a safe and reliable.

As you know, access to the Internet is the cornerstone for e-business, with the possibility of

connecting people and businesses around the world. Percentage of enterprises in the European

Union using computers connected to the internet seems to have reached a saturation level.

According to Eurostat - the authority dealing with the processing and publication of statistical

information in the European Union (EU28) in 2013, 96% of enterprises had access to the

Internet (Fig. 1). Share of enterprises with access to the Internet is similar in most countries.



69

Figure 1. Enterprises with Internet access in the EU28 in 2013 (source:

http://ec.europa.eu/eurostat/statistics-explained/index.php/E-business_integration)

Although the majority of companies have Internet access, the number of solutions using cloud

computing is relatively low. In 2014, Eurostat published a study on the use of cloud technology

in enterprises in Europe, showing that [3]:

19% of EU companies were using cloud computing, especially for systems hosting email

and file storage (Fig. 2).

46% of these companies (of 19%) were using advanced cloud services, such as financial

accounting software, customer relationship management or computing power for the

implementation of business applications.

Twice as many companies prefer the use of public cloud solutions (12%) than those in the

private cloud (7%).

4 of 10 companies (39%) said that using cloud security breach risks would be the main

factor limiting the use of cloud computing services.

42% of those who reported not using cloud lack sufficient knowledge about cloud

computing and therefore were reluctant to use them.

Figure 2. Using cloud computing services to businesses in EU28, 2014 (% of firms)

(source:http://ec.europa.eu/eurostat/statistics-explained/index.php/Cloud_computing_-

_statistics_on_the_use_by_enterprises)

The highest percentage of businesses that used cloud computing in 2014 were observed in

Finland (51%), Italy (40%), Sweden (39%) and Denmark (38%). In contrast, the use of cloud

computing services in less than 10% of all enterprises in Romania was found (5%), Latvia (6%)

and Poland (6%), Bulgaria (8%), Greece (8%) and Hungary (8%). It appears that companies in

Romania uses the least cloud computing services, ranking last.

In 16 member states of the European Union, cloud computing is mainly used for e-mail, such

as in Italy (86%), Croatia (85%) and Slovakia (84%). In 11 states, cloud computing services

were mainly used for storage of files, for example in Ireland (74%), UK (71%), Denmark (70%)

and Cyprus (70%), while Netherlands stood by hosting databases of businesses (64%).



70

Figure 3. Using cloud computing services in European companies, 2014

(source: http://ec.europa.eu/eurostat/documents/2995521/6208098/4-09122014-AP-EN.pdf)

3. Benefits of using cloud computing to businesses Already there are economic enterprises and government agencies who turn with confidence to

the opportunities of cloud computing systems. The benefits of cloud computing systems will

fundamentally change the possibilities of small organizations (with small-scale business and

non-profit organizations) to acquire ICT capabilities to enhance productivity and foster

innovation.

Through cloud computing can configure a virtual office that provides the flexibility of

connecting to a business from anywhere and at any time. Access to business data is very easy.

Connection can be done with a growing number of devices used in today's business

environment, such as smartphones, tablets. Cloud computing offers businesses a number of

benefits such as:

Reduce IT costs: Migrating business in cloud computing can reduce the costs of

managing and maintaining IT systems. Rather than purchase expensive equipment and

software for a particular type of business can reduce costs by using the resources of a

cloud computing service provider. Operating costs will be reduced because:

o cost of system upgrades and new hardware and software may be included in the

contract;

o will no longer pay wages for skilled IT personnel;

o energy consumption cost will be reduced;



71

o For example, Amazon Web Services (AWS)

(http://aws.amazon.com/ecommerce-applications/) provides a cloud computing

solution for small and large ecommerce. The solution for making online and

retail sales, is flexible, secure, highly scalable and price.

Scalability: The business can quickly have the necessary resources, it is highly flexible

for resource allocation or withdrawal according to business needs. This task is cloud

computing service provider

Business Continuity: Protecting data and systems is an important part of business

continuity planning. If a natural disaster, power outage or other undesirable event data

stored in the cloud are protected in a secure location, with the potential to be

subsequently accessed quickly, minimizing any loss of productivity.

Effectiveness of the collaboration: collaboration in a cloud environment offers a

business the ability to communicate and share data/information much easier compared

to traditional methods. For example, if a project working in different locations can use

cloud computing to provide all participants access to the same files.

The flexibility of working practices: Cloud computing allows employees to be more

flexible in how they work. For example, it is possible to access data from home, when

employees are on vacation or while commuting to and from work (provided there is an

internet connection).

Access to automatic updates: Access to automatic updates for IT business requirements

may be included in the contract with the service provider. Depending on cloud

computing service provider, the system will be regularly updated with the latest

technology. This may include updated versions of software and upgrades for servers

and processing power.

4. Risks of using cloud computing

Most risks of using cloud computing solutions is the lack of data and lack of control over

information processing.

Environmental Security: The concentration of computing resources and users in a cloud

computing environment is also a concentration of security threats. Because of their size

and importance, cloud environments are often covered by a series of attacks. Before

entering into a contract for a service, it is desirable to be informed by the cloud service

provider about how access control is achieved, which are practices vulnerability

assessment and configuration management controls if there are patches and to see if the

data is properly protected.

Security and privacy: the organization's confidential data hosting service providers and

cloud involves transfer of data security controls at the service provider. It is necessary

that the service provider needs to understand privacy and data security organization, to

comply with data security and privacy, and regulations that apply, such as HIPAA

(Health Insurance Portability and Accountability Act), PCI DSS ( Payment Card

Industry Data Security Standard), FISMA (Federal Information Security Management

Act) or considerations of the Gramm-Leach Privacy-Bliley Act.

If cloud computing service is used only for data storage is recommended their encryption

before being transferred to the cloud. We will use strong encryption systems, under the

exclusive control of the consumer cloud services. This ensures data confidentiality (4).

If the data stored in the cloud computing will be processed and distributed, your cloud service

provider to have access to them. Security measures are achieved by an agreement on how cloud

computing provider can use the data.



72

Data availability and business continuity: A major risk to business continuity in the

cloud computing environment is the loss of Internet connection. Cloud computing

provider must have controls for checking and ensuring connectivity to the Internet.

Sometimes, if a vulnerability is identified, it is possible that access to cloud computing

provider to interrupt to correct the problem.

Disaster Recovery: The ability to recover in case of disaster cloud provider, is vital to

the recovery plan and a company that hosts the computing resources and data to the

cloud provider.

5. Conclusion

Cloud computing is a technology that will be used by both small businesses and large

enterprises to develop and implement their applications. Moreover, current European policy

initiatives, such as data protection and law reform Common European Sales, reduce barriers to

adoption of cloud computing technology in the EU.

References

[1] OECD Science, Technology and Industry Scoreboard 2013. Internet:

http://www.oecd.org/sti/scoreboard-2013.pdf, April 27, 2014 [Oct. 20, 2014];

[2] Marnix Dekker, Dimitra Liveri, Certification in the EU Cloud Strategy, European Union

Network and Information Security Agency (ENISA), November 2014;

[3] Konstantinos GIANNAKOURIS, Maria SMIHILY, Cloud computing - statistics on the use

by enterprises, noiembrie 2014. Internet: http://ec.europa.eu/eurostat/statistics-explained/index.php/Cloud_computing_statistics_on_the_use_by_enterprises#Methodology_.2

F_Metadata, November 27, 2014 [Oct. 20, 2014].

[4] CERT-RO, Cum să abordezi protecția datelor în cloud. Internet: http://www.cert-

ro.eu/files/doc/886_20141001111032007308300_X.pdf, April 27, 2014 [Oct. 20, 2014];

[5] Avizul nr. 05/2012 privind „cloud computing”. Internet: http://ec.europa.eu/justice/data-

protection/article-29/documentation/opinion-recommendation/files/2012/wp196_ro.pdf,

July 27, 2012 [Jan. 20, 2015]

http://www.oecd.org/sti/scoreboard-2013.pdf

http://ec.europa.eu/eurostat/statistics-explained/index.php/Cloud_computing_-_statistics_on_the_use_by_enterprises#Methodology_.2F_Metadata



http://www.cert-ro.eu/files/doc/886_20141001111032007308300_X.pdf

http://www.cert-ro.eu/files/doc/886_20141001111032007308300_X.pdf



73

HEALTHY AGEING MOBILE GIS APPLICATIONS DEVELOPMENT

AND AUDIT FOR THE ACHIEVEMENT OF SOCIAL

SUSTAINABILITY

Cosmin TOMOZEI

Vasile Alecsandri University of Bacău, Romania

[email protected]

Cristian AMANCEI


[email protected]

Abstract. The objective of this paper is to present a reliable way for the engineering,

reengineering and audit of mobile health applications. This research implied the constructive

analysis of health applications of several types, by means of international scientific databases,

in which these types of applications were presented. Secondly, the use of mobile devices and

GIS applications for senior citizens have been taken into account so as to create efficient ways

of communication and processing of medical and spatial data. These types of applications

should provide help and support for people in need and assist the decision makers, both

patients and health specialists for the assurance of the quality of life as well as the seniors’

autonomy.

Keywords: mobile health, GIS, sensors, data analysis, audit.

JEL classification: JEL classification: L86 – computer software

1. Introduction Some of the best-known aspects of sustainability achievement concerning the modern society

are represented by age friendly communities and by the easiness of access to the resources by

the senior citizens. In papers such as [1] the age friendliness of cities is analyzed, especially by

taking into account the allocation of health resources. The knowledge-based society has as a

main objective to offer the people in need the appropriate health resources and services in a

suitable time, furthermore to support the autonomy and possibility of elderly to live in a secure,

healthy and friendly environment, which presumes that they should be able to work with the

new information and communication technologies so as to provide valuable information to the

medical staff and to their families.

In order to accomplish this social goal, it is very important that the data which result from the

interaction of the senior citizens with the technological devices to be subjected to analytical

procedures as well as the extraction of the most significant information, by the specialists,

though data modelling and estimation procedures. Such procedures are identified in [2], in

which exposure to environmental risk factors is assessed. The thematic maps are based on the

geocoding process and on and on the utilization of GPS devices, sensors and databases.

Valuable data, obtained by smart devices and sensors are processed and transmitted to the

health specialists.

On the one hand, the geographic information systems offer a reliable support for data

representation and analysis, by integrating specific procedures, such as spatial analysis and

regression, time series, charts and geo-coordinates, which are very helpful in the development

of health graphical representations on thematic maps. As tablets and intelligent devices are



74

greatly used in healthcare, as mentioned in [3], for general practitioners and patients, aiming to

reduce the incidence of strokes and monitor the cardiovascular diseases, additional

functionalities may be taken into account, for the spatial analysis of diseases and for the

creation of estimation models. Smart devices are appropriate for mapping software and by

means of touchscreens, gestures and sensors a set of new indicators is in position of being

developed.

A novel approach of healthy aging application development consists of the events which are

implemented by software functionalities, which should be used for elderly people interaction

with the devices by tap, double tap, tap-and-hold, pan or slide which are very usable and

comfortable in the user interaction. These events or gestures combined with geolocation

facilities and data obtained through sensors becomes more and more beneficial in mHealth [4]

applications development.

It is stated that the mHealth phenomena got significant attention in the last years, because of

the devices that support the health objectives and transformed the health services and their

accessibility for the patients. Complex communication functionalities have been created, by

connecting medical devices via Wi-Fi and Bluetooth for the transmission of data regarding the

physical condition of the patients, such as blood pressure, glycaemia, body temperature, heart

rate but further analyses are to be accomplished for more elaborated examinations and

calculations.

2. Citizen assistance, healthcare and quality of life

The assurance of a good public health system represents a continuous aim of researchers and

practitioners, so as to draw-up the characteristics and the necessity of well-being for the elderly

people. Measures have been taken [5] for the building of healthy living environments by

combining medical treatments with movement and physical exercises. Furthermore, studies

and policies about the “aging-in-place” have been created in order to contribute to the people’s

autonomy in life.

Mobile applications and remote sensors e.g. smart wearable devices and pedometers support

the development of healthy communities and age-friendly environments. Several factors such

as walkability, air quality and green space as well as reduced levels of danger have been

considered as vital for age friendly environments.

These factors, as well as personal health information should be in position of being modelled

with the support of smart devices and customized for the personal health requirements, in order

to assist [6] the seniors for the maintenance of life quality and health condition. Citizen oriented

health assistance services are continuously lifelong provided and assure the possibility of

managing their needs and treatments, with health practitioners advisory services approval.

Mobile health and the integration of sensors and wearable technologies offer to the elderly

people new ways of connectivity and interaction, within a smart medical distributed system for

uninterrupted healthcare assistance. Furthermore, clinical professionals utilize such assistance

systems in order to support the elder patients’ home care.

These services are accessible via dedicated protocols and their functionalities are exposed to

the mobile devices which the patients carry in their daily lives. An important issue is

represented by the success in communication between the clinical specialists, the software

development teams and the beneficiaries concerning the objectives, the personal health plan

structures, the data formats and the data exchange between the actors by means of specialized

services. Another important issue is related to the data storage and the distribution between

specialized servers and Cloud machines, for large files and objects such as medical imaging

files, and personal mobile devices, with a direct effect on how data is queried and structured in

the following stages for the specialized medical decision support activities. Automated



75

reasoning procedures may be implemented and adapted for the creation of personalized

recommendations based on each person activity history and health records. Service oriented

architectures and service composition models for health management are being implemented

and invoked by mobile health applications.

3. mHealth applications engineering for senior citizens

A reliable taxonomy of mHealth applications is presented in [4] and offers a holistic image

about the objectives, engineering, maintenance and reengineering of health applications.

Organizations adopt more and more mHealth apps for the increasing of the patients’

satisfaction regarding the medical and social services. These applications deeply contribute to

the autonomy of the patients and support the achievement of high levels of the life quality.

In this section, we present a simple example for mobile application with use in healthy aging

and especially with regards on the quality of life of the senior citizens.

For elderly people the consumption of water, salt, coffee, prescription medication, as well as

the movement per day represent key elements for a healthy and autonomous life. These factors,

identified as exogenous variable in a regression model, may straightforwardly be modeled by

means of mobile devices and services, in a mobile-first cloud first strategy of going forward,

according to [10].

Blood pressure is taken into account as an endogenous variable, modelled in terms of the

identified exogenous variables, by means of classes in the mobile health applications. The

classes and the mHealth services are instantiated on the mobile devices. The identified linear

regression model will be useful in the estimation of the blood pressure based on the quantities

of products which the patients had consumed and consequently in-place recommendations will

be given to the patients.

A secondary phenomenon which assures the quality of life of senior citizens is represented by

the movement per day variable. The movement per day a citizen does is measured by

pedometers and activity trackers wearables and gets easily modeled by mobile devices via

Bluetooth connections. The movement per day is depending on the weather and especially on

the temperature, because it is advisable that the elderly people should avoid the outdoor

activities on very cold or very hot weather. Another aspect regarding the values of the

movement per day factor is determined by the geographical location of the citizen towns of

residence. This leads to the adoption of the GIS modules and furthermore on the integration of

map modelling procedure on mHealth application based on a process of software

reengineering.

Time series and regression analysis are key elements in the analysis of health indicators and in

the estimation of health data models. Limits should be set by medical specialists and those

limits should not be exceeded by the patients in their daily life.

In (1), (2) and (3) the regression models for the blood pressure and movement per day are

analytically defined and described as simultaneous equations models.

kk medicationwatercoffeesaltfpressureBlood ,,,_ (1)

kk positiongeoetemperaturfdayperMovement ,__ (2)

k

NrEx

j

kjjkkkk uetaVarExetaVarExBP 1

11 (3)

where:

BPk – the dependent, endogenous variable of ecuation k within the model,

identified as blood pressure;

1keta – the intercept of the simultaneous equations model;



76

lketa , – the coefficient of the independente variabile Varexk1, of the model;

ku – the error term.

The optimization of the model leads to personalized recommendation of quantities for each of

the exogenous variables, e. g. salt, medication, water and coffee in order to support the

positioning of the blood pressure in the normal limits. The models should be structurally

consistent and avoid multicollinearity. The size of the model is determined by the number of

exogenous variables and the number of data series. This analysis of multicollinearity checks

whether there is correlation between the explanatory variables which lead to the inconsistence

of the model.

An additional aspect, which is significant for the quality of life is represented by electronic

prescriptions, which are directly sent on mobile devices, back ended in Cloud and with push

notifications. The electronic prescription sent on mobile devices with notification will

considerably decrease the time spent by the elderly people as well as the effort to get the

necessary medication for the assurance of their quality of life.

4. mHealth application audit approach

In this section, we present the audit approach proposed for our mobile application architecture

proposal.

The audit starts with a security risk analysis main focused on analyzing vulnerabilities and

threats to the mHealth information resources in order to decide what controls should be

implemented in the future development phases. Due to the complex interactions among the

components of the information system, a single vulnerability may have multiple propagation

paths, leading to different security risks for our application [11].

Based on a self-assessment of the patient information manipulated by the application, we have

developed a list of main risk factors and proposed mitigation controls presented in Table 1.

Table 1. mHealth main risk factors and proposed mitigating controls

Category Risk factor Proposed mitigating control

Application

security

Application access

control

Access to the analysis results will be

permitted through accounts validated by the

application administrator.

Access to personal data will be permitted only

through authentication with digital

certificates.

Application processing

history

Logs will be maintained for the processing

history of the application at a high detail level.

Capability of fault

tolerance

All the client requests will be parsed and the

application will respond only to a predefined

set of utilization cases, for all the other

requests general messages will be presented.

Data security Data confidentiality The access to the database will be restricted

only to trusted connections that are enforced

by digital certificates, due to the sensitivity of

the data stored.

Data integrity Data upload will be permitted only for the

administrative users that remain responsible to

ensure the integrity of the data. The client



77

application interface will permit only

operations that do not involve data upload.

Communications

and operation

security

Communication

integrity

Communication integrity will be protected by

encryption. The level of encryption will be

generated by the sensitivity of the

communicated information.

Protection of log

information

The application and system log files will be

stored on a dedicated machine with a different

permission set than on the production

environment.

After application development is finished we will perform detailed testing of the main risk

factors identified in order to check if the proposed mitigating controls are appropriate and

sufficient for our application.

5. Conclusions

Mobile devices are accessible, powerful and useful for data processing and transmission

without a substantial effort by the elderly people, by means of specific mobile software

applications. Mobile health applications become more and more present in people’s lives and

provide useful information to the health specialists.

Our paper intention is to reflect a novel utilization of this types of devices for the support of

the senior citizens autonomy and quality of life, through software engineering and

reengineering strategies. Health services and service oriented architecture play a decisive role

in this context and assure the communication between mobile device applications and

specialized servers.

The engineering of the mobile apps presumed the definition of models in which endogenous

variables were explained in terms of exogenous, independent variables in the regression

analysis, based on the data collected by mobile devices and wearable servers connected via

Bluetooth and Wi-Fi.

The knowledge-based society has as a main objective to offer the people in need the appropriate

health resources and services in a suitable time and therefore to provide the health facilities,

medication and accurate information, so as to lead to an increase of personal autonomy and the

possibility of living in a secure, healthy and friendly environment. Mobile health apps maintain

this desire achievable and accessible to many people.

References

[1] J. Ruza, J. I. Kim, I. Leung, Ca. Kam and S. Y. Man Ng, “Sustainable, age-friendly cities:

An evaluation framework and case study application on Palo Alto, California”, Sustainable

Cities and Society, vol. 14, pp. 390 – 396, 2014.

[2] D. Fecht, L. Beale and D. Briggs, “A GIS-based urban simulation model for environmental

health analysis”, Environmental Modelling & Software, vol. 58, pp. 1–11, 2014.

[3] M. Radzuweit and U. Lechner, “Introducing tablet computers into medical practice: design

of mobile apps for consultation services”, Health and Technology, vol. 4, No. 1, pp 31– 41,

2014.

[4] P. Olla and C. Shimskey, “mHealth taxonomy: a literature survey of mobile health

applications”, Health and Technology, vol. 4, No. 1, pp 31-41, 2015.

[5] E. J. Burton, L. Mitchell and C. B. Stride, “Good places for ageing in place: development

of objective built environment measures for investigating links with older people’s

http://link.springer.com/journal/12553/4/1/page/1



78

wellbeing”, BMC Public Health, Vol. 11, 2011, Available:

http://www.biomedcentral.com/1471-2458/11/839.

[6] S. C. Christopoulou, “A smart citizen healthcare assistant framework”, Health and

Technology, vol. 3, No. 3, pp 249-265.

[7] J. Kerr, S. Duncan, J. Schipperjin, “Using Global Positioning Systems in Health Research

A Practical Approach to Data Collection and Processing” American Journal of Preventive

Medicine, vol. 41, No. 5, pp. 532–540, 2011.

[8] A. K. Lyseen, C. Nøhr, E. M. Sørensen, O. Gudes, E. M. Geraghty, N. T. Shaw, C. Bivona-

Tellez, “A Review and Framework for Categorizing Current Research and Development in

Health Related Geographical Information Systems (GIS) Studies”, Yearbook of Medical

Informatics, Available: http://www.ncbi.nlm.nih.gov/pubmed/25123730

[9] N. Vercruyssen, C. Tomozei, I. Furdu, S. Varlan, C. Amancei, “Collaborative

Recommender System Development with Ubiquitous Computing Capability for Risk

Awareness”, Studies in Informatics and Control, vol. 24, No. 1, pp. 91-100, 2015.

[10] Satya Nadella: Mobile First, Cloud First Press Briefing, Available:

http://news.microsoft.com/2014/03/27/satya-nadella-mobile-first-cloud-first-press-

briefing/

[11] [ N. Feng, H. J. Wang, L. Li,”A security risk analysis model for information systems:

Causal relationships of risk factors and vulnerability propagation analysis”, Information

Sciences Journal 256/2014, pp. 57-73



79

CLOUD–BASED ARCHITECTURE FOR PERFORMANCE

MANAGEMENT SYSTEMS FOR SMES



[email protected]

Abstract. Performance is a continuous state that a company or individual desire to achieve.

In order to reach for this result a company needs to prioritize and manage its activities with

less effort and less resources. For a small company such as SMEs resources are limited,

especially financial resources. This paper is proposing a performance management system

hosted on a Cloud environment. Basically, the performance management application will run

on a SaaS architecture based environment.

Keywords: cloud, SME, performance management, architecture

JEL classification: M20

1. Introduction Performance is a concept used in all economic sectors; it is a general state that all economic

entities desire to reach, whether they are companies, employees or shareholders. "Performance

is associated with two key processes: performance management and performance

measurement"[1]

Performance management is a set of analytical and managerial processes that help the

organization achieve its strategic, operational and individual objectives. The three main

activities which are defining the performance management concept are: setting the strategic,

operational and individual objective, consolidating the measurement information relative to the

objective which is relevant for the progress of the company and managers' decisions regarding

enhancements of the activity based on the measurement indicators. Because in large companies

most of the managerial activities involve managing a large amount of data, many software

vendors offer Business Intelligence and Business Performance Management software to

support this process.

"Business Intelligence concept involves raw data that needs to be condensed from different

data sources and then transformed in information. Performance Management concept uses this

information and transpose it into an intuitive decision making format. “To differentiate BI from

PM, performance management can be viewed as deploying the power of BI, but the two are

inseparable. Think of PM as an application of BI. PM adds context and direction for BI.”[2]

Performance measurement is a concept defined as "a way to collect, analyze and report

information regarding a group, an individual, an organization, a system or a component. This

involves the study of the processes or the strategies of a company. The performance is the

criteria on which a company can determine its capacity to prevail."[3]

2. Cloud computing for SMEs

Cloud computing is one of the most revolutionary technologies that helps governs and

companies to enhance its services and performance. Cloud computing technology helped

multiple companies to reduce their costs of IT infrastructure and IT equipment maintenance

costs, to benefit from the flexibility and scalability of Cloud by having access, basically, to

unlimited storage, to increase its processing capacity, IT efficiency and agility. Also, this



80

technology has eliminated the need for external storage devices, such as software discs,

physical servers, and the need for installation of updates, upgrades and specialized employees

for maintaining the IT infrastructure.

A survey-based study made by Reza Sahandi, Adel Alkhalil and Justice Opara-Martins

explored the requests and concerns that small businesses have regarding Cloud Computing

technology. The most important premises of the study were the following:

the factors that encouraged small businesses to migrate to Cloud;

the adopted strategies for Cloud services utilization.

The study was conducted in Great Britain and the survey was completed by IT decision-makers

and managers. The results show that 45.5% of the SME want to implement Cloud-based

services to reduce costs and 44.9% to benefit from mobility and convenience when accessing

cloud-based applications. These figures demonstrate that small businesses are aware of the

advantages of the adoption of such technology. Also, the study brings another aspect that SMEs

find appealing, as follows: the "ubiquity and flexibility of Cloud Computing (38.9%),

increasing computing capacity (32.9%) and providing greater IT efficiency (31.7%)"[4]

The results for the second premise emphasize important aspects regarding operational activities

that SME have: 32.5% of the respondents are planning to use the Cloud-based services for

current operational activities. This percent shows the fact that small businesses knows the

advantages of this technology and the fact that Cloud technology can interoperate with other

systems. On the other hand, 27% of the respondents have mentioned that they do not have in

plan to use Cloud-based services yet.

Among the main concerns of SMEs for adopting cloud-based solutions include primarily the

Privacy and Data Protection. This is number one reason why SMEs don't desire to migrate to

a cloud-based solution. Also, confidentiality, data integrity and vendor lock-in are a major

concern as well. "Still, Cloud Computing is a winsome venture for SMEs but it certainly takes

a good business sense and steps in order to fully reap its benefits"[4]

Another representative study conducted by Ashwini Rath, Sanjay Kumar, Sanjay Mohapatra,

Rahul Thakurta in India, a developing country, was made in order to reveal the degree of

interest and understanding of the SMEs for Cloud Computing services. This study covered

issues like:

"- The level of awareness among SMEs regarding Cloud;

- The level of willingness among SMEs to invest in Cloud;

- The potential reasons behind possible engagement with Cloud;

- which layer of cloud computing architecture is most likely to be used?"[5]

The results of the above study show that 93% of the respondents already have information

about Cloud Computing and its benefits. Regarding the potential reasons behind possible

engagement with Cloud Computing 19% responded that cost reduction is the most important

and 18% responded that controlling marginal profit is the second most important factor. Also

45% of the respondents were willing to use individual software packages (Software as a

Service) and 43% were willing to use a complete operating system on which they can add

custom software. This is called Platform as a Service.

3. Cloud-based architecture for SMEs

Both studies have emphasized that small business are interested in Cloud Computing solutions

even though some are active in a developed country and some in a developing country. Most

small businesses are interested in Software as a Service (SaaS) applications. There are several

advantages of using this type of architecture compared to traditional ones such as:



81

- Lower costs. This type of application resides in a shared or multitenant environment

where the software and hardware costs are maintained by the service provider. A small

company only pays for access to it.

- No installing and configuring time. Basically, the small company has instant access

to the application, the installing and configuring process will be done by the service provider.

- Scalability and integration. A SaaS solution is hosted in a cloud environment that is

scalable and can be integrated with other cloud solutions. In case of a traditional solution

whenever you need to integrate it with another software your need to buy hardware, software

and assistance to integrate them, but using a cloud environment with SaaS software all the work

will be done by the service provider.

-Pay as you go. This is the main concept for cloud solutions because it gives companies

the benefit of predictable costs for all services that they want. For small businesses a predictable

cost is essential for their budget. Also, service providers have personalized offers in order to

be aligned to different budget types and needs that small businesses have.

- No responsibility for upgrades, updates, uptime and security. The service provider

will take care of this type of maintenance activities that are necessary for managing software.

A performance management system is essential for tracking the progress of the organization,

but small businesses do not have the infrastructure to host and use a powerful performance

management system even though they need one. The SaaS technology and a PM system is a

perfect combination for obtaining great results with less effort. This paper is proposing a SaaS

architecture for a performance management system that will have access to different on-

premise databases, if the company doesn't want to move all their data sources into the Cloud

environment, and to different Social Media data sources to retrieve information in order to

apply marketing analytics.

The following SaaS architecture, depicted in Figure 1, is composed from a SaaS platform on

which all the business related information are stored and processed. The performance

management system is hosted on the platform and has access to on-premise applications of the

company in order to retrieve financial, staff or other department related data. The PM

application has access to Social Media platforms and mines for significant data that will allow

it to create relevant analytics. The customer which was represented in the diagram as being a

manager will access the Performance Management application using a browser. The

application can be accessed using only an Internet connection. The PM software is protected

by a firewall and by the security management tool hosted on the SaaS platform. The user will

have access to the content depending on its role. Each type of user will have a different role

associated with his account depending on the type of information he is accredited to access.

Figure 2. SaaS architecture



82

The SaaS platform structure can be seen in the Figure 2 diagram. The Presentation Layer

consists from multiple UI modules such as: Reporting module which will be used for

generating performance reports on different areas or departments. The User Interface

components are composed from Dashboards, Scoreboards, Strategy Maps and Administration

related interface. The reporting module is implemented using a special API named Apache POI

which is generating Office documents such as Excel, Word or even PowerPoint presentations.

The dashboards, scoreboards and other type of data presentation will be generated using

different technologies such as JQuery, Ajax and Javascript.

The Business Application Layer is composed from business related modules: HR, CRM,

Inventory, SCM, Financial or Sales. The data used by these modules can be retrieved from the

data sources available in the cloud or on-premises data sources. The KPI Generator will process

all the existing data and will calculate the key performance indicators and other indicators that

will be sending to the presentation layer.

The Service Layer is mostly composed from component management modules which will

administrate the internal core of the platform and will provide security for the communication

between components.

Figure 2. SaaS platform overview

5. Conclusions

A SaaS architecture-based application can be a real option for small businesses due to their

limited availability of resources. Also, it can be personalized for each company based on their

available budget. In order to determine SMEs to implement a Cloud based architecture issues

such as security, data privacy or vendor lock-in need to be addressed. Cloud computing

providers should invest more in network and physical security in order to offer to customers a

secured software solution.

Acknowledgment

This work was cofinanced from the European Social Fund through Sectoral Operational

Programme Human Resources Development 2007-2013, project number

POSDRU/159/1.5/S/142115 „Performance and excellence in doctoral and postdoctoral

research in Romanian economics science domain”.

References

[1] A. Brudan, ”Rediscovering performance management: systems, learning and integration”

Measuring Business Excellence, vol. 14, issue 1, 2010.



83

[2] A. Rusăneanu, ”Comparative analysis of the main Business Intelligence solutions”,

Informatică Economică, Vol. 17, No.2/2013, pp.148.

[3] A. Rusăneanu, ”Rules for selecting and using key performance indicators for the Service

Industry”, Vol. 2, Issue 2(4) /2014, pp. 661-666.

[4] R. Sahandi, A. Alkhalil, J. Opara-Martins, ”Cloud computing from SMEs perspective: a

survey based investigation”, Journal of Information Technology Management Volum

XXIV, Number 1, 2013.

[5] A. Rath, S. Kumar, S. Mohapatra, R. Thakurta, ”Decision points for adoption Cloud

Computing in SMEs”, available at:

http://www.academia.edu/3139478/Decision_points_for_adoption_Cloud_Computing_in_S

MEs (accessed on March 14th 2015).



84

SOFTWARE TOOLS AND ONLINE SERVICES THAT ENABLE

GEOGRAPHICALLY DISTRIBUTED SOFTWARE DEVELOPMENT

OF WEB APPLICATIONS

Mihai GHEORGHE


[email protected]

Abstract. Geographically Distributed Software Development (GDSD) has seen an increased

popularity during the last years mainly because of the fast Internet adoption in the countries

with emerging economies, correlated with the continuous seek for reduced development costs

as well with the rise of the Software as a Service (SaaS) platforms which address planning,

coordination and various development tasks. However, the implementation of an efficient

model of GDSD has proven challenging due to cultural and legal differences, communication

and coordination issues and software quality concerns. This study identifies the basic

development operations and the interactions among team members as well as the software

tools and online services that successfully cope with these. Two perspectives are reviewed:

software development specific operations and interactions and the connections between the

employer and the remote employees in terms of communication, billing and payments. The

study focuses on the development process of web applications. In the end the author analyses

the impact of using these solutions on cost, duration and quality compared to conventional in-

house software development processes.

Keywords: Freelancing, Global Software Development, Human Resources Management, IT

Project Management, Software Development Tools

JEL classification: F63, O31, O33

1. Introduction

Depending on both the complexity of the software product and the management model the

Software Development Process may have multiple roles for its team members [1]. A typical

role hierarchy is described in Figure 1. The dashed lines illustrate informal interactions.

A Geographically Distributed Software Development process happens when the team members

are not collocated and share different time-zones, languages, cultures and legislations.

Mainly as a result of distance, communication, coordination and control are challenging in

GDSD. Communication is reported to be one of the key processes in software development

and is heavily linked to the effectiveness of coordination and control [2].

From its current professional position, as the owner and manager of a software company since

2006, the author has frequently dealt with offshore development scenarios, mostly as a provider

and occasionally as a customer. Based on the literature review and the author’s experience, this

study aims to answer the following two research questions:

RQ1: Is there any set of software tools and online services that can address the

communication, coordination and control inconveniences generated by the distributed

approach in the GDSD?

RQ2: How does the use of these software tools and online services influence the cost,

duration and quality of each activity from the Software Development Process?



85

Figure 1. A typical role hierarchy in a Software Development Process [1]

2. Software Development related operations In an ever-growing demand for software products numerous development models have arisen

from heavy planned Waterfall models to lightweight Agile variations. Determining the most

suitable ones for developing web applications is not covered by the current study. Regardless

of their order, duration and allocated importance there are some elementary activities which

reside in all software development methodologies (Figure 2). The dashed arrows suggest that

the sequence of the operations can vary with the methodology. Coding is usually an individual

activity so it shouldn’t be different in GDSD from conventional in-house development.

Figure 2. Typical Software Development activities

Knowledge transfer is a crucial activity in GDSD, and it can span across the entire development

process [3]. Knowledge transfer doesn’t happen only between stakeholders and the

development team but inside the team as well, for instance when a senior programmer mentors

junior staff.

In the following subsections, software tools and online platforms that support the identified

operations are showcased.



86

2.1. Requirements management

Issuing requirements in a manner that can be handled to the hierarchy of development roles

may vary from a brief document to an exhaustive set of specifications, use-case diagrams, test

scenarios, performance and security restrictions.

Conventional in-house development processes can benefit from direct communication among

team members, internal meetings, documents and quick revisions.

Wireframes and Prototyping

The functional and layout requirements for web applications can be described using wireframes

and live world-wide accessible prototypes. Both literature and crawling the web reveal

numerous free and paid online services that transform static designs into clickable interactive

prototypes which can be published and shared with the development team and stakeholders.

InVision [4] is an online prototyping service for websites and mobile apps that among others

allows simultaneous work on the same project, enhances presentations with gestures,

transitions and animation support, stores the project on its own cloud infrastructure or with

other services such as Dropbox, supports versioning, comes with a collection of prototype

templates. According to their website [4], IBM, Adobe, Twitter, HP, PayPal, Intel, Yahoo,

Salesforce, Nike are among the companies that use the service. Basic features are free of

charge, but the company features an enterprise plan as well for unlimited projects, team

members and storage.

Similar services (Justinmind, Axure, iRise, Microsoft Visio) are widely used in managing the

requirements and distributing them across the team regardless of their location making them

suitable for GDSD. Requirements errors are the largest contributor to change requests, and cost

to mitigate these errors grows exponentially through the lifecycle [1][5]. Therefore using

specialized prototyping services can increase the efficiency of requirements management,

reduces the duration of this activity and although it usually comes with a licensing or

subscription cost, can reduce the overall cost of the development process.

2.2. Task assignment

Assigning tasks to a geographically distributed development team, setting dependencies

between tasks, monitoring progress and other development metrics is more difficult in a GDSD

framework compared to an in-house development scenario [6].

In order to address this issue, a great number of integrated project management SaaS platforms

has evolved. Basecamp, JIRA, Pivotal Tracker, Asana, Teamwork, Producteev are just a few

products which the author has successfully worked with so far. Features like team management,

collaborative reporting, issue tracking, time tracking, document management, software

development metrics and reports, budget management, invoicing are fairly common. For less

complex projects, usually basic features, limited number of projects, limited team members or

trial use come free of charge. Monthly or annual subscriptions can be contracted with costs /

month ranging from $15 to $750, for projects that have hundreds of team members. For an

organization that large, the costs are insignificant while the benefits from a real-time world-

wide accessible project management framework are considerable.

2.3. Management of the source code

Assuring source code coherency when multiple programmers contribute to the same product

has been an issue for a while. The Source Code Control System (SCCS), created in 1970 is the

first Version Control System (VCS) to be mentioned [7]. Since then, code complexity has

dramatically increased.



87

In a GDSD environment, for web applications that evolve from one day to another, there is the

need to commit source code changes frequently, often simultaneously between different team

members, without having the risk of unintentionally overwriting someone else’s work.

Managing the code changes manually is very inefficient and prone to errors, practically

impossible even for medium complexity projects. Since SVN appeared in 2001 [7], developers

can work on the same code file at the same time and the system can deal with conflicts and

automatically merge the contribution into a single files. Unlike SVN which has a centralized

repository, Git is one of the distributed VCS that requires each contributor to have its own

repository. This means, the developers can work even if they don’t have a permanent Internet

connection. Along with the use of acknowledged web frameworks (ASP.NET MVC, Zend for

PHP, Spring for Java, or other similar), adoption of a mature VCS can address security

concerns by restricting access for a certain group of developers to core functionality. For

instance, access can be restricted for a Front End Developer to the application’s controllers,

while granting permissions to test his own work [9].

The most popular modern VCS such as SVN, Git, Mercurial, Bazaar are free of charge.

2.4. Testing and debugging

Due to their nature, web applications are accessible from virtually any place in the world as

long as there is an Internet connection and a browser. However, testing web applications is a

vast field which can be divided in several domains.

In the GDSD configuration, Functional Testing and User Interface and Usability Testing can

be addressed as a crowdsource operation in order to validate the product on as most computing

devices, operating systems, browsers and human behaviors as possible. Compared to in-house

development, the quality and speed of this activity will be increased. uTest.com is a service

that supports crowdsource testing with more than 150.000 testers and Quality Assurance

specialists [10]. Efficient automated testing can be performed for source code compliance with

http://validator.w3.org/. Suggestions for correcting the source code are provided on the fly.

Also, automated load and performance testing with Google Page Speed and Yahoo! Slow

algorithms can be achieved on http://gtmetrix.com/, http://tools.pingdom.com/fpt/ or similar

services.

2.5. Knowledge transfer

Notes on Knowledge Transfer (KT) for GDSD have been posted in section 2.1 in relation with

Requirements Management. However, in conventional in-house development processes, KT is

also an information exchange between various team members mostly through direct informal

communication. In the GDSD environment, due to time-zone, language and cultural barriers,

informal communication is at a minimum level so KT is hard to achieve in a properly manner.

3. Non-development related operations

Setting up a GDSD framework doesn’t exclusively consist of dealing with technical and

procedural challenges. The relation between the employee (provider) and the employer (the

customer) needs to be supported as well.

3.1. Communication

Informal communication is usually done through instant messaging systems, conference calls

and video calls. Platforms such as Skype, Apple iChat and Google+ Hangout are easy to use

but require an increased bandwidth for multiple attendants [11]. This can either have a negative

impact on the quality of the transmissions or an increase of costs for updating the Internet

connection.



88

3.2. Effort evaluation and Billing

Unlike in-house development, in GDSD configuration, concerns may arise regarding the

amount of time spent by each team member in the project’s benefit. Software products have

been developed to monitor the activity of remote team members, create various reports and

even automatically generate invoices. Elance’s Work View, is a desktop software client that

counts the time spent on a project and sends random screenshots to the client. Each week,

automated billing for the tracked hour is performed [12]. RescueTime is different time

management products focused on increasing individual productivity by providing more

detailed reports, blocking distracting websites. It records time based on keystrokes and mouse

movements [13].

3.3. Payments

Convenient international payments are not a novelty anymore. Various platforms have gained

popularity and became trustful for many business as well for peer to peer payments. PayPal is

reported as the most used in terms of users and transaction volumes. It is available in 203

countries and 26 currencies, it features invoicing and charges from 0.2% to 3.9% depending on

the account type and payment [14]. For premier accounts, withdrawals to credit card can be

processed in less than 24 hours. Other less popular platforms such as Skrill or Payoneer feature

lower fees. Payoneer supports a fast withdrawal which transfers the money on an international

self-issued credit card within 2 hours since initiated [15].

4. Conclusions

A series of activities have been identified as requiring a different approach in a GDSD

configuration against in-house software development. Online services and software tools that

respond to specific needs of these activities have been evaluated.

RQ1: The set of software products and online platforms which have been evaluated can address

the inconveniences generated by the GDSD approach and in some cases can even perform

better than conventional in-house development activities.

RQ2: The effects on activity cost, quality and duration due to the use of the proposed solutions

is described in Table 1 and Table 2. + means an increase, - a decrease, 0 no significant impact,

n/a not applicable.

Table 1. GDSD Software Development Activities and the impact of using Online Services and other

tools

Activity Impact on cost Impact on quality Impact on duration

Managing

requirements + ++ -

Assigning tasks 0 + +

Managing source

code coherency 0 + -

Testing and

debugging + + -

Knowledge

transfer 0 -- +++



89

Table 2. Non-Development activities in GDSD and the impact of using Online Services and other tools

Activity Impact on cost Impact on quality Impact on duration

Communication + - +

Billing 0 + -

Payment 0 n/a -

Acknowledgment

This paper was co-financed from the European Social Fund, through the Sectorial Operational


POSDRU/159/1.5/S/138907 "Excellence in scientific interdisciplinary research, doctoral and

postdoctoral, in the economic, social and medical fields -EXCELIS", coordinator The

Bucharest University of Economic Studies.

References

[1] V. Mikulovic and M. Heiss "How do I know what I have to do?: the role of the inquiry

culture in requirements communication for distributed software development projects", in

ICSE '06: Proceedings of the 28th international conference on Software engineering, 2006

[2] B. Fernando, T. Hall, A. Fitzpatrick "The impact of media selection on stakeholder

communication in agile global software development: a preliminary industrial case study",

in Proceedings of the 49th SIGMIS annual conference on Computer personnel research,

2011

[3] F. Salger and G. Engels "Knowledge transfer in global software development: leveraging

acceptance test case specifications" in ICSE '10: Proceedings of the 32nd ACM/IEEE

International Conference on Software Engineering - Volume 2, 2010

[4] InVision | Free Web & Mobile Prototyping (Web, iOS, Android) and UI Mockup Tool,

http://www.invisionapp.com/#tour, [Mar. 09, 2015]

[5] iRise | Rapid Collaborative Prototyping: Build better software faster, http://www.irise.com/,

[Mar. 09, 2015]

[6] G. Wiredu "A framework for the analysis of coordination in global software development"

in: International workshop on Global software development for the practitioner, 2006

[7] J. Loeliger "Version Control with Git", O’Reilly Media, Inc., 2009, pp. 1-6

[8] C. Brindescu, M. Codoban, S. Shmarkatiuk, D. Dig "How do centralized and distributed

version control systems impact software changes?" in ICSE 2014: Proceedings of the 36th

International Conference on Software Engineering, 2014

[9] B. Collins-Sussman, B. Fitzpatrick, C. Pilato "Version Control with Subversion For

Subversion 1.7", California, 2011, pp. 202 – 205

[10] uTest – Software Testing Community, http://www.utest.com/, [Mar. 09, 2015]

[11] Y. Xu, C. Yu, J. Li, Y. Liu "Video telephony for end-consumers: measurement study of

Google+, iChat, and Skype" in IMC '12: Proceedings of the 2012 ACM conference on

Internet measurement conference, 2012

[12] Elance Tracker with Work View™,

https://www.elance.com/php/tracker/main/trackerDownload.php, [Mar. 09, 2015]

[13] RescueTime: Feature, https://www.rescuetime.com/features, [Mar. 09, 2015]

[14] Paypal Global – All Countries, https://www.paypal.com/ro/webapps/mpp/country-

worldwide, [Mar. 09, 2015]

[15] Global Payments, Payout Services & Money Transfer | Payoneer,

http://www.payoneer.com/home-b.aspx, [Mar. 09, 2015]



90

THE IMPORTANCE OF JAVA PROGRAMMING LANGUAGE IN

IMPLEMENTING DISTRIBUTED SYSTEMS OF DECISION FOR

ONLINE CREDITING

Robert-Madalin CRISTESCU


[email protected]

Abstract. The distributed systems for online crediting are modern and very important

nowadays. These systems can approve online credit without too much effort. All procedures

starting from papers to the credit approval are done online with the help of a distributed system

implemented in Java programming language. The Java programming language is more and

more used nowadays and it helps implementing some systems which have a high level of

difficulty concerning implementation. This system is very useful because it reduces the effort

that the customers make such as the way to the office and obtaining necessary papers for the

credit decision. The crediting decisions which this system can make are very correct and quick.

These systems can announce the customers in a short time if the request they made was

successful or not. The distributed systems of decision implemented in Java programming

language are easy to use and have a high level of response especially on client-server part

which is the most important because the request are sent there and the credit decisions are

made.

Keywords: decision systems, distributed systems, java programming language, online

crediting.

JEL classification: D81

1. Introduction

‘Nowadays, with the broadening usage of distributed systems and Grid, the need for

cooperation between many different heterogeneous organizations occurs. Such cooperation

usually requires sharing of access to data, service and other tangible or intangible resources.’

[10].

The objective is to demonstrate that a distributed system of decision for online crediting

implemented in Java programming language is for the future and it will be more and more

requested and used.

The term of distributed systems refers to ‘the collaboration among connected devices, such as

tiny, stand-alone, embedded microcontrollers, networking devices, embedded PCs, robotics

systems, computer peripherals, wireless data systems, sensors, and signal processors resulting

in networked systems of embedded computing devices whose functional components are nearly

invisible to end users.’ [7]

The banking institutions are managed by ‘distributed database management system’. [8]

Distributed systems are more and more important nowadays because they help making

decisions in what client-server crediting type concerns. Distributed systems of client-server

type implemented in Java programming language give the possibility to find out the crediting

decisions online in a very short time. Also, Java programming language is used for creating

web services that are used in the distributed system of decision. Without these web services,



91

the application cannot work at normal parameters. The most important services are SOAP,

ESB, WSDL.

Many people prefer this type of crediting because they do not have time to go to the bank

because of their work. Many people work in areas with no banks and this type of crediting is

very useful for them. The persons who want online credit save time in what the crediting

decision concerns and they have a very easy way to send the files. All papers necessary for the

credit request file can be scanned and this system can make a decision in what the crediting

concerns only after the clients and the data we have in each client’s papers are introduced in

the data base. The system makes the right decisions because it is meant to transmit positive

decisions if only a certain client fills the conditions implemented in the system. The conditions

are similar to those we meet in banks.

‘The Internet explosion and the possibility of direct digital interaction with large numbers of

home consumers presents tremendous opportunities, and challenges, to the financial services

community. ‘ [6]

2. Distributed systems of decision for online crediting

‘Along with banking sector development and increasing demands regarding customer care

improvement, performance providing, transaction efficiency and security, internal processes

optimization, increasing complexity level of products and service, banking information

systems have known a continuous evolution.’[4]

‘To address that new world traditional reductionist vision of information systems is becoming

to be replaced by sociomaterial construction a complex adaptive systems approach. Virtual

organization represents that approach.’ [2]

In order to be efficient, the distributed systems of decision have to be implemented in a strong

and latest program. Java programming language is used more and more often for implementing

some systems and it has the power to decide if online credits are approved or not depending on

the parameters introduced into the system. The bank institutions need these systems because

they have the advantage to win a bigger number of clients who want credits if they buy such a

system. Any employee of a bank can easily use such a system because it has the latest interface.

The distributed system of decision has modules implemented on the client part and on the

server part too. Client-server communication can take place very quickly if the so-called EJB

(Enterprise Java Beans) are used. The distributed system can make more decisions at the same

time. There is the possibility that the number of clients be very big because the system can be

implemented so that it can handle very much requests and at the same time do the calculations

rapidly.

‘The development of a robust and secure communications infrastructure is a key factor to

building up an electronic banking system.’ [9] Online crediting with the help of distributed

systems can be achieved following certain stages: contacting the banking institution that has

such a system, scanning papers and sending them to the respective bank. Contacting can be

made through a simple phone call. Scanning papers is made by the client and then he sends it

to the banking institution for evaluation. Sending is made with the help of an e-mail address

well secured.



92

Figure 1. In this figure we can observe the architecture of modules which take part of distributed system for

online crediting.

3. The role of Java programming language in implementing distributed systems for online

crediting

The Java programming language has a very important role in implementing distributed systems

for online crediting. Distributed systems implemented in Java programming language have

protocols of request-reply type on their base. They can offer support for emitting a request from

distance. The most used methods for implementing a distributed system are RMI (Remote

Method Invocation) and Corba. ‘Java is often used to develop portable grid applications, with

programs being sequences (compositions) of remote method calls.’ [1]. ‘CORBA run-time

system works as the executing environment for Ag2D application.’ [11] Distributed systems

implemented in Java J2EE use web services like SOAP, ESB, WSDL. Implementing a web

service is made as following: an .xml file with the respective data is created. Then it is

configured in another .xml file and the existence of the respective wsdl is notified. Then certain

Java classes on the web service can be ruled and generated and it can be observed where the

request and the response are. The next steps are strictly on the implementation part of the web

service in Java in order to be functional. A web service needs the following implementations:

the implementation of builders, the implementation of DTO (Data Transfer Object), the

implementation of an interface, the implementation of a client, the implementation of a service,

the implementation of a transformer, the implementation of a controller. The implemented

builders are practically used for making the links on request header. DTO (Data Transfer

Object) contains the fields from response which have on their base Getter and Setter. The

interface contains methods and fields which are static and final types. The client can be Stateful

or Stateless type and it is used for calling the builders. ‘Stateful aspects can react to the trace

of a program execution; they can support modular implementations of several crosscutting

concerns like error detection, security, event handling, and debugging.’[5] The service is used

for receiving requests and sending them to ESB (Enterprise Service Bus). The transformer

takes information from Response and puts it in DTO. The controller makes the connection

between Request and Response and calls the web service. The distributed system for online

crediting can be updated using Maven Update from the developing menu.



93

Figure 2. In this figure we can observe the architecture of distributed system of decision for online crediting.

4. Demonstrative application

This demonstrative application presents the efficiency in the future of a distributed system of

online crediting implemented in Java programming language.

Table 1. Demonstrative application with the employees of some companies that will request a credit in the

future

Name and

firstname

Company Prefers going

to the bank for

requesting a

credit

Prefers the bank

having the

distributed

system for online

crediting

In case the bank does not

have the distributed

system, he/she choses or

not to change the banking

institution

Selariu Ramona Accenture No Yes She choses changing the

banking institution

Tarbuc Constantin Else Digital

Solutions

No Yes He choses changing the

banking institution

Chirita Georgian

Catalin

Vodafone No Yes He choses changing the

banking institution

Irimia Alexandra CSM Bucuresti No Yes She choses changing the

banking institution

Popa Alexandru Perfect Bite

Dentistry

No Yes He choses changing the

banking institution

5. Conclusions

Considering the questionnaires that I applied, the distributed system is very useful for a banking

institution because it can win very much clients in a short time. If the banking institution has

more and more clients who ask for credits, the profit obtained from interests will increase

significantly and the bank will increase its turnover and can develop more. ‘Electronic banking

systems provide us with easy access to banking services.’ [3] For buying this distributed

system, the bank institution has to make a serious investment and to replace the system that it



94

has at the respective time with this one. If more and more banks buy a system like this in a

short period of time, clients can obtain a legerity from more banking institutions. The banking

institutions who do not want to buy a system like this will lose market share in front of the

institutions that have such an informatics system. The interface of the informatics system is

easy to use for each employee of the banking institution. It is not necessary a training period

for improving the use this system, because it will have an interface easy to use and also one of

the latest. Nowadays, technology improves and more and more bank institutions make major

investments in the internal informatics systems. If these investments are not made, it is a great

risk level that those systems that the institutions have at the respective time cannot handle the

processes that an employee has to achieve in those systems because they have an older

implementation. The amortization of investment can be achieved in a very short time because

the number of clients will raise and the bank’s profit will be bigger. In the future, there will be

made a lot of questionnaires based on this information and there will be clearly seen the

advantages of this system.

References

[1] Alt, M., Gorlatch, S. (2005). Adapting Java RMI for grid computing, Future Generation

Computer Systems, Volume 21, Issue 5, pp. 699–707.

[2] Amorim, B.S.R, Sousa, J.L.R. (2014). Information System conceptualization drive of

unique business process through virtual organizations, Procedia Technology, 16, pp. 867

– 875.

[3] Claessens, J., Dem, V., De Cock, D. et al. (2002). On the Security of Today’s Online

Electronic Banking Systems, Computers & Security, Volume 21, Issue 3, pp. 253–265.

[4] Georgescu, M., Jefleab, V. (2015). The particularity of the banking information system,

Procedia Economics and Finance, 20, pp. 268 – 276.

[5] Leger, P., Tanter, E., Fukuda, H. (2015). An expressive stateful aspect language, Science

of Computer Programming, Volume 102, pp. 108–141.

[6] Leong, S.K., Srikanthan, T., Hura, G.S. (1998). An Internet application for on-line

banking, Computer Communications, Volume 20, Issue 16, pp. 1534–1540.

[7] Salibekyan, S., Panfilov, P. (2015). A New Approach for Distributed Computing in

Embedded Systems. Procedia Engineering, 100, pp. 977 – 986.

[8] Sene, M., Moreaux, P., Haddad, S. (2006). Performance evaluation of distributed

Database-A Banking system case study, A volume in IFAC Proceedings Volumes, pp.

351–356.

[9] Sklira, M., Pomportsis, A.S., Obaidat, M.S. (2003). A framework for the design of bank

communications systems, Computer Communications, Volume 26, Issue 15, pp. 1775–

1781.

[10] Stelmach, M., Kryza, B, Slota, R., Kitowski, J. (2011). Distributed Contract Negotiation

System for Virtual Organizations, Procedia Computer Science, 4, pp. 2206–2215.

[11] Wang, L. (2008). Implementation and performance evaluation of the parallel CORBA

application on computational grids, Advances in Engineering Software, Volume 39, Issue

3, pp. 211–218.

http://www.sciencedirect.com/science/article/pii/S0167739X04000810?np=y

http://www.sciencedirect.com/science/article/pii/S0167739X04000810?np=y

http://www.sciencedirect.com/science/journal/0167739X

http://www.sciencedirect.com/science/journal/0167739X

http://www.sciencedirect.com/science/journal/0167739X/21/5

http://www.sciencedirect.com/science/article/pii/S0167404802003127



http://www.sciencedirect.com/science/journal/01674048

http://www.sciencedirect.com/science/journal/01674048/21/3






http://www.sciencedirect.com/science/journal/01676423/102/supp/C

http://www.sciencedirect.com/science/article/pii/S014036649700162X



http://www.sciencedirect.com/science/article/pii/B9780080446547501928














95

PERSONAL DATA VULNERABILITIES AND RISKS MODEL

Gheorghe Cosmin SILAGHI

Babeș-Bolyai University Cluj-Napoca

[email protected]

Abstract: Nowadays, a huge concern came up about the capabilities of intelligence forces to

perform mass surveillance and the extent of these capabilities. Performing personal data

interception is possible because, within society, a lot of sensitive data flow between various

systems. This paper draws a picture of the vulnerabilities and risks regarding personal data

and discusses issues regarding data interception and mining for legal purposes.

Keywords: mass surveillance, personal data, risks and vulnerabilities

JEL classification: Z18, K14, K36

1. Introduction

With the emergence of the Snowden affair, a lot of concern came up about the capabilities of

the intelligence forces to perform mass surveillance and dataveillance and the extent of these

activities. From the technological point of view, scientists and legal people acknowledge the

fact that technical capabilities of existing hardware equipment and software do exceed the

perceived limit of fair collection and usage of data for security purposes, from a privacy

concern point of view. Now, we are aware that national agencies like National Security Agency

(NSA) of US, Australian Signals Directorate (ASD) [1], Government Communication

Headquarters (GCHQ) [2] from UK, European intelligence agencies, Communications

Security Establishment Canada (CSEC) [3] are programmatically using sophisticated tools to

perform mass surveillance on all of us, intercepting Internet and phone communications and

breaching basic privacy principles, regardless whether the target is legitimate intercepted or

not, in a hope to proactively respond the various security threats of our global world like

terrorism. Are such behaviours legitimate, acceptable and proportionate? How those operations

could happen in a democratic world which abides the principles of human rights, including

privacy? Smart surveillance of real time communication of all sorts is not possible without

advanced data mining techniques allowing for face and voice recognition and identification of

individuals. But, extracted information is useless without being correlated with personal data

already stored on police and administrative databases operated worldwide.

This paper presents a model concerning vulnerabilities and risks regarding personal data. The

model builds on the general picture concerning data usage by Law Enforcement Agencies

(LEA) and Security and Intelligence Services (SIS) and depicts the data flows where sensitive

information could be collected. Furthermore, we list the relevant databases containing personal

sensitive data for the usage of detection, prevention and prosecution of crimes and some tools

that can be used for Open Source Intelligence (OSINT) purposes.

2. Data vulnerabilities and risks model

This section develops a vulnerabilities and risks model for tracking the sensitive personal

information of users world-wide. Nowadays, humans communicate more and more and they

make extensive use of novel technologies like broadband while traditional communication

stagnates. The latest data [4] released by the International Telecommunication Union (ITU) –

the United Nations specialized agency for information and communication technologies clearly



96

shows1 the advance of mobile telecommunications against fixed telephony and the increased

stake of mobile broadband. Thus, while people go mobile and on Internet they leave traces of

their existence by revealing their personal data with various occasions, voluntarily and

involuntarily. Mass surveillance of communication and Internet leads towards collection of

huge amount of data, being therefore possible to identify people and their daily activity through

these traces.

Based on the above motivation, we insist about the usefulness of a high level picture, displaying

important points where people leave traces about their personal data and daily activities.

Identifying the data sources and the data flows during communication can help us to point out

the relevant vulnerabilities for personal data leakage. We emphasize that these vulnerabilities

can be exploited for various purposes, including lawful interception for crime detection and

prosecution – carried on by Law Enforcement Agencies (LEA) or Security and Intelligence

Services (SIS), profiling for private commercial usage and for governmental reasons, or simply

(organized) criminal activities like money stealing directly affecting the people identified by

those data.

First of all, we need to point out relevant data which are vulnerable to various attacks. As we

can see on figure 1, citizens’ data, infrastructure data and corporate data are target of mass

surveillance by LEA and SIS.

Figure 1 - Relevant data under surveillance [5]

1 In 2013, at the world level, ITU reports 1.17 billion fixed telephony subscriptions, compared with 6.83 billion

mobile-cellular subscriptions. From 2005 up to 2013, mobile cellular subscriptions increased more than 3 times.

In 2013 ITU reports more than 2 billion mobile broadband subscriptions, 3 times more than fixed (wired)

broadband subscriptions.



97

Our focus is on citizens’ data or other data which can lead to revealing citizens’ personal

information. On one side, part of citizens’ data is produced for citizens’ identification by the

government – an administrative purpose, on the other side, people communicate and these data

is different as it falls under the privacy and confidentiality right of humans. Mass surveillance

does collect especially this second sort of data and matches it against the existing personal data

in the governmental databases.

In the next section, we will enumerate the governmental databases specifically used for crime

detection by LEAs or other databases used for administrative purposes, collecting personal

data.

Figure 2 presents how citizens’ data is produced and how it flows on various communication

environments, including Internet. This figure allows us to identify the vulnerable points where

personal data can be “stolen”.

Figure 2 - Personal and communication data flows

Users store their personal data in their computing devices like laptops, PCs, tables and

smartphones. With these devices they do carry their daily activity on Internet, in the form of

emails, web searches, participation in social networking, generating user content on blogs,

wikis, clouds or other applications. Tracing back the user daily activity on the Internet, one can

reach the device used to post this activity and, given access to the device, one can access the

personal data.

Users communicate via phone-calls. With the movement of analogic telephony towards the

digital one and with the emergence of novel mobile phone technologies like 3G and 4G, we

notice that the telephony network goes towards an IP network, thus, being vulnerable to the

same risks like the Internet.

Therefore, of high risk are all the communication lines between the user directly and the devices

he/she uses – like the (mobile) telephone and the communication lines between the devices and



98

the remote applications cooperating with those devices. If this communication is intercepted,

all personal data, user generated content or personal expression become vulnerable.

In general, we assume that (Internet) applications we use indeed do for us the intended

functionality and not another one. We do trust the application we use. But, a lot of them collect

data from us and use this data for purposes – like profiling, not specified in the service level

agreement consented between us and the application or service provider. Therefore,

applications we use represent another risk against our data. Either their operators use the data

in purposes not known to the user, or they use the applications as backdoors towards our

devices.

Users perform daily activities being present in public private places where there are sensors

deployed – like CCTV cameras. Those sensors register the user activity and behaviour,

communicating it to some central control room, using the same communication lines. Thus,

they represent another source of information capable of revealing important details to person

identification.

Besides their daily activity and behaviour, people need to register with the governmental

structures, their identification data being stored in the governmental administrative databases.

Information contained in governmental databases is the most reliable one with respect to user

identification. Thus, if LEAs and SISs intend to exploit the data-related vulnerabilities of our

behaviour, they need to synchronize gathered intelligence with existing facts stored in the

police and administrative databases.

We notice that majority of communication passes over some networking infrastructure,

regardless if it is a wired infrastructure – like the fibre-optic or a satellite communication.

Therefore, investigating the networking infrastructure in detail is a must.

All networking communication abides to some technical standards well known by the technical

people. Vulnerabilities and risks of the networking communication become valid for both

positive usage within lawful interception and for misuse as part of criminal activities.

We also notice that mobile communication tends to go on the same computer networking

standards. This implies that mobile communication and Internet traffic fall under the coverage

of the same networking standards, enhancing the effectiveness of both lawful interception

activities and, on the other side, of the (organized) criminal activities.

We also notice that personal data is matched against the content of administrative or police

databases, enumerated on the next section.

Also, besides user generated content on the Internet, a lot of information is produced and

broadcasted by media and this information source represents a valuable asset in what regards

surveillance. Automated systems intercept open media broadcasted data – like TV streaming,

and extracted data is matched against citizens’ data, either these obtained from lawful

interception, or those stored in databases. We will brief on the Open source intelligence concept

on section 4.

3. Databases and data storage systems

In this section, without being exhaustive, we enumerate several databases in use by LEAs and

governments to store personal data of citizens, for law enforcement or administrative purposes.

LEAs of different countries use their custom databases, and coordinate their content with the

general database of Interpol.

Interpol manages ICPO – the Interpol Nominal data database ICIS (International Criminal

Information System), in use by 190 countries for the detection and prosecution of crimes in

general. It contains more than 153 000 records on known international criminals, missing

persons and dead bodies, with their names, birth dates, physical description, photographs,

criminal histories, fingerprints, etc.



99

LEAs of different countries manage specific or specialised (police) databases, including

specific structured data (e.g. DNA, fingerprints, stolen passports) or data recorded for specific

types of crime (e.g. theft of Works of Art, terrorist activities). For example, many countries

implements AFIS (Automated Fingerprint System) – which is an integrated computer system

with a database of millions of personal data used for the identification of fingerprints. Many

countries use national DNA databases. National databases often have match rates for linking a

crime scene profile with a previously stored person (between 20-50%). The INTERPOL DNA

Database [6] introduces a data matching system with the capacity to connect its international

profile matching system to all member states using DNA profiling. Participating countries

actively use the DNA Gateway as a tool in their criminal investigations and it regularly detects

potential links between DNA profiles submitted by member countries. Other highly used

databases include stolen works of art, stolen motor vehicles, arms register, missing persons,

stolen and lost travel documents, stolen administrative documents.

Administrative databases held by public authorities are kept for specific administrative

purposes, not for police purposes. On a case to case basis, authorization can be given upon

certain conditions to LEAs to access directly or indirectly topical data in the context of a

criminal investigation. In includes the Personal Information File (in form of a register of

residents or civil registry), passports, national IDs, drivers’ licenses, visa information, the

asylum seekers information systems (Eurodac), the Interagency Border Inspection System,

criminal records, consular lookout and support system, the consolidated consular database, etc.

4. Open source intelligence

According with US Department of Army, open source intelligence (OSINT) is defined [7] as

the discipline that gathers intelligence from “publicly collected information that is collected,

exploited and disseminated in a timely manner to an appropriate audience for the purpose of

addressing a specific intelligence and information requirement”. We shall notice and

emphasize two essential properties of OSINT: (i) It exploits publicly available information, in

the sense that any data produced for the general public on various environments with various

purposes fall under the OSINT definition and (ii) The source of information is open, in the

sense that data can openly be gathered without the expectation of privacy. Amongst open

information sources we can have electronic environments which can be interrogated

automatically, or humans. We notice that the above-presented definition covers various

environments like:

media, including written media on newspapers or magazines broadcasted media like

radio or TV,

WWW, with all publicly information available on it, including websites, social

networking, wikis, blogs, video-sharing, search engines, images, maps, geospatial

information etc. In general, web-based user generated content, deep and dark web fall

under this category.

Professional and academic communities that spread their knowledge in workshops and

conferences, publishing reports, academic articles, books etc.

Official governmental sources, releasing various reports like demographic reports,

official statistics, laws and regulations, official speeches, press conferences and

debates, budgets, public information emerging out of e-government systems

Information that can be gathered by interrogating computer machines and other

networked elements available over the Internet, including publicly available sensors

Geospatial information collected from satellites, GIS systems, maps, atlases, airspace

and maritime navigation data, environmental data.



100

OSINT assumes that all information is collected from open sources, which preclude the privacy

assumption. Ones who use OSINT assume that they do not have to abide to privacy concerns,

as all the data is available to everyone.

Countries like USA or Great Britain justify their controversial mass surveillance programs like

PRISM or Tempora to fulfill the some of the goals of OSINT. But, there are many simple or

sophisticated OSINT tools used in practice.

Several simple tools are based on open source software. We enumerate here tools like The

Wayback Machine, Who.is, Maltego, public translation services like Google translate,

IP2Location, NewsNow, SocialMention, Google Hacking Database, or the Social engineering

toolkit. Being based on open-source technologies, these tools and other ones are of the same

usage for both the investigators and the criminals.

Besides these open-source software tools, there is a massive software market for companies

that build integrated and expensive software for LEA and SIS usage, especially for OSINT

purposes. We enumerate here companies like Sail Labs Technology, GlimmerGlass, Verint,

Trovicor, SS8, Mobiliaris, Ipoque, and others2. Companies like these one gather in huge

exhibitions to present their developments to LEAs and SIS, where most of the presentations

are held with closed doors, on invitation basis only.

5. Conclusion

This paper presents a vulnerabilities and risk model concerning sensitive personal data. With

the nowadays extent of dataveillance and massive surveillance programs operated by countries

all over the world, a controversial question comes in about the legitimacy of these programs

and whether they violate human rights including the right of privacy.

We want to draw attention to weak points there sensitive personal data are vulnerable and list

several tools used by law enforcement agencies and intelligence services for data surveillance.

Scientific community in IT needs to be aware that developed technologies are used by both

sides of the game: by the governmental investigators and by criminals.

Acknowledgement

We acknowledge support from the European Union’s Seventh Framework Programme for

research, technological development and demonstration under FP7 RESPECT project, grant

agreement no. 285285.

References

[1] Tim Leslie and Mark Corcovan, Explained: Australia’s Involvement with the NSA, the US

spy agency at heart of global scandal, ABC News, 19 Nov 2013,

http://www.abc.net.au/news/2013-11-08/australian-nsa-involvement-explained/5079786

[2] Julian Borger, GCHQ and European spy agencies worked together on mass surveillance,

The Guardian, 1 Nov 2013, http://www.theguardian.com/uk-news/2013/nov/01/gchq-

europe-spy-agencies-mass-surveillance-snowden

[3] Greg Weston, Glenn Greenwald, Ryan Gallagher, Snowden document shows Canada set

up spy posts for NSA, CBC News, 9 Dec 2013, http://www.cbc.ca/news/politics/snowden-

document-shows-canada-set-up-spy-posts-for-nsa-1.2456886

2 The enumeration is not exhaustive and our intention is not to advertise those companies.

http://www.abc.net.au/news/2013-11-08/australian-nsa-involvement-explained/5079786

http://www.theguardian.com/uk-news/2013/nov/01/gchq-europe-spy-agencies-mass-surveillance-snowden

http://www.theguardian.com/uk-news/2013/nov/01/gchq-europe-spy-agencies-mass-surveillance-snowden

http://www.cbc.ca/news/politics/snowden-document-shows-canada-set-up-spy-posts-for-nsa-1.2456886

http://www.cbc.ca/news/politics/snowden-document-shows-canada-set-up-spy-posts-for-nsa-1.2456886



101

[4] International Telecommunication Union, World Telecommunication / ICT Indicators

database 2013, key 2006-2013 data, available at http://www.itu.int/en/ITU-

D/Statistics/Pages/stat/default.aspx

[5] Joseph A. Cannataci, Defying the logic, forgetting the facts: the new European proposal for

data protection in the police sector, European Journal of Law and Technology, vol. 4(2),

2013, http://ejlt.org/article/view/284/390

[6] http://www.interpol.int/INTERPOL-expertise/Forensics/DNA

[7] Headquarters Department of Army, Open-source intelligence, ATP 2-22.9, July 2012,

available at http://www.fas.org/irp/doddir/army/atp2-22-9.pdf

http://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx

http://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx

http://ejlt.org/article/view/284/390

http://www.interpol.int/INTERPOL-expertise/Forensics/DNA

http://www.fas.org/irp/doddir/army/atp2-22-9.pdf



102

A BUSINESS MODEL FOR THE INTERACTION BETWEEN ACTORS

OF CULTURAL ECONOMY

Cristian CIUREA

Bucharest University of Economic Studies, Bucharest, Romania

[email protected] Florin Gheorghe FILIP

Romanian Academy- INCE & BAR, Bucharest, Romania

[email protected]

Abstract. The paper proposes a business model for the efficiency optimization of the

interaction between all actors involved in cultural heritage sector, such as galleries, libraries,

archives and museums (GLAM). The implementation of virtual exhibitions on mobile devices

is described and analyzed as a key factor for increasing the cultural heritage visibility. New

perspectives on the development of virtual exhibitions for mobile devices are considered.

Keywords: cultural heritage, virtual exhibitions, business model, mobile applications. JEL classification: Z1, O1

1. Introduction We are living in the era of the internet and mobile technologies, where information can be

accessible at a touch/click distance and the most important is that the access can be anytime

and from anywhere. Mobile technologies have evolved at exponential rate in the last years and

the evolution will continue. The rapid change of mobile devices in hardware and software have

made possible the replacement of old computers and laptops with smart phones and tablets

when discussing about internet surfing, email checking and so on [1].

Young people represent the majority of mobile device users, having grown up with these

gadgets and being familiar with their use. A big issue of the anytime-anywhere access to

information of young people is that they want to discover everything online, they have no time

to go to libraries to read classic books, they are not willing to visit museums and art galleries

to explore exhibitions of paintings, sculptures and so on. Thus, their knowledge of culture,

literature and history is not as rich as schools and universities expected them to have.

In order to aid the young generation and to increase the visibility and number of visitors,

libraries and museums have decided to present their exhibitions and collections online. The

idea of virtual exhibitions implementation on mobile devices, as native mobile applications, is

accepted by many museums and libraries that already have digitized material available online.

A virtual exhibition goes beyond digitizing a collection which is primarily meant for

preservation [2]. The potential offered by mobile technologies and the increased number of mobile devices

processing multimedia content, both offline and online, facilitate the implementation of virtual

exhibitions on mobile devices [3]. In [4] and [5], a cybernetic model for computerization of the

cultural heritage is proposed in order to study the relationship between the organizations

involved.



103

2. Cultural economy

The idea of cultural economy originated in the 90s, when De Michelis said that “Europe's

cultural heritage is a strategic resource similar to oil for Arab countries”. Through the

digitization process of multimedia collections from libraries, museums and other cultural

institutions, the knowledge and development of national and European cultural heritage

elements are ensured [4].

The cultural economy is a combination of technological developments, human factor, new

business models and geopolitical and economic evolutions [6], as shown in Figure 1.

Figure 1. Components of Cultural Economy

When discussing about cultural economy, we must consider the academic world’s perspective

on democratization of the access to knowledge and on preservation of the original objects of

cultural heritage.

At the same time, heritage and culture in a globalized world are exposed to a wide range of

demands of consumption and communication [7].

The evolution of information and communication technologies must ensure not only a

better knowledge and preservation of cultural heritage items, but it must promise an

increased number of direct visitors of cultural institutions as well. Software applications

for creating virtual exhibitions, in addition to being used as tools to prepare and build

content for virtual visits, may help the staff of a museum or library to conceive and create

exhibitions [8].

Technology has offered museums and libraries the means to create more vivid and

attractive presentations for communicating their message to the visitors in a more effective

and attractive manner [9].

3. The proposed business model

The model proposed for explaining the interaction between actors of cultural heritage sector

has the objective to highlight the direct relation between investment in digitization and

visibility of cultural heritage exhibits on the one side, and the number of visitors and revenues

of cultural institutions on the other side [10]. The model supposed to increase the visibility and



104

attractiveness of different exhibits and collections stored in libraries and museums by making

them available online with the support of mobile applications for the implementation of virtual

exhibitions [11] [12].

The objective of implementing virtual exhibitions on mobile devices is not to replace physical

exhibitions, but to bring art consumers closer to what they love, to make collections accessible

at any time and from any place, to keep visitors informed in the field with new exhibitions or

events and to open people’s appetite to knowledge [13].

Here is the simplified business model:

𝑥1(𝑘) = ∑ 𝑑𝑖(𝑘)

𝑤

𝑖=1

+ ∑ 𝑣𝑖(𝑘)

𝑤

𝑖=1

+ 𝑥0(𝑘)

𝑦1(𝑘) = ∑ 𝑚𝑖(𝑘)

𝑧

𝑖=1

+ ∑ ℎ𝑖(𝑘)

𝑧

𝑖=1

+ 𝑦0(𝑘)

where:

k – the current month;

x1 – the number of estimated physical visitors of cultural institutions;

x0 – the number of existing physical visitors of cultural institutions (who visit them

repeatedly);

di – the increase of the number of visitors resulting from investments in digitization;

vi – the increase of the number of visitors resulting from investments in visibility

(implementation of virtual exhibitions for mobile devices);

y1 – the estimated revenue of cultural institutions;

y0 – the current revenue of cultural institutions;

mi – the revenue resulting from investments in visibility (development of virtual

exhibitions for mobile devices);

hi – the revenue resulting from data reuse (digitized collections reuse);

w – the number of exhibits selected for digitization;

z – the number of digitized exhibits selected for reuse.

In Figure 2 below, there are presented the relations between implementation of virtual

exhibitions for mobile devices on the number of visitors and revenues of galleries, libraries,

archives and museums. The figure shows a simplified diagram of the proposed business model,

in which users of mobile applications, such as virtual exhibitions, come to visit cultural

institutions, due to exploration of virtual exhibition on the mobile device.

In the preconditions of the business model proposed, we consider that data should be reusable

also for other virtual exhibitions or in other online way. The concept of permanent universal

cultural depot appears, which is totally distributed, infinite in size and accessible anytime and

anywhere and implementing the vision of the British novelist H. G. Wells [14].



105

Figure 2. Influence of business model on cultural institutions indicators

The impact of the business model is upon several other domains, such as education (increase

the quality and attractiveness), tourism (diversification of the offer of services), e-commerce

with digital content and consumer goods industry.

4. A mobile application for virtual exhibitions

We consider a mobile application for implementing a virtual exhibition accessible on mobile

devices, such as tablets and smartphones with Android© operating system. The mobile

application is designed to allow the reuse of digital content to implement also other virtual

exhibitions. The whole digital content, such as images, movies, sound and text descriptions are

stored on server and not locally, on mobile devices. This is a feature that allows the developer

to change the content of the virtual exhibition, without requesting the user to update or reinstall

the mobile application.

The mobile application created was meant to present a virtual exhibition with historical

documents from the Romanian Academy Library (BAR). Some screen captures from the

mobile application are displayed in Figure 3.

The mobile application has the following objectives:

attract new visitors online, who will be transformed into physical visitors of cultural

institutions;

present most important pieces of collections extracted from real exhibitions;

increase the visibility of collections shown in virtual exhibitions;

estimate the users’ behavior, in order to create categories of visitors and to discover

users’ preferences in terms of exhibits visited.

Providing educational and cultural information through virtual exhibitions on mobile devices

is not enough to arouse the interest of young people in culture, as the collections need to be

presented to the public in an attractive manner.



106

Figure 3. Mobile application for virtual exhibition implementation

The authors think to implement the proposed business model, in the mobile based solution, by

making available the mobile application to a great number of museums and libraries in

Romania in order to collect and analyze information about the evolution of indicators.

We must agree that virtual exhibitions are extraordinarily difficult to design and develop,

mainly because of the number of various stakeholders involved in the process [15].

The good news is that several platforms are already available to achieving multilingual virtual

exhibitions. A good example is the MOVIO tool [16] [17], which is largely utilized as a de

facto standard in Athena Plus project (www.athenaplus.eu).

5. Conclusions The implementation of mobile applications for virtual exhibitions has the objective to show

rare and valuable collections and will witness a significant development in the next years, if

we consider the impact they have in the educational and cultural fields.

In the next period, the evolution of mobile technologies will be significant in terms of devices

capabilities, operating systems, mobile applications and number of users. Taking this aspect

into consideration, it is crucial for any cultural institution to develop at least one mobile

application for increasing the visibility and attractiveness of cultural heritage.

The future works on the proposed topic will include researches on the evolution of visitors’

number of cultural institutions resulting from investments in digitization and implementation

of mobile applications for virtual exhibitions and the relations between actors of cultural

economy.

Acknowledgment This paper is supported by the Sectorial Operational Programme Human Resources

Development (SOP HRD), financed from the European Social Fund and by the Romanian

Government under the contract number SOP HRD/159/1.5/S/136077, and by the Athena Plus

project (CIP Framework Programme 2007-2013, Grant agreement no. 325098).

References [1] P. Clarke, “Tablets: Will They Replace PCs?”, Nemertes Research, 2013, Available at:

http://i.crn.com/custom/INTELBCCSITENEW/WhitePaper_Tablets_ReplacePCs.pdf



107

[2] G. Dumitrescu, F. G. Filip, A. Ioniţă, C. Lepădatu, “Open Source Eminescu’s Manuscripts:

A Digitization Experiment,” Studies in Informatics and Control, Vol. 19, No. 1, pp. 79-84.

[3] F. G. Filip, C. Ciurea, H. Dragomirescu, I. Ivan, “Cultural Heritage and Modern Information

and Communication Technologies,” Technological and Economic Development of

Economy, Vol. 21, Issue 3, 2015.

[4] F. G. Filip, “Information Technologies in Cultural Institutions,” Studies in Informatics and

Control, Vol. 6, No. 4, 1996, pp. 385-400.

[5] F. G. Filip, D. A. Donciulescu, C. I. Filip, “A Cybernetic Model of Computerization of the

Cultural Heritage,” Computer Science Journal of Moldova, Vol. 9, No. 2(26), 2001, pp.

101-112.

[6] F. G. Filip, “Catre o economie a culturii si o infrastructura informationala intelectuala,”

ACADEMICA, 12 (132), pp. 12-13.

[7] A. Alzua-Sorzabal, M. T. Linaza, M. Abad, L. Arretxea, A. Susperregui, “Interface

Evaluation for Cultural Heritage Applications: the case of FERRUM exhibition,” The 6th

International Symposium on Virtual Reality, Archaeology and Cultural Heritage (VAST

2005), The Eurographics Association, 2005.

[8] J. Gomes, M. B. Carmo, A. P. Cláudio, “Creating and Assembling Virtual Exhibitions from

Existing X3D Models,” Docs.DI, 2011.

[9] G. Lepouras, A. Katifori, C. Vassilakis, D. Charitos, “Real exhibitions in a Virtual

Museum,” Virtual Reality Journal, Springer-Verlag, Vol 7, No 2, 2003, pp. 120-128.

[10] C. Ciurea, C. Tudorache, “New Perspectives on the Development of Virtual Exhibitions

for Mobile Devices,” Economy Informatics, Vol. 14, No. 1/2014, pp. 31-38.

[11] C. Ciurea, C. Coseriu, C. Tudorache, “Implementing Mobile Applications for Virtual

Exhibitions using Augmented Reality,” Journal of Mobile, Embedded and Distributed

Systems, Vol. 6, No. 3, 2014, pp. 96-100.

[12] C. Ciurea, A. Zamfiroiu, A. Grosu, “Implementing Mobile Virtual Exhibition to Increase

Cultural Heritage Visibility,” Informatica Economică, Vol. 18, No. 2/2014, pp. 24-31.

[13] C. Ciurea, G. Dumitrescu, C. Lepadatu, “The Impact Analysis of Implementing Virtual

Exhibitions for Mobile Devices on the Access to National Cultural Heritage,” Proceedings

of 2nd International Conference ‘Economic Scientific Research - Theoretical, Empirical

and Practical Approaches’, ESPERA 2014, 13-14 November 2014, Bucharest, Romania.

[14] H. G. Wells, World Brain, Methuen & Co., London, 1938, Available at:

https://ebooks.adelaide.edu.au/w/wells/hg/world_brain/

[15] S. Foo, Y. L. Theng, H.L.D. Goh, J. C. Na, “From Digital Archives to Virtual Exhibitions,”

Handbook of Research on Digital Libraries: Design, Development and Impact, IGI Global,

Hershey, PA, pp. 88-101.

[16] S. H. Minelli, M. T. Natale, B. Dierickx, P. Ongaro, D. Ugoletti, R. Saccoccio, M. Aguilar

Santiago, “MOVIO: A semantic content management and valorisation approach for

archives and cultural institutions,” Girona 2014: Arxius i Indústries Culturals, 2014,

Available at: http://www.girona.cat/web/ica2014/ponents/textos/id234.pdf

[17] M. T. Natale, S. H. Minelli, B. Dierickx, P. Ongaro, M. Piccininno, D. Ugoletti, R.

Saccoccio, A. Raggioli, “Exhibiting Intangible Cultural Heritage using MOVIO: a

multilingual toolkit for creating curated digital exhibitions, made available by the

AthenaPlus project,” ICOM 2014 - Access and Understanding – Networking in the Digital

Era: intangible Cultural Heritage, 2014, Available at:

http://www.cidoc2014.de/images/sampledata/cidoc/papers/H-2_Natale_Minelli_et-

al_paper.pdf



108

NEAR FIELD COMMUNICATION - THE STATE OF KNOWLEDGE

AND USE OF NFC TECHNOLOGIES AND APPLICATIONS BY USERS

OF SMART MOBILE DEVICES

Mihaela Filofteia TUTUNEA

Babeș-Bolyai University of Cluj-Napoca, Faculty of Business

[email protected]

Abstract. The evolution of mobile technologies in the past decade has imposed NFC as the

new technology for proximity communication, at the same time there being launched NFC-

enabled smart devices that have opened new opportunities for conducting daily activities using

contactless technologies. The development of these technologies has determined dedicated

applications and software solutions to grow in number to the point that they are now covering

more and more areas of daily activities. From this perspective, the present paper is divided

into two parts: the first provides an overview of NFC technologies and their applications, while

the second represents a study of the state of knowledge and use of NFC technologies and

applications by users of smart mobile devices. The findings provide useful information for

producers of smart mobile devices and mobile applications dedicated to new technologies and

intelligent devices and obviously, for their users.

Keywords: NFC, mobile applications, smart mobile devices

JEL classification: L86, M15

1. Introduction and literature review The continuing development of mobile technologies and applications and related software

solutions, has created new trends in terms of consumer orientation toward smart mobile

devices.

In recent years we have all witnessed the fast development of these devices, generation after

generation, and their permanent enrichment with new communication functions and features.

In these evolutionary trends we can also include the use of NFC (Near Field Communication)

technologies, considered some of the newest facilities in proximity communication.

Based on these considerations, the present paper was divided in two parts, namely: the first one

presents a general perspective on NFC technologies and on the main related applications, while

the second part focuses on the study of the state of knowledge 108n dues of NFC technologies

and applications by users of smart mobile devices.

Given the fact that the study can be considered singular in the landscape of the use of NFC

technologies, the findings are very useful also for understanding the NFC ecosystem, defined

as „different applications (marketing, payments, identity, access, transit, peer-to-peer, posters,

gaming, product labels) and different end markets (e.g., consumer, medical, enterprise) –

especially beyond payment [23].

a. NFC – general aspects According to the NFC Forum, which was founded in 2004 by Nokia, Philips and Sony, Near

Field Communication (NFC) is „a standards-based short-range wireless connectivity

technology” [16]. In time, the NFC Forum has continued its mission to promote the use of NFC

technology and at present it includes 190 members among whom equipment and electronic and



109

mobile device manufacturers, software solution developers, financial service suppliers, etc.

According to the NFC Forum, NFC technology allows bidirectional communication and

contactless transaction between electronic devices, thereby expanding the capabilities of

wireless technologies [17].

NFC, is defined as „a radio device, on a frequency of 13.56 MHz that can establish

communication between two objects up to 20 cm apart. The data exchange speed can reach a

maximum of 424 kbit/s and the time needed for establishing the connection is less than 1/10

seconds” [20]. Since NFC is viewed as an extension of RFID (Radio Frequency Identification),

communication and data exchange involve, in this case, an initiator and a target, just as with

RFID, with the specification that NFC devices are often programmable (eg. mobile phones)

[11, p 14].

The use of NFC involves both active and passive communication, according to which devices

already involved in a communication process can fulfill the role of initiator or recipient/target.

In active communication, both devices (initiator and target) have their own power sources and

generate alternatively RF signal for data transmission (eg. mobile phone, NFC reader). In

passive communication, the initiator device (reader) generates an RF field that serves as power

source also for the target device and thus becomes a transponder (eg. tag) [8, p 1449].

In the case of NFC communication, a mobile device user interacts with a smart device/object

„NFC-enabled mobile phone, NFC reader (allows data transfer with another NFC

object/device, eg. contactless point of sale (POS) terminal), NFC tag (an RFID tag without own

power source)” [5, p 13].

The data exchange between NFC devices/objects has specific formats. NDEF (NFC Data

Exchange Format) provides the most important difference between RFID and NFC, in the sense

that each NDEF message includes multiple records with formats recognized by NFC devices.

The most common types of NDEF records are: “simple text records (text, metadata, encoding

scheme), URI (network addresses), smart posters (specific information for a poster),

signatures” [11, p 15].

The NFC Forum has defined four types of NFC tags that provide information specific to

communication between NFC devices and NFC tags and the NDEF message definition mode

(Table 1). Table 1 – NFC Tag types – Operation Specification (Source: [22])

Depending on the type of interaction between the mobile device user and a smart object, NFC

technologies operate in three ways: reader/writer, peer-to-peer and card emulation. Each

operating mode has specific communication interfaces “(ISO/IEC 14443, FeliCa, NFCIP-1) on

the RF layer, having different technical, operational, and design requirements” [5, p 13] (Table

2).

Type 1 Tag Type 2 Tag Type 3 Tag(Felica) Type 4 Tag

Standard ISO 14443A ISO 14443A Japanese Industrial

Standard (JIS) X6319-4

ISO14443A and

ISO14443B

R/W capableread and re-

write

read and re-

write

read and re-write, or

read-only

read and re-write

or read-only

Memory availability96 bytes to

2 kilobytes

48 bytes to

2 kilobytes

up to 1 megabyte per

service

up to 32kilobytes per

service

Data speed communication 106 Kbit/s 106 Kbit/s 212kbit/s or 424kbit/s 212kbit/s or 424kbit/s

Operation

specification

Tag type



110

Table 2 – NFC operating modes (Source: [5, p 15-18])

Reader/Writer Peer-to-

Peer

Card Emulation

Read request (MU SO)

Data transfer (SO MU)

Processing within device Additional service usage (MU > SP)

Write request (MU SO)

Acknowledgment (SO MU)

Data

request/transfer (MU > MU)

Additional service

usage (MU > SP)

Service request

(MU SP)

Background services (SP >TPS)

Service usage + data

(optional) (SP MU) (Mobile user – MU; Smart object (NFC tags) – SO; Service provider – SP; Third-Party Services – TPS)

From a statistical perspective, IHS Technology shows that by 2018, 64% of the mobile phones

shipped worldwide will be NFC-enabled, compared to only 18.2% as they were in 2013; and

if in 2013 Android phones represented 93% of all shipped NFC phones, in 2018 Android will

represent only 75% of NFC phone market [18].

The main benefits of using NFC technology are „ease of use, versatility and security”, while

the main disadvantages usually refer to „compatibility, costs, security” [15].

Regarding security in NFC technologies, the following are the most common attacks:

“eavesdropping, man-in-the-middle-attack, denial of service, phishing, relay attack, data

modification” [24].

b. Application Fields for NFC technology Eight categories were identified for classifying NFC applications: „healthcare services, smart

environment services, mobile payment, ticketing and loyalty services, entertainment services,

social network services, educational services, location based services, work force and retail

management services” [4].

Healthcare Services

The applicability of NFC technologies in medicine covers different solutions, devices and

monitoring systems for various ailments and can be used including in a home-based regime.

Thus, healthcare companies have created numerous biometric devices using NFC technologies,

as there are the cases of Ergonomidesign that developed „Minimum”, a biometric device that

monitors different vital parameters using NFC cloud; and Impak Health, the creator of

„RhythmTrack”, which monitors patients’ daily activity and slumber.

The new trend of developing NFC wearable devices imposed itself on the market. Fitbit is just

one of the companies that offer NFC applications for monitoring intelligent bracelets [9]. In

2013, the Harvard Medical School developed a NFC medication tracking system, which is „a

bedside system” that uses Google Nexus NFC devices 7 [10]. NFC applications can also be

found in passive systems that use NFC tags to monitor alerts like medical appointments,

meetings, events, etc. [21].

Security problems for Mobile Health Monitoring Applications refer mainly to ”denial of

services, phishing and lost property” [8, p 1550].

Proximity Payments

ITU defines Proximity Payments as „the main payment method for B2C transactions that

implies the use of a mobile phone as equivalent to a credit or debit card (mobile wallet) or

(POS)”. The best known and most used „mobile wallets” are: Google Wallet as a NFC-enabled

device, a cloud based payment platform using a cloud-based NFC contactless card emulation

service from YESpay International Limited and YES-wallet.com, Apple’s Passbook for iOS 6,



111

China UnionPay’s largest provider of NFC Mobile Payments in China and NTTDoCoMo,

which dominates the market for NFC Mobile Payments in Japan [12].

Marketing

NFC technologies are associated with a new interaction model called „touch paradigm”,

according to which, „the user is offered context-aware services in an smart environment”. The

authors propose a model that integrates NFC and QR codes with smart environments, that

allows the management ”of all kinds of mobile coupons” [3]. Some of the areas in which „the

NFC tap and touch marketing” can be applied are: “discount store products, public

transportation and tourism, payment methods, NFC social media, NFC smart posters, Smart

home NFC settings, car NFC settings, office NFC settings, Smart City settings-NFC utility

meters” [14].

Access, authorization and monitoring

In some universities, for example in Spain, NFC technologies are used in professor and student

identification applications. These technologies are integrated with NFC Smart cards, but there

is the possibility of extending to other smart objects like cards, bracelets, cell phones, etc. [19].

The Mobile-Campus application, is another example of using NFC technologies for providing

information specific to students and related to the university campus [2]. Other universities

have implemented autonomous systems that monitor students’ attendance using NFC

technologies [13].

Military

NFC has ”a vital role in the army, particularly in the area of cordless communication where

short range communication is essential” [7].

Tourism

A number of possible NFC applications were identified also in tourism: “information supply,

access and authorization (hotels, museums, events), mobile ticketing, loyalty management,

bonus and membership cards” [6]. Gabriella Arcese, et al., emphasizes that “NFC technology

can be regarded as one of the recent ICT developments that has a huge potential for travel and

tourism” [1].

2. Smart devices users - study of the state of knowledge and use of NFC technologies and

applications The aim of this study was to identify the state of knowledge regarding NFC technologies and

the state of use of related applications, having as a starting point users’ NFC-enabled

infrastructure. It was based on the use of primary and secondary information sources, the

former resulting from the administration of a questionnaire and the latter from the collection

of statistical data and research studies from the online and offline environments. The

questionnaire had a modular design and comprised 12 closed-ended questions. It was created

in a mobile and web-based format and was structured as follows:

the first module - identification of respondents’ socio-demographic information like age,

sex, education, country of residence;

the second module - identification of respondents’ mobile infrastructure, such as the type

of used mobile devices [phone, Smathphone, tablet, wearables (SmartWatch, bracelet, etc.)

and mobile operating system, the telephone/mobile device model;

the last module - focused on NFC - determination of respondents’ awareness level regarding

the possession of a NFC-enabled device, the level of use of NFC communication facilities

and applications, the type and frequency of use of NFC applications.

The study was conducted in the time span February - December 2014 on a random sample. The

population under investigation counted 823 subjects, of which 601 formed the final sample.



112

The questionnaire was administered online, on social networks, at the end of the study period

being collected 567 valid questionnaires, which constituted the subject of the present analysis.

3. Results and discussions The analysis of the data collected through the first question module facilitated sketching

respondents’ socio-demographic profile, according to which: most users are males (58.6%)

aged between 25 and 34 years (29.12%) [the other age categories, in descending order, were:

35-44 (26.10%), 18-24 (22.05%), 45-54 (18.05%), 55-64 (4.23%), over 65 (0.45%)], holding

an academic degree (41.32%) [the indicated educational categories were the following: college

(29.57%), high school (17.06%), graduate (6.98%)]; according to the declared country of

residence, 57.98% of the respondents were from Romania and the remaining 42,02% from 29

countries around the world, the best and least represented being Spain (8.01%) and China

(0.98%), respectively; for Italy, UK, France, USA, Germany, Greece and the Rep. of Moldavia

the representation was between 6 and 8% (Figure 1).

Figure 1 - Respondents by country of residence

The second module of questions revealed the following regarding respondents’ mobile

infrastructure: 19.24% of the respondents have three mobile devices, while 46.12% have only

two; 78.65% have smartphones, 47.22% tablets and 28.11% wearable gadgets; the predominant

operating system is Android (40.22%), followed by iOS (33.31%), Windows Phone (11.01%),

BlackBerry (9.03%) and others (< 3.56%).

The analysis of the last question module allowed:

identifying respondents’ awareness level regarding possessing a NFC-enabled device,

which was tested by two questions, one referring to the model of the owned mobile device

(in this case the respondents were given a list of NFC-enabled mobile devices) and other to

the fact whether the mobile device was NFC-enabled or not. The respondents were given

further details for either question so that they fully understand them and be able to provide

an informed answer. The results were contradictory in the sense that, although 36.05% of

the respondents selected NFC-enabled mobile device models, only 10.78% of them actually

knew that, 64.89% did not know what to answer and 24% said their device was not NFC-

enabled. Of the Romanian respondents, only 3.17% said they knew they had a NFC-enabled

device.

identifying the level of communication facility and NFC application use - of the respondents

aware of possessing NFC-enabled devices, 65.47% reported using NFC facilities, 21.77%

said they never used them and 12.76% said they tried them out of curiosity; none of the

Romanian respondents indicated using the NFC applications.

determining the type of used NFC applications - the highest percentage of respondents are

declared non-users of NFC applications (32.89%); the applications used by over 20% of

the respondents are health and appointment monitoring (24.12%), data sharing and



113

collaboration (23.76%) and mobile printing and scanning (20.12%); the applications used

by over 10% of the respondents were related to touristic information and guidance

(10.34%); the other application categories were used by less than 10% of the respondents

(Figure 2).

Figure 2 - Used NFC applications

identifying the frequency of using NFC applications – this brings to the foreground the

segment of those who use these applications rarely (29.43%), followed by non-users

(24.12%); permanent users and those using NFC applications only for job-related purposes

represent 23.67% and 22.78%, respectively; a smaller percentage (9.34%) declared

themselves only occasional users of NFC applications (Figure 3).

Figure 3 - Frequency of using NFC applications

4. Conclusions In a world where words like digital, online and mobile have become intrinsic to everyday life

and manufacturers of mobile devices permanently create real trends regarding the consumption

of such products - from those designed for common use, to gadgets and wareables, there

emerges the possibility of a trend envisaging the synchronization between users’ capacity to

absorb new technologies and the latter’s evolution and launching speed. If mobile users were

able to familiarize themselves with the permanent use of Bluetooth and Wi-Fi, it is now high

time that NFC technologies be assimilated. Although NFC and NFC-enabled devices have been

present on the market for some years now, actual and possible users are little, if at all, aware

of their existence and the benefits they bring.

From this perspective, the present study aimed to identify aspects related to users’ awareness

of NFC-enabled mobile devices and the facilities offered by these technologies, as well as the

level of using related applications.

The findings revealed a faint user profile regarding NFC facilities and applications: only

10.78% of the respondents were aware of possessing NFC-enabled mobile devices, while



114

65.47% of the sample said they used NFC facilities, health monitoring being the most common;

however, most respondents confessed they use NFC applications very rarely.

An important aspect brought out by the study refers to the fact that there is a delay in users’

adaptation to smart mobile devices equipped with new communication facilities and

technologies (NFC, in our case) besides their common features. It is possible that users perceive

the necessity of having a smart device only at the level of trend in gadget fashion and not as a

need or desire to use new features and applications routinely.

Therefore, the author believe that the results of the present research may provide useful

information for manufacturers of both smart mobile devices and mobile applications for new

technologies and smart devices. Last but not least, they can guide smart device users in the

buying process and help them benefit from the enhanced facilities of the purchased items.

References

[1] G. Arcese, G. Campagna, S. Flammini, O. Martucci, “Near Field Communication:

Technology and Market Trends”, Technologies 2014, 2, pp 143-163, p. 154, available:

http://www.mdpi.com/2227-7080/2/3/143

[2] Bhattacharya Sagnik, Panbu, M. B., Design and Development of Mobile Campus, an

Android based Mobile Application for University Campus Tour Guide, International

Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075,

2013, Volume-2, Issue-3, pp. 25-29, available:

http://www.ijitee.org/attachments/File/v2i3/C0405022313.pdf

[3] F. Borrego-Jaraba, P. G. Castro, C. G. García, L. I. Ruiz, A. M. Gómez-Nieto, “A

Ubiquitous NFC Solution for the Development of Tailored Marketing Strategies Based on

Discount Vouchers and Loyalty Cards”, Sensors 2013, 13, pp 6334-5354, p 6334, 6335,

available: http://www.mdpi.com/1424-8220/13/5/6334

[4] O. Busra, A. Mohammed, O. Kerem and C. Vedat, “Classification of NFC Applications in

Diverse Service Domains”, International Journal of Computer and Communication

Engineering, 2013, Vol. 2, No. 5, pp: 614, available: http://www.ijcce.org/papers/260-

F00028.pdf

[5] V. Coskun, K. Okay, B. Ozdenizci, “NFC Professional Development Application For

Android”, 2013, John Wiley & Sons, Ltd., pp 1-283, p 13,14, 15-18, available: http://it-

ebooks.info/book/2218/

[6] R. Egger, “The impact of near field communication on tourism, Journal of Hospitality and

Tourism Technology”, 2013, Vol. 4, Issue: 2, pp 119-133, p. 122, available:

http://www.emeraldinsight.com/action/doSearch?AllField=egger&SeriesKey=jhtt

[7] G. Gopichand, T. Chaitanya, Krishna, R. Ravi Kumar, “Near Field Communication and Its

Applications in Various Fields”, International Journal of Engineering Trends and

Technology (IJETT), 2013, Volume4, Issue4, pp 1305-1309, p 1307, available:

http://www.ijettjournal.org/volume-4/issue-4/IJETT-V4I4P359.pdf

[8] P. S. Halgaonkar, N. S. Daga, V. M. Wadhai, “Survey on Near Field Communication in

Healthcare”, International Journal of Science and Research (IJSR), 2014, Volume 3 Issue

12, p 1449-1450, available http://www.ijsr.net/archive/v3i12/U1VCMTQ3NTk=.pdf

[9] HCL, Near Field Communication in Medical Services, Whitepaper, April 2013, [Online],

http://www.hcltech.com/sites/default/files/near_field_communication_in_medical_device

s.pdf, pp. 9, [September 6, 2014]

[10]R. Boden, Harvard Medical School develops NFC medication tracking system , April

2013, http://www.nfcworld.com/2013/04/04/323325/harvard-medical-school-develops-

nfc-medication-tracking-system/, [December,9, 2014]



115

[11] T. Igoe, D. Coleman, B. Jepson, Beginning NFC Near Field Communication with Arduino,

Android, and PhoneGap, O’Reilly Media, Inc., 2014, pp: 1-232, p:14,15, available:

http://it-ebooks.info/book/3199/

[12] ITU-T Technology Watch Report May 2013 The Mobile Money Revolution. Part 1:

NFC Mobile Payments, available: http://www.itu.int/dms_pub/itu-

t/oth/23/01/T23010000200001PDFE.pdf

[13] L. Kumari, N. M. Midhila, E. Blessy, B. Karthik, “Automatic Wireless Attendance

Recording and Management Using Near Field Communication (NFC)”, International

Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 2013,

Volume 2, Issue 4, p 1642-1645

[14] D. Mitrea, “Near Fiels Communications - From Touch To Tap Marketing Empirical

Studies, Sea - Practical Application Of Science”, 2014, Volume II, Issue 2 (4), pp 623-630,

p 625-626, available: Http://Www.Sea.Bxb.Ro/Article/Sea_4_72.Pdf

[15] R. N. Nagashree, R. Vibha, N. Aswini, “Near Field Communication”, I.J. Wireless and

Microwave Technologies, 2014, 2, pp 20-30, available:

http://www.mecs-press.org/ijwmt/ijwmt-v4-n2/IJWMT-V4-N2-3.pdf

[16] NFC Forum 1, What Is NFC?, http://nfc-forum.org/what-is-nfc/ [January, 21, 2015]

[17] NFC Forum 2, Our Mission & Goals, http://nfc-forum.org/about-us/the-nfc-

forum/[January, 21, 2015]

[18] S. Clark, Two in three phones to come with NFC in 2018, NFC world,

http://www.nfcworld.com/2014/02/12/327790/two-three-phones-come-nfc-2018/, [June,

6, 2014]

[19] D. Palma, J. E. Agudo, H. Sánchez, M. Macías, “An Internet of Things Example:

Classrooms Access Control over Near Field Communication”, Sensors 2014, 14, pp 6998-

7012, p 6999, 7003, available: http://www.mdpi.com/1424-8220/14/4/6998

[20] D. Popescu, M. Georgescu, “Internet Of Things - Some Ethical Issues, The USV Annals

Of Economics And Public Administration”, Volume 13, Issue 2(18), 2013, pp 208-214, p:

210, available:

http://www.seap.usv.ro/annals/arhiva/USVAEPA_VOL.13,ISSUE_2%2818%29,2013_fu

lltext.pdf

[21] M. Quratulain, F. Munazza, R. Iqbal, A. Awais, “NFC Tags-Based Notification System

for Medical Appointments”, International Journal of Automation and Smart Technology ,

2014, pp. 191-195

[22] C. Sathya, M. Usharani, “A Survey of Technologies to Enable Security in Near-Field

Communication Tag Design”, International Journal of Innovative Research in Advanced

Engineering (IJIRAE) ISSN: 2349-2163, 2014, Volume 1, Issue 10, pp 412-415

[23] Smart Card Alliance Mobile & NFC Council Webinar, NFC Application Ecosystems:

Introduction, Peer-to-Peer, NFC Tags/Posters and Product Label Applications, 2012, pp. 4,

http://www.smartcardalliance.org/resources/webinars/nfc_app_ecosystem/20120927_NF

C_Application_Ecosystems.pdf [December, 12, 2014]

[24] P. Suthar, N. Pandya, “Near Field Communication”- An Enhanced Approach Towards

Contactless Services, International Journal of Advanced Research in Computer Science,

2013, Volume 4, No. 2, pp 139-142



116

HOW TO VISUALIZE ONTOLOGIES. A STUDY FROM AN END

USER’S POINT OF VIEW

Bogdan IANCU

The Bucharest University of Economic Studies [email protected]

Abstract. This working paper aims to present a new approach for visualizing ontologies’ data,

dedicated especially to non-technical end users. All the graphical representation formats

widely available for the semantic web data are dedicated in general to experienced users that

have the necessary Web 3.0 background. They use notations and rules that can be hard to

understand by the final users, notions that are somehow unnecessary for them. In this short

study we propose a new usability for the already existent open source JavaScript based charts

libraries. Even if they were designed for other usages like an alternative for bar charts (the

case of bubble chart) or for the first Web 2.0 sites (the case of tag clouds), they can be easily

adapted to represent complex ontologies. The first part of this paper presents the main known

graphical representations used by the semantic web experts. The way in which we can adapt

other charts to display ontologies together with some graphical examples are presented in the

final part of the study. Keywords: charts, JavaScript, ontologies, Web 3.0 JEL classification: C55, C88, Y10

1. Introduction The current generation of semantic web uses formats like RDF, RDFS or OWL for representing

internally the structure, the constraints or the individuals of an ontology. Even these formats

are widely used in the research field by experts they are not well known by non-technical and

sometimes even by technical users. But there are cases when websites that are using ontologies

want to display them directly to the final user. In this case the already known graphical

representations of ontologies are not very useful because they were designed by experts for

experts and make use of notations and rules that are not just hard to understand, but also

unnecessary for a user of the website.

In the first part of this paper are presented the existing ontology graphical representations with

theirs’ pros and cons. We will see where they stop being useful for end users and start being

annoying instead, by displaying unnecessary or hard to understand information.

The second part proposes ways of using JavaScript based charts types dedicated to other kinds

of representations as a base that can be adapted to represent ontologies in a more user-friendly

way.

There are a lot of open source JavaScript based charts libraries on the web like Charts.js [8],

D3.js [9], amCharts [10] or Google Charts [11]. They offer graphical representations like

bubble chart, flower chart or chord diagram that can be easily adapted to display the ontology’s

individuals and relations.

All this study is in fact a research made in order to find a solution regarding the representation

of a media description ontology. To be more explicit, this ontology contains data regarding the

knowledge that can be extracted from the eLearning videos posted on Youtube. The struggle

consists in how to display the videos that contain the information that the user is interested in,



117

in such a way that he can understand how related are the videos to his interests. A simple list

wouldn’t be enough because it is unidimensional and can order the videos based just on a single

criteria. We want a way to display how much needed information the videos contain together

with how the videos are related to each other. In other words we want to build an Augmented

Intelligence system similar to the one presented in [4].

2. Known graphical representations of ontologies The first method used to represent ontologies and the simplest one, which was available even

prior to W3C’s standard formats, was topic maps. Topic maps were originally developed in the

late 1990s as a way to represent back-of-the-book index structures so that multiple indexes

from different sources could be merged. However, the developers quickly realized that with a

little additional generalization, they could create a meta-model with potentially far wider

application. The ISO standard is formally known as ISO/IEC 13250:2003 [5].

If at the beginning the topic map graphical representations were really simple to understand,

after they were adopted for ontologies their structure changed. The current graphical form

(Figure 1) is called Topic Maps Martian Notation (TMMN) and uses symbols like "blob",

"label", "line", "dotted line" and "arrow" to represent the relationships and basic elements of

the Topic Maps model, namely Topics, Names, Associations and Roles, Scope, and

Occurrences (including Subject Identifiers and Subject Locators).

These symbols and elements aren’t needed for the final user of an ontology based application.

Anyone will want just to see how search results or other similar elements link one to each other

or at least a simple hierarchy between them. The end user is not interested about scopes, how

relations or entities are called and other specific elements.

Figure 1. The Topic Maps Martian Notation

Another method widely used for representing ontologies is called OntoGraf. It is the main

graphical representation in the Protégé editor and offers support for interactively navigating

the relationships of the created OWL ontologies. It also incorporates additional features such

as focus on home, grid alphabet, radial, spring, tree-vertical & horizontal hierarchies, zoom-in,

zoom-out, no-zoom, node-type, arc type and search (contains, start with, end with, exact match,

reg exp) [2]. As we can see in Figure 2 this form of graphical representation is unfortunately

as hard to understand as the topic maps (or even harder) by a non-technical user because it

includes specific notations and legends.



118

Figure 2. The OntoGraf representation of a Traffic Signs ontology

So the problem to resolve is how to represent ontologies in such a way that even the least

technical user could understand them. A solution is by using already known graphical

representations that can be adapted to semantic web formats. The way in which we can adapt

them is presented in the next part.

3. Adapting JavaScript based charts for ontology representation We are in the era of mobile devices and technologies, heading ourselves to the Internet of

Things (IoT), the network of physical objects or "things" embedded with electronics, software,

sensors and connectivity to enable it to achieve greater value and service by exchanging data

with the manufacturer, operator and/or other connected device [7].

The majority of sites have now dedicated mobile applications or at least a responsive design.

Responsive web design is an approach to web design aimed at crafting sites to provide an

optimal viewing experience: easy reading and navigation with a minimum of resizing, panning,

and scrolling, across a wide range of devices (from desktop computer monitors to mobile

phones) [6]. This thing means that the new ways of representing ontologies for the end user

should be not only easy to understand, but also adapted to different resolutions and computing

powers.

JavaScript is a programming language that can be used on any kind of device because it is

supported natively by any browser. Together with HTML5 it has the advantage of being a

lightweight solution for displaying different kinds of charts. There are a lot of chart responsive

libraries like the one remembered in the introduction part: Charts.js, D3.js, amCharts, Google

Charts, etc. From those open source libraries we will choose just four types of graphs that can

be easily adapted to represent semantic data: the bubble chart, the tag cloud, the flower chart

and the chord diagram.

The bubble chart (Figure 3) is used in general as an alternative to bar charts. Instead of drawing

bars which display the values scaled to the chart sizes, rounded shapes are being drown with

sizes that reflect the values. But the bubble chart has another advantage (one that the bar chart

hasn’t): it can also display how the values are related to each other. By grouping bubbles in

clusters we can represent the fact that they can be categorized somehow or that it exists a

relation between them.



119

Figure 3. Different types of JavaScript based bubble charts

We can make use of this chart for ontology representations by adapting some of its features.

First of all in an ontology we don’t have values, or at least not in all ontologies. So in order to

adapt the bubble chart to ontologies we should choose at the beginning what the bubbles will

represent.

Ontologies consists, at their lowest level, of triples (subject-predicate-object). These triples are

considered the fundamental building blocks of semantic representations. The subject in a triple

corresponds to an entity - a “thing” for which we have a conceptual class, predicates are a

property of the entity to which they are attached and objects fall into two classes: entities that

can be the subject in other triples, and literal values such as strings or numbers [3].

Thus in our case the bubbles will be entities and the values that will give their sizes will be the

number of relations (properties) that each entity has: 𝑉𝑖 = ∑(𝑁𝑠 𝑖 + 𝑁𝑜 𝑖)

max (∑(𝑁𝑠 𝑖 + 𝑁𝑜 𝑖))∗ 𝑀 (1). Equation

(1) says that the value for the bubble’s i size will be the sum of triples in which the entity i

appears as subject plus the triples in which the same entity appears as object, all this divided

by the maximum of this sum from the entire ontology and multiplied by the maximum physical

size of a bubble.

In this way the most important entities from our ontology will be bigger and the less important

ones will be smaller. Furthermore we can draw linked entities with the same color to show that

they are somehow related. Entities that are linked to more than one other entity could be colored

with all the related colors or with the most dominant one (the one of the entity which has the

most relations to). In this way we provide a simple method to visualize our ontology for any

kind of user. One does not have to know what a triple is and what all those arches represent.

The chart will be self-explanatory: the bigger – the most important, the smaller – the less

important, same color – somehow related.

A similar approach will be used for the tag cloud, but here instead of bubbles we will display

the label object of each entity. This label will be written in a rectangle shaped box which size

will depend on the number of the entity’s appearances in triples (Figure 4).

Figure 4. A tag cloud that uses the terms’ numbers of appearances in text



120

The flower chart (Figure 5) is somehow similar to the presented ontology graphical

representations. But it has an advantage, beside the one of being based on HTML5 and

JavaScript, which the others don’t: it can be displayed also in a 3D format. The user can interact

with the chart and rotate it on any axis, getting the full benefits of the representation in this

way.

Figure 5. Different types of JavaScript based flower charts

Because these kind of chart is very similar to the topic maps it is not hard to adapt it in order

to represent semantic data. If we talk about the first chart presented above (Figure 5 – left side),

the only thing that we should do will be to represent only the most important property of each

entity by a line and to grow the bubbles based on the total number of triples in which they

appear like we did for the bubble chart.

To obtain the graphic presented in the right side of the Figure 5 is even simpler: we will display

all entities as dots and all the properties as lines without any legend or notation. In practice

though, the first representation could be from our point of view more representative.

The chord diagram (Figure 6) allows us to represent a large amount of data in a single chart.

That means that it is an ideal resource for large ontologies too. It is actually used for situation

in which different types of data are related to each other, just like in the ontologies. So in order

to display semantic data in this format there aren’t so many things to change. We will represent

all the entities as labels (by using the rdfs:label corresponding object) around a circle and all

the properties as chords that link the entities in triples. The result has a great impact for the end

user because he can easily see how things are related just by crossing the mouse over the needed

entities.

Figure 6. Different types of JavaScript based chord diagrams



121

4. Conclusions The widely used graphical representations of ontologies in the research field aren’t a good way

of representing semantic data when it comes to end non-technical users. These kind of

representations usually use notations or symbols known only by the people from the web

semantic field. In order to resolve this kind of problem this paper presented some ways in which

we can use classical chart formats that are available as JavaScript/HTML5 responsive libraries

for displaying ontologies’ data. We saw that with some adaptions the meaningful data from

any ontology can be outputted to the final user in a way in which it can be easily understand.

Even if there are another researches on this theme like [1], they tend to focus more on finding

new ways of displaying ontologies for experienced users, not for non-technical ones. Future

work includes the full development of the media description ontology remembered in the

introduction part and the adaption of a presented chat in order to display the search results.

Acknowledgment This paper was co-financed from the European Social Fund, through the Sectorial Operational





References [1] S. da Silva, C. Isabel and C. M. Dal Sasso Freitas, “Using visualization for exploring

relationships between concepts in ontologies” in Information Visualisation (IV), 2011 15th

International Conference on. IEEE, 2011, pp. 317-322. [2] V. Swaminathan and R. Sivakumar, (2012, April). A Comparative Study of Recent

Ontology Visualization Tools with a Case of Diabetes Data. International Journal of

Research in Computer Science [Online]. 2 (3): pp. 31-36. Available:

http://www.ijorcs.org/uploads/archive/Vol2_Issue3_06.pdf

[3] T. Segaran, C. Evans and J. Taylor, Programming the Semantic Web. Sebastopol: O’Reilly

Media, 2009, pp 19-22.

[4] Big Data and the Rise of Augmented Intelligence: Sean Gourley at TEDxAuckland,

Internet: http://tedxtalks.ted.com/video/Big-Data-and-the-Rise-of-Augmen, December 7,

2012, [March 10, 2015]

[5] Topic Maps – Wikipedia, the free encyclopedia, Internet:

http://en.wikipedia.org/wiki/Topic_Maps, November 23, 2014, [March 12, 2015]

[6] Responsive web design – Wikipedia, the free encyclopedia, Internet:

http://en.wikipedia.org/wiki/Responsive_web_design, March 9, 2015, [March 12, 2015]

[7] Internet of Things – Wikipedia, the free encyclopedia, Internet:

http://en.wikipedia.org/wiki/Internet_of_Things, March 11, 2015, [March 12, 2015]

[8] Chart.js | Open source HTML5 Charts for your website. Internet: http://www.chartjs.org/,

March 10, 2015 [March 10, 2015]

[9] D3.js – Data-Driven Documents, Internet: http://d3js.org/, March 10, 2015 [March 10,

2015]

[10] JavaScript Charts and Maps | amCharts, Internet: http://www.amcharts.com/, March 10,

2015 [March 10, 2015]

[11] Google Charts – Google Developers, Internet, https://developers.google.com/chart/,

February 25, 2015 [March 10, 2015]



122

STUDY ON STUDENTS MOBILE LEARNING ACCEPTANCE

Daniel MICAN

Babeş-Bolyai University, Business Information Systems Department

[email protected]

Nicolae TOMAI Babeş-Bolyai University, Business Information Systems Department

[email protected]

Abstract. Mobile learning has become very popular recently, due to the spectacular

development of embedded technology in the frame of mobile devices. Therefore, this study is

based on capturing the perceptions in using mobile devices by students of bachelor and master

from the FSEGA, Babeș Bolyai University. The first step we took was to run a prospective

study, using as main instrument the questionnaire. At this study participated 40 respondents.

After analyzing the results we observed that reading lecture materials is leading the top of the

preferences, followed by listening to lectures and watching presentations. Receiving lectures

materials and marks play an important role among the preferences, the majority of the

respondents confirming these activities. Moreover, we observed that the vast majority of the

respondents consider that software and mobile learning applications improve the success of a

lecture overall and increase the quality of e-learning process. Furthermore, for fulfilling all

the lecture objectives it is suggested to combine the usage of the traditional learning methods

with the mobile learning ones.

Keywords: mobile devices, mobile learning, mobile learning activities, mobile technology JEL classification: Z19

1. Introduction Mobile learning refers to delivering the educational process towards the students through

mobile devices. The students can access the educational content anytime and from any location

via Internet. Due to the high degree of presentation of mobile devices, a series of learning

paradigms have been designed which are intended for the remote use. Therefore, scientist have

developed an increased interest with respect to the usage of mobile devices and the perception

of the users regarding the learning activities [3], [5], [6], [7]. Firstly, we pass in review the

directions of mobile devices in the literature and secondly we will present the results obtained

in the framework of this study. Likewise, we will indicate the particularities resulted from

analyzing the data gathered from the respondents with regard to the preferences of both mobile

devices usage and the preferred mobile learning activities. We will end this paper by presenting

the conclusions and the directions of future works.

2. Related work In the literature there have been identified a series of researches that focus on the use of mobile

devices in the learning process. These researches are based on studying the desired

functionalities, perceptions and the user’s attitude with regard to the usage of new devices in

the process of mobile learning. Stockwell [3] conducted a study with a group of 75 students

from the Waseda University, Tokio, Japan. In this study, they analyzed the preferences of using

either a mobile device or a desktop computer in language learning process. The user’s



123

preferences have been captured by using questionnaires and logs on the server. The result of

the study reveals the fact that the students show a general positive attitude with regard to using

mobile devices in the language learning process, even if they are aware about some limitations

of the mobile devices. More than two-thirds of the subjects taking part at this experiment

showed an increased interest in using mobile phones in the language learning process, both on

a short and on a long period of time. In his study [6], Wang investigated the relevant

determinants of mobile learning acceptance based on the Unified Theory of Acceptance and

Use of Technology (UTAUT). The goal of Wangs study, was to determine some potential

differences caused by gender or age. The results obtained from 330 Taiwanese subjects

indicated that the relevant factors in using mobile learning were determined by the following

criteria: performance expectancy, effort expectancy, social influence, perceived playfulness

and self-management of learning. Likewise, they have identified the fact that the age

differences temper both the effects of effort expectancy and social influence. Furthermore, the

gender differences temper the social influence and self-management of learning in using

mobile learning.

An extensive study into mobile learning is presented in [4]. This study is shaping the stage of

technology taking into consideration the papers published in journals, technical reports and

research projects. Learning requires at least two types of subjects: the person who will be

trained and the content itself [2]. In this case, a person can use three important types of learning:

auditory, visual and tactile. The new technology offers innovative learning rules that can be

used together with the traditional ones. The increasingly widespread of Internet devices

especially of the mobile devices has improved the quality and the flexibility of learning, thus

providing a new learning paradigm, called: mobile learning [1]. Based on the developed and

implemented system at the Shangai Jiaotong University, Wang showed that the mobile learning

activities can involve the students much better in the learning process. Therefore, the students

changed their status of passive learners in learners involved not only emotionally but also

intellectually in the learning tasks. A global view over the field of mobile learning belongs to

Wu [7]. Wu undertook a literature review, offering a comprehensive summary on the basis of

164 studies from 2003 to 2010. The main findings reveal that most studies were focused on

efficiency, designing the learning system or doing experiments and surveys.

3. Case study The goal of this prospective study was to test the trends in the field of mobile learning. In order

to demonstrate the functional links among the presented elements, a larger sample volume

would be required which would allow the use of econometric models. After collecting and

processing the data, we obtained a series of results illustrated in the following. In terms of

owned mobile devices, we obtained the results illustrated in Figure 1. Smart phones and laptops

are the most popular mobile devices. They are in the possession of 75% and 57% of the

respondents. The big losers are mobile phones that are in the possession of 25% of the

respondents. These are on the list of purchasing preferences of only 5% of the students taking

part in this study.

Even if none of the respondents have an eReader, these seem to be preferred for purchasing by

12,5% of the respondents. It seems that the winner on the mobile devices market remains the

smart phone. Even if the smart phone is the winner in the “Possession Category”, this continues

to lead the purchase preferences of the respondents. Therefore, 37% of them want to purchase

a smart phone. The smart phone is surprisingly followed by TabletPC. This is preferred by 35%

of the respondents and it is owned by 30% of them. In terms of connecting to Internet, the vast

majority of owned mobile devices are connected to Internet. If we take into consideration the

laptops, the percentage of the devices connected to Internet is higher than of the owned devices



124

due to the fact that a part of the respondents use laptops in order to connect to Internet, but

they do not hold this mobile device. These use the laptop generally at work. It seems that the

ultra-books did not gain a significant market share due to very high prices. Instead of ultra-

books people still prefer laptops precisely because of the low prices.

Figure 1. Owned mobile devices, desired and connected to the Internet

Wi-Fi is used by 97,5% and is preferred as a primary method of connecting to Internet by

72,5% of the respondents. The mobile networks have gained ground lately and are being used

to connect to Internet by 60% of the respondents. A significant part of the respondents, namely

27,5% prefer mobile networks to connect to Internet. These results are shown in Figure 2.

Figure 2. Connecting mobile devices to the Internet via Wi-Fi and mobile networks

With regard to the time the subjects spent using mobile devices we obtained the following

results: the vast majority of the respondents, namely 95% use mobile devices daily. Thus, 50%

of them use the mobile devices between 1 and 3 hours, and 37,5% use them more than 4 hours

daily. This is illustrated in Figure 3.

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

Owned devices

Wanted devices

Internet connected

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

used on device used more often

WiFi

Mobile network



125

Figure 3. The time spent using mobile devices

Regarding the most common locations where people go online, we obtained the following

results: the vast majority of the respondents, namely 72.5% connect to Internet from home. The

rest of the respondents connect to Internet from school, transit and work. The percentages for

these locations are almost equal: 32.5%, 35%, and 32.5%. The distribution can be seen in

Figure 4.

Figure 4. The most common locations to go online

In terms of using mobile learning and the impact on e-learning, we came to the following

conclusion: it is estimated that 85% of the software and mobile learning applications improve

the learning process. Regarding that mobile learning increases the quality in the process of e-

learning, 77.5% of respondents tip the balance towards yes. Conversely, the respondents tend

to be more skeptical towards meeting course objective through mobile learning. Therefore,

20% tend to infirm this hypothesis, 12.5% are not sure about it, and 67.5% tend to confirm it.

These results are illustrated in Figure 5.

On the basis of these answers we came to the following conclusion: the vast majority of the

respondents consider that software and mobile learning applications improve the learning

process overall and increase the quality of e-learning. Moreover, for fulfilling the objectives of

the course and the learning process it is preferable to use the mobile learning in conjunction

with traditional methods of learning.

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

daily: <1hour

daily: 1-3hours

daily: >4hours

A fewhours /week

I use itrarely

0.00%

20.00%

40.00%

60.00%

80.00%

Home School In Transit At Work Other



126

Figure 5. Influence of mobile learning in the learning activities

In order to outline the feedback regarding mobile learning activities we collected and analyzed

the answers relating to: receiving marks, writing exams and quick tests, playing educational

games, watching presentations, listing to lectures, reading course materials, voice recording

and receiving course materials. The results we get show that the respondents have a clear

preference for reading the lecture/course materials. A percentage of 67.50% completely agree

reading the course materials, and a 22.50% are agree to read the course materials on mobile

devices. This activity is followed by listening to lectures and watching presentations. The

percentages of the respondents that agree to these activities are: 90% and 82.5%.

Figure 6. Feedback on mobile learning favorite activities

Receiving marks and lecture materials continues to be preferred by a large number of the

respondents. At the end of the list the respondents highlighted the following: voice recording,

writing exams and quick tests and the playing educational games. Thus, 42.5% of the

respondents seem not to be sure about the voice recording. On the other hand, surprisingly,

15% of the respondents are not sure about playing educational games, and 32.5% do not agree

with this activity.

5. Conclusions In this paper we made a prospective study to test the trends in the field of mobile learning.

After collecting, processing and analyzing the data, we can make the following statements:

smart phones and laptops are the most popular owned mobile devices. These are in the

0% 20% 40% 60% 80% 100%

the use of mobile learning software would improve…

the mobile learning increases the quality of e-learning

the course learning objectives can be met by mobile…

Yes Probably Not Sure Probably Not No

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

receive learning materials

voice recording

read course materials

listening to lectures

watch presentations

play educational games

take exams and quizzes

receive grades

Completely Agree Somewhat Agree Not Sure Somewhat Disagree Completely Disagree



127

possession of 75% and 57% of the respondents and are leading the purchasing preferences

together with the tablet pc. The big losers are mobile phones that do not appear anymore on the

purchasing preferences list. The respondents use at least one mobile device that is connected

to Internet and 95% of them use mobile devices daily.

With respect to the use of mobile learning and the impact on e-learning we observed that the

majority of the respondents consider that software and mobile learning applications improve

the overall learning process in the frame of a lecture or a course and increases the quality in the

frame of e-learning. Moreover, we have reached the conclusion that in order to fulfill all the

course objectives it is preferred to combine the traditional learning methods with the mobile

learning methods. Furthermore, regarding the use of mobile learning activities we obtained the

following results: the reading of course materials is leading the top preferences. A percentage

of 67.50% completely agree to read the course materials and 22.50% agree to read the course

materials on mobile devices. This activity is followed by listening to lectures and watching

presentations. Receiving marks and course materials continues to be preferred to a large

number of the respondents. At the end of the list the respondents highlighted the following:

voice recording, taking exams and quick tests and playing educational games. Thus, 42.5% of

the respondents seem not to be sure about the voice recording. On the other hand, surprisingly,

15% of the respondents are not sure about playing educational games, and 32.5 % do not agree

with this activity. In future studies, we aim to increase the sample size which would allow the

use of econometric models and the establishment of certain links and dependencies.

References [1] J. Donner, "Research approaches to mobile use in the developing world: A review of the

literature," The information society, vol. 24, no. 3, pp. 140-159, 2008.

[2] J. L. Moore, C. Dickson-Deane, K. Galyen, "e-Learning, online learning, and distance

learning environments: Are they the same?," The Internet and Higher Education, vol. 14,

no. 2, pp. 129-135, 2011.

[3] G. Stockwell, "Investigating learner preparedness for and usage patterns of mobile

learning," ReCALL, vol. 20, no. 3, pp. 253-270, 2008.

[4] O. R. Pereira and J. J. Rodrigues, "Survey and analysis of current mobile learning

applications and technologies," ACM Computing Surveys (CSUR), vol. 46, no. 2, article

27, 35 pages, 2013.

[5] M. Wang, R. Shen, D. Novak, X. Pan, "The impact of mobile learning on students' learning

behaviours and performance: Report from a large blended classroom," British Journal of

Educational Technology, vol. 40, no. 4, pp. 673-695, 2009.

[6] Y. S. Wang, M. C. Wu, H. Y. Wang, "Investigating the determinants and age and gender

differences in the acceptance of mobile learning," British Journal of Educational

Technology, vol. 40, no. 1, pp. 92-118, 2009.

[7] W. H. Wu, Y. C. J. Wu, C. Y. Chen, H. Y. Kao, C. H. Lin, S. H. Huang, "Review of trends

from mobile learning studies: A meta-analysis," Computers & Education, vol. 59, no. 2,

pp. 817-827, 2012.



128

CONDUCTING PROCESS MODEL UNDERSTANDING

CONTROLLED EXPERIMENTS USING EYE-TRACKING: AN

EXPERIENCE REPORT

Razvan PETRUSEL

Babeș-Bolyai University of Cluj-Napoca, Romania [email protected]

Cristian BOLOGA

Babeș-Bolyai University of Cluj-Napoca, Romania [email protected]

Abstract. The research introduced in the paper is placed in the area of Business Process

Management (BPM). It is a major concern for industry and research, having process models

are at its core. The paper approaches controlled experiments that employ eye-tracking as the

observation method, for researching process model understanding. Previous research in the

field, was conducted using only indirect observation methods. Eye-tracking was employed in

several related fields, from which we distill the major points of interest related to experimental

design (e.g. metrics, tasks, participant selection, etc.). The main contribution of the paper is a

set of guidelines based on the state-of-the-art review, and our own experience with eye-tracking

based controlled experiments.

Keywords: business process model understanding, eye-tracking empirical research, eye-

tracking controlled experiment. JEL classification: C80, C90, L23.

1. Introduction Business process management (BPM) is a mature area both for industry and research. In

industry all major business software vendors (SAP, Oracle, IBM) sell process-oriented suites.

Also, there is a wealth of implementations from smaller companies supporting BPM projects

major stages (e.g. modeling, operational support, reporting and controlling). This push of

software developers is driven by the fact that most medium and large companies migrate to

process-oriented architectures. Research on BPM spreads across all continents [1], [2]. Venues

for BPM are high-profile journals and dedicated tracks in several major information systems

conferences.

At the core of BPM is the business process model (or process model for short). There is a

standard notation for drawing such models (BPMN) as well as a few other, widely-known in

industry (EPC or UML Activity Diagrams) or research (Petri Nets). No matter how the process

model is graphically depicted, the underlying assumption is that a human reader will be able to

understand it. So far, there is a body of knowledge on how process model understanding can

be improved. However, it all stands on research performed using indirect methods (i.e.

interviews, questionnaires and controlled experiments). This paper discusses how process

model understanding can be researched by applying direct observation methods. Eye-tracking

is such a direct method, given the so-called ‘eye-mind relationship’ which emphasizes the

strong connection between human thinking and the object focused by his eyes.

This paper unfolds as follows. First, we introduce the background of eye-tracking as a research

method. Then, we establish the state-of-the-art in eye-tracking based research in several areas



129

closely related to process model understanding. This is the base for a synthesis regarding major

points to be observed while executing an eye-tracking controlled experiment. In the end we

give our own experience in organizing and running such empirical research.

2. Background

2.1. Eye-tracking

Eye-tracking is a technique employed in research for over 50 years. At first, most research

effort was put into refining the technique itself. Then, its use spread to a large number of

research topics ranging from medicine to computer science. The most influential work

(according to citation count) on eye-tracking research methodology are those of Duchowski [3]

and, more recently Holmqvist et al. [4].

The human brain is built to filter the information that can be potentially processed (i.e. this is

commonly named attention). Medical research has shown that there is a so-called ‘eye-mind

relationship’, which basically means that we can accurately perceive something only if we

fixate it with our eyes and focus our attention to it, and that there is a direct relationship between

eye movements and the cognitive load of the brain [5]. Therefore, the easiest way to detect the

object of attention is to measure where the eyes are fixated. The human eye has a very limited

area where visual stimuli (e.g. images) are registered and sent to the brain with the highest

acuity (i.e. the foveal region, which is about 8% of the visual field). Therefore, the eye is

constructed to move around and then pause such that the object of interest is placed in the

fovea. Eye-tracking is a technique that pinpoints the object of interest of a subject based on the

eye mechanics. A relevant insight from medical research is that the brain needs around 80

milliseconds of seeing an image before it is registered by the brain (under normal lighting

conditions). This delay is different when it comes to reading (50-60 ms) or seeing pictures (150

ms). There are several techniques that can be employed to detect eye movements. The latest

technique is remote and non-intrusive, and is called Pupil Centre Corneal Reflection. Such eye-

tracking hardware commonly consists of one or more video cameras and an infrared

illuminator. The cameras basically film the eyes and then software calculates the vector based

on the angle of the illuminator reflection on the pupil. Thus, the position of the eye in space

and the direction of gaze can be predicted with high accuracy. In our experiments, the subjects

needed to examine a business process model drawn using the Business Process Model and

Notation (BPMN) standard, in order to answer a comprehension question. The focus of

attention from one model element to the other is captured as eyes fixate the different elements.

The main eye functions that can be measured are: fixations (pause of eye movements on a

specific area of the visual field), saccades (rapid movement between two fixations) and drift

(fine adjustments that keep perception of stationary object). When it comes to scientific

research based on the eye-tracking observation method, there are several metrics that are

commonly recorded [3], [4]: the number of fixations, the duration of each fixation, the

saccades, and the sequence of fixations. The most common metric is the fixation count.

How are eye-metrics calculated by an eye-tracking system? During a recording, raw eye

movement data points are collected every 16.6 ms (given that our eye-tracker works at a

frequency of 60 Hz). Each data point has a timestamp and a coordinate (x, y). The analysis

software processes this log and aggregates the data points into fixations. This is basically done

using two thresholds: distance threshold (to aggregate data points in a fixation, the distance

between them should be less than a certain number of pixels) and fixation threshold (to detect

a fixation, data points should be in close proximity of each other over a certain period of time).



130

2.2. Controlled experiments on understanding, using eye-tracking as an observation method

We performed an analysis on the papers listed in the first 15 hit pages of Goggle Scholar on

the key-words “eye-tracking” and “eye tracking” (duplicate hits were removed). A second

search was done considering the first 15 pages of Google Scholar filtered by papers published

after 2010. All papers were classified according to their title and abstract into four categories:

medicine (approx. 44% of the papers), research on eye-tracking methodology (~22%), human

factors and behavior (~17%), usability (~16.5%), while papers on other issues amounted to

about 2%. Only papers on usability topic were investigated further. All the abstracts of the

papers in this pool were read. Papers that presented interest during this screening, were fully

read. Then, we further divided the papers into three sub-categories for which we extended our

search on the papers citing and being cited by. The first category that matches our topic are

eye-tracking papers related to BPM. We found only a handful of such papers. A close match

to our field were the eye-tracking papers on other types of model understanding (e.g. UML

diagrams, geographical maps, etc.). Finally, we also found our topic matches research on

learning, more specifically on the influence of visual enhancements on understanding graphical

depictions of concepts.

When it comes to eye-tracking papers related to BPM, our previous research [6] was the only

one approaching process model understanding with eye-tracking. We investigated the

relevance of the notion of Relevant Region on business process model comprehension using

27 participants. We concluded that, the correctness of the comprehension question answers can

be predicted based on the number of Relevant Region elements fixated by the reader, and on

the percentage of time spent fixating the Relevant Region. The other noticeable paper

employing eye-tracking in BPM [7] investigates how process models are created based on a

textual description. The experiment was run on a population of 25 students, but only 2 instances

were used in analysis. Eye-tracking metrics used in analysis were fixations and the fixation

durations. A third paper [8] is vaguely introducing eye-tracking as a possible tool to be used in

BPM research. The one hypothesis under investigation postulates that eye-tracking is a suitable

method to assess requirements for user satisfaction in business process modeling. A fourth

paper concludes that current low-cost eye-tracking systems are suited for research in BPM [8].

One of the most interesting papers from the second sub-category attempts to identify the

influence of layout, color and stereotypes on the comprehension of UML Class diagrams [9].

Comprehension effort was measured by the total number of fixations (i.e. the more fixations

on a diagram the worst the layout). The assumption is that more fixations indicate an inefficient

exploration that leads the subject to span his attention inefficiently over more model elements

than necessary. The study was performed using 12 subjects of various expertise (faculty, Ph.D.,

master and bachelor students as well as novices with no UML background).

Gathering more insights, by employing eye-tracking, than previously obtained by a

questionnaire-based study was the spark of [10]. The study targeted understanding UML Class

diagrams. The 15 academics participants were introduced with diagrams in an attempt to

compare two model layouts. Metrics used in the study were grouped in three classes: fixation

count (total number of fixations on the model), fixation rate (number of fixations on relevant

classes (i.e. tasks in a process model), number of fixations on relevant classes and associations

(i.e. tasks and edges in a process model), number of fixations on other model elements), and

fixation duration (average duration of all fixations, average fixation duration on classes,

average fixation duration on classes and associations, and average fixation duration on other

model elements).

Jeanmart et. al. [11] used eye-tracking to study UML Class diagrams comprehension (more

specifically, a certain design pattern). The approach was to conduct experiments on diagrams



131

in 3 conditions: no patterns, with patterns in canonical layout and with patterns in a modified

layout. The study was conducted on a population of 24 students.

For the third sub-category, there is an excellent review of studies that use eye-tracking to

explain and enhance learning (with words and graphics) [12]. Most interesting for us is a

comparison of 6 papers with regard to content, independent variables, eye-tracking metrics,

dependent/outcome variables as well as the main research contribution. Some interesting facts

revealed by this comparison are: a) when it comes to eye-tracking measures, all studies relied

on the time spent looking at the relevant areas; b) the dependent variables measured

comprehension as accuracy at answering test questions; and c) in 25% of the papers, the factor

was visual cues, thus making it the most researched one. Eye-tracking was employed to test

the influence of cues on learning in [13]. Two experiments are reported on a population of 57

undergraduate students. The independent variables were the number of fixations and the

duration of fixations (as there was a high correlation between the two, the latter was not

reported in the paper). The dependent variables are learning time and a comprehension measure

(a composite between binary answer questions and one descriptive answer question that

calculated the percentage of an expected number of 15 items).

The influence of visual cues on learning performance was also assessed in [14]. There were

two factors (cues and no cues) evaluated using questionnaires and computer support, but no

eye-tracking. Overall, learning efficiency (measured both as learning concepts, and learning

time) was measured.

In [15], step-by-step guidance is provided by highlighting the steps required for performing a

task using software interface. Basically, menus and toolbars items are dynamically colored in

an e-mail application while the user performs some task. The evaluation metrics were: user

responses and time performance. There were 64 participants in the experiment.

Reading a process model requires the user to employ some sort of visual search strategy. To

this regard, the efficiency of map reading visual efficiency is investigated using eye-tracking

in [16]. The experimenters set the sequence of activities that must be performed (e.g. explore

map, select appropriate map layer, locate area, zoom-in to relevant area, identify correct piece

of information for the response). That lead to the need to compare the ‘ideal’ visualization and

the subject’s fixations. In [16] string-matching metrics as Levenshtein distance and others were

used to cluster subjects.

2. State-of-the-art summary: main points to consider when running eye-tracking

controlled experiments When a new controlled experiment is designed, the researcher needs to focus on several main

issues: stimuli and tasks (what kind of activities need to be executed under observation and on

what objects), participants (how many persons are observed and what is their level of

expertise), experimental procedure (how is the experiment organized), experimental instrument

(how is the eye-tracking data collected, how is raw data turned into useful outputs), measures

(on what is data collected so that the hypothesizes are tested).

Given the short review of several papers related to eye-tracking controlled experiments on

process model understanding, researchers addressed those points as follows:

Participants Between 12 and 64, with most experiments around 25 participants.

Mostly students at various levels and some experts (academic and very rarely industry).

Stimuli (factors of the experiment): model layout (e.g. canonical design patterns vs.

random), visual signaling of elements (Coloring of model elements, Visual cues on the

model, animations on the model)

Tasks in the experiment: between 1 and 6 tasks to be performed on between 1 and 3 models.

Mostly not more than 3 tasks to be performed in one experiment.



132

Eye-tracker settings:

- most papers use 60 Hz bright-pupil systems.

- Fixation duration threshold: not reported in most papers, when reported between

0.05 and 0.5 seconds.

- Pixel threshold (used to group collected raw focus coordinates into fixations) not

reported in most papers, when reported between 20 and 50 pixels.

Metrics: Fixation count, Fixation duration, Task time, Correctness of answers (in one

instance a more complex approach to correctness),

Data analysis: data consistency test, ANOVA and/or ANCOVA, rarely regression analysis.

3. Guidelines based on our experience with designing and running eye-tracking based

controlled experiments So far we organized several eye-tracking experiments in an attempt to shed more light on how

humans make sense of business process models. Considering the points reviewed before, our

insights are:

Participants: the expertise level of participants is very important. Generality can be

achieved by a balanced mix of academic (researchers and/or students) and industry

participants.

Stimuli: careful consideration is needed when choosing: the modeling notation, and

complexity of models (e.g. measured as the number of elements in the model). If the

experiments are within-subjects, the same model needs to be at least twice. Our experience

shows that mirroring a model is enough to mitigate the learning effect.

Tasks: the typical task is answering comprehension questions. For within-subjects

experiments, the same task needs to be performed twice. To mitigate learning effect, our

solution was manipulating the graphical representation (e.g. mirror model, re-label it).

When asking comprehension questions about the control-flow of process models, one needs

to cover at least: sequence, concurrency, and exclusive choice. Questions can be asked

about the model itself (e.g. find the ‘shortest path’) or about resource and data perspectives.

Experimental procedure: given that it’s a direct observation method, an eye-tracking

experiment has the disadvantage that it can be done only one person at a time. The average

time needed to complete the experiment is rarely disclosed in papers. It’s capital that the

researcher balances between a lengthy procedure (e.g. that involves more models that

increase the generality of the conclusions) and the impact of fatigue on the participants.

Our experience shows that striking this balance needs several iterations of re-design and

testing.

Eye-tracker settings: in our experiments there is no statistically significant difference

between various thresholds used for aggregating raw data coordinates into fixations. For

example, by changing the duration setting from 0.05 to 0.1 and then to 0.15 seconds, we

noticed a polynomial increase in the number of fixations. However, when expressed as

percentages, the changes were insignificant between the three settings. This holds true for

the distance threshold as well. Therefore, standard eye-tracker settings should be fine for

most cases.

Metrics: fixation count and average fixation duration must be used, as they are at the core

of eye-tracking outputs. Additional data can be recorded on task duration or on participant’s

confidence in the comprehension question answer (for our experiments so far confidence

didn’t provide further insights). We employed efficiency as well, calculated as answer

correctness over task time. Also, if there is a notion of ‘relevant element’ in a model, classic

search metrics such as precision and recall can be used.



133

Data analysis: in the case of within-subjects experiments, paired samples tests are very

relevant. Given the normality of the data, one can choose Student’s T-test (normally

distributed data) or Wicoxon Rank Sum Test (not–normal distribution).

4. Conclusions We introduced the state-of-the art in controlled experiments using eye-tracking in the area of

model understanding. Then, considering our experience in the field, we put forward a set of

essential guidelines to consider when designing and running eye-tracking experiments in

process model understanding.

The main contribution of the paper are the guidelines, which are aimed at researchers that will

attempt to investigate process models. We believe that this is a useful summary of our

experience as pioneers in applying eye-tracking for directly investigating process model

understanding.

References [1] W. M. van der Aalst, “Business process management: A comprehensive survey”, ISRN

Software Engineering, vol. 2013, Article ID 507984, 37 pages, 2013.

doi:10.1155/2013/507984

[2] R. K. Ko, S. S. Lee, and E. W. Lee, “Business process management (BPM) standards: a

survey”, Business Process Management Journal, vol. 15, no. 5, pp. 744-791, 2009.

[3] A. Duchowski, Eye tracking methodology: Theory and practice, Berlin: Springer, 2007.

[4] K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. Van de Weijer,

Eye tracking: A comprehensive guide to methods and measures, Oxford University Press,

2011.

[5] M.A. Just, and P.A. Carpenter, Eye fixations and cognitive processes, Cognitive

Psychology, vol. 8, no. 4, pp. 441–480, 1976.

[6] R. Petrusel, and J. Mendling, “Eye-Tracking the Factors of Process Model Comprehension

Tasks” Lecture Notes in Computer Science, vol. 7908, pp. 224-239, 2013

[7] J. Pinggera, M. Furtner, M. Martini, P. Sachse, K. Reiter, S. Zugal, and B. Weber,

“Investigating the Process of Process Modeling with Eye Movement Analysis”, Lecture

Notes in Business Information Processing vol. 132, pp 438-450, 2013.

[8] F. Hogrebe, N. Gehrke, and M. Nüttgens, “Eye Tracking Experiments in Business Process

Modeling: Agenda Setting and Proof of Concept” in Proc. EMISA 2011, pp. 183-188.

[9] Y. Shehnaaz, K. Huzefa, and J. I. Maletic, “Assessing the Comprehension of UML Class

Diagrams via Eye Tracking” In Proc. 15th IEEE International Conference on Program

Comprehension, 2007, pp. 113-122.

[10] B. Sharif, and J. I. Maletic, “An Eye Tracking Study on the Effects of Layout in

Understanding the Role of Design Patterns”, In Proc. ICSM 2010, pp. 1-10.

[11] S. Jeanmart, Y. G. Gueheneuc, H. Sahraoui, and N. Habra, “ Impact of the visitor pattern

on program comprehension and maintenance” In Proc 3rd IEEE International Symposium

on Empirical Software Engineering and Measurement 2009, pp. 69-78.

[12] R. E. Mayer, “Unique contributions of eye-tracking research to the study of learning with

graphics”, Learning and instruction, vol. 20 no. 2, 167-171, 2010.

[13] J. M., Boucheix, and R. K. Lowe, “An eye tracking comparison of external pointing cues

and internal continuous cues in learning with complex animations”. Learning and

instruction, vol 20, no. 2, pp. 123-135, 2010.

[14] L. Lin, and R. K. Atkinson, “Using animations and visual cueing to support learning of

scientific concepts and processes”, Computers & Education, vol. 56, no. 3, pp. 650-658,

2011.



134

[15] L. Antwarg, T. Lavie, L. Rokach, B. Shapira, and J. Meyer, “Highlighting items as means

of adaptive assistance’, Behaviour & Information Technology, vol. 32, no. 8, pp. 761-777,

2013.

[16] A. Çöltekin, S. I. Fabrikant, and M. Lacayo, “Exploring the efficiency of users' visual

analytics strategies based on sequence analysis of eye movement recordings”, International

Journal of Geographical Information Science, vol. 24, no. 10, pp. 1559-1575, 2010.



135

A TEST DATA GENERATOR BASED ON ANDROID LAYOUT FILES

Paul POCATILU


[email protected]

Sergiu CAPISIZU

Bucharest Bar

[email protected]

Abstract. Test data generation represents an important step for a high quality testing process

for any software, even for mobile devices. As proposed in previous works, an interesting source

for random data generation is represented by the UI layout files. This paper presents a system

dedicated to Android layout files that uses these files as input and generates an XML-based file

used by the test data generator to obtain test data sets.

Keywords: mobile applications, layout files, software testing, test data generators, software

quality

JEL classification: C49, C61, L86

1. Introduction

Software testing represents an important step in software development [1], [2]. The testing

process is thoroughly presented in books like [3] and [4]. Like other applications, mobile

applications require testing in order to achieve a required level of quality. This can be done

using similar tools and frameworks and also specific and dedicated tools, depending on the

platform. Some of the mobile application testing types are shortly described in [5].

During the testing process, test data generation has its own role for testing success. Test data

generation is made using different tools and techniques. The paper continues the researches

presented in [6] and [7] and focuses on template generation for test data based on Android

layout files. Test data templates are XML-based files written using DSL (Data Specification

Language). The generated test data can be used by own testing tools or frameworks or can be

used as inputs for existing testing frameworks and tools.

The paper is structured as follows. The section Android testing frameworks and tools presents

the most important aspects related to Android applications testing. It also make a short

presentation of the Android testing instruments. Test data generation section describes the

proposed system for test data generator based on Android layout files. In Data Specification

Language (DSL) section is detailed the XML-based language used for test data specification.

The proposed parser for Android layout files is presented in the last section, Android layout

files parser. The paper ends with conclusion and future work.

2. Android testing frameworks and tools

Android applications being developed using Java programming language, JUnit testing

framework is suitable for the automated testing of functional issues. JUnit is a testing

framework for regressive unit testing of Java programs [8]. The main Java classes used by the

framework are associated to test cases and suites.

The Android platform includes several tools and frameworks. Also, third party developers have

built such tools and framework for Android applications testing. In [9] are presented the

fundamentals of Android applications testing.



136

Table 1 summarizes the most used Android testing tools and frameworks. Many of these are

based on JUnit. Table 1- Android testing tools and frameworks

Framework/Tool Included in

Android API

Testing level Testing

approach

Type

Espresso Yes UI testing Black box Framework

Android

Instrumentation API

Yes Unit testing White box Framework

Monkey Yes UI testing Black box Tool

monkeyrunner Yes UI testing Black box Tool

Robotium No UI testing Black box Framework

Robolectric No Unit testing White box Framework

UI Automator Yes UI testing Black box Framework

Espresso is a testing framework used for UI testing. It is based on JUnit and it is included in

Android Support Repository.

Android Instrumentation framework is developed by Google for testing Android applications.

It allows to control the life cycle of the Android applications and components during the tests.

Monkey is a testing tool that runs within the emulator or on the mobile device. It is used to send

random events to the Android device (user or system).

monkeyrunner provides an API that can be used to control an Android device by installing

applications, sending commands and taking and saving screenshots.

Robotium is a testing framework for Android and hybrid application. It is based on Android

Instrumentation and it is used to automate UI testing.

Robolectric is framework that allows testing on a JVM running on a computer. This will speed-

up the testing process.

UI Automator framework provides an API that allows to control user and system applications

for UI testing.

In [10] there is a short presentation of several other Android testing frameworks and tools:

Mockito – a framework for testing Java and Android applications; it allows creation of

mock objects for testing and it is used in unit testing;

EasyMock – a testing framework used in unit testing; it uses mock objects;

PowerMock – a framework based on Mockito and EasyMock;

Inifinitest – a testing plugin for Eclipse and IntelliJ; it is a continuous test runner.

In order to automate the testing process, some of these tools can run using generated data by

dedicated tools.

Specific Android testing approaches are presented in [11], [12] and [13].

3. Test data generation

In order to generate test data, test data generators (TDG) can be based on random functions or

can use inputs related to application under test (specifications, source files, data constraints,

list of values, layout files etc.).

The test data will be used either for white-box testing (as in [14]) or for functional testing.

White-box testing requires a deeper knowledge of source code and a previous analysis of it is

required before [15].

Our proposed solution take as input an XML-based file that includes a description of each field

for which will be generated test data.



137

Figure 1 depicts the architecture of the test data generator system. Android layout files are used

as inputs for the parser. The parser generates a DSL file that is used as input for the test data

generator. Finally, the test data generator will provide the test data.

Figure 1- Test data generator system

The DSL file provides required information to test data generator and allows to generate test

data for the analyzed software under test (SUT). The generated test data could be stored in

memory or in files (XML, binary or any other specific format).

4. Data Specification Language (DSL)

In [6] and [7] was proposed an XML-based language used for test data generation. The current

version include more nodes for a better control of data generation. The root node any DSL file

is dataset. Each field for which data will be generated is represented by field node. Each field

includes the nodes:

type (could be string, number, boolean etc.);

generation (could be random or a list of values);

maxLength with attribute fixed used for a required length;

The fields that require values from a list of values will include the lov node with values used

for selection.

The XSD schema of DSL files is presented in Listing 1.

Listing 1. DSL files XSD schema

<xs:schema attributeFormDefault="unqualified"

elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="dataset">

<xs:complexType>

<xs:sequence>

<xs:element name="field" maxOccurs="unbounded" minOccurs="0">

<xs:complexType>

<xs:sequence>

<xs:element type="xs:string" name="type"/>

<xs:element type="xs:string" name="generation"/>

<xs:element name="maxLength" minOccurs="0">

<xs:complexType>

<xs:simpleContent>

<xs:extension base="xs:string">

<xs:attribute type="xs:string" name="fixed" use="optional"/>

</xs:extension>

</xs:simpleContent>

</xs:complexType>

</xs:element>

<xs:element name="lov" minOccurs="0">

Android layout

parser

Android layout

Generated DSL file

Test data

generator

Test data

138

<xs:complexType>

<xs:sequence>

<xs:element

type="xs:string" name="item" maxOccurs="unbounded" minOccurs="0"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:sequence>

<xs:attribute type="xs:string" name="type" use="optional"/>

</xs:complexType>

</xs:element>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

Data length will be deduced from the Android UI layout files (such as android:maxLength

attribute).

Data type could be determined based on android:inputType and initial fields values. For input

that include numbers and a specific format (like phone numbers, date etc.), it should be included

the format also.

Also, for numeric fields, it could be added the nodes minValue and maxValue and their

corresponding values obtained from the layout file or specifications or could be added later.

5. Android layout files parser

The Android layout parser uses XML-based files available in res/layout folder of the Android

project. Several sources, such as [16], present the content and structure of Android layout files.

In order to exemplify the DSL template generation, the XML layout from Listing 2 was used.

Listing 2 - Android layout file used as example

<?xml version="1.0" encoding="utf-8"?>

<ScrollView xmlns:android= "http://schemas.

android.com/apk/res/android"

android:layout_width="match_parent"

android:layout_height="match_parent" >

<LinearLayout


android:layout_height="match_parent"

android:orientation="vertical" >



<EditText

android:id="@+id/editAutor"


android:layout_height="wrap_content"

android:inputType="textCapWords"/>



<EditText

android:id="@+id/editTitlu"



android:inputType="text"/>



<EditText

android:id="@+id/data"



android:inputType="date"/>



<EditText

android:id="@+id/editEditura"



android:inputType="text"/>



<EditText

android:id="@+id/editIsbn"


android:layout_height="wrap_content"/>



<EditText

android:id="@+id/editPret"



android:text="0"

android:inputType="number"/>



<Spinner

android:id="@+id/spinGen"





<CheckBox

android:id="@+id/checkUzata"

android:layout_width="wrap_content"


<Button

android:id="@+id/buttonSalveaza"


android:layout_width="wrap_content"

android:layout_gravity="center"

android:text="Salveaza" />

</LinearLayout>

</ScrollView>



139

The layout includes eight controls for which test data need to be generated: six EditText

controls, one Spinner control and one CheckBox control. Three EditText controls include

android:inputType attributes with values: textCapWords, text, and number. Figure 2 presents

the actual layout used as example running on a real device.

Figure 2 - Test data generator system

The generated test data could be used by existing tools and frameworks to fill the controls and

to activate the submission button.

Based on the layout from Listing 2, the generated DSL file is presented in Listing 3. Current

version includes mostly random values generation and list of values (checked and unchecked)

for CheckBox controls.

Listing 3 - Generated DSL file

<dataset>

<field type="EditText">

<type>string</type>

<generation>random</generation>

< maxLength fixed="No" />

</field>


<type>string</type>


<maxLength fixed="No">20</length>

</field>


<type>string</type>



</field>


<type>string</type>



</field>


<type>string</type>



</field>


<type>number</type>



</field>

<field type="Spinner">

<type>string</type>

<generation>lov</generation>

</field>

<field type="CheckBox">

<type>boolean</type>

<generation>lov</generation>

<lov>

<item>checked</item>

<item>unchecked</item>

</lov>

</field>

</dataset>



140

This DSL file represents an input for the test data generator. In this stage, the DSL file does

not fully automate test data generators. It could require a manual intervention or other

additional parsers or editors that need to narrow data boundaries or add other constraints or

will provide the list of values for list-based controls. For example, the for the Spinner control,

the list of values has to be filled before data generation.

6. Conclusions and future work

The proposed system can be integrated with many testing frameworks and tools available for

Android platform. The presented format of DSL files is a preview and it will be improved

during the future researches.

The next steps include the development of the test data generator that will generate test data

based on DSL files.

References

[1] S. Pressman, Software Engineering: A Practitioner’s Approach. 7th ed., New York:

McGraw-Hill, 2009

[2] I. Sommerville, Software Engineering. 9th ed., Boston: Addison-Wesley, 2011

[3] G. J. Myers, C. Sandler, T. Badgett, The Art of Software Testing, 3rd Edition, Wiley, 2011

[4] M. Roper, Software Testing, McGraw-Hill Book, 1994

[5] M. Kumar and M. Chauhan, "Best Practices in Mobile Application Testing (White Paper),"

Infosys, Bangalore, 2013

[6] P. Pocatilu, F. Alecu and S. Capisizu, "A Test Data Generator for Mobile Applications," in

Proc. of the IE 2014 International Conference, Bucharest, Romania, May 15-18, 2014, pp.

116-121

[7] P. Pocatilu and F. Alecu, "An UI Layout Files Analyzer for Test Data Generation,"

Informatica Economica, vol. 18, no. 2/2014, pp. 53-62

[8] J. Langr, A. Hunt and D. Thomas, Pragmatic Unit Testing in Java 8 with JUnit, The

Pragmatic Programmers, 2015

[9] Testing Fundamentals | Android Developers, available at:

http://developer.android.com/tools/testing/testing_android.html

[10] P. Pocatilu, I. Ivan et al, Programarea aplicațiilor Android, Bucharest: ASE Publishing

House, 2015

[11] S. Yang, D. Yan and R. Rountev, "Testing for poor responsiveness in Android

applications," in Proc. of the 1st International Workshop on the Engineering of Mobile-

Enabled Systems (MOBS), 2013, pp. 1 – 6

[12] A. Gupta, Learning Pentesting for Android Devices, Packt Publishing, 2014

[13] W. Choi, G. Necula and K. Sen, "Guided GUI testing of Android apps with minimal restart

and approximate learning," in Proc. of the 2013 ACM SIGPLAN international conference

on Object oriented programming systems languages & applications (OOPSLA '13). ACM,

New York, NY, USA, pp. 623-640

[14] S. Jiang, Y. Zhang and D. Yi, "Test Data Generation Approach for Basis Path Coverage,"

ACM SIGSOFT Software Engineering Notes, vol. 37, no. 3, pp. 1-7, 2012

[15] A. Zamfiroiu, "Source Code Quality Metrics Building for Mobile Applications," in proc.

of the IE 2014 International Conference, Bucharest, pp. 136-140

[16] R. Meier, Professional Android 4 Application Development, Wiley, 2012



141

EMV/BITCOIN PAYMENT TRANSACTIONS AND DYNAMIC DATA

AUTHENTICATION WITH SMART JAVA CARDS

Marius POPA

Department of Economic Informatics & Cybernetics


Cristian TOMA Department of Economic Informatics & Cybernetics


Abstract. In the paper is presented EMV and Bitcoin payment transactions. For the EMV

transaction is presented in details the contact-based and contactless transactions, as well as

the authentication between the card and the terminal procedures. The Bitcoin section presents

only the usage aspects of the payment transactions, but with details for future development in

terms of implementation within open source code using Java Card technology.

Keywords: EMVCo, Bitcoin, DDA – Dynamic Data Authentication, e-payment transaction,

cryptographic security. JEL classification: C88, L86, Y80

1. Introduction A smart card payment is the transfer of an item of value expressed as money amount using a

specific hardware device called smart card in order to get goods, services or to fulfill legal

obligations.

A smart card has small sizes similar to classic bank cards, and embeds integrated circuits for

various purposes like identity, authentications and application processing. There are many

applications of the smart card, part of these ones being presented as it follows [12]:

Financial – the smart card is used as credit card in classic way, having improved transaction

security; also, electronic wallets can be deployed as smart card applications and pre-loaded

by funds, avoiding the connection to bank at payment time.

Subscriber Identity Module – the smart card is used to securely store international mobile

subscriber identity and the related key for users of the mobile telephony (mobile phones,

tablets, computers and so forth).

Identification – the smart cards are used to authenticate the citizens’ identity; there are

some examples of implemented identity systems at governmental level in which the

citizens use the smart cards to address public services or to be compliant with the legal

requirements; this kind of applications requires employing a Public Key Infrastructure

(PKI).

Public transit – the smart cards are used in integrated ticketing infrastructures implemented

and operated by public transit operators; applications may include also financial or identity

features for small payments transactions or other public services; for instance, the public

transit operator in Bucharest provides two kinds of cards depending on the presence of the

electronic wallet as feature in the card application; a card may be named or not, being used

as identity card during the operator controls of the bus passengers.



142

Computer security – the smart card is used to store certificates to authenticate some

computer operations like secure web browsing, disk encryption, single sign-on an so forth;

in this case, the smart card acts like a security token.

Schools – the smart card is used to support the service offers to the students (small

payments within the campus, public transportation and so forth), their tracking and

monitoring within the scholar infrastructure, control access to the school’s or college’s

facilities.

Healthcare – the smart card is used to improve the security and privacy of the patient

information, avoid the health care frauds, access to the patient’s medical information

immediately in case of emergency, improved support in data migration between medical

personnel; the Romanian National Health Insurance Agency performs a national program

for health care cards distribution in order to earn the benefits of smart card-based health

system; the National Agency will use the collected data to reduce the health care frauds,

improve the national health care programs and to monitor the health of the citizens, all

these information having a positive impact in the budgets in the coming years and better

distribution of the public funds to public services provided to the citizens.

Other applications – the smart card may be used in every areas where the advantages

presented above may be implemented; for instance, the economic organizations can protect

their businesses using a smart card-based system for exchanges of goods and services; the

requirement is a critical one in high technology fields and for those companies invest large

money amount in research and development; such organizations have to get the chance to

cover their expenses.

Multiple-use systems – a smart card may be used in multiple purposes: identity, health

care, payment, public transportation and so forth; in such case, some restriction and legal

requirements are applied in order to make functional the smart card-based system.

The smart card-based systems have evident benefits, but some problems can appear during

their exploitation. The main threat is the malware and security attacks that may compromise

the smart card or the system.

The design of a smart card is stated by ISO/IEC 7810 standard regarding its size. Also, the

design considers how the smart card interacts with a card reader in order to send and receive

data to and from the back-end systems.

Regarding the communication channel between the smart card and card reader, the following

types of smart card are considered [12]:

Contact smart cards – communication is made by a contact area that have some contact

pads in order to provide the electrical connectivity; the smart card is powered by the card

reader;

Contactless smart cards – communication and powering is made by radio-frequency (RF)

induction; the smart card requires proximity to card reader in order to power its circuits

and communicate via a radio channel.

Dual smart cards – the both communication interfaces (contact-based and contactless) are

implemented upon the same smart card.

Universal Serial Bus – USB smart cards – communication is made by USB attaching the

smart card to a computer; in this case, the smart card becomes a security token and

authenticates some operations (like connection of the computer to Internet via a USB

dongle containing a SIM smart card).

In banking area, the smart card is used to support financial transactions among people, public

institutions and economic organizations. Some applications are developed according to

industry standards elaborated and implemented by professional and commercial associations



143

like EMVCo. EMVCo operates in banking area and standardizes the interoperability of the

smart cards (Integrated Circuits Cards – IC Cards), points of sale (POS) and automated tellers

machines (ATMs) in order to provide better authentication of the payment transactions.

2. Contact-based and Contactless EMV Transactions The EMV contact-based and contactless transactions need security. Java Card technology

together with the JCVM – Virtual Machine and Embedded OS/HAL – Hardware Abstraction

Layer security counter-measures, are able to provide SECURITY. The EMV concept is to

provide with computational cryptography the security for the payment transactions with the

following major items:

Card verification – using public key cryptography (RSA algorithm) for SDA / DDA

process; it is something that the card-holder has.

Cardholder validation – based on various methods (mostly used is PIN – Personal

Identification Number, but could be handwriting/biometry); it is something that the

card-holder knows.

Card cryptogram computation of the transaction – using symmetric key cryptography

(TripleDES algorithm in ISO 9797 mode) for encrypting among other data, a terminal

random / unpredictable number plus the transaction amount and date, in order to

provide the non-repudiation proof to the merchant terminal that the transaction is

validated by the card.

Figure 1 presents a mobile NFC Android application that obtains all the Java Card applets AIDs

(Application Identifiers) from a contactless EMV banking card (Visa):

Figure 1. List of Java Card applets instances AIDs installed into a Visa Paywave Contactless EMV banking

card

The most important thing in the Java Card is the applet management. One of the most important

systems for applets management in the market is Global Platform, in terms of specifications

and implementations. The Global Platform Card Manager application - Issuer Security Domain

(ISD) has AID: A0 00 00 00 03 00 00. An EMV complete contact-based payment transaction



144

[4] and the place of the offline data authentication (ODA – red text) mechanisms within the

EMV transaction flow is emphasized in figure 2:

Figure 2. Complete Contact-based EMV transaction and the place of DDA

In order to reduce time of the card and terminal interaction in the proximity radio field for the

contactless payment transaction, all the phases from Figure 2 are compressed in Figure 3,

according with [5]:

1. Application Selection

•For contact base is the selection of PSE Java Card applet with AID: "1PAY.SYS.DDF01" and then the selection of the payment applet via Global Platform Card Manager Issuer Security Domain (ISD)

•For contactless is the selectionof PPSE Java Card applet with AID: "2PAY.SYS.DDF01" and then the selection of the payment applet (e.g. Visa Paywave contactless: A0000000031010) via GP ISD

•See Figure 1 for AIDs values and complete names of ISD, PSE, and PPSE; for each ISO SELECT APDU, PSE/PPSE and EMV payment applications, the Java Card applets responds with FCI - File Control Information

2. Initiate Applicatio

n Processing

& Read Applicatio

n Data

•Intiate Application: Terminal sends GPO (Get Processing Options EMV APDU command) and receive from the card Data Objects encoded in TLV (Tag-Length-Value)

•Read Applications Parameters: Terminal read records (with READ RECORD APDU command) from the card file system for getting necessary Data Objects encoded in TLV (e.g. PAN - Personal Account Number, Cardholder name, etc.)

3. Offline Data

Authentication

•SDA - Static Data Authentication

•DDA - Dynamic Data Authentication - it will be detailed in the next sections

• These authentication is used for M2M (Machine-2-Machine) Auth => the card is genuine (mainly is the card authentication to the terminal) using INTERNAL AUTHENTICATE APDU Command

4. Restriction

s Processing

•It is performed only by the terminal, in order to compare the Terminal Data versus Data Objects read from the card (e.g. Does the terminal support the payment application version? Is the terminal current date within the card validity period? Is the card listed in the black-list?)

5. Cardholder Verificatio

n

•Depending of CVM (Cardholder Verification Method: a) No PIN - Personal Identification Number, b) Offline PIN - in clear OR encrypted with public keys, c) Online PIN, d) Handwriting, e) Biometry, f) combination between handwriting and everything else), the terminal sends VERIFY PIN APDU command, in order to authenticate if the person knows the secret (PIN) and he/she is genuine. Before offline PIN try-conter and check procedure, triggered by VERIFY PIN APDU and processed inside the smart card, it could be use GET DATA APDU immediatelly after step 4 executed only by the terminal.

6. Terminal Risk

Management

•It use also GET DATA APDU Command and response in order to obtain from the card IAC - Issuer Action Code, and then it is triggered the updating of the ATC - Application Transaction Counter / counter value of the last online transaction.

7. Terminal Action

Analysis

•After obtaining IAC from the smart card in the previous step, there are rules applied inside of the terminal, in order to: a) Approve transaction offline, b) Decline transaction offline or c) process the transaction online with the card issuer bank. After getting IAC from the card by the terminal, with GET DATA/READ APDU, sample rules are: i) if (offline PIN verification == fail) => terminal goes online; ii) if (the merchant press a special button on the terminal) => terminal goes online; iii) if (transaction_amount > transaction_limit) => terminal goes online. After this step is performed, the terminal is able to send first GENERATE AC (Application Cryptogram) APDU

8. Card Action

Analysis

• The main part in here is to have an 8 byte cryptogram generated with DES algorithm by the card, in order to: a) Accept the transaction offline (TC - Transaction cryptogram), b) Accept the transaction but only online (ARQC - Application Request Cryptogram) and wait for ARPC (Application Response Cryptogram) from the issuer bank, c) Refuse directly the offline transactio - AAC (Application Authentication cryptogram).



145

Figure 3. Kernel 6 Contactless EMV transaction flow, Copyright [5]

In the contactless scenario for the first tap, the time should be less than 0,5 second or maximum

1 second, therefore the DDA (Dynamic Data Authentication) signature is encapsulated by the

card into GPO (General Processing Options) APDU (Application Protocol Data Unit)

Response or could be stored as EMV Data Object (TLV – Tag-Length-Value) into the card

“file system” if the signature is greater than 1024 bits. The second tap is optional and it is

recommended if the PIN is required in order to update the tries remaining in case of CVM

failure (wrong PIN).

3. Dynamic Data Authentication Process in Payment Transactions

The financial smart card-based application may be or not be in accordance with EMV

specifications. The EMVCo efforts to ensure a secure interoperability of the participant



146

components of a financial transaction were materialized into four books of specifications as it

follows [http://www.emvco.com/specifications.aspx]:

1. Application Independent IC Card to Terminal Interface Requirements.

2. Security and Key Management.

3. Application Specification.

4. Cardholder, Attendant, and Acquirer Interface Requirements.

For that application compliant with EMV specification, Dynamic Data Authentication (DDA)

has the role to detect the fake/altered/duplicated IC Cards during an offline transaction. The

offline transaction is made when online/real-time authorization is not required for the

transaction. In order to do that, the terminal (card reader) must deal with such kind of

transactions and the bank also agrees that the IC Card may accept and implement the offline

transaction by its applet application. Also, the bank can establish some offline transaction

parameters at IC Card personalization time like the maximum amount per offline transaction,

maximum number of consecutive offline transactions, maximum cumulative amount and so

forth as security barriers against the possible transaction frauds. According to EMV

specifications, the offline transaction data authentication is made by the following mechanisms

[3]:

Static Data Authentication (SDA) – detection of data alteration after the IC Card

personalization.

Dynamic Data Authentication (DDA) – detection of data alteration received from the

terminal or generated by the IC Card.

Combined DDA and application cryptogram generation (CDA) – includes verification of

the signature (DDA and application cryptogram) by the terminal.

The application cryptogram is a transaction certificate in order to authenticate a transaction to

be accepted.

The DDA signature process has the following characteristics [3]:

It is performed by the terminal and the card – as requirement, the IC Card needs

coprocessor.

The DDA signature authenticates the IC Card resident and generated data and data

received from the terminal.

It detects the fake/altered/duplicated IC Cards.

The DDA signature generation process has the following settings [3]:

Access to the Certification Authority (CA) RSA public key.

Access to the Issuer Bank (IB) RSA public key certificate.

Access to the static data certificate.

IC Card RSA key pair stored on the card – the RSA private key is securely stored and

cannot leave the card; the RSA public key is signed and stored together with static

application data.

Random challenge generated by the terminal in order to be signed by IC Card RSA private

key.

The security components and roles are depicted in figure 4 and the following considerations

are available for payment applications compliant to EMV specifications:

CA RSA key pair – the CA RSA public key is stored in each terminal; the CA RSA private

key is used to sign the IB RSA public key certificate.

IB RSA key pair – the IB RSA public key is stored by IB RSA public key certificate

signing by CA RSA private key and stored by the IC Card; to have access to IB RSA public

key, the IB RSA public key certificate must be decrypted using the CA RSA public key

stored by the terminal and then the IB RSA public key can be extracted from the certificate;



147

the IB RSA private key is used to sign the IC Card RSA public key certificate that contains

the static application data and IC Card RSA public key.

IC Card RSA key pair – the IC Card public key is stored together with static application

data in IC Card RSA public key certificate; the certificate is stored on the card and it is

decrypted using the IB RSA public key; so, in order to access to IC Card public key, the

CA RSA public key is used to decrypt the IB RSA public key certificate, the IB RSA

public key is extracted from that certificate and it is used to decrypt the IC Card RSA

public key certificate; from the last certificate, the IC Card RSA public key is extracted in

order to decrypt the DDA; the IC Card RSA private key is used to encrypt the content of

DDA and this key never leaves the IC Card.

Figure 4. Offline dynamic data authentication, Copyright [3]

As result of the previous explanations, the following communication flow occurs from IC Card

to terminal [3]:

IB RSA public key certificate.

IC Card RSA public key certificate.

DDA.

After receiving the above certificates, the terminal processing aims [3]:

Decryption of IB RSA public key certificate to validate the IB RSA public key against the

CA.

Decryption of IC Card public key certificate to validate the IC Card public key against the

IB.

Decryption of DDA in order to validate the offline dynamic data signature.

The structure and content of the previous security elements used in RSA scheme to authenticate

the offline transaction are detailed in [3] according to the table 1.

Table 1. Security items involved in DDA process Required item Length Reference in [3]

CA Public Key Index 1 Table 8, Section 6

IB Public Key Certificate var Table 13, Section 6.3

IC Card Public Key Certificate var Table 14, Section 6.4

IB Public Key Remainder var Section 6.4

IB Public Key Exponent var Section 6.4

IC Card Public Key Remainder var Section 6.4

IC Card Public Key Exponent var Section 6.4



148

Required item Length Reference in [3]

IC Card Private Key var Section 6.5, Section 6.6

Signed Dynamic Application Data var Table 17, Section 6.5

Signed Static Application Data var Table 7, Section 5.4

DDA is an improvement of SDA in order to secure the payments transactions. DDA uses

dynamic data like unpredictable number and other transaction-related data to prevent the use

of fake/altered/duplicated cards in offline transaction. DDA is specific only for offline

transaction otherwise the payment transaction protection is ensured by real-time/online

verification and validation. Offline transactions are available in the context of IC Card

personalization and terminal transaction qualifiers.

4. Bitcoin Payments Transaction Usage

The Bitcoin is an alternative payment system having an electronic currency (BTC), but it is not

quite an e-cash/e-coin payment system. The entire system is based on wallets able to store the

signed transactions and miners applications able to produce bitcoins (BTC). In the figure 5 is

presented the Android mobile application MyCelium that handles BTC wallets. For each

payment transaction, in order to provide anonymity, the application is generating every time

another private and public keys pair for ECDSA (Elliptic Curve Digital Signature Algorithm),

and BTC Wallet identification value in Base64 encoding.

Figure 5. Mobile Application BTC Wallet Accounts

If the user of the BTC Wallet needs to send BTC value within a payment transaction or the

one need to pay a product or a service, then should choose the “Send” option as in figure 6:



149

Figure 6. Mobile Application BTC Wallet Send option

Each transaction input (Source and Destination BTC Wallet, BTC Amount, etc.) is processed

with RIPEMD-160/SHA-256/SHA-512 hash function and encrypted with ECDSA asymmetric

key algorithm. Figure 7 shows the transaction details, which can be verified into a 3rd party

BTC payment transaction platform as [6] in Figure 8:

Figure 7. Mobile Application BTC Wallet Transaction details option



150

Figure 8. BTC Transactions details in BlockChain.info web platform [6]

As can be seen in figure 7 and 8 each BTC payment transaction can be identified unique

identified through the HASH value or the transaction value (in this case 0,004 BTC) available

into the BTC block chain at the web address:

https://blockchain.info/address/1PM4iKXj4uQUs51CsVVPPgRiXskxS4YgPT. From Java

Card perspective, there are open source implementations: [9], [10], [11].

5. Conclusions

From the point of view of the SECURITY of the payment transaction, both EMVCo and

Bitcoin compliant applications have been implemented in Java Card API on the real hardware

and Java Card Virtual Machine 2.x version. In terms of security, Java Card platform and

technology has a serious advantage over other platforms in terms of design (e.g. possible in the

future, to implement security hardening in Java Card VM implementation against the logical

attacks), community (Oracle/Sun Microsystems Technology Network developers and JavaOne

conference events) and enforcement institutions (Global Platform). Besides the security the

advantages, the Java Card implementation is portable on various hardware architectures such

as: Mobile device (with (U)SIM – Universal Subscriber Identity Module / eSE – embedded

Secure Element / TEE – Trusted Execution Environment hardware), Java Card USB token,

Java Card dual interface/contactless smart card, etc.



151

References

[1] C. Boja, M. Doinea and P Pocatilu, “Impact of the Security requirements on Mobile

Applications Usability”, Economy Informatics, vol. 13, no 1, pp. 64 – 72, 2013

[2] C. Toma, C. Ciurea and I. Ivan, “Approaches on Internet of Things Solutions”, Journal of

Mobile, Embedded and Distributed Systems, vol. 5, no. 3, pp. 124 – 129, 2013

[3] EMV Integrated Circuit Card Specifications for Payment Systems, Book 2, Security and

Key Management, Version 4.3, November 2011, EMVCo, LLC

[4] EMV Integrated Circuit Card Specifications for Payment Systems, Book 3, Application

Specification, Version 4.3, November 2011, EMVCo, LLC

[5] EMV Contactless Specifications for Payment Systems, Book C-6, Kernel 6 Specification,

Version 2.5, February 2015, EMVCo, LLC

[6] https://blockchain.info

[7] http://www.cnas.ro/page/cardul-national-de-asigurari-de-sanatate-2.html

[8] http://www.emvco.com/specifications.aspx

[9] https://github.com/LedgerHQ/btchipJC

[10] https://github.com/Toporin/SatoChipApplet

[11] https://ledgerhq.github.io/btchip-doc/bitcoin-technical-1.4.2.html

[12] http://en.wikipedia.org/wiki/Smart_card



152

ACCURATE GEO-LOCATION READING IN ANDROID

Felician ALECU


Răzvan DINA


[email protected]

Abstract. “112 – Show my Location” is an Android app intended to be used in all emergency

situations where the current location coordinates (taken by GPS/Network) are needed. For

minimum power consumption, the location is updated on demand only. The application is listed

on Google Play Store (eu112) and can be installed, tested and regularly used free of charge.

Keywords: 112, European emergency number, current location coordinates, Android. JEL classification: O33

1. Introduction On 20th January 2014, a plane of Romanian Superior School of Aviation (transporting a five-

person medical team) crashed in Apuseni Mountains. Even if a call at 112 was made at 16:16,

the rescue teams reached the site around 22 (after 6 hours) because all the methods used to

locate the place failed.

Despite the fact all the passengers were carrying mobile phones, the rescue services were

severely delayed because they did not receive the proper coordinates of the crash site.

One of the passengers used a smartphone to report the GPS coordinates but the application he

used just got the GSM tower position that was sent further to the emergency services, instead

the real location. Few other attempts were done to find out the current location but the maps

application reported the hour as the position (19, 33, 20/1 and 19, 37, 20/1 after four minutes),

probably because the GPS was not active on the device.

Unfortunately, even if she survived the crash, the student girl Aura ION died of hypothermia

because of the long emergency response, the rescue teams reaching the site after 6 hours from

the crash.

For such cases when it is vital to report accurate position data to the emergency services, a

simple and power effective application can make the difference between life and death.

2. App Description The aim of the “112 – Show my Location” application is to offer the user all the details needed

by the emergency services for a quick localization. 112 is the European emergency number,

available free of charge, 24/7, anywhere in the European Union. Persons can dial 112 (by

landline phones as well as mobiles) to reach the emergency services like police, medical

assistance and fire brigade.

The application is listed on the Google Play Store (eu112), so anyone can use it free of charge,

as illustrated in Figure 1.



153

Figure 1. The Google Play Store listing

For minimum power consumption, the location is only updated on demand. The current

coordinates (taken by GPS/Network in Decimal Degrees) are displayed and also highlighted

on the map.

3. The user interface The 112 number should be use for emergency situations only. Using the number for any other

reason is an abuse (if done intentionally) or misuse (if done accidentally). Because any abuse

is a criminal offence, the application is first requesting the user to enter a random password

(Figure 2), trying to prevent the pocket dialing situations or the cases in which children playing

with the phone may inadvertently call the emergency number.

Figure 2. Asking for a password in order to prevent any accidental call to 112

The Update Location button can be used to manually update the current location. The

application will show the phone number (as it is defined under the android settings), the date,

the current position latitude and longitude and the time of the last location update (Figure 3).

All these details are vital for the rescue teams trying to locate any person in danger but carrying

a smartphone.



154

Figure 3. Showing the current phone location

For minimum power consumption, the location can only be updated on demand since the

battery level should be preserved as long as possible in any critical situation.

The user has the choice to call 112 and to report the details displayed by the application (for

example the phone can be switched on the speaker mode in order to be able to read the position

coordinates). A more convenient option is to send the location by SMS (Figure 4) to a number

that can be manually entered or selected from the address book.

Figure 4. The selection of the phone number to send the current location coordinates to

4. Implementation details In order to be able to provide a contextual experience based on location awareness, the

application is using the new Google Play services location APIs instead the old Android

framework location APIs (android.location). The application needs a Google API key that can

be generated under Google Developer Console -> APIs -> Credentials, as illustrated in Figure

5.



155

Figure 5. The Google API key used to connect the application to the Google Play services

The previously generated API key must be included into the app manifest file, as described in

Listing 1. The fine and coarse location permissions are needed to allow the application to use

all the available location providers in order to get location as precise as possible.

Listing 1. Defining the Google API key

AndroidManifest.XML

…

<uses-permission android:name="android.permission.ACCESS_COARSE_LOCATION" />

<uses-permission android:name="android.permission.ACCESS_FINE_LOCATION" />

…

<meta-data

android:name="com.google.android.maps.v2.API_KEY"

android:value="AI**********************************" />

…

This key is used by the application to connect to the Google Play services, as presented in

Listing 2. Listing 2. Using the Google API key

MainActivity.java

…

protected synchronized void buildGoogleApiClient() {

mGoogleApiClient = new GoogleApiClient.Builder(this)

.addConnectionCallbacks(this)

.addOnConnectionFailedListener(this)

.addApi(LocationServices.API)

.build();

createLocationRequest();

}

protected void createLocationRequest() {

mLocationRequest = new LocationRequest();

// Sets the desired interval for active location updates.

mLocationRequest.setInterval(UPDATE_INTERVAL_IN_MILLISECONDS);

// Sets the fastest rate for active location updates.

mLocationRequest.setFastestInterval(FASTEST_UPDATE_INTERVAL_IN_MILLISECONDS);

// Sets the priority for the most precise location possible

mLocationRequest.setPriority(LocationRequest.PRIORITY_HIGH_ACCURACY);

}



156

By pressing the Update Location button, the FusedLocationApi is used (Listing 3) to provide

the best available location update based on different location providers like GPS or WiFi.

Listing 3. Getting the location updates

MainActivity.java

…

protected void startLocationUpdates() {

LocationServices.FusedLocationApi.requestLocationUpdates(

mGoogleApiClient, mLocationRequest, this);

}

The new location details are obtained by the onLocationChanged method, as exemplified in

Listing 4.

Listing 4. The details of the new location

MainActivity.java

…

@Override

public void onLocationChanged(Location location) {

mCurrentLocation = location;

mLastUpdateTime = DateFormat.getTimeInstance(

DateFormat.MEDIUM, Locale.UK).format(new Date());

// update details

mDateText.setText(DateFormat.getDateInstance(

DateFormat.SHORT, Locale.UK).format(new Date()));

mLatitudeText.setText(String.valueOf(mCurrentLocation.getLatitude()));

mLongitudeText.setText(String.valueOf(mCurrentLocation.getLongitude()));

mLastUpdateTimeText.setText(mLastUpdateTime);

mMap.clear();

LatLng latLng = new LatLng(mCurrentLocation.getLatitude(),

mCurrentLocation.getLongitude());

mMap.animateCamera(CameraUpdateFactory.newLatLngZoom(latLng, 17)); //from 2 to 21

mMap.addMarker(new MarkerOptions()

.title("You are HERE!")

.snippet("Lat " + String.valueOf(mCurrentLocation.getLatitude()) +

", Long " + String.valueOf(mCurrentLocation.getLongitude()))

.position(latLng));

// location updated

Toast.makeText(this, "Location updated.", Toast.LENGTH_SHORT).show();

}

By pressing the Call 112 button, the dialing pad appears with the 112 number already entered,

so the user has to only press the green call button (Listing 5). Unfortunately, only the system

applications are allowed to directly call the special numbers like 112.

Listing 5. Calling 112

//MainActivity.java

public void Call112ButtonHandler(View view) {

Uri callUri = Uri.parse("tel://112");

Intent callIntent = new Intent(Intent.ACTION_CALL,callUri);

callIntent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK

| Intent.FLAG_ACTIVITY_NO_USER_ACTION);

startActivity(callIntent);

}



157

For sending the SMS, starting with Android 4.4 KitKat only the default SMS application can

send and receive messages. In order to be able to send short text messages even if there is no

reason to be defined as the default SMS app, the “112 – Show my Location” is forced to use

the SmsManager feature, as presented in Listing 6.

Listing 6. Sending position details by SMS

MainActivity.java

...

private void SendSMS(String phoneNumber, String smsMessage) {

try {

SmsManager smsManager = SmsManager.getDefault();

smsManager.sendTextMessage(phoneNumber, null, smsMessage, null, null);

Toast.makeText(getApplicationContext(), "SMS to " + phoneNumber + " sent.",

Toast.LENGTH_LONG).show();

} catch (Exception e) {

Toast.makeText(getApplicationContext(),

"SMS failed, please try again later.", Toast.LENGTH_LONG).show();

e.printStackTrace();

}

}

5. Conclusions The application could be very useful for any emergency situation because it offers all the

necessary details in a faster and reliable way with minimum battery power consumption.

Being based on the Google location API, the application is us using Google Play services to

determine the current position in a faster and accurate way by using the best available location

taken from several different sources like GPS or WiFi.

In real life critical situations, such an application can easily save lives.

Acknowledgement




postdoctoral, in the economic, social and medical fields - EXCELIS", coordinator The


References

[1] P. Pocatilu, I. Ivan, A. Visoiu, F. Alecu, A. Zamfiroiu, B. Iancu, Programarea

aplicatiilor Android, ASE 2015, ISBN: 978-606-505-856-9.

[2] Making Your App Location-Aware,

https://developer.android.com/training/location/index.html

[3] Comunicat oficial al Serviciului de Telecomunicatii Speciale (STS) remis MEDIAFAX,

http://www.mediafax.ro/social/sts-medicul-zamfir-nu-a-trimis-coordonatele-geografice-

ci-ce-vedea-in-jur-si-ora-de-pe-mobil-12015644



158

ABOUT OPTIMIZING WEB APPLICATIONS

Marian Pompiliu CRISTESCU

“Lucian Blaga” University of Sibiu [email protected] Laurentiu Vasile CIOVICA

“ALMA MATER” University, Sibiu [email protected]

Abstract: Software optimization is an extended area of software engineering and an

important stage in software product development. If taken into account a number of n

programs which solve the same problem, the optimal program will be considered the one

which gives the best value for an indicator called performance criterion. In the present paper

is accomplished the description of the techniques used for optimizing the web applications,

putting in the same time pre and post optimization code fragments, measurements of the

access time in the case of an online application for testing intelligence called IQ Test. The IQ

Test application developed from simple linear structure to arborescence structure with

directed arcs to descendants, because after optimizations to get a high degree of flexibility,

the final version being represented by a graph structure, where browsing is realized on a lot

of routes and is possible the navigation in any direction.

Keywords: web application, software optimization, methods of optimizing, intelligence.


1. Introduction

The development of web application follows, like determinant item, the architecture which is

sitting on the base of it. In this case, in majority of the cases is using the customer server

technology to implement the applications distributed in system. The principal components of

the application which use this architecture are the server and the customer.

In web applications, the server component is represented by web servers. This applications

are produced for specialized companies. In application presented in this study the web server

is Tomcat. The customer component is represented by web browsers.

According to [3], the simplest web applications are those in which the servers send files with

static content to browser. The disadvantage of this applications is that any information update

must be made files from the server, which inevitably determine to permanent updates, in the

other words to projects that can never be considered finished.

For the solving of this problems, were developed technologies [1], [5] which after receiving

the request, don’t send anymore the content of a file, but their build the dynamic answer,

eventually after consulting the local databases whose content is modifying in time. This type

of applications architecture is called „three tier”, which involve three elements: the Browser,

the Web Server and the Data Base Server.

The optimization of a program involve „ improving the performances of this, although was

obtained, for the moment, a superior value for the performance criterion [2]. The solution is

not unique because from an optimization to another one ameliorate the performance. Also

should take in consideration the local character of the optimization, this referring to a

program which is modify or to the comparing result of a very small number of programs

between them, which represent the solve version of a problem.



159

The optimization of the program means the program improving. All the references are made

at a lot of target programs. After the optimization is realized the informatics application and

bringing it’s at a functional form without any errors. Is discussed of optimization in the

context of an application which solve correct and complete the problem for that was realized.

2. Methods for optimizing the web applications The concept of optimizing is different from the one of correcting the errors. The goal of this

operation is to improve the characteristics of the application, bringing it to an optimum level.

According to [4], „The optimization of an application describes the modification process of

the software product through development of versions or solutions with a higher quality

level”.

2.1 Minimizing the memory space Because it takes in discussion a web application, the most consumers of the memory space

would be the multimedia resources, so that the optimization effort will be directed mainly to a

favorable report between compression and quality. In the application Test IQ the most

voluminous multimedia resource is the video tutorial which in uncompressed version has the

size of 14242 KB. It achieves a considerable economy, using the codec TSCC TechSmith

Screen Capture Codec. This codec is optimized for print screens so that the result file would

be small and well compressed, occupying 6788 KB, the space economy in this case would be

7454 KB.

The next element for space minimizing is the database. By storing the database application

strictly necessary elements and their corresponding encoding shrinks the size of the database.

In this case the application contains 3 tables assimilated to the 3 tests. Economical solution

for storing the ID of the question, the image represented as BLOB, the code for the interface

type and the correct answer database provides sufficient robustness while keeping to an

acceptable size. This way the values of 0.5KB - Table 1 0.57KB - table 2 and 0.16KB - table

3 reaches a cumulative value of 1.23 KB for the database.

Another important aspect for reduce the occupied memory is to minimize the input. In this

case, by requiring the user name and first name as input ensures a small size of the input.

2.2 Maximizing the performance of the source code

One of the advantages of distributed applications is the presence applets, Java programs

designed to run in the web browser. Here appear two conflicting issues: dynamic related

facilities that an applet brings web application performance in contrast with lower speed

feature of Java. To compensate for this shortcoming, it is necessary applet source code

optimization so as to increase the speed of compilation. This goal is achieved by several

methods:

Operation substitution - when is possible, slower operators (/,*,^) should be replaced with

faster operators as bit shifting.

x >> 2used in place ofx / 4

x << 1in place ofx * 2.

Bonded operations are recommended:

a+=i more efficient than a = a+i.

Elimination of understatements it’s a very good modality to eliminate redundant

calculation.

double x = d * (l / m) * sx;double y = d * (l / m) * sy;it’s replaced by:

double s = d * (l / m);double x = s * sx;double y = s * sy;



160

Invariant code usage in applet’s development. If an invariant it’s used to formulate a result,

is more efficient to isolate the invariant.

In expression: for (int i = 0; i < x.length; i++)

x[i] *= Math.PI * Math.cos(y);

can be observed the redundant calculation of each mathematical expression in each iteration.

Optimized approach assumes expressions unique calculation and the later usage of the result:

double ex = Math.PI * Math.cos(y);

for (int i = 0; i < x.length; i++)

x[i] *= ex;

Same algorithm is valid also for image constructions:

for (int i = 0; i < n; i++)

img[i]=getImage(getCodeBase(),"im"+i+".gif";

becomes: URL url = getCodeBase();

for (int i = 0; i < n; i++)

img[i]=getImage(url,"im"+i+".gif");

Correct definition of variables takes in consideration that the local variables are the fastest to

access. So in the construction of the methods it is recommended the usage of this type of

variables as much as possible. Another suggestion refers to specifies access. For each method

that doesn’t varies from instance to instance of the class, or which are constant on all the

execution, it’s used static and final. This way there won’t be any lost times with instantiation

of the variables each time a new object is created.

Reuse of the code greatly improves program efficiency. It’s recommended the usage of

classes expanded from already existent classes and to be called from already existent

libraries. This way can be personalized classes which are used in the interface trough usage

of expanding standard classes and adding additional fields and methods.

class MyJPanel extends JPanel{

private IqApplet ia;

private int punctaj;

public MyJPanel(IqApplet ia, int x){

super();

this.ia = ia;

punctaj = x;

}

Another method is the reuse of objects. Instantiation of a new object consumes time, so

when possible recycle is preferred. Especially for formatting elements of an applet as object

of Font type. Instead of declaring a new object for title’s font, answers and questions, it’s

used an generic object Font, which will actualize trough setFont() method.

Graphic optimization is an important aspect in applet’s functionality. If the speed in this

sector is improved the performance of the application would decisively improve also. This

won content should not compromise the image quality because there are situations when it’s

preferred a slower approach for a better quality.

Classic case it’s represented by algorithm “Double Buffering” which purpose is to reduce the

blinking effect of images on the screen. For this algorithm there is also an optimization. It’s



161

used the property of “clipping”, which doesn’t assumes that all the surface will be redraw, but

only the exclusively needed components. Optimized algorithm is presented here:

public void update(Graphics g) {

Graphics offgc; // graphic object with which it draws// image in background

Image offscreen = null; //image which will be bonded

Rectangle box = g.getClipRect();//area determination which must be // redrawn

offscreen = createImage(box.width, box.height); // image creation

offgc = offscreen.getGraphics();

// cleaning of exposed area

offgc.setColor(getBackground());

offgc.fillRect(0, 0, box.width, box.height);

offgc.setColor(getForeground());

offgc.translate(-box.x, -box.y);

paint(offgc);

// image transfer on the area

g.drawImage(offscreen, box.x, box.y, this);

}

Other optimization strategies for graphic components involve reusing the libraries which

already exist. In the case of building geometric forms is recommended to use the

drawPolygon() method for a lines draw cycle with drawLine().

The optimization of written text has a high importance because an applet contain also

informative sections. It’s recommended using the specialized StringTokenizer classes for

solving a StringBuffer text on units for concatenations.

The optimization of compiler brings more value for above methods to prove them effective.

Will be used a JIT compiler instead of the classic Java VM. JIT just-in-time compiler is a

program which convert the byte code Java which is interpreted in instructions which can be

sent to the processor. The disadvantage of this method consists in the fact that lose the code

portability which is provide by the byte code.

Other optimization practice involves that after the entire code was be developed and tested,

will be recompile with optimization options from the activate compilation by the javac-o

console command.

A special attention in the case of code optimization must give to the servlet for intermediation

which this realized between applet and the database will be efficient. For optimizing is used

the next methods:

Storing static dates in XML files and the extraction of these inside of the init() method which

is specify to the servlet. Init() method is executed only once on the servlet instantiation and is

overwrite, so that is preferred this approach in the favor of the dynamic generation of the data

at each customer call to servlet. At the bottom is showed the init() method and the way in

which is got the values of some data from attached XML configuration:

...

private String connectionURL;

private String contentType;

private String user;

private String password;

public void init(ServletConfig config)throws ServletException{

super.init(config);

connectionURL = config.getInitParameter("connectionURL");



162

contentType = config.getInitParameter("contentType");

user = config.getInitParameter("user");

password = config.getInitParameter("password");

if(password == null)

password = "";

}

The utilization of the print() method instead of the println() method when sending data for

objects such as PrintWriter and others. It is recommended to use print(), the more efficient

way, because, intern println() takes the data and sends it to print(). The difference in aspect

between the results of the two methods is visible only on the HTML source.

The periodical flush of the exit flux (output flush) ensures that the user sees the result

before the entire page loads. Even though this process doesn’t improve the application in

general, the user is offered the sensation that the page is processed faster. The technique is

useful especially when pages have parts with lots of graphics or parts which require intense

processing.

Supplementation of the answer buffer size in which the servlets load content. When the

buffer fills up, the servlet creates a socket type connection towards the client and unloads the

buffer. To reduce the number of sockets and traffic it is recommended to increase the size of

the buffer through a response.setBufferSize(10240) appeal.

Restricting appeals to ServletContext.log - the appeal method for ServletContext.log

degrades performance. To increase efficiency the appeals to this method are reduced in favor

of displaying on the console System.out.println().

2.3 Maximization of the satisfaction level of the user

Because the purpose of the application is that of testing an ability, the projected interface will

be simple and oriented towards functionality. The user’s first contact with the application is

through the initial form. To ensure the users satisfaction the following elements are taken into

consideration in the creation of the interface for Test IQ:

all the important elements of the page are placed onto the surface accessible without

scrolling. This lets the user understand the application more easily without necessitating

supplementary actions;

the form is created in a way to not confuse the user. The background color isn’t

disturbing, the obligatory brackets are marked with red and signaled with an informative

note, the information required from a user is at a minimum level and are in a logical

succession. Implied values exist for each bracket except NAME and FIRST NAME, for

the reason of avoiding errors;

validations together with the correct display of error messages are realized in an intuitive

manner to guide the user;

on the start page the user will find a description of each test. Upon accessing them the

user can find out particular information, the necessary time to finish the test and other

information to guide the user;

outside of the NAME and FIRST NAME brackets represented through textboxes, which

permit the user to introduce desired characters, the rest of the brackets requires simple

selection. Through this the possible mistakes that can be made by the user are minimized,

thus less errors appear.

the application comes with a tutorial which familiarizes the user with it. The tutorial is

available as a video or a slideshow in order to accommodate users that have browsers

without media plug-ins;



163

after pressing the save button, the user is announced through the appearance of an

hourglass that the application is loading and will be available shortly.

The test application is very important for the optimization process. From the perspective of

user satisfaction maximization, the app has to be easy to use, easily navigable, and must have

enough command panels to facilitate an optimal use process of a test.

3. Conclusions

The optimization of web applications represents an accumulation of methods, which act on

different routes and follow the improvement of the applications performance taking into

consideration the wants of the user. The optimization process starts after a certain application

is functional and in each case necessitates a precaution as not to reduce the performance of

one function through the optimization of another. Such an outcome does not justify the

optimization. For the Test IQ application the optimization of the client, applet, server and

servlet has been insisted upon. In each part the interface, data transfer, resource management

and result display was the focus.

The optimization of a web application means creating a rigorous projection on both the

source code and language and the applications structure. The duration of the optimization is

desired to be shorter, the complexity of the application reasonable, as in to present

functionality and sufficient robustness to permit ulterior development and maintenance. 4

stages of development have been registered on the path of evolution of Test IQ, each with

their own well defined local purposes. At the finalization of each stage the application is

tested rigorously as to ensure that there are no errors when it reaches the next stage.

References

[1] Bajaj A., Krishnan R., “CMU-WEB: A Conceptual Model For Designing Usable Web

Applications”, Journal of Database Management, Idea Group Publishing, Volume 10,

Issue 4, Page 33, The Heintz School Carnegie Mellon University, 1 October, 1999, ISSN

10638016;

[2] Boja Catalin, “Aspecte privind optimizarea în domeniul informaticii aplicate în

economie”, Economie teoretică şi aplicată, Bucureşti 2007, ISSN 1844-0029 (editia

online), pg 43-54;

[3] Cristescu M.P., Cristescu C.I., Cucu C., “Distributed applications. Practical rules”, in

Proc. 17th International Economic Conference IECS 2010 „THE ECONOMIC WORLD’

DESTINY: CRISIS AND GLOBALIZATION?”, vol. I, ISBN 978-973-739-987-8, Sibiu,

Romania, may 13-14, 2010, pp. 69-74;

[4] Ivan Ion, Boja Catalin, “Practica optimizării aplicaţiilor informatice”, Editura ASE,

Bucureşti, 2007, ISBN 978-973-594-932-7, 483 pg;

[5] King A., “Website Optimization: Speed, Search Engine & ConversionRate Secrets”,

published by O’Rilley Media Inc., USA 2008, ISBN 978-0-596-51508-9, 349 pg.



164

THE RELATIONSHIP BETWEEN ENVIRONMENTAL AND

ENERGY INDICATORS. THE CASE STUDY OF EUROPE

Titus Felix FURTUNĂ

The Bucharest University of Economic Studies

[email protected]

Marian DÂRDALĂ


[email protected]

Roman KANALA

Université de Genève

[email protected]

Abstract. The largest amount of CO2 emissions in the European Union originates from the

production of electricity and heat. Coal-based energy production in the EU has generated

more than 20% of total CO2 emissions in the EU. In this paper, we analyze the relationship

between environmental indicators and various sets of energy indicators. We propose to use

the generalized canonical analysis in order to study the relationship among many sets of

indicators. The programmatic support for Canonical Analysis and Generalized Canonical

Analysis software application is provided by the Java Apache Commons Math Library. For

graphical representation we used the Java specialized library Java Free Report. Factorial

distributions of the countries in canonical axis are displayed on maps prepared in ArcGIS by

ESRI.

Keywords: Canonical analysis, Generalized Canonical analysis, Environmental Indicators,

Energy, GIS

JEL classification: C380, C8, Q4

1. Introduction

In our society, energy is essential to meet the daily needs of individuals, industry, transport,

agriculture and services. The energy issues are of main concern, because of their increasing

importance and impact on the environment. Air pollution contributes to an increase of the

greenhouse effect that causes global warming. Non-renewable energy resources produce

many pollutants during their operation. On the other side, renewable energy is often

considered as “clean” energy. However, no energy path is entirely emission-free.

Based on data provided by Eurostat, we draw a relationship between the structure of energy

production, especially electricity and emission of greenhouse gases and other pollutants. The

sets of indicators are as follows:

Set1 - Emissions of greenhouse gases and air pollutants;

Set2 - Main indicators of energy;

Set3 - Final energy consumption by products;

Set4 - Final energy consumption by sectors;

Set5 - Electricity production by sources.

General connection among these sets of variables can be studied in a first stage using the

generalized canonical analysis. Pairs of data sets having a connection between them can be

studied in the second stage applying canonical analysis.



165

The Canonical Analysis is a statistical method, proposed in 1936 by Hotelling [1a]. It is used

for significance hypothesis testing and to depict the relationship between two sets of variables

amongst two sets. Generalized Canonical Analysis is an extension of Canonical Analysis for

many sets of variables. Among these methods, let’s mention the following ones: Sumcor

method [2] consists in maximizing the sum of the canonical correlations, Carroll method

[1][5] consists in maximizing the sum of correlations between the common canonical

variables and canonical variables of the groups, SSqCor method [3][4] consists in

maximizing the sum of the squares of correlations of canonical variables couples. Various

conditions and optimal criterion may produce different results [8].

2. Generalized Canonical Analysis. Problem formulation

Let q observations tables X1, X2, ..., Xq describe the same n individuals. Let’s note mi the

number of columns of the Xi matrix, and with Wi the Rn subspace generated by the columns

of Xi matrix. With Pi is noted the orthogonal projector on the Wi subspace. We suppose that

n >

q

i

im1

. According with Carroll criteria [1][7] Generalized Canonical Analysis determines

in the first step an auxiliary variable Z1 and q canonical variable zi

1 (i = 1, q), so that

q

i

izZR1

112 ),( to be maximal under constraint: 11 ZZt

=1. In order to have the sum of

correlations to be maximal, the zi1 vectors are chosen as orthogonal projections of the Z1

vectors on Wi subspaces (Wi: zi1 = Pi∙Z

1) . So, the above sum can be rewritten as:

q

i i

t

i

i

tq

i

q

ii

t

i

i

t

i

i

q

i

i

zzn

zZ

zzn

zZn

zVarZVar

zZCovzZR

111

211

1 1 11

2

11

11

211

1

112

1

1

,),( .

Replacing 1

iz with Pi∙Z1 and taking account that iii

t

i PPPP 2, we obtain:

1

1

111

1 111

211

1

112 11),( ZP

nZZPZ

nZPPZn

ZPZzZR

q

i

i

t

i

tq

i

q

ii

t

i

t

i

tq

i

i

.

So, the optimal problem becomes:

1

1

11

1

1

1

1

ZZ

ZPn

ZMaxim

t

q

i

i

t

Z .

The solution of this problem, the Z1 variable, is the eigenvector of the

q

i

iPn 1

1matrix

corresponding to the highest eigenvalue. The canonical variables of the groups are

determined by relation: zi1 = PiZ

1.

In the k step the Zk auxiliary variable and the zik (i = 1, q) canonical values are

determined so that

q

i

k

i

k zZR1

2 ),( be maximal, subject to constraints:

1) 1ktk ZZ

2) 1,1 ,0 kjZZ jtk .



166

The Zk variable is the eigenvector of the

q

i

iPn 1

1matrix corresponding to the

eigenvalue of k order (order of magnitude). The canonical variables of groups are: zik = PiZ

k.

4. Relationship between sets of indicators

The set of emissions of greenhouse gases and air pollutants includes the following indicators:

air emissions produced in industry and households sectors by electricity, gas, steam and

air conditioning supply;

air pollution by: Sulphur oxides, Nitrogen oxides, Ammonia, Non-methane volatile

organic compounds, Particulates (< 10µm) in sector of energy production and distribution

and energy use in industry;

greenhouse Gas Emissions (CO2 equivalent) in energy and energy industry.

The set of main indicators in energy:

energy intensity of the economy - Gross inland consumption of energy divided by GDP

(kg of oil equivalent per 1 000 EUR);

implicit tax rate on energy - EUR per ton of oil equivalent;

combined heat and power generation - percent of gross electricity generation;

electricity generated from renewable sources - percent of gross electricity consumption;

market share of the largest generator in the electricity market - percent of the total

generation;

share of renewable energy in gross final energy consumption.

The set of final energy consumption by sectors includes indicators that represent the total sum

of energy supplied to the final consumer's door for all energy uses. The values are expressed

as a percentage. The indicators of the final energy consumption by product set represent the

total final energy consumption and the energy consumption of a selected number of products

or product groups. These values are also expressed as a percentage. The last set of indicators

reflect the structure of electric power sources. First we apply the generalized canonical

analysis in order to discover the relationship among all the five data sets. The results are

evaluated by applying of a significance test based on Wilk’s - Bartlett statistic [6]. In Table 1

are presented the significance test results for the first three canonical roots and a significance

level equal with 0.05. There is no significant relationship among the five data sets at the

global level.

Table 1 - Statistical significance of auxiliary canonical variables

Root Chi Square Degree of freedom Critical Chi Square

Z1 18.42809 35 49.80185

Z2 8.70627 34 48.60237

Z3 4.52463 33 47.39988

Canonical correlation for each group is presented in Table 2. It's about correlations between

auxiliary canonical variables and canonical variables of sets. We remove from the analysis

data sets where the correlations between auxiliary canonical variables and the canonical

variables of the data sets have low values: Set2 and Set4.

For graphical representation of the individuals on the auxiliary canonical axis, we propose to

use a map of Europe. Countries with similarities regarding the relationship between the three

sets of indicators are colored in close colors (white, light grey, dark grey).



167

Figure 1. Map of European countries colored using the first auxiliary canonical variable

The countries for which the values on the first auxiliary canonical variable axis are smaller

are colored in lighter shades of colors. These countries are the ones in which the link between

energy activities and environment indicators is smaller. As anyone can see, the relationships

are more intense in countries that generate energy mainly out of fuels such as oil, coal or gas.

Table 2 - Auxiliary canonical variables - canonical variables correlations

Root Set1 Set2 Set3 Set4 Set5 Total

Z1 0.84015 0.7552 0.95213 0.81302 0.92712 4.28761

Z2 0.81263 0.77401 0.77447 0.55641 0.84853 3.76606

Z3 0.65885 0.70746 0.81826 0.34197 0.83167 3.35821

After removing Set2 and Set4, we apply a new algorithm for the three remaining sets. The

significance test validates the first three auxiliary canonical roots.

Table 3 - Statistical significance for three sets analysis

Root Chi Square Degree of freedom Critical Chi

Square

Z1 102.7433 32 46.19426

Z2 71.78868 31 44.98534

Z3 51.89408 30 43.77297

Z4 38.57813 29 42.55697

Z5 27.81176 28 41.33714

The recalculated canonical correlations for the first three sets are presented in Table 4. Only

the significant auxiliary canonical variables are taken into consideration.

Table 2 - Auxiliary canonical variables - canonical variables correlations with 3 sets.

Root Set1 Set3 Set5 Total

Z1 0.92265 0.94266 0.95369 2.819

Z2 0.88179 0.8732 0.88205 2.63705

Z3 0.81013 0.79537 0.8268 2.43231



168

To determine the correlations among significant canonical auxiliary variables and variables

of groups, a detailed look on relations among groups is required. The correlation circle

among the first two canonical auxiliary variables and the variables of groups is drawn in

Figure 2.

Canonical analysis is a good work tool to define the connection among various phenomena

represented in the data sets. In case of connection between emission of pollutants and the

activity in the energy sector, the results show that a some correlation exists.

Figure 2 - Correlation between auxiliary canonical variables and the group variables

It can be observed that some values are low, which means that the energy production and

energy consumption behave in a different way when it comes to the emission of pollutants.

For instance, the electric energy produced from oil is characterized by a strong correlation

with many indicators concerning pollution. A more detailed image concerning the

relationships between sets can be obtained by computing the correlations among variables of

each group and the canonical variables of groups. High correlations indicate variables whose

significance contributes to the relationship among sets. In Figure 3 is presented the

correlation circle of two data sets. It can be observed that the concentration of points

represents the same variables but in the different spaces.

5. Conclusions

The canonical analysis is a good work tool to depict the relationships among various

phenomena represented the data sets. The case study results show that a significant

relationship exists between pollutants emission and the activity of the energy sector. This

relationship is stronger in countries where the energy is mostly generated from fossile fuels.



169

Figure 3 - Correlation between canonical variables and the group variables in each space

Acknowledgment

This work was supported by the Swiss Enlargement Contribution in the framework of the

Romanian-Swiss Research Program (Grant IZERZ0_142217).

References

[1] J. D. Carroll, “Generalization of canonical correlation analysis to three or more sets of

colonnes”, Proceedings of the 76th Annual Convention of the American Psychological

Association, 1968, p. 227-228

[1a] H. Hotelling, “Relations Between Two Sets of Variates”, Biometrika 28(3-4), 1936, pp. 321-377

[2] P. Horst, “Relations among m sets of measures”, Psychometrika 26(2), 1961(a), p. 129-

149

[3] R.J. Kettering, “Canonical analysis”, Encyclopedia of statistical Sciences, S. Kotz, N.L.

Johnson, New York, Wiley, 1983, p. 354-365

[4] R. J. Kettering, “Canonical analysis of several sets of variables”, Biometrika 58(3), 1971,

p. 433-451

[5] V.Nzobounsana, T. Dhornet, “Écart : une nouvelle méthode d’analyse canonique

généralisée (ACG“, Revue de Statistique Appliquée, 51(4), 2003, p. 57-82

[6] M.S. Bartlett, M.S., “A Note on the Multiplying Factors for Various chi2

Approximations“, J R Stat Soc Series B 16 (2), p. 296–298, 1954

[7] Diday, E., Pouget, J., Lemaire, J., Testu, F., “Elements d’analyse de donnee“, Dunod,

Paris, 1985

[8] A. Gifi, (1990). “Nonlinear multivariate analysis.” Chichester: John Wiley and Sons,

1990 (First edition 1981 Department of Data Theory, University of Leiden).

[9] Commons Math: The Apache Commons Mathematics Library:

https://commons.apache.org/proper/commons-math/

[10] JFreeChart - Java chart library: http://www.jfree.org/jfreechart/

[11] Eurostat: http://ec.europa.eu/eurostat/data/database

[12] International Energy Agency: http://www.iea.org/statistics/statisticssearch/report



170

IS THE INTEGRATED MANAGEMENT SYSTEM OF QUALITY,

ENVIRONMENT AND HEATH AND SAFETY A SOLUTION FOR

SME’S PERFORMANCE?

Dorin MAIER


[email protected]

Adela Mariana VADASTREANU

National Institute of Research and Development for Isotopic and Molecular

Technologies and Technical University of Cluj-Napoca

[email protected]

Andreea MAIER

Technical University of Cluj Napoca

[email protected]

Abstract. The research made in this paper comes in the context of the more and more

demanding economic environment for the SMEs. In this sense in order to stay on the market

organizations need to reduce production cost to make more profit but also to meet the

quality, environment and health and safety requirements. Various standards have been

developed in order to certificate organizations that they respect the requirements. Applying

an integrated standards model is shown to be a more effective way then applying the

standards independently. The problem that arises is that in order to adopt the integrated

version of standards there have to be made some changes at the organizational level. In the

case of SMEs every change that needs to be done has an effect on its activity, and in this

sense the paper tries to deal with the changes needed for adopting an integrated management

system and if those changes have an influence over the performance of the SMEs. After the

analysis of the results, we can conclude that in order to improve the SMEs performance a

necessary step is the implementation of the integrated management system, despite the

disadvantages that can affect the SME’s activity.

Keywords: ISO 9000, ISO 14000, OHSAS 18000, Business performance

JEL classification: M12, M21, P17

1. Introduction The new challenges for the success of business in today’s economic context are in a

continuous development. Companies are struggling to reduce production cost, make more

profit, and still meet with all the environmental, quality or health and safety demands or

create a better image to overcome competition. Implementing several standards

simultaneously and independently can be costly and human resource demanding. Integrating

the standards has been shown to be a means of overcoming these difficulties. On the other

hand, integration of the standards requires some changes in the organization such as;

integrating the actual management systems, focus on products, stakeholder collaboration and

the creation of a learning environment, which might lead to some factors that might hinder

integration of management systems which include the absence of knowledge amongst

employees and the management, absence of demands;.

There are significant differences in the character of large and small companies. The lack of

resources, technical ability, time and capital usually lead SMEs to inaction [1]. They are



171

closely integrated into the fabric of the local community, have staff that usually come from

within a small radius of the company and often use traditional processes or services [2].

Furthermore, [1] assert that smaller firms tend to lack information management systems to

concentrate information-gathering with one or two key personnel rather than sharing

scanning activities among a range of top executives, while larger firms on the other hand,

have the capital to employ external consultants or may even have in-house experts.

2. Approaches regarding the integrated quality - environment - occupational health and

safety management systems The researchers and practitioners all over the world study the concept of integrated

management system. A series of definition, in order to cover all the aspect of an integrated

management system were formulated. Among this definition for this paper we have consider

that the definition of [3] is quite conclusive. According to this definition, an integrated

management system is conceptualized as a single set of interconnected processes that share a

unique pool of human, information, material, infrastructure and financial resources, in order

to achieve a composite of goals related to the satisfaction of a variety of stakeholders. A

management system sets the goals and objectives, outlines the strategies and tactics, and

develops the plans, schedules, and necessary controls to run an organization.

Since many management systems have been developed, companies have two choices: leave

these to function as specific systems, or integrate them. An integrated management system is

‘the organizational structure, resources and procedures used to plan, monitor and control

project quality, safety and environment’ [4]. The need that gave birth to this trend worldwide

[5] considers that the appearance of integrated management systems is not of a successfully

visionary nature, but rather a result of the real constraints on current markets in which

organizations operate.

By analysing the definition of an integrated management system we can draw some general

lines that cover all the aspect related to the integration of the management systems. In the

same research where we have find the definition of integrated management systems [3], the

author consider that integration of management systems is really about two things: standards

and internal systems that these standards describe. If both are considered, the ultimate goal is

‘one standard, one system’. However, this research examines to what extent small and

medium companies have achieved the goal of ‘many standards, one system’, because since

management standards keep emerging like mushrooms, any effort towards an integrated

standard that would cover all the current ones is useless.

In the literature there are a lot of studies that have approach the problem of integration of

management systems and are presented a various type of possible integration and model of

integration were developed. In this paper we do not study the theory of integration and also

due to the limitation of pages we cannot present all the models. For this study we have chosen

a model of integration proposed by the authors in paper [5].

The management systems integration is possible because ISO 9001, ISO 14001 standards are

compatible and OHSAS 18001:1999 specification was modeled on ISO 14001; all referential

have process-oriented approach based on the concept PDCA (Plan - Do - Check - Act).

Common principle of the systems is the continuous improvement (see Figure 1.).



172

Figure 1 - Model of integrated management systems (author proposal in [5])

The base of the model of integrated management systems presented in Figure 1 is formed by

the four-step management method PDCA (plan-do-check-act), a method that allows the

control and continuous improvement of processes and products. Each PDCA cycles starts

with establishing the objectives and processes in order to achieve the expected results

(PLAN), then all these objectives and processes need to be implemented and deliver the

result (DO), the result must be studied and compared with the expected results (CHECK), and

finally the last step is the request of corrective actions on3 differences between the planned

result and the actual result, and the determination of the root causes (ACT).

3. Effect of integrated management systems over the performance of an organization

The performance of an organization can be considered its ability to achieve its objectives

through a strong management and a persistent rededication towards those goals.

Organizational performance involves the recurring activities to establish organizational goals,

monitor progress toward the goals, and make adjustments to achieve those goals more

effectively and efficiently. The organization performance is measured by its competitive

advantage and its brand differentiation both are the result of working hard focusing on

developing to deliver high quality services to increase efficiency through management

changes techniques through applying quality management system and environmental

standard to decrease product cost and increase product quality [6].

In order to analyze if the integration of management systems, or adoption of an model of

integrated management system we have analyzed the benefits of Integrated Management

System and compared them with the changes necessary to implement by the organization. In

Table 1 we have identified, based on the literature study [7],[8],[9], the main advantages of

implementing an Integrated management system.



173

Table 1 - Summarizes benefits that an SME can gain from the implementation of an Integrated Management

System

Benefits of Integrated Management System

improvement of internal efficiency and effectiveness

homogeneity in management methodologies

the reduction in the fuzzy management boundaries between individual systems and in the broadening of the horizon beyond the functional level of any individual

avoid duplications between procedures of the systems

eliminate the overlap of effort reduction in external certification

costs over single certification audits

alignment of objectives, processes, resources in different areas

positive for small business reducing paperwork synergy effects elimination of effort and redundancies a holistic approach to managing

business risks improve internal and external

communication reduce risks

increase in profit margins improvement of quality of

management by down-sizing three functional departments to one and reducing fuzzy management boundaries between individual systems

increase in operational efficiency by harmonizing organizational structures with similar elements and sharing information across traditional organizational boundaries

streamlining paperwork and communication

less redundancy and conflicting elements

time saving more transparency more feasibility better structured processes clearer responsibilities harmonization of MS documentation responsibilities and relationships

gain a structured balance of authority/power

expose conflicting objectives identify and rationalize conflicting focus organization onto business goals create a formalization of informal

systems harmonize and optimize practices

identify and facilitate staff training and development

reduction of coordination problem improved operational performance cross-functional teamwork motivation of staff enhanced customer confidence simplified systems resulting in less confusion, redundancy or conflicts in

documentation optimized resources in maintaining a

single system with a single simplification of requirements integrated audits

Another aspect that we have analysed in the literature was the negative part of implementing

an integrated management system, or the price that an organization has to pay in order to

have a functional management system [10]. The principal disadvantages and barriers for

implementing an integrated management system are summarized in Table 2.

Table 2 - Disadvantages and barriers to the implementation of integrated management systems

Disadvantages of integrated management systems

transmission of non-conformances from one system to other systems

the maintenances of additional procedures like document changes, training or

calibration may overload the support staff

in some cases the need of additional support, that may create conflict between

new group and older one



174

the lack of knowledge and competence in the organization

the organizational separation of the systems

hope for a clear focus in dept of single standard

security with the existing management systems

the management has one-sided focus on one area

the workers have to work differently

In order to achieve the aimed integrated policy, targets and objectives it is needed to develop

the capabilities and support mechanism. In fact, the organization should focus on aligning its

internal assets, i.e., people, strategy, resources and structure. The alignment of assets is an

important characteristic of successful small and medium enterprises [11].

Implementation and compliance of the Integrated Management System requires the

development of capabilities and support mechanisms in order to achieve the integrated

policy, objectives and targets. In fact, the organization should focus on aligning its internal

assets, i.e., people, strategy, resources, and structure.

4. Conclusions

The recognition of management systems was possible because of multiple causes like the

legislation changes in quality, environment and health, the rapid technological

advancements, the on-going necessity for developing new skills for the employees and many

more. The requirements of the organizations related to standards are to enable it to control

those key management functions with maximum effectiveness and minimum bureaucracy.

In the case of SMEs the importance of the management systems is higher, due to the limited

human and financial resources. In this context, the integrated management systems have a

bigger importance. In the literature, we can find several models for integrating different

management systems, the most used are quality, environmental and health and safety

management systems.

Also the researches identify the advantages and the disadvantages of the integrated

management system. We can consider as the main advantages the coordinated decisions,

coherence in the organization’s activity and in the end the efficiency with lower costs. If we

focus on the disadvantages, the difficulty of implementation and the reduction of flexibility

are the main ones. However, by analysing all the advantages and disadvantages we can

conclude that in order increase the performance in business a necessary step is

implementation of integrated management systems.

Acknowledgements

This paper benefited from financial support through the National Institute of Research and

Development for Isotopic and Molecular Technologies, Cluj-Napoca, Romania.

References

[1] Holt, D., Anthony, S. and Viney, H. (2000) , Supporting Environmental Improvements in

Small and Medium-Sized Enterprises in UK. Greener Management International, Issue

30, pp. 29-49.

[2] http://www.aippimm.ro/articol/imm/legislatie-imm/definitie-imm accessed at 15.02.2015

[3] Karapetrovic, S. (2003) Musings on integrated management. Measuring Business

Excellence, Vol. 7, no. 1, pp. 4-13.

[3] Griffith, A. (2000) Integrated management systems: a single management system solution

for project control? Engineering, Construction and Architectural Management, Vol. 7, no

3, pp. 232-240



175

[4] Dragomir, M. (2010), Cercetări şi contribuţii privind concepţia, dezvoltarea,

implementarea şi îmbunătăţirea continuă a sistemelor de management integrat al calităţii,

mediului şi sănătăţii şi securităţii muncii în întreprinderile industriale, teza de doctorat, pp

15- 18, Universitatea Tehnică, Cluj- Napoca

[5] Olaru, M, Maier, D., Nicoara, D., Maier, A., (2013), Establishing the basis for

development of an organization by adopting the integrated management systems:

comparative study of various models and concepts of integration, 2nd World Conference on Business, Economics and Management (BEM) Location: Antalya, Turkey Date: APR 25-28, 2013

[6] Gavris, O. (2009), „Management of infrastructure rehabilitation works using multicriteria

analysis”, 6th International Conference on the Management of Technological Changes

Location: Alexandroupolis, GREECE Date: SEP 03-05, 2009

[7] Anastasiu, L. (2009), „How the changing of technology can become a motivating factor in

human resources management”, 6th International Conference on the Management of

Technological Changes Location: Alexandroupolis, GREECE Date: SEP 03-05, 2009

[8] Ciplea, S, Ciplea, C., Anastasiu, L., Popa, A.,(2010), „Costs optimization methods with

applications in students trening”, 6th International Seminar on the Quality Management in

Higher Education Location: Tulcea, ROMANIA Date: JUL 08-09, 2010

[9] Olaru, M. (1999), Managementul calităţii, ediţia a 2-a revizuită şi adăugită, Editura

Economică, Bucureşti

[10] Darabonţ, D., Pece, Ş., (2003), Studiu privind elaborarea unui sistem de management al

securităţii şi sănătăţii în muncă, Institutul Naţional de Cercetare Dezvoltare pentru

Protecţia Muncii, Bucureşti.

[11] Roncea, S, Sârbu, F. (2004), Ghid Pentru Proiectarea şi Implementarea unui Sistem

Integrat de Management Calitate – Mediu - Sănătate şi Securitate în Muncă, QM

Consulting, Bucureşti



176

A FRAMEWORK FOR DESIGNING AN ONTOLOGY-BASED E-

LEARNING SYSTEM IN HEALTHCARE HUMAN RESOURCE

MANAGEMENT

Lidia BAJENARU

National Institute for Research & Development in Informatics, 8-10, Mareşal Averescu

Avenue, Bucharest, Romania

[email protected]

Ion Alexandru MARINESCU

National Institute for Research & Development in Informatics, 8-10, Mareşal Averescu

Avenue, Bucharest, Romania

[email protected]

Ion SMEUREANU

Department of Economic Informatics and Cybernetics, Bucharest University of Economic

Studies, 6, Romana Square, Romania

[email protected]

Abstract. In this paper we present the personalized e-learning system components based on

ontology, a tailored training system adapted to the needs of hospital management, and the

architecture that will form the basis of the proposed e-learning system implementation.

We present the technical and conceptual aspects of our e-learning system for the course

personalization. The system is composed of different knowledge components what represent

the system’s information about the specific domain, namely Healthcare Human Resources

Management (HHRM), and the student model. This system builds a personalized educational

content to meet the target group preferences and the need for knowledge in domain. The

modelling of the educational domain specific to HHRM and the use of ontologies in the

process of personalized learning experience are implemented into an intelligent learning

Web platform. The proposed ontological e-learning system aims: to provide a comprehensive

and systematic knowledge base about the competences and knowledge of the target group

enabling training depending on the profile and goals of the student. The gain of this type of

training will be a more competent evaluation and management of the health services

provided by hospital.

Keywords: E-learning, ontology, human resource management, personalization, architecture.

JEL classification: D83, M12, I19

1. Introduction

The e-learning goal is to "break" the barriers of time and space by the automation of learning

[1]. The proposed e-learning system addresses the management team of a university hospital

in order to verify and improve knowledge in the HHRM field as professional requirements.

This paper presents a proposal of architecture for an ontology-based system, a tailored

training system adapted to the needs of hospital management. The ontology usage model in e-

learning system is developed to structure the educational content in the domain of HHRM

from Romania [2]. The proposed method for the development of this system is based on

building a personalized learning path for each student according to his profile and

preferences. Personalization and reuse of educational materials of this system is achieved

using semantic Web technology and proposes an ontological approach.






177

Semantic Web underlies the new WWW architecture that allows searching and navigating

through cyberspace by content with formal semantics [3]. Ontology is part of the Semantic

Web structure and according to Gruber [4] it is the explicit specification of a

conceptualization which facilitates the exchange of knowledge in a domain.

Among the instruments designed to support the development of ontology we used Protégé

[5]. Ontology components are used in the Semantic Web as a form of representation of

knowledge. Ontologies generally describe: classes, attributes, individuals, relations.

Ontologies are used to model educational domains and to build, organize and update specific

learning resources (e.g. student profiles, learning paths, learning objects).

The proposed ontological e-learning system aims: to provide a comprehensive and systematic

knowledge base about the competences and knowledge of the target group enabling training

depending on the profile and goals of the student. The gain of this type of training will be a

more competent evaluation and management of the health services provided by hospital.

2. General aspects on the modelling of the system processes

The results of the modelling processes of the proposed intelligent e-learning system are the

following [6]:

a student model based on the student's knowledge for learning personalisation;

a conceptual model of the HHRM domain – an ontology for knowledge modelling.

The student model is used to adapt the interaction mode of the e-learning system according to

the user’s needs [7]. The domain model is used to breaks down the body of knowledge into a

set of domain knowledge elements.

The Student Model is built based on the following features: profile identification, knowledge,

learning style, learning goal, student’s goal, student skills level assessment and feedback. The

modelling process involves other main domains: learning modelling process and digital

content modelling, and it is based on basic concepts of Information Management System

(IMS) standard.

Building the student model is based on the static and dynamic models. The main source of

static information (that does not change during system-student interaction) is the student

profile. It contains the following information: personal identification data, cognitive profile,

preferences on educational content, education. The student performance and the information

regarding knowledge gained during the e-learning process are stored dynamically in the

student portfolio and serve to continuously update the student model [6]. Student’s profile

was implemented with the Protégé environment [6].

Our knowledge domain is about Healthcare Human Resources Management in an university

hospital in Romania. The identified domain concepts are organized into an ontology and were

implemented using Protégé, as shown in [6].

In the modelling process of the knowledge domain, body of knowledge is decomposed into a

set of domain knowledge called "concepts". These basic pieces of knowledge or information

are represented by learning objects (LOs). LO based on IMS standard help personalize the

educational content, respectively the learning material is offered according to the student

needs. The learning path scenario of our system is based on the level of knowledge of the

student and his requirements. It should allow the students to access the content of a particular

field of knowledge - in our case, the specific desired job, necessary to complete their specific

training. The personalization of the learning units is achieved by selecting a Learning Object

(LO) for each specific concept of the learning path. The learning objects arise both from

classical content providers (teachers, publications etc.) and from automatic collection of

specific content from the Web. In the first case, the system links the student learning

preferences to the learning object metadata contained in the system database.



178

The graphical representation of the e-learning process concepts hierarchy can be seen in

Figure 1.

Figure 1 - An ontological representation for system concepts

The knowledge base consists of information about students’ skills, function and competences,

and the general and specific HRM concepts within a university hospital.

3. Overview of the e-learning system

The modelling of the educational domain specific to human resource management into a

hospital and the use of ontologies in the process of personalized learning experience will be

implemented into an intelligent learning Web platform.

The general conceptual architecture of the e-learning system is presented in Figure 2.

The system allows the student to access a set of HRM domain concepts (categories), where

each domain concept is associated with an explicit formal description. After the student

chooses the HRM domain target concepts, the system triggers the training process by

evaluating several alternatives that aim to build an adequate course (presentation) which

satisfies both the student's current knowledge state and his / her personalized learning

preferences [8].

The core of the intelligent e-learning system is an ontology (Domain description) which plays

the role of a systematic and comprehensive repository of knowledge on the skills of the target

group, containing basic concepts (such as competence, management, person, skills etc.)

allowing the application of available knowledge and its relationships to other concepts,

instances and properties.

The Student Model determines the students' current level of knowledge and the objectives that

he / she wishes to achieve by training. It guides the students' learning activity and it is built

incrementally by the system using both sources of data offered directly by the student

(collected via electronic forms generated by the system) and by the student-system

interaction.

The Collect / Up-to-date Component creates an updated Student Model and collects data from

various sources for this model. The main source of static information (that does not change

during system-student interaction) is the student profile. The student performance and the

information regarding knowledge gained during the e-learning process are stored dynamically

in the student portfolio and serve to continuously update the Student Model.



179

Figure 2 - The general conceptual architecture for the proposed e-learning system

Current knowledge level of the student (Knowledge state) is evaluated by the system using

pre-testing students' prior knowledge, collected by pre-assessment tests, and based on the

results obtained from the learning process. The learning style defines how the student prefers

to learn and it can be adapted according to the developed cognitive capacities.

Depending on the profile and responsibilities of each member of the management team, they

will have access to e-learning platform, in order to get a personalized learning program based

on a specific ontology and to get their bibliographies compliant to their learning requirements

[6]. After the result delivery, the user profile is dynamically filled with additional

information, achieving the Student Model. Using the mechanisms of the learning process, the

system will link the target concepts chosen by the student, the description of the HRM

domain (Ontology) and the student's current level of knowledge, options, profile and

preferences. Based on these specific domain concepts, the necessary learning units (course,

lesson, module etc.) will be personalized and launched allowing the student to access

educational content from a particular knowledge domain.

The student goes through the learning unit and obtains results from evaluations for each

module. The system validates the intermediate results, providing or not the students the right

to continue the sequence of activities, and updates their profile. The system analyses the key

concepts of the student profile (e.g. learning style) and provides dynamic information

(specific links, references etc.). The system compares the individual results obtained from

tests with the required domain-specific knowledge, providing at the end of the training a

feedback with recommendations for additional training.

4. The architecture of proposed HHRM system

The architecture of the proposed Web-based intelligent educational system is three-tier client

server architecture. The client-side is concerned only with user interface and connection to

the server. The personalized e-learning processes combine the student model with the domain

model in order to deliver suitable course contents to the student. The system adapts the course

contents according to the students test evaluations.



180

The logical architecture integrates: the Web server that provides: the user interface and

access to the database, the application programs through which the main functions of the

system are performed by providing the capabilities necessary for the training platform, and

the database server that will host the system database (educational content, profiles and

portfolios of the students) and ensures storage and retrieval of data necessary to run a course.

The technical architecture of the system is composed of two major blocks: the functional

block that provides functionality and the data block containing the system repositories. The

functional block contains the client (represented by the Web browser) and the server (the

Web server along with other specific components). The data block is composed of the

following repositories: users (students), metadata, and content. The system components are

described below, and a graphical representation of this may be seen in Figure 3:

A. The User Interface is a client-side component by which the user requests an URL and

selects the option of training.

B. The Semantic Web Component runs on the server-side and serves as support for two

other important components: D and F. It is personalized developed around a semantic Web

engine. This component is able to process RDF formatted documents and information

based on the proposed training model of the system, performing optimal connections

between specific resources, tailored to the student.

C. The Front-end Web Server is the first server-side component of the proposed system that

interacts with the clients. It receives requests from the client through HTTP protocol,

forwards the request for internal processing, waits and sends back to the client the

response in a specific Web page format to be displayed in the browser.

D. The Security Component is a server-side component and is designed to ensure optimal

and safe use of the system. It is responsible for authentication of participants, based on

user name and password, and for authorization and access control to resources, depending

on the user's role and context.

Figure 3 - The System Technical Architecture Diagram

E. The Content Generation Component runs at server level and is responsible for

processing and generating the optimum personalised content of educational material to the

participant, using the student model specific to the system. It works with the other two D

and F to finally provide a high degree of course personalization to the target group.



181

F. The Learning Unit Allocation Component works also at the server-side level. This

component together with the Web Semantic engine (E) analyse the specific context of

each member of the target group and, using an internal algorithm based on the data model

and other attributes of the student, starting from the log in the system; it determines the

best personalized training path for the student to follow in order to maximize the

knowledge assimilation level.

G. The Storage Component essentially identifies the source database server that will store

the persistent information of the system. It retains the student specific information, both

his profile information and information relating to his activities and progress.

5. Conclusions

The purpose of the project consists in the implementation of a new approach for the health

system managers’ training based on modern e-learning technologies. In this paper we have

presented the overall e-learning system for training in HHRM field including its

functionalities. These functionalities include the building a personalized learning path for the

each student according to his/her profile and preferences. This is possible because the system

contains both a representation of the student’s knowledge and his learning preferences, and

the knowledge domain. The proposed system is provided with an abstract representation of

the specific domain (by ontology) and with the description of learning objects. The system

contains a hierarchical content structure and semantic relationships between concepts. This is

important for searching and sequencing learning resources in Web-based e-learning systems.

The proposed system architecture is able to support training activities in an interactive way,

based on the semantic Web by its components.

References

[1] V.R. Pandit, “e-Learning System Based on Semantic Web”, International Conference on

Emerging Trends on Engineering and Technology (ICETET), 2010, IEEE, ISSN: 2157-

0477.

[2] L. Băjenaru, A.-M. Borozan and I. Smeureanu, “An Ontology Based Approach for E-

Learning in Health Human Resources Planning”, In Proc. The 13th International

Conference on Informatics in Economy, pp. 352-357, ISSN: 2247-1480, 2014.

[3] T. Berners-Lee and M. Fischetti, Weaving the Web: The original design and ultimate

destiny of the World Wide Web by its Inventor. New York: HarperCollins Publishers, 1st

Edition, 2000.

[4] T.R. Gruber, “Toward Principles for the Design of Ontologies Used personalization

Knowledge Sharing”, International Workshop on Formal Ontology, International Journal

of Human-Computer Studies, Vol. 43, Issue 5-6 November / December, pp. 907-928,

1995.

[5] N.F. Noy, R.W. Fergerson and M.A. Musen, “The knowledge model of Protégé-2000:

Combining interoperability and flexibility”, In Proc. The 2nd International Conference on

Knowledge Engineering and Knowledge Management, 2000, Springer-Verlag, pp. 17-32.

[6] L. Băjenaru and I. Smeureanu, “An Ontology Based Approach for Modeling E-Learning

in Healthcare Human Resource Management”, Journal of Economic Computation and

Economic Cybernetics Studies and Research, ISSN: 0424–267X, Vol. 49, No. 1, 2015.

[7] P. Brusilovsky and C. Peylo, “Adaptive and intelligent web-based educational systems”,

International Journal of Artificial Intelligence in Education, Vol. 13, pp. 156-169, 2003.

[8] N. Capuano, M. Gaeta, A. Micarelli and E. Sangineto, “An Intelligent Web Teacher

System for Learning Personalisation and Semantic Web Compatibility”, In Proc.The

Eleventh international PEG conference, Russia, 2003.



182

PROJECT MANAGEMENT COMPETENCY ASSESSMENT FOR IT

PROFESSIONALS: AN ANALYTIC HIERARCHY PROCESS

APPROACH

Elena-Alexandra TOADER The Bucharest University of Economic Studies, Romania

[email protected]

Abstract. The paper proposes the use of the Analytic Hierarchy Process (AHP) applied on

the competency model defined by the authors in previous studies and in case of two IT

projects with different characteristics in which employees operates. First, the PM

competency model was decomposed into a hierarchical structure. AHP through pairwise

comparisons by five technical experts was utilized in order to determine weight values for the

each competency category and for each competency element within the two projects. The

prioritization of the competences was made for the performance level: expertise. Using

Wilcoxon Signed Ranks statistical test on AHP results, we can determine if there is a

significant difference between the two projects competency elements at the performance

level: expertise. The AHP results will be implemented into a competency assessment tool. An

activity scenario was defined to help the integration of AHP method in the competency tool

assessment.

Keywords: Competency Register, IT Competency Assessment, Analytic Hierarchy Process.

JEL classification: C44, M51

1. Introduction

The competitive advantage of an IT organization is given by the increasing interest in the

development of project management competencies of the technical professionals that are

working within it [1]. Many software firms have implemented assessment tools in order to

evaluate the performance of their professionals. It is important to have an objective and

transparent competency assessment tool in order to obtain valid results.

Placing the right people in the right projects can lead to the success of the project, therefore

monitoring the performance of the employees has become one of the most important goal in

software organizations.

Due to the importance of this assignment process of the right people into the right projects,

the multi criteria decision making techniques have become more popular.

Deciding the weights of each competency is not an easy task since there are many aspects

that should be taken into consideration. We propose an evaluation method based on a multi-

criteria decision method (MCDM) named Analytic Hierarchy Process (AHP). The AHP is a

decision support tool that can to assess, construct, generate, prioritize or measure

competences demanded by various professions [2].

AHP method has been applied in different contexts, such as: engineering problems [3], e-

banking security [4], layout design [5] and evaluation of technology investment decisions [6].

2. Analytical Hierarchical Process (AHP)

Analytic Hierarchy Process (AHP) is a multi-criteria decision method (MCDM) introduced

by [7]. The AHP method integrates qualitative information and quantitative values into the

same decision making methodology by evaluating the elements of the decision.



183

The Analytic Hierarchy Process (AHP) was developed in order to structure a decision process

into a scenario process influenced by several independent factors. The method contains a

hierarchical structure that holds dependencies and interactions between the elements. The

hierarchy tree is structured from top (the main goal of the decision-maker) through criteria

and sub-criteria. The tree can have as many sub-criteria as needed [7].

The AHP allows prioritization between the elements generating a weight for each evaluation

criteria by using pairwise comparisons. After that, AHP assign a score to each option

according to the decision taken. The higher score is considered a better performance of the

option with respect to the criteria considered. Finally, the AHP combines the criteria weights

with the option scores, determining a global score for each option. The global score for a

given option is a sum of the weighted scores in respect to all the criteria [7].

The steps of the AHP method are described by [8] and are the following: problem definition

and creation of the hierarchy containing the elements and the criteria for evaluation; the

construction of a set of pairwise comparison of the elements and the calculation of the

priorities; verify the results and the judgements; synthesize the judgements in order to set up

local weights and consistency of comparisons; selection of the best variant based on the

aggregations of the weights across various levels in order to obtain the final weights of

alternatives.

The AHP method helps decision-makers to select a solution from a set of alternatives and

provide the possibility to make the comparison between the alternatives. In competency

assessment domain, the method has been used by several studies [9], [10], [11], in the project

managers selection [12], [13], as well in the personnel recruitment process [14].

3. Methodology

3.1 Research Method

The aim of the current research was to estimate the weights for each competency category

and for each competency element within two IT projects. The IT projects have different

characteristics. The method application will help in finding the optimal solution for

determination and ranking the competency elements and will test the hypothesis that there is

a significant difference between the competency elements within those two IT projects. The

weights are calculated for the performance level: expertise.

Based on the competency ontology defined by [16], we can define the competency elements

within each competency category as a set of knowledge and abilities that an IT professional

must possesses in order to achieve a specific performance level. A performance level is

determined through performance indicators defined in the ontology [16]: awareness,

familiarity, mastery, expertise. The competency model defined by [15] contains 15 PM

competencies grouped by three categories: methodical, personal-social and strategic-

organizational.

The methodical category is containing the following competency elements: knowledge of

applied PM methods; technical analysis of information; the automation and optimizing of

work’ processes; the evaluation, review and quality assurance of work and implementation of

the maintenance technique. The personal-social category is containing the following

competency elements: teamwork, creativity, vigilance, efficiency, motivation, ethics and

stress resistance. The strategic-organizational category is containing the following

competency elements: permanent organization; health, security, safety and environment and

respect of work methods and procedures.

In order to illustrate the implementation of the AHP method, first a complete hierarchical

structure was represented based on the competency model defined by [15]. This structure



184

established the model that will be used in order to evaluate and compare the competency

categories and the competency elements within the competency model.

Suggested methodology was tested on a small group of five IT professionals that are working

in software Romanian organizations in two IT projects with different characteristics. Even

though the number of the IT professionals is small, and the number of projects as well, in this

test we can show the first results of using the AHP algorithm and demonstrate if there is a

significant difference between the competency elements within the two IT projects. The IT

professionals filled a questionnaire related to each competency category and each

competency item within the competency model corresponding for each IT project. Their

responses has been synthesized using AHP method.

For each decision, the decision-maker (in our case, the IT professional) has to indicate the

preferences or priority of each competency category or for each competency element in terms

of how it contributes to each criterion. The scale used was proposed by [7]. The scale

comparison values are: 1 - equally important, 3 - weakly more important, 5 - strongly more

important, 7- very strongly more important, 9 - absolutely more important.

After the preference for each competency category and for each competency element is made,

a pair-wise comparison matrices is constructed. It is necessary to synthesise the pair-wise

matrices. The synthesizing is made by dividing each element of the matrix by its columns

total. The weights are obtained by calculating the row average. The next step is the

verification of the consistency of the pair-wise comparison matrix. The CR is acceptable if it

is not exceed 0.10. If the CR is greater than 0.10, then the judgment matrix is inconsistent and

should be improved [7]. The analysis of the AHP results for the two projects was conducted

by using SPSS ver.19 software and Wilcoxon signed-rank test. Using Wilcoxon test, we can

demonstrate if there is a significant difference between the competency elements of the two

projects.

3.2 Analysis and Discussion

The AHP hierarchical structure was represented using Web-HIPRE [19], a free applet web-

based software (available: http://hipre.aalto.fi) for the first IT project analyzed. In Figure 1 is

described the hierarchical structure based on the competency model defined by [17].

In Table 1 are described the pairwise comparisons between each competency category for

first project. The consistency of the pair-wise comparison matrix (CR) is 0.099, is lower than

0.10 and that means that the judgment matrix is valid.

In Table 2 is described the standardized matrix and the weight of each competency criteria

for the first project. As we can observe, at the expert level of performance, the methodical

competencies have the weight of 65%, the personal-social competencies 21% and the

strategic-organizational competencies 17.5%.

Table 1 - The pairwise comparisons matrix Table 2 - Standardized matrix

Competency

category

C1 C2 C3 Weight

C1 0.62 3.5 3 61.5%

C2 0.29 1 1.4 21%

C3 0.33 0.71 1 17.5%

Competency

category

C1 C2 C3

C1 1 3.5 3

C2 0.29 1 1.4

C3 0.33 0.71 1

Sum 1.62 5.21 5.4

http://hipre.aalto.fi/



185

Figure 1 - Structuring AHP Model in Web-HIPRE (Source: http://hipre.aalto.fi)

Similar, we can compute the weights for each competency element from the competency

model defined by [9]. In the methodical category, priority has the competence knowledge of

applied PM methods (0.34%), followed by the automating and optimizing work processes.

(0.27). In the personal-social category, the priority is given by the permanent organization

competency (0.21%), followed by the motivation (0.2%). In the strategic-organizational

category, the priority has the respect of work methods and procedures (0.49%). The results

for the first project are described in Table 3.

Table 3 - The competency element weights

Competency element Weight Competency element Weight

C11. Knowledge of applied PM

methods

0.34 % C24. Efficiency 0.12%

C12. Technical analysis of

information

0.18 % C25. Motivation 0.2%

C13. Automating and optimizing

work processes

0.27% C26. Ethics 0.12%

C14. Evaluation, review and quality

assurance of work

0.10% C27. Stress resistance 0.08%

C15. Implementation of the

maintenance technique.

0.11% C31. Permanent organization 0.21%

C21. Teamwork 0.15% C32. Health, security, safety

and environment

0.3 %

C22. Creativity 0.17% C33. Respect of work methods

and procedures

0.49%

C23. Vigilance 0.16%

http://hipre.aalto.fi/



186

The results of the AHP method for the two IT projects has been analyzed with the Wilcoxon

Signed Ranks test. The results are showed in Table 4. Table 4. Wilcoxon Signed Ranks results (SPSS computation)

a. Based on negative ranks.; b. Wilcoxon Signed Ranks Test

The Wilcoxon signed ranks test shows that there is no significant difference between the

competency elements of the two projects for the performance level: expertise (Z= - 0.755, p =

0.450).

4. Integration of the proposed algorithm in the assessment tool for evaluate the IT

professionals

The above algorithm will be included in an online assessment tool which is going to evaluate

the competencies of IT professionals within Romanian software organizations First, the

employee register a new account by filling out a form. After that, when he succeed to login

into the application, he can start a self-assessment process by responding to the questions

related to the PM competency model defined by [15]. At this moment, the responses from the

questions are saved in the database. At this moment, the algorithm for evaluating responses

compute the AHP competency elements with the corresponding score and the corresponding

AHP competency category. A final score and feedback will be given at the end of the

assessment. The scenario is available in the activity diagram from Figure 2.

Figure 2 - Employee Assessment – activity scenario

Test Statisticsb

v

2 - v1

Z -

.755a

Asymp. Sig.

(2-tailed)

.

450



187

5. Conclusion

The aim of the study was to estimate the weights for each competency category and for each

competency element for two IT projects with different characteristics. Using an evaluation

method based on a multi-criteria decision method (MCDM) named Analytic Hierarchy

Process (AHP) we can find the optimal solution for determination and ranking the

competency elements. In order to illustrate the implementation of the AHP method, first a

complete hierarchical structure was represented. Then, through pairwise comparisons by five

technical experts were determined the weight values for the each competency category and

for each competency element within the PM competency model. Wilcoxon signed ranks test

shows that there is no significant difference between the competency elements of the two

projects for the performance level: expertise

It was underlined the integration of the AHP method into the competency assessment tool

developed in order help in determining and improving the performance level of IT

professionals. A limitation of our study is that the assessment tool was not implemented yet

in a Romanian IT organization, so no relevant results are available. Another limitation is that

the employer module has not been developed yet.

Future research should be directed through developing open questions for the evaluating the

assessment items, through developing the employer module and through implementing the

assessment tool into in a Romanian software organization.

Acknowledgment

Toader Elena Alexandra was co-financed from the European Social Fund, through the

Sectorial Operational Programme Human Resources Development 2007-2013, project

number POSDRU/159/1.5/S/138907 "Excellence in scientific interdisciplinary research,

doctoral and postdoctoral, in the economic, social and medical fields -EXCELIS",

coordinator The Bucharest University of Economic Studies.

References

[1] Fisher, E. 2010. “What Practitioners consider to be the Skills and Behaviours of an

Effective People Project Manager” International Journal of Project Management, JPMA-

01272, 9p.

[2] Hafeez and Essmail, 2007K. Hafeez, E.A. Essmail Evaluating organization core

competences and associated personal competencies using analytical hierarchy process

Management Research News, 30 (8) (2007), pp. 530–547

[3] Wang, L., and Raz, T. (1991). Analytic Hierarchy Process Based on Data Flow Problem.

Computers & IE, 20:355-365.

[4] Syamsuddin I., Hwang J. (2009). The Application of AHP Model to Guide Decision

Makers: A Case Study of E-Banking Security, Third Asia International Conference on

Modelling & Simulation, 25-29 May 2009, Bandung, Bali, Indonesia;

[5] Cambron, K.E. and Evans, G.W., (1991). Layout Design Using the Analytic Hierarchy

Process. Computers & IE, 20: 221-229.

[6] Boucher, T.O. and McStravic, E.L. (1991). Multi-attribute Evaluation within a Present

Value Framework and its Relation to the Analytic Hierarchy Process. The Engineering

Economist, 37: 55-71.

[7] Saaty, T.L. (1980). The Analytic Hierarchy Process. McGraw-Hill International, New

York, NY, U.S.A.

[8] Saaty, T. L. (1999) The seven pillars of the analytic hierarchy process. Proceedings of the

ISAHP Conference (AHPIC 1999), Kobe, pp. 20-33.



188

[9] João Varajão, Maria Manuela Cruz-Cunha. (2013) Using AHP and the IPMA

Competence Baseline in the project managers’ selection process, International Journal of

Production Research, vol 51, pp. 3342-3354

[10] Brožová, H. (2011). Weighting of Students´ Preferences of Teacher´s competencies.

Journal on Efficiency and Responsibility in Education and Science, 4(4), 170-177, ISSN

1803-1617.

[11] Jan Bartoška, Martin Flégl, Martina Jarkovská (2012) IPMA Standard Competence

Scope in Project Management Education, International Education Studies; Vol. 5, No. 6;

[12] Kelemenis, A., K. Ergazakis, and D. Askounis. 2011. “Support Managers’ Selection

Using an Extension of Fuzzy Topsis.” Expert Systems with Applications 38: 2774–2782.

[13] Zavadskas, E. K., P. Vainiunas, Z. Turskis, and J. Tamosaitiene. 2012. “Multiple Criteria

Decision Support System for Assessment of Project Managers in Construction.”

International Journal of Information Technology & Decision Making 11 (2): 501–520.

[14] Hsiao, W.-H., T.-S. Chang, M.-S. Huang, and Y.-C. Chen. 2011. “Selection Criteria of

Recruitment for Information Systems Employees: Using the Analytic Hierarchy Process

(AHP) Method.” African Journal of Business Management 5: 6201–6209.

[15] Bodea C. N., E-A. Toader, “Development of the PM competency model for IT

professionals, base for HR management in software organizations”, 12th International

Conference on Informatics in Economy (IE 2013), Education, Research and business

Technologies, Bucharest, April 2013

[16] Bodea C. N., E-A. Toader, “Ontology-based modeling of the professional competencies

- a comparative analysis”, 11th International Conference on Informatics in Economy (IE

2012), Education, Research and business Technologies, Bucharest, May 2012, pp. 452-

458

[17] J. Mustajoki, and R.P. Hämäläinen, “Web-HIPRE: Global decision support by value tree

and AHP analysis”, INFOR, vol. 38, no. 3, 2000, pp. 208-220

http://www.worldscientific.com/action/doSearch?Contrib=varajao%2C+j

http://www.worldscientific.com/action/doSearch?Contrib=cruz-cunha%2C+m+m



189

BUSINESS SUCCESS BY IMPROVING THE INNOVATION

MANAGEMENT

Adela Mariana VADASTREANU

National Institute of Research and Development for Isotopic and Molecular

Technologies and Technical University of Cluj-Napoca

[email protected]

Dorin MAIER


[email protected]

Andreea MAIER

Technical University of Cluj Napoca

[email protected]

Abstract. The more and more demanding economic context emphasizes the awareness of the

importance of innovation for organizations. This paper highlights the importance of the

innovation for an organization and also gives some practical solution in order to improve the

overall success of the business through a better approach of innovation. The research in this

paper is conducted in such a manner that it can be read and apply by everyone, regardless of

his / her knowledge of innovation management. After a short introduction, where we briefly

define innovation, we have study the literature and establish the main determinants of

innovation. We proposed an integrated model of innovation management and in the end, we

proposed a solution for improving the innovation management by using a innovation

quadrant. By using this solution, every organization, regardless of the specific activity, may

identify a direction for improving the innovation capacity. In the conclusion part we offer

some advices, based on our studies, for the organizations and for the employees also, in

order to be more innovative and thus to be more successful.

Keywords: Innovation, innovation management, business success

JEL classification: O32, M21

1. Introduction Innovation was and still is a subject of high interest in the economic literature [1], [2], [3].

But not only the theoreticians study it also the practitioners and researchers deal with

innovation manly because of its relevance for the increase of the success and survival of

firms. Innovation was considered the elixir of life for companies, regardless of their size and

profile [4],[5],[6],[7],[8],[9]. Innovation is a dominant factor in maintaining global

competitiveness [10],[11].

In the same time, innovation is not easy. The efforts of innovating over the years have

provided a wealth of failed innovation projects. Even large companies that were once

precursors and creators of all markets have failed to remain competitive when major changes,

especially technological, occurred [12], [13].

Organizations are so accustomed to what they do (basic skills) that they get stuck there, and

when the environment changes (ex. changing customer requirements, changing regulations)

they are unable to adapt easily and quickly [14], [15].

Innovating today will be easier to innovate tomorrow [16], [17]. The attention of many top

executives is largely focused on the urgent problems of daily management; innovation is




190

often pushed aside although is considered an important issue, it is not a priority as other more

pressing issues are. When a management team becomes aware of the need for innovation,

identifies a number of challenges that have to deal with.

2. The determinant factors of innovation Innovation is regarded as a key element in achieving sustainable competitive advantages for

company’s success. Based on a literature review we have identify the main determinants of

innovation [18], [19], [20]:

common vision, culture and values in terms of innovation;

cooperation strategies for innovation;

individual and organizational commitment to innovation;

appropriate structures and systems to suit the innovation process;

human and financial resources;

information management and exchange between partners;

customer focus and feedback;

collaboration and creativity in teamwork;

autonomy and accountability;

visionary leadership to promote strategic alliances;

continuous improvement to achieve excellence through innovation;

education and training.

Analyzing the determinants of innovation, we discover that the true DNA of innovation is not

related to 'freewheeling', chaotic organizations that adopt a "out of the box" culture, which

welcomes creative types, who get brilliant ideas. This approach might work very well, but is

not the approach considered by the innovation experts. On the contrary, successful innovators

creates a very clear and concise agenda for achieving innovation, helps create a solid

structural framework with strong values that motivate all employees in an organization to

contribute, to generate and execute a variety of ideas in an aligned strategies. In other words,

innovation excellence is achieved by closely managing the innovation process and at the

same time by encouraging creativity [11].

Developing a framework for measuring innovation provides a valuable opportunity for

companies to assess the degree of innovation and also to discover the gaps in knowledge. W.

Edwards Deming quality expert incorporates first challenge in a quote: "If we do not

measure, we cannot improve". However, there is not a unique "model" of innovation for

today's businesses [11].

3. An integrated model of innovation management

The success of an innovation management system is in its implementing model. A model is

needed in every area in order to progress, because it offers a set of guiding principles. Most

organizations adopt a methodology for innovation without having a solid model; this can only

lead to an accidental innovation.

In order to be effective a model of innovation management, should be useful for all kinds of

organizations, whether large or small, public or private, and should consult and organize,

compulsorily, a wide range of problems. This framework must manage a rigorous process of

innovation, requiring specific tools, and above all it must remain simple and accessible. The

author proposes an integrated model [11], illustrated in Figure 1, which meets these

requirements and provides a basis for future development of an innovation management

system.



191

Figure 1 - An integrated innovation management model [11]

The integrated model supports innovation process management by proposing a number of

steps that an organization must do in order to succeed in their drive for innovation, taking

into account all the important aspects of a business system.

In order to achieve high performance in the innovation field, first we must divide the complex

innovation domain existing in an organization into smaller parts. We divided innovation into

eight parts, which are interconnected in a specific way. These parts are: vision and policy

innovation, strategic innovation, innovation in network development, human resources

development for innovation, process innovation, product innovation, marketing innovation

and administrative innovation.

The integrated model of innovation can be easily integrated into an existing management

system, of an organization. An efficient and sustainable innovation model will reduce

frustrations, risks, cost overruns and failures associated with the innovation activities.

4. Proposals to improve innovation management for organizations Although the process of innovation is one of the most important factors behind the growth

and prosperity of today's global economy, it is also poorly understood. Over the last century,

industry leaders have learned to master the production process to such an extent that it no

longer functions as a significant competitive advantage. The new challenge is to master the

innovation process.

Innovation management is an important process that few companies have mastered it well.

The main reason for this is that the innovation process is not fully understood or implemented

by organizations. This happens because of the lack of maturity of the innovation culture.

Although 85% of companies in Romania-agree that innovation is of strategic importance,

only 51% have made an innovation strategy, and only 16% are successful in managing the

innovation process, measured by financial performance (The White Book of SME’s from

Romania, 2012).

Although the importance of innovation is predictable, the interesting question is how to

achieve excellence in managing innovation. As in the literature review, there are no easy

answers to this question, because there are no shortcuts to excellent results. Successful



192

companies do not have a silver innovative bullet - they do not achieve results by doing one or

few things better than others do, they make everything better.

Based on the results of the interviews, companies can fall into one of four areas that relate to

the degree of innovation. These areas provide the basis for improved forms of innovation in

order to draw conclusions from interviews conducted.

The proposed framework for measuring the degree of innovation can place the organization

in one of four quadrants (Figure 2).

Figure 2 - Innovation quadrants [11]

The set of requirements and indicators are configured to achieve high level of innovation. If

there is a big difference between requirements and indicators, we will propose solutions to

improve the innovation process.

The diagram is positioned the companies in one quadrant according to the degree of

innovation achieved:

In zone 1 we have companies that do not innovate, which does not consider innovation as

important to their success, with minimal investment, with reduced competition /

monopoly (small towns where new businesses are started). The proposed strategy for

zone 1: the best solution is to assign importance to the innovation process.

In zone 2, we have weak innovative companies; the most frequent situation is of small

and medium-sized firms with non-innovative products and low investment (retail). The

proposed strategy for the 2: to innovate more to enter into zones 3 and respectively 4.

In zone 3, activities could involve a sector that creates new products or processes and the

need to search for interesting links and synthesis between adjacent sectors. Here,

innovation is a priority. The most feasible strategy for zone 3: is to increase innovation in

the same quadrant

The zone 4 is desirable one, is the area indicating strong innovative firms. Companies

from this quadrant considered innovation as the most important factor for organizational

performance. The proposed strategy for zone 4: the best of consolidation in the same

quadrant, keeping competitors out of it.

The innovation performance is clearly linked to the effectiveness with which firms have

captured the main innovation activities. We accept that despite considerable progress made in

recent years, the instruments used to measure innovation they still provide an incomplete

evaluation of innovative activities of firms.



193

5. Conclusions

The more intense competitiveness the more innovative companies are, due to continued

momentum to remain on the market to come up with something new and thus to overcome

competition.

Based on our research we have established some advices for the companies in order to be

more innovative. First aspect is related to the management support, not only to approve the

new innovative way of thinking but also a more active support such as employees motivation

or the provision of training and development programs. Another aspect is creating an

environment in which every employee can freely expose his ideas. The rewarding of

employees, they should be rewarded accordingly and here we do not mean just a monetary

reward but also an emotional one, such as public recognition of theirs merit in the company.

The communication networks within the company must be very effective, so that if an

employee has a new and innovative idea regarding a product, service or process, it can be

heard and properly capitalized. An intelligent allocation of resources. Organizational culture

should allow the development of innovation and thus be a culture that is willing to take the

risk, to embrace constructive failure to obtain extraordinary results.

Until now we have just give advice to the company in order to be more innovative, but every

employee can improve their innovative thinking by being more creative, flexible and

communicative; they must have confidence in their abilities; they must be brave to express

and support innovative ideas at work; they must be ready and willing to make responsible

decisions at work, in other words to be willing to take a moderate risk.

Acknowledgements

This paper benefited from financial support through the National Institute of Research and

Development for Isotopic and Molecular Technologies, Cluj-Napoca, Romania.

References

[1] Milbergs, E., V. (2007), “Innovation Metrics: Measurement to Insight. IBM Corporation”,

Center for Accelerating Innovation

[2] Brad, S. (2008b), „Vectors of Innovation to Support Quality Initiatives in the Framework

of ISO 9001:2000”, Int. Journal of Quality & Reliability Management, Vol. 25, Nr. 7, pp.

674-693

[3] Leavengood, S. (2011), „Identifying Best Quality Management Practices for Achieving

Quality and Innovation Performance in the Forest Products Industry”, Portland State

University

[4] Gavris, O. (2009), „Management of infrastructure rehabilitation works using multicriteria

analysis”, 6th International Conference on the Management of Technological Changes

Location: Alexandroupolis, GREECE Date: SEP 03-05, 2009

[5] Varis, M. (2010), „Types of innovation, sources of information and performance in

entrepreneurial SMEs”, European Journal of Innovation Management, Vol. 13, Nr. 2, pp.

128-154

[6] Anastasiu, L. (2009), „How the changing of technology can become a motivating factor in

human resources management”, 6th International Conference on the Management of

Technological Changes Location: Alexandroupolis, GREECE Date: SEP 03-05, 2009

[7] Ciplea, S, Ciplea, C., Anastasiu, L., Popa, A.,(2010), „Costs optimization methods with

applications in students trening”, 6th International Seminar on the Quality Management in

Higher Education Location: Tulcea, ROMANIA Date: JUL 08-09, 2010



194

[8] Xu, J., Houssin, R., Caillaud, E., Gardoni, M. (2010), „Macro process of knowledge

management for continuous innovation”, Journal of knowledge management, Vol. 14, pp.

573 – 591

[9] Lin, R. (2010), „Customer relationshipmanagement and innovation capability: an

empirical study”, Industrial Management & Data Systems, Vol. 1, pp. 111-133

[10] Maier, A. (2013), „Cercetări şi contribuţii la dezvoltatea modelelor de management al

inovării”,(teză de doctorat), Universitatea Tehnică, Cluj- Napoca, România

[11] Prahalad, C. (1994), „Competing for the Future”, Ed. H. B. Press, Massachusetts

[12] Vachhrajani, H. (2008), „A symbiosis of Quality and Innovation : Creating an integrated

model for SMEs, Quality- Striving for Excellence”, Vol. 5, Nr. 6

[13] Tushman, M., Nadler, D. (1986), „Organising for Innovation, California Management

Review”, Vol. 28, Nr. 4, pp. 74-92, Spring

[14] Anthony, S., Christensen, C. (2005), „Innovation Handbook: A Road Map to Disruptive

Growth, The Road to Disruption”, Harvard Business School Publishing

[15] Leavy, B. (2010), „Design thinking – a new mental model of value innovation”, Strategy

&Leadership, Vol. 38, pp. 5 – 14

[16] Zhao, F. (2000), „Managing Innovation and Quality of Collaborative R&D”, The Centre

for Management Quality Research, RMIT University

[17] Garcia, R., Calantone, R. (2002), „A Critical Look at Technological Innovation

Typology and Innovativeness Terminology: A Literature Review”, Journal of Product

Innovation Management, Vol. 19, Nr. 2, pp. 110-32

[18] Scotchmer, S. (2004), „Innovation and Incentives”, Cambridge, MA: MIT Press

[19] Prügl, R., Franke, N. (2005), „Factors impacting the succes of tookits for user innovation

and design”, Working paper, Vienna University of Economics

[20] Valencia, J. (2010), „Organizational culture as determinant of product innovation”,

European Journal of Innovation Management, Vol. 13, Nr. 4, pp. 466-480



195

RECOMMENDER SYSTEMS, A USEFUL TOOL FOR VIRTUAL

MARKETPLACE; CASE STUDY

Loredana MOCEAN

Babes - Bolyai University of Cluj – Napoca

[email protected]

Miranda VLAD

„Dimitrie Cantemir” Christian University Bucharest,

The Faculty of Economic Sciences ClujNapoca, miranda.vlad@ cantemircluj.ro

Mihai AVORNICULUI

Babes - Bolyai University of Cluj – Napoca

[email protected]

Abstract. A consequence of changes in demographic and cultural environment is the

changing attitude and lifestyle of consumers. Consumers are more mature, more refined,

have more discernment are more cosmopolitan, more individualistic and more concerned

about health and environmental problems.

The aim of this article is to present a case study of recommender systems through their most

important aspects. There are also presented several types of algorithms used and

implemented in special cases and problems that have to be considered in time.

Keywords: Recommender Systems, Collaborative Marketplace, Algorithms

JEL classifications: C63, C82

1. Introduction

Recommender systems have become an important research domain in recent years; they

compare a user profile to several reference characteristics and predict the rating or

preferences of the user which would be given to an item that was not yet considered.

In order to make this possible we must answers the following questions:

What kind of products do we buy?

What kind of products are frequently bought by other users?

Which are purchased most frequently?

What is the frequency of purchased products from a certain category?

Recommender systems are applications that offer the user objects, notions from his circle of

interest (movies, music, news, images, persons). Recommendations are made based on the

systems knowledge about the users profile (explicit or implicit).

The aim of a recommending system is to generate useful recommendations for a group of

users. A critical component is the generation of recommendations from liable sources for

human decisions.

2. Classification

Recommender systems can be of two types - from the point of view of algorithms used, as

they are presented in [1]:

recommender systems based on the content of the article. These algorithms focus on the

content of the article and offer as recommendations similar articles to the actual one,

initially based on special metrics such as tags;



196

recommender systems based on collaborative filtering. These systems use the entities

from the application (users, articles, user preferences, actions) and recommend articles

based on user preferences. User preferences can be studied based on their explicit actions

(voting) and implicit behavior.

Collaborative filtering methods can be further classified as follows:

methods based on users. The preferences of similar users are used to recommend further

articles;

methods based on articles. Similar articles are searched based on the quantification of

several social aspects, for example if users preferred a given article and also voted

positively for other articles.

3. Characteristics

In the followings the important characteristics of the recommender systems are described ([1]

and [2]):

recommendation transparency is a characteristic through which the systems offer

motivation for the recommended article;

exploration vs. Exploitation.

Assume the following situation: there are two articles that can be recommended for a group

of users and the user activity for an initial article can be recorded. The problem is the decision

making: should the two articles be recommended and if yes then when exactly is the best

moment to maximize the effect of the recommendation on the user.

the problem of offering a user friendly interface can be solved by the navigation guiding

of the user. This case is applicable when the content is very large within the application;

correct moment valuation;

a recommender system considers the changes that the outdating of an article can bring to

the relevance of the information;

the scalability of the system is very important in the context of big data;

diversity is important when recommendation of very similar elements minimizes the

effect on the user [3].

In this domain appear the following classic problems of a recommender system:

cold start problem: appears when a user is new and there are no information regarding this

and preferences cannot be deduced;

first rater problem refers to the new articles for which no references were made and no

preferences were shared;

problem of manipulation, for example when a group of users gives negative comments for

a competitive article;

confidentiality.

It appears when a user has to evaluate given recommendations; it is not wanted that the users‘

preferences to be published.

There are a few important approaches regarding recommender systems.

a) Amazon.com virtual store

Amazon uses recommender algorithms that personalize each users online shopping. The

algorithm named „Item-to-Item Collaborative Filtering” [4] offers real time recommendation,

it is scalable to big data and is obtaining good recommendations. The algorithm makes a

connection between the purchased item and the items noted by other similar users, then it

combines the similar items into a list of recommendations. In order to define the best match

for a given item, the algorithm builds a table of similar items by finding the ones that the

consumers tend to buy together. A product-product matrix can be built through the iteration



197

of all pairs and calculating a similarity metric for each pair. Despite of this, a lot of the

product pairs do not have the same clients, so that the approach is inefficient from the aspect

of the time of processing and the memory used. The following iterative algorithm offers a

better approach by calculating the similarity between a given product and all the products that

have a connection with it [1].

Products = { Ii | Ii a Product}, Clients = { Ci | Ci a Client}.

Auxiliary functions:

BuyersFor: Products → PClients;

BuyesrFor(Ii)={Ci1...Cij...Cin | Cij, buyer of product Ii}, ∀ ∈ i I Products and ∈ ij C Clients.

BuyedBy: Clients→ PProducts;

BuyedBy(Ci)={Ij1...Ijk...Ijm|Ijk was buyed by client Ci}, ∀ ∈ i C Clients and ∈ jk I Products.

SimilarityCalc: Products x Products→ [0,1];

SimilarityCalc(Ii, Ik) is the result of a similarity matrix M between 2 products Ii and Ik.

Function CalculateSimilarity(Catalog) {

Associations:= ø , Similarities:= ø

For each product Ii in Products {

For each client Cij in BuyersFor(Ii) {

For each product Ik in BuyedBy(Cij) {

Associations := Associations U( Ii , Ik)

}

}

For each product Ik associated with Ii {

Similarities := Similarities U(Ii , Ik, SimilarityCalc(Ii, Ik))

}

}

}

b) User influence in recommender systems

In the context of image recommendation [5] propose a method of ordering recommendations

based on an algorithm similar to PageRank. It is considered that a user who holds qualitative

images is likely to appreciate other images of quality (being an authority in this matter). The

quality of an image is given by the weighted sum of user authorities that marked the image as

favourite.

c) Bootstrapping methods for recommender systems

Recommender systems start without information about the users or items. Olsson &

Rudström in paper [6] propose a method that initiates a recommender system (bootstraps)

with a set of artificial user profiles. These can be obtained by the sampling of a probabilistic

model built on previous knowledge.

4. Evaluation of recommender systems

Recommendation systems have a variety of properties that may affect user experience, such

as accuracy, robustess, scalability. The evaluation of recommender systems can be made as it

follows ([7] and [1]):

„Root Mean Square Error” (RMSE) method; it is a popular method for calculating user

preferences with the following formula: RMSE(P) = √∑ (Pest,i−Preal,i)2

i

n;

average of absolute difference between the real value of the preference and the estimated

value of the recommender system;

data split in a training set and a testing set so that the classifiers are being evaluated. After

the training set is presented to the system, it needs to calculate the preferences for the

instances from the testing set. These preferences are compared to the real ones, so that an

error score is obtained for this algorithm.



198

5. Case Study: Amazon.com online store

We try to implement a method also used in [8]. The methode in our case is used in

recommendation systems. The study presents 3 books. Book A is called “Geschichten von

der Polizei”, book B is called “Geschichten vom Rennfahrer Mick” and book C is called “Die

Wilde Autobande”.

The percent of recommendations of each product at the initial t0 moment is : A 50%, B 20%,

C 30%. The percent of fidelity from a week to other week is:

For the product A – 60%

For the product B – 70%

For the product C – 50%

The rest preferences of other users are given in the next table.

Table 1 - The rest preferences

The product which is

cancelled buy the buyers

Reorientations for recommendations to other products

A B C

A - 18 23

B 11 - 13

C 7 11 -

We intend to analyze the evolution of virtual market recommendations within 4 weeks, of

products A, B and C.

The analytic algorithm is:

1. We establish the data input.

2. We write initial distribution. Table 2 - Initial distribution

Initial distribution 0.50 0.20 0.30

3. We analyze the evolution of virtual market recommendations within 4 weeks.

Transition probabilities matrix is built according to reliability coefficient and

reorientations of buyers.

P=(0.59 0.18 0.230.11 0.76 0.130.07 0.11 0.82

)

4. We build the matrix of probabilities of transition.

Table 3 - Matrix of probabilities

0.59 0.18 0.23

0.11 0.76 0.13

0.07 0.11 0.82

5. We build the recommendations percentage after first week. The Market percentage

recommendation are: Table - 4 Market percentage

0.59 0.18 0.23

0.11 0.76 0.13

0.07 0.11 0.82

6. We build the percentages on the marketplace after 1,2,3 and 4 weeks.



199

Table 5 - Percentages of marketplace

The percentage after 1st week 0.338 0.275 0.387

The percentage after 2nd week 0.31 0.291 0.399

The percentage after 3nd week 0.29 0.32 0.39

The percentage after 4th week 0.21 0.344 0.446

7. We build the situation regarding the evolution of product market recommendations for

weeks considered. Table 6 - Evolution of product market

W0 W1 W2 W3 W4

A 0.5 0.338 0.31 0.29 0.21

B 0.2 0.275 0.291 0.32 0.344

C 0.3 0.387 0.399 0.344 0.446

8. We draw curve evolution on four weeks for each product.

Figure 1 - Curve evolution

State of art

In the past decade, there has been a vast amount of research in the field of recommender

systems, mostly focusing on designing new algorithms for recommendations.

In paper [9], the authors introduce a model for recommendation systems, based on a utility

matrix of preferences. They introduce the concept of a “long-tail “which explains the

advantage of on-line vendors over conventional, brick-and-mortar vendors. They, then,

briefly survey the sorts of applications in which recommendation systems have proved useful.

Prem Melville and Vikas Sindhwani in paper [10] say that “The design of such

recommendation engines depends on the domain and the particular characteristics of the data

available”.

To succeed in practice, a recommender system must employ a strong initial model, must be

solvable quickly, and should not consume too much memory.

In their paper [11], the authors describe their particular model, its initialization using a

predictive model, the solution and update algorithm, and its actual performance on a

commercial site.

The weakness of the popular EM is examined in the paper [12]. It is based on learning

approach for Bayesian hierarchical linear models and propose a better learning technique

0

0.1

0.2

0.3

0.4

0.5

0.6

W0 W1 W2 W3 W4

Pe

rce

nta

ge

Weeks

A

B

C



200

called Modified EM. They showed that the new technique is theoretically more

computationally efficient than the standard EM algorithm as in paper [13][14].

The doctoral thesis of Shengbo Guo [15]is about how Bayesian methods can be applied to

explicitly model and efficiently reason about uncertainty to make optimal recommendations.

The proposed methodology from paper [16] improves the performance of simple Multi-rating

Recommender Systems as a result of two main causes; the creation of groups of user profiles

prior to the application of Collaborative Filtering algorithm and the fact that these profiles are

the result of a user modeling process, which is based on individual user’s value system and

exploits Multiple Criteria Decision Analysis techniques.

Bedi et al. in their paper [17] tell us that “Fuzzy sets can handle and process uncertainty in

human decision-making and if used in user modeling can be of advantage as it will result in

recommendations closely meeting user preferences”. In their paper, a hybrid multi-agent

recommender system is designed and developed where user's preferences; needs and

satisfaction are modeled using interval type-2 (IT2) fuzzy sets.

Conclusions

Based on the studies and the implementation of the proposed application, the benefits and the

special utility of the recommender system could be evaluated, in the context of a large and

hard-to-explore search space.

References

[1] M. Roditis, A. Tabacariu and Ş.Trăuşan-Matu, “Sistem de recomandare de imagini pe

baza aspectelor sociale, semantice şi vizuale”, Revista Română de Interacţiune Om-

Calculator 4(1) 2011, 23-50

[2] J. Riedl J, T. Beaupre and J. Sanders, “Research Challenges in Recommenders“, ACM

Recommender System 2009, avalaible online at http://recsys.acm.org/2009/tutorial3.pdf,

2009

[3] G. Linden, “What is a Good Recommendation Algorithm?” In Communications of the

ACM. Available at http://cacm.acm.org/blogs/blog-cacm/22925-what-is-a-good recommendation

[4] G. Linden G., B. Smith and J.York , “Amazon.com Recommendations: Item-to-Item

Collaborative Filtering”, IEEE Internet Computing, vol. 7, no. 1, pp. 76-80, Jan./Feb.

2003

[5] V. Hosu and S. Trausan-Matu, “Metodă de recomandare bazată pe rang, considerând

satisfacţia utilizatorilor” in D.M.Popovici, A.Marhan (eds.), Proceedings of the 4-th

International Conference of Interaction Human-Computer, Ed. RoCHI 2007, MATRIX

ROM, Bucureşti, 2007, pp.129-132

[6] Olsson T., Rudström Å. Genesis, “A method for bootstrapping recommender systems

using prior knowledge”, in Olsson T., Bootstrapping and Decentralizing Recommender

Systems - IT Licentiate theses, 2003-006. Uppsala University, Department of Information

Technology, Uppsala, Sueden, 2003.

[7] Shani G, Gunawardana A. “ Recommender Systems Handbook 2011”, Part 2, 257-297,

2011

[8] Camelia Ratiu-Suciu, “Modelarea & simularea proceselor economice”, Ed. Economica,

2003

[9] Chapter 9, “Recommendation Systems” avalaible online at http://infolab.stanford.edu/

~ullman/mmds/ch9.pdf

http://recsys.acm.org/2009/tutorial3.pdf

http://cacm.acm.org/blogs/blog-cacm/22925-what-is-a-good%20recommendation

http://cacm.acm.org/blogs/blog-cacm/22925-what-is-a-good%20recommendation

http://infolab.stanford.edu/



201

[10] Prem Melville and Vikas Sindhwani, Recommender Systems, IBM T.J. Watson

Research Center, Yorktown Heights, NY available online at

http://vikas.sindhwani.org/recommender.pdf

[12]. G. Shani, D. Heckerman et al. , “An MDP-Based Recommender System”, available

online at http://jmlr.csail.mit.edu/papers/volume6/shani05a/shani05a.pdf

[13] G. Jawaheer , P. Weller , P. Kostkova , “Modeling User Preferences in Recommender

Systems: A Classification Framework for Explicit and Implicit User Feedback”, Journal

of Machine Learning Research 6 (2005) 1265–1295

[14] Yi Zhang , J. Koren, “Efficient Bayesian Hierarchical User Modeling for

Recommendation Systems”, https://users.soe.ucsc.edu/~yiz/papers/c10-sigir07.pdf

[15]. S. Guo, “Bayesian Recommender Systems: Models and Algorithms” Doctoral thesis,

http://users.cecs.anu.edu.au/~sguo/thesis.pdf

[16]. K. Lakiotaki , N. F. Matsatsinis and A.Tsoukiàs, “Multi-Criteria User Modeling in

Recommender Systems”,

http://www.lamsade.dauphine.fr/~tsoukias/papers/Lakiotakietal.pdf

[17]. P. Bedi, P. Vashisth and P. Khurana, “Preeti, Modeling user preferences in a hybrid

recommender system using type-2 fuzzy sets” http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6622471&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6622471

[18] P. Adjiman, “Flexible Collaborative Filtering in JAVA with Mahout Taste”. available at

http://philippeadjiman.com/blog/2009/11/11/flexible-collaborativefiltering-in-java-with-

mahout-taste/, 2009.

[19] A. Boch, A. Zisserman, X. Munoz, “Representing shape with a spatial pyramid kernel”.

CIVR’07, available at http://eprints.pascalnetwork.org/archive/00003009/01/bosch07.pdf

[20] P. Resnick and H.Varian H., “Recommender systems. În Communications of the ACM,

Volume 40, Issue 3 (March 1997). Pg: 56 – 58, 1997

[21] B. Sarwar, G. Karypis and J. Konstan, „Item-Based Collaborative Filtering

Recommendation Algorithms”. In WWW10, Hong Kong, May 2001

[22] P.-N. Tan, M. Steinbach, V. Kumar, „Introduction to Data Mining”, Addison Wesley,

2005

[23] J. Vermorel and M. Mohri “Multi-Armed Bandit Algorithms and Empirical Evaluation”

in European Conference on Machine Learning, available at

http://www.cs.nyu.edu/~mohri/ pub/bandit.pdf, 2004

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6622471&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6622471

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6622471&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6622471

http://www.cs.nyu.edu/~mohri/



202

METHODOLOGY FOR THE COHERENT ROMANIAN LINKED

OPEN GOVERNMENT DATA ECOSYSTEM

Codrin-Florentin NISIOIU


[email protected]

Abstract. Linked Open Government Data(LOGD) facilitates the integration of data and

allows the link between the disparate government data sets. The increasing use / reuse of the

releases LOGD increases the need to improve the quality of data. The availability of LOGD

allows the creation of new services offered by the public / private sector. The reuse of LOGD

in the e-government applications leads to a considerable reduction in costs. The article is

divided in 3 sections. The first section is describing the LOGD ecosystem, the second section

propose a methodology for a coherent Romanian Linked Open Government Data Ecosystem

and the third section put LOGD ecosystem in the context of metadata economy.

Keywords: Linked Open Government Data (LOGD), Open Data, Romanian LOGD

ecosystem

JEL classification: M15, O38

1. The LOGD ecosystem

Data are provided by the specific areas of public sector information - geographic information,

business specific information, legal information, weather data, social data, information on

transport, cultural heritage information and information about education and research.

The LOGD life cycle is based on a demand - supply data model. There is a "data provider"

that selects, shapes, publish and perform the data management for the data supply. There is

also a "consumer data" that looks for, integrates, reuses and provides feedback for the data

demand in order to improve the data management.

The specific actors of the LOGD ecosystem are:

1. The data providers represented by the public administrations that "open" its data and

provide it as "Linked Open Government Data" (LOGD).

2. Data Consumers are citizens, entrepreneurs, companies and public administrations that

reuse LOGD through value-added applications and services.

3. Data brokers are third-party organizations, public or private, which manages data

catalogs and marketplaces facilitating access to the available LOGD. They can provide

additional services such as advanced queries, data visualization and the ability to export in

various formats.

4. Regulatory Entities represented by the local / regional / national public administrations

and cross-border institutions - for example the European Commission regulates LOGD

through policies, laws and directives.

The LOGD ecosystem is analyzed through the 9 areas of Business Model Canvas [1]:

1. The value proposition provides an overview of an organization's products and services

that have added value to the consumer of data.

2. The key-resources are represented by the necessary activities and resources to create

added value to the customer.

3. Key partners are those who made a voluntary cooperative agreement between two or

more organizations to create added value for the customer.



203

4. Key activities are represented by the ability to execute action models that are needed to

create added value to the customer.

5. The cost structure is achieved through the representation in money of all means

involved in the business model.

6 The customer segments are represented by the organizations that perform added value

for LOGD.

7. The income obtained through LOGD - the way used by an organization to obtain

revenue.

8. The channels are the way to keep in touch with the clients.

9. The customer relations are represented by the type of bond that is created between the

supplier and the client company.

The value proposition can be achieved through: the flexible integration of data, the increasing

data quality, new services and reducing costs.

The errors are progressively corrected through the mechanisms of "crowd-sourcing" and

"self-service".

The key resources are: the URI policies, the infrastructure of linked data and also the specific

skills and competencies. URI policies must ensure the permanence and uniformity of the Web

identifiers. The infrastructure of "linked data" is the web infrastructure needed to achieve the

"linked data". Skills and competencies are key resources needed to develop and maintain

"linked data".

The key partners may be the governmental partners, entities from the business environment

or from NGOs. The government partners are regulators, e-Government agencies and other

entities from the public administration. The business partners are ICT providers, the data

providers and the data brokers. The NGOs are part of lobby groups, communities and

engaged civic action groups.

The LOGD specific key activities are: the development, the maintenance and the promotion.

The development specific activities consist of identification activities, modeling,

transformation, harmonization, publication and reuse of LOGD. The maintenance specific

activities consist of support activities for users, server maintenance etc. The promotion

specific activities consist of activities regarding the promotion of the LOGD use.

The LOGD cost consists of the development, the maintenance and the promotion cost. The

development costs include the costs needed for the identification, the modeling, the

processing, the harmonization, the publication and the reuse of LOGD. The maintenance

costs include the costs of updating the data and/or the specific infrastructure maintenance

costs. The promotion costs of the availability of data as "linked data" are obtained by

comparing the costs of promotion of their availability through other means.

Customer segments using LOGD are: the government, the NGOs, the companies specialized

in working with LOGD and the academic environment. The companies specialized in

working with LOGD may have in their teams data brokers, LOGD services developers and

journalists focused on data. The pricing model includes: the income sources, the used pricing

model, the price structure and the type of license LOGD associated.

The specific LOGD revenue sources are from: public funds (LOGD is partially financed by

the government), user fees (LOGD is partly funded from subscriptions) and advertising. The

model of the price may be: free, freemium - the data are free, but the features, functionality

and proprietary virtual goods are paid, premium - quality data associated with a certain image

and a certain prestige among users.

The price structure consists of fee for LOGD subscriptions or LOGD fee for the

"application". LOGD licensing types should consider: the unknown licensing - "linked data"

service does not indicate under what license data can be used, modified and redistributed; the



204

award restrictions; the restrictions for commercial use; the "share-a-like" restrictions - for

redistribution it will be used the same license or other public good.

The LOGD distribution is achieved through: Web APIs (LOGD are accessible through Web

APIs), download datasets that are part of a "Data Bank", the proprietary applications and the

Web applications.

The networking mechanisms are based on "branding" (LOGD has a brand strategy that refers

to the value of data), on advertising (the consumer is informed of specific services "linked

data" through advertising), on user support (there is a support centre in order to use LOGD)

and on user feedback mechanism (measuring user satisfaction and ensure general feedback).

2. Methodology for a coherent Romanian Linked Open Government Data Ecosystem

The methodology comprises the following steps: I) the developing of the "desk research"

preparatory work; II) the collecting metrics for each case study; III) making interviews; IV)

the specific set of questions for the LOGD provider / consumer.

The minimum information gathered in the "desk research" is: LOGD user/provider mission;

the provision and the reuse of linked data; LOGD price; the LOGD licensing and the market

analysis. The LOGD user / provider mission is presented in a few words correlating with

public work performed by it. The provision and the reuse of the linked data are made as a

summary that presents linked data through the links, the documents, the used vocabularies

and the linking services description of data. The LOGD price is shown through the applied

price and the specific documentation. The LOGD licensing is presented through the applied

licensing conditions. The market analysis shows the market players, the LOGD providers and

users. Sources of information from step I are: the public administration websites, annual

reports and studies on public sector data.

The collected metrics for each case study are: 1) the use: the number of URIs / queries, the

number of LOGD government re-users and the number of LOGD re-users of trading; 2) the

incomes and the income sources: public funds, incomes from LOGD annually subscriptions

and data on demand per year, the LOGD price developments per year, advertising; 3) costs:

cost of development – which are necessary to all activities of LOGD identification, modeling,

transformation, harmonization, publication and / or reuse, maintenance costs per year -

necessary costs for the publication of the updated data, maintenance costs - costs of the

specific infrastructure and promotion costs - promotion costs of the data availability as

“linked” data. 4) the benefits: the number of “linked” data sets the number of applications

LOGD – based, the number of requests for correction of data and cost savings by integrating

information.

Interviewing – There will be at least two interviews with the LOGD provider and LOGD

beneficiary.

The set of questions specific for the LOGD provider / consumer are: 1) Organization - How is

correlated the LOGD demand / use with the public activities specific to the organization?

There is a strategic document within the organization that motivates the investment in LOGD

provision / reuse? What are the future plans? 2) Proposals – Does the provision of LOGD

data create new opportunities for the flexible data integration? Has the data quality increased?

Have new services appeared? Have the costs been reduced? Who are the facilitators / the

main inhibitors for LOGD to provide added value to the users? 3) The key resources - has

your organization a policy regarding URIs? Can you describe the infrastructure for "linked"

data from your organization? What skills and competencies are required in order to provide /

to use LOGD? 4) The key partners - What organizations are the key partners in the LOGD

provision / use? Key activities - What activities do you have in order to provide / use LOGD?

5) Cost Structure - What investments were made by your organization in order to provide /



205

use LOGD? What were the costs in order to publish maintain and promote LOGD? What are

the trends? 6) Customer segments - What are the main users of LOGD services? How often

LOGD is used? What is the trend? 7) Income - What is the LOGD financing? What is the

mechanism of price or what are the other sources of income? What type of license is used for

reuse? 8) Channels - What are the channels most commonly used to investigate LOGD? Web

APIs? Website? Application? Market data? 9) Customer relations - There is a branding

strategy for LOGD? Investments in specific advertising for LOGD?

3. Metadata Economy

The metadata widely interoperable represents a new IT phenomenon that in combination with

new licensing strategy creates new opportunities for the products diversification and the

creation of new opportunities. The media industry has attracted attention by switching from

traditional content to metadata and by professionalizing the specific strategies in order to

conserve and enhance the quality of structured data by applying Semantic Web principles [2].

Using Semantic Web approach is relevant when the distribution of goods increases and

allows the multiplicity of services and customer portfolios. The uniform application of RDF

data model is the cornerstone of the Semantic Web and Linked Open Data, allowing syntactic

and semantic interoperability. Saumur and Shiri [3] have considered the increase of

researches conducted on issues related to metadata and the decrease of the traditional ones

(such as indexing, the artificial intelligence, etc.). They [3] have documented new areas of

research such as Web-based cataloging, classification and interoperability.

K. Haase believes [4] that with the increasing volume of data, the economic value of

metadata increases too. Facing the necessary pressure in order to diversify business,

especially in knowledge-based business sectors such as media, life sciences, banking,

insurance or trade, it requires a constant search for new ways to create value-added products

and services to existing customers or to attract new consumers. The specific concepts of

metadata such as metadata schemas, vocabularies, ontologies, identifiers, queries, etc. have

become central factor of production in the efficient operation of existing and opening new

ways of diversifying products and services. But the approach of Sjurts [5] believes that

diversification can be seen in the light of interoperable metadata. The resource-based

approach investigates how valuable economic resources are created and exploited

commercially.

The market-based approach investigates the new consumers and the market segments that can

be penetrated and are safe. Both approaches are intertwined and affect each other. The

recognition and the understanding of the specificity of interoperable metadata is crucial in

developing a business around the metadata semantics especially when there are applied the

appropriate licensing strategies. Due to the increasing interaction between the factors specific

to the creation of goods it has been passed from the value chain approach to the network

approach [6]. The network approach takes into account two factors: 1. An input can be used

in various contexts for different purposes and 2. an economic actor can be active at different

levels of simultaneous creation of added value. Latif [7] used this approach to describe the

structural coupling of economic actors, their roles and the sets involved in creating

Interconnected Data. Added value networks are characterized by a highly organizational

complexity and require different governance principles in the "open source" projects. Demil

& Lecoque in [8] developed the concept of "Governance Bazaar" in which interactions

between economic actors are characterized by: decentralization, collaborative engagement

model, resource sharing and hybrid business models composed of strong and weak property

rights.



206

4. Conclusions

I think that the same principle used in “Governance Bazaar” can be adopted in the Linked

Open Government Data ecosystem in order to design and to govern the open data

infrastructure based on the federalization principle, self-service and collaborative way of

creating value in. My future research will focus on: 1) proposing architecture for LOGD

ecosystem based on the federalization principle, self-service and collaborative way of

creating value in. 2) use the proposed methodology to prove the sustainability of the LOGD

architecture.

Acknowledgment




research in Romanian economics science domain”

References

[1] Osterwalder, A., & Pigneur, Y. (2009). Business Model Generation.

[2] Rachel Lovinger. Nimble: a razorfish report on publishing in the digital age. Technical

report, 2010.

[3] Kristie Saumure and Ali Shiri. “Knowledge organization trends in library and

information studies: a preliminary comparison of the pre-and postweb eras”. Journal of

Information Science, vol. 34(5), pp. 651-666, 2008.

[4] Kenneth Haase. “Context for semantic metadata” in Proceedings of the 12th annual ACM

international conference on Multimedia, MULTIMEDIA '04, New York, USA, 2004, pp.

204-211.

[5] Insa Sjurts. “Cross-media strategien in der deutschen medienbranchhe. eine okonomische

analyse zu varianten und erfolgsaussichten.” In Bjorn Muller-Kalthoff, editor, Cross-

Media Management, pages 3-18. Springer,2002.

[6] Axel Zerdick, Arnold Picot, Klaus Schrape, Alexander Artope, Klaus Goldhammer,

Ulrich T. Lange, Eckart Vierkant, Esteban Lopez-Escobar,and Roger Silverstone. “E-

conomics: Strategies for the Digital Marketplace”. Springer, 1st edition, 2000.

[7] Atif Latif, Anwar Us Saeed, Partick Hoeer, Alexander Stocker, and Claudia Wagner.

“The linked data value chain: A lightweight model for business engineers”. In

Proceedings of I-Semantics 2009 - 5th International Conference on Semantic Systems,

Graz, Austria, 2009, pp. 568-577.

[8] Benoit Demil and Xavier Lecocq. “Neither market nor hierarchy nornetwork: The

emergence of bazaar governance.” Organization Study, vol. 27(10), pp. 1447-1466, 2006



207

QUALITY METRICS FOR EVALUATING INTERACTIVITY

LEVEL OF CITIZEN ORIENTED WEB BASED SOFTWARE

SOLUTIONS

Emanuel Eduard HERȚELIU

Bucharest Academy of Economic Studies

[email protected]

Abstract. This paper presents a way to evaluate the quality of web based software solutions

in means of what degree of interaction is provided to user. The types of analyzed solutions

are presented divided into categories. Evaluation is done using quality metrics. The metrics

are presented as well as the process of collecting data and applying them. Collecting data is

done using an automated tool. The modules of the tool are presented and the process of using

it is highlighted.

Keywords: Quality, Metrics, Citizen Oriented, Web Based Software Solutions

JEL classification: O38

1. Introduction

Web environment is a good choice to host applications that provide services to citizens. This

makes a specific application easily reachable throughout a high range of devices and software

platforms. The number of persons in EU that use internet daily increased from 31% in 2006

to 65% in 2014 [1] which means that developing an online app is a must in order to reach out

to a big percent of users. Applications designed and developed for citizens are usually funded

by governments and intended to help them by providing online services that make their life

easier. This is why they are usually free of charge, highly interactive and accessible. Services

provided by national and local governments through online apps are:

information sharing services that update to citizens news about changes in payment and

tax schemes, laws, local and national administrative info, cultural, community and social

events that they could or need to attend, touristic information;

multimedia guiding with the use of audio and video features in order to inform citizens

and help them with their administrative tasks;

payment services helping tax payers save time as doing it from home or form certain

places with devices designed especially for administrative payments;

online surveys services in order to gain information about user’s satisfaction with the

services and improvements that can be done based on their opinion;

enhancements that help disabled people use the services and keep informed;

archives containing forms and document templates that are available to download so that

people could save time by filling them prior using them at service desks.

Steps towards good online services are taken by governments all over EU and worldwide but

at the moment not all services are provided to citizens. There are local and national

administrations that provide one or few of the services but they need to be centralized and

standardized so integrating new services does not involve unnecessary efforts and costs.

2. Interactivity approach

When categorizing online services provided to citizens, in terms of interactivity there is a

distinction between certain types of web based solutions [2]. Thus using interactivity as

criteria the following categories of informatics applications are distinguished:



208

static applications in which content is displayed for user as static content so user can read

it and not much action being needed other than scrolling, navigating through pages,

following links, hiding and un-hiding containers filled with information;

dynamic applications where certain content is brought from the server and displayed

based on user interaction and demands via input controls;

archives as applications with downloadable content as forms, pictures and documents

relevant to user needs as a citizen;

dynamic archives as apps where user can not only download content but contribute by

uploading personal documents;

media applications where user can access audio and video informative content depending

on his interest and perform interacting actions with the players and tools used to

display/stream the content;

mixt applications which can include specific forms of all categories presented, user

inserted data is processed on the server and calculations are made before sending a

response and updating the page with the results.

When developing the interactivity level of services provided to citizens an important step to

make is to include as many forms of interaction as possible diversifying actions user can

perform and improving the quality of the results user obtain when accessing the app [3]. The

presented application types are linked to user actions in Table 1.

Table 1 – Applications categories and user actions

Application Type Specific User Actions Services

static scrolling, change pages, follow

links, hide un-hide content

schedule updates, news, informative

content, touristic information,

timetables

dynamic

filling in text boxes, checking

checkboxes, switching between

radio buttons, submitting forms

tax paying, properties buying

archives downloading files downloading administrative forms,

documents, papers and books

dynamic archives downloading/uploading files uploading content

media streaming video and audio

content, viewing pictures

watching news, informative and

educational movies and pictures

The categories and user actions presented in Table 1 are intended to draw a line between

certain types of applications oriented towards helping people in day by day life. Combined

with the number of steps necessary to reach to results this approach of interactivity is

measurable using metrics.

3. Interactivity evaluation tool

In order to evaluate the level of interactivity a set of metrics was built. They are intended to

categorize apps and tell the degree of interaction user gets when accessing the app. Being

web based the analyzed apps are evaluated by parsing the html code of each page. The goal is

to make it easy to measure the interactivity for a high number of applications at a time. Thus

an open source web crawler was integrated and an automation tool built and used to get and

parse the source code based on app’s URL as described in Figure 1.



209

`

Figure 1 – Quality evaluation tool

The automated tool presented in Figure 1 consists of three main modules that are

interconnected and linked to databases:

the crawler module which is an open source crawler integrated and configured to accept

as input a range of app URLs and crawl each one of them; crawling begins on homepage

of the app and then follows the links found on the page to proceed to the next pages of the

same app; checking that a page belongs to the same app is done by checking that it shares

the same domain and port as homepage; the output of crawler module is source code of

each page that is sent to the database to be further processed;

the parser module which reaches the database for the source code crawled and reads it as

input; crawled source code is rigorously parsed following apps structure and each page is

divided in its component HTML nodes that are sent as output to be processed in the next

step;

the processing module has two input sources in terms of the parsed HTML nodes and

built metrics that were prior stored in the database; during processing the module does

counting, keywords matching, calculating page’s length in order to give a meaning to the

metrics that calculate loading easiness; the processed data is stored in the database

according to each app and page;

a fourth module is under construction and it’s intended to for communicating with the

database and retrieve the results in order to bring it using HTML to the final user in

formats that include graphics, tables and explaining text.

URL 1

URL 2

...

URL i

...

URL n

Input: Apps

URL set =========

ssssset

CRAWLER

DATABASE

Output: source code for

each page

SC 1

SC 2

...

SC i

...

SC n

Input: Source Code

PARSER

Output: HTML nodes

BUILT METRICS

PROCESSING MODULE

Output: quality level



210

The technologies used to build the automated tool are ASP.NET with Visual Studio 2013 as a

working environment. The databases are built and managed using SQL Server 2012. Entity

Framework is used in order to facilitate easy integration between database and modules. The

presentation module is built also using ASP.NET as a web app using MVC 5 pattern.

4. Evaluating interactivity using metrics

According to the categories of applications presented, metrics are calculated for a set of web

based applications. The metrics are related to the HTML code of each page and are applied

evaluating the nodes via the automated tool presented. The metrics used are divided in three

categories:

binary metrics used to show whether a specific node was found on the page or not having

the value of 1 in case of a found node or 0 if node not found;

counting metrics are used to count the total number of apparition of certain nodes inside

the page content;

ratio metrics are used to highlight to what extent the content of the page is represented by

a certain type of node; they are calculated as summing up all nodes of a certain type and

dividing the sum to the total number of nodes of the page.

Evaluating the HTML nodes is done by applying metrics from each category both at page

level and at whole application level [4]. The metrics are presented in Table 2 together with

their meaning in terms of interactivity.

Table 2 – Interactivity metrics

Metric Symbol Metric

Category

Way of expressing Means of

interactivity

Total number of pages

per application

PN

Counting

Summing up all pages

crawled for an app

The complexity of

the problem

solved by the

application

The total number of

nodes per page/app

NN Counting

Summing up all nodes

parsed per page/app

The structure of

page/application

Form nodes present on

page/app

FE Binary

Searching for nodes of

type form

Application is

interactive

The weight of input

nodes in the total

number of app/page

nodes

IR

Ratio

Dividing the total number

of input nodes to the total

number of nodes found on

page/app

Application’s

extent of

interactivity

Input nodes of type file

present on page/app

IFE Counting

Summing up the input

nodes of type file

Application allows

file uploads

Total number of files

available for download

per page/app

DFN

Counting

Summing up all files

available for input per

page/app

Application allows

file downloads

Secure protocol present SE Binary

Checking the protocol of

app’s URL

Application allows

secure transactions

The metrics presented in Table 2 are chosen based on their contribution in evaluating the

interactivity level for citizen oriented web based apps. They are meant to be used as variables

of quality indicators for evaluating interactivity level. The presence of form html nodes at an

app level calculated by FE metric is meaningful for the interactivity level because it tells

weather the citizen oriented web based app provides results calculated by submitting user

input values or not. The weight of input nodes calculated by IR metric for the apps that



211

provide forms elements is important because it tells about the effort and information needed

from user in the interaction process.

The metrics are applied on application sets. The metrics set MS is built: 𝑀𝑆 ={ 𝑀1, 𝑀2, … , 𝑀𝑖, … 𝑀𝑁𝑇𝑀}, where Mi represents the i metric in the total of NTM = 7 metrics

from Table 2. Each metric is calculated for a set of applications. The applications set AS is

built: 𝐴𝑆 = { 𝐴1, 𝐴2, … , 𝐴𝑖 , … 𝐴𝑁𝑇𝐴 }, where Ai represents the i application in the total of NTA

applications. The results are evaluated individually for each app and then for the entire set of

applications as presented in Table 3.

Table 3 – metrics set and applications set

App \ Metric 𝑴𝟏 𝑴𝟐 ... 𝑴𝒋 ... 𝑴𝑵𝑻𝑴

𝑨𝟏 𝑚11 𝑚12 ... 𝑚1𝑗 ... 𝑚1𝑁𝑇𝑀

𝑨𝟐 𝑚21 𝑚22 ... 𝑚2𝑗 ... 𝑚2𝑁𝑇𝑀

... ... ... ... ... ... ...

𝑨𝒊 𝑚𝑖1 𝑚𝑖2 ... 𝑚𝑖𝑗 ... 𝑚𝑖𝑁𝑇𝑀

... ... ... ... ... ... ...

𝑨𝑵𝑻𝑨 𝑚𝑁𝑇𝐴1 𝑚𝑁𝑇𝐴2 ... 𝑚𝑁𝑇𝐴𝑗 ... 𝑚𝑁𝑇𝐴𝑁𝑇𝑀

Total 𝑻𝑴𝟏 𝑻𝑴𝟐 ... 𝑻𝑴𝒋 ... 𝑻𝑴𝑵𝑻𝑴

In Table 3 the way of evaluating the metrics are presented for a set of NTA applications. The

element 𝑚𝑖𝑗 represents the value obtained for the 𝑀𝑗 metric when applied on the application

𝐴𝑖. The last row of the table represents the totals for each metric. The total 𝑇𝑀𝑗 corresponding to 𝑀𝑗 metric is calculated as the average of the 𝑚𝑖𝑗 values for all NTA

apps.

5. Conclusions

Achieving a high interactivity level is a key for developing good quality web based citizen

oriented software solutions. Thus built metrics need to be applied on a high range of

applications in order to refine and validate the obtained results. Thus further developing and

enhancement of the automated tool and the metrics themselves are to be done so they

measure quality with a high level of accuracy.

Acknowledgment





References

[1] Source: Eurostat. Internet: http://ec.europa.eu/eurostat/documents/2995521/6343581/4-

16122014-BP-EN.pdf/b4f07b2a-5aee-4b91-b017-65bcb6d95daa, March 16, 2015

[2] Ion IVAN, Bogdan VINTILĂ, Dragoș PALAGHIȚĂ - Types Of Citizen Orientated

Informatics Applications, Open Education Journal, Russia, ISSN 1818-4243, No.6, 2009

[3] Maria HAIGH – Software quality, non-functional software requirements and IT-business

alignment, Software Quality Journal, Vol. 18, Issue 3, September 2010, pp.323-339,

ISSN 09639314

[4] Christof EBERT, Manfred BUNDSCHUH, Reiner DUMKE, Andreas

SCHMIETENDORF – Best Practices in Software Measurement, Publisher: Springer

Verlag Berlin Heidelberg, 2005, 300 pp, ISBN 978-3-540-26734-8

http://ec.europa.eu/eurostat/documents/2995521/6343581/4-16122014-BP-EN.pdf/b4f07b2a-5aee-4b91-b017-65bcb6d95daa

http://ec.europa.eu/eurostat/documents/2995521/6343581/4-16122014-BP-EN.pdf/b4f07b2a-5aee-4b91-b017-65bcb6d95daa



212

BRIDGE PKI

Eugen Ștefan Dorel COJOACĂ

Ministry for Information Society, Romania,

Doctoral School of University of Economic Studies, Bucharest

[email protected]

Mădălina Elena RAC-ALBU

Bucharest University of Medicine and Pharmacy “Carol Davila”, Romania,

Doctoral School of University of Economic Studies, Bucharest

[email protected]

Floarea NĂSTASE


[email protected]

Abstract. In this article we present one of the initiatives of the Ministry of Informatics

Society, namely: the "Bridge PKI", this project wants to provide business infrastructure

enabling interoperability between accredited vendors of electronic signatures in Romania

and secure access to this infrastructure. The project aims to increase the administrative

efficiency using new electronic technologies of communication and to boost the use of

electronic documents. Implemented project was financed by European funds and has taken

that step further interconnection with existing similar systems in EU countries.

Keywords: interoperability, PKI, e-Business, e-Government, e-Services

JEL classification: M10, M14, O33

1. Introduction

MSI Strategy (Ministry for Informatics Society) on computerization of public services at

central and local level must pursue notably several directions:

Increasing the efficiency of the administrative apparatus using new electronic

technologies of communication;

Orientation of public services to the needs of citizens and businesses;

Provide free access to public information;

Transparency in the exercise of the administration;

Fulfilling the standards of European administration, to interconnect systems in the EU

Member.

The users of these services must trust in government. For example, the citizen must have the

guarantee that official documents received were not changed and were even sent by the

Authority. At the same time, public authorities must be able to verify that the received

documents are from people who claim to have sent and if the documents are original.

To ensure the security of the main objectives in managing electronic documents, IT systems

are using infrastructures based on public key cryptography.

According to [1], “Public Key Infrastructure (PKI) refers to the technical mechanisms,

procedures and policies that collectively provide a framework for addressing the previously

illustrated fundamentals of security - authentication, confidentiality, integrity, non-

repudiation and access control.”



213

For effective cooperation between companies, citizens and government, as well as public

authorities in other Member States of the European Union, PKI systems need to be

interoperable.

There are several models based on public key infrastructure [2] [3]:

1. The authority validation model - is based on using OCSP protocol (Online Certificate

Status Protocol) to interview a server about the status of the certificates.

As advantages of this solution can be mentioned:

online consultation always provides updated information;

validation information should not be disseminated on all workstations.

As disadvantages can be mentioned:

users must use the OCSP compliant software tools;

the checking of the validation path is partially assigned to someone else.

2. The hierarchical model - certification authorities are organized hierarchically, each of

which issues certificates to subordinated authorities or to their users.


certification paths are easily determined because they are unidirectional. For each

certificate there is a single certification path to the root certification authority;

a hierarchical architecture provides a high degree of scalability, adding new subordinate

certification authority to allow the management of a high number of users being simple

to achieve.


compromise of the private key of the root certification authority is catastrophic, requiring

revocation of all existing certificates and recreate the hierarchy;

the reliability between organizations or companies is not necessarily hierarchical and

therefore it is very difficult to find one authority that all trust in.

3. The mesh method - All certification authorities (CA) are considered reliable points.

Certification Authorities (CA) issue certificates to another; pair of certificates describing

bidirectional trust relationship.


compromise of a certification authority (CA) does not destroy the entire structure of

public keys;

a public key "mesh" can be easily constructed from a set of certification authority (CA)

isolates.


can develop an endless loop of certificates;

the contents of a certificate cannot be used for recognition of access rights.

4. The Web / Internet Trust model - This solution is based on the list of trusted

certificates CTL (Certificate Trust List).


compromise of a certification authority (CA) does not destroy the whole structure of

public keys;

achieving certification path is simpler than in the case of the "mesh".


do not process the revocations and does not provide support for using multiple

certification policies;

lacks mechanisms to protect data integrity.

5. The Bridge CA model - is also based on mutual certification relations, but trust model

used is the star type. As advantages of this solution can be mentioned:



214

compared to "mesh" model, the certification path discovery becomes easier;

certification paths are shorter

Using electronic signature in e-Government services is very important because it allows the

use of electronic documents. There are many projects that use electronic signature and their

management is not an easy task in this regard with studies showing solutions to obtain

optimal solutions [4]. Thus, interoperability between public key infrastructures has different

meanings depending on the context it is used - essential in this project is interoperability

between different areas.

Each entity involved in this project has its own PKI architecture, resulting the need for the

PKI hierarchy to interact with each other. Thus, entities that adhere to this system will be

recognized in the topology. When the unit Enrollment in this system succeeds, secure

communication will be resolved between the parties; there will be no need for any auxiliary

authentication because the unit is considered a reliable one in the system.

The solution of the project is based on the draft list of certificate authorities to ensure

interoperability between domains trust of different public keys. The system is simple and

easy to manage and to avoid the disadvantages that may occur in terms of security are

introduced lists of trusted certificates (CTL - Trust List Certificate). It will distribute a list of

trusted root certification authorities that includes, in its own certificate signed list of the

system.

The system architecture provides participating members the following services:

Distribution of CA accredited certificates as a signed list, the list of trust (TL);

Cross-certification certifying of participating members who do not want to use the trusted

list (TL);

Provide public key certificate for each member PKI and the appropriate certificate

revocation list;

PKI interface specification for each member that interacts with the BCA;

The opportunity of the participants to test and validate their own interface to the reference

site.

2. Concept

In order to achieve the aspirations of the project were considered several possible solutions

(each with its advantages and disadvantages) resulting in the optimal solution for

implementing a combination of popular Web / Internet Trust and Model Bridge CA - Figure

1. This model overcomes the disadvantages of other PKI models (simple, hierarchical or

mesh) and achieving mutual trust between participants who will retain their own reliable

structure. Advantages of this model are:

this model is open;

the compromise of a certification authority (CA) does not destroy the whole structure of

public keys;

achieving certification path is simpler than in the case of the " mesh ";

compared to its public key " mesh " certification path discovery becomes easier;

certification paths are shorter.

Bridge PKI uses cross- certification process, a process that involves the following steps:

application, submit documentation, mapping policies, technical interoperability testing,

approval of the application, the negotiation of an agreement, cross certification.



215

Figure 1 - The PKI Bridge

3. Bridge PKI Architecture

PKI Bridge is compatible with external Certificate Authority issuing X.509 v2 CRL and

X.509v3 certificates and X.500 uses distinct names. It also uses two critical extensions: the

extension that implements basic constraints which indicates whether the subject of the

certificate is a Certification Authority and "key usage" which specifies whether the private

key can be used for signature verification certificates and CRL-s. From the functional point

of view, the system regards:

ITU X.509 v3 – format digital certificates;

X.509v2 – certificates issued CRL-s;

RFC 5280 – for issuing, publishing and revoke digital certificates;

RSA, DSA, ECDSA –cryptographic algorithms for signature;

SHA-1, MD5 - cryptographic algorithms for hash;

DH – shift key (RSA key length of 1024 to 4096 bits);

PKCS#7/PKCS#10 for managing certificates;

PKCS#11 and PKCS#12 – to store private keys;

FIPS 140-2 level 1 – for cryptographic libraries;

CRL, OCSP (RFC 2560) – to validate certificates.

Architecture degree of implementation is shown in Figure 2 and Figure 3.



216

Figure 2 - Bridge PKI Final Architecture specifying relevant functionalities

Figure 3 - Final Bridge PKI architecture – interoperability with: PCU (Single Point of Contact in Romania) ,

SEAP ( Electronic Public Procurement System), IMM (Portal for IMMs in Romania ) , NTC ( National Trade

Register Office ) [5]

As can be seen in Figure 3, interoperability is achieved with some systems developed by the

Ministry for Information Society and also with the National Trade Register Office, an

institution that operates with a very high flow of documents from citizens.

4. Conclusions

The implementation of this project has allowed a work environment that ensures through

interoperability the following features:



217

The capacity of a public key applications to interact with another application of public

key;

The possibility of mixing of components from different companies to create a public key

infrastructure organization;

Interaction between the fields of public keys, belonging to different organizations, to

enable secures transactions between these organizations.

References

[1] What is PKI?, availale: https://www.comodo.com/resources/small-business/digital-

certificates1. php

[2] RFC5217, M. Shimaoka, N. Hastings, R. Nielsen - Network Working Group

[3] PKI Interoperability Models, Chris Connolly, Peter van Dijk, Francis Vierboom, Stephan

Wilson, Galexis, February 2005

[4] F. Pop, C. Dobre, D. Popescu, V. Ciobanu and V. Cristea, “Digital Certificate

Management for Document Workflow in E-Government Services”, proc. of Electronic

Government, 9th IFIP WG 8.5 International Conference, EGOV 2010, Lausanne,

Switzerland, pp 363-374, August\September 2010, Springer

[5] Portal e-Romania, available:

http://portaleromania.ro/wps/portal/Eromania/!ut/p/c5/04_SB8K8xLLM9MSSzP y8xBz9

CP0os_hAAwNfSydDRwP_UGMzA0_XUEtv4-

AQYwMDA_1wkA6zeAMcwNFA388jPz dVvyA7rxwA-

YBK0g!!/dl3/d3/L2dBISEvZ0FBIS9nQSEh/?WCM_GLOBAL_CONTEXT

=/wps/wcm/connect/portaluri/portaluri/articole/articoleportal&categ=BridgePKI&IdSite=

1d2c2d0042e44462a78faf6a9cee2aac



218

AN ENTERPRISE APPROACH TO DEVELOPING COMPLEX

APPLICATIONS

Alexandru-Mihai MARINESCU Endava Romania

[email protected]

Anca ANDREESCU Bucharest University of Economic Studies

[email protected]

Abstract. Nowadays, developing applications for various clients has become a matter of

delivering speed and sacrificing quality. In many situations, the deliverable has to meet strict

deadlines but most often doesn’t, due to poor planning. Many steps of the software

development cycle are elided, thus leading to software defects, a rigid structure that is unable

to adapt to future requirements and, probably, the most significant, poor performance of the

application. In the following paper we will highlight the most important steps one can take

when developing a software solution in order to meet a high standard of quality. We will

follow the early development of an application that delivers university schedules to students,

directly to their smartphones. The software solution is comprised of a server based on the

REST architectural style, developed on the .NET Framework, and three native mobile

applications, one for each popular operating system.

Keywords: .NET, Development, mobile, performance, quality, REST

JEL classification: C88, A23

1. Introduction Building an enterprise grade application with no prior experience might be a daunting task.

Developing a plan and executing it will make this task a lot easier. Applying a set of rules

and following certain steps towards building a proper software solution requires an idea to act

as a nucleation point for the application.

In this article we will look at some of the steps that can go into such a plan and how they

were implemented in a software solution for delivering university schedules to students’

smartphones.

The “Orarum” project started from these simple questions: why do we still check our

schedule at the university’s notice board? Moreover, why do we write them down at the back

of a notebook when we could be using a device that almost everybody owns: a smartphone?

The need for a mobile application for querying university schedules is certainly not new but

through personal experience we managed to narrow it down to a few basic requirements that

should make up a schedule application.

2. Proofing the business idea

Identifying a good business idea is the first step towards developing a great application. We

won’t go into details on how to do that, since it is not the scope of this article, but we will

highlight, what we believe to be, the two most important aspects to consider.



219

2.1 Identify the uniqueness of the idea and study the competition

What makes the idea unique? Uniqueness doesn’t necessarily mean one has to invent

something, just to set oneself apart from the competition. She/he should choose a set of

characteristics she/he wants the application to offer and compare it to what other software

solutions have to offer, in the case that competition exists.

For “Orarum”, the conclusion to this step was simple: besides the basic read-only schedule

displayed on the university’s website, there were no management tools to help the student

keep track of his/her schedule, through the means of a mobile application.

2.2 Get consumer feedback

Identifying a need through personal experiences is, most of the time, not enough to market-

proof one’s idea. Reach out to the target demographic, not just friends and family, but people

who would actually want to use the software product. They're much less likely to be biased.

There are two easy and conclusive ways to do so: develop and distribute a questionnaire and

interview key users, presented as follows.

2.2.1 Develop and distribute a questionnaire

The survey should be easy to complete and not take much time. In order to certify the

“Orarum” idea, we had created a 6 question survey which took approximately 30 seconds to

complete. We used a popular website based tool for creating and distributing it, thus,

reaching little over 100 students and providing vital information on which to develop the

application.

2.2.2 Interview key users of your future application

A questionnaire can’t possibly be short and cover every aspect of what one is looking for.

That’s why a complementary method is to interview a few key users from the target public.

Determining who they are will vary depending on the idea, of course. The interview for the

“Orarum” project targeted students of several universities from Bucharest. The gathered

information helped determine how to store the data in order to record all the aspects of the

students’ schedule.

Subsequent to this process, the initial requirements identified might change, or new ones

might get added. It is normal at this point for the business logic to adapt to the environment.

3. Setting up the project

Similar to sculpting a masterpiece, developing a software solution from scratch will require a

few tools to help out along the way and produce far better results. Besides the obvious

integrated development environment and database management tool, here is a list of auxiliary

software applications crucial for aiding the software development process.

3.1. Revision control tools

Revision control simply means versioning all the files used in the software solution, in order

to keep better track of changes. If working on the project is an individual task, a source code

management tool will just provide a way to save one’s work somewhere to a central server,

keep track of it and ensure consistency. The benefits of such a tool are greatly enhanced when

there are multiple people working on the same solution.

Signing up for a repository is the first step towards having revision control. It can be free or

subscription based. Either way, choosing one should be based on the following criteria: cost,

efficiency, security and guaranteed up-time.



220

Depending on what hosting site you chose for the distributed version control system (DVCS),

you will now need to download a client to interface it. Generally there are three directions

one can take: Git, SVN or Team Foundation Server.

For the “Orarum” project a Git repository was chosen, with a Git client for Microsoft Visual

Studio.

3.2 Issue tracking tools

An issue tracking system is a software solution for managing and maintaining lists of stories

and defects. It helps an organization, or an individual to stay on top of important issues,

collaborate better, deliver higher quality software and increase productivity.

Bugzilla, Apache Bloodhound, Team Foundation Server are just a few from a very long list.

They use different back-end databases for persistence, are written in various languages and

offer a plethora of features.

Atlassian JIRA is the issue tracking product that was used in the development of the

university schedule project. Signing up for Atlassian Bitbucket was free for up to five users

and it offered unlimited private repositories, code reviews, JIRA integration, dedicated

support, custom domains and a REST API.

3.3 Database design tools

Having a proper tool for developing the database will save you a significant amount of time

in the development of the solution. From our experience designing the database for the

“Orarum” project, we found that a visual tool adds a lot of benefits. Being able to collaborate,

have a friendly UI, have a smooth workflow, be able to manage models and generate SQL

we’re just some of the features we were looking for in a database design tool. Since it

supported Microsoft SQL Server and offered an impressive list of features, the option we

chose was Vertabelo, a website based tool.

4. Developing the application

4.1 Setting up coding conventions and identifying best practices

Performance is important to your application. A “performance culture” should permeate the

whole development process, right through from setting up coding conventions and identifying

best practices to acceptance testing [1],[2].

Every aspect of the development cycle should be covered. Depending on the language

chosen, many coding best practices can be easily identified. Probably, the best thing one can

do is to look them up in books like Effective C#, Effective Java, Clean Code etc.

Doing this setup, prior to starting actual development will result in an individual or a team to

be more productive. Here is a list of ten “rules” to developing performant enterprise

applications [3]:

1. Design first, then code

2. Have the right tools available to measure performance

3. Write code that is clear and easy to maintain and understand

4. Gain an understanding of underlying tasks that the framework performs

5. Set performance goals as early as possible

6. Only optimize when necessary

7. Avoid optimizing too early

8. Not delay optimization too far

9. Assume that poor performance is caused by human error rather than the platform

10. Employ an iterative routine of measuring, investigating, refining/correcting.



221

4.2 Developing the database

Every well build application rests on a solid database. Having a strong and efficient design

will vastly improve the performance of the entire system. A database application should be

properly designed to take advantage of the database server capabilities. Applying a set of

principles and best practices will not only improve application performance but also have

impact on security, availability and reliability, code readability and maintainability, memory

and disk usage [4].

There is a huge choice of database management systems (DBMS), which includes packaged

and open source database suites. The main suppliers include Fujitsu, Hewlett-Packard,

Hitachi, IBM, Microsoft, NCR Teradata, Oracle, Progress, SAS Institute and Sybase [5].

The primary aspect one should consider when choosing the right DBMS for his/her solution

should be how well it bonds with the framework. For example the .NET Framework is

optimized for SQL Server. Obviously, this choice will be influenced by many other factors,

primarily, cost. If such is the case, one should opt for an open source database and pay close

attention when choosing the most optimized ORM.

The database structure will not be set in stone but it will be more difficult to refactor once

development has started. Therefore, the process of constructing the database should not be

rushed. Instead, opt for an iterative approach and always try to apply best practices and

standards. This is where the database design tool plays a big role.

4.3 Building the application’s structure

Now that we got a good database to work with, it is time to move on to developing the

application. The first step is to define your folder structure and abide by the conventions you

set earlier. In the short term, ignoring best practices will allow you to code rapidly, but in the

long term it will affect code maintainability [6].

As before, there is no one correct way to do so. It will depend on what type of project you

will develop, what framework you are using or what platform you are designing for. Bottom

line is that you should have a neat project solution that will be maintainable.

4.4 Developing prototypes

A proof of concept (POC) is a demonstration aimed at verifying that certain concepts or

theories can be achieved. A prototype is designed to determine feasibility, but does not

represent the final deliverable [7].

Developing several prototype applications, prior to working on the main one will clear any

questions regarding implementation and boost confidence that the product can be delivered.

A POC does not represent the final deliverable and it is usually not tested.

Prototypes still require specifications and will most likely not exclude the need for an

extensive system analysis. They shouldn’t replace model-driven designs.

In the context of an enterprise application this can mean anything from a new piece of

software that needs to integrate perfectly with the system to a new feature that needs to be

added. Probably the most significant proof of concept that was realized for the “Orarum”

project was the integration of the mobile applications with the Facebook API. It demonstrated

that a user can log-in, access his/her groups and friends and post messages to his/her account.

4.5 Implementing the user stories

Implementing the previously created user stories should be done iteratively. The software

should be delivered in a rapid and incremental manner, always remaining adaptable in

response to change. Even though you set in mind a number of specifications, the development



222

should begin by implementing just part of the software, which can be reviewed in order to

identify further requirements. Everything we set up will allow the development team to work

in a more agile fashion.

At the end of each iteration, the project can receive vital feedback. Any defects can be

tracked at an early stage, thus avoiding major problems further down the development cycle.

From the perspective of the development team, the adoption of iterative and incremental

development is empowering, enabling team members to actively and aggressively attack

project risks and challenges in whatever they judge to be the most appropriate manner.

Managing iterations by setting clear objectives and objectively measuring results ensures that

you are free to find the best way to deliver results [8].

4.6 Writing unit and integration tests

Each piece of software you write should be thoroughly tested. Developing both unit tests and

integration tests in each iteration of the software development process is mandatory for

producing quality applications [9].

A unit test is a single piece of automated code that exercises a function and checks a single

explicit assumption about it. An integration test, by comparison, takes many

functions/systems, connects them end to end, inputs data on one end and gets it out the other.

It still makes an explicit assumption but also makes many implicit assumptions as well.

A software test should be readable, maintainable and trustworthy. If any of these two

qualities are pretty high, consequently, they will drag the third one up with them.

Here are four anti-patterns one should avoid when developing tests:

1. The opaque anti-pattern

The code inside the test should be easily readable and informative. The variable names

should be self-explanatory and the test name should be consistent. An example to follow

can be: NameOfFunctionUnderTest_ContextOfTest_DesiredResultOfTest().

One should follow the arrange-act-assert setup when writing the test, in order to make it

easily scannable.

2. The wet anti-pattern

In production code, one would use the DRY principle (Don’t Repeat Yourself). The same

convention should be applied to testing. Pull similar segments of code into a single

function that each test can use. Group similar functions and create helper classes in your

testing solution. Basically, keep the same production sensibilities in test code.

3. The deep anti-pattern

This pattern applies more to unit testing but it can be extrapolated to integration testing as

well. In a nut shell, unit tests should make a single assumption. If a test fails, we should

know exactly why it failed and where.

A best practice we found when developing the “Orarum” application was to override

Equals() and GetHashCode() functions in all of the entity classes used across the solution.

In consequence, instead of asserting each property of the class, a test can now assert the

object as a whole. This also changes the behavior of a collection assert.

4. The wide anti-pattern

This last anti-pattern applies only to unit testing since it specifies that a unit test should

not make implicit assumptions. When a test fails, we should know just from the output

what code to fix in production.

Creating production code that uses dependency injection will allow one to write unit tests

that can mock behavior and decouple from other components. This is called inversion of

control.



223

5. Conclusions Developing an enterprise grade application might be an overwhelming task but having a plan,

and taking one step at a time will produce a faster, cheaper and more performing software

solution. Together we have looked at how to find and business proof the idea, how to set up

the project and all the necessary tools and how to solidly develop an application in order to

produce superior results. Every aspect of your project should abide by certain conventions

and best practices. These rules will have to be found throughout the entire solution: database,

application directory structure and file names and both production code and testing code.

Prototypes might be very useful but they can also take up time, thus the right balance must be

found.

Every subject that we enlisted in this research paper can be an investigation of itself and there

is certainly a lot of depth to each and every one of them. On top of this, there are many other

aspects that can comprise an application development cycle.

The “Orarum” application was a great way to exercise everything that we’ve learned until

then, as well as discover new concepts and methodologies. As stated above, there is a lot

more to discover so we encourage the reader to find an idea and implement a performance

oriented software solution by all means possible. After all, “The bitterness of poor quality

remains long after the sweetness of meeting the schedule has been forgotten.” – Anonymous

References [1] B. Wagner, Effective C# (Covers C# 4.0): 50 Specific Ways to Improve Your C#, 2nd

Edition), Pearsons Education Inc., 2010.

[2] B. Wagner, More Effective C#: 50 Specific Ways to Improve Your C#, Pearsons Education

Inc., 2009.

[3] R. Page and P. Factor, „.NET performance- The Crib Sheet”, May 2008, available:

https://www.simple-talk.com/dotnet/performance/net-performance-

cribsheet/#_Toc198269213 [March 14, 2015]

[4] An Oracle White Paper. Guide for Developing High-Performance Database Applications,

available: http://www.oracle.com/technetwork/database/performance/perf-guide-wp-final-

133229.pdf [March 14, 2015]

[5] A. Mohamed, „Choosing the right database management system”, available:

http://www.computerweekly.com/feature/Choosing-the-right-database-management-

system [March 14, 2015]

[6] R. C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship, Pearsons

Education Inc., 2008.

[7] Business Analyst Learnings. Proof of Concept: Benefits & Risks of Prototyping in

Business Analysis, available: http://businessanalystlearnings.com/blog/2013/9/1/proof-of-

concept-benefits-risks-of-prototyping-in-business-analysis [March 14, 2015]

[8] IBM developer Works, "What is iterative development?,", 2012, available:

http://www.ibm.com/developerworks/rational/library/may05/bittner-spence/ [March 14,

2015]

[9] R. Osherove, The Art of Unit Testing: with examples in C#, Manning Publications, 2013.

http://www.amazon.com/Bill-Wagner/e/B004QIVWSQ/ref=dp_byline_cont_book_1

http://www.amazon.com/Bill-Wagner/e/B004QIVWSQ/ref=dp_byline_cont_book_1

https://www.simple-talk.com/author/robyn-page-and-phil-factor/

http://www.oracle.com/technetwork/database/performance/perf-guide-wp-final-133229.pdf

http://www.oracle.com/technetwork/database/performance/perf-guide-wp-final-133229.pdf

http://www.computerweekly.com/feature/Choosing-the-right-database-management-system

http://www.computerweekly.com/feature/Choosing-the-right-database-management-system

http://www.amazon.com/Robert-C.-Martin/e/B000APG87E/ref=dp_byline_cont_book_1

http://businessanalystlearnings.com/blog/2013/9/1/proof-of-concept-benefits-risks-of-prototyping-in-business-analysis




http://www.ibm.com/developerworks/rational/library/may05/bittner-spence/

http://www.amazon.co.uk/Roy-Osherove/e/B00D3VXAG4/ref=dp_byline_cont_book_1



224

A PRELIMINARY ANALYSIS OF BUSINESS INFORMATION

SYSTEMS MASTER PROGRAMME CURRICULUM BASED ON THE

GRADUATES SURVEY

Marin FOTACHE Al.I.Cuza University of Iași

[email protected]

Valerică GREAVU-ȘERBAN

Alexandru Ioan Cuza University of Iasi, Romania

[email protected]

Florin DUMITRIU Al.I.Cuza University of Iași

[email protected]

Abstract. The most recent economic crisis forced western companies to cut their IT budgets.

This was possible in many cases by outsourcing IT projects in countries like Romania. For

Romanian Information Systems graduates outsourcing has steeply increased the job offer.

But it also changed the required skills (competencies) ratio between technical and business

issues. When crises erupted a lot of ERP projects had been launched but since then only a

few have been completed and recently very few have been initiated. New IS jobs require more

technical skills and urge for curricula recalibration. This paper presents some preliminary

results of a survey conducted in 2013 and 2015 for the graduates of IS master programme at

Al.I.Cuza University of Iasi. The survey main objectives were to identify the required skills on

the IT market, graduates opinion about the programme, including curricula, syllabi,

internship, infrastructure, teaching staff, and program strengths and weaknesses and re-align

curricula to the industry demand.

Keywords: information systems curricula, graduates survey, R, SharePoint JEL classification: M15

1. Introduction As academic discipline and research topic, Information Systems (IS) has followed a

convoluted trajectory. It's never ending identity crisis [1] [2] combined with a steep decline in

enrolments after 2000, especially in US programmes [3], rose questions about its future.

Amidst funeral moods, Romanian (and other Eastern-European) IS programmes have thrived

at both undergraduate and graduate levels [3]. Romania’s increasing attractiveness for IT

outsourcing (lower wages, technical skills, proficiency in foreign languages) created a big

appetite for IT professionals and consequently IS enrolment has constantly risen.

As technologies change quickly, so the industry requirements. IS programmes must adapt

their curricula based on similar programmes curricula, including model curriculum

guidelines published by Association for Information Systems [3], listening to the industry

needs (involving industry representatives), and getting feedback from graduates.

This paper presents some results of a survey targeting Business Information Systems master

graduates at Al.I.Cuza University of Iasi (UAIC). Technical solution for analysis was

developed using Microsoft SharePoint Server platform and R/RStudio language/platform.



225

2. Information systems undergraduate and graduate programmes curricula As technical and business topics could be mixed in various proportions, diversification of IS

programmes manifests not only among universities from different countries, but also within

the same country [3]. At UAIC both undergraduate and graduate Information Systems

programmes curricula were developed following recommendations ACM/AIS [3][5][6].

Figure 1 shows the main courses proposed in IS2010 undergraduate curriculum [5] model

and their relation to the career tracks of the IS graduates.

Figure 1 - Structure of ACM/AIS IS2010 undergraduate curriculum [5]

At master level, MSIS2006 [6] nominated 24 possible career tracks (such as: Academia;

Knowledge Management; Computer Forensics; Managing the IS Function; Consulting; Data

Management and Warehousing; Mobile Computing; Database and Multi-tiered Systems;

Decision Making; Project Management; Security; Systems Analysis & Design; Enterprise

Resources Planning; Telecommunications) each with suggested courses. Despite the relative

obsolescence of the MSIS2006 (the next curriculum model for IS graduate programmes is

expected within a year or two), we claim that since 1997 (undergraduate) and 2007 (master),

IS programmes at UAIC have been properly aligned to ACM/AIS recommendations.

3. Graduates survey brief description Graduates survey was designed in 2013 and made available for graduates from September to

December 2013 and from January to March 2015. Some of the sections of questionnaire are:

Personal details (e.g. age, gender); Graduation path for both bachelor and master levels;

Carrier path (year of first employment, company profile, location and stakeholders, income

level; Free/open messages (for teachers, colleagues, future students, five positive and five

negative features of the programme); Level of satisfaction about teaching staff and activities,

different areas of interest (programming, modelling, databases, etc.) research opportunities

and administrative components of master degree, all using a Likert scale from 1 to 5.



226

The questionnaire was distributed through social networks (mainly on Facebook, where

graduates have pages organized by enrolment year). There were 84 answers, but only 74 were

kept as the 10 of them were flawed (seven respondents did not respond to any questions, and

three seem to be still students and not graduates) etc.

Proportion of the respondents’ genre follows the proportion of students’ genre for BIS

(figure2, left). 58.9% of respondents are females and only 41.1% males. This contradicts a

much-debated anxiety - the scarcity of women in computing.

Figure 2 - Respondents’ genre (left) and enrolment year (right)

Analysing the number of answers per admission year (Figure 2, right) one can notice some

discrepancies between 2009/2010 and 2011/2012. Based on the number of enrolled students

and the “fresh memory” factor, we expected the number of 2011/2012 respondents to be

bigger than for 2009/2010. The figures indicate the opposite. Somehow ironically, this

psychological factor of memory freshness might be part of the explanation.

4. Technical platforms As students BIS graduates had extensively used (Microsoft) Share Point Portal. So, we chose

it as platform for the questionnaire instead of an equivalent free solution. Share Point assures

a better control over the respondents (as former UAIC students). Free Internet surveying tools

allowed anonymous answers but malicious people might fill in multiple malformed answers

and alter the results. Even if the free tools surveys could be protected by the uniqueness of an

IP address, it is largely acknowledged that any person can access many devices connected to

the Internet, or can use proxy servers to access and respond multiple times to a survey.

A basic requirement for surveys is the anonymity of respondents [7] [8]. SharePoint is a

powerful tool that authenticates the user but also anonymizes the respondent without coding.

Another Share Point strength is the definition question branch logic if necessary. Also

SharePoint provides a graphical summary of answers in real time, determining key users to

react in promoting the survey on multiple media: e-mail, social media, and specific websites.

Almost any question format can be implemented in a SharePoint survey: Single line of text,

Multiple line of text, Choice, Rating Scale, Number, Date and Time, Yes/No. All answers are

saved into the server database. Results can be exported as RSS, spreadsheets or .csv files.

For data visualization and analysis we chose an increasingly popular open-source platform,

R/RStudio. R is the most dynamic data language [9], [4]. Main R packages used in this

analysis are: stringr, reshape, plyr, dplyr and ggplot2.



227

5. Preliminary results concerning graduates employability As expected, most of the respondents work in IT industry (90.54%). As Figure 3 (left) shows,

graduates employability is quite satisfactory (98%). 85% of graduates had been hired before

master completion. First and second year of study are most frequent moments of getting the

first job. This data is consistent with a recent survey targeting current second year BIS master

students. Out of 66 current second year students, 57 (86%) said they are already hired.

Figure 3 - Respondents’ first job moment (left) and job area (right)

This is good news for the programme and reflect not only the increasing demand for IS/IT

professionals, but also suggests a proper alignment of IS programme to the industry needs.

Later years have witnessed a growing number of applications for both undergraduate and

graduate IS programmes.

Graduates jobs areas/fields are very important in calibrating the curriculum. As seen in

Figure 3 (right) most of the jobs are technical. This is natural since graduates get managerial

positions (project/team managers) only after a few years of experience and proof of their

competencies in working within and/or assembling a team. Jobs are concentrated on five

main areas - Quality assurance, Programming, Databases and IS analysis and design -

which covers 85% of total answers. Most of graduates’ professional careers are related to

Software Development.

Figure 3 confirms the hypothesis of increasing number of western outsourced IT projects.

Moreover one can infer that the outsourced projects target the final stages of software

development (i.e. programming, testing and deployment) and the first stages (requirements

analysis, design) remain mostly at headquarters.

Also Figure 3 shows a specific trait for what employers demand from IS graduates. Generally

Western IS programmes focus on business/managerial issues, whereas most of the Romanian

IS graduates must cope with technically oriented jobs, i.e. programming, databases, computer

networks. Before 2010 a larger proportion of graduates was hired in ERP/CRM projects as

consultants, implementers and mostly business related activities. As the crisis hit most of the

ERP projects, demand for business related jobs has shrunk. Fortunately ERP scarcity was

outpaced by more technically outsourced projects.

Recent changes in the IT jobs urges recalibrating the proportion of business and technical

courses in IS curricula. They also suggests ERP syllabus to be more process-oriented and

provide better background for further courses such as Business Process Management.

6. Programme evaluation by the graduates To evaluate the master programme, graduates were asked to assess, using a five-level scale

(very bad, bad, average/neutral, good, excellent), the following items: the programme (at



228

general level), courses (utility for their professional activity, teaching, link to practice,

research content), professors (teaching, availability, attitude towards students), infrastructure

(labs, classrooms, public spaces), administrative staff performance and attitude towards

students/graduates, internship, etc.

Figures 4 and 5 show high score for teaching staff, master programme and courses (94%,

69% and 59%). Adding half of the average level assessments, the positive percentages reach

96%, 84% and 77%. This suggests proper teaching skills, availability and proper attitude

towards students, a very positive image about programme and positive opinion about courses.

Figure 4 - Evaluation of teaching staff, programme and courses

These findings are also enforced by the mean values – 4.45, 3.90 and 3.73 - out of a

maximum of 5, and the relative low standard deviations (Figure 5).

Figure 5 - Evaluation of teaching staff, programme and courses as a heatmap

Overall programme assessment ranges between teaching stuff and courses assessments and

that the latter has a negative contribution on the programme score. Therefore, we think it will

be necessary to conduct a more detailed survey in order to find out which courses have to be

updated or even which revisions on IS curricula are requested and consequently get the

proper alignment of the programme to the industry needs. Also, the new study has to validate

the hypothesis of increasing need on technical courses and diminishing business-oriented

ones.



229

7. Limits, discussions and conclusions This article analyses curriculum issues of Business Information Systems Master Programme

from the perspectives of alumni, based on them experience during the studies and how they

were influenced by this program in career. Privacy and malevolent actions were taken into

considerations when survey was developed and disseminated. There are some limitations of

this study. The number of respondents is satisfactory, but one can ask if the graduates who

filled in the questionnaire faithfully represent the entire population of BIS graduates. It is

possible that unlucky or ill-prepared graduates who have not succeeded in getting a

rewarding job avoided answering. Also the distance between survey and graduation time

could affect the quality of some answers. Our study revealed a proper alignment of IS

programme to the industry needs and a very satisfactory graduates employability. We also

found out that most of graduates’ professional careers were related to Software Development,

especially the final stages of software development and that the IT market, and that the

technical skills were required more than business-oriented ones. These finding might explain

why the courses issue had a negative contribution on programme reputation compared with

the teaching stuff issue. Unlike Western IS programmes Romanian ones must cope with

technical skills, i.e. programming, databases, computer networks, etc.

In the next stage of our study we will check the hypothesis of increasing the weight of

technical courses in the master programme curricula to the injury of business-oriented

subjects.

Acknowledgment The R solution for master analysis was developed within ASIGMA (Asigurarea Calității în

Învățământul Masteral Internaționalizat: Dezvoltarea cadrului național în vederea

compatibilizării cu Spațiul European al Învățământului Superior) project,

POSDRU/86/1.2/S/59367

References [1] I. Benbasat and R.W. Zmud, “The Identity Crisis Within the IS Discipline: Defining and

Communicating the Discipline’s Core Properties,” MIS Quarterly, 27(2), pp.183-194,

2003

[2] R. Hirschheim, H.K. Klein, “Crisis in the IS Field ? A Critical Reflection on the Status of

the Discipline,” Journal of the Association for Information Systems, 4(5), pp.237-293,

2003

[3] M. Fotache, “Information Systems / Business Informatics Programmes in Europe.

Uniformity or Diversification?,” in Proc. of the Second Symposium on Business

Informatics in Central and Eastern Europe - CEE Symposium 2011, Cluj-Napoca

[4] M. Fotache, “Using R for Data Analysis of Master Graduates Survey,” in Proc. of the

23rd International Business Information Management Conference (IBIMA), Valencia,

2014

[5] H.Topi et al., “IS 2010: Curriculum Guidelines for Undergraduate Degree Programs in

Information Systems,” Communications of the Association for Information Systems, Vol.

26, 2010

[6] J.T. Gorgone et al., “MSIS 2006: Model Curriculum and Guidelines for Graduate Degree

Programs in Information Systems,” Communications of the Association for Information

Systems, Vol.17, 2006 (also published in ACM CIGCSE Bulletin, 38(2), 2006).



230

[7] A. Barak, Psychological Aspects of Cyberspace: Theory, Research, Applications,

Cambridge University Press, 2008

[8] N. Schwarz and S. Sudman, Context Effects in Social and Psychological Research, New

York: Springer, 2011

[9] B. Muenchen. Job Trends in the Analytics Market: New, Improved, now Fortified with C,

Java, MATLAB, Python, Julia and Many More!, 2014, Internet:

http://r4stats.com/2014/02/25/job-trends-improved/ [March 5, 2014]



231

TEAMWORK CULTURE IN IASI IT COMPANIES

Doina FOTACHE ”Alexandru Ioan Cuza” University of Iași

[email protected]

Luminița HURBEAN West University of Timișoara [email protected]

Abstract. The liberty and the individuality of the modern society are supported by a complex

communication system based on sophisticated technological solutions which contain the

mechanism of functional integration in the present day organizations. The teamwork culture

plays an important role in an IT company success nowadays, in the larger context of

globalization and growing complexity and functionality expected from modern software

solutions. However, teamwork cannot be treated separate from the software development

methodologies and we remarked the rise of the Agile and, in particular, the SCRUM

methodologies. Therefore, besides the collaborative technological platforms adopted by IT

companies on a large scale, we cannot ignore the methodological, organizational and

motivational features of teamwork. Our study of teamwork culture attended on the IT

companies from Iaşi, a recognized centre for its contribution to the IT education in Romania

and in the EU. The paper includes a study based on the authors’ experience in giving

specialty lectures at BA and MA levels in Informatics and Business Information Systems, as

well as on the experience of practitioners invited as guest lectures. We suggest solutions to

support the bachelor graduates’ successful employment in IT companies. Our conclusions

encourage us to consider further research focused on the actual evolution of organizational

culture strategies correlated with the methodologies for developing software projects by

teams in the IT industry, irrespective of the company they belong to.

Keywords: teamwork, teamwork culture, agile methodologies, agile team, SCRUM

JEL classification: L86, O33

1. Introduction In the last 15 years, IT companies were looking to become agile, in order to improve the

flexibility and responsiveness of their software development teams, so that they can react

promptly and efficiently to the changing requirements and turbulences of today’s dynamic

and global environment. Many software development models, such as rapid prototyping, joint

application design (JAD), rapid application development (RAD), extreme programming (XP),

or SCRUM, have been proposed to improve the flexibility of IT project teams, for faster and

simpler software development processes. In agile software development, teamwork and

communication are fundamental values for the project success.

2. A brief review of the teamwork concept linked to software development The topic of teamwork was extensively researched, long before the IT industry elevation, but

the IT field was widely used as a research ground, because it is based on teamwork. Classical

1960s research like Tuckman’s is returned and reinterpreted in the efforts to discover the

secret of creating a successful team [1]. A team is defined as a small number of people with



232

complementary skills who perform for a common purpose and keep to established

performance goals, for which they are commonly responsible. The team performance is

extremely important in software development, therefore many studies were conducted in this

area.

Teamwork is expected to create a work environment that encourages “listening and

responding constructively to views expressed by others, giving others the benefit of the

doubt, providing support, and recognizing the interests and achievements of others”[2]. Such

a work environment promotes individual performance, which enhances team performance,

and this one supports the organization’s performance.

So, basically, team performance is based on human interaction. In the traditional

methodologies of software development, when a plan-driven product-line approach was

utilized, the team was lead with the “Command and Control” method and the team leader had

absolute authority and applied an autocratic leadership style. As the need for organizational

agility and responsiveness grew, more flexible and modern management strategies have been

set up. Prediction, verifiability, and control were replaced by uniqueness, ambiguity,

complexity, and change. Contrasting the “Command and Control”, the “Engage and Create”

method came into view, with the main focus of getting people engaged and invested. The

team leader engages team members by “inviting them to offer their perspectives, participate

in team decisions and requiring them to adopt a strong sense of accountability” [3]. In IT

teams, this change can be considered more as a conversion, and it is considered as one of the

biggest challenges when introducing agile software development methodology, because

organizational culture and people mindset cannot be changed easily [4]. In an attempt to

summarize, the characteristics of agile versus traditional methodologies are like in Table 1.

Table 1- Agile versus Traditional Methodologies

Characteristic Traditional Methodologies Agile Methodologies

Approach Predictive Adaptive

Emphasis Process oriented People oriented

Management style Autocratic Decentralized

Team size Large Small

Culture Command and control Engage and create

Change approach Change sustainability Change adaptability

Planning Comprehensive Minimal

Documentation Heavy Low

Success measurement Compliance with plan Business value

Nowadays, the technological developments on one side, and the amplified globalization of

software development and the outsourcing practice on the other side, create software

engineering challenges due to the impact of temporal, geographical and cultural differences,

and require specific approaches to address these issues ([5], [6], [7]). Geographically wide

spread businesses influence the organizational structure of IT companies and generate a

multicultural environment with globally interconnected teams. Globalization can no longer to

be stopped and, at the same time, we speak of multiculturalism and ethnocentrism because

teamwork goes beyond the organizational and even national limits [8]. The hypothesis of

cultural egalitarianism (possible due to informational technology) is not to be trust due to the

cultural assimilation tendencies (we mention here the Anglo-Saxon linguistic globalization

present in the software companies which, for a long time, produced only English versions of

their software). Many traditional cultures, like French, German, or Spanish have been

reduced to silence in the long term development plans of program producers for all users [9].



233

Characteristics of agile development versus global development and their antagonism are

presented in Table 2.

Table 2 - Characteristics of Agile versus Global Software Development

Characteristic Global Development Agile Development

Team structure Team specialization Self-organizing

Communication frequency Only when necessary Regular collaboration

Communication means Technology mediated Face-to-Face

Synchronization Often asynchronous Synchronous

Communication cardinality Tunneled Many-to-Many

Communication style Formal Informal

Management culture Command and control Responsive

Decision making Centralized Collaborative (source: adapted from [5])

Why and when do organizations choose or have to adopt the agile development project

methodologies? Teams are social systems in permanent change, this modification being

known as group dynamics [10]. A team forms in time, while continuous changes occur in

team members, in team environment, in the relation established for reaching goals and

business values. Often in IT projects, time, this important resource, is poor. Besides their

normal tasks, employees in this field carry out obligatory activities for personal development,

mentorship (graduate and internship programs provide them with human resources) and

promoting the solutions of their companies [11]. Thus, at the company level, the following

goals for team development are carefully monitored:

clarification of common goals and values,

improvement of assuming the role of each team member,

stimulation of interaction and open communication in a team,

formation and identification with team culture,

fostering inter team support,

establishing positive interdependence

improving group productivity,

clarification of peer work goals and relational problems,

learning the positive solving of conflicts,

fostering the cooperation and elimination of competitive behavior,

awareness of interdependence.

Stimulating and impeding factors related to agile teamwork are presented in Table 3.

Table 3 - Stimulating and inhibiting factors in agile teamwork

Agile team = concentration of energies

(synergy) Stimulating

factors Inhibiting

factors Members of a successful team:

have clearly defined roles,

share resources,

share common values,

have a successful leader.

Team members self-

appreciation

Neglecting personal and

collective needs of team

members

Agile team:

replaces organization hierarchy,

has measurable goals,

has unconventional ideas.

Establish of common

values an goals

(consensus)

Goals and values no longer

coincide in time



234

Agile team:

values diversity and complementarities,

initiates rituals and manifestations which

trigger change in company culture.

Communicate for

clarification of values

and establish goals

Autocratic way of

establishing goals, orders

and obligations

Agile team:

strives to win,

is knowledgeable of competition,

encourages a healthy competitive attitude.

Appropriate praises The policy of competition

party

Agile team:

is highly interdependent,

eliminates barriers and solve problems.

Open reaction of all

team members Avoiding individual and

collective praise

controls individual ego-es Solving problems Direct annoying criticism

manages conflicts Praising and

celebrating success

Negate/neglect conflicts

identifies causes of failure Looking for escape-goats

engage team members Exaggeration of success (source: adapted from [12])

3. A succinct analysis of IT companies in IASI Despite the problems reported on the local IT market, 2013 and 2014 were defined as positive

years for Iași companies and the results on the increase trend. The most profitable companies

were Amazon, Centric, and SCC Services. The financial results for 2013 indicate high

profits: 2.2 million lei for Amazon, 1.5 million lei for Centric, and 1.3 million lei for SCC.

These foreign investments are closely followed by a local company, Focality, which recorded

1.28 million lei in profit is the same year. The positive tendency is sustained by the overall

figures:

- In 2012, total number of employees for the first 10 IT companies was 800 and the overall

profit was 11.69 million lei.

- In 2013, the first 10 IT companies employed almost 1000 persons, while profit reached

12.5 million lei.

- The statistics for 2014 are not published yet, but the unofficial numbers indicate a

positive evolution of both indicators. However, in December 2014, tens of IT employees

were fired – three IT companies diminished their operations and let go a large number of

people, some rumors speak of a hundred IT specialists fired. So overall 2014 numbers

might not look as good as they could.

The Iasi IT market analysis in the last few years offered a current total of 6.000 employees

and over 900 million lei revenues. In a classification based on the number of employees, the

first place is occupied by Continental Automotive with 1000 people (see the entire top 10 in

Table 4). Table 4 - The Iasi IT companies top 10 based on number of employees

Position Company No. of employees

1. Continental Iaşi 1000

2. Unicredit Business Integrated Solutions 400

3. SCC Services 385

4. Mind Software 291

5. Endava 290

6. Amazon Development Center 172

7. Centric IT Solutions 110

8. Iaşi Security Software 83

9. Beenear 80

10. Ness România 75



235

In our research, we have also investigated the salaries and the job requested experience,

competencies and skills in software development, as we read the opened positions that IT

companies offered, documenting online (on the e-jobs site) and at the Iasi Business Days

event in 2014:

- Junior positions (0-1 year experience): 350-600 Euro, depending on primary

competencies, foreign languages, other skills;

- Middle positions (2-4 years work experience) with good IT skills and knowledge of

software development methodologies: 700-1300 Euro;

- Senior positions (5 or more years IT work experience) with solid IT skills and practical

knowledge of software development methodologies: 1200-3000 Euro.

4. Agile practices in IT companies from Iasi The applied study was directed to the top 10 companies presented in Table 4 and had the

following issues to answer to:

- Do IT companies acknowledge and implement the teamwork culture and is

multiculturalism present in the IT companies?

- What software development methodologies do they apply?

- Do IT companies have internship or graduate programmes in software development field?

Afterwards, the educational offer was analyzed in order to establish how Universities prepare

their students to become eligible candidates for IT jobs, by indicating the existence or

absence of those courses that provide the expected competencies.

The empirical study used the interview as investigation method, results being collected from

the case studies conducted by student teams as projects or final dissertations, and also the

direct study of the companies’ web sites and of their specific job descriptions. The research

methodology includes the experimental method and the probative thinking.

The findings of the studied issues for the 10 major IT companies are presented in Table 5.

Table 5 - The findings of studied issues in the top 10 Iasi IT companies

Company Teamwork

culture &

Multiculturalism

relevance

Software development

methodologies used

Internship or graduate

programmes in

software

development

Continental

Automotive Iaşi

Yes V-cycle, Waterfall

model, SCRUM

Yes + scholarships

Unicredit Business

Integrated Solutions

Yes Waterfall model,

SCRUM

Yes

SCC Services Yes SCRUM Yes

Mind Software Yes SCRUM Yes

Endava Yes SCRUM Yes

Amazon Dev. Center Yes SCRUM, Spring Yes

Centric IT Solutions Yes SCRUM Yes

Iaşi Security

Software

Yes Agile No

Beenear Yes SCRUM No

Ness România Yes SCRUM Yes

With no exception, all investigated companies confirm our assumption about the teamwork

culture and admit that multiculturalism is a characteristic of their teams at present. In terms of

software development methodologies, all of them are not only aware, but use the agile



236

approach and most of them (8 of 10) offer internships. SCRUM seems to be the specific agile

method they prefer.

Based on the theoretical findings and the data exposed in Table 5, we further analysed the

most representative study programs in the field of software development from three different

faculties in Iasi, tracking the academic courses that may offer students the basic knowledge

on teamwork and software development methodologies (see Table 6).

Table 6 - The findings of studied issues in the top 10 Iasi IT companies

Academic Entity Groupware

Software

development /

engineering

Agile

methodologies

Informatics Program - Faculty of

Informatics (“Al. I. Cuza” University) No Yes No

Economic Informatics Program -

Faculty of Economics and Business

Administration (“Al. I. Cuza”

University)

Yes Yes No

Applied Informatics Program -

Faculty of Computer Engineering and

Automatic Control (Technical

University)

No Yes No

According to this simple analysis, graduates are not prepared for the current approach in

software development. All study programs have the classical courses of software engineering,

with the aim of assuring the significant technical skills expected from a graduate. Only one of

the three analyzed programs offers a course of “Groupware”, which develops the abilities for

teamwork and none has a course dedicated to agile methodologies. We have discovered an

optional course of “Inter-human communication” at the Technical University, which is more

than nothing. However, the results are eloquent and confirm our perception that students need

to have more than technical skills and knowledge. The development of internship and

graduate programmes in IT companies strengthen this opinion and try to supply the lack of

specific courses with actual practical work.

5. Conclusions Teamwork is an important ingredient in present-day work in IT companies, as agile practices

have been accepted in many organizations. Our study confirmed these facts. IT companies

now expect to hire skilled people, who can follow agile methodologies. One of the most

important ingredients is communication, and the Agile Manifesto stipulates that "the most

efficient and effective method of conveying information to and within a development team is

face-to-face conversation" [13]. In present-day large, multicultural IT projects teams, this is a

challenge. Therefore we considered that the Universities’ specialized study programs should

be adapted to current trends and we analyzed them. The findings reveal that graduates are not

prepared for the new approaches in software development, although they have good technical

skills and knowledge. The practice of internships which was established in the IT companies

targeted this need, with training components like “shadowing existing projects” or “soft skills

training programs”, but we consider that Universities should also upgrade their curriculum

appropriately.

Our study was limited to the representative study programs in one academic centre and local

IT companies. We believe that our conclusions are relevant for the entire spectrum of



237

Romanian IT study programs, although exceptions should also exist. An extended analysis

would be of interest in order to formulate an objective conclusion. Also, the analysis should

be also expanded into the syllabi of the courses and the practical stage. With regard to the IT

companies, all of them are obviously aware of the agile practices. Simply considering our

findings one can observe that all the small companies use agile methodologies. Large

companies, with big and complex projects, prefer the traditional methods and we agree that

agile methods can be inefficient in large organizations and certain type of processes. A more

comprehensive study in the IT companies should differentiate the firm size, the team size and

the project type and also address the organizational culture issues.

References

[1] B. Tuckman, “Developmental sequence in small groups,” Psychological Bulletin, no. 63,

1965, cited in N.B. Moe et al., “A teamwork model for understanding an agile team: A

case study of a Scrum project”, Information and Software Technology, no. 52, 2010, pp.

480–491.

[2] J.R. Katzenbach and D.K. Smith, “The discipline of teams,” Harvard Business Review 71

(2), pp.111–120, 1993.

[3] P. Plotczyk and S. Murphy, “Command and Control is OUT! Create and Engage is IN!”,

available: http://www.wsa-intl.com/278-this-month-s-articles/command--control-is-out-

engage-create-is-in/

[4] F. Fagerholm, et al, “Performance Alignment Work: How software developers experience

the continuous adaptation of team performance in Lean and Agile environments,”

Information and Software Technology, February 2015

[5] F. Dumitriu, D. Oprea and G. Mesnita, “Issues and strategy for agile global software

development adoption,” Recent researches in Applied Economics, 2011, pp. 37-42.

[6] A. Grama and V.D. Pavaloaia, "Outsourcing IT – The Alternative for a Successful

Romanian SME," Procedia Economics and Finance, Volume 15, 2014, pp. 1404–1412.

[7] O. Dospinescu and M. Perca, “Technological integration for increasing the contextual

level of information,” Analele Stiintifice ale Universitatii" Alexandru Ioan Cuza" din Iasi-

Stiinte Economice, Vol. 58, 2011, pp. 571-581

[8] A. Munteanu, D. Fotache and O. Dospinescu, “The New Information Technologies:

Blessing, Curse or Ethnocentrism Fertiliser?”, Proc. of The 9th IBIMA International

Business Information Management Conference, January 4-6, 2008 Marrakech, Morocco.

[9] D. Fotache, A. Munteanu and O. Dospinescu, “Cultural Antropologic Reflections in the

Digital Space", Proc. of the 14th International Business Information Management

Conference (14th IBIMA), Istanbul, Turkey, 2010.

[10] D. J. Devine, “A review and integration of classification systems relevant to teams in

organizations,” Group Dynamics: Theory, Research, and Practice, pp. 291-310, 2002.

[11] C.M. Carson, D.C. Mosley, S.L. Boyar, “Goal orientation and supervisory behaviours:

Impacting SMWT effectiveness”, Team Performance Management, pp. 152-162, 2004.

[12] M.L. Liua, N.T. Liub, C. Dinga and C.P. Lin, ”Exploring team performance in high-

tech industries: Future trends of building up teamwork”, Technological Forecasting and

Social Change, Volume 91, 2015, pp. 295–310.

[13] Manifesto for Agile Software Development, available: http://www.agilemanifesto.org/










238

CRM- PHENOMENOLOGICAL AND CONCEPTUAL ANALYSIS IN

REFERENCE TO THE “STATE OF ART”

Mihaela IONESCU

Business Administration Doctoral School,


[email protected]

Abstract. The latest technological trends have brought into focus longstanding discussions

regarding the role played by Customer Relationship Management solutions in modern

organizations and furthermore, the role they play in relationships with the customers. The

article will summarize a thorough knowledge of how CRM solutions apply and allow

developing a new paradigm of education so that the individuals can develop their skills and

use all native and acquired knowledge in achieving any goal. In the modern business,

competition becomes more pronounced and knowledge as a source of success of the company

in the market becomes a necessary strategic choice. First, the author will make an

introduction on how information becomes a key resource and the central business function of

the company that takes an active part in the virtual market. Then, through benefits and scales

of interest, the author will explain the role of Customer Relationship Management which is

indispensable because it facilitates the acquisition of knowledge about the factors affecting

the company. In most of the situations, there is no business analysis or standardize work,

which would lead to predetermined patterns of education.

Keywords: business intelligence, consumer behavior, customer expectation, customer

relationship management, education, innovation JEL classification: M150, O3

1. Introduction In analyzing the state of the art, the article will take in account the improvements taking place

in the general process of design and selection of advanced CRM solutions to meet current,

emerging or future requirements scheduled by a certain organization. The process of

designing, developing and implementing the company`s objectives is the new perspective

substantially influencing state of the art for the application of CRM and similar systems. [1]

During the last decades, the use of artificial intelligence technology in order to support the

company`s projects has constantly increased. For instance, the researchers implemented in

the 1980s in USA highlighting that the interest for alternative communication solutions in

business increased. Due to the implementation of improved intelligence technology, problem

diagnostics raised by 30%-35%, facilitating the potential in-depth evaluation of the situations

occurring inside the company or between the company and its customers. [1]

In current competitive and dynamic environment, business intelligence consists in efficient

processing of a vast amount of information, in order to create a concrete representation of the

state of a company, as well as the emerging trends affecting its activity. For example, the

state of art consisting companies in the 1990s and the contemporary development of state of

art presents a substantial evolution. Due to the increasing of social media, blogs and other

media resources, state of art can`t be restricted anymore to the level of dealing exclusively

with structured information. Gradual and constant information sources vary and are directly



239

integrated into companies` environments, creating the image of textual information that is

shared through web and intranets. The new face of state of art is deeply marked by media

convergence, including audio and video streaming, as way to create an improved availability,

supported by high-quality information. [2]

The concept of state of art highlights the fact that a business process consists of several

functions that are performed in a certain order, able to transform both information and

material implied by a particular company. Business processes rely on internal and external

levels, associated with the type of customers they are willing to communicate with. In both

cases, for the executions of the tasks assumed on both levels, business processes require

people, data, technical and information resources. The state of art is the matrix that

encompasses every activity performed to obtain the fulfillment of the task assigned. For

example, in the case of an industrial enterprise, the state of art consist every process starting

with customers` orders to the effective shipment of their product, as well as their

maintenance. [3]

The concept of state of art is a challenging one, inviting to constant competition that

stimulates companies in creating improved products. For instance, in the 1970s in US, there

were few and dominant semiconductor and semiconductor equipment companies, some of

American suppliers preferring to sell traditional products, considered to globally satisfy the

customers` needs. Once the Asian, especially the Japanese similar products entered the

American market, the size of the local market for semiconductors expanded, the American

companies being willing to resist the "intruder". In addition, once their products became

mature, in terms of performance, comparing with the rival ones, the US companies

introduced some of their products in the Asian market, changing the balance of initial state-

of-art perspective. The continuous flux between the two distant markets is determined by the

preference of Japanese suppliers towards the foreign companies to sell their state-of-the-art

products. [4]

In the past decades, research on sustainable innovations has expanded rapidly to increase our

understanding of the ways in which new technologies and social practices enable societies to

become more sustainable. It`s likely to notice a gradual evolution of this trend, since coherent

perspectives have been introduced that analyze systemically the ways in which more

sustainable technologies are adopted in society, such as transition management and

innovation systems research. State-of-art was a key feature to the developing of companies in

the last decades, being currently considered a feature of "corporate sustainability

management, sustainable organizational development and sustainable innovation in daily

business". [5]

1.1 Benefits and the scale of interest for using CRM solutions Since at the beginning of 2000s, CRM became an essential feature in business world, as

pointed by several reports of studies implemented by large sales companies, the importance

of state of art substantially increased. According to IDC report, during 2000-2003, CRM was

associated with a market growth from $4 billion to $11 billion, at the end of 2003 [6]. Taking

in account this explosive evolution, the perception of CRM radically changed from being the

right software to imply, in order to assure a satisfactory business tool.

CRM was viewed as something "more" [6], a concept requiring "a new customer-centric

business model which must be supported by a set of applications integrating the front and

back office processes" [7]. Comparing with the previous perspective, the state or art

connected with the current image of CRM is based on a dual structure, both front and back

office, not exclusively back office, since the company has to be permanently connected to



240

their customers, not only in the case of a transaction as the sale of a certain product or

service.

R. Buehrer and Ch. D. Mueller developed in 2001 a research among major European

managers, highlighted that recently the state of art associated with CRM was marked by an

essential transformation. CRM is not just a business solution, is a target towards customers

and partners, as consequence the concept can be divided into different concrete classes,

mostly operational software (business operations management), analytical software (business

performance management) and collaborative software (business collaboration management).

According to the results obtained by the researchers, the last category of CRM

implementation, collaborative software, is the most common perspective (60%), followed by

CRM as operational software (58%) and the analytical image of CRM (53%). As the

mentioned proportions display (a common perspective of 30%), CRM is used as a complete

solutions comprising operational analytical and collaborative abilities. In addition, the results

of the research draw the attention on the habitude of purchasing CRM packages, but selecting

the CRM tool/tools that best fits the company`s objectives, being customized by the existing

IT departments [6].

While asked about the effective implementation of CRM tools, the interviewed managers

mentioned that it`s a recurrent frame for business-to-business relationship management

(67%), as well as for business-to-consumer relationship management (65%). Only 16% of the

managers participating to the research admitted that CRM solutions can support business-to-

employee relationship management, being a perspective that doesn`t use the maximal

capacity of CRM, keeping the business to a traditional implementation of it. Concretely,

business-to-business uses of CRM focus on the company`s processes with partners, while

business-to-consumer CRM solutions are based on the satisfaction of the end-consumer. In

addition, a business-to-employee use of CRM would create a reliable management structure

to support collaboration among employees within a company ([6] R. Buehrer, Ch. D.

Mueller,2002).

Artificial intelligence was gradually included in the managing process, supposing a strategic

use of information and data acquisition methods to identify and analyze the activity of the

company. Through the new updated related to CRM and similar projects, the company are

able to efficiently improve their resources, in order to raise the global productivity, in order to

deliver satisfactory services and products to customers. In order to fulfill this objective,

companies are generally interested in:

- Anticipating the potential future failures of the company through predictive and

preventive maintenance, including its production and its communication with customers

- Reduction or elimination of actual or potential fault diagnosis, in response to an

improved orientation towards customers` needs

- Optimization of performance by identifying the potential levels or parts that can be

implied by organization, in order to improve its global production

- Rapid isolation of any problem and effective implementation of solutions, in order to

avoid a potential crisis occurring to the production of the company or to the

communication developed with its customers. [1]

State-of-art completes and directs the evolution of certain business intelligence solutions

adopted by companies. For example, currently companies are marked by an increasing web

process integration including application servers, workflow solutions, collaboration portal,

and document management systems or data warehouse package. In order to create a unique

and efficient product, representatives of the company are connected through workflow

solutions, including individual activities into integrated business process.



241

In order to implement the new perspective of state-of-art, companies rely on suitable software

with the capability to define and map the required activities, as well as to monitor and report

their development. A suggestive example is the collaboration portal reuniting internal and

external sources, as well as promoting collaborative work, despite the geographical obstacles.

Based on a heterogeneous, distributed and semi-structured information platform, the new

business intelligence solutions are able to deliver adequate results to the executive team`s

demands. [8]

Generally, the current state-of-the-art business intelligence technologies are a class of tools

that have evolved in the last 30 years, able to store, manipulate and transform a larger amount

of information than the previous solutions. In order to perform this performance, those

solutions focus on increased data and text mining algorithms, using statistical algorithms and

genetic algorithms. In the first category factors such as forecasting, classifying or multivariate

analysis are included, while the genetic algorithms are factors individually created by the

company`s profile. All information display by those algorithms is grouped in groups and

subgroups of similar concepts, creating the taxonomy of the elements analyzed. The already

created classification algorithms are used to index the remaining data. Concretely, the

indexing and categorization procedures are performed by web services and can be

manipulated through any portal or engine functioning on a standard SOAP protocol. [8]

An eloquent example of company implying modern business intelligence solutions is IBM, a

notorious computer manufacturer. The business groups associated under the brand IBM are

cooperating into a global framework, in order to support the production and the sales`

evolution in North America, Europe and Asia. To achieve its purposes, especially the

expansion of global customer base, IBM relies on a particular structure starting with the

2000s. IBM`s overseas divisions don`t operate independently on a country-by-country base as

before, but into an unifying network, in order to stimulate cooperation. In addition, IBM

modified its people supply chain, creating mini-IBMs in each country with their own

administration and manufacturing, as a form of multinational business model.

The perspective radically changed, under the impact of Asian brands` competition, producing

high-quality goods and services in a similar pattern. In order to reduce costs and deliver

efficient services and goods, IBM created a global integrated operation network, reuniting

workers and suppliers worldwide into competency centers, aiming to implement low costs in

same places but in others attempting to have highly skilled employees closer to customers.

Further, IBM experts specialized into certain business functions were organized into clusters

of business expertise, located in any country of the world, but connected to each other and

headquarters. Through high-speed intranets and project managers` support, those specialized

clusters cancel the disadvantage issued by a job relocation, providing efficient results to the

global infrastructure of the company. [9]

Business Intelligence solutions are increasingly used by companies, in order to create and

maintain a reliable customer relationship management, monitor the assets or stocks, as well

as to increase sales, as major objectives. Business Intelligence solutions dedicated to an

efficient customer relationship management, including CRM, focus on the exploration on the

long-term of potential and current customers` value. The logic is simple and efficient for a

company: the more it knows its customers, the more it can anticipate their needs and offer

products or services able to increase its revenues, through a satisfactory sales process.

Besides the immediate profit, companies are able to reinforce their position in front of

shareholder presenting a concrete value that can be associated with further targeted marketing

activities. [10]

As Business Intelligence solution, CRM is packed by IT vendors (17 sellers worldwide) in

three main categories: small businesses with less or equal 50 users of the tool, medium



242

businesses between 51 to 499 users of CRM and large businesses, including 500 or more

users of CRM. Analyzing the profit rates of IT companies selling CRM solutions, the major

buyers of the tool is represented by major companies (63%), followed by medium companies

(56%) and 16% as small businesses. [6]

Structurally, a sophisticated CRM solution could support the global customer lifecycle, based

on four major phases: "knowledge, intention, contracting and settlement" [11]. A reliable

CRM solution will be able to provide to the company acquiring it administration

management, marketing management, customer service management, logistics management,

sales management, as well as customer interaction management. The majority of CRM

software providers rely on the satisfaction of three basic functionalities, including knowledge,

intention and contracting, while the settlement phase as reflected through contract

management and online tracking of sales is an issue increasingly demanded by companies.

[11]

1.2 CRM capability to offer preset patterns of education to companies and educational

institutions

Through its expanding functionality, as providing a valuable overview of a company`s

performance, CRM can be also reckoned as an entrepreneurial educating tool. Managers

"learn" the capacities of their companies, as well as the targeted customers, while employees

"learn" to use the maximal potential of their company, as well as to create a reliable

relationship with customers. Generally, CRM is an educational tool as it creates interaction

between the company and its customers/partners, as well as it generates the ability to evaluate

the processes between company and external interaction. [6]

Concretely, a reliable CRM solution will be able to grant sufficient control and flexibility to

the company, in order to satisfy the needs and analyzed expectations of their customers. As

follows, few features about CRM will be mentioned, that have to be understood and correctly

managed by companies.

Currently CRM solutions can be analyzed through the "metric", as image of the marketing

impact created by the business performance. Managers and employees are interested in

retrieving information about acquired customers, the percentage of customers, the value of

cross-selling, customer migration to other products offered by the company, if possible to

products offered by competition or the social changes in their customer behavior. Based on

the existing data, as well as on the potential scenarios offered by CRM reports, companies

can implement a unique marketing strategy or a suite of strategies, to assure the profitability

of the business. Concretely, based on the data generated by CRM, companies adopt small

marketing measures, in order to reduce the impact of customers` perception rigidity, since an

exclusive focus as customer acquisition at the expense of the customer retention is likely to

create an imbalance of loyalty, globally affecting the activity of the company [12].

2. Conclusions Similarly to any other business intelligence solution, CRM is marked by interoperability, but

expanded to an unmatched level. Through CRM, managers, employees or partners can easily

and securely exchange data between applications as database, email or extracted report. The

more database is supported, the less redundant the results will be for the company willing to

extract them and compare them with previous performance. In order to support an interactive

and quick use of CRM, companies usually complete it with Microsoft SQL Server (70%),

Oracle (65%), Sybase (37%) or IBM DB/2 (28%). In order to provide a reliable

interoperability, ERP activities associated with back office of the CRM are supported by

Oracle or SAP R/3 standards. Despite its multiple perspective, CRM is still an undiscovered



243

opportunity for business, since managers prefer to focus on statistical analysis, on short-term,

rather than expanding CRM to the limits of a global analysis. In this case, CRM is still a

subject that has to be "learnt", by both managers and employees or partners, in order to take

advantage of a satisfactory experience [1].

CRM is an essential factor on corporate agendas, for both small and large businesses, being

considered a new technology that will help companies target in a more efficient way their

global market segments, micro-segments or individual customers markets. CRM is perceived

as new marketing thinking, in contrast with the traditional methods, focusing on a more

focused attitude on customers` needs. A concise definition of CRM is the method “to create,

develop and enhance relationships with carefully targeted customers in order to improve

customer value and corporate profitability and thereby maximize shareholder value” (A.

Payne, 2005), in order to improve the potential of profitable, long-term perspectives.

Acknowledgment Mihaela Ionescu: This work was supported by the project “Excellence academic routes in

doctoral and postdoctoral research - READ” co-funded from the European Social Fund

through the Development of Human Resources Operational Programme 2007-2013, contract

no. POSDRU/159/1.5/S/137926.

References [1] D. Blumberg, Managing High-Tech Services Using a CRM Strategy, CRC Press LLC,

London, 2003.

[2] M. Castellanos, U. Dayal and V. Markl, Enabling Real-Time Business Intelligence,

Springer, New York, 2011.

[3] J. Elzinga, Th. Gulledge, C-Y. Lee, Engineering Advancing the State of the Art, Kluwer

Academic Publishers, Amsterdam, 1999.

[4] United States General Accounting Office, Report International Trade- US Business

Access to Certain Foreign State-of-the-Art Technology (1991), GAO, Washington.

[5] F. Boons and F. Ludeke-Freund, “Business models for sustainable innovation: state-of-

the-art and steps towards a research agenda”, Journal of Cleaner Production, no 45, 2012.

[6] R. Buehrer and C. D. Mueller, “Approach To Overcome Existing Limitations For CRM-

Implementation”, Journal ECIS 2002, June 6-8, 2002.

[7] Magic Software report The CRM Phenomenon – Whitepaper (2000), Magic Software

Enterprises Ltd, New York.

[8] C. Cunha and M. Manuela, Adaptive Technologies and Business Integration: Social,

Managerial and Organizational Dimensions, Idea Group Reference, London, 2007.

[9] V. Kale, Inverting the Paradox of Excellence. How Companies Use Variations for

Business Excellence and How Enterprise Variations Are Enabled by SAP, CRC Press,

New York, 2014.

[10] R. J. Baran, R. J. Galka and D. P. Strunk, Principles of Customer Relationship

Management, Thomson South-Western, New York, 2008.

[11] R. M. Morgan, J. Turner Parish and G. Deitz, Handbook on Research in Relationship

Marketing, Edward Elgar Publishing Limited, London, 2014.

[12] A. Payne and P. Frow, “A Strategic Framework for Customer Relationship

Management,” Journal Of Marketing, 2005, vol. 69, no. 4.



244

CONTINUOUS INTEGRATION IN OPEN SOURCE SOFTWARE

PROJECTS

Mihai GEORGESCU Bucharest University of Economic Studies

[email protected]

Cecilia CIOLOCA Bucharest University of Economic Studies

[email protected]

Abstract. The paper describes the process of continuous integration applied for open source

projects. It presents the advantages of having continuous integration and also offers a

practical approach towards introducing CI in a project using an open source tool. The

concept of continuous integration is explained using detailed diagram along with adjacent

processes related to this concept.

Keywords: architecture, build, continuous integration, open source, project, source control

JEL classification: J86, C88, C89

1. Introduction A software engineering team contains highly specialized people which handle different

aspects of the software product lifecycle. During the development and testing phase the main

persons involved are the programmers and testers. In this paper the author analyzes the

infrastructure architecture that is responsible for making the development and testing process

more fluid and effective in both open source and commercial software projects.

In [1] the authors investigate whether continuous integration has impact in open source

software projects. According to the author, the practice of having continuous integration in

open source projects has started to grow in the last years.

Teams in open source software projects are geographically distributed among different

locations. All team members are contributing to the same project and share all its resources.

The task of maintaining a fully working set of source code is a difficult task due to the big

number of people that introduce changes to the code. Another challenge in open source

projects is the availability, at any given moment, of a runnable version of the product to the

testing team. Each modification performed on the source code must be available to the testing

team as soon as possible so that it can be tested and delivered. Usually, in commercial

software companies, the development and testing processes have scheduled deadlines so the

need of fast propagation of changes onto testing environments is not essential as compared to

open source environments.

In open source projects anyone can contribute to the source code, even it is only one or many

contributions. Because of this, it is mandatory for the project owner to make sure that any

modification of source code meets the following rules and objectives:

- Will not break the build, meaning that the source code will not compile anymore;

- All automated unit tests are performed successfully;

- Any source code change is rapidly available to someone for tests and feedback;

- Potential problems will be found as soon as possible.



245

A software project contains many interconnected modules depending on its architecture. In

open source and commercial software projects, the project infrastructure consists of the

following environments:

- Development environment represents the virtual or physical location where the current

version of the software product is available to all programmers which contribute to the

project. All source code changes are performed initially on this environment;

- Staging environment represents the virtual or physical location where the current

version of the software product is available for testing of any modification. This

environment must be stable and updated at very short intervals with modifications from

the development environment.

Both development and staging environments can be either local or hosted by the project.

Figure 1 - Relationship between development and staging environments

The environments depicted in Figure 1 are mandatory for an efficient management of a

software project. These environments must be synchronized at different intervals. The length

of the intervals is relative to the software project needs. If it is an open source project, these

environments should be synchronized at shorter intervals to allow changes to be available

rapidly to other members of the project. The process of updating the environments with latest

changes, running the automated tests at the moment of actually committing the modifications

is called continuous integration.

This paper aims at explaining the concept of continuous integration in software projects and

presenting an open source tool that helps achieve this. The advantages introduced by this

process are also highlighted in the context of open source projects and commercial ones.

2. Explaining the Continuous Integration (CI) process

Before implementing the process of Continuous Integration (CI) in a software project, the

following concepts must be understood.

Source Code is the set of files containing the software application behavior written in a

programming language.

Source Code Repository is the virtual location where the source code is hosted. There are

well known software applications that manage the access to the source code, thus performing

the role of a source code repository. The main functionalities of a source code repository are:

- Commit or check in – is the process of pushing modifications to the source code

repository;

- Check out – is the process of marking the file as in use in order to inform other members

that modifications will be performed on it;

- Get version – retrieves a specific file version or the latest one from source code

repository;

- Branch – a complete set of source code copied from an existing one;



246

- Merge – the process of combining one or several source code files.

Build is the process of compiling a set of source code and perform any related activities to

produce a working output which forms the software product. It can be achieved using the

development IDE and can also be automatized using different tools.

Unit Test is the process of writing methods for testing existing software application

functionalities.

Automated Tests is the process of automating, by means of software tools, the calls of the

unit test methods.

Project Community represents the totality of persons contributing to the project.

Understanding the above processes is mandatory for successfully implement continuous

integration. The way CI is integrated in an open source software project is presented in Figure

2.

Figure 2 - The process of Continuous Integration in open source software projects

In [2] the continuous integration process phases are described. In open source projects the

following process is taking place when a community member introduces a change:

- Before performing any modification, the developer must obtain the latest version of

source code from the source code repository;

- Source code is modified locally with the requested functionality;

- A build is performed by the developer to initially test the functionality on the

development environment;

- If initial tests are positive, the files are committed to the source code repository;

- A new build package will be performed by the project owner along with all necessary

post build configurations and the staging environment is updated;

- Members of the community can get the latest staging build and install it locally or

perform tests on the staging environment;

- The process is resumed from the beginning when the next change appears.

The [3] book introduces Jenkins as one of the most well-known open source continuous

integration tools and describes step by step the entities required for successfully apply it to a



247

project. It offers a complete guide on configuring the tool and customize it using the available

plugins.

In [4] the paper identifies other processes that can automatize using continuous integration

which includes:

- Generating the project documentation automatically;

- Static code analysis;

- Spell checker and code beautifier.

The purpose of CI is to automate the above process so that every commit operation to the

source code repository will trigger an automated update of both development and staging

environments. Continuous integration process is also responsible for running automated unit

tests, any post build operations and inform all project members about the results of the entire

operation.

Being able to automatically propagate user changes to all environments and perform all build

related tasks introduces great advantages to the open source project management. The

advantages are presented in the next chapter of this paper.

3. Advantages of having continuous integration in open source projects

In [5] the main advantages of having a continuous integration process in open source projects

are:

- Automatize the repetitive manual processes;

- Generate deployable version of the software at any time;

- Reduce risk;

- Provide real time feedback on the build status;

- Improve the quality of the software product.

CI automates existing manual process and introduces additional steps to allow generation of a

deliverable build package at any time. This is one of the biggest advantages offered by CI as

the output of this process will generate a tangible software product that can be directly

accessed by the community members. Any time a user performs a change in a project’s file,

CI system will trigger all activities required to produce a ready to be delivered application

package. Without having a reliable CI, all these operations must be manually performed. Due

to the high number of changes in a project and the lack of a CI, it will create a big overhead

for the project owner to prepare at short intervals a ready to deliver package of the

application.

Reducing risk is achieved by quickly integrating changes thus obtaining rapid feedback from

testers. CI runs unit tests automatically so any defect can be quickly identified. Every time a

change is introduced to the software product, the developers are making assumption that it

will not negatively affect any existing functionality. Most of the times this is not what

actually happens and defects are introduced. CI helps overcome this problem by offering real

time feedback. In [6] the author shares his experience starting as a software tester on

traditional methodologies while transitioning to the agile software testing and concludes

highlighting the importance of continuous integration in agile environments.

4. Open source tool for implementing the Continuous Integration Process

The research in this paper identified an open source tool that facilitates continuous integration

in projects. In [7] the authors present several implementations of continuous integration along

with the differences between them. Based on the research, the authors propose a descriptive

model for documenting and understanding the implementation of continuous integration.

According to Jenkins official documentation there are more than 100.000 active users that are

using Jenkins project as the CI tools for their projects. Jenkins is the leading open source



248

continuous integration server and it also provides more than 1000 plugins to support building

and testing virtually any project.

In order to set up a CI environment, the project must have the following major prerequisites:

- Source Code repository containing the necessary files to build the project;

- Dedicated hardware machine to host Jenkins CI;

- Environments to deploy the software package obtained after build.

After Jenkins CI is installed on a dedicated machine, it must be configured so that it will have

access to the source code repository. The first step is to configure Jenkins CI so that it will fit

the needs of your project. Among the settings there are email server, security rights and

plugins to be used.

The next step is to create build projects inside Jenkins CI for all components of the software

application. At this step it is important to build your solution with the corresponding plugin

provided by Jenkins. If it is a Java, C#, C++ or other types of projects, than the correct build

plugin must be installed. Jenkins CI allows configuration of builds to be started at different

intervals or real time at any modification in the source code. Upon completion of these steps

and running the first build, in the Jenkins CI main window the project should look similar to

the one in Figure 3.

Figure 3 - A build project created in Jenkins CI

The information provided in the main Jenkins CI window for each build project is:

- List of defined build projects;

- The status of the build: success, failed or inconclusive;

- Last time when the build failed or finished with success;

- The duration of the build process;

- The percentage of successful builds from the total number of builds.

By analyzing this information, the project owner can evaluate the health of the project and

identify reasons for build failures very fast. In [8] the authors present a framework based on

Jenkins CI dedicated for building student projects, executing automated tests, running static

code analysis and calculating code coverage. This paper confirms the extensibility of the

Jenkins CI environment.

To further extend the continuous integration process, as presented in [9], automatic unit tests

can be implemented using the proposed automatic test generation tool CTG which increased

the test coverage by an average of 60%.

An improvement for making Jenkins CI fully distributed is presented in [10] and allows

communication and synchronization of multiple CI servers every time new artifacts produced

by the build are updated.

5. Conclusions

Continuous Integration is beneficial for the software project no matter if it is open source or

commercial, as it provides real-time feedback about the build health and automatizes all

manual processes. Having this process in place offers the following benefits:

- Reduce risk of broken builds;

- Automatize manual processes;

- Provides real time feedback about the health of the build;

- Runs automated unit tests and provides fast results;

- Automatically updates all development and testing environments.



249

Due to the distributed nature of teams in open source projects it is essential to have an

integrated build system. This allows defining rules and processes each time a team member

does a modification on the source code which increases the chances of quickly finding

possible problems.

References [1] Amit Deshpande and Dirk Riehle, Continuous Integration in Open Source Software

Development, Open Source Development, Communities and Quality, Springer US, pp.

273-280, 2008, ISBN: 978-0-387-09683-4

[2] P. M. Duvall, S. Matyas and A. Glover, Continuous Integration, Improving Software

Quality and Reducing Risks, Addison-Wesley, 2008, ISBN: 978-0-321-33638-5

[3] A. Berg, Jenkins Continuous Integration Cookbook, Packt Publishing, 2012, 344 pages,

ISBN: 978-1849517409

[4] M. Ettl et al. "Continuous software integration and quality control during software

development," IVS 2012 General Meeting Proceedings, 2012.

[5] J. Holck and N. Jørgensen, "Continuous Integration and Quality Assurance: A Case Study

of Two Open Source Projects," Australasian Journal of Information Systems, vol. 11, no.

1, pp. 40-53, 2003, ISSN 1449-8618

[6] S. Stolberg, "Enabling Agile Testing through Continuous Integration," Agile Conference,

vol., no., pp.369 - 374, 24-28 Aug. 2009

[7] D. Ståhl and J. Bosch, "Modeling continuous integration practice differences in industry

software development," Journal of Systems and Software, Volume 87, January 2014, pp.

48-59

[8] S. Heckman, J. King and M. Winters, "Automating Software Engineering Best Practices

Using an Open Source Continuous Integration Framework," in Proc. of the 46th ACM

Technical Symposium on Computer Science Education, ACM, New York, USA, pp.677-

677

[9] J. Campos, A. Arcuri, G. Fraser and R. Abreu, "Continuous test generation: enhancing

continuous integration with automated test generation", in Proc. of the 29th ACM/IEEE

international conference on Automated software engineering (ASE '14), ACM, New

York, NY, USA, 55-66, 2014

[10] S. Dösinger, R. Mordinyi and S. Biffl, "Communicating continuous integration servers

for increasing effectiveness of automated testing," in Proc. of the 27th IEEE/ACM

International Conference on Automated Software Engineering ACM, New York, NY,

USA, pp. 374-377

http://link.springer.com/search?facet-author=%22Amit+Deshpande%22

http://link.springer.com/search?facet-author=%22Dirk+Riehle%22

http://link.springer.com/book/10.1007/978-0-387-09684-1



250

SERVICE-ORIENTED MODELING AND ARCHITECTURE FOR AN

E-FINANCIAL ASSISTANT INTEGRATION WITHIN THE BANKING

SYSTEM

Mirela TURKEȘ

Dimitrie-Cantemir Christian University

[email protected]

Irina RAICU Bucharest University of Economic Studies/CRI, Paris 1 Pantheon-Sorbonne University

[email protected]/ [email protected]



[email protected]

Abstract. The financial system plays a vital role in supporting sustainable economic growth.

Due to their importance in the financial system and influence on national economies,

banks are highly regulated in most countries. Banks are struggling with heterogeneous

legacy systems that are difficult to change and integrate. Service Oriented Architecture

(SOA) has recently gained popularity as a new approach to integrate business applications in

banking sector in order to attain the integration, flexibility and efficiency to succeed in this

complex business environment. The aim of this paper is to propose an E-Financial Assistant

(EFA) system, an information system that acts as an intermediary between individuals and

specialized information systems exposed by banks and financial institutions, which can be

easily integrated with other banking systems. The solution helps understand the way the

adoption of SOA in E-Banking can create an open and collaborative environment, with low

costs and efficient service management.

Keywords: e-banking, e-financial, enterprise service bus, integration, service-oriented

architecture

JEL classification: F15, G2, P4

1. Introduction The banking system has one of the most complex and sophisticated IT infrastructure. Each

bank has its own financial software which stores client data, transactional data, financial data

and data regarding their products and services. The data models used by each bank to define

and manage the information are different. What if the information regarding products and

services can be structured in a homogeneous manner for each bank? This approach can have

major benefits on the banking market from the client point of view. If a customer needs a

credit he will visit various banks and requests offers from each one and analyze them in order

to decide which one is the most suitable for his needs. Using the above approach, the

customer can access a collaborative environment where all the information regarding credits

can be found for any bank and he can make a decision based only on the information exposed

by the collaborative application without visiting a bank to request for offers. Today's

technology can create a platform that can integrate all the information regarding products,

services, financial data, and transform it into valuable information for the customer.



251

2. SOA integration with ESB

To thrive in today's business environment, there is a need in dynamic, interactive and cost-

effective network of employee, customer, partner and supplier relationships. And this

requires more agility through the processes and systems that make up the network. For many

businesses is IT complexity that makes this hard to achieve. IT complexity and costs often

arise from an organization’s approach to application integration. The resulting connectivity

tangle can lead to unreliable connectivity, slow and inflexible application integration, costly

point-to-point connectivity and inability to effectively leverage new connectivity methods for

people, processes and information. However, these challenges can be addressed with a SOA

approach that integrates applications and integrates services through an Enterprise Service

Bus (ESB) to ensure the business gets the right information to the right place at the right time.

Also, it addresses the business agility and cost optimization needed to succeed. A service-

oriented architecture (SOA) is a combination of consumers and services that collaborate,

supported by a managed set of capabilities, guided by principles and governed by supporting

standards. [1] The concept of service-oriented architecture includes a set of desirable design

characteristics for promoting interoperability, re-usability and organizational agility as well

as a service-oriented business process-modeling paradigm. SOA is commonly used to

designate anything contributing to an enterprise platform based on service-oriented

principles. It is an example of business and computing evolution, an architectural approach

for business that includes and requires supporting technologies capabilities. An ESB solution

has two main functional areas: messaging and enrichment. [2] Messaging is the reliable

delivery of information whenever, wherever it's needed. Enrichment is the enhancement of

messages with matching and routing communication between services, conversion between

transport protocols, transformations between data formats and identification and distribution

of business events. For business agility, the software products have to be delivered faster to

the market place. While messaging supports reliable delivery of business information, ESBs

products eliminate the integration logic, normally developed for each application. Without

this time consuming overhead, the integration of all applications is simplified, allowing for

more flexibility and faster delivery of business services. Over time, organizations continue to

add complexity and costs for development and maintenance of point-to-point connections. An

ESB solution eliminates redundant and complex connectivity programming. Flexibility and

costs are the only barriers to successful application integration. Another is addressing

emerging technologies and standards while quickly accessing business information

everywhere. ESB enables the creation of new, low-cost services by seamlessly bridging Web

2.0 technology to existing legacy and back-end enterprise applications. [3] Also, it federates a

single logical ESB infrastructure to share data, applications and resources by forming a single

logical ESB infrastructure across systems and applications. The reach of the business data can

be extended from and to any device or system within or outside of business network.

3. Design and implementation of the E-Financial Assistant

E-Financial Assistant (EFA) is an information system that acts as an intermediary between

individuals and specialized information systems exposed by banks and financial institutions.

The main functionality of EFA is to present to clients the best credit offers, function of their

personal and financial data and their preferences. EFA inquires the banks for their credit

offers based on the client's data. But, prior to sending the client's personal data, it validates it

with the General Direction of Persons Evidence (GDPE) [4]. Banks may send back offers on

client's initial credit preferences or if the client preferences does not match any of their offers,



252

banks send offers with other credit preferences that are close to the initial ones. EFA

calculates the best deals and if the client chooses one credit offer, EFA takes care of all

formalities. First, it validates the client financial data with both Credit Office (CO) [5] and

General Registry of Employees (GRE) [6]. Then sends the results to the selected bank and

based on this validation the bank decides the credit approval. In the Figure 1, the functional

requirements of EFA are described by the use-case diagram.

Figure 1 - Functional requirements of EFA

3.1. Business modelling The actors that interact with the system are the clients applying for a credit. The agent that

interacts with the system is the EFA Administrator which may add a contract with another

bank that wants to participate in the application. The stakeholders are represented by GDPE,

GRE and CO. The main business use case is the appliance to a credit.

The software requirements include both functional and non-functional requirements. The

non-functional requirements are specific to any ESB:

- Location transparency: the client has to be decoupled from the bank providers locations;

EFA provides a central platform to communicate with any application necessary without

coupling the message sender to the message receiver;

- Transport protocol conversion: EFA should seamlessly integrate applications with

different transport protocols like HTTP(S) to JMS, FTP to a file batch, and SMTP to

TCP;

- Message transformation: EFA should provide functionality to transform messages from

one format to the other based on open standards like XSLT and XPath.



253

- Message routing: EFA should determine the ultimate destination of an incoming

message;

- Security: EFA should provide Authentication, authorization, and encryption functionality

for securing incoming messages to prevent malicious use as well as securing outgoing

messages to satisfy the security requirements of the service provider.

- Monitoring and management: A monitoring and management environment is necessary to

configure EFA to be high-performing and reliable and also to monitor the runtime

execution of the message flows.

3.2. Architecture Design and Integration within banking system

The credit appliance integration solution consists of several message exchanges: one message

exchange asks the banks for credit offers function of client input data. Another message

exchange involves the response messages from the banks with their credit offers. Because

these are request/reply exchanges, we combined them into one message flow design as can be

seen in the Figure 2. Next, the confirmation exchange can then be placed in a separate

message flow design to improve the readability of the diagram. The decision about how the

messages should be sent to the two banks area falls under the Message Channel patterns. The

two patterns in the message channel category that can solve this problem are Point-to-Point

and Publish Subscribe. With the Point-to-Point pattern solution; there are introduced two new

message channels, which have a fixed message producer and consumer application. The

Publish Subscribe pattern seems better suited because publishes a message to a topic and

subscribed consumers can consume the message. If the number of banks increases, it can be

easily added with publish-subscribe channel. So we chose publish-subscribe channel to ask

the banks for their credit offers. Next, it appears the problem of how does the banks know

where to send the message back. Here comes into picture the Return Address pattern which

states that the request message should contain a Return Address that indicates where to send

the reply message. So, a Return Address is put in the header of the message because it’s not

part of the data being transmitted.

Figure 2 - The message flow design diagram of the bank's credit offers part

To implement the response-message exchange with the banking institutions, it at first seems

that there is no specific integration functionality. Figure 3 shows an overview of this

confirmation message flow. But not all the messages arriving from banks are displayed to

client. They need to be aggregated into one or more best deals if equal monthly instalment.

This functionality is achieved by adding the Aggregator pattern. The second message flow

diagram addresses the choice of a credit offer. A content-based router is used to route the



254

message to the chosen bank application. The content based router inspects the content of the

message; based on the bank name, it determines the correct target endpoint. Then, the bank

application returns a confirmation message that it has received the message. The

implementation of publish-subscribe functionality in ServiceMix involves configuring a

number of JBI components.

Figure 3 -The message flow design diagram of the credit choice part

Figure 4 shows all the JBI components required for the first part of the integration solution.

Firstly, JMS endpoints are configured using servicemix-jms component. Next, integrating

Spring beans in a ServiceMix message flow, implies marshaling and unmarshaling of the

XML messages that flow through the JBI bus to the Java beans that are defined as the input

parameters and return value for the Spring restaurant bean. Two components are applicable to

implement the Spring bean invocation: the Bean service engine and the JSR-181 service

engine. For our approach, we used the JSR-181 service engine, because this component is

capable of (un)marshaling the XML messages automatically. Finally, for routing the

incoming message to the bank beans and handling the bank responses we used the Camel

service engine.

Figure 4 - JBI components needed to implement the publish-subscribe functionality



255

5. Conclusions A collaborative platform based on a Message Broker technology is a real option for the

financial system in order to offer an advanced tool for potential customers to inform about

products and services and apply for credits. Sustainable economic growth aims to provide

customers satisfactions both through consuming of banking products or provision of banking

services and increasing the clients' life quality. Therefore, an E-Financial Assistant

collaborative platform is a key element in supporting sustainable economic growth. The

effort to achieve a relationship based on mutual trust and collaboration between the bank and

its client represents a core motive of banks activities. Through the platform, the clients have

quick access to the most advantageous credit offers, thereby contributing to the individual

client wellbeing. Other advantage of this platform is that the banks can handle in an efficient

manner the information about their clients. Databases which keep information about the

clients are created and thus, the data can be used to design new banking products or for

improvement of the banking methods. On other hand, the integration of collaborative

application within existing banking systems is faster, less complex and lower cost.

The customer can benefit from this collaborative application on many levels such as financial

and personal. The application aims to satisfy two kinds of needs: Effective needs - clients are

concerned with what banks can offer to them in order to satisfy their various needs such as,

the need to access a credit for funding or refinancing an existing credit, lower costs regarding

the access to the loan, a reduced time of according the loan, extended loan duration, reduced

documentation. The needs of customers can be satisfied based on the diversity of banking

products and services correlated with the ability of faster adaptability of banking services

according the current needs of their clients. Also, the application satisfy the Virtual needs of

their clients through the use of a modern technology which provides an user friendly interface

in order to be accessible anytime and anywhere by the customers in a facile manner. The

collaborative application provides the quick access to banks offers, especially to the

advantageous crediting conditions.

In conclusion, this collaborative environment can generate information regarding the

competitiveness among banks in such a way that banks will change their products to become

more competitive, therefore the customer will benefit the most.

References [1] I.Raicu, "Proposal of a SOA model to serve as a practical framework for E-Justice

interoperability," Proc. of the 13th International Conference on Informatics in Economy,

Bucharest, 2014

[2] Schmidt M.T., Hutchison, B. , Lambros, P. and R. Phippen, "The Enterprise Service Bus:

Making service-oriented architecture real," IBM Systems Journal, vol. 44, Issue: 4,

pp.781 – 797, 2005

[3] N. Basias, M. Themistocleous and V. Morabito, "A Decision Making Framework for

SOA Adoption in e-Banking: A Case Study Approach," Journal of Economics, Business

and Management, Vol. 3, No. 1, 2015

[4] General Direction of Persons Evidence, available: http://dgepmb.ro/furnizari-de-date-cu-

caracter-personal-din-rnep/54

[5] Credit Office, available: http://www.birouldecredit.ro/index.htm

[6] General Registry of Employees, available:

https://reges.inspectiamuncii.ro/Cont/Autentificare?ReturnUrl=%2fPreluariRegistre

http://www.birouldecredit.ro/index.htm



256

RISK ASSESSMENT FRAMEWORK FOR SUCCESSFUL

E-GOVERNMENT PROJECTS

Otniel DIDRAGA West University of Timisoara

[email protected]

Abstract. E-government projects are intended to increase information and services

availability, interactivity and accessibility for citizens and businesses through the advanced

use of ICT. These projects are exposed to various risks. The complexity of e-government risks

requires a detailed analysis of risks. There are several risk categories, and risks are ranked

as high, medium or low. Applying risk management techniques reduces the impact of risks on

the success of the project. We propose a risk assessment framework for e-government

projects implementation. Successful projects depend on risk management strategies, and the

proposed framework can be applied in order to increase the effectiveness of services

delivered to users by the government.

Keywords: E-government, Risks, Risk Assessment, Successful projects. JEL classification: H7, H11, D81

1. Introduction The concept of E-government [1] defines the generic use of new communication technologies

and computer applications by the central and local public administration to improve the

administrative activities.

Technology provides governments two significant opportunities: increased operational

efficiency by costs reduction and productivity increase, and better quality of services

provided by the public administration [2].

E-government can provide opportunities to transform public administration into an

instrument of sustainable development [3].

E-government means using ICT to make public service better, cheaper and faster [4].

Through innovation and e-government, public administrations can be more efficient, provide

better services and respond to demands for transparency and accountability [3].

E-government services have several benefits: transparency, accountability [5],

responsiveness, visibility, efficiency, performance, and integration. These benefits can be

registered by the public administration organizations that make use of on-line services [6].

E-government projects have to be successfully designed and deployed in order to deliver the

expected benefits for quality services for citizens and businesses [7]. In order to be

successful, e-government projects must build trust within agencies, between agencies, across

governments, and with businesses, NGOs and citizens [5].

The trust in e-government services is directly enhanced by the citizens’ higher perception of

technological and organizational trustworthiness, the quality and usefulness of e-government

services, the Internet experience, and propensity to trust [8].

Projects of implementing e-services are inherently complex [7], and while e-government still

includes electronic interactions of government-to-government (G2G); government-to-

business (G2B), and government-to-consumer (G2C), we must consider a multi-stakeholder

approach to e-government projects[3].



257

Gatman considers that active stakeholders, refer to any actor that has a role in identifying,

communicating, developing, enhancing, using e-Government solutions [6].

Public managers need to be involved actively as stakeholders, not just users of information,

and e-government projects miss their targeted objective, become redundant, waste valuable

resources, and lack interoperability because stakeholders are not sufficiently involved [6].

Successful involvement of stakeholders can create the environment that would support

communities in developing ICT capacities and resources according to their particular needs in

active and functional public administration [6].

Stoica makes a classification of e-government projects [9]:

a) from the functionality/implementation point of view in degree/readiness:

1. implemented projects;

2. pilot projects;

3. abandoned pilot projects;

b) from the point of view of success in implementation:

1. successful projects;

2. failed projects;

3. new projects (unverified);

c) awarded projects:

1. internationally awarded projects;

2. considered best practices/models at national level.

Stoica gives examples of successful e-government projects [9], [10]: The Electronic System

for Public Acquisitions e-Procurement (ESPP) – a successful Romanian e-government

project considering the financial impact and number of users. The project was part of the

Government Programme 2009-2013 Strategy (‘eRomania’) [11]. Stoica consideres “it is

easier, with more impact and more successful, to implement individual high-level projects,

already verified, and considered examples of good practice at international level” [9].

Assessment of e-government projects should consider dimensions: strategic, technological,

organizational, economic, operational, and services [12].

2. Risk Management in E-government Projects E-government projects involve people, processes and technology [13]. In particular, they deal

with multiple and complex challenges [2] and face several roadblocks that eventually obstruct

their potential to deliver the intended benefits to the citizens [13]. The roadblocks or

resistance items can be: technical issues; inadequate standards and frameworks; conflicting

initiatives; coordinating issues; security and privacy concerns; budgetary constraints; and

inadequate skills, awareness, and digital divide [13].

Evangelidis enumerates possible risks that may be experienced in e-government projects:

uncertain timescale prediction, increased costs and delays, misguided decisions, dependence

on technology, security risks and unpredictable risks related to the nature of the ICTs [7].

Ex-ante assessment (pre-implementation) of the value of e-government projects is usually

conducted via risk assessment or risk analysis [12].

Risk and risk management influence the success of generic IT projects [14]. The effects of

risk management in IT projects encompass: creating awareness, establishing trust and setting

priorities, clarifying expectations, creating acceptance and commitment, thus contributing to

a higher success probability of IT projects [14].

Risk assessment usually occurs in the design phase of the project. Nevertheless, after the e-

government project has been implemented, a second phase of evaluation is necessary. The

second assessment evaluates whether or not risks were successfully mitigated and what on-



258

going action is needed. Thus, the two components of assessing the strategic dimension are

objectives and risk management [12].

Considering the risk management process, the activities can be divided into two basic steps of

activities that relate to each other, and they should be performed consequently [15]: risk

analysis (assessment) and risk management (mitigation).

Risk management implies high levels of complexity and the outcome is not always as

planned, identifying the risks in e-government projects allows risks to be managed [16].

Effective risk assessment can determine success or failure of e-government projects [12].

3. Risk Assessment Framework for E-government Projects The result of our research is the proposed model of Risk Assesment Framework for E-

government Projects. Our framework is based on a multidimensional approach to risk

assessment in e-government projects. The first dimension is the Project Phases, the second

dimension is the Risk Categories, and the third dimension is Risk Assesment Activities (Table

1). The Risk Assesment Activities are iterative processes. These processes are executed in all

project phases and for all risk categories.

Table 1. Risk Assessment Framework (own development)

Project

Phases

Risks

Categories

Initiation

and

Planning

Analysis Design Implementation Maintenance

Risk

assessm

ent

an

d R

isk M

itigatio

n P

lan

Project Management

and Coordination

Project Team

experience

Project Team

Communication

Social and cultural

context

Social and economic

context

Communication with

citizens

Support and

Assistance of citizens

Technical issues and

standards

Security and privacy

Legal factors

Productivity for

Citizens and

Government

Budgetary

constraints

Political influence



259

The risk categories included in the framework are: project management and coordination [2],

project team experience (skills) [13], team communication and coordination [13], social and

cultural context [5] [13], social and economic context [12], citizen communication [15],

support and assistance of citizens [15], technical issues and standards [6] [12] [13], security

and privacy [2] [13], legal factors [2] [15], productivity (accessibility) for citizens [15],

budgetary constraints [6] [12] [13], and political influence [2] [5] [6] [7].

Risk assessment must occur in the colored phases presented in Table 1.

Risks are assessed by their probability of occurrence (unlikely, moderate and likely) and their

impact (high, medium, low) on the outcome of the project [14] [15].

After every iterative risk assessment activity, a Risk Mitigation Plan will result in order to

classify and control risks for achieving project success.

4. Conclusions E-government projects are complex endeavors. They require strategic planning and applying

risk management activities to minimize risk and to deliver e-services successfully to citizens,

businesses, and other public authorities.

The challenges imposed by the occurrence of risks must be overcome through risk

management. One important step in risk management is the risk assessment activity.

Performing risk assessment during the phases of a project is a continuous process that results

in a clear understanding of the strategy and challenges of the project.

E-government projects involve different people, many processes, and high technology.

Like any other IT project, E-government projects must include a risk management strategy

based on a standard framework, making the results transparent to the stakeholders.

Governments must apply control methods and risk mitigation plans to build citizen’s trust in

using e-services.

Successful projects that are already verified and considered examples of good practice

depend on risk management strategies.

Our proposed framework can be applied in order to increase the effectiveness of services

delivered to users by the government. Assessing risks through the proposed framework

creates awareness, establishes trust and sets priorities, clarifies expectations, and creates

acceptance.

The limit of our proposed framework model is that it has not been tested in practice, but it

may find applicability in different areas in the public sector, further empirical research being

needed to validate the framework.

Acknowledgment This work was cofinanced from the European Social Fund through Sectoral Operational

Programme Human Resources Development 2007-2013, project number POSDRU

159/1.5/S/142115 “Performance and excellence in doctoral and postdoctoral research in

Romanian economics science domain”.

References [1] MCSI, “Strategia Națională privind Agenda Digitală pentru România 2020” – Februarie

2015, http://www.mcsi.ro/CMSPages/GetFile.aspx?nodeguid=0617c1d7-182f-44c0-a978-

4d8653e2c31d

[2] J.R. Gil-Garcia, and T.A. Pardo, “E-government Success Factors: Mapping Practical

Tools to Theoretical Foundations”, Government Information Quarterly 22 (2005), pp.

187–216.



260

[3] United Nations, “E-Government Survey 2014 – E-Government for the Future We Want”,

2014, ISBN: 978-92-1-123198-4, http://www.unpan.org/e-government

[4] European Commission, “Delivering the European Advantage? ‘How European

Governments can and should benefit from innovative public services’ - eGovernment

Benchmark – May 2014”, Luxembourg, Publications Office of the European Union,

2014, ISBN 978-92-79-38052-5.

[5] T. Almarabeh, and A. AbuAli, “A General Framework for E-Government: Definition

Maturity Challenges, Opportunities, and Success”, European Journal of Scientific

Research, Vol. 39, No.1 (2010), pp. 29-42.

[6] A. Gatman, “e-Government – Assisting Reformed Public Administration in Romania”,

Romanian Journal of Economics, vol. 32, No. 1(41), 2011, pp. 216-242.

[7] A. Evangelidis, “FRAMES – A Risk Assessment Framework for e-Services”, Electronic

Journal of e-Government, Vol. 2, No. 1, Jun 2004, pp. 21-30.

[8] S.E., Colesca, “Increasing E-Trust: A Solution to Minimize Risk in E-Government

Adoption”, Journal of Applied Quantitative Methods, vol. 4, no. 1, 2009, pp. 31-44,

ISSN: 1842-4562.

[9] O. Stoica, “E-Government Implementation in Romania. From National Success to

International Example”, 2009,

http://www.nispa.org/conf_paper_detail.php?cid=17&p=1549&pid=166

[10] O. Stoica, “Romanian E-Government between Success and Failure”, 2008,

http://www.nispa.org/conf_paper_detail.php?cid=16&p=1253&pid=166

[11] European Commission, “eGovernment in Romania, Edition 12.0 - eGovernement

Factsheets”, January 2015,

https://joinup.ec.europa.eu/sites/default/files/egov_in_romania_-_january_2015_-

_v.12.0_final.pdf

[12] J. Esteves, and R.C. Joseph, “A comprehensive framework for the assessment of

eGovernment projects”, Government Information Quarterly, vol. 25, 2008, pp. 118–132.

[13] A.M. Al-Khouri, N. Al-Mazrouie, and M. Bommireddy, “A Strategy Framework For the

Risk Assessment And Mitigation For Large E-Government Projects”, International

Journal of Managing Value and Supply Chains (IJMVSC), Vol. 1, No. 2, December 2010,

pp. 36-43.

[14] O. Didraga, “The Role and the Effects of Risk Management in IT Projects Success”.

Informatica Economica Journal, Vol. 17, no. 1, 2013, pp. 86-98.

[15] M. Podgoršek, “Risk assessment of e-service projects”, 4th Working seminar on

Performance auditing, April 20 – 21, 2004 in Moscow, Russia.

[16] S. Paquette, P.T. Jaeger, and S.C. Wilson, “Identifying the Security Risks Associated

With Governmental Use of Cloud Computing”, Government Information Quarterly 27

(2010), pp. 245–253.



261

AN ASSESSMENT OF THE MAIN BENEFITS AND

CHARACTERISTICS OF BUSINESS SOFTWARE FROM THE

PERSPECTIVE OF ROMANIAN SMEs

Victor LAVRIC Bucharest University of Economic Studies, Romania

[email protected]

Abstract. This paper investigates the main benefits and characteristics of business software

from the perspective of Romanian SMEs. In order to do so, we analyze the data collected

from more than 1.400 Romanian enterprises. As we identify the most frequently used types of

software, the main benefits that entrepreneurs assign to them and specific structural

implications of some software solutions, we therefore are able to extract valuable insights

regarding the business software market in Romania and its potential to extend both in size

and depth.

Keywords: business software, cloud computing, innovation, SMEs

JEL classification: O31, O33, L25

1. Introduction Business software solutions became a key concept in today’s economic environment, being

achieved a consensus among economists, both practitioners and theoreticians, that the class of

software that addresses the specific needs of various business processes [1] is an essential

element of current economic processes. As the economic environment is becoming more and

more dynamic and unpredictable, companies are forced to adapt rapidly their products,

services and business processes, therefore posing a challenge from the perspective of

business software integration [2]. In this context, there are research papers that emphasize the

fact that “execution of Business Software Systems encounters many problems, leading to the

high scale of their failure, which then is reflected in considerable financial losses” [3].

Nonetheless, there is evidence that “as the business environment gets more challenging,

SMEs are now implementing ERP packages to be more competitive, efficient and customer-

friendly” [4]. As the development of the IT&C market has shown many times, despite the

innovative approaches or technologies incorporated in a software product, the failure to

address the consumers’ subjective requirements could bring very fast to unfortunate

consequences. Theoretical literature went further and defined this situation as a basic

contradiction – “the software industry has been engaged in ongoing effort to solve the basic

contradiction between the personalization of user’s needs and the versatility of the software

products” [5].

In order to address this basic contradiction, we will focus in our study on the investigation of the

main benefits and characteristics of business software from the perspective of Romanian SMEs.

2. The most frequently used business software solutions in the SMEs sector

In order to analyze the use of business software solutions by the Romanian SMEs we will

rely on the data collected in March – April 2014 from 1.569 enterprises that operate in

economic sectors such as services, constructions, industry, transportation and tourism [6].

The size of the sample, as well as the complexity of the questionnaire used for the interviews,



262

gives us the means to engage in a relevant quantitative analysis with structural implications.

In Figure 1 we present the 16 most frequently used business software solutions, or, to be

more precise, the purposes for which they are utilized. Therefore, the entrepreneurs and

managers from the Romanian small and medium-sized enterprises reveal a high propensity to

use specific software solutions for email management (67.08%), invoicing (43.54%) and

accounting (41.18%). Smaller percentages are encountered in the case of wage (24.85%) and

inventory (23.15%) management software, being followed by software solutions addressing

issues like ecommerce (13.11%), goods management (11.34%), communication by SMS

(11.02%), HR management and timekeeping (9.25%), website management (7.93%), large

files transfer (6.95%), data storage (6.75%), achieving software (6.56%), communication by

newsletter (5.90%), personnel recruitment (5.70%) and legal information software (5.51%).

Although the elements that are not in the first 16 have very low percentages, they are

important for our analysis because, in a competitive environment, the differentiation in terms

of diversity and intensity of using business software solutions contributes to higher

performance and competitiveness consolidation. This is the case for software that targets the

following needs: managing internal portals, data collection, document management, fleet

management, videoconferencing, customer relationship management (CRM), enterprise

resource planning (ERP), employee performance management, sales force automation (SFA)

and call center management

Figure 1 - The most frequently used business software solutions

By taking a closer look to the above mentioned elements, we can identify five main

categories of challenges that entrepreneurs and managers from the Romanian SMEs are

trying to bypass with the help of specific business software solutions. Although there could

be spotted some overlap among these categories, we find it very useful for our analysis, as it

is facilitating interpretation of some structural differences, both from the perspective of size

and performance. Therefore, the challenges that demand a more intense response in terms of

business software use by the small and medium-sized enterprises are the following:

1. Improving communication with clients, business partners and other relevant

stakeholders: email, ecommerce (online store), SMS communication, newsletter, internal

portal, videoconferencing and call center management.

5.51%

5.70%

5.90%

6.56%

6.75%

6.95%

7.93%

9.25%

11.02%

11.34%

13.11%

23.15%

24.85%

41.18%

43.54%

67.08%

Legal information

Recruitment

Newsletter

Archiving software

Data storage

Large files transfer

Website management

HR management and timekeeping

SMS communication

Goods management

Ecommerce (online store)

Inventory management

Wage management

Accounting

Invoicing

Email



263

2. Developing the infrastructure for data management: large files transfer, data storage,

archiving software, data collection, website management and document management.

3. Increasing the efficiency of resource allocation: customer relationship management,

enterprise resource planning, fleet management and sales force automation.

4. Providing a more rigorous management of human resources: wage management, HR

management and timekeeping, recruitment and employee performance management.

5. Optimization and informatization of the operational processes: invoicing, accounting,

inventory management, goods management, legal information.

Table 1 - The structure of the most frequently used business software by size and performance evolution

16 most frequently used business

software solutions

Last year performance Size

Much

better Better Identical Worse

Much

worse Micro Small Medium

Email 55.88% 75.19% 61.83% 65.21% 70.00% 62.37% 75.29% 78.26%

Invoicing 82.35% 61.11% 37.44% 34.79% 46.25% 36.02% 63.32% 76.81%

Accounting 50.00% 55.56% 36.83% 33.42% 46.25% 33.76% 61.00% 73.91%

Wage management 76.47% 41.11% 20.73% 13.97% 26.25% 17.97% 42.08% 68.12%

Inventory management 23.53% 33.70% 19.39% 19.18% 31.25% 18.13% 35.52% 52.17%

Ecommerce (online store) 5.88% 17.41% 12.80% 10.96% 7.50% 12.33% 12.36% 21.74%

Goods management 26.47% 12.59% 10.00% 10.14% 13.75% 9.19% 15.44% 27.54%

SMS communication 2.94% 10.74% 11.71% 7.40% 18.75% 8.86% 16.22% 23.19%

HR management and timekeeping 5.88% 15.56% 7.80% 6.03% 13.75% 6.20% 15.44% 34.78%

Website management 2.94% 10.00% 7.93% 6.30% 6.25% 6.04% 10.81% 26.09%

Large files transfer 38.24% 14.44% 4.88% 3.01% 3.75% 5.16% 8.88% 27.54%

Data storage 55.88% 12.96% 3.78% 3.29% 7.50% 5.16% 9.65% 20.29%

Archiving software 14.71% 9.26% 6.10% 3.56% 8.75% 4.43% 10.81% 24.64%

Newsletter 0.00% 5.19% 6.71% 5.48% 1.25% 5.56% 5.02% 11.59%

Recruitment 64.71% 12.22% 2.93% 1.64% 2.50% 3.79% 9.27% 23.19%

Legal information 11.76% 11.85% 3.66% 4.38% 2.50% 4.19% 7.72% 17.39%

Our structural analysis highlights the fact that the frequency of using the above mentioned

business software solutions is positively correlated with the SMEs’ size (Table 1). This

finding is an extension of the empirical evidence that the diversity and number of the utilized

business software items increases along with the size of the company: microenterprises (1-9

employees) use on average 2.39 items, small enterprises (10-49 employees) – 3.99 items,

while medium-sized enterprises (50-249 employees) use on average 6.07 business software

solutions. We can explain this phenomenon by underlining the following tendencies:

the complexity of the processes in a company, both strategically and operationally,

increases faster than the size of the enterprise, therefore the objective need for business

software solutions becomes more acute and diversified;

the management of the larger companies is more professional and experienced, thus

possessing the necessary skills for operating specific business software solutions, as well

as the insight needed for spotting the areas that demand such approaches;

the larger the enterprise, the more resources it can allocate for the acquisition of business

software, IT infrastructure and personnel with adequate skills and knowledge;

in most of the cases, larger companies engage in more complex projects, therefore there is

a greater need for professional business tools.

A closer look to the magnitude of variation in conjunction with SMEs size reveals the

following hierarchy of the challenges that demand a more intense response in terms of

business software solutions: (1) increasing the efficiency of resource allocation, (2) providing

a more rigorous management of human resources, (3) developing the infrastructure for data

management, (4) optimization and informatization of the operational processes and (5)



264

improving communication with clients, business partners and other relevant stakeholders.

The above mentioned findings are sustained by the following facts regarding the magnitude

by which the share of the entities that use a specific software solution in the case of the

enterprises with better and much better results is larger than the percentages of the SMEs

using the same solution, but with worse and much worse outcomes:

17.67 times higher on average for the business software focused on increasing the

efficiency of resource allocation: customer relationship management (16.59), enterprise

resource planning (37.69), fleet management (2.84) and sales force automation (13.58);

6.46 times higher on average for the solutions focusing on providing a more rigorous

management of human resources: wage management (2.92), HR management and

timekeeping (1.08), recruitment (18.56) and employee performance management (3.28);

3.68 times higher on average for the software that relates to data management: large files

transfer (7.79), data storage (6.38), archiving software (1.95), data collection (3.63),

website management (1.03) and document management (1.31);

1.85 times higher on average for the business software focused on Optimization and

informatization of the operational processes: invoicing (1.77), accounting (1.32),

inventory management (1.13), goods management (1.64), legal information (3.43);

1.21 times higher on average for the solutions focusing on improving communication

with clients, business partners and other relevant stakeholders: email (0.97), ecommerce

(1.26), SMS (0.52), newsletter (0.77), internal portal (1.23), videoconferencing (2.56).

By continuing our structural analysis from the perspective of SMEs’ performance dynamics,

an interesting finding arises – the above mentioned hierarchy maintains, therefore the ratio

between the percentages of the companies with better and much better results and those of the

enterprise with worst and much worst performances decreases as the ranking is falling. It

might be that larger organizations are more stable and have greater chances of succeeding,

however this resilience of the hierarchy points out that better performance and higher

employment is positively correlated with all the five challenges we have presented earlier, but

with different degrees of intensity.

3. The main benefits of the business software solutions

Our research also investigates what are the main benefits of the business software items as

seen by Romanian entrepreneurs and managers from SMEs. The importance of the perceived

benefits relies on the fact that they transpose into demanded features and characteristics of

business software solutions. As the development of the IT&C market has shown, despite the

innovative approaches or technologies incorporated in a software product, the failure to

address the consumers’ subjective requirements could bring very fast to unfortunate

consequences.

Their perception is presented in Figure 2, thus underlining the importance of data security

(45.25%), easier collaboration between team members (28.74%), flexible access to corporate

data (28.30%), easier control of internal processes (21.54%), increased customer satisfaction

(19.44%), compliance with current legislation (19.38%), increased performance by

optimizing the operational processes (12.11%), increased productivity of specific

departments (10.33%), customized services and products for clients (7.20%), obtaining

detailed reports regarding various departments (6.88%), paying a monthly fee (5.23%),

removing redundant data entry activities in multiple applications (4.84%), intuitive menus in

Romanian language (4.59%), free and fast upgrade of the software (3.95%) and the feature of

Single Sign On (0.83%).



265

Figure 2 - The main benefits of the business software solutions

The structural analysis from the perspective of the SMEs’ size highlights the fact there are

four main types of benefits (Table 2):

a) Business software benefits whose propensity is positively correlated with the size: easier

collaboration between team members, increased customer satisfaction, compliance with

current legislation, paying a monthly fee, removing redundant data entry activities and

software menus in Romanian language.

b) Business software benefits that have quasi-equal percentage for micro and small

enterprises, but a significantly higher value for medium-sized companies: easier control

of internal processes, optimizing the operational processes and single sign on.

c) Business software benefits that have quasi-equal percentage for medium and small

organizations, but a significantly lower value for micro-enterprises: data security, flexible

access to corporate data, increased productivity of specific departments and obtaining

detailed reports.

d) Business software benefits whose values have no clear relationship with the size of the

firm: customized services and products for clients, free and fast upgrade of the software.

Table 2 - The structure of the main benefits of the business software solutions

Benefits of the business software solutions Size

Micro Small Medium

Data security 43.51% 51.35% 52.17%

Easier collaboration between team members 25.22% 37.84% 56.52%

Flexible access to corporate data 27.32% 31.66% 31.88%

Easier control of internal processes 20.95% 21.62% 30.43%

Increased customer satisfaction 17.32% 25.48% 33.33%

Compliance with current legislation 18.21% 23.17% 24.64%

Increased performance by optimizing the operational processes 11.85% 11.58% 17.39%

Increased productivity of specific departments 9.11% 14.67% 14.49%

Customized services and products for clients 6.29% 11.58% 5.80%

Obtaining detailed reports regarding various departments 5.56% 11.58% 11.59%

Paying a monthly fee 4.92% 5.79% 7.25%

Removing redundant data entry activities in multiple applications 3.79% 5.79% 18.84%

Menus in Romanian language 4.03% 6.18% 7.25%

Free and fast upgrade of the software 3.79% 4.63% 2.90%

Single Sign On 0.56% 0.77% 4.35%

0.83%

3.95%

4.59%

4.84%

5.23%

6.88%

7.20%

10.33%

12.11%

19.38%

19.44%

21.54%

28.30%

28.74%

45.25%

Single Sign On

Free and fast upgrade of the software

Menus in Romanian language

Removing redundant data entry activities

Paying a monthly fee

Obtaining detailed reports

Customized services and products for clients

Increased productivity of specific departments

Optimizing the operational processes

Compliance with current legislation

Increased customer satisfaction

Easier control of internal processes

Flexible access to corporate data

Easier collaboration between team members

Data security



266

4. Conclusions As the findings of this paper suggest, the diversity and number of the business software items

used increases along with the size of the company. Therefore, the competitive environment

makes it very profitable to acquire new business tools, both in terms of diversity and intensity

of using business software solutions, thus contributing to higher performance and

competitiveness consolidation. By investigating more than 25 purposes for which SMEs use

business software, our analysis allowed us to cluster them in five categories – five challenges

that demand a more intense response in terms of business software use: (1) increasing the

efficiency of resource allocation, (2) providing a more rigorous management of human

resources, (3) developing the infrastructure for data management, (4) optimization and

informatization of the operational processes and (5) improving communication with clients,

business partners and other relevant stakeholders. We consider it very valuable for business

software providers to address the issue of product development from a complex perspective,

thus including in the process such elements as (a) the correlation between a specific domain

of operationalization and the size of the company (even the evolution of the organizational

performance) and (b) the specific benefits that a user expects to receive (i.e.

features/characteristics of the software). Although there is a lot of space for product

customization, a good assessment of the market’s needs and consumer expectations

contributes consistently to a more efficient allocation of resources and opportunity scouting.

Acknowledgement


Programme Human Resources Development 2007 - 2013, project number



References [1] P. Schubert, "Business software as a facilitator for business process excellence:

experiences from case studies," Electronic Markets, vol. 17, no. 3, p. 187-198, August

2007.

[2] D. Rombach, M. Kläs and C. Webel, "Measuring the Impact of Emergence in Business

Applications," Future Business Software, p. 25-26, 2014.

[3] B. Czarnacka-Chrobot, "The Economic Importance of Business Software Systems

Development and Enhancement Projects Functional Assessment," International Journal

on Advances in Systems and Measurements, vol. 4, no. 1&2, p. 135, 2011.

[4] J. Esteves, "A benefits realisation road-map framework for ERP usage in small and

medium-sized enterprises," Journal of Enterprise Information Management, vol. 22, no.

1/2, p. 25-35, 2009.

[5] H. Yang, X. Rui, Y. Liu and J. He, "Business Software Rapid Development Platform

based on SOA," International Journal of Database Theory and Application, vol. 6, no. 3,

p. 21, 2013.

[6] O. Nicolescu, A. Isaic-Maniu, I. Drăgan, C. Nicolescu, O. M. Bâra, M. L. Borcoş and V.

Lavric, White charter of Romanian SMEs in 2014. CNIPMMR. Bucharest: Sigma

Publishing House, 2014.



267

ECONOMIC VALUE EXCHANGES IN MULTI-PARTY

COLLABORATIONS: USING E3-VALUE TO ANALYSE THE M-PESA

ECOSYSTEM

Caroline KINUTHIA Vienna University of Technology

[email protected]

Andrew KAHONGE University of Nairobi

[email protected]

Abstract. The mobile market is increasingly facing cut-throat competition. There is a

multitude of stakeholders jostling to rake in the profits. For the brand-new actor keen to

propitiously make a debut in the market, there is need to explore business opportunities in

this novel mobile money industry based on the distribution of value in the network. The goal

is to increase and sustain profitability for the enterprise at all costs. In particular, the

principle of economic reciprocity is the glue that has held M-PESA; a mobile based financial

innovation that first premiered in Kenya together. This paper models value exchanges within

the M-PESA ecosystem using e3-value in order to determine which actor is doing what and

with whom. The contribution is an analysis that addresses these questions: Who are the

business actors involved in the operations? What is in the offing for the actors? What are the

elements of offerings? What value-creating activities have produced or consumed these

offerings? What activities are performed by which actors? Finally, would a different business

model be better positioned to boost M-PESA’s economic value?

Keywords: e3-value, mobile ecosystem, m-pesa, multi-party collaborations, value network

JEL classification: F16, G23, O32

1. Introduction

The contentious question amongst stakeholders is how to effectively apportion profit in the

mobile financial market. In the past, the success of a company relied on its own ability to

single-handedly, create products and provide efficient services. Innovation has brought about

new trends of conducting business. The distinct characteristic of mobile payment ecosystem

is the amalgamation of multiple players in the value network. Every player is competing to

maximize revenue. As already implied, the gap can be bridged by determining and creating

efficient business models.

The objective of the paper is to analyze value creation in the context of mobile payments and

thereafter model the M-PESA system illustrating the economic value exchanges within multi-

party collaborations. We collected information by conducting semi-structured expert

interviews with different professionals in the mobile payment sphere. Other data collecting

activities included distributing user questionnaires and accessing recent literature from

Safaricom [6] [7], which is the Mobile Network Operator that provides the M-PESA’s

technology. This technology has subsequently been mirrored by other countries due to its

success in Kenya.



268

The answers we seek comprise:

I. Determining the actors in the network and what they are offering.

II. Establishing the gains that collaboration brings.

III. Identifying activities performed by each actor.

IV. Gauging the capacity of the model to deal with disagreements among different players

amicably.

The paper is organized as follows. In Section 2, we will look at the M-PESA system at a

glance. The next Section discusses the e3-value business model. In Section 4, we will present

the methodology. Thereafter we will present our results, which is an illustration of the M-

PESA technology using [1] and [8] toolset and editor. Finally, we will draw some

conclusions.

2. M-PESA technology at a glance

2.1 How does it work?

M-PESA is a money-transfer service that is operated by Safaricom, the largest mobile

network operator in Kenya. The first step requires the customer to register an M-PESA

account at an M-PESA agency. The M-PESA agent acts as a middleman between the mobile

network operator and the customer. Registration is free. However, the customer should

produce the following: a Safaricom SIM card which has a Safaricom mobile number,

customer’s official names, date of birth and original identification documents of the customer

(National ID, military ID, alien ID or passport). Copies of identification documents are sent

to the Safaricom headquarters for record keeping. Once the customer is officially registered,

the system sends an acknowledgement SMS to both the agent and the customer

acknowledging successful registration. The SMS is a four-digit key that also contains further

instructions on how to complete the activation process. This four-digit key may be

customized to the customer’s numbers of choice and is subsequently used as a PIN number.

Thereafter, the customer may proceed to carry out financial transactions.

2.2 M-PESA Services

The M-PESA system is layered to serve multiple players. Initially, the basic services that

were offered by M-PESA included cash deposits to customer’s own account, loading of

airtime from customer’s mobile wallet, sending money to other M-PESA accounts,

withdrawing hard cash from the M-PESA agent and paying utility bills as mentioned by [5].

However, the quest for traditional financial companies, banks and start-ups to cash in and

leverage for a competitive edge has led to the emergence of new innovative money-transfer

services. Parents can now pay school fees for their children directly to the school via M-

PESA. Another benefit is that customers are able to receive loans at a lower interest rate than

what the banks charge. Remittances of bulk payments (Business to Consumer) may also be

carried out. Some organizations disburse payments through this process to their both their

customers and employers. Many employees have comfortably embraced M-PESA as a safe

platform to receive their salary and insurance payments. In Kenya, the unbanked populace

finds M-PESA as a ubiquitous and fair payment method [3]. The need to carry cash around is

also eliminated resulting to a reduction of criminal attacks. Recently, M-PESA has partnered

with other international companies in a bid to improve service delivery. These partners which

include [1] Western Union, World Remit, Xpress Money, XendPay, SkyForex, Skrill and

Post Finance operate as a conduit allowing customers to send and receive money directly by

phone from United States of America, United Kingdom, United Arab Emirates and Europe.



269

3. e3-value approach

3.1 e3-value graphical notation

The developer of e3-value approach [1] [8] modeled a value web made of actors that execute

activities by creating, exchanging and consuming things that have economic value [9]. This

modeling language is ideal for the M-PESA infrastructure. It is ontological-based and it

incorporates concepts from requirements engineering and conceptual modeling [4]. The main

focus is to identify and analyse how value is created, exchanged and consumed in multi-actor

collaborations [2]. The ontology consists of a set of concepts that can easily be used to model

real scenarios. The e3-value ontology is classified under three sub-viewpoints:

I. The global actor viewpoint

It comprises of the Actor, Value Object, Value Port, Value Offering, Value

Interface. Value Transfer, Value Transaction, Market Segment

II. The detailed actor viewpoint.

It comprises of the Composite Actor.

III. The value activity viewpoint.

It comprises of Value Activity

consists-of in

0..* 0..*

has assigned to aas assigned to has 0..1 1..* 1 1 ..* 0..1

1 consists-of

1..2 in

1.. * consists-of 1 consists-of

1..* in has-in in-connects 1..* in offers- offered-

requests requested-by

0..* 1

0..* 1 0..* 1

has- out-connects

out 1..*

Figure 1 - e3-value ontology example source [1]

The graphical notation of e3-value consists of relations and concepts that have been

developed by [1]. The question which drives this approach is; will the idea be something of

economic value?

Market

Segment

Value

Interface

Actor

Value

Transaction

Value

Offering

Value

Exchange

Value

Port

Value

Object



270

Figure 2 - e3-value graphical notation source [1]

3.2 e3-value example

Figure 3 - A simplified example to illustrate e3value exchanges

Item [1] shows the start stimulus, the process begins from this point. In Figure 3, there are

two actors which are graphically represented by rectangles; couple and wedding planner

indicated as Item [2]. An actor is an economically independent entity that has the capacity to

increase its own value. Item [3] indicates a value interface. It groups one in-going and one

out-going value offering. Value ports are usually drawn within it. Item [4] shows a value

object. A value object is either a service or product that brings economic value to the actor.

Item [5] is known as a value exchange and is used to connect two value ports with each other.

Item [6] shows value ports which help to interconnect actors in order to facilitate exchange of

value objects. Item [7] shows a value activity. Value activities should be profitable to, at the

very least, one actor. Item [8] joins the value interface with the stop stimulus. Item [9] is a

stop stimulus and its function is to show where the process terminates.

4. Methodology

19 expert interviewees were consulted in order to shed light on the goings-on at MPESA’s

headquarters. 44 percent of the interviewees were technical personnel, while 40 percent held

managerial positions. 10 percent of the remaining interviewees worked as regulators with the

Actor Market segment Value interface Value ports AND (fork/join)

Value activity Value Exchange with Value Objects OR (fork/join) Start Stimulus Payment Goods/Service Stop Stimulus



271

Commissions Communication of Kenya or the Central Bank of Kenya. The final 6 percent of

the respondents worked in mobile banking. In order to strengthen our research, we also

distributed user questionnaires to customers to understand their views in this area. We have

also consulted literature from various sources to build up our findings.

5. Results and conclusions Our model clearly shows the business actors involved in the operations detailing value

offerings and activities. Partnerships between actors in conducting activities illustrate the

interoperability nature of networks between the organizations and the need for transparent

processes.

Figure 4 - M-PESA's value web model using e3-value

Figure 4 indicates the input and output of each activity. The start stimulus is triggered by the

customer and subsequent activities are either carried out in a step-by-step basis or in parallel.

The sequence of activities can be followed with ease. Beyond the scope of this paper is an

assessment of the economic sustainability of our model. In conclusion, having multi-actor

collaborations in the mobile payment ecosystem has a positive consequence. Our findings

indicate that businesses are specializing more than in the past as roles continue to be shared in

the value network. However, one negative finding is that there is still no outright way to deal

with conflict resolution especially when it comes to role collision. There is currently a huge

tussle between banks and the mobile network operators. The banks feel that these mobile

network operators are crossing the line into their jurisdiction. As a result, banks have been

pushed to be more innovative in service delivery and to subsequently push the mobile

network operators to partner with them so as to reduce loss of revenue that they had faced

initially.



272

Acknowledgment

We gratefully acknowledge the PhD School of Informatics, Vienna University of Technology

in conjunction with the city of Vienna for funding this research. We are also thankful to Prof.

Gerti Kappel and Prof. Christian Huemer who introduced to us the e3-value business

methodology. A special thank you to Dr. Rainer Schuster for sparing time from his busy

schedule to discuss e3-value semantics with us and for offering his insights in this topic. We

would also like to mention our fellow colleagues Ilian Berov and Kateryna Zaslavska for

their great input, our esteemed interviewee participants from Safaricom (Kenya),

Communications Commission of Kenya and Central Bank of Kenya. Last but not least, we

appreciate the input of Brian Omwenga; an adjunct faculty at the University of Nairobi, an

ambassador of the Africa Innovation Foundation’s, Innovation Prize for Africa and the

chairperson of the Software Engineering technical committee (TC94) setting national

software engineering standards at the Kenya Bureau of Standards.

References

[1] J. Gordijn, "The e3-value toolset," [Online]. Available: www.e3value.com. [Accessed 17

February 2015].

[2] R. Schuster and T. Motal, "From e3-value to REA: Modeling Multi-party E-business

Collaborations," in Proc. of the Seventh IEEE International Conference on E-Commerce

Technology, Munich, 2009.

[3] T. K. Omwansa, "M-PESA: Progress and Prospects," in Proc. of the Mobile World

Congress, GSMA special edition, Nairobi, 2009.

[4] K. Pousttchi and Y. Hufenbach, "Value Creation in the MobileMarket - A Reference

Model for the Role(s) of the FutureMobile Network Operator," Business & Information

Systems Engineering, vol. 3, no. 5, pp. 299-311, 2011.

[5] I. Mas and D. Radcliffe, "Scaling Mobile Money," Journal of Payments Strategy and

Systems, vol. 5, no. 3, 2011.

[6] W. Jack and S. Tavneet, "The Economics of M-Pesa," Masachusetts, 2010.

[7] Safaricom. [Online]. Available: http://www.safaricom.co.ke/. [Accessed 15 February

2015].

[8] J. Gordijn and H. Akkermans, "Value based requirements engineering: Exploring

innovative e-commerce idea," Requirements Engineering Journal, vol. 8, no. 2, pp. 114-

134, 2003.

[9] C. Huemer, A. Schmidt, H. Werthner and M. Zapletal, "A UML Profile for the e3-value

e-Business Modeling Ontology," in Prof. of the 20th International Conference on

Advanced Information Systems Engineering



273

ASPECTS OF INTER-ORGANIZATIONAL KNOWLEDGE

MANAGEMENT IN COLLABORATIVE NETWORKS

Marinela MIRCEA

The Bucharest University of Economic Studies, Bucharest, Romania [email protected]

Abstract. In the knowledge society, the interest given to the organizational networks and

knowledge management is more and more obvious. Among the potential advantages offered by

the two approaches, the increase of flexibility and obtaining a competitive advantage on the

market hold an important place. Apart from the offered advantages, both the organizational

networks and the inter-organizational knowledge management are confronted with problems

and challenges. Inter-organizational knowledge management is a complex process due to both

the nature of knowledge and to the collaborative environment. Two of the major challenges of

inter-organizational knowledge management are knowledge sharing between the members of

the organizational network and interoperability. The main purpose of the paper is the analysis

of the knowledge management process within organizational networks, with an accent on the

changes made by the collaborative environment. Throughout the paper the importance and

necessity of an inter-organizational knowledge management is emphasized, a comparative

analysis of certain aspects of intra and inter-organizational knowledge is performed and the

collaborative characteristics that influence the stages of the knowledge management process

are identified.

Keywords: collaborative networks, information technology, inter-organizational knowledge,

knowledge management. JEL classification: D83, L14, O33

1. Introduction The importance of knowledge and organizational networks in obtaining a competitive

advantage on the market is recognized both by theoreticians and by practitioners as well. The

increase of the number of collaborative networks, the accent on knowledge within the society

based on knowledge and innovation come to support those who want to adhere to new

organizational forms that would lead to obtaining success on the global market.

Knowledge is known as one of the most important assets of management within organizations,

as knowledge allows organizations to use and develop resources, to increase the competitive

ability and to obtain a substantial competitive advantage [1]. Knowledge also represents an

important source that allows nations, organizations, and persons to obtain benefits, such as:

learning improvement, innovation, and decision making. Any organization, public or private,

needs a knowledge management process in order to obtain the best performances [2].

In global economy, the strong competition, the frequent changes on the market, the higher and

higher demands concerning quality, lead to the necessity of new organizational forms.

Organizational networks are acknowledged as organizational forms characterized by an

increased flexibility and that may lead to obtaining the competitive advantage on the market.

Apart from the potential advantages, organizational networks are also confronted with

problems and challenges that are particularly connected to the complexity of the collaborative

environment.



274

Inter-organizational knowledge can be defined as an explicit set of knowledge that is

formalized and created by organizations [3]. The interactions within the network allow

organizations to develop the collaborative and relational tacit knowledge and to generate inter-

organizational tacit knowledge that can be capitalized within the inter-organizational memory

[4].The inter-organizational knowledge allows organizations to develop distinctive abilities,

which may lead to the increase of the innovation ability.

Inter-organizational knowledge management is a complex process due to both the nature of

knowledge (intangible, dynamic, intrinsic) and to the collaborative environment as well. One

of the major challenges of inter-organizational knowledge management is its sharing among

the members of the organizational network. The knowledge sharing depends on a series of

factors, such as: the security of the communication channel, the organizational culture of the

participants and their roles, the nature of knowledge (tacit and explicit; formal and informal),

the organizational structure, and the support offered by the information and communications

technology (ICT).

Interoperability also represents another challenge within the inter-organizational knowledge

management. The existence of certain heterogeneous knowledge management systems within

different partner organizations that are not able to communicate and to integrate themselves,

leads to the limitation of reusing the inter-organizational knowledge (formalized explicit

knowledge) [1].

The main purpose of the paper is represented by the analysis of the knowledge management

process within organizational networks, with an emphasis on the changes produced by the

collaborative environment. Throughout the paper the importance and necessity of an inter-

organizational knowledge management is emphasized, a comparative analysis of certain

aspects of intra and inter-organizational knowledge is performed and the collaborative

characteristics that influence the stages of the knowledge management process are identified.

2. Inter-organizational knowledge in collaborative networks

The analysis of the knowledge management process involves the recognition of the types of

knowledge and of the means in which the collaboration influences the stages of their

management process. The explicit or coded knowledge is the knowledge that is transmitted

through a formal, systemic language and that is sent as data, scientific formulas, specifications

or manuals [5]. In such case, both the communications technologies and the knowledge

management systems of organizations influence their management as well. Insuring the

knowledge management systems’ interoperability is also a critical fact for the success at the

level of network.

Tacit knowledge is personal and difficult to formalize. Tacit knowledge is deeply rooted in

action, procedures, routine, commitments, ideals, values and emotions [5]. Their transfer is

much more difficult to accomplish between organizations. The interactions within the

networks, the organizational culture, and the communications abilities are only a few of the

factors that influence the management of these types of knowledge. Moreover, according to

[6], the lack of certain common goals, the significant differences in the corporate culture, the

competition between partners, the lack of certain cooperation rules, the inadequate coordination

and the opportunistic behavior lead to difficulties and restrictions within the inter-

organizational use of knowledge.

In organizational networks trust represents a catalyst that enables strategic business interactions

and knowledge sharing between organizations [7], [8]. Trust symbolizes the positive

psychological state in which you believe in the goodwill and integrity of the correspondent

regarding the support of the promise to obtain common results. Trust is the fundamental

condition necessary for eliminating incertitude within the interaction processes [9].



275

Inter-organizational knowledge presents some differences compared to intra-organizational

knowledge, due to the complexity of the collaborative environment in particular. Table 1

displays some aspects regarding the intra and inter-organizational knowledge. The differences

are not only limited to the aspects in the table. Moreover, the structure of the network, the form

of organization (partnerships on long or short term), the size of the network, as well as other

aspects, influence the inter-organizational knowledge, leading to certain differences within the

stages of the management process.

Table 1- Intra and inter-organizational knowledge aspects

Aspects Intra-organizational Inter-organizational

Geographical area Teams, national organization,

multinational organization.

Organizational network

Barriers Formal, hierarchic. Structural, communication, cultural.

Creation

The knowledge creation is

performed within organization.

Many a time, the knowledge is

created in specialized

departments (departments of

research and development).

The knowledge creation takes place in an unsecure

environment. There is a possibility that within the

process, the partner’s goals or behavior might change

and, furthermore, their collaboration relationships.

Through interactions and the reuse of available

knowledge at the level of network, the possibility of

creation of new knowledge is higher.

Transfer

It is accomplished through the

compliance with the intellectual

property rights. It overlaps with

the knowledge change within

organization.

It is influenced by the characteristics of the network,

such as: culture, type of alliance, available ICT. It

shows common characteristics with many of the

knowledge acquisition processes [10].

Share

Takes place in keeping with the

roles, security and integrity of

the knowledge management

system within organization.

Takes place when the partners manage to cultivate

trust and to build long term partnerships.

Storage

It is carried out in

databases/knowledge bases,

according to the organization’s

strategy.

It is carried out according to the agreements within

the network and to the available technology.

Furthermore, the existence of interoperability at the

level of knowledge management systems is

necessary.

Use/

Reuse

Leads to the development of

new products/services/added

value.

Leads to the development of more new knowledge,

through the interaction of the involved partners, trust

and common goals.

In long term organizational networks, knowledge represents the decisive base for the intelligent

and competent performance of partners, and inter-organizational knowledge management

becomes a new paradigm of strategic management [6].

3. Inter-organizational knowledge management

Inter-organizational knowledge management requires an adequate technology (platforms,

applications, instruments), support infrastructure (processes, network roles) and human capital

(digital abilities, collaboration competences and reflexive practical abilities). The adequate

technology depends on the purpose of the organizational network, on the digital abilities of its

members, as well as on the institutionalization degree of their processes [11]. Starting from the

stages of the knowledge management process identified by Schwartz [12] and from the support



276

ICT is able to offer in every stage of the management process, figure 1 presents the knowledge

management process in the context of organizational networks.

Figure 1 - Knowledge management in collaborative networks

3.1. Knowledge creation/generation The knowledge generation process can be accomplished through knowledge acquirement or

through knowledge creation [10]. In order to effectively produce a creation process of inter-

organizational knowledge, the partners involved in the network must line up their knowledge

bases during the knowledge sharing process [13]. Irrespective of the knowledge nature (tacit,

explicit), knowledge creation involves the necessity of a common goal. The common goal and

the connections are elementary for allowing cooperation within the organizational network.

The interactions that take place within the network are essentially connection elements within

the network [14]. Once the partnership is established, organizations must take into account the

knowledge assignation for the creation and sharing activities, as well as the behavioral aspects

regarding the partners’ orientation concerning the future value of the results obtained together

[13].

3.2. Knowledge representation/store

Knowledge representation is the process of representation in a graphical form and that can use

different pieces of information and communication codes (natural language, figures, drawings,

photographs, flux diagrams). The explicit knowledge, until the moment of representation, is

informal knowledge. The only stored information is the one necessary for identifying the

persons and places where knowledge is located [10]. Knowledge storage depends on the

technology available within the network. Knowledge can be stored in databases or knowledge

bases. Within the organizational network there can exist different knowledge representation

languages, different management systems, questioning the interoperability at the level of

network.

3.3. Knowledge access/use/reuse

Knowledge access is performed according to the agreements within the organizational network

and to the available systems. Organizations allow the taking over of inter-organizational



277

knowledge and of knowledge that can be reused in order to support the five stages of the

knowledge management process. Thus, the reusable inter-organizational knowledge network

is developed in order to allow member organizations to reuse inter-organizational knowledge

that is stored in the knowledge reservoirs of organizations [3].

3.4. Knowledge dissemination/transfer

Knowledge transfer is an interactive process through which organizations accumulate and

develop new knowledge; it allows business partners to detect and understand business

problems and to develop viable solutions [9]. The knowledge transfer can be accomplished

through different mechanisms: formal or informal [15]; coded or personalized; individualized

or institutionalized [16]. Wagner claims that sharing knowledge within an inter-organizational

network allows the creation of more diverse knowledge than sharing within an organization

[17]. The transfer of knowledge shall be accomplished through a user interface, according to

the used mechanism and the agreements established at the level of network.

In case the knowledge is tacit, the communication is best carried out through socialization or

through the facilities offered by the multimedia communication technology. The effectiveness

and efficiency of the knowledge transfer processes is affected by the assimilation capacity of

the receiver [10]. In order to carry out a better knowledge transfer, both the receiver and the

sender must have a common fund of knowledge (the existence of a redundancy). If there is no

common fund of knowledge, the receiver will not have the possibility to understand the

transferred content, and the process of information is useless. At the opposite side, if the

receiver already contains the transferred information, then it already represents knowledge for

the receiver.

4. Conclusions The society based on knowledge, the new organizational forms, the global market, the

development of the information and communications technology, are the pillars of the

continuous development and innovation. The knowledge associated with new organizational

forms and with appropriate means of management may lead to agility and to gaining the

competitive advantage on the market. However, their embrace does not guarantee

organizational success without an adequate analysis of the ICT support. The present paper

points out certain aspects regarding inter-organizational knowledge within organizational

networks. The paper represents a starting point towards a more detailed analysis of inter-

organizational knowledge management and of the appropriate ICT support.

Acknowledgment This paper was co-financed from the European Social Fund, through the Sectoral Operational

Programme Human Resources Development 2007-2013, project number POSDRU/

159/1.5/S/138907 "Excellence in scientific interdisciplinary research, doctoral and

postdoctoral, in the economic, social and medical fields - EXCELIS", coordinator The


References [1] N. Leung, S.K. Lau and J. Fan, “Enhancing the Reusability of Inter-Organizational

Knowledge: an Ontology-Based Collaborative Knowledge Management Network,”

Electronic Journal of Knowledge Management, vol. 7, no. 2, pp 233 – 244, 2009.

[2] K.A. Al-Busaidi, “A Framework of Critical Factors to Knowledge Workers’ Adoption of

Inter-organizational Knowledge Sharing Systems,” Journal of Organizational Knowledge

Management, vol. 2013, pp. 1-11, 2013.



278

[3] N.K.Y. Leung, S.K. Lau and J. Fan, “An Ontology-Based Knowledge Network to Reuse

Inter-Organizational Knowledge,” in 18th Australasian Conference on Information

Systems, Toowoomba, 2007, pp. 896-906.

[4] I. Zouaghi, “Tacit Knowledge Generation and Inter-Organizational Memory Development

in a Supply Chain Context,” Systemics, Cybernetics and Informatics, vol. 9, no. 5, pp. 77-

85, 2011.

[5] E. Shijaku. Knowledge creation and sharing in an organization: An empirical analysis of

the New Product Development process. Internet:

https://www.theseus.fi/bitstream/handle/10024/21190/Shijaku_Elio.pdf?sequence=1,

2010 [February 20, 2015] [6] T. Blecker and R. Neumann, “Interorganizational Knowledge Management: Some

Perspectives for Knowledge Oriented Strategic Managment in Virtual Organizations,”

Interorganizational Knowledge Management, pp. 63-83, 2000.

[7] J.H. Cheng, “Inter-organizational relationships and information sharing in supply chains,”

International Journal of Information Management, vol. 31, no. 4, pp. 374–384, 2011.

[8] S. Hoejmose, S. Brammer and A. Millington, “Green supply chain management: the role

of trust and top management in B2B and B2C markets,” Industrial Marketing Management,

vol. 41, no. 4, pp. 609–620, 2012.

[9] Y.H. Chen, T.P. Lin and D. Yen, “How to facilitate inter-organizational knowledge sharing:

The impact of trust,” Information & Management, vol. 51, pp. 568–578, 2014.

[10] Y. Jarrar, M. Zairi and G. Schiuma. Defining Organisational Knowledge: A Best Practice

Perspective. Internet:

http://ecbpm.com/files/Knowledge%20Management/Defining%20Organisational%20Kno

wledge.pdf, 2010 [February 20, 2015] [11] M. Velasco. Knowledge transfer model for collaborative networks. Internet:

http://www.olkc2013.com/sites/www.olkc2013.com/files/downloads/146.pdf, 2013,

[February 20, 2015] [12] D.G. Schwartz, Encyclopedia of Knowledge Management. Hershey, London, Melbourne,

Singapore: Ed. Idea Group Reference, 2006, pp. 1-902.

[13] A. Capasso, G.B. Dagnino and A. Lanza, Strategic Capabilities and Knowledge Transfer

Within and Between Organizations:New Perspectives from Acquisitions, Networks,

Learning and Evolution, Ed. Edward Elgar Publishing, 2005, pp. 1-377.

[14] M. Brannback, “R&D collaboration: role of Ba in knowledgecreating networks,”

Knowledge Management Research & Practice, vol. 1, pp. 28–38, April 2003.

[15] A.A. Bolazeva. Inter-organizational knowledge transfer mechanisms in the focal

company: a case study. Internet:

http://dspace.ou.nl/bitstream/1820/3698/1/MWAABolazevamei2010.pdf, 2010 [February

20, 2015] [16] W.F. Boh, “Mechanisms for Sharing Knowledge in Project-based Organizations,”

Information and Organization, vol. 17, no. 1, pp. 27-58, 2007.

[17] S.M. Wagner and C. Buko, “An Empirical Investigation of Knowledge-sharing in

Networks,” The Journal of Supply Chain Management, vol. 41, no. 4, pp. 17-31, November

2005.



279

4D(ATA) PARADIGM AND EGOVERNMENT

Bogdan GHILIC-MICU

Bucharest University of Economics, Romania

[email protected]

Marian STOICA


[email protected]

Cristian USCATU


[email protected]

Abstract. The move towards global informational society supported by information and

communication technology leads to a radical change of most classical concepts regarding

society structure and the way it is directed and controlled. In this context, we define the 4D(ata)

paradigm in terms of Big Data, Social Data, Linked Data and Mobile Data, as an emergent

model for supporting national and global eGovernment projects. The proposed model has a

high flexibility, adaptable to all four directions of electronic government: G2C, G2B, G2G,

G2E. in this paper we will approach the four concepts defining the 4D(ata) paradigm

independently (as much as possible), highlighting the implications on eGovernment.

Additionally we will identify the synergic effect of existence and functioning of the 4D in

contemporary government systems.

Keywords: big data, social data, linked data, mobile data, eGovernment, information and

communication technology (ICT).

JEL classification: M15, O32, O38

1. Introduction

An essential component of informational society is e-democracy, which concerns the activity

and interactions of citizens, public institutions and political organizations through information

and communications technology (ICT). The purpose of these activities is the development and

promotion of democracy values through citizens’ participation to the decisional process

together with the public authorities, so that citizens can really see the results of their efforts.

ICT facilities, especially communication environments used for interaction by citizens,

organizations and public institutions become fundamental instruments in the process of

modernization of society and government structures.

While electronic government mainly refers to accessibility of government services, electronic

democracy refers to the citizen’s active role in enlarging his possibilities through ICT. Thus,

electronic democracy allows citizens to get involved in public institutions’ activity, taking part

in the decisional process, and it allows the government to react adequately to citizens’ needs

[1].

Information society impact on personal life and individual development of citizens may be

analyzed from several points of view. Such an analysis must consider firstly the model of this

new type of society with all its political, economic and social characteristics both on the level

of individual countries and on international level.

This new concept implemented on country level helps users and has multiple benefic effects.



280

Informational society relies heavily on large scale use of ICT at work, in relation with

authorities and public institutions and everyday life (for shopping, instruction, and various

other activities – figure 1).

Figure 1 - Informational society and its relation with individuals

In the last years numerous analyses were carried out regarding public sector and its relations

with society. Because of increased demand for public services, budget restrictions and high

personnel costs, but also due to the help of new ICT features, the concept of electronic

government or eGovernment was born and propagated.

2. Electronic government

The literature provides many expressions and definitions related to electronic government

concept. Still, there are some common elements that lead to a generally accepted definition.

Thus, in a European approach, electronic government is defined as the use ICT in public

administration, along with organizational changes and acquiring new competencies, in order to

improve public services and democratic processes [2].

The concept itself refers, in principle, to the interactions between state public institutions and

the citizens through electronic means. These interactions are not limited only to accessing

information regarding current procedures or regulations. Electronic government also means

more sensible elements like digital signature, electronic payment, electronic vote, laws and

regulations, public procurement and electronic auctions, affidavits, licenses and approvals etc.

Governments are increasingly aware of the importance of changing the online services in order

to make them available to as many citizens as possible (table 1). However, the current approach

is wrong and progresses slowly, leaving space for lots of improvements regarding the speed of

accessing the information, its quality transparency, promoting to the proper users. The slow

increase of eGovernment popularity compared to other online services lead the users to distrust



281

online public services, creating a vicious circle.

Table 1 - eGovernment strategy benefits

Public sector

in relation

with

Examples Benefits

Citizens

Information

Culture

Health

Education

Transaction advantages

Tax collecting

Wide range of means of

communication, reduced costs for

transactions, expanded services,

openness towards democratic

participation.

Businesses

Support programs

Advice and help

Regulations

Tax collecting

Fast, reduced transaction costs,

facilitates transactions

Suppliers E-procurement

Reduced transaction costs, better stock

management, collaborative

environments

Other

components of

the public

sector

Communication:

Between departments

and agencies

Between local and

central governments

Increased efficiency, reduced

transaction costs, more efficient use of

knowledge bases, more flexible

arrangements

Electronic government is an instrument that contributes to the harmonization of relations

between citizens and public authorities, based on mutual respect and interested collaboration

between the state and citizens. Both theoreticians and practitioners identify and recognize (by

use in specific projects) the four pylons of electronic government: G2C (government to

citizen/consumer), G2B (government to business), G2G (government to government /

administration), G2E (government to public employees). All these forms are based on

electronic government principles, synthesized in six recommendations:

A. Transparent partnership – all activities must be transparent, publicly discussed, considering

the opinions and ideas of all parties involved;

B. Accessible information – respect the citizens fundamental right of access to public

information;

C. Social orientation – implement electronic government considering the citizens’ needs;

D. Legislative harmonization – harmonize the legal framework with international regulations

and standards;

E. Protection and security – respect the constitutional rights and liberties of citizens in the

process of creating, storing, processing and transmitting information, including protection

of personal data, through means and methods of ensuring information security;

F. Priority for political, economic and social dimension against the technological aspects.

Electronic government provides administrations with an online environment for providing

information, observing the most important concepts of electronic environment [3]. Also, the

4D(ata) approach highlights the benefits of applying Linked Data and Big Data concepts. The



282

accent is on accountability of governments regarding public data. Also, Social Data concept

evaluates the social dimension of electronic government.

3. Big Data dimension of 4D(ata) paradigm

A multitude of references in literature puts Big Data either in the category of abstract concepts

or specific instruments of information and communications technology. For

comprehensiveness, our approach will consider both aspects. Thus, we may define Big Data as

the concept used to describe datasets of such large size and high complexity that prevents

standard applications to process them. The size of these datasets is above the possibilities of

typical database systems (SGBD) to collect, process, manage and analyze. Because they may

come from a wide range of domains (social, political, economic, scientific, cultural etc.) they

are important for competition, increasing productivity, innovation and increasing the number

of consumers.

The premises of Big Data development are found, firstly, in the spectacular development of

ICT in the last years, which allows today’s world to be better connected, easier to find / locate,

hear / listen. From the perspective if ICT instruments, Big Data may be defined as the

technology that processes large volumes of data, beyond the usual abilities of traditional data

bases. Thus, the Big Data has the two facets of a coin: the descriptive part in the concept

(objective facet) and the applicative part in the technology (subjective facet).

Although one of sectors that benefit most from Big Data is the business sector, public sector is

not outside its influence [4]. On the contrary, Big Data may help lots of governments in serving

their citizens, overcoming the national problems and challenges like unemployment and fraud

fight (for example the American program Medical Fraud Strike Force –

http://www.stopmedicarefraud.gov).

On European side, the first government to employ Big Data was Great Britain, through Horizon

Scanning Center. Using this project, the British government tried to find answers to the

problem of global warming, regional and international tensions and security. The project with

the most significant impact on electronic government was launched in 2009 through the public

portal www.data.gov.uk. It provides users, since the first year of launch, with thousands of

datasets to help them understand the governing ways and policies. The platform offers the

citizens the possibility to get involved in electronic government, giving them the e-citizen

statute [5].

4. Social Data dimension of the 4D(ata) paradigm

There are multiple possibilities to define Social Media. Most definitions highlight three

important elements: (1) user generated content, (2) communities and (3) Web 2.0. [6]

Globalwebindex, for example, quoted by [7], provides the statistics regarding the use of Social

Media (http://www.globalwebindex.net). First four platforms on top are Facebook, Google+,

Youtube and Twitter. The next three platforms are social networks from China, including Sina

Weibo, Tencent, Weibo and Qzone.

In the context of the proposed 4D(ata) paradigm, Social Data means the structured information

obtained from analyses and statistics that describes the behavior patterns of individuals

regarding certain public interest subjects. The information is drawn from the use of electronic

platforms, mainly social platforms and web services that favor human interaction. This

information is gathered on a specific time frame and used to extrapolate certain behavior

patterns or collective trends of change (in specific domains of interests: culture, sport,

education, entertainment, fashion etc.).

The perspective of electronic government must perceive Social Data in terms of data

voluntarily created and disseminated by citizens through social platforms. This type of data is



283

mainly subjective and they must be considered appropriately in e-government strategies.

Social Data dimension is used by the government to understand the public opinion, anticipate

the reaction to adoption of governmental decision.

5. Linked Data dimension of 4D(ata) paradigm

The term of Linked Data (interconnected data) was introduced by Sir Tim Berners-Lee, director

of World Wide Web (W3C) consortium and creator of World Wide Web (in 1990), in one of

his works on the architecture of web space [1], [8]. The term describes a way of publishing and

interconnecting data in a structured form, starting from the idea that data becomes more

valuable and more credible when they are connected to other data. This model seeks to

standardize data from heterogeneous sources, using as main rules the RDF model (Resource

Description Framework) in order to publish structured data on the web.

As dimension of 4D(ata) paradigm in relation with electronic government, Linked Data

describes a way to publish structured data so that they interconnect and increase their usability

through semantic queries. This data may be built on current standards, like HTTP, RDS or URI

(Uniform Resource Identifier), with the goal of enriching computing units data bases, leading

to more relevant results.

There are numerous benefits for governments using linked data standards to publish data. Lots

of governments have started a decade ago to create a governmental linked data web space inside

an interconnected data cloud. These attempts involve a more responsible and secure availability

of data and for consumers this translates into easier and more flexible access to government

data.

In the last years there are more and more government projects involving the use of Linked

Data. Starting with www.data.gov platform of United States and continuing with European

Union LATC (Linked open data Arround The Clock), PlanetData project, DaPaaS (Data and

Platform as a Service) project and Linked pen Data 2 (LOD2) project, they all provide

thousands of datasets for every user.

Linked Data standards provide numerous opportunities for government actions regarding

statistics and geo-spatial information because the most useful datasets also contain statistical

information, whether it is the number of vehicles registered in a year or the real location of a

certain event. Both domains have the same large numbers of interested users, thus the use of

Linked Data standard is benefic.

One example of Linked Data for geo-spatial information is provided by the British government.

Through the INSPIRE directive, Great Britain makes sure the European countries exchange

spatial information. Among other functionalities, the directive provides identifiers for spatial

objects and a resolution mechanism (built on standard web architecture).

6. Mobile Data dimension of 4D(ata) paradigm

Like the other three dimensions of the 4D(ata) paradigm related to electronic government,

Mobile Data is a natural consequence of technological development. The mobile phone has

evolved from a simple voice device to multimedia communications, able to access and transfer

audio and video data, functioning also as a global positioning device, electronic wallet etc. In

this context, government applications may take advantage of the functionalities of this widely

used device.

Correlated with the terms previously analyzed, Mobile Data means the possibility to provide

citizens with a collection of instruments for strategic use of governmental services or

applications available only for mobile devices, laptops, tablets and wireless internet

infrastructure.

The relation between citizens and the government has changed since the apparition of Open

http://www.data.gov/



284

Data concept and mobile applications. They now interact through application for public

transportation payment through mobile phone, emergency applications etc. on the background

of a trend to provide more transparent public data. In a continuously changing society, adapting

to more flexible economic and political conditions, the changes do not stop here (see the

concept of democracy 3.0).

For example, in USA, government agencies have developed applications for airport security

(MyTSA – Transportation Security Administration) or applications that help small enterprises

to apply for various regulations. Other applications provide instructions for emergency

situations, maps to shelters and rescue centers. Still, the potential of mobile data is only

exploited by governments on a small scale.

7. Conclusions

Electronic government must be a goal by itself. Unfortunately, in Romania all the ideas and

efforts of the government have been so far only answers to requests from outside and were

oriented on technology, neglecting all social, political and economic aspects. A more complex

approach, better coordinated and closer to Romanian realities may have better results and lower

costs. A few failures in implementation of western policies in other domains should have taught

us already that in Romania it is hard to build on large scale due to political cycle, budget

restrictions and mentalities. Although, we must not overlook projects that enjoy continued

success (for example the governmental portal for public procurement SEAP – www.e-

licitatie.ro).

Bibliography

[1] W3C, available: http://www.w3.org/standards/semanticweb/data, 2013, [Feb. 20, 2015].

[2] Roșca I. Gh., Ghilic-Micu B., Stoica M. – eds., Informatica. Societatea Informațională. E-

Serviciile, Economica Publishing House, 2006.

[3] Programul de guvernare 2013-2016, available: http://data.gov.ro/about, [Feb. 20, 2015].

[4] European eGovernment Action Plan 2011-2015. Internet: http://ec.europa.eu/digital-

agenda/en/european-egovernment-action-plan-2011-2015, Digital agenda for Europe, A

Europa 2020 Initiative, [Feb. 20, 2015].

[5] Internet, available: http://data.gov.uk/faq, [Feb. 20, 2015].

[6] T. Ahlqvist, A. Bäck, M. Halonen and S. Heinonen (2008), Social Media Roadmaps.

Exploring the futures triggered by social media, [On-line]. VTT Technical Research Centre

of Finland, ESPOO 2008, VTT TIEDOTTEITA research notes 2454 ISBN 978-951-38-

7247-available: http://www.vtt.fi/publications/index.jsp [Feb. 20, 2015].

[7] M. Hu and L. Bing, “Mining and Summarizing Customer Reviews”, in Proceedings of

ACM Conference on Knowledge Discovery and Data Mining (ACM- SIGKDD-2004),

Seattle, Washington, 2004.

[8] Epimorphics Ltd., available: http://www.epimorphics.com/web/resources/what-is-linked-

data, 2012, [Feb. 20, 2015].

http://data.gov.uk/faq



285

AUTOMATIC USER PROFILE MAPPING TO

MARKETING SEGMENTS IN A BIG DATA CONTEXT

Anett HOPPE

CheckSem Research Group

Laboratoire Electronique, Informatique Et Images (LE2I) UMR CNRS 6306

University of Burgundy, Dijon, France

[email protected]

Ana ROXIN

CheckSem Research Group



[email protected]

Christophe NICOLLE CheckSem Research Group



[email protected]

Abstract. Within the discussion about the analysis methods for Big Data contexts, semantic

technologies often get discarded for reasons of efficiency. While machine learning and

statistics are known to have shortcomings when handling natural language, their advantages

in terms of performance outweigh potential concerns. We argue that even when handling vast

amounts of data, the usage of semantic technologies can be profitable and demonstrate this by

developing an ontology-based system for automatically mapping user profiles to pre-defined

marketing segments.

Keywords: marketing segment, user profiling, semantic Web, Web mining.

JEL classification: M31, L86, D80

1. Introduction

Due to rising mass and complexity of information and products available on the Web, content

customization becomes more and more crucial to enable efficient usage. As stated in a paper

from 2000 [1], technology brings vendor and customer “closer than ever before”, as the former

are able to follow every user’s path on the Web and adapt offers and contents accordingly.

Mass customization, the provision of individualized content for a large number of customers,

heavily relies on automatic analysis of usage histories, web contents and their structures.

Profiling approaches have to connect content information with each user’s individual

navigation history. On a web-scale, this is qualified as a “Big Data” problem [2]. A system

aiming for effective use profiling has to cope with those information masses and deduce

appropriate content to display in quasi-real-time, reacting on momentary and long-term

information needs equally.

In the case of the MindMinings system presented here, we aim for an application mapping user

profiles (built from implicit information) to marketing segments. More specific, we have been

working with experts from digital advertising to identify attributes and concepts that are crucial

for the prediction of a user’s future consumer behavior. On the one hand, the application of

semantic technologies allows to maintain a richer image of each user within the system – by



286

extending the keyword-based representation to an ontology structure that relates real-world

concepts with their semantic relationships. On the other hand, the limitation to those entities

that have been identified to be pertinent for consumer segmentation, allows light-weight, rapid

processing.

The implicit information that builds the base for the profiling consists of the browsing

information that is available to each online publisher: user IDs, paired with the contents

requested and a user agent. Based on this information, the goal is to profile the user’s

engagement with certain contents and to predict the likeliness for him to react on a certain

advertisement content, depending on time and used device. All data modelling has been

included in a customized domain ontology (further described in Section 0), that is integrated in

a flexible system that extracts the raw user information from a structured file, retrieves the web

contents in question and relates them to the content-related concepts within the ontology to

allow on-the-fly-segmentation of user activity.

2. Presentation of the implemented approach

The important bottleneck that has been described in relation with semantic technology comes

into play when extracting information through text analysis [7]. Syntactic/semantic analyses of

natural language are costly and have been largely avoided when handling immense amounts of

data [3]. A decoupling of semantic analysis and active user profiling enables us to avoid that

shortcoming of semantic technology. Therein, we benefit from the practical setting in the

industrial application: due to privacy concerns, every online publisher has only access to those

parts of the navigational history that happen on her websites or those of collaborators. Even

though this might involve a conglomerate of differently themed websites, it is still a limited set

of contents that can be continuously monitored and analyzed, the relevant semantic information

kept in the system. In consequence, the task at runtime is reduced to the connection of the

already available semantic page information according to the user’s individual behavior, and

the deduction of inherent patters.

The semantic page information within the system is updated on a regular basis, based on the

lifecycle of the indexed pages. To pertain a maintainable knowledge base, contents that are

vital have to be identified, contents that are outdated or uninteresting for the user base have to

be discarded. At the moment of writing we consider a metric composed of incoming/outgoing

link connections, age of the page and reappearance of its core concepts in novel articles as a

good starting point for an automatic judgment. This allows us to state whether a webpage is

semantically expressive or not. Based on this metric, we perform keyword extraction and

semantic disambiguation of extracted keyword only for webpages considered as semantically

expressive. [4] presents a deeper view on the MindMinings profiling system. The article at

hand will mainly focus on the underlying ontology and application developed.

3. The ontology

3.1. Overall view

The customized domain ontology constitutes the heart of the system. It has been carefully

designed to capture all information that is relevant for the profiling process, but avoiding

unnecessary complexity. As none of the authors has a background in digital advertising, this

working step has been accomplished in dialogue with domain experts.

The goal was a data structure that has the facilities to accompany the profiling process while

being slim. Hence, we limited the scope to those elements that are of high relevance for the

application domain. As a result, we obtain several modules with a varying degree of adaptation

to digital advertising. The basic entities of the user profile consist of attributes that are quite



287

generically used through profiling applications. Others, as for example the chosen core

concepts for the qualification of the web contents, feature a set of terms that is highly specific

to consumerism and commerce.

In those cases, we head for a modular ontology design, grouping the concepts in question under

a super-concept. The distinction of specialized and generic components enables to replace the

application-specific element when transferring the data structure to an alternative domain.

Furthermore it enables the integration of a more generic general purpose ontology to facilitate

comparative testing with alternative approaches. 0 shows a high-level overview on the

developed ontology modules and how they connect.

Figure 1 - Schematic view on the top-level classes of the ontology and their relationships

3.2. Main concepts

3.2.1. User ID

The center of the profiling ontology is, of course, the user, identified by an identification string.

As an adaptation to the terminology of digital advertising, this string is called ”BID”, as an

abbreviation for ’browser identification’. The BID concept constitutes the center piece and

connection point for all user related information, be it the modelling of her past navigations or

high-level segment affiliations. It can be seen as the most generic constituent as all profiling

process have to include some way of distinguishing one user from the rest of the group (be it

by a string, an ID number or her name).

3.2.2. Webpage

Web resources are the main source of semantic information. The respective concept is thus the

second anchor point within the ontology. It is identified by a string that contains its URL and

may be connected to further information concerning its domain and owning publisher. The

information is stored as a certain base domain might impose a bias on the topics covered within.

The term ”politics” appearing in a low-level domain, for instance, gives a strong hint on the

topics covered in the child pages. Similarly, a certain partner might be connected to a limited

set of preferred user segments (user profiles mapped to marketing segments).

3.2.3. Navigation history

The navigation history as depicted in the server logs contains the user ID, the page visited, time

stamps, basic information about the used device and, at times, if that particular browsing



288

activity represents a successful advertising conversion (meaning, that the page was reached by

clicking of a displayed ad).

The entities related to this part of the user profile are ”Hit” and ”Session” respectively. The

”Hit” groups all information captured for one single user event – the visited web page, time

stamp and user agent. All those data are modelled using OWL datatype properties, including a

boolean variable that captures if the user activity is a reaction on a displayed advertising.

The ”Session” concept groups several clicks based on a time concept. For the moment, the

baseline is to group all hits that have been effectuated in less than thirty minutes’ distance.

3.2.4. Web content information

Each web page is related to summary concepts according to the results of the semantic analysis.

This involves the key concepts that describe the documents content and, based on them, an

affiliation to topic categories that qualify the page on a higher level. To simplify the work with

the experts, those entities are, at the time, named according to the terminology that we

encountered in the domain, ”Keyword” and ”Universe” respectively. Therein, the instances of

”Keyword” capture the found key concepts. The system handles semantically disambiguated

and qualified key concepts, not plain keywords. The term ”Universe” refers to a customized

topic category. The ensemble of universes constitutes a hierarchy of first-level topic concepts

such as ”Foods and Drinks” and ”Kids”. Subordinated are second-level subdivisions such as

”Baby food” and ”Diapers” in the case of the category ”Kids”.

The categorization scheme is probably the module that is most influenced by the application to

digital advertising. For instance, the above division into such fine-grained topic as ”Diapers”

does not seem intuitively graspable. The same applies for first-level categories such as

”Luxury” or ”Hazardous Games” that, from a general point of view in web classification do

not seem of such importance. They are, however, important factors for customer segmentation

and gain such high value in the classification for that.

We make this highly specific categorization scheme interoperable by relating the few core

concepts with their counterparts in existing semantic repositories such as Dbpedia [8]. This

serves unique identification on the one hand; on the other hand it enables us to take advantage

of already existing relational information. Furthermore, all categories are sub-classes of one

central concept ”Universe”. The whole categorization module can thus be exchanged by an

alternative scheme – be it adapted to another application domain or a general classification

standard (such as the category set provided by the Open Directory Project [6]).

3.2.5. Mapping a user profile to marketing segments

The actual qualification of the user happens in two distinct modules that aim to split two levels

of abstraction – the generic user profile and the application-specific segmentation into customer

groups. The ”Profile” class groups sub-classes that capture basic user properties as deductible

from their browsing habits. This includes a group of socio-demographic attributes, such as age

and gender, and behavioral attributes as activity periods during the day or an interest in luxury

goods. Some of these criteria are quite generic to profiling applications – socio-demographic

information are of interest in numerous domains. The behavioral elements however, mirror

clearly the commercial focus of the target application.

So do the user segments that have been exemplified. Indeed, this part of the user profile will

be highly variable, based on current campaigns and clients. The sub-classes of ”Segment”

contain all complex user models that are obtained by combination of attributes from the lower-

level user profile. For a very simple example, one segment ”sporty mom” could capture all

individuals that were recognized as being female, having children and showing an interest in



289

sports-related publications for targeting a certain brand of sportswear. Similarly, one could

intend to target often-travelling business people with specific offers from the transport domain.

The usage of membership degrees for the definition of a segment seems quite straightforward

as it is based on notion that seem intuitively interpretable: the certainty with which can be stated

that a user belongs to a certain age group, or the portion of his navigation that is related to a

certain topic category.

In contrast to that, the computation of membership degrees when relating keywords to

categories, or categories to profile attributes poses a conceptual problem. The ”buzzwords”

related to topics evolve in time – a fact that becomes especially clear when considering the

technology domain: new devices appear and replace the former key terms, old terms lose in

importance. In former works, this issue has been solved by analyzing co-occurrence patterns

of terms (e.g. [5]). In our case, we aim to combine two sources of information: the occurrence

patterns in the dynamic corpus of web documents and the rather stable external knowledge

repository. The balancing of both will have important influence on the relational structure of

the ontology.

4. Application example

A few approaches described in literature have been tested in actual industrial environments [9].

We illustrate the current implementation of our system with the example of a user profile

automatically mapped to a pre-defined marketing segment, realized on the base of the ontology.

The goal of a sample campaign described to us by the experts was one targeting individuals

that were (a) mothers, by information known from Customer Relationship Management

(CRM), (b) mainly interested in sports-related topics. The interface shown in 0 shows a

demonstration interface that allows the combination of a list of sample URLs with CRM

information. The URLs are displayed on the left side, together with a classification of their

contents (”Justice and administrations” and ”Sports” in this case). On the right hand side, a set

of drop-down lists allows to enter ”simulated” CRM information. This process is performed

automatically in the system, the manual operation only serves for the demonstration of the

inference engine.

Figure 2 - Demo interface: Combination of content-related features and CRM (male, over 65, no children)



290

Besides, the rule base within the ontology is limited to the one, above-described segment with

the short label ”SportyMom”. Thus, the entered information of a male person above the age of

65 and without children does not match the description – hence, no segment information can

be deduced for the User ID in question.

The situation changes when we alter the CRM input so it identifies the user as being female

and mother of two children, as can be seen in 0. When assuming a female user that has, indeed,

a child, the inferred triples are added to the data base and a segment deduced.

Figure 3 - Demo interface: Combination of content-related features and CRM (female, 35 to 49 years old, one

child)

5. Conclusions

In sum, we presented how, using an ontology, one can capture the basic concept of a user

profiling process. The ontology integrates expert knowledge about the nature of the concepts

involved in the profiling process, but also about the rules applied for mapping a certain profile

to a certain marketing segment. Moreover, critical design decision have been taken considering

the specific application context (Big Data as on a web scale):

• Volume: We make a strong focus on the essential information – the deduced profile entities

are centered on commercial factors that decide the placement of an advertisement, in

contrast to more generic approaches.

• Velocity: All time-consuming information have been decoupled from the actual profiling

system.

• Variety: Our system allows coping with the wide variety of HTML-formattings for

webpages.

• Veracity: The validation of information is a strength of the ontology. Each concept within

the ontology carries a unique identifier mapped to its counterpart in an external Linked

Data resource.

• Value: We rely on navigation histories of actual users, as they are already used for content

recommendation.

In the close future, we will have to engage in detailed performance testing to prove the

appropriateness of our system.



291

References

[1] J. Srivastava, R. Cooley, M. Deshpande and P.N. Tan, “Web usage mining: Discovery and

applications of usage patterns from web data”. ACM SIGKDD Explorations Newsletter

1(2), 2000, pp.12-23

[2] J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and A. H. Byers, “Big

data: The next frontier for innovation, competition, and productivity,” The McKinsey

Global Institute, Tech. Rep., May 2011.

[3] E. Cordo. Building Better Customer Data Profiles with Big Data Technologies, available:

http://data-informed.com/building-better-customer-data-profiles-with-big-data-

technologies/, December 10, 2012 [Feb. 25, 2015].

[4] A. Hoppe, A. Roxin and C. Nicolle, “Dynamic semantic user profiling from implicit web

navigation data”, in proc. of the 13th International Conference on Informatics in Economy,

2014, ISSN: 2247 - 1480

[5] M. Abulaish and L. Dey, “Biological ontology enhancement with fuzzy relations: A text-

mining framework,” in Web Intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM

International Conference on. IEEE, 2005, pp. 379–385.

[6] C. Sherman, “Humans do it better: Inside the open directory project,” Online, vol. 24, no.

4, pp. 43–50, 2000.

[7] A. Halevy, P. Norvig, and F. Pereira, “The unreasonable effectiveness of data,” Intelligent

Systems, IEEE, vol. 24, no. 2, 2009, pp. 8–12.

[8] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives, “Dbpedia: A

nucleus for a web of open data,” in The semantic web. Springer, 2007, pp. 722–735.

[9] S. Calegari and G. Pasi, “Personal ontologies: Generation of user profiles based on the

fYAGOg ontology,” Information Processing & Management, vol. 49, no. 3, pp. 640 – 658,

2013, personalization and Recommendation in Information Access, available:

http://www.sciencedirect.com/science/article/pii/S0306457312001070 [Feb. 25, 2015]



292

DECISION-MAKING PROCESS ASSISTANCE USING PRECISION

TREE MODULE OF PALISADE DECISION TOOLS SUITE

Georgeta SOAVA

University of Craiova, Faculty of Economics and Business Administration

[email protected]

Mircea Alexandru RADUTEANU


[email protected]

Catalina SITNIKOV


[email protected]

Abstract. In the information society, the information becomes a "strategic resource" for any

business; systems integration has become particularly important because of the facility of

shared use of data and their movement within and outside the company. In this paper, we

wanted to present the importance and while necessity of using information technologies in the

substantiating and decision-making. Thus we presented some general aspects of the decision

process computerization; we have reviewed some theoretical considerations on interactive

decision support systems and decision-making stages in assisted mode. Based on these, we

considered a particular case of decision-making process under risk and uncertainty on a

reputable company in Romania, SC Guban SA. The problem of study consisted in accepting or

not, to produce under another brand. To address this hypothesis we used the package offered

by Palisade Decision Tools Suite-module, Decision Tree, which offers us the solutions and

recommendations. This model provide decision-makers the ability to quickly process

information available and to address complex problems, time-consuming in real-time, creating

a strategic and competitive advantage for the company.

Keywords: company, decision, decision-tree, information, technologies

JEL classification: M1, D7

1. Introduction

The emergence of a new technological paradigm organized around new information

technology, more powerful and more, supple 'even allow information to become a product of

the production process. In the new economic environment, IT tool tends to become a weapon,

if they haven't it, the main competitors it will use against you. For companies, the implications

of technological, behavioural, organizational especially, are extremely high: they are forced to

increase and restructure of the production circuits, including to reduce for as the information

to circulate more quickly, to manage products with maximum efficiency, by eliminating, where

possible, the stocks to ,,thinking", to produce and react to market signals in real time.

The wide proliferation of the informatics, the current information revolution put their strong

imprint on decision processes. At the moment we notice a computerization of the decision-

making process at more and more companies, namely are using computers in all stages of this

process, through specialized programs that can substantially increase the speed and

effectiveness of decision making. Cybernation of the decision-making process generates

multiple specific advantages in the enterprise: increasing the substantiating degree, of the




293

rationality of decisions, the use of a larger volume of information processed with sophisticated

procedures and models; speed up of the decisions, providing higher chances of employment

during the period optimal decision; facilitation and increasing the accuracy in the assessment

of the effects of decisions by managers, especially in the economic; the partial freeing of the

managers to perform routine tasks in particular regarding the collection, processing and

verifying information, which creates the possibility of focusing on the major components of

their work.

2. Decision support systems Automating of the data collection on the evolution of various economic phenomena

characteristic of the firm and its external environment, the use complex computational models

and their analysis leads to an explosion of information available to the decision maker, which

could not be effectively used without decision support systems.

Decision support systems are interactive systems that, through of the decisions models and of

specialized databases provide information to assist managers in making decisions [1]. The main

objective of such a system is to streamline the decision making process by using information

technologies. Over time were formulated many definitions for decision support systems (DSS).

In this regard it is noted Sprague and Carlson, which define the DSS as "an interactive computer

system that assists decision makers in the use of the data and models in order to solve the

problems structured, semi-structured or unstructured"[2]. Kanter believes that "DSS is used in

less structured problems, where art meets science of management"[3]. Kroenke defines the

DSS as "a set of tools, data, models and other resources on that decision makers use them to

understand, evaluate and solve unstructured problems” [4]. O’Brien stated that DSS is a

"system that provides interactive information to support decision-makers in the decision-

making process" [5].

By reviewing the views presented above, the following conclusion emerges: a DSS is an

information system that incorporates data and models used to support, not replace, the human

factor in the decision making, when decisions which will be taken are semi-structured or

unstructured. In such an decision - information environment, the use of DSS and intelligent

systems can assist decision-makers so that they do not lose sight of the major influencing

factors in problem analysis, to be informed on the reasons that motivate decisions their own

experience and enrich their knowledge base through access to knowledge and experience of

others, may transfer their knowledge as knowledge bases or diagnosis. In this way, the use of

decision support systems is an opportunity being beneficial for decision-making activity.

A decision support system is a powerful tool and should be an integral component of

managerial work to extend the manager's ability to process viable information quickly and to

treat complex problems, time consuming, reduces time allotted of the decision process,

improves reliability of decision-making, encourages exploration and learning process, creates

a strategic and competitive advantage for the company.

Analysis of decisions starts the on the one hand, by the premise of the accepting of the human

limits of information processing and, on the other hand, by considering need to incorporating

of the judgments and intuitions, of the imagination results and makers decision creativity.

The aim is not only to solve (in the sense of providing solutions for) different decision

problems, but the primary aims is to help and encourage the decision maker to think, to give

the structure of the problem to understand it better, to choose and "clarify" the goals set and to

identify (or imagine) easier the alternative of action to evaluate.



294

3. Palisade Decision Support package – Precision Tree

At company level, the decision situations are very different, and decision-making processes

involved are extremely heterogeneous in terms of structural and functional parameters. In this

way, it requires a deeper approach of the main components involved: decision maker, decision

variants crowd, decision criteria crowd, environment, the crowd of consequences and

objectives. Currently, decision analysis is supported by a number of methods and techniques,

using specific tools such as risk analysis, influence diagrams and decision trees etc. Analysis

of decisions begins, on the one hand, on the assumption of acceptance human limits of

information processing, and on the other hand, from the need to incorporate judgments and

intuitions, of the results of imagination and creativity of the makers.

Decision Tools Suite is software produced by Palisade Corporation, able to provide solutions

for risk analysis, representing a robust and innovative decision support. Decision Tools Suite

contains multiple modules, which added to Microsoft Excel, ensuring flexibility, ease of use,

and addresses a wide range of clients in all areas of life. Decision Tools Suite has been widely

adopted in almost all fields. Since decisions at the company, can be complex and involves

massive amounts of data, Decision Tools Suite are designed for computational speed and

maximum capacity, while maintaining ease of use.

In this paragraph, we have decided to present the facilities offered by this software in the

interactive decision support under risk and uncertainty, using Precision Tree module.

In business in general, but especially in the company’s activity, many times decisions not

depend only by the immediate consequences but on the basis more remote consequences of

future decision-making processes. It should be noted that in the globalization, uncertainty is

manifested significantly. Evaluation of such cascading decision making can be performed using

the decision tree.

Precision Tree is part of a package of decision support Palisade that brings advanced concepts

of modelling and decision analysis of the Microsoft Excel environmental, based on the

traditional model of decision tree. It is used to structure decisions to make them more

understandable to those who need to interpret them.

An analysis of decision provides a simple report format of the preferred route for decision-

making and risk profile of all possible outcomes. It can also present the results to understand

the trade-offs, conflicts of interest, and important objectives.

The decision tree involves consideration of "operative risk" based on the uncertainty of

situation and the impossibility of forecasting accurate it, and involves the following steps: (a)

definition of decision making, of random moments and their succession; (b) collect information

on the various action alternatives; (c) determination of the state of nature and of strings of

events; (d) estimating the effects of the end of each series of events, and evaluation criteria for

effectiveness; (e) discovery of a certain selection policy of the decision alternatives; (f)

sensitivity analysis for the optimal solution; (g) final analysis and issuing recommendations for

decision making.

Determination of the optimal solution actually means finding the best way, the most appropriate

branch of the tree starting from the final node to the initial node.

We will try to exploit the importance of the decision analyze through the decision tree, and the

facilities offered Precision Tree package, of the company Palisade, in decision making, starting

from a situation occurring at a company that manufactures shoes "Guban".

4. Case study - using Precision Tree at Guban Company

The Guban Company has a long history, the modern models by each season are made only

from the finest leather quality, are handmade [6].

The company manufactures and sells footwear in the country and in the European market, about



295

one million pairs on year. Margin realized by the company on a pair of shoes is 0.4 euro. A

transnational company from Asia with a strong network of department stores around the world

proposes of Guban Company, manufacture of shoes, but with transnational company brand,

and will earn 0.1 euros, on the pair of shoes but would have a much bigger market outlets.

According to its tradition Guban Company, wants to preserve the brand. It is estimated that the

emergence of this new brand makes a barrier to the current its market. The new market is

expected to be triple, face from losing Guban. If Guban Company refuses proposal, it is

possible as its main competitor (Leonardo), shall have requested and will accept with a

probability of 60%. It may in this case to avoid losing more customers to initiate an advertising

campaign for 300,000 euros, in this case, be able to decrease the profit margin, remained 0.35

euro benefit. We suppose as transnational company will not be able to start an advertising

campaign for 400,000 euros if Guban Company will react (probability 50%).

In this case, we have the following technical data:

1. Considering that Guban accept the Asian company's offer, it can lose: 10% of the market

with a probability of 30%, 15% of the market with a probability of 50%, and 20% of the market

with a probability of 20%;

2. If Guban Company refuses the proposal and the competitor accepts it, we have three

alternatives:

a) Guban Company doing nothing;

b) Guban Company starts an advertising campaign, and in this situation can lose:

given that competitors Guban does nothing: 0% of the market with probability 10%, 5% of

the market with a probability 60%, and 10% of the market with a probability of 30%;

given that competition also begins an advertising campaign, Guban company lose: 10% of

the market with a probability of 30%, 15% of the market with a probability of 40%, and

20% of the market with a probability of 30%;

c) Guban Company decreases the product price, and in this case, there may be two versions:

if competition does nothing, Guban lose: 5% of the market with a probability 30%, 10% of

the market with a probability of 50%, and 15% of the market with a probability of 20%;

if competition begins an advertising campaign, the Guban company may lose: 10% of the

market with a probability of 40%, 15% of the market with a probability of 40%, and 20%

of the market with a probability of 20%.

Using "Precision Tree", we will realize the decision tree, the evaluation of results at each branch

of the tree and is calculated for each variant, expected value. Once launched the challenge

coming from both the transnational company and the competition, the Guban SA Company

could no longer be sure that the margin received for a couple of footwear and therefore will

result in a certain amount of margin variability. This variability brings the manufacturer and

the danger of losing: thus in the achieved decision tree, it can be automatically calculate the

amount that it expects to collect at a pair of shoes under the circumstances, and could also cause

the average value, or expected. Model is shown in Figure1 (a part of the decision tree

constructed as it is very large).



296

Figure 1 - Decision tree

(Source: Processing of own data using PrecisionTree 6 of PALISADE DECISION TOOLS SUITE) [7]

Once we have designed the model and we defined its parameters, we can begin to realize a

decision analysis that will have as results statistics, graphics and management policy

suggestions. It is important to note that the values are entered or edited in a decision model

"real time". According to the results, note that we cannot consider cases in which Guban

company refused and competition accepts, starting an advertising campaign or fall in price.

Mathematical expectation and the results corresponding to the two decisions are very small.

The calculation of the optimal variant in a decision tree involves reducing with the chance

nodes at the higher gain, calculating the average expected values on the right. It is necessary,

however to know which are the consequences of following the election of suggested branch

respectively to know the risk profile.

A risk profile is a distribution function that describes the chance associated with each possible

outcome of the decision model. Risk profile demonstrates chart the uncertainty of the chosen

decision framework, and for to build, we will go through the following steps: (1) for each

branch of the decision tree, can be calculated expected value of winning by multiplying the

probabilities on branches with the chances of winning sequential; (2) nodes are reduced by

considering only of the optimal branches. These steps are repeated until the tree is completely

reduced to a single chance node with a set of values and corresponding probabilities; (3) the

final set of pairs defines a discrete probability distribution which is used to construct the risk

profile. Risk profile is represented as a discrete distribution in probabilities chart (shows the

probability that the result is equal to a certain amount). We observed that on the branch of

acceptance of the offer for the Asian market, the value gained is higher than under other

branches, is significantly presented graphically. A decision analysis complements the standard

statistics on the model made depending on the chosen decision, being provided in real time as

it enters or edit values in decision tree. These statistics (estimated model value, the minimum,

and the maximum deviation of possible outcomes), presents a decision tree profile.

Analysis with Precision Tree includes in achieving risk profile and statistical report that

provides a statistical summary report of decision analysis, which allows us to be able to make

comparisons between the alternatives chosen in the statistical summary report. From all the

data presented is noted that Guban Company, it is now more convenient to accept the offer

Asian market, the average chance of winning if of acceptance is 3,855 to 3,562 euros for non-

acceptance, the minimum is 3.8 to 2.8 euros. Maximum not show a large variation, 3.9 and 4

euro. Policy suggestion report allows us to know which option is chosen at each node shown

in the optimal solution.

After the study we noted that the decision tree shows all possible options for decision and

random events in a tree structure. It forming from the left to right, presenting events and



297

decisions related. All options, outcomes and expected values with the values and their

associated probabilities are shown directly in the spreadsheet. We note that there is little

ambiguity about the possible outcomes and the decisions of the branches, is enough we to look

at the node, and we can see all possible outcomes, coming from node. Also, the fact that the

data are processed in real time, we have able to test any options want, and finally choose the

one that best satisfy our desires. And to have the best results in the analysis carried out, we can

continue with sensory analysis of each branch, or only the branches that we raise some

questions. To determine which variables matter most in our decisions, we use sensitivity

analysis which is useful for finding the limit values, where an optimal choice for a decision

node changes values.

If we want more information, we can change the limits chosen and restores sensitivity analysis.

Alongside of graphic we have and data report on that was done this, and so we can evaluate

each graph correspond the test data respectively the expected value, and finally see what we

can do to have the highest gain, or in other words the smallest loss. According to these graphs

we can evaluate, the progress of the expected value based on the value of the selected input.

Decision Analysis provided by Precision Tree give us direct reports, including statistical

summaries, risk profiles and provides policy suggestions on that manager to consider them.

Decision analysis can produce more quality results, on that putting them to the available to the

manager enables him to foresee the conflicts of interest and the main important objectives. All

results of analyses are reported directly in Excel for an easy customization, printing and saving.

No need to learn something new because all Precision Tree reports can be modified like any

other Excel worksheet or chart.

Efficient use of decision tree method depends on the update of the information as rolling

process modelling, being quite complicated, as when developing the model can be fully

estimated the decision variants. In order to avoid major deviations, is need as the tree to be

reviewed and according to materialize assumptions to be reassessed reasoning in the decision

intermediate nodes.

The deficiencies of the decision tree method, starting from the lack of information on the

dispersion and shape distribution of all possible outcomes and the probabilities of these

outcomes. It is very important to carry out a full description of probability distributions,

because makers have different reactions and attitudes towards risk, and the shape of the

probability distribution allows outlining a view of the risk associated with each alternative. To

try to limit these shortcomings, we can use the other components of the package from Palisade

Decision Tools.

4. Conclusions

As a result of this, we can say that decision support systems, in the conditions of globalization,

becomes an indispensable condition for managers, to analyze information in order to

substantiate and adoption decisions in real time, which can provide a strategic and competitive

advantage for the company. Palisade software package it constitute in a basic support for

managers in decision making, in conditions of risk and uncertainty. The advantages that result

from the use of a decision support system are not identified in all decision situations or all

decision-makers, but depends on the degree of matching between the decision maker, the

decision context and decision support system. We can say that the package Palisade - Decision

Tools Suite is a specialized system for risk analysis that allows users to control decision-making

and be able to optimize, it so that decisions could ultimately face an uncertain environment

with a plurality of different risks. In this way, we believe that for managers of any company,

such a powerful tool in an uncertain world is mandatory for face challenges.



298

References

[1] S. Berar, The decision system of the company. Informatics and Economic Prospects,

available: http://www.geocities.ws/sanda_berar/procesuldecizional.htm2011

[2] R. H. Sprague and E. D. Carlson, Building Effective Decision Support Systems. Englewood

Cliffs, NJ: Prentice-Hall, Inc., 1982, pp.22

[3] R.M. Kanter, B. Stein and T. D. Jick, The Challenge of Organizational Change: How

Companies Experience It and Leaders Guide It. New York: Free Press, 1992, pp.56

[4] D. Kroenke, Management Information System, Mitchell McGraw-Hill, 1992, pp. 38

[5] J.A. O'Brien, Management Information System, McGraw-Hill International Editions, 1999,

pp.79

[6] Compania Guban, available: http://guban.ro/sample-page/

[7] Software Precision Tree 6, Palisade DecisionTools Suite

http://www.geocities.ws/sanda_berar/procesuldecizional.htm

http://hollis.harvard.edu/?itemid=%7Clibrary/m/aleph%7C002589239

http://hollis.harvard.edu/?itemid=%7Clibrary/m/aleph%7C002589239

http://www.google.ro/search?hl=ro&tbo=p&tbm=bks&q=bibliogroup:%22McGraw-Hill+International+Editions%22&source=gbs_metadata_r&cad=5

http://guban.ro/sample-page/



299

RoaML: AN INNOVATIVE APPROACH ON MODELING WEB

SERVICES

Cătălin STRÎMBEI

Al.I.Cuza University of Iasi

[email protected]

Georgiana OLARU

Al.I.Cuza University of Iasi

[email protected]

Abstract. What we intend to argue in this paper is whether a new service-based modeling

approach could be technologically feasible, desirable by architects and developers and viable

as new MDA platform. This new approach we call it RoaML as a “step-brother (or sister)” of

already established SoaML initiative.

Keywords: Web services, RESTful services, Service Oriented Architecture, ROA, SOA.

JEL classification: D83, L86

1. Introduction: web service-based architectures and UML modeling A software architecture is an abstraction of the run-time elements of a software system. It is

defined by the configuration of its elements – components, connectors, and data – constrained

in their function and relationships in order to achieve a desired set of architectural properties

(e.g., reliability, scalability, extensibility, reusability). Currently, two architectural styles are

dominant: Service Oriented Architecture (SOA) and Resource Oriented Architecture (ROA).

The SOA and ROA architectural design patterns and the corresponding distributed

programming paradigms provide a conceptual methodology and development tools for creating

distributed architectures. Distributed architectures consist of components that clients as well as

other components can access through the network via an interface and the interaction

mechanisms the architecture defines; in the cases of ROA and SOA such distributed

components will be named respectively resources and services.

With the emergence of SOA, a new UML specification was needed in order to cover the needs

of designing services - SoaML. In this article we will argue whether another initiative more

appropriate for the specific needs of modelling ROA applications.

1.1 SoaML framework Service Oriented Architecture (SOA) is the paradigm for the development of software systems

based on the concept of service. A development method based on the SOA paradigm requires

some notations to present services, their interfaces and the way they are built, including the

case where they are built from other services, the architecture of a system in terms of services

and the way they are orchestrated [6]. The SoaML specification defines a UML profile with a

metamodel that extends UML to support the range of modelling requirements for SOA,

including the specification of systems of services, the specification of individual service

interfaces, and the specification of service implementations. The SoaML metamodel extends

the UML metamodel to support an explicit service modelling in distributed environments. This

extension aims to support different service modelling scenarios such as single service

description, service-oriented architecture modelling, or service contract definition. This is done



300

in such a way as to support the automatic generation of derived artefacts following the approach

of Model Driven Architecture [9].

The OMG SoaML specification also introduces the concept of services architecture to model

how a group of participants that interact through services provided and used to accomplish a

result. According to SoaML, a service is an offer of value to a service consumer (a simple client

or another service) through a well-defined interface that could be available to a community

(which may be the general public). For SoaML, a service architecture is made by a group of

participants providing and consuming services at specific service points [1]. The goals of

SoaML are to support the activities of service modelling and design and to fit into an overall

model-driven development approach, supporting SOA from both a business and an IT

perspective.

1.2 IBM Rational Rose Framework: REST Service Model Concerning the modeling of SOA requirements we have a well-defined standard (SoaML).

Regarding the modelling of ROA requirements, we don’t have any specific official guidelines.

One attempt to customize the modelling and design for RESTful Web Services comes from

IBM, that has included in version 8.0.3 of Rational Software Architect Version a template for

REST Modelling [12]. This template proposes a set of elements for class and sequence

diagrams like Resource Class, Path Dependency, GET Operation, PUT Operation, POST

Operation, HEAD Operation and Delete Operation. These elements allow some kind of basic

resource modelling but they don’t offer other guidelines on how to model more complex

architectures and we think that it does not represent a comprehensive modelling approach

regarding ROA domain.

2. ROA vs. SOA The traditional conceptual model of service-oriented architectures, or the service-oriented

paradigm, seems like one evolutionary way of distributed computing programming. As Object

Orientation paradigm has “naturally” evolved from procedural programming and modular

development, challenged by distributed computing models based on RPC, T. Erl argues that

service orientation evolved from object orientation, challenged by new distributed computing

and integration models BPM, EAI and, finally, by web services standardized such as SOAP

initiatives[3].

Our question is: REST oriented conceptual model could be the next evolutionary step of

service-like architectures? Maybe … or maybe not? The main advantage of SOA over ROA is

the more mature tool support, type-safety of XML requests. Conversely, the main advantage

of ROA is the ease of implementation, the agility of the design, and a lighter approach for

business perspective. Thus, REST services differ from older SOA(P) services taking into

consideration (at least) two perspectives:

the degree of sophistication: REST “philosophy” assumes to simplify web service

“protocol” as much as possible;

the overemphasis on the basic and self-defining Rest principle of HATEOAS (Hypermedia

as the Engine of Application State), something like “if there is not HATEOAS then there

is no Rest”.

Taking into consideration the service design principles stated by T. Earl [3] and [4], there are

some subtle conceptual differences between SOA(P) and REST related to:

service contracts, concerning standardization and design: RESTful requires no formal

contract specification, although REST API standardization and versioning is promoted as

best practice. In fact, RESTful proponents sustain very fluid REST APIs so that there is no

“official” dependency to a formal service interface specification, REST API



301

documentations are desirable but not to formally interfere with REST system architecture.

There are some specialized tools to generate REST API docs, like Swagger, but there is no

WSDL-like document implied. A service contract (like a WSDL document) looks like static

typing (from programming languages) of the service providing component. As opposite,

REST approach favor a very dynamic approach, e.g. service operations could be specified

in service resource instance representation as dynamic-links and not through an extra meta-

specification document;

service coupling, concerning intra-service and consumer dependencies: any service-based

architecture (or any kind of distributed component architecture) must have the attribute of

interoperability. In this context, coupling refers to the way of managing service

(component) interdependencies: there is an abstract level that could be understood as

service structural relationships and there is a runtime level where those structural

relationships became live (synchronous or asynchronous) connections. On the other side,

REST services favor a more dynamic approach where link-lists, formatted using

conventional standards like HAL, JSON-LD, Collection+JSON or SIREN as they are

discussed in [10], could be generated specifically for each resource instance, unlike the

more rigid approach of SOAP, where relationships are mostly endpoints statically defined

in WSDL definitions;

service discoverability, concerning interpretability and communication: SOA(P) promoters

developed a sophisticated standard in this regard - UDDI, but, as REST means

“simplification”, REST supporters considered that there is no need for a “middleman” like

a “service registry”, the REST /service is a URL, and the URL must be formatted to be self-

explained;

service composability, concerning composition member design and complex compositions:

SOA(P) approach, in the same line with service coupling principle, favor a contract-first

pattern, as it is coined by R. Daigneau [7]. That means a static way to define composite

services through WSDL specific documents, a declarative workflow spread in industry

within the form of BPEL Orchestrators. On the other side, REST approach favor a more

dynamic way to compose web services based on URL-links (as de facto relationships) close

to point-to-point composability [2]. The are some critics to this kind of linking services

which claim that point-to-point should not be considered as a kind of composability taking

into consideration the that business logic is encapsulated in service implementation.

Therefore “traditional” SOA design principles are not quite entirely appropriate for

RESTful-web services too, consequently ROA architectures might need different service

design solutions.

3. Zero-based approach of RoaML In our opinion, a radical innovative approach assumes a “zero-based approach” - meaning that

it will not continue or refine an existing conceptual metamodel as “UML for Rest” or “UML

for ROA” or even “UML for SOA”... Our goal is to preserve simplicity declared by REST

framework “founders” and theorists, but the “great compromise” is how to maintain

consistency in the same time. In the following we rather propose some guiding principles for a

business-oriented REST/ROA metamodel, and not a complete UML profile or framework for

REST oriented architectures.

3.1 Metamodels Our proposal takes into consideration an approach based on three delimited profiles, as in

Figure 1:



302

one focused on the application domain modelling, in fact the business side of the systems,

that we named DDD Metamodel to invoke “Domain Design Driven” principle to model

software components for business[8];

one more substantial and consistent, named REST Metamodel, focused on modelling

application services or components using RESTful principles. This metamodel also covers

some elements dedicated to architectural modelling of more complex application systems

from REST services, into a sub-profile named ROA Metamodel;

last one, the REST Domain Metamodel, has the integrator role, so that meta-modelling

elements from Rest Metamodel (and ROA sub-metamodel) to be tailored for business

specific needs.

Figure 1 - RoaML Metamodels and their relationships

3.2 From REST Metamodel to ROA Metamodel The core REST metamodel, makes a distinction between REST resources (marked with

WebResource stereotype) and the REST services, as their producers, see Figure 2.



303

Figure 2 - Core REST Metamodel stereotypes and their relationships

Also this metamodel assumes the existence of resource descriptions where links have a

fundamental role (considering HATEOAS principles). These hyperlinks could identify the

resource (WebLinkSelf), could represent structural relationships with other resources

(WebLinkRelation) or could signify an action endpoint concerning resource itself (other than

standard actions based on GET, POST, PUT, DELETE requests from HTTP protocol). Another

important distinction refers to resource archetypes [8] represented by a set of specialized

WebResource stereotypes: Document, Collection of documents, Store and Controller. A

critical aspect of any component-based model concerns also the way to assemble individual

components in complex systems. In order to address these issues in the context of REST

services based architecture, we propose a meta-extension to the REST core-metamodel in the

form of a ROA (sub)metamodel (shown in Figure 3) based on the distinction between REST

service and REST resource, that will take into consideration: (i) resource binding using

relation-links; (ii)service binding using service-links guided by the link-relations from core

resource model.

Figure 3 - An architectural metamodel for ROA based systems



304

3.3. REST domain metamodel Finally, the REST Domain metamodel (see Figure 4) is centered on business WebEntity

concept that tries to combine a fundamental concept from business metamodel with the REST

WebResource from REST architectural model.

Figure 4 - An architectural metamodel for Rest service based systems

The WebEntity description will use at least three other meta-elements:

entityUID for identity purposes, but in the form of self-links;

web-entity-attributes;

web-entity-relationships coming from the web-link-relations of REST core metamodel.

Starting from this framework, a MDA initiative could further add a new level in metamodeling

approach: e.g. one could define a JEE metamodel for JEE as the platform-of-choice to

implement REST services and resources.

4. RoaML target audience The success or failure of any software development or technologically initiative depends on a

critical “quality”: the popularity that could engage a prospective massive audience.

Figure 5 - Measuring REST vs. SOA audience



305

We have studied the popularity of SOA versus REST in the last years and we found the

following:

a series of statistics from Programmable Web shows a growth of the REST API from 58%

in 2006 to 73% in 2011, while the SOAP API registered a decrease from 29% in 2006 to

17% in 2011 [14].

another study from indeed.com shows that the trend of REST jobs is increasing (1% in

2014) while the one for SOA jobs is decreasing (0.3% in 2014) [15].

Google trend is showing also a growing interest in REST versus SOA based on the number

of google searches of the topic[13], see figure 5.

Judging by the success of REST we can say that RoaML has an important potential audience.

The key point for the successful adoption of RoaML is represented by it’s simplicity and

flexibility, the same principles that recommend REST over SOAP.

5. Conclusions and future work

As we have argued in our paper, RoaML could be a suitable modelling language for

applications based on resource oriented architecture. At this moment we are presenting just

some guiding principles for a business-oriented REST/ROA metamodel, but we are planning

to improve our metamodel and also to propose a MDA approach for implementing REST

services and resources.

Acknowledgment

This work was supported by the European Social Fund through Sectoral Operational

Programme Human Resources Development 2007 – 2013, project number

POSDRU/159/1.5/S/134197, project title “Performance and Excellence in Doctoral and

Postdoctoral Research in Economic Sciences Domain in Romania”.

References

[1] Choppy, C., Reggio, G. - A Well-Founded Approach to Service Modeling with Casl4Soa,

ACM, 2010

[2] R. T. Fielding, Architectural Styles and the Design of Network-based Software

Architectures, CHAPTER 5 Representational State Transfer (REST),

http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm, 2000

[3] T. Erl, SOA: principles of service design, Pearson Education, Inc., Boston, Massachusetts:

2008

[4] T. Erl [et.al.], Web service contract design and versioning for SOA, Pearson Education,

Inc., Boston, Massachusetts: 2009, pp.25-26

[5] M. Massé, REST API Design Rulebook, Gravenstein Highway North: O’Reilly Media,

Inc.m 2012, pp.15-16

[6] P. Brown, Implementing SOA: Total Architecture in Practice, Addison Wesley

Professional, 2008

[7] R. Daigneau, Service design patterns: fundamental design solutions for SOAP/WSDL and

restful Web services, Westford, Massachusetts: Pearson Education, Inc., 2012, pp.85-93

[8] E. Evans, Domain Driven Design: Tackling Complexity in the Heart of Software, Pearson

Education, 2004.

[9] OMG, Service oriented architecture Modeling Language (SoaML) Specification Version

1.0.1, 2012

[10] K. Sookocheff, On choosing a hypermedia type for your API - HAL, JSON-LD,

Collection+JSON, SIREN, Oh My!, available: http://sookocheff.com/posts/2014-03-11-on-

choosing-a-hypermedia-format/

http://sookocheff.com/



306

[11] Vinay Sahni, Best Practices for Designing a Pragmatic RESTful API, available:

http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api, 2014

[12] S. Katoch, Design and implement RESTful web services with Rational Software

Architect, available: http://www.ibm.com/developerworks/rational/library/design-

implement-restful-web-services/ , 2011

[13] ***, available:

http://www.google.com/trends/explore#q=%2Fm%2F03nsxd%2C%20%2Fm%2F0315s4

&date=1%2F2007%2098m&cmpt=q&tz

[14] ***, available: http://www.infoq.com/news/2011/06/Is-REST-Successful

[15] ***, available: http://www.indeed.com/trendgraph/jobgraph.png?q=Rest%2C+SOAP"

border="0" alt="Rest, SOAP Job Trends graph”

http://www.vinaysahni.com/



307

INFORMATION SYSTEMS IMPLEMENTATION IN THE JIU VALLEY

SME’S

Eduard EDELHAUSER

University of Petroșani

[email protected]

Lucian LUPU DIMA

University of Petroșani

lupu_lucian @yahoo.com

Abstract. The study set sights on 25 Jiu Valley SMEs, and data were collected in 2014 year.

The survey led us to identify the effect of the crisis on SMEs, but also to sets the current stage

of the level of implementation of ICT tools in the 2014 year in Jiu Valley SMEs. The research

was based on a quantitative questionnaire that was structured on 21 questions, seven of the

questions focused on Jiu Valley SMEs during the crisis and other 14 questions oriented on ICT

aspects of SMEs. Jiu Valley, a former mono-industrial area, is facing today with a very difficult

situation. One solution for solving this social problem could be offered by SMEs managers and

ICT experts. The study will represent the foundation of a process of integrated information

systems implementation in the SMEs from the Jiu Valley.

Keywords: SMEs, IT&C, Management Information Systems

JEL classification: C10, L60, M15

1. Theoretical framework

Information systems represent the best method for using information technologies, gather,

process, store, use and disseminate information between employees and organizations. The

field of information systems requires a multi-disciplinary approach, to be able to study the

range of socio-technical phenomena which determine their development, use and effects in

organizations and society as it is defined by the UK Academy for Information Systems. We

consider that the most relevant and well represented Information Systems for the actual

Romanian organizations are the Management Information Systems. Within a pyramid model

of an organization, MIS are management level systems that are used by middle managers,

connecting operational management that use ERP applications and top managers that use an

BI application, and ensure the smooth running of the organization in short to medium term. An

important role of these systems is revealed because MIS offer highly structured information

and allows managers to evaluate an organization’s performance by comparing current with

previous outputs [3],[4].

Information systems include Enterprise Resource Planning (ERP) which represents a tool that

integrate economic processes of an organization and is used to optimize its resources. ERP’s

are systems designed on a client-server architecture basis, and are developed to be able to

process transactions and to facilitate the integration of all the processes from the whole

organization starting with the planning and development stage and reaching to the Front and

Back Office applications described in relations with the suppliers, customers and other partners

[5]. On an international level integrated informational systems have been evolving for more

than 45 years and nowadays all the economic processes of an organization could be integrated.

In Romanian there are still some insular applications (own developed applications by small IT



308

companies or by an organization itself, not integrated and usually very small). The only

solution of making a business efficient is to use the ERPs techniques that permit planning the

resources of the enterprise or integrating the information of a business. The name ERP derives

also from the concept that you need a single database where all the functional systems can be

combined in an integrated one, which is offered to all departments by information distribution

[5].

We are now in 2014, emerging of cloud computing that employs networking technology to

deliver applications, as well as data storage independent of the configuration, location or nature

of the hardware. This, along with high speed smartphone and WiFi networks, has led to new

levels of mobility in which managers may access the MIS remotely with laptop, tablet

computers and smartphones. In this context modern management method could contain all of

the ERP modules such as transactional database, management portal or dashboard, BI system,

customizable reporting, resource planning and scheduling, product analysis, external access via

technology such as web services, document management, messaging or chat or wiki and

workflow management [2].

2. Research methodology and results

In the virtue of the questionnaires we achieved the results. In order to analyze the statistical

connections we used correlation analysis for the intensity of the connections between the

variables and regression analysis to estimate the value of a dependent variable (effect) taking

into account the values of other independent variables (causes). We carried out a multiple

regression analysis in order to identify the effect that implementation of ERP, BI and e-

Government applications in the decision making processes of the organization.

2.1. Methodology

The instrument used for collecting data is a quantitative questionnaire. The research based on

the quantitative questionnaire was structured on 21 questions, seven of the focused on Jiu

Valley SMEs during the crisis and other 14 questions oriented on ICT aspects of SMEs such

as: hardware, e-Commerce, ERP business software and BI management software.

2.2. Respondents

The study set sights on over 25 Jiu Valley SMEs, and data were collected in 2014 year. The

survey led us to identify the effect of the crisis on SMEs, but also to sets the current stage of

the level of implementation of ICT tools in the 2014 year in Jiu Valley SMEs. In the category

of Jiu Valley medium enterprises we were focused in our study on manufacturing, construction,

public services and retail. Almost 4,000 employees from a total of 7,500 employees, from the

top 75 medium enterprises are involved in these four areas. In the category of Jiu Valley small

enterprises we were focused on forestry, bakery, lohn, production of furniture and electrical

equipment, construction and retail. Almost 2.300 employee from the 3.300 of the top 150 small

enterprises are involved in these areas [1].

Even data were collected only from 25 organizations, these are representative for the 2014 Jiu

Valley SMEs, because in this economical moment Jiu Valley has only 225 SMEs that could

need an ERP or a BI software instrument as an advanced management method. There were

analyzed over 225 Jiu Valley SME’s, 75 medium enterprises and 150 small enterprises, and

were identified 25 distinct groups without any significant deviations among the enterprises

forming each group. In order to apply this method there were chosen 25 companies that were

included in the present study. Each of these 25 companies is representative for the enterprises

group, having similar trend. This simplified version of the method is suitable for our problem

because in this way there were obtained very clear results.



309

2.3. Findings and discussions. Graphical results and statistical analysis

Sixty percent of the investigated organizations declare that have a website. In our opinion, in

2014 every organization must had a website, because Romania has more than 5 million mobile

or Wi-Fi Internet connection, and more than 3 million cable or dial up Internet connection,

representing half of the Romania inhabitants [6]. So there are many opportunities for

developing a website, an online store or an Internet sales strategy - Q1. Your organization has

a website?

Figure 1. Q2. What components has your website offers for users?

In the area of website components, we can conclude that Internet is in 80% of the cases use

only as an online show window and not as an interactive instrument between organization,

employees, managers and customers.

Figure 2. Q5. Your organization has interacted online with local or governmental authorities?



310

Through e-Romania strategy, the authorities had allocated over 500 billion euro in the period

2010-2013, for the connection of all systems of public administration and for offering over 600

electronic services. So the companies have benefits from these e-Government implementation.

Figure 3 - Q6. The total number of computers (hardware endowment) in your organization

In Romania in 2012, 66.4% of the household population has a computer [6]. So for these 12

companies having 454 employees, 98 of them being involved in ICT activities and having 109

computers, we have a very good ICT endowment.

Figure 4 - Q9. Which are the software components implemented in your organization

We have a low degree of software usage in the 12 companies from the manufacturing field

investigated. Excepting the Office software, having a 100% usage, the CAD/CAM software,

specific for the manufacturing field, having a 58% usage, and the accounting information

systems, having a 66% usage, other software are used only occasionally. As we already knew

from other previous studies (Edelhauser, 2012), the usage of accounting and payment ERP

components is very common in Romanian companies, having a 100% usage for ERP

accounting, and a 83% usage for ERP payment in the 12 investigated companies. The CRM



311

and SCM components have a 17% usage, but in the virtue of the Q14 - List proposals of

applications, which you consider necessary to implement in your organization, 4 companies

representing other 33%, wish to implement such a component. Two component are never used:

HR ERP component, probably because Revisal software is mandatory in Romania, and

manufacturing ERP component, probably because the costs for implementing such a software

are very high.

The estimated total amount to be allocated to ICT endowment is 550,000 lei - 5.5 billion lei for

the 25 organizations investigated. (Actually only 18 companies have proposed allocation of

funds) - Q12. Please specify what amount (in lei) you are willing to assign in the coming years

for software implementation in your organization?

Sixty-four percent of the investigated organizations declare that plan to implement a Cloud

technology. Even if in Romania it does not exist a harmonized legislation regarding the Cloud,

the organizations are very conscious about the Cloud technology, and they plan to implement

such a technology - Q13. Please indicate if you plan to implement a Cloud technology in your

organization in the coming years?

Based on Q7 (Number of computers interconnected in a LAN and to the Internet) and Q8 (How

many people in your organization are involved in the ICT activities) from the questionnaire we

have made a regression analyses.

Table 1 - Regression analysis and correlation between the number of computers and the number of employee

with access to these computers for the 12 companies from the manufacturing field

We noticed that there is a strong link (with a significance of correlation R = 0.979 > 0.63 for

11 degrees of freedom). F-test also has a high value (226), and the Sig. corresponding F

statistics is 0.00 (0.00) which gives significant linear relationship between two variables.

Because both F that has a high level, and significance Sig. is reduced, can be concluded that

the results are not coincidental. The regression coefficient R=0.979 shows a strong link

between the variable Personal_acces_IT given to the level of IT, and the independent variable

Calculatoare showing the size of the organization. The model explains 98.5% from the total

variation of the variable Calculatoare (R2= 0.985). The rest of 1.5% is influenced by other

residual factors not included in the model. So the usage of the computers by the employee of

the organizations for the 12 selected organizations from the manufacturing field is excellent.



312

Table 2. Regression analysis and correlation between the number of computers and the number of employee

with access to these computers for the 7 companies from the services field

We noticed that there is a strong link (with a significance of correlation R = 0.998 > 0.63 for 6

degrees of freedom). F-test also has a high value (1525), and the Sig. corresponding F statistics

is 0.00 (0.00) which gives significant linear relationship between two variables. Because both

F that has a high level, and significance Sig. is reduced, can be concluded that the results are

not coincidental. The regression coefficient R=0,998 shows a strong link between the variable

Personal_acces_IT given to the level of IT, and the independent variable Calculatoare showing

the size of the organization. The model explains 99.7% from the total variation of the variable

Calculatoare (R2= 0,997). The rest of 0.3% is influenced by other residual factors not included

in the model. So the usage of the computers by the employee of the organizations for the 7

selected organizations from the services field is excellent.

Table 3. Regression analysis and correlation between the total number of employees and the total number of

computers for the 7 companies from the services field

So there is a good link between the total number of employees and the total number of

computers form the organizations, R=0.791 >0.63 for the 7 selected organizations from the

services.

3. Conclusions

The ICT level of implementation in Romanian and Jiu Valley SMEs was revealed in a surveys

developed in 25 Jiu Valley SMEs. We conclude that we have a low level of managerial culture,

and also a low level of top and operational managers IT qualification. Most of the SMEs have

a satisfactory ICT endowment, SMEs use computers only as office tools and computerized



313

accounting, and this leads to an inefficient managerial decision and a SMEs disadvantage on

the business market.

The ICT in Jiu Valley SME’s has a low level of implementation and some specific modules

such as ERP, BI, CRM, e-Commerce, e-Business or Cloud Technology are rarely used. Even

data were collected only from 25 organizations, these are representative for the 2014 Jiu Valley

SMEs, because in this economical moment Jiu Valley has only 225 SMEs that could need an

ERP or a BI software instrument as an advanced management method.

References [1] P. Barta, (2013) The current situation of the Romanian SMEs, 2012 Edition, Post

Privatization Foundation, Bucharest, 2013, available: http://www.postprivatizare.ro,

[January, 20, 2015].

[2] E. Edelhauser, A. Ionica and M. Leba, “Modern Management Using IT & C Technologies

in Romanian Organizations,” Transformations in Business & Economics, Vilnius

University, Lithuania, vol. 13, no. 2B (32B), 2014, pp. 742-759.

[3] E. Edelhauser and L. Lupu Dima, “Management Information Systems. Past and Present in

Romanian Organisations”, in Proc. The 11th International Conference on Informatics in

Economy IE 2012, Bucharest, 2012, pp. 459-463.

[4] E. Edelhauser, A. Ionică and C. Lupu, “Enterprise Resource Planning and Business

Intelligence, Advanced Management Methods for Romanian Companies”, in Proceedings

of the 1st Management Conference: Twenty Years After, How Management Theory Works,

Technical University of Cluj Napoca, Todesco Publishing House, 2010, pp. 63-72.

[5] D. Fotache and L. Hurbean, Enterprise Resource Planning, Bucharest, Economică

Publishing House, 2004.

[6] ***, (2013) Documentary Analysis Result in Romanian IT&C Sector, [On-line] POAT

2007-2013 project [January, 20, 2015].



314

META-INSTRUCTION IN E-EDUCATION

Gabriel ZAMFIR

The Bucharest University of Economic Studies, Bucharest, Romania [email protected]

Abstract. This paper is a conceptual one focused on assisted instruction in e-society,

developed as an appropriate approach for e-education. Assisted instruction is situated between

traditional education and blended learning style, and it requires a different paradigm,

according to the main tendencies of our environment: the faster changing technologies and the

slower systematic implementations in e-education. The new paradigm is proposed as a suitable

orientation of the educational research related to e-science and e-business. The study is

developed starting from the central concept of the e-society, qualifications. A conceptual

paradigm of the e-society is structured in introduction and a functional paradigm is presented

in a conceptual framework. The interrelationships between the building blocks of the assisted

instruction in e-education are designated in a theoretical framework. A paradigm of meta-

instruction, based on assisted individual study and standard assisted instruction applications,

related to Bloom’s taxonomy and concepts map approach is proposed in an analytical

framework. In this context, it is highlighted a different vision of the connection between

ontology and epistemology related to methodology and methods in e-education.

Keywords: assisted individual study, assisted instruction, e-articles, e-classroom, and teacher

assisted learning JEL classification: I23

1. Introduction The central concept of the e-society is qualification and this fact explain why e-education is

valuable both as an end in itself, and as an engine of the community. The development of the

conceptual approaches for describing qualifications is currently an important priority for any

country. In [1, p.22] is specified that traditional models and methods of expressing

qualifications structures are giving way to systems based on explicit reference points using

learning outcomes and competencies, levels and level indicators, subject benchmarks and

qualification descriptors. These devices provide more precision and accuracy and facilitate

transparency and comparison. Without these common approaches, full recognition, real

transparency and thus the creation of an effective European Higher Education Area, will be

more difficult to achieve. As a consequence, in Romania could be visited the National Higher

Education Registry (http://www.rncis.ro/) which it offers options for advanced search selecting

fundamental domain, science branch, hierarchy domain, and study program and it displays a

related summary, including professional competences, transversal competences, and possible

occupations for the owner of the diploma. At the same time, there is the National Qualifications

Authority, where, at present time, there are published 851 occupational standards

(http://www.anc.edu.ro/?page_id=42); the methodology assigned to these results is available.

In 2005 it was established the Romanian Agency for Quality Assurance in Higher Education

(ARACIS) which is an autonomous public institution, of national interest, whose main mission



315

is the external evaluation of the Romanian higher education’s quality, at the level of study

programmes, as well as from the institutional point of view. ARACIS uses an online platform

for registering and evaluation of experts included in National Register of Evaluators

(http://www.aracis.ro/nc/en/aracis/).

It is important to highlight that transversal competences are defined in the first annex of the

National Education Law no 1/2011. The transversal competences represent valuable and

attitudinal acquisitions which cross over a domain or study program and they are reflected

through the next descriptions: autonomy and responsibility, social interactions, personal and

professional development. A working definition of transversal competencies is presented in [2,

p.4-5]. In this report it is mentioned that that there are significant variations in the definition

and interpretation of transversal competencies among participating countries and economies.

So, each country and economy would clarify and use their own definitions of transversal

competencies under each domain in their studies. As a conclusion, each domain would remain

generic, and they are presented in Table 1.

Table 1 - UNESCO’s Working Definition of Transversal Competencies

Domains Examples of Key Characteristics

Critical and innovative thinking Creativity, entrepreneurship, resourcefulness,

application skills, reflective thinking, reasoned decision-

making

Inter-personal skills Presentation and communication skills, leadership,

organizational skills, teamwork, collaboration,

initiative, sociability, collegiality

Intra-personal skills Self-discipline, enthusiasm, perseverance, self-

motivation, compassion, integrity, commitment

Global citizenship Awareness, tolerance, openness, respect for diversity,

intercultural understanding, ability to resolve conflicts,

civic/political participation, conflict resolution, respect

for the environment

Optional domain: (Example)

Physical and psychological

health

Healthy lifestyle, healthy feeding, physical fitness,

empathy, self-respect

Synthetizing, we could conclude that e-society is based on three pillars which depend on

qualifications, as it is presented in Figure 1.



316

Figure 1 – Conceptual paradigm of the e-society

One important issue in this approach is to highlight, as it is presented in [3], that transversal

competences are recognized as the teachers’ important skills in their organization of the

teaching and learning process and their professional development, as well as in the process of

them teaching these competences to their students. The same author is mentioning that teachers

are the result of their “cumulative autobiography”, so they act on the basis of his/her constructs,

beliefs and understanding of human learning, professional growth and development. This fact

is considered in our analyse being an unrevealed side of the meta-instruction related to an

individual.

2. Conceptual framework Another important issue of this study is to observe that over the time, first there were

developing facts, and second these were getting labels, while, in recent times, there are a lot of

labels already created, while the corresponding facts are late or just only in progress. Referring

to e-science’s input, resources come primary from e-education features, secondly from e-

business structures, while e-science’s output are focused on e-business applications then the

implementations occur in e-education, reflecting an inherited behaviour.

Continuing the previous research, as they are shown in [4] and [5], the e-article could be defined

as the basic result of a scientific research activity in e-science. At the same time, production of

the e-articles is bigger and bigger and some of these are using for learning, as knowledge

objects more or less refined, for understand and apply, and some of these become objects for

analyse, synthesise and evaluation. The two situations could be reflected as a study trying to

find in virtual world the very first scientist or understanding scientometrics; these are two

examples in order to present diversity in scientific research, as soon as the word ‘scientist’

entered the English language in 1834 and there is an interesting history in order to define it,

while bibliometrics become webometrics and then Scientometrics 2.0 based on Social Network

Data and it is going to include new ontologies. On the other hand, using a search engine or a

meta-search engine with the keywords timeline in education, we find out that the blackboard

was invented in 1801 by James Pillans. In 1990, as an improvement, whiteboards begin to erase

the chalkboard from schools (after they have been promoted in business). In 1960, the overhead

projector allowed instructors to use reusable printed transparencies (but first in the army

communities). In 1999, early versions of interactive boards were wired to desktop computers,

while the latest models can connect with mobile devices.



317

A useful investigation could be found in [6], where e-society is understood from three versions,

i.e. E-version for electronic, D-version for digital and V-version for virtual. All three can be

used as prefix. The three versions are interrelated to each other and reflect progresses not only

in technologies and but also in applications for our activities. After a terminological analyse of

these versions, it has been found that some terms in the categories of these versions are

interchangeable, some are not. In general, E as a prefix can be placed before terminology of

subjects and objects alike, whereas the D and V suit technology alike. Such a judgement could

explain an adequate comprehension for numerous prefixes associated with learning, term

which reflects, by default, a typical human activity. In this context, [7] describes e-learning as

the way people use an electronic device (usually a computer) with learning technology to

develop new knowledge and skill individually or collaborative; mobile devices with learning

technologies constitute various forms of wireless environments can have many functions to

promote mobile learning (m-learning); the same author highlights that since the early 2000s

new forms of mobile technology containing additional sensor devices have been providing new

directions for technology-assisted learning, and this has led to context-aware ubiquitous

learning (u-learning). Based on the same evolution of technology-enhanced learning, [8] and

[9] complete the stages with a new one, called Smart Learning sustained by social technologies.

It has to be present that previous researches, such as [10], consider necessary a redefinition of

the term social technology, as soon as the concept of social technologies has several aspects

which destabilize the dominant status of technology. Another strategy for learning in e-

education is b-learning (blended learning) which is analysed in [11], considering that e-learning

has become widely used in every type of education (traditional and formal education,

continuous education and corporate training) because of its characteristics such as flexibility,

richness of materials, resource-sharing and cost-effectiveness. In this work, the author paid

more attention to the blended-learning (b-learning) systems, which consider systems

“combining face-to-face instruction with computer-mediated instruction”. B-learning has been

largely used in the context of higher education, and it includes a wide range of learning formats

such as self-study and instructor-led in both an asynchronous and synchronous mode. [12]

analyse e-learning as a concept, and associate e-Learning 1.0, 2.0 and 3.0 with the prevalent

technologies available in their kin Web versions (1.0, 2.0 and 3.0 respectively). In a survey of

predictions, they highlight that e-Learning concept of “anytime, anywhere and anybody” will

be complemented by “anyhow”, i.e. it should be accessible on all types of devices (a-learning).

Based on the three paradigms of the learning in e-society presented in [5] as the lexical one,

the terminological one and the conceptual one, and according to the interrelationships between

the building blocks of an e-education system developed in [13], a conceptual framework for a

functioning e-society is shown in Figure 2.



318

Figure 2 – Functional paradigm of the e-society

This approach highlights the three forms of the traditional education system: the formal, the

informal and the non-formal education reflected as web components for a learning perspective

in e-society and consists of the building blocks for a theoretical framework.

3. Theoretical framework A study involved in educational research [14], it creates a paradox when considers that what

knowledge is, and the ways of discovering it, are subjective, as soon as it concludes that it is

important for English language teachers to understand the underlying ontological and

epistemological assumptions behind each piece of research that they read. The author continues

mentioning that teachers need to be able to recognize how these assumptions relate to the

researcher’s chosen methodology and methods, and how these assumptions connect to the

findings which are presented in journal articles. First remark is that English is the native

language for the information technology domain and second, in this domain, initially it is

determinant and then relevant, to analyse the term knowledge distinctively, in meta-language

or in object-language. The working language of the educational research has to imply the term

knowledge connected with data and information, as soon as the interrelations could form

different ontologies for research. There are textbooks in which the authors distinguish between

data, information, and knowledge. In [15], as the author mentions, the term “data” refers to the

syntax, “information” refers to the interpretation, and “knowledge” refers to the way

information is used.

In this context, based on the knowledge framework presented in [13], scientific research could

be defined as a paradigm between theory and practice, related to the cognitive infrastructure of

the researcher. As soon as scientific research supposes two languages, theory includes ontology

as an object language and epistemology as the meta-languages, while practice contains methods

as an object language and methodology as the meta-languages. Such an approach confers a

paradigm of assisted instruction in e-education, as it is shown in Figure 3.



319

Figure 3 – The interrelationships between the building blocks of assisted instruction in e-education

Each notion of this paradigm is developed as a construct in a meta-language for a meta-

discipline, in order to integrate an interdisciplinary approach.

4. Analytical framework In order to design an assisted instruction environment in the e-classroom, based on the

analytical framework presented in [16], consisting of the concepts map approach, the context

for developing explicit knowledge, and the pattern of a didactical developing of the content, in

Figure 4 is presented a paradigm which integrates the traditional class, as course and seminar,

in a laboratory activity; this is based on the principles of an assisted instruction system:

individualized learning, personalized learning, interactivity founded on the triple-vision of the

personal computer as tool, tutor and tutee, the adequate information granularity of the content,

assisted instruction developed as a meta-discipline while it integrates inter-disciplinary

approaches because of the diversity as forms of knowing, and trans-disciplinary approaches

because of diversity as forms of knowledge.

Continuing the analyze developed in [17], where for a learning process the content is developed

using a concepts map integrated as a pyramid of notions enabled for knowing, understanding

and applying, while the teaching process is designed on a concepts map aggregated in a network

of notions enabled to analyse, systematize and evaluate the content of the subject, we find two

levels for this approach: a practical one, based on standard assisted instruction applications,

and a theoretical one, centred on assisted individual study, see Figure 4. In a pyramid there are

concepts as basis or aggregate, while in a network, the concepts could be considered,

functionally, as threshold concepts. In assisted instruction, the applications design for learning

include a sequential set of tasks, gradually developed from simple to complex and focused on

knowing, understanding and applying notions used in theories, methodologies and models. It

is the role of the assisted individual study, in a face-to-face interaction in the e-classroom as

teacher assisted learning, to harmonise the conceptual infrastructure of the content in the zone

of proximal development of the learner and to identify and solve the troublesome knowledge

generated by the threshold concepts or by missing concepts in his cognitive infrastructure.



320

Figure 4 – The paradigm of meta-instruction related to Bloom’s taxonomy and concepts map

The assisted individual study could be developed based on e-articles, and treated as a dedicated

environment for learning with all the e-learning functionalities understood as support, as soon

as the e-article represents the basic result of a scientific research activity in e-science.

5. Conclusions Learning, as a typical human activity, consists of knowing the past, understanding the present

and applying in the future. This concept could become a paradox if it is not analyse as a notion

of the meta-language when we research the methodology. A new level of the scientific research

is involved when the meta-language becomes the object-language in e-education.

References [1] *** - A Framework for Qualifications of the European Higher Education Area, Bologna

Working Group on Qualifications Frameworks, Published by: Ministry of Science,

Technology and Innovation, Copenhagen K, 2005, pg. 200; the publication can also be

downloaded from: http://www.vtu.dk ISBN (internet): 87-91469-53-8 [2] *** - Transversal Competencies in Education Policy & Practice, Phase I, Regional

Synthesis Report, Published in 2015 by the United Nations Educational, Scientific and

Cultural Organization 7, place de Fontenoy, 75352 Paris 07 SP, France and UNESCO

Bangkok Office, © UNESCO 2015, pg. 80 ISBN: 978-92-9223-509-3 (Electronic version)

This publication is available in Open Access under the Attribution-ShareAlike 3.0 IGO

(CC-BY-SA 3.0 IGO) license (http://creativecommons.org/licenses/by-sa/3.0/igo/).

[3] R. Čepić, S. Tatalović Vorkapić, D. Lončarić, D. Anđić and S. Skočić Mihić, "Considering

Transversal Competences, Personality and Reputation in the Context of the Teachers’

Professional Development," International Education Studies; Vol. 8, No. 2; 2015

[4] G. Zamfir, "Quality-Quantity Paradigm in Assisted Instruction," Journal of Applied

Quantitative Methods, vol. 5, No. 4, Winter 2010, ISSN: 1842-4562, http://www.jaqm.ro

[5] G. Zamfir, "Learning Paradigms in e-Society," Informatica Economică, Volume 17, No.

3/2013, INFOREC Publishing House, DOI: 10.12948/issn14531305/17.3.2013.09

[6] Xiuhua Zhang , Hans Lundin – Understanding E-Society by E, D and V, published in the

volume of the Proceedings of the IADIS International Conference on e-Society, Ávila,

Spain 16-19 July 2004, Edited by Pedro Isaías, Maggie McPherson, Piet Kommers, ISBN

(Book): 972-98947-5-2

[7] G.-Z. Liu and G.-J. Hwang, "A key step to understanding paradigm shifts in e-learning:

towards context-aware ubiquitous learning," British Journal of Educational Technology,

Volume 41 No 2 2010, doi:10.1111/j.1467-8535.2009.00976.x

[8] E. K Adu and D. C C Poo, "Smart Learning: A New Paradigm of Learning in the Smart

Age," proc. of TLHE 2014, International Conference on Teaching & Learning in Higher



321

Education, National University of Singapore,

http://www.cdtl.nus.edu.sg/tlhe/tlhe2014/abstracts/aduek.pdf

[9] G.-J. Hwang, "Definition, framework and research issues of smart learning environments -

a context-aware ubiquitous learning perspective," Smart Learning Environments 2014, 1:4,

http://www.slejournal.com/content/1/1/4

[10] A. Skaržauskienė, R. Tamošiūnaitė and I. Žalėnienė, "Defining Social Technologies:

evaluation of social collaboration tools and technologies," The Electronic Journal

Information Systems Evaluation Volume 16 Issue 3 2013, pp. 232-241

[11] P. Peres, L. Lima and V. Lima, "B-Learning Quality: Dimensions, Criteria and

Pedagogical Approach," European Journal of Open, Distance and e-Learning, Vol. 17 / No.

1 – 2014, ISSN 1027-5207, DOI: 10.2478/eurodl-2014-0004

[12] N. Rubens, D. Kaplan and T. Okamoto, "E-Learning 3.0: anyone, anywhere, anytime, and

AI," proc. of International Workshop on Social and Personal Computing for Web-

Supported, SPeL 2011, http://activeintelligence.org/wp-content/papercite-

data/pdf/elearning-30-rubens-spel-2011--preprint.pdf

[13] G. Zamfir, "Assisted Learning Sytems in e-Education," Informatica Economică, Vol. 18

No. 3/2014, INFOREC Publishing House, DOI: 10.12948/issn14531305/18.3.2014.08,

ISSN: 1453-1305, EISSN: 1842-8088, p. 91-102

[14] J. Scotland, "Exploring the Philosophical Underpinnings of Research: Relating Ontology

and Epistemology to the Methodology and Methods of the Scientific, Interpretive, and

Critical Research Paradigms," English Language Teaching; Vol. 5, No. 9; 2012

[15] W. van der Aalst, C. Stahl, Modeling Business Processes: A Petri Net-Oriented Approach,

(Cooperative Information Systems), Publisher: The MIT Press (May 27, 2011)

[16] G. Zamfir, "Concepts Map Approach in e-Classroom," Informatica Economică, Volume

16, no. 3, 2012

[17] G. Zamfir, "Theoretical and Factual Meaning in Assisted Instruction," Informatica

Economică, Vol. 15 No. 2/2011, p. 94-106



322

SEMANTIC WEB TECHNOLOGIES FOR IMPLEMENTING COST-

EFFECTIVE AND INTEROPERABLE BUILDING INFORMATION

MODELING

Tarcisio MENDES de FARIAS

CheckSem - LE2I UMR CNRS 6306 - University of Burgundy, Dijon, France

[email protected]

Ana-Maria ROXIN CheckSem - LE2I UMR CNRS 6306 - University of Burgundy, Dijon, France

[email protected]

Christophe NICOLLE CheckSem - LE2I UMR CNRS 6306 - University of Burgundy, Dijon, France

[email protected]

Abstract. In the field of AEC/FM, BIM has been recognized by industrial and political actors

as a powerful tool for resolving data interoperability problems. Coupled with cloud computing

and GIS, BIM would allow integrating different information exchange standard into one single

digital building model that can be real-time edited by several stakeholders or architects. In this

paper, we examine the benefits brought by using Semantic Web technologies in delivering such

universal building model. We present how our approach is a step further in reaching the vision

of BIM, and how it can serve construction process, operation and maintenance, along with

facilities’ lifecycle management.

Keywords: AEC, BIM, facility management, ontology, SWRL.

JEL classification: L74, H57, D80

1. Introduction

When considering a building, its lifecycle comprises two phases: building construction and

facility management. Facility management is generally performed by divisions of

municipalities (be it large or small) or by private contractors. In both cases, the challenge is the

same and addresses data interoperability and management. Indeed, the data produced

throughout the building’s lifecycle is handled and updated by several actors intervening in the

associated processes. This generates a considerable amount of heterogeneous data that has to

be handled by a generally limited number of people. BIM (Building Information Modelling)

[1] is one of the latest approaches proposed in the field of AEC/FM (Architecture, Engineering

and Construction / Facility Management) for organizing into one single model several layers

of information [2]. In the context of our approach, we define BIM as the process of generating,

storing, managing and exchanging building information in an interoperable and reusable

manner [3].

The first step in BIM standardization was conducted in 1999 by buildingSMART (formerly

International Alliance for Interoperability, IAI). It resulted in the development of a model for

representing all components of a physical building, namely the IFC (Industry Foundation

Classes) model [4]. Unlike previous formats such as DXF (Drawing eXchange Format) [5] or

DWG (DraWinG) [6], which were graph- and respectively vector-oriented, the IFC standard

(ISO 10303-21) [7] relies on object-oriented modelling.



323

In the context of BIM, a building is represented in the form of one or several IFC files. Those

files contain several different types of information ranging from operational and maintenance

costs to building’s intended use and level of security. The challenges related to the management

of such heterogeneous data are heightened by the considerable number and diversity of actors

manipulating those files. Not only is the manipulation of IFC files a fastidious process, but

there is no standard language or protocol for querying those files in order to display the data

pertaining to a specific context or task. Or given the wide variety of actors working with such

files (municipalities, stakeholders, architects, etc.), and considering their particular

requirements, there is an increasing need to display only the information pertaining to a given

business logic or context. In our vision, BIM stands as a cooperative system of unified business

views of the same building. Thus, we propose a novel approach based on Semantic Web

technologies. This article emphasizes the main advantages brought by this approach, notably

in terms of operations that can be performed and in terms of economies to be realized.

This paper begins with a brief overview of the advantages and challenges related to the

integration of Semantic Web technologies in the BIM vision. Section 0 lists main limits that

exist today concerning BIM. For each of those limits, we present how our approach can

leverage them. We conclude this article by arguing the benefits of using ontologies coupled

with logical rules in order to tackle the above-mentioned BIM-related issues.

2. BIM and knowledge engineering – advantages, challenges, economical model

By implementing a unique model of a building, BIM helps people from the AEC field in

applying a standard approach for projects dealing with buildings or infrastructures. The idea

behind the BIM approach is to define a novel method for collaboration among actors

intervening in such projects, thus allowing them to exchange data and information concerning

the project with the guarantee that the data will remain consistent and accurate.

The vision of BIM relies on a data model that would allow the following [2]:

Integrating all relationships and interactions between building components (structural,

architectural, mechanical), equipment, piping supports, etc. into a fully coordinated model

Capturing modifications and specifications from different stakeholders, while guaranteeing

data’s consistency and accuracy

“Improved collective understanding of the design intent” [8] - delivering stakeholders a

clearer view of the project, while improving decision making

Time and cost savings are among the most well-known advantages of adopting the BIM

approach. Those savings come with the BIM’s promise for improved knowledge and control,

for AEC service providers, over the building lifecycle. Faster project approvals, more

predictable outcomes, sustainable design and analysis services, along with improved

collaboration are only a few examples.

While the advantages of relying on such model-based design have been clearly identified, the

model itself needs additional extended specification. In this year’s month of February, an EU

BIM Task Group has been created. Comprising representative from 15 European Commission

members, the Group’s goal is to define “how it will share best practice and converge on the

adoption of BIM into the European public estate” [9].

Moreover, inefficient interoperability has been identified as a cost raising factor: studies have

shown that it raised new construction costs by $6.18 per square feet, whereas operation and

maintenance costs were risen by $0.23 per square feet [10]. More information about cost

savings related to BIM adoption can be found in [8].



324

3 Semantic Web technologies for delivering a realist implementation of a BIM

In [11] authors state that while lack of interoperability among applications is indeed a

significant drawback of current systems, it does not stand as the main factor preventing BIM

adoption. Authors identify three “interrelated” obstacles to BIM adoption in the building

industry. For each barrier, we argue that Semantic Web technologies can be used as leverage.

3.1. Lack of business process integration

3.1.1. Current state As previously presented, the BIM approach aims at integrating design data in a model-based

design process. While this offers interesting advantages in terms of information flow and

process connections, it does not consider the specification of clear relationships in the building

supply chain. The so-described BIM approach fails in clearly specifying the workflow and the

data interactions among stakeholders and other actors intervening thorough the building

lifecycle.

Semantic Web technologies rely on ontologies as a specification mechanism, and this has been

proven to be more expressive than informal languages (e.g. UML). A typical example would

be the modelling of relationships involving logical rules, such as “same-as”. Indeed, UML

cannot represent such relationship, whereas it is handled by ontology languages such as OWL.

When considering the problem of managing heterogeneous services or processes, having such

relationships appears as critical since different systems and applications using different

languages need to be able to identify a same high-level command, along with its “meaning”

and effects [12].

When considering system interoperability, it is generally achieved by first creating an

information model (defining critical concepts), then second, by deducing management models

from this model. Unfortunately, most of the existing approaches rely on informal languages for

specifying the information and the data models. Among those, we may cite the Distributed

Management Task Force’s (DMTF) CIM [13], the TeleManagement Forum’s (TMF) Shared

Information and Data model (SID) [14]. While representing vendor-independent data, these

models fail in sufficiently specifying and integrating contextual information for management

operations. Moreover, the CIM approach relies on a proprietary language [13] and the SID

approach is based on UML.

Knowledge engineering in the form of ontologies allows leverages the drawbacks of the above-

mentioned initiatives. Indeed, ontologies have been proven as a formal mechanism for

specifying a common knowledge for a domain of discourse [15]. The so formally-defined

models become tools for solving meaning interpretation problems. Hence, ontology

engineering appears as a solution for modelling and integrating vendor- and technology-

specific knowledge present in information and data models.

In the context of BIM, and in order to answer the problem of specifying transactional business

process, one could use an ontology for modelling every relationship in the building supply

chain and lifecycle, in terms of roles, risks and benefits [16]:

Roles: tasks defined for each participant, information to be generated, information

exchanges among partners, etc.;

Risks: in the case when the system fails in determining the origin of some design

information, specify how can the associated risk be assigned;

Benefits: clearly identify the savings realized by using the considered model as a decision-

making tool.

Moreover by using ontologies, each process could only use only the necessary subset of all

information created by every other process. For example, when an electrical engineer designs



325

the emergency lighting system, he/she does not need to know the colour of the walls or the

carpets present within the same room.

3.1.2. Our approach

In order to address the above listed issues, we have conceived an OWL ontology for the IFC

standard. With such representation of the building information, we are able to implement

shortcuts in the form of logical rules, which allow implementing an intelligent building

information system. Our approach allows a more intuitive extraction of building views and

mitigates the gap of semantic heterogeneity for building software interoperability. The

characteristics of this ontology are given in [3].

Still, several studies have identified clear needs for the integration of construction documents

and data in the standard IFC format. Therefore, we have chosen to integrate the COBie

(Construction Operations Building information exchange) format in our ontology. All mapping

and description of IFC2x4 entities, defined types, select types, and property sets for COBie has

been fully documented in [17]. Starting from these mappings, we have translated them into

logical rules and applied those rules on top of the two ontologies created: one for the IFC

model, one for the COBie model. With this implementation, we are able to automatically

transform COBie data into IFC data. Thus, we have addressed the issue of reducing time waste

associated with the integration of IFC with COBie spreadsheets [18]. We further aim at

improving our knowledge base by introducing links to other models, notably for managing

buildings’ maintenance operations or electrical characteristics. Including such additional

information would allow our system to provide information to users regarding utility

consumption, distribution, use, or cost [19].

3.2. Lack of digital design information computation

3.2.1. Current state

Various formats exist for digital design data. As mentioned in the Introduction, in the context

of BIM, the IFC standard (ISO 10303-21) [1] was chosen for representing building digital

design data. Still, in order to allow a computer to manipulate and perform operations on such

data, one must implement mechanisms that allow computers to understand such data as we

humans do. In other words, the format used for handling such data must contain machine-

interpretable knowledge about the data held. Again, ontologies appear as a solution for this

problem, as they allow specifying such machine-understandable knowledge. Ontology-based

models are “intelligent” as they implement the mechanisms allowing the concepts within the

model to know how to interact with one another [2].

For illustrating this, we may take the example of a room. Such concept does not exist in the

IFC standard, but can be easily defined by means of logical rules over ontology concepts, e.g.

a unique space contained by other building components (such as walls, floors, and ceilings).

Therefore, the whole building model is a knowledge base, while its abstract concepts such as

room or façade wall can be defined using non-graphic data and geometric information. These

concepts can be defined using logical constructs (such as rules and constraints) on top of the

knowledge base. It is therefore possible to query the knowledge base in order to obtain only

the view corresponding to a given concept (e.g. a meeting room). Moreover, when applying

such formalism, if a stakeholder modifies the information pertaining to a given view, this

modification is transmitted through the whole knowledge base and automatically updated in

the views of all the other stakeholders displaying it.



326

3.2.2. Our approach

Let us consider the example of a facility manager that needs to handle a building’s façade walls.

The IFC standard does not implement such a concept, but it contains all information for

implicitly describing it. In order to address this limit, our approach allows defining novel

concepts as used by AEC/FM actors by means of SWRL rules [3].

For example, let us consider the case of a facility manager that needs to plan the cleaning of all

windows of a given building. The concept of a windowed-space is not present in the IFC

standard, so identifying such spaces would represent a lot of manual work from the facility

manager. However, this information can be easily exploited, if we create the concept

BimSpaceWithWindow through the following SWRL rule:

IfcRelSpaceBoundary(?x) & IfcSpace(?y) & IfcWindow(?z) & RelBuildingElement(?x, ?z) &

RelSpace(?x, ?y) ⇒ BimSpaceWithWindow(?y)

We can easily extend this example to the case where SWRL rules are used to specify precise

business contexts and processes. A facility manager could therefore specify that window

cleaning should be performed after a façade walls’ cleansing.

3.3. Lack of design information sharing

3.3.1 Current state

The final goal for BIM is to help relevant parties involved in building management (such as

facility managers) to make use of the so-modelled data. While monolithic data models and

software applications fail in delivering the advertised functionalities, studies have identified

loosely coupled applications to be very promising [16]. Still such applications need to

implement innovative share mechanisms for exchanging design information. Indeed,

traditional approaches perform well in contexts where the data to be exchanged is well-defined,

repetitious and transactional.

When considering BIM scenarios for design information exchange, these are mainly performed

on a query basis and need to integrate contextual information for both the query initiator and

the system answering the query. Additionally, such queries are performed over the above-

described knowledge bases, which are far more complex than traditional database systems. The

IFC format allows exchanging platform-independent information across AEC applications.

Still, not only is this standard going to evolve over time, but its underlying structure is very

complex and difficult to query.

3.3.2 Our approach

Query simplification

Having defined the ontology of the IFC model, we are able to define on top of it SWRL rules.

This allows simplifying the writing of SPARQL queries, notably by referencing concepts

created by means of such rules. When considering the SPARQL query that allows retrieving

all external walls of a building, this query can be highly simplified by using the SWRL-defined

concept of “external wall”. Not only is the query simpler, but it also gains in ease of

understanding. Table 1 illustrates this example in further details.

Handling IFC standard evolution

Starting with the publication of the first version of the IFC standard [7], its specification has

been updated several times. Generally, there is no backward support between the different

versions, as illustrated by the IFC change log [7]. This is mainly due to the fact that most

modifications are made in the data model structure: modifying the attributes’ order for a given

IFC entity, replacing a deleted entity with another data structure, etc.



327

Our approach is also useful in the case when standard evolution adds new entities as subclasses

of existing IFC entities. By means of SWRL rules, triple store existing data is automatically

restructured. Data extracted from IFC files complying to previous versions of the IFC standard

can be automatically update in order to comply to the newer versions of the standard. These

mechanisms allow us to handle different IFC schemas, thus increasing the interoperability of

information exchange among stakeholders.

Table 1 - Content details

Initial query SELECT ?externalWall WHERE {

?externalWall a ifc:IfcWall.

?o a ifc:IfcDefinesByProperties;

ifc:RelObjects ?externalWall;

ifc:RelPropertyDefinition ?pSet.

?pSet a ifc:IfcPropertySet;

ifc:HasProperties ?p.

?p a ifc:IfcPropertySingleValue;

ifc:Name ?name.

?name ifc:dp_IfcIdentifier "IsExternal".

?p ifc:NominalValue ?val.

?val a ifc:IfcBoolean;

ifc:dp_IfcBoolean "true"^^xsd:boolean}.

SWRL rule ifc:HasProperties(?a, ?x) & ifc:NominalValue(?x, ?z) & ifc:Name(?x, ?y) &

ifc:RelPropertyDefinition(?b, ?a) & ifc:RelObjects(?b, ?c) & ifc:IfcWall(?c) &

ifc:dp_IfcBoolean(?z, “true”^^xsd:boolean) & ifc:dp_IfcIdentifier(?y,

"IsExternal"^^xsd:string) ⇒ BimExternalWall(?c)

Final query SELECT ?externalWall WHERE {

?externalWall a ifc:BimExternalWall.}

4 Conclusion

In this paper, we have presented how Semantic Web technologies can help in achieving the

vision of a model-based BIM information system. Based on the ontology we developed for the

IFC format, we show how AEC actors can make use of it. Our approach allows answering the

various objectives related to data interoperability in the building construction and maintenance

domain. Our approach allows defining concepts missing from the IFC standard but that can be

useful in the context of BIM. For doing so, we use logical rules thus separating the BIM data

structure model (e.g.: IFC) from its semantics. Therefore, we can increase the data model

expressivity without compromising the interoperability level delivered by the IFC standard.

Moreover, we have integrated in our knowledge base an automatic mapping to COBie files,

therefore our system allows extracting data from COBie files and directly transform it into IFC

files.

Acknowledgment

This work is part of a collaborative project with the French company ACTIVe3D who has

financed this work.

References

[1] R. Volk, J. Stengel and F. Schultmann, "Building Information Modeling (BIM) for existing

buildings - Literature review and future needs," Automation in Construction, Volume 38,

pp 109-127 (2014)

[2] Autodesk Building Information Modeling, Realizing the Benefits of BIM, 2011 Autodesk,

Inc., available: http://images.autodesk.com/adsk/files/2011_realizing_bim_final.pdf



328

[3] T. M. de Farias, A. Roxin, C. Nicolle, "A Rule Based System for Semantical Enrichment

of Building Information Exchange". Theodore Patkos, Adam Wyner and Adrian Giurca.

RuleML 2014, Aug 2014, Prague, Czech Republic. Vol-1211, pp.2.

[4] R. Vanlande, C. Nicolle, and C. Cruz, "IFC and building lifecycle management,"

Automation in Construction, vol. 18(1), pp. 70-78 (2008)

[5] Autodesk, Inc.: DXF Reference. San Rafael, USA: Autodesk, Inc (2011)

[6] Open Design Alliance: Open Design Specification for .dwg files (2013) [Online],

Available:

http://opendesign.com/files/guestdownloads/OpenDesign_Specification_for_.dwg_files.p

df

[7] International Alliance for Interoperability: IFC2x Versions (2013) [Online], Available:

http://www.buildingsmart-tech.org/specifications/ifc-overview

[8] SmartMarket Report: The Business Value of BIM (2009), pg 26. McGraw-Hill

Construction, Bedford, Massachusetts.

[9] EU BIM Task Group plans ‘convergence’ program, March 3rd 2015, [Online], Available:

http://www.construction-manager.co.uk/news/eu-bim-task-group-plans-convergence-

programme/

[10] G. S. Coleman and J. W. Jun, "Interoperability and the Construction Process, a White

Paper for Building Owners and Project Decision-Makers", available:

http://www.construction.org/clientuploads/resource_center/facilities_mamagement/Intero

perabilityandtheBuildingProcess.pdf

[11] R. Eadie, H. Odeyinka, M. Browne, C. McKeown and M. Yohanis, "Building

Information Modelling Adoption: An Analysis of the Barriers to Implementation," Journal

of Engineering and Architecture, March 2014, Vol. 2, No. 1, pp. 77-101, available:

http://aripd.org/journals/jea/Vol_2_No_1_March_2014/7.pdf

[12] Strassner, J. and Kephart, J., “Autonomic Networks and Systems: Theory and Practice”,

NOMS 2006 Tutorial, April 2006.

[13] DMTF, Common Information Model Standards (CIM), available:

http://www.dmtf.org/standards/standard_cim.php.

[14] SID – Shared Information Data model, available:

http://www.tmforum.org/InformationManagement/1684/home.html.

[15] Guarino N. & Giaretta P., “Ontologies and Knowledge Bases: Towards a

Terminological Clarification, in Towards Very Large Knowledge Bases: Knowledge

Building and Knowledge Sharing”, N. Mars (ed.), IOS Press, Amsterdam, pp. 25–32. 1995.

[16] Phillip G. Bernstein, Jon H. Pittman, Barriers to the Adoption of Building Information

Modeling in the Building Industry, AUTODESK Building solutions, white paper,

November 2004. Available :

http://academics.triton.edu/faculty/fheitzman/Barriers%20to%20the%20Adoption%20of

%20BIM%20in%20the%20Building%20Industry.pdf

[17] BuildingSMART Alliance. MVD COBie (IFC2x4), 2014, [Online], Available:

http://docs.buildingsmartalliance.org/MVD_COBIE/

[18] W. E. East, “Performance Specifications for Building Information Exchange,” Journal

of Building Information Modeling. Fall 2009, pp 18-20.

[19] D. Sapp, “Computerized Maintenance Management Systems (CMMS), Whole

Building Design Guide," National Institute of Building Sciences. January 27, 2011.

[Online], Available: www.wbdg.org/om/cmms.php.



329

PRELIMINARY RESULTS OF AN EMPIRICAL INVESTIGATION ON

BLENDED LEARNING IMPLEMENTATION IN A ROMANIAN HEI

Iuliana DOROBĂȚ Department of Economic Informatics and Cybernetics,

Bucharest University of Economic Studies, Romania [email protected]

Abstract. This paper is centred on presenting preliminary results of an empirical investigation

conducted in the Bucharest University of Economic Studies in order to measure the success of

a blended learning project implementation. I present the investigation process and emphasize

on the impact of the newly implemented online.ase.ro platform on the student satisfaction.

Keywords: HEI (Higher Education Institution) blended learning, e-learning systems success,

measuring e-learning systems success, user satisfaction. JEL classification: I21, I23, D83

1. Introduction Blended learning stands for combined, integrative, hybrid learning [1] with the purpose of

attaining learning objectives by applying specific technologies. Blended learning

implementation projects imply the implementation of e-learning systems in order to achieve a

customized act of learning adapted to the individual learning style of students.

Compelled by these new dimensions of the learning process - the need to introduce modern

teaching techniques based on the use of ICT (Information and Communication Technology)

[1] and also the necessity of alignment to the European standards in education [2], a significant

number of Romanian HEIs are embarking on blended learning implementation projects [3]

meant to help them strategically and to improve their services in an increasingly competitive

environment. The Bucharest University of Economic Studies undertook an e-learning system

implementation project during 2014. The Moodle platform online.ase.ro was available online

from the first semester of the current academic year. Therefore, I have had the opportunity to

conduct an empirical investigation regarding the success of this e-learning system

implementation from the student’s perspective.

2. Research model

The investigation process started by analysing several approaches used for measuring the

success of the e-learning systems identified in the academic literature: the DeLone and McLean

(D&M) model [1], [4], [5], the TAM model (Technology Acceptance Model) [1], [6], models

focused on users’ satisfaction [1], [7], [8] and models focused on the e-learning quality [1], [9].

As a result of the completion of this first step of my research I proposed a model which I

entitled it E-Learning System Success (ELSS) that is based on several perspectives: overall

system quality, user perceived control, usefulness and user satisfaction, user attitude, social

factors and benefits of using the e-learning systems.



330

Figure 1 - The proposed ELSS model [1].

As it is showed in Figure 1 the proposed model is centred on the user satisfaction dimension.

In the academic literature I identified studies focused on validating models for measuring e-

learning system success [1], [10], and [11]. In my research I concentrated my efforts on

validating the proposed model but also to quantify the influence of the individual learning style

on the student’s satisfaction.

3. Study method

The research method used in my study consists of delivering 2 questionnaires that include

questions regarding student’s personal and academic data and a series of items meant to assess

the proposed ELSS model variables.

The first questionnaire was conducted with the sole purpose to obtain the individual learning

style for each student and contained 18 items (each item offering 2 possible answers). This

questionnaire is based on Kolb’s learning style inventory [12], [13].

The second questionnaire contains questions regarding the variables of subsequent model’s

categories: usefulness and satisfaction (PU - Perceived usefulness, PEU - Perceived ease of

use, S - Satisfaction), perceived control (CSE - Computer self-efficacy, CA - Computer

anxiety), user attitude (IU - Intention to use, U – Use, L- Loyalty), quality (SQ - System quality,

IQ - Information quality, SQ - Service quality, ESQ - Educational system quality) and social

factors (T – Trust, SU - Social usefulness, SI - Service Interaction). These variables were

measured on a 5-point Likert scale ranging from 1 “strongly disagree” to 5 “strongly agree”.



331

Students were asked to state the frequency of the Moodle platform usage on a 5-point scale

from “never” to “a lot”.

4. Sample and data collection

As I stated before I applied the blended learning concept during the first semester of the current

academic year (2014/2015), by using the online.ase.ro Moodle platform. The subjects of my

study were students enrolled in the first academic year in the Economic Informatics and

Cybernetics programme from the Faculty of Economic Cybernetics, Statistics and Informatics

of the Bucharest University of Economic Studies in Romania.

The questionnaires were delivered and completed online. I validated 209 questionnaires which

were the support of my study.

5. Preliminary analysis and results

In Table 1 I present the profile of the respondents which reveals that 54,55% of the total number

of participants are females and 45,45% are men. Also, the majority of the respondents is formed

from students that are between 17 and 20 years of age (84,69%). Even though all respondents

are in their first academic year the majority (47,37%) of them already have participated at least

3 online courses. After the analysis of the data collected (these are the results of the first

questionnaire) I identified that the majority of the students are accommodators (44,5%). The

profile of an accommodator is described a person who relies on intuition, is attracted to new

experiences and prefers a practical and experiential approach (commonly known as learning-

by-doing) [14].

Table 1 - The profile of the respondents

Category Frequency Percentage

Gender

Male 95 45,45

Female 114 54,55

Total 209 100

Age

Age between 17-20 years 177 84,69

Age between 20-23 years 23 11

Age between 23-26 years 2 0,96

Age above 26 years 7 3,35

Total 209 100

Individual learning style

Accommodator 93 44,5

Diverger 37 17,7

Assimilator 47 22,49

Converger 32 15,31

Total 209 100

Number of online courses they have participated

1 online course 55 26,32

2 online courses 36 17,22

3 online courses 19 9,09

More than 3 courses 99 47,37

Total 209 100



332

In Figure 2, I present the gender and Kolb learning styles distribution. As noticeable, the biggest

difference between male and female participants is less than 5% and it manifests between

divergers.

Figure 2 - Gender and learning styles distribution

Most students have participated in over 3 online courses regardless their learning style (see

Figure 3).

Figure 3 - Student participation in online learning courses

Also the students stated that they use the online.ase.ro platform often (see Figure 4).

Therefore their experience is relevant in the context of this research.

Figure 4 - online.ase.ro frequency of use.

I measured on a Likert scale the student satisfaction regarding the quality of the online learning

system, the online learning process and the blended learning process.

The results show that 76,55% of the participants regardless their learning style are satisfied

with the quality of the Moodle system (see Table 2).



333

Table2 - Learning style * S1. Student satisfaction - system quality Crosstabulation.

S1. Student satisfaction - system quality

Total

Strongly

disagree Disagree

Neither

agree nor

disagree Agree

Strongly

agree

Lea

rnin

g

style

Accommodator 1 3 16 52 21 93

Diverger 0 0 6 21 10 37

Assimilator 0 4 10 24 9 47

Converger 0 2 7 17 6 32

Total 1 9 39 114 46 209

Also, 73,68% of the participants stated that they are satisfied with the online learning process

(see Table 3).

Table 3 - Learning style * S2. Student satisfaction - online learning Crosstabulation

S2. Student satisfaction - online learning

Total

Strongly

disagree Disagree

Neither

agree nor

disagree Agree

Strongly

agree

Lea

rnin

g

style

Accommodator 1 3 17 53 19 93

Diverger 0 2 7 22 6 37

Assimilator 0 6 10 23 8 47

Converger 0 1 8 19 4 32

Total 1 12 42 117 37 209

The last measurement reveals that 78,95% of the participants stated that they are satisfied with

the blended learning process (see Table 4).

Table 4 - Learning style * S3. Student satisfaction - blended learning Crosstabulation

S3. Student satisfaction - blended learning

Total Disagree

Neither

agree nor

disagree Agree

Strongly

agree

Lea

rnin

g

style

Accommodator 2 19 58 14 93

Diverger 1 6 20 10 37

Assimilator 1 10 27 9 47

Converger 0 5 18 9 32

Total 4 40 123 42 209

6. Conclusions In this paper I presented only few preliminary results of the empirical investigation conducted

in order to verify the impact of the implementation of the Moodle platform in the Bucharest

University of Economic Studies on the students overall satisfaction. In the first part of this



334

paper I emphasized on the research model and on the study method in order to establish the

context of the investigation. Then I extracted and presented some preliminary results.

Considering that the majority of the participants are satisfied with the quality of the Moodle

platform (76,55%) and the quality of the blended learning process (78,95%), I can conclude

that the students enrolled in the first academic year are opened to and embrace the modern

teaching techniques.



POSDRU/159/1.5/S/138907 “Excellence in scientific interdisciplinary research, doctoral and

postdoctoral, in the economic, social and medical fields - EXCELIS”, coordinator The


References [1] I. Dorobăț, „Models for measuring e-learning systems success: a literature review,”

Informatica Economică Journal, vol. 18, no.3, pp. 77-90, 2014.

[2] A. Ion, D. Vespan, „Human sustainable development in the context of Europa 2020

strategy”, Proc. of the 15th Eurasia Business and Economics Society-EBES Conference -

Lisbon, Lisbon, Portugal, pg.98, 2015.

[3] I. Dorobăț, A. Florea, V. Diaconița, „Applying blended learning in Romanian universities:

between desideratum and reality”, Proc. of the 7th International Conference of Education,

Research and Innovation (ICERI), Seville, Spain, pp. 1819-1825, 2014.

[4] W.H. DeLone, E.R. McLean, "The DeLone and McLean model of information systems

success: A ten-year update," Journal of Management Information Systems, vol.19, pp. 9–

30, 2003.

[5] C.W. Holsapple, A. Lee-Post, "Defining, assessing, and promoting e-learning success: An

information systems perspective," Decision Sciences Journal of Innovative Education, vol.

4, pp. 67–85, 2006.

[6] F. D. Davis, "Perceived usefulness, perceived ease of use, and user acceptance of

information technology," MIS Quarterly, vol. 13, no.3, pp.319–340, 1989.

[7] P.C. Sun, R.J. Tsai, G. Finger, Y.Y. Chen, D. Yeh, "What drives a successful e-Learning?

An empirical investigation of the critical factors influencing learner satisfaction,"

Computers & Education, vol. 50, pp. 1183–1202, 2008.

[8] J.H. Wu, R.D. Tennyson, T.L. Hsia, "A study of student satisfaction in a blended e-learning

system environment," Computers & Education, vol. 55, pp. 155–164, 2010.

[9] J.K. Lee, W.K. Lee, "The relationship of e-Learner’s self-regulatory efficacy and perception

of e-Learning environmental quality," Computers in Human Behavior, vol. 24, pp. 32–47,

2008.

[10] R. Arteaga Sanchez, A. Duarte Hueros, "Motivational factors that influence the acceptance

of Moodle using TAM," Computers in Human Behavior 26 (2010), pg. 1632-1640.

[11] T. Escobar Rodriguez, P. Monge Lozano, "The acceptance of Moodle Technology by

business administration students," Computers & Education 58 (2012), pg. 1085-1093.

[12] A. Kolb, D. Kolb, "The Kolb Learning Style Inventory—Version 3.1, Technical

Specifications," HayGroup Experience Based Learning Systems, Inc., 2005.

[13] D. Terzi, "Exodul adulţilor în instruire," Revista Didactica Pro, vol. 3, pg. 54-65, 2001.

[14] S. Mcleod, "Kolb-Learning styles", 2013, available online on 27 February 2015 at

http://www.simplypsychology.org/learning-kolb.html.



335

BUSINESS PROCESS MANAGEMENT DRIVEN BY DATA

GOVERNANCE

Liviu CIOVICĂ Academy of Economic Studies

[email protected]

Răzvan Daniel ZOTA

Academy of Economic Studies

[email protected]

Ana-Maria CONSTANTINESCU Lucian Blaga University of Sibiu

[email protected]

Abstract. As organizations begin to exploit the value of data for strategy and operations, it is

recognized the role of data governance in helping the business to realize the potential value in

data. Data governance provides the capabilities that support the administrative tasks and

processes of data stewardship. It supports the creation of data policies, manage workflows,

and provide monitoring and measurements of policy compliance and data use. Data

governance workbenches and functionality in tools such as master data management (MDM),

data quality, and metadata management are now better equipped to link to data policies (i.e.,

consistency, correctness, completeness, relevancy etc.). This gives to the business data

stewards a better operational control to validate and manage data compliance with data

policies. In this way, we may have a single solution to govern data across the five areas of data

governance — data quality, MDM, metadata management, security, and information life-cycle

management — and, more important, the ability to tie data compliance to quantifiable business

impact. A better and a more consistent approach to Business Process Management is done

through a proper governance and understanding of data.

Keywords: business process, data governance, data quality, data stewardship, information life

- cycle.

JEL classification: M21

1. Introduction Main objective of data governance is to provide an overview of the importance and relevance

of data governance as part of an information management initiative. As business data stewards

become accountable for data governance success, they need solutions to support their activities.

It is not enough to collect data policies and rules, business data stewards need capabilities that

align to the processes and management of data as a new business asset. Data Governance

provides an operating discipline for managing data and information as a key asset of an

enterprise. It includes organization, processes and tools for establishing and exercising decision

rights regarding valuation and management of data.

The elements of data governance are:

Decision making authority;

Compliance;

Policies and standards;



336

Data inventories;

Full lifecycle management;

Content management;

Records management;

Preservation and disposal;

Data quality;

Data classification;

Data security and access;

Data risk management;

Data valuation.

1.1 IT Governance and Data Governance

As defined by the IBM Data Governance Council, data governance represents the political

process of changing organizational behavior to enhance and protect data as a strategic

enterprise asset.

The Data Governance Institute defines DG as a system of decision rights and accountabilities

for information-related processes, executed according to agreed-upon models which describe

who can take what actions with what information, and when, under what circumstances, using

what methods.

Data Governance touches both business and IT by answering two questions:

Regarding business: How do we leverage data to improve business process and

performance?

Regarding IT: How do ensure optimal reuse, quality and operational efficiencies?

By running a Data Governance program lifecycle we may develop a value statement, prepare

a roadmap, plan and fund, design the program, deploy the program, govern the data, monitor,

measure, report.

Typically, data Governance has a three-part mission:

Proactively define/align rules;

Provide ongoing, boundary-spanning protection and services to data stakeholders;

React to and resolve issues arising from non-compliance with rules.

Typical universal goals of a Data Governance program are to enable better decision-making

and reduce operational friction, to protect the needs of data stakeholders, train management

and staff to adopt common approaches to data issues, build standard, repeatable processes,

reduce costs and increase effectiveness through coordination of efforts and to ensure

transparency of processes.

IT Governance makes decisions about:

IT investments;

IT application portfolio;

IT project portfolio.

IT Governance aligns the IT strategies and investments with enterprise goals and strategies and

by using Control Objectives for Information and related Technology (COBIT) provides

standards for IT governance. Only a small portion of the COBIT framework addresses

managing information. Some critical issues, such as Sarbanes - Oxley compliance, span the

concerns of corporate governance, IT governance, and data governance.

Data Governance is focused exclusively on the management of data assets and is at the heart

of managing data assets.



337

2. Master Data Management and Business Process Management Correlation

Master data can be defined as the data that has been cleansed, rationalized, and integrated into

an enterprise-wide “system of record” for core business activities. [1]

Master Data Management (MDM) is the framework of processes and technologies aimed at

creating and maintaining an authoritative, reliable, sustainable, accurate, and secure data

environment that represents a “single version of truth”, an accepted system of record used both

intra- and inter-enterprise across a diverse set of application systems, lines of business, and

user communities. [2]

As products of data governance, data quality and MDM helps drive business agility by allowing

developers to infuse business process management (BPM) with timely, trusted data from

master data management (MDM), leading to more intelligent business processes.

Timely use of information from MDM is a key factor. MDM data delivery needs to be

integrated with the business process. The data needs to be governed and managed by providing

validation, avoiding duplicates, and so on. These steps for data stewardship need to be

repeatable forming a business process. Therefore, not only does MDM deliver trusted data in

BPM based solution, but also BPM is a key enabler for managing MDM.

MDM comes with a process to manage data stewardship. Data stewardship processes are

modified only by the data stewards to address new issues with data quality – for example,

needing to check valid values for a key data field, validating against an external trusted source,

etc.

MDM and BPM can address challenges in two areas. They can enable more accurate and timely

decision making to enhance business performance, by using BPM to optimize process with

human tasks, automated tasks, and improved visibility. MDM can provide trusted and timely

data to business processes. Enterprise process agility with BPM and trusted timely data from

MDM may be combined.

Master Data can be a trusted asset to the organization’s business processes. BPM helps

implement and enforce policies and coordinates multi-step/multi-role workflow for data.

Although master data management is a valid and strongly recommended product of data

governance, with high potential in delivering valuable and critical data, it still’s presents some

issues in data management organization, like:

Discovery - cannot find the right information;

Integration - cannot manipulate and combine information;

Insight - cannot extract value and knowledge from information;

Dissemination - cannot consume information;

Management – cannot manage and control information volumes and growth.

It also presents some issues in gathering information from users, managers, and all the personal

which needs the correct information to act upon:

52% of users don’t have confidence in their information;

59% of managers miss information they should have used;

42% of managers use wrong information at least once a week;

75% of CIOs believe they can strengthen their competitive advantage by better using and

managing enterprise data;

78% of CIOs want to improve the way they use and manage their data;

Only 15% of CIOs believe that their data is currently comprehensively well managed.

To drive business agility data quality should solve these issues in a proper manner and time:

Poor data quality costs real money;

Process efficiency is negatively impacted by poor data quality;



338

Full potential benefits of new systems not be realized because of poor data quality;

Decision making is negatively affected by poor data quality.

Information in all its forms – input, processed, outputs – is a core component of any IT system

and applications exist to process data supplied by users and other applications. Data breathes

life into applications so data must be stored and managed by infrastructure - hardware and

software, representing a key organization asset with substantial value. Significant

responsibilities are imposed on organizations in managing data.

MDM System provides mechanisms for consistent use of master data across the organization

and provides a consistent understanding and trust of master data entities. Is designed to

accommodate and manage change.

Organizations have multiple, often inconsistent, repositories of data for:

Line of business division;

Different channels;

Cross-domain;

Distribution of information;

Packaged systems;

Mergers and acquisitions;

Operational MDM System participates in the operational transactions and business processes

of the enterprise, interacting with other application systems and people. Analytical MDM

System is a source of authoritative information for downstream analytical systems, and

sometimes is a source of insight itself.

3. Data and information management

Data and information management is a business process consisting of the planning and

execution of policies, practices, and projects that acquire, control, protect, deliver, and enhance

the value of data and information assets. Its scope is to manage and use information as a

strategic asset for implementing processes, policies, infrastructure and solutions to govern,

protect, maintain and use information.

To make relevant and correct information available in all business processes and IT systems

for the right people in the right context at the right time with the appropriate security and with

the right quality is necessary to exploit the correct and proper information in business decisions,

processes and relations.

In information management we confront ourselves with two goals, a primary goal and a

secondary goal.

Primary goals are used to:

understand the information needs of the enterprise and all its stakeholders;

capture, store, protect, and ensure the integrity of data assets;

continuously improve the quality of data and information, including accuracy, integrity,

integration, relevance and usefulness of data;

ensure privacy and confidentiality, and to prevent unauthorized inappropriate use of data

and information;

maximize the effective use and value of data and information assets.

Secondary goals are used to:

control the cost of data management;

promote a wider and deeper understanding of the value of data assets;

manage information consistently across the enterprise;

align data management efforts and technology with business needs.



339

By managing data and information carefully, like any other asset, by ensuring adequate quality,

security, integrity, protection, availability, understanding and effective use of shared

responsibility for data management between business data owners and IT data management

professionals it results that Data Management is a business function with a set of related

disciplines.

The business function of planning for, controlling and delivering data and information assets

for development, execution, and supervisions of plans, policies, programs, projects, processes,

practices and procedures that control, protect, deliver, and enhance the value of data and

information assets for the scope of the data management functions and for the scope of its

implementation vary widely with the size, means and experience of organizations. Role of data

management remains the same across organizations even though implementation differs

widely.

With a shared role between business and IT, data management is a shared responsibility

between data management professionals within IT and the business data owners representing

the interests of data producers and information consumers. Business data ownership is the

concerned with accountability for business responsibilities in data management.

More accurate and timely decision by making an enhance business performance through BPM

optimizes process with human tasks, automated tasks to improve visibility. MDM provides

trusted and timely data to business processes by combining enterprise process agility with BPM

and with trusted, timely data from MDM.

By ensuring master data as a trusted asset to the organizations processes enforcing appropriate

Data Governance policies to support process consumption with the help of BPM Express by

implementing and enforcing policies and coordinating multi-step / multi-role workflow for

data.

4. Conclusion

Improving data quality is one of those timeless things which can provide value on its own, or

it can be done as a first step towards something else such as master data management (MDM),

or it can be done together with MDM.

Data Governance is necessary in order to meet several strategic business requirements like:

Compliance with regulations and contractual obligations;

Integrated customer management (360 degree view);

Company-wide reporting needs (Single Source of the Truth);

Business integration;

Global business process harmonization.

All of this traces back to lack of data governance and poor quality data in the end. Master data

management technology can address a lot of these issues, but only when driven by an MDM

strategy that includes a vision that supports the overall business and incorporates a metrics-

based business case. Data governance and organizational issues must be put front and center,

and new processes designed to manage data through the entire information management life

cycle. Only then can you successfully implement the new technology you’ll introduce in a data

quality or master data management initiative.

References [1] A. Berson and L. Dubov, Master Data Management and Customer Data Integration for a

Global Enterprise, McGrall-Hill, 2007, pp. 8



340

[2] O. Boris, “One Size Does Not Fit All: Best Practices for Data Governance”, University of

St. Gallen, Institute of Information Management Tuck School of Business at Dartmouth

College, Minneapolis, September, 2011

[3] J. J. Korhonen, MDM and Data Governance. Helsinki University of Technology, available:

http://www.jannekorhonen.fi/MDM_and_Data_Governance.pdf



341

M-LEARNING AND LIFELONG LEARNING

Alina-Mihaela ION

The Bucharest Academy of Economic Studies, Romania

[email protected]

Dragoș VESPAN

The Bucharest Academy of Economic Studies, Romania

[email protected]

Abstract. The article highlights the impact of using mobile devices by the participants at the

lifelong learning educational process. Seeing human development as a sustainable

development involves adapting to new hardware and software technologies supported by the

progress of information and communication technology. Due to technological development

access to mobile devices grew significantly. This way, the extension of the educational process

towards lifelong learning is simplified by the use of mobile devices. Practically, m-Learning

instruction type for lifelong learning is sustained.

Keywords: lifelong learning, computer assisted instruction, m-learning, mobile devices,

information and communication technology

JEL classification: I2

1. Introduction

The progress known by the information and communications technology led to major changes

in all spheres where it is used and, consequently in educational domain. Using technology in

education encourages at the same time both traditional education process and lifelong learning.

Both at European level and worldwide there is an increasing attention given to lifelong

educational process as a result of the awareness that sustaining lifelong learning process is in a

direct relationship with the living standards of citizens and their quality of life. Investments in

human capital support all other investments. This way, the importance of education and its role

in society is emphasized.

2. Extension of traditional education

At European level, since 2002, m-Learning was seen as an extension of e-Learning. On this

purpose, Ericsson implemented the project "From e-learning to m-learning" as part of the

Leonardo da Vinci II program, under the coordination of the European Union.

According to [1], [2], Leonardo da Vinci II was an European vocational instruction program,

representing a natural following of the Leonardo da Vinci I program, [3].

Leonardo da Vinci II program was conducted between January 2000 and December 2006,

allowing the right of 31 European Union member states to participate at the competition,

including Romania, which it was in the process of adherence at that time.

As [4] states, students in higher education are increasingly using mobile devices in their

personal activities and also in their social activities, including their education.

According to [4], one of the most popular European applications as innovation is Federic@

platform implemented at Federico II University of Naples, Italy. The platform provided to

students is oriented to m-Learning distance instruction type and can be accessed through a wide



342

variety of smart phones and tablets. The applications provided by the Federal@ are: Federica

WebLearning, FedericaMobile and Federica iTunes U.

Through these applications, the instructors have benefit from the necessary technology to create

and customize educational materials, and students have access to information using a computer

or a mobile device, as shown in Figure 1.

Figure 1. Example of m-Learning courses for economics

Federico II University provides a wide variety of courses, corresponding to different

specializations as Economics, Medicine, Pharmacy, Engineering, Philosophy, etc.

Generally, distance instruction using computers is considered to be an independent instruction.

On the contrary, in the case of instruction through mobile devices, the educational process

should be considered to be a complementary method for the classical education with presence

in campus or even a complementary instruction method for distance learning.

The advantage provided by the small sizes of mobile devices allows them to be used at any

time by those concerned to get documented in different areas. With a simple Internet

connection, a charged battery and available time, users have the opportunity to study whether

they are into a classroom, in transportation means or at home, completing this way their

knowledge on their domain of interest. Such activities encourage lifelong learning towards a

sustainable human development.

Mobile devices are mainly used for communicating via short text messages, searching of

information or displaying media content. According to [5], materials involving writing large

size texts are not supported due to the small size of the screen which makes them difficult to

be read. In order not to complicate the process of reading and not to get the student tired,

educational supports for m-Learning must fulfil a certain standard and must be designed right

from the start for small screens. Displaying contents optimally on the screens of mobile devices

and structuring them accordingly will increase the interest of students towards m-Learning.

The communication in educational process is encouraged by the mobile technologies. Those

who use m-Learning specific applications can communicate verbally or in writing, can

collaborate through shared applications and can interact through the Social Media specific

applications.



343

Communication facilities through SMS, dialog and chat encourage collaborative activities

between those who study using these methods. In [5], based on the arguments presented by [6],

the idea according to which mobile devices represent collaborative learning method.is

sustained.

3. M-Learning and sustainable human development

Mobile devices provide support especially for fast communication. Mobile devices have the

main role of providing support for achieving synchronous and asynchronous communication

between users, according to [7]. The development of social aspects of the users is such

encouraged.

[8] highlights the functionalities of a mobile phone, based on a survey conducted among

students. Analysis of their responses resulted in a ranking of the functionalities of mobile

phones used in the educational process, as follows:

1. Communication

2. Accessing educational content

3. Accessing useful information

4. Social Networking

5. Content authoring

The functionalities mentioned above, which are specific to mobile phones, can be extended as

well to other mobile devices such as laptops (considered to be partial mobile device) and

tablets.

If from educational perspective, the utility of mobile devices is represented by the order of

functionalities above from 2 to 5, Figure 2, when we have a general look, the importance of the

functionalities is exactly the reverse order, from 5 to 2.

Figure 2. The functionalities of mobile devices

Therewith, the functionality of mobile devices can be grouped into:

Basic functionalities represented by communication (verbal communication, SMS, e-

mail), content authoring (texts composition, pictures taking, audio-video recording,

audio recording) and social networking (creating and publishing content on Facebook,

Twitter, YouTube, Instagram, Pinterest).

Advanced functionalities, represented by the possibility to access useful information

(on websites such as electronic dictionaries, blogs specialized in a particular area) and

Over-advanced functionalities


using software technologies

and programming environments

Advanced functionalities

6. Accessing educational

content

5. Accessing useful

information

Basic functionalities

3. Social Networking


1. Comunication



344

accessing the educational content available via PDF files, web pages, and interactive

applications.

Some users of the advanced functionalities of mobile devices are aware of the importance of

getting documented in a particular field, clarifying certain concepts or terms used casually in a

particular context. Unwittingly, they will use these advanced functionalities provided by

mobile devices for self-training. In this case, the educational process is almost imperceptible

to the user. Such self-training activities can be carried out by users of all ages, regardless of

their training. The activity itself can be considered to be specific to lifelong learning, voluntary,

without the user of the device to specifically be aware of the educational process.

The awareness generally occurs among users actively involved in the educational process. This

category of users of mobile devices represents a particular case of all users conducting intentional

learning and organized learning, activities specific to lifelong learning. A distinction between these

two concepts is made in [9].

The definition of these concepts can be adapted for the users of mobile devices in education. For

the users of mobile devices, considering [9] we can say that intentional learning represents the fact

that, at a certain moment of time after completing the initial formation, the user becomes aware

about the continuation of his education in a certain domain and by the importance of participating

at further formation courses in order to increase his level of knowledge, competencies and skills in

the domain of activity he is interested in.

In this scenario, the use of functionalities provided by mobile devices for searching educational

materials, enrolling and graduating specialty or forming courses represents operations that the user

conducted intentionally and is aware of.

Compared to intentional learning, from the perspective of lifelong learning and particularizing

for the case where mobile devices are used in the educational process, organized learning

represents learning that has been planned in a program. The participants do not express

explicitly their intention to follow such type of learning but rather they generally choose this

following some constraints at work.

Implicitly, the users of advanced functionalities, both those who are aware of the educational

process and those which are self-instructing, use also the basic functionalities provided by

mobile devices.

Along with the raise of his instruction level, the user will be capable of using high level

software technologies in order to develop educational content. Thus, Figure 2 can be improved

by adding the sixth functionality represented by the use of high level technologies.

The use of mobile devices in education must be seen further as a complementary instruction

method. As an instructor choses to present the content of the same course either in a classical

way, using a PDF file, or in an interactive way, using specific software technologies, both of

these methods being used to train, mobile technologies can also be viewed as an alternative

way of training. In order to learn, a student may choose to access educational materials either

on a computer or on a mobile device, be it a smartphone. The objective of the instruction and

the role of the device remain the same.

Even if it should not matter how the user choses to view the information in a completion test

of an organized course (PDF or interactive), when talking about mobile devices the situation is

different. Mobile devices may be used for routine evaluation but, for the moment, the final

evaluation conducted exclusively on mobile phone or tablet is not possible.

In order to get a correct and fair assessment, all those taking an exam at the same time should

have exactly the same type of device or devices with identical performances. The problems

that arise not allowing the completion of an educational process by giving final tests on the

mobile phones or tablets can be divided into two categories: financial problems and

methodological issues. From the financial point of view, investments should be made in



345

laboratories in order to provide students with access to all types of mobile devices that they can

use along the educational process for studying and also for getting familiar with. Thus, in the

event of taking a test on such a device, the user should have the necessary skills to use the

device without any problems.

From the methodological point of view, the problems that may occur are caused by the novelty

of the use of mobile devices in an educational process that is finalized through awarding

recognized diplomas. There should be different methodologies developed for creating

educational content to be proposed in the training process.

4. Conclusions and future directions

The programs proposed and sustained at European level in the past 25 years represent the proof

of the interest shown for transforming Europe into a competitive and dynamic knowledge-

based economy, as its development is intended to be a durable and sustainable one. Such a

development can only be achieved through investments in human capital, raising its

educational level. Within the European Union, one priority for the coming years is to create

programs that allow sustainable development of EU citizens.

Including mobile devices in educational activities provides more support to students. Still, this

should be seen as an activity that is complementary to the process of lifelong learning, which

is traditionally achieved by presence on campus, or through personal computers.

The advantages of accessing educational content on the tablet or mobile phone are similar to

those of accessing the contents from the personal computer. Deploying complex activities

which involve the use of over-advanced functionalities of mobile devices is hampered by the

small size of the screen.

The basic functionalities of mobile devices are represented by rapid communication and

information transmission, two activities that support collaborative activities between students

or between common users of applications specific to educational field. Using mobile devices

in everyday activities directly encourage voluntary or organized involvement in educational

activities. Due to the advantage of having the ability to keep the user in constant contact with

the educational or lucrative activities he conducts, the use of mobile devices encourages

lifelong learning process.

Acknowledgment



POSDRU/159/1.5/S/138907, "Excellence in scientific interdisciplinary research, doctoral

and postdoctoral, in the economic, social and medical fields - EXCELIS", coordinator The

Bucharest University of Economic Studies, Romania.

References

[1] P. Landers, "From e-Learning to m-Learning," 08 2002. [Online]. Available:

http://learning.ericsson.net/mlearning2/project_one/leo.html. [Accessed 03 2015].

[2] CE, „Obiectivele Europa 2020,” 10 02 2014. [Interactiv]. Available:

http://ec.europa.eu/europe2020/europe-2020-in-a-nutshell/targets/index_ro.htm. [Accesat

04 12 2014].

[3] UE, Educaţie şi formare: cadru general - Programul de învăţare pe tot parcursul vieţii

2007-2013, 2009.

[4] eprof.ro, "Tehnologii Mobile ȋn Învățare," 2015. [Online]. Available:

http://www.eprof.ro/doc/Mobile_learning.pdf.



346

[5] A.-M. Ion și P. Pocatilu, „Using M-Learning in Education,” în The Proceedings of the 11th

International Conference on Informatics in Economy, Bucureşti, 2012.

[6] H. Uzunboylu, N. Cavus și E. Ercag, „Using mobile learning to increase environmental

awareness,” Computers & Education, vol. 52, nr. 2, pp. 381-389, february 2009.

[7] A.-M. Ion și D. Vespan, „Collaborative Learning and Knowledge Transfer in

Consciousness Society,” Informatica Economica, vol. 15, nr. 3, pp. 115-127, 2011.

[8] L. Nielsen , " Research-based proof that students use cell phones for LEARNING," 16

February 2013. [Online]. Available:

http://theinnovativeeducator.blogspot.co.at/2013/02/finally-research-based-proof-

that.html. [Accessed 2015].

[9] A.-M. Ion, "Lifelong Learning and Human Sustainable Development in European Union,"

in Proceedings of the 9th International Conference On Economic Cybernetic Analysis:

Positive And Negative Effects Of European Union And Eurozone Enlargement Pone-2014,

Oct 31 - Nov 1st 2014, Bucharest, 2014.



347

THE PROBLEM OF DATA CONSISTENCY IN ANALYTICAL

SYSTEMS

Oleksandr SAMANTSOV

Kharkiv National University of Radioelectronics

[email protected]

Olena KACHKO

Kharkiv National University of Radioelectronics

Abstract. Authors describing how data inconsistency in analytic systems between aggregated

and raw data appear and provide a solution how to prevent it due to models of data consistency.

Also the question if we need to have inconsistent data also appear.

Keywords: aggregating data, business intelligence, data analytics, data inconsistency

JEL classification: C88, L86

1. Introduction

Business Intelligence processes are tightly composed in all spheres of life and this leads to the

fact that modern companies handle terabytes of user data. This complicates the structure of

modern analytical data providers, which are multilevel systems, which often give access to data

aggregated at a certain level, rather than raw [1]. One of the challenges when working with

aggregated data is the complexity of the raw data and this leads to delays in processing, which

are critical for some Business Opportunities (medicine, sales) and, moreover, violates one of

the principles of any analytical and distributed system – the principle of consistency. In this

article we will look at methods to deal with consistent data.

2. Sources of latency in analytical databases

The business requirements for a zero-latency analytical environment introduce a set of service

level agreements that go beyond what is typical of a traditional data warehouse. These service

levels focus on three basic characteristics:

1. Continuous data integration, which enables near real-time capturing and loading from

different operational sources. This sort of data integration results in an increasing number of

late-arriving data (e.g. due to propagation delays). Besides technical challenges (mixed

workload caused by concurrent updates and analytical queries, scalability, performance,

minimized scheduled downtimes, etc.) there are other issues which directly affect the analytical

environment:

- Analysis results may change unexpectedly from the analyst’s perspective during the repetition

of an identical analytical query if the result set was affected by newly integrated data in the

meantime. This is a really critical situation, because it confuses analysts that are accustomed

to the stable snapshot paradigm for data warehouses. It is very difficult for them to determine

the cause for such an unexpected change: the newly integrated data.

- Keeping aggregates current. Aggregates are intended to provide better performance for

analytical queries, providing results at a higher level, rather than all the detailed data. This is a

common situation in analytical environments using OLAP. In a traditional data warehouse all

the aggregates are updated at the end of every update window. However, in a continuous

loading environment this is not feasible. We need a model that is able to reflect multiple

versions of aggregates regarding the same dimension hierarchy levels [2].



348

2. Active data warehouses. An active data warehouse is event driven, reacts in a timeframe

appropriate to the business needs, and makes tactical decisions or causes operational actions

rather than waiting to produce periodic reports. It provides an integrated information repository

to drive both strategic and tactical decision support within an organization. Furthermore, rule-

driven (active) decision engines can use this information in order to make recommendations or

initiate operational actions in near real time responding to predefined data conditions in the

warehouse. [2].

3. Late-arriving data. Late-arriving datasets are defined as data, which is available for loading

and is logically related to data warehouse datasets already integrated during previous update

periods (e.g. weeks or even months ago). Late-arriving records are welcome because they make

the information more complete. However, those facts and dimension records are bothersome

because they are difficult to integrate. The newly integrated datasets change the counts and

totals for prior history. There are several reasons why we shouldn’t ignore late-arriving data:

- Analysis results may change retrospectively. Late-arriving data can possibly change analysis

results unexpectedly from the analyst’s perspective. This situation is similar to that encountered

in continuous load environments. It is very difficult to determine the cause of an unexpected

change. It can be one of the following two possibilities:

- Late-arriving data is integrated into the data warehouse and affects historical analysis results

by changing counts and totals for prior history. So it need to be re-aggregated.

- Loading errors or data quality problems during data staging (e.g. duplicate rows after

restarting a loading process) affect detailed data and aggregates.

Also, we should take into account the high availability for data warehouses. Availability service

levels for a zero-latency analytical environment are typically more stringent than for strategic

decision support implementations. The active mechanisms of the proposed architecture expand

the scope of a traditional DWH to include tactical decision support queries that are critical for

the operational aspects of an organization’s business. As result, 24 × 7 × 52 (24 hours a day, 7

days a week, 52 weeks a year) availability becomes an important requirement, because without

access to the DWH, the business cannot operate in an optimal way. These availability

requirements apply to both planned (system upgrades, etc.) and unplanned (disaster, etc.)

downtimes [2].

3. Data consistency models

Analytical storage is characterized by a predominance of read operations on write operations.

Therefore, we will focus on client-oriented models of data consistency.

3.1. Potential consistency

This model is characterized by the fact that the paralleling degree in which requests are coming

and the data consistency degree may vary.

These examples can be seen as instances of distributed databases and insensitive to the

relatively high degree of consistency violation. They data usually does not change for a long

time and all copies of it are gradually becoming consistent. This form of consistency is called

potential consistency.

The potentially consistent data warehouse has the following property: in the absence of

changes, all copies of the data are gradually becoming identical. The potential consistency, in

fact, requires only a change to the guaranteed expenditures for all copies, regardless of the time

when it will happen. Conflicts of double entry are often relatively easy to solve, if we assume

that the changes can be made only by a small group processes. Therefore, the implementation

of the potential of consistency is often quite cheap [3].



349

According to this data consistency model, we can propose the following solution for data

aggregation: depending on the update rate and volume, a special aggregator program runs on

the database and re-aggregates available data.

This decision will cause data consistency violation, but it will not load the system with

unnecessary operations. This solution may be used if the analytical data that we use does not

require real-time updates or when the refresh rate and the volumes of data are small. For

example, the aggregation of demographic behavior or the user data of a web resource.

3.2. Monotonic-read consistency

The data warehouse provides a monotonic-read consistency if it meets the following condition:

if the process reads the value of a data item X, any subsequent read operations always return

the same value or a newer one.

In other words, the monotonic-read consistency ensures that if the process at time t sees a

certain value of X, then later he would never see an older value of X.

Monotonic-read consistency is implemented as follows. When a client performs a read

operation from the server, the server checks the client dataset to the local presence of all write

operations. If it is not, it binds to other servers to update the data before the read operation.

Based on the definition of monotonic-read consistency, we can offer the following

implementation of a data aggregator. When a user accesses aggregates, it compares the

timestamp of building aggregated data and timestamp of the source data change. If the source

data has been changed, the program starts the aggregation process. For this implementation is

critical to the time to aggregate data. Thus, the first access to aggregates execution after

changing the raw data can be quite lengthy. Therefore, this data consistency model can be used

only for analytical systems with a relative small amount of incoming data.

3.3 Monotonic-write consistency

In many situations, it is important to store copies of all data in the correct order of the

distributed write operations. This can be done, provided with monotonic-write consistency. If

the store has the property of monotonic-write consistency, this means that following conditions

are correct: the write process data item X is completed before any subsequent write operations

of this process in the element X. Here, the completion of the write operation means that the

copy on which the following operations reflects the effect of the previous recording operation,

produced by the same process, and thus it does not matter where the operation has been

initiated. In other words, the writing operation in the copy of the X data item is only performed

if the copy is consistent with the results of the previous recording operation is performed on

the other replicas X.

Consistency monotone recording is implemented similarly to monotonous reading. Whenever

you initiate a new client writes to the server the server looks set customer record. (Again, the

size of this set may be too large for the existing performance requirements.) The server verifies

that these write operations performed first and in the correct order. After performing a new

operation, an identifier of a record of the operation is added to the set of records [3].

Thus, we can offer the following solution. When recording raw data for an analysis system, we

run the aggregator program. To improve performance, it makes sense to run the aggregator not

processing each record, but a certain chunk. At the same time, we should provide that update

of the raw data and their aggregation should appear in the same transaction. This solution is

again suitable for systems with rarely renewed data, however, by horizontally extending the

number of processes that update data (in accordance with the number of CPU cores), we can

achieve good results and frequently updated data.



350

3.4 Read-your-writes consistency

There is another client-oriented model of data consistency, very similar to the monotonic-write

consistency. The data warehouse has the property of read-your-writes consistency, if it satisfies

the following condition: the result of write-in process data item X is always visible subsequent

reads X of the same process. In other words, the write operation is always completed before

the next read of the same process.

Consistency when reading your own records also requires that the server is running a read

operation and has access to all write operations from the customer's record. The write operation

can simply extract from other servers before performing a read operation, even if it threatens

to turn into a problem with response times. On the other hand, the client software can search

the server itself, in which the write operation, said set of recording in the customer has been

completed [3].

Thus, to comply with this principle, we can propose the following solution: the aggregation

program knows that the raw data has been updated. Depending on this, it rebuilds the necessary

aggregated data sets. When a user accesses the aggregated data, we need to verify whether

there is an ongoing aggregation process. If so, the user will wait for its completion, otherwise

– immediately receives the data. This method is the most convenient for real-time processed

data. However, its disadvantage is that it requires a large number of resource capacity, to

aggregator software, which will be spent for tracking current write operations.

4. Real-time data necessity

To decide if we need to implement any of the suggested solutions, we should decide if it worth

to implement them, if the cost of supporting the solution will be much higher than real data

cost. The key concept behind "real time" is that our artificial representation must be in sync

with the real world so that we can respond to events in an effective manner. In today's

technology, the data warehouse has become an artificial representation of our real business

world. In this regard, the primary purpose of the data warehouse is to maintain a unified and

consistent view of the business reality [4].

In data analytics there are 3 main steps that should be followed after the business event is done.

The first step is storing raw data in the data warehouse and aggregating it. The time spent on

this step can be called data latency.

The second step is analyzing the data and providing it to appropriate data analyzer. The time

spent on this step can be called analysis latency.

The third step is to make a decision and implement it. The time spent on this step can be called

decision latency.

Figure 1 – The evolution of the data value over time



351

In figure 1, we are displaying an event for which the value is decreasing rapidly after it first

triggered. This can be applied, for example, to stocks or health care. However, this is not the

only way in which the data cost can be changed.

Figure 2 – Examples of changing the data cost over time

To determine if there is a need to provide real-time data, you should build such a graph, and

estimate how much the data value will change if we won’t have real-time data. If it is much

less than the cost of real time data supporting, than we just need to ignore it.

5. Conclusions

This article proposed methods to solve the problem of inconsistency between aggregated and

raw data in an analytical systems. At the same time, it should be noted that there is no single

way to solve this problem, and the choice depends solely on the characteristics of the system.

We should also never forget that, in order to ensure that data is consistent, we will have to

sacrifice either time on data processing or computing resources, and thus the storage and

processing of data will be more expensive. Therefore, from our point of view it makes sense to

neglect data consistency, unless it is absolutely necessary.

References

[1] D. Moody and M. Kortink, From Enterprise Models to Dimensional Models: A

Methodology for Data Warehouse and Data Mart Design. Internet:

http://ssdi.di.fct.unl.pt/bddw/material_apoio/artigos/files/2000-Moody.pdf [Feb. 26, 2015]

[2] R. M. Bruckener and A. M. Tjoa, Capturing Delays and Valid Times in Data Warehouses

- Towards Timely Consistent Analyses, Journal of Intelligent Information Systems, vol.

19:2, Kluwer, Netherlands, 2002, pp. 169–190

[3] A. S. Tanenbaum and M. Van Steen, Distributed systems: principles and paradigms. New

Jersey: Pearson Education. Inc, 2007, 686 p.

[4] R. Hackathorn, The BI Watch: Real-Time to Real-Value Internet:

https://www.researchgate.net/publication/228498840_The_BI_Watch_Real-

Time_to_Real-Value [Mar 01, 2015]



352

BIG DATA ANALYSIS AS FUNDAMENT FOR PRICING DECISIONS

Anca APOSTU


Abstract. Data-driven decision-making for business, industry, research, commerce and social

media represents nowadays a revolution thanks to the notion of “Big Data”. Big Data Analysis

provides an important source of knowledge while manipulation of such data requires suitable

storage and analysis capabilities. For effective large-scale analysis, locating, identifying,

understanding, and citing data has to happen in an automated manner supported by scalable

cloud platforms. In this article the author’s aim is to present the fundamental aspects of a study

for deducting price calculation elements from Big Data and ways of using these elements for

pricing decisions in retail industry. Learning how to capitalize on Big Data opportunities today

could make the difference for changes to come tomorrow for the mentioned industry.

Keywords: Big Data, Information economics, price calculation, pricing decisions, price

knowledge. JEL classification: C53, C55, C80.

1. Introduction and review of literature Customer price knowledge has been the object of considerable research in the past decades.

Paper [1] cites over sixteen previous studies, most of which focus on measuring customers’

short-term price knowledge of consumer packaged goods. In a typical study, customers are

interviewed either at the point-of-purchase or in their home and asked to recall the price of a

product, or alternatively, to recall the price they last paid for an item. In perhaps the most

frequently cited study, [2] asked supermarket shoppers to recall the price of an item shortly

after they placed it into their shopping cart. Surprisingly, fewer than 50% of consumers

accurately recall the price. Thus, despite the immediate recency of the purchase decision there

is no improvement in the accuracy of the responses.

In a recent paper [3], the authors combine survey data and a field experiment to investigate this

prediction. In their study, they survey 14 customers and collect price recall measures for

approximately two hundred products. They then conduct a field experiment in which they

randomly assign the same items to one of three conditions. In the control condition, items are

offered at the regular retail price. In the price cue condition, a shelf tag with the words “LOW

prices” is used on an item. In the discount condition, the price is offered at a 12% discount

from the regular price.

The authors show that both price cues and price discounts increase demand. But, consistent

with theoretical predictions, the authors find that price cues are more effective on products for

which customers have poor price knowledge [8]. In contrast, price discounts are more effective

when customers have better price knowledge. Together these results highlight the importance

that price knowledge serves in determining the effectiveness of price changes and price cues.

Paper [4] examines consumer price knowledge by comparing the actual market prices and

consumer price estimates in the Finnish grocery market. Although the individual price

estimates of consumers were found to differ significantly from the actual market prices, the

medians of consumer price estimates and market prices were very close to each other for most

of the products in our data. The study indicates that consumer price knowledge is not as poor



353

as previously suggested by the results of point-of-purchase studies. The authors suggest that at

least part of the weakness in consumer price knowledge can be explained by differences in

market price variation.

2. The importance of Price Knowledge

The responsibility of pricing officer is to read and understand price attention approach, to

define the products for which he wants to measure price attention and to follow the

methodology based on measuring both price awareness and price sensitivity.

Modeling and analytics are now the hot topic in a vast array of formerly non-scientific

environments: Internet pioneers feverishly seek to uncover the mathematical roots behind

“viral” marketing; physicists are being employed by publishers to forecast how a favorable

New York Times book review sends Amazon sales skyrocketing; geology specialists are

helping some fast-moving consumer goods retailers re-create sales aftershock effects. In each

case, non-scientific environments are being transformed by modeling, forecasting, and

predictive analytics science – namely because in the absence of these powerful disciplines, it

would be nearly impossible for retailers to handle the complexity of data and variables

associated with these tasks.

Most retailers know that technology has played an increasingly important role in helping

retailers set prices. But until recently, these efforts have been rooted in advances in computing

technology, rather than in newfound applications of scientific principles. Real science is a

powerful, pervasive force in retailing today, particularly so for addressing the complex

challenge of retail pricing. Done right, the application of scientific principles to the creation of

a true price optimization strategy can lead to significant sales, margin, and profit lift for

retailers. Our research aims to explain why retailers can no longer thrive without a science-

based pricing system, defines and calls out the new challenges brought by the rising of Big

Data.

3. Big Data – bringing new opportunities in retail systems

Big Data is bringing new opportunities to drive innovation, ramp up productivity, and create

groundbreaking new user experiences. Learning how to capitalize on Big Data opportunities

today could mean positioning for changes to come tomorrow for retail industry.

Big Data is a body of data that is so voluminous, variable and/or fast moving that is not feasible

to process, store, access and analyze it using conventional technologies. So far it has a

skyrocketing volume, estimated to reach 7.9 zettabytes in 2015 (1 zettabyte = 1 trillion GBs)

at an estimated growth rate of +40% per year, 1 petabyte/15s. These changes are mostly due

to cheaper computing power and storage and to the recent changes: High growth in multimedia

(especially video); explosion personal digital devices (change in usage/interaction);

proliferation of sensors, meters, trackers etc. (+30% year); all connected to Internet („digital

exhaust”).

The current sources of data are presented in „Table 1” in relationship with the 3-V Signature

of Big Data – Volume, Variety and Velocity.



354

Table 1. Current sources of data for Big Data Analytics Volume Variety Velocity

Structured Data

(Databases, Warehouses, ERP, CAD, Biz Apps)

* * *

Machine Data

(logs, sensors, meters, clickstreams, CDR, GPS)

*** ** ***

Unstructured content:

External: Web, Social Media, Multimedia;

Internal: documents, e-mail, CMS/Collaboration

** *** *

Big Data Analytics use sentiment analysis, spatial analysis, semantic processing, natural

language processing, statistics, text mining, clustering, ensemble learning, pattern recognition,

work analysis/relationship mapping, signal processing, classification, machine learning,

distributed processing, data fusion, date visualization and brings with new opportunities such

as:

1. Insights, innovation, efficiency, agility:

a. Volume(new insight from new sources, complete(raw) data vs. sample sets);

b. Variety (insight & innovation with unused, under-used sources (value 90%));

c. Velocity (agility & competitive through real–time/right-time data);

d. Cross-referencing (Aggregation/data fusion).

2. Enormous potential value in all sectors:

a. U.S. retail: 60+% increase in net margin;

b. 0.5-1.0% productivity growth.

The great amount of data that is currently being produced by business, industry, research,

commerce and social media represents an important source of knowledge and manipulation of

such data requires scalable storage and good analysis tools.

For example, paper [5] shows how “Walmart is using its big data collection to inform

customers of their choices as a mobile marketing strategy. It rationalizes that since its

consumers have made their mobile devices as indispensable, the retail company can make use

of this information to its advantage. The customers will be offered shopping tools to help them

from pre-store planning, decision making, and up to check out.”

“Walmart plans to implement its mobile strategy to improve its revenues. It is planning to offer

a mobile application to help customers create shopping lists by suggesting to them what they

should buy. Walmart is able to do this because it has a huge collection of shopping data it has

collected from its customers.”

4. Datafication – the new concept and way of using data for retail industry

“Using great volumes of information requires three profound ways in how we approach data.

The first is to collect and use a lot of data rather than settle for small amounts or samples, as

statisticians have done for well over a century. The second is to shed our preference for highly

curated and pristine data and instead accept messiness: in an increasing number of situations,

a bit of inaccuracy can be tolerated, because the benefits of using vastly more data of variable

quality outweigh the costs of using smaller amounts of very exact data. Third, in many

instances, we will need to give up our quest to discover the cause of things, in return for

accepting correlations” [6]. With Big data, instead of trying to understand precisely why an

event has taken place, “the researchers can instead collect and analyze massive quantities of

information about such events and everything that is associated with them, looking for patterns



355

that might help predict future occurrences. Big data helps answer what, not why, and often

that’s good enough“, the same authors will report.

The paper in discussion brings into attention the concept of “datafication”, which is not the

same as digitization, which takes analog content and converts it into digital information. The

concept of datafication was first introduced by Shigeomi Koshimizu, a professor at the

Advanced Instritute of Industrial Technology in Tokyo, in a study of people posture when

seated as an attempt to adapt the technology as an antitheft system for cars, and refers to taking

all aspects of life and turning them into data. For example, Google’s augmented-reality glasses

datafy the gaze. Twitter datafies stray thoughts. LinkedIn datafies professional networks.

“Once we datafy things, we can transform their purpose and turn the information into new

forms of value. For example, IBM was granted a U.S. patent in 2012 for “securing premises

using surface based computing technology” – a technical way of describing a touch-sensitive

floor covering, somewhat like a giant smartphone screen. Datafying the floor can open up all

kinds of possibilities. For example, retailers could track the flow of customers through their

stores. Once it becomes possible to turn activities of this kind into data that can be stored and

analyzed, we can learn more about the world – things we could never know before because we

could not measure them easily and cheaply.”

Two decades ago, America’s biggest Retailers—Walmart—planned to increase its market size

by penetrating rural communities located in small towns with the view of bringing its services

to all and sundry while maximizing its profitability [7]. Walmart successfully pulled this off

by analyzing the large sets of data available to them—gathered from its previous ventures—

for understanding the consumption needs of these communities as well as selecting choice

locations for setting up its stores.

Walmart’s move has been credited by most market analysts as the first use of big data in the

retail industry, giving it the tag of “merchandising pioneers in big data” due to the fact that its

analysts successfully analyzed the millions of customer data available to them without the use

of today’s sophisticated software/hardware tools. Since then, smaller retail chains both off and

online has utilized big data in understanding customer consumption needs and building

adequate marketing strategies [9].

“Walmart is also planning to create a mobile app which will help customers when they’re

already in the stores”. The said application is able to sense if the customer is inside a store and

automatically prompts the customer to opt for “Store Mode” so that he can scan codes for

discounts and prices. An application which allows customers to use voice prompt to create a

shopping list based on a customer’s preferences is also one of the topics Walmart invests effort

in. [5]

“According to a study conducted by the retail store, customers who make use of its mobile

applications shop at Walmart’s stores at least twice a month. They also spend 40% more

monthly as compared to customers who are non-app users. In January of this year, Walmart

teamed up with Straight Talk Wireless to create a $45 prepaid plan for its customers who have

smartphones. Clients who bought smartphones at any of Walmart’s stores are also offered a

$25 a month no-interest fixed monthly installments through the store’s credit card.” [10]

5. Conclusions – current challenges, future approaches

The research on price knowledge reveals that there is an opportunity for firms to influence

customers’ price perceptions. Moving forward, the use of Big Data has helped retailers:

optimize the pricing of merchandise, improve marketing campaigns, improve store operations,

product placement and customer satisfaction analysis, staffing policies and providing decision

support.



356

The fact remains that retail outlets will continue to face the challenges associated with

gathering useful data and analyzing them to predict the ever changing trends in the retail

industry.

References

[1] Monroe, Kent B. and Angela Y. Lee, 1999, “Remembering versus Knowing: Issues in

Buyers’ Processing of Price Information,” Journal of the Academy of Marketing Science,

Vol. 27, No. 2, pp. 207-225.

[2] Dickson, Peter R. and Sawyer, Alan G., July 1990, “The Price Knowledge and Search of

Supermarket Shoppers”, Journal of Marketing, Vol. 54, No. 3, pp. 42-53, published by:

American Marketing Association, [online] Available: http://www.jstor.org/stable/1251815

[3] Anderson, Eric T., Edward Ku Cho, Bari Harlam and Duncan I. Simester (2007), “Using

Price Cues,” mimeo, MIT, Cambridge MA.

[4] Aalto-Setälä, Ville, Raijas, Anu, 2003, “Actual market prices and consumer price

knowledge”, Journal of Product & Brand Management, ISSN: 1061-0421

[5] De Borja, Florence, June 7, 2013, “Walmart Uses Big Data For its Mobile Marketing

Strategy”, [online] Available: http://cloudtimes.org/2013/06/07/wal-mart-big-data-mobile-

marketing-strategy/

[6] Cukier, Kenneth, Mayer-Schoenberger, Viktor, “The Rise of Big Data. How it’s changing

the way we think about the world”, Foreign Affairs, May/June 2013, pp.28 - 40

[7] McKinsey Global Institute, “Retail Analytics Romance with Big Data”, [online] Available:

http://biginsights.co/retail-analytics-romance-with-big-data/

[8] Anderson, Eric T., Simester, Duncan I., 2008, “Price Cues and Customer Price

Knowledge”, January, [online]

Available: http://www.kellogg.northwestern.edu/faculty/anderson_e/htm/personalpage_fil

es/Papers/Price%20Cues%20Book%20Chapter%20Jan%203.pdf

[9] Schuman, Evan, “Walmart's Latest Big Data Move”, June 10, 2013, [online]

Available: http://www.fierceretail.com/story/walmarts-latest-big-data-move/2013-06-10

[10] Renee Dudley, Lindsey Rupp, “At Walmart and Other Retailers, Price-Matching Has Its

Perils”, May 9, 2013, [online] Available: http://www.businessweek.com/articles/2013-05-

09/at-walmart-and-other-retailers-price-matching-has-its-perils



357

ON A HADOOP CLICHÉ: PHYSICAL AND LOGICAL MODELS

SEPARATION

Ionuț HRUBARU

Al.I.Cuza University of Iași

[email protected]

Marin FOTACHE Al.I.Cuza University of Iași

[email protected]

Abstract. One of the biggest achievements in the database world was the separation of the

logical and physical models. The end user is not concerned at all about the underlying physical

structures. She/he only declares the needed information and does not explicit how to get it.

This has been a fundamental rule for decades and fulfilled even by newer data systems (i.e.

NoSQL data stores). However Hadoop blurs the distinction between logical and physical data

layers. Consequently it might be considered a step back in the database community. In this

paper we examine how and why the physical and logical models are tightly coupled in Hadoop,

the way Hive fills the logical-physical gap. We compare Hive layer/framework data model with

the “classical” relational approach. We will also emphasize some architectural issues that

might determine the overall performance of the data system.

Keywords: Big Data, Data Independence, Data Models, Hadoop, Hive JEL classification: L86, M15

1. A brief history of independence in the database world First business applications – developed in early 1960s – stored data in flat/independent files.

Data storage, retrieval and processing were possible only through code. Each program had to

describe the data, and had to explicitly open, loop through, and close the data files. There were

no separation between data and programs [1].

First Data Base Management Systems (DBMSs) provided, among other features, the data

independence, i.e. making application programmers independent from the details of data

representation and storage [1]. ANSI/SPARC generalized the notion of data model and the

three levels of abstraction in a database – external, conceptual/logical, and physical - [1] [2].

Subsequently the relational model proposed high level languages for declaring and processing

data [3].

The three-level data architecture was meant to provide two types of independence, the physical

independence and the logical independence. Logical data independence makes database users

agnostic of any changes in the logical structure of data (e.g. tables and attributes). In other

words, logical data independence is the capacity to change the conceptual schema without

having to change external schemas or application programs. Physical data independence is the

capacity to change the internal schema without having to change the conceptual schema. It

insulates users from changes in the physical storage of the data.

2. Main consequences of logical and physical independence

We argue that data independence is a major feature, which might decide the NoSQL and

Hadoop systems further acceptance. We defend this idea by examining some important




358

consequences of logical and physical independence for software development and

organizational information systems:

Separation of database professionals from programmers. Flat/independent data files

architectures required explicit data description in every program that uses them [1]. An average

business application could be composed of hundreds or thousands of pieces of code. All the

information extracted from the data required programming [4]. First DBMSs moved the

description of data (the metadata) into a separate file called data dictionary or system catalog.

Further advancements hid the data declaration and processing from the physical

implementation. Today, database professionals are generally not (“classical”) programmers.

Specialization of programming languages for database management. Programming languages

for data declaration, control and manipulation were among first Domain Specific Languages.

Stripped by many general programming features, they were endowed with simple and

comprehensible commands covering the basic needs of data definition, data processing and

data control. Their success is due mainly to their declarative nature – data professional must

specify what do they want, and not how do they get the information [5].

Open path to standardization and universal acceptance. Logical and physical independence

paved the way towards standardization. In computing industries SQL is a sole example of

ubiquitous acceptance and conformity to the standard by almost all major database vendors [5].

More recently, providers of NoSQL and Hadoop systems realized their products need the

power of high level programming languages so they develop frameworks and systems whose

data language resembles SQL [6].

Separation of concerns for database software producers. Once the data model and the data

language were standardized, for many years the database vendors focused on database

performance and creating powerful features addressing database users and professionals. This

separation explains much of the database advancements and maturity.

The empowerment of end-users (democratization of data). SQL was intended to be so clear and

simple that even the non IT professionals can access and deal with structured data writing and

running SQL queries [7], directly or with various tools support (query designers, report

generators). SQL was the first and still is the most important tool for users to deal with their

own data, at different levels of complexity.

Impedance mismatch. Logical and physical data independence also created or aggravated some

software development problems. Perhaps the most striking is the impedance mismatch.

Designers, on one side, and programmers, on the other side, generally use methodologies and

tools with strong Object-Oriented flavors, such as Rational Unified Process, UML, Java, etc.

The relational model of database layer in most organizations has a completely different

philosophy for modeling the real world. So in the 1990s the demise of relational data model

and its replacement with OO seemed natural and imminent. Now the impedance mismatch

problem persists in software development and various solutions have been proposed such as

Object-Relational Databases, Object-Relational mapping tools.

Programmers (in)competence in exploiting databases (basic) features. A more debatable issue

we defend here is the “impedance mismatch” between programmers and databases. Put simply,

by mere observation of large number of software development projects and professionals, we

cannot help observing how poor the database competency is for most Computer Science /

Software Engineering graduates. Despite thousand page database textbooks, CS/SE graduates

seem to know almost nothing about real world database modeling, stored procedures and what

database logic can do within a software application.

The fall of many data models. The failure of much anticipated data models - such as Object-

Oriented Data Model, XML Data Model -, could be properly explained by the loss of logical

and physical independence those systems entailed. NoSQL and Hadoop technologies might



359

have the same fate [8] if data independence and high-level languages (for data definition and

manipulation) are not provided.

3. Big Data and the promised land of data management

The Big Data movement (and hype) is usually associated with new ways of processing data

grouped under two banners, NoSQL and Hadoop. Both groups of technologies have vocally

advertised their departure from the relational/SQL data model. But some scholars and

professionals argue for hybrid data persistence which combines the benefits of SQL with the

scalability of NoSQL and Hadoop systems [9].

Although Big Data has undoubtedly been one the buzzwords in the latest years [10], its

meaning is not always clear. Stonebraker [11] focuses the definition on three essential features:

big volume, big velocity, and big variety. Processing Big Data requires using distributed

architectures and algorithms. Most of them are based on the divide and conquer approach,

whereas others rely mainly on sampling. Sources of big data sets are everywhere: transactions,

logs, emails, social media, sensors, geospatial data, etc.

One key aspect related to all NoSQL systems is that data modeling has radically changed. In

fact NoSQL data stores claim to manage schema-less databases [9]. In SQL DBMSs a database

has at first to be logically modeled (tables, relationships, constraints). In NoSQL and Hadoop

data storage starts not with a logical model but with physical blocks.

As a reaction to NoSQL, some major RDBMS producers have started to include columnar

persistence in their products (a NoSQL family) simultaneously providing an SQL interface. On

the other side, many NoSQL/Hadoop systems and frameworks provide now SQL-like query

languages.

4. Hadoop or data processing without a logical model

Hadoop technologies promised a simpler and more efficient paradigm for distributed file

system. Storing and processing huge amount of data on cheap, commodity hardware were a

breakthrough that helped companies in dealing with ever increasing volumes of data. Some of

the use-cases where Hadoop works best include: log data analysis, data warehouse

modernization, fraud detection, risk modeling, sentiment (text) analysis, image classification,

etc. Although some experts predicted a dark future for it [12], Hadoop’s ecosystem literally

exploded by adding features and technologies to address data velocity (streaming) and variety

(integration). Hadoop’s architecture and infrastructure has two main components: the storage,

which is a distributed file system (HDFS) and the processing components, which in most cases

rely on a map-reduce implementation. Recent Hadoop releases remove Map-Reduce constraint

as single processing platform, by introducing a new resource engine (YARN) and a new data

processing framework (Apache Tez) [13].

Working with Hadoop natively involves knowledge and interaction with the underlying

physical storage, as opposed to the relational world. The main characteristics of HDFS are

sequential writing, lack of support for random reads and non-updatable file content (only

inserts are possible) [14]. That is because Hadoop was intended for batch processing of huge

files that implies reading data, not updating it.

The second component of Hadoop is Map Reduce which processes data stored in HDFS. Map

Reduce is a framework/algorithm designed specifically for parallel processing which goes very

well with the HDFS philosophy of storing data across multiple data nodes (commodity

hardware), in a distributed environment. Map Reduce code (written in Java or other

programming language) runs on each data node in a cluster. It contains two main

functions/interfaces: a Map which processes the input in parallel, produces key value pairs

which become input to a Reduce function which produces the final result.



360

Map-Reduce is tightly linked to the input data: the programmer knows what that data is all

about and must design and implement the logic of both Map and Reduce phase. When input

data is changed, the whole java program must be updated. We are not discussing only about

change in structure (new keys, new attributes, etc.) but also about change in volume, which can

affect the logic used by a client application to process the data. These kind of logic and

decisions are automatically taken by an optimizer in the case of a RDBMS, but for Map Reduce,

it is the programmer’s responsibility to implement it.

5. Hive: back to independence?

Initially Hadoop relied only on Map Reduce for data processing. But despite its advantages,

Map-Reduce is awkward for programmers and incomprehensible for non-programmers. Even

the easiest SQL queries translate in long pieces of (mainly Java) code in Map-Reduce. Also,

the lack of repository for metadata creates huge problems in data processing and integration.

Processing data close to the physical storage proved to be not as productive as expected by

Hadoop enthusiasts. The Hadoop myth that for better performances you must know your data

and physical storage is in doubt.

Hive was developed by engineers at Facebook to address their needs for ad-hoc queries and for

an information dictionary [15]. Hive brings structure into the data stored in HDFS by managing

the information about entities (metadata) in a metastore. As data in Hadoop is stored in HDFS,

data serialization and de-serialization use rules defined in programs called SERDEs. These

programs (basically Java classes) help Hive to interpret the data from HDFS.

Apart from the metastore, main Hive modules are: the driver (a counterpart of RDBMSs

optimizer which communicates with Hadoop through Map Reduce jobs that get the data from

HDFS and does the processing); thrift server (used for integration with other applications via

JDBC or ODBC); CLI, Web Interface (client components that access the data); and

Extensibility Interfaces (tools used to define custom functions and SERDEs).

The query lifecycle is pretty much the same as in a relational database server. It goes through

phases such as parsing, generating execution plan using a rule (or cost) based optimizer,

generating the steps (Map Reduce jobs) and executing them.

Hive works with well-known concepts from relational databases world such as tables, columns,

rows and data types. There are primitive data types (string, integer, timestamp, date, etc.) and

collections data types (struct, array, and map). Also user can create custom types, all

implemented in Java and inheriting the underlying behavior.

Besides tables, from a logical perspective there are partitions too. Tables in Hive can be

partitioned, just like in a classical RDBMS, using different columns as criteria to horizontally

split the data. This can help performance since the optimizer can figure out which partition to

use to satisfy the search criteria of a query which can improve significantly the performance

[16]. Unlike relational databases, partitioning has also an impact of how the table is physically

stored.

Another concept related to storage is buckets which are basically hash partitions within the

previously declared partitions, which are useful for sampling and can be a way to control

parallelism at write. When a table is created, the user can specify the column on which buckets

are created and the number of buckets. One can execute a query only on a sample which will

be far more effective than scanning the whole table.



361

Figure 1. Partitioned and bucketed table (left) and Map Reduce script in HiveQL (right)

In the example on the left side of figure 1, inside each partition (defined by values of column

invoice_date) the columns used for clustering (supplier) will be hashed into buckets. This way

one can segment large data sets and improve query performance. Records with same supplier

will naturally be stored in the same bucket (through the hash), and each bucket will contain

multiple suppliers. Each bucket will materialize in a separate file in Hive. Bucketing is enforced

only when reading data, not when inserting, and users must pay attention to ensure that data is

loaded correctly by specifying a number of reducers equal to the number of buckets. Partitions

and buckets will impact the HDFS physical layout. So, although Hive offers logical concepts,

it becomes apparent that modeling is still tight to physical design.

Hive generally loads data from HDFS or from external sources by copying files and then trying

to apply the structure on it for reading purposes. The Hive biggest strength is HiveQL (Hive

Query Language) a SQL-like language. With HiveQL it is easy to define table structure and

the rules for data serialization (writing) and de-serialization (reading). These rules refer to file

formats (TextFile, SequenceFile, RCFile, ORCFile), record and column formats, compression

algorithms, etc. Again, in Hive one must know the underlying data, unlike classical RDBMS

where decoupling between physical and logical is a basic feature. Hive works with schema on

read, and unlike RDBMS, it does not enforce schema on write. Structure is applied when

reading data, if there is a mismatch between table definition and the actual data, then Hive will

return what it can read, showing NULL values for the rest, or throwing an error only if reading

is not possible at all.

HiveQL is used for both DDL and DML operations. Since Hive is in fact another layer over

Map Reduce, all HiveQL commands will be translated into Map Reduce jobs executed against

HDFS. This hides Map Reduce implementation from the end user and allow users to specify

what she/he wants not how to process data. But at translation queries can get quite complex

and error prone, which will affect performance. Some of the simplest queries can take in Hive

quite a large amount of time. HiveQL allows a mixture between declarative and procedural:

the user can still write Map Reduce scripts and inject it in the query itself. On the right side of

figure 1 is an example of Map Reduce in Hive.

Currently Hive is and we argue it will remain tightly coupled to its underlying layers, Map

Reduce and HDFS. The separation of logical and physical data layers, is not enforced in a

system like Hadoop targeted to processing large files in batches.

5. Discussions, conclusions and future research Hadoop is not only about HDFS, Map Reduce and Hive. There are many tools and layers

developed to cover areas such as data integration (Sqoop), machine learning (Mahout),

statistics (RHadoop), ETL (Pig), column oriented databases (HBase), scheduling for

workflows (Oozie) making the Hadoop stack evolving to a level that was difficult to predict a

decade ago.

Some renowed scholars [8] do not bet on HDFS’s future` since, unlike a parallel DBMS, Map

Reduce sends the data across nodes to produce query results, not the other way around.



362

However, this is not entirely true. Hadoop processes the data locally in the Map phase and

sends the data through the network only in the Reduce phase and only when necessary.

Whether HDFS is or not is faster than a RDBMS or a parallel DBMS is an entirely different

discussion and although out of scope of this paper, we can make the statement that it depends

on the use case. We consider Hadoop will evolve even more in catching up with those uses

cases where relational database systems work better, and in improving in directions beyond

batch processing: streaming and interactive querying.

Hive supposedly brought logical and physical data independence into Hadoop. But this is only

on surface, since the language presents and insists on features that offer capabilities to interact

and manipulate the physical layer directly. There are clauses for serialization and de-

serialization, file formats and this is possible because in Hive we have a schema on read

implementation, it does not impose restrictions. We end up having a logical data model, but

only used for projecting structure over a physical file and data block. Bringing processing so

close to the data entails serious problems when the data changes, since the map reduce jobs

will need to be updated, either because it does not produce the expected result, or the

performance suffers.

Future directions will include testing and analyzing how the modeling decisions, both logical

and physical can affect performance and how, based on specific use cases, architectural

decisions can affect throughput and latency of a business application.

References [1] R. Elmasri, and S.B. Navathe, Fundamentals of database systems, Boston: Addison-

Wesley, 2011, pp. 9-52.

[2] C.W. Bachman, Summary of current work ANSI/X3/SPARC/study group: database

systems, ACM SIGMOD Record, vol. 6, no. 3, pp.16-39, July 1974

[3] E.F. Codd, A Relational Model of Data for Large Shared Data Banks, Communications of

the ACM, vol. 13, no.6, pp. 377-387, June 1970

[4] A. Silberschatz, M. Stonebraker, and J.D. Ullman, Database systems: achievements and

opportunities, ACM SIGMOD Record, vol. 19, no. 4, pp. 6-22, December 1990

[5] M. Stonebraker, Future Trends in Database Systems, IEEE Transaction on Knowledge and

Data Engineering, vol. 1, no. 1, pp. 33-44, March 1989

[6] D. Abadi et al., The Beckman Report on Database Research, ACM SIGMOD Record, vol.

43, no. 3, pp.61-70, December 2014

[7] D.D. Chamberlin Chamberlin and R.F. Boyce, SEQUEL: A structured English query

language in Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data

description, access and control (SIGFIDET '74). ACM, New York, USA, pp. 249-264

[8] M. Stonebraker, A valuable lesson, and whither Hadoop? Communications of the ACM,

vol. 58, no.1, pp.18-19, January 2015

[9] P. J. Sadalage and M. Fowler, NoSQL distilled: a brief guide to the emerging world of

polyglot persistence. Addison-Wesley, 2012, pp. 133-152

[10] H.U. Buhl, M. Röglinger, F. Moser, Big Data: A Fashionable Topic with(out) Sustainable

Relevance for Research and Practice? Business & Information Systems Engineering, vol.

2, 2013, pp.65-69

[11] M. Stonebraker What Does 'Big Data' Mean?, Communications of the ACM

(BLOG@CACM), September 21, 2012, http://cacm.acm.org/blogs/blog-cacm/155468-

what-does-big-data-mean/fulltext

[12] M. Stonebraker, Possible Hadoop Trajectories, Communications of the ACM

(BLOG@CACM), May 2, 2012, Available at http://cacm.acm.org/blogs/blog-

cacm/149074-possible-hadoop-trajectories/ [Jan 20, 2015].



363

[13] B. Lublinsky, K. Smith, A. Yabukovich, Professional Hadoop Solutions. Indianapolis,

John Wiley & Sons, 2013

[14] T. White, Hadoop The Definitive Guide. Sebastopol, CA: O’Reilly, 2011;

[15] A. Thusoo, J.S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu and R.

Murthy Hive, A Petabyte Scale Data Warehouse Using Hadoop, ICDE 2010,

http://infolab.stanford.edu/~ragho/hive-icde2010.pdf [Febr. 20, 2015].

[16] B. Clark, et al. "Data Modeling Considerations in Hadoop and Hive," SAS Institute Inc.,

2013, http://support.sas.com/resources/papers/data-modeling-hadoop.pdf [Jan. 15, 2015].

http://infolab.stanford.edu/~ragho/hive-icde2010.pdf

http://support.sas.com/resources/papers/data-modeling-hadoop.pdf



364

BIG DATA CHALLENGES FOR HUMAN RESOURCES

MANAGEMENT

Adela BÂRA


Iuliana ȘIMONCA (BOTHA) The Bucharest University of Economic Studies

[email protected]

Anda BELCIU The Bucharest University of Economic Studies

[email protected]

Bogdan NEDELCU


Abstract. The article shows how complex the human resource recruitment became and offers

an insight on the selection issues many multinational organizations encounter, especially

nowadays, when any software solution is no longer viable if it has no big data capabilities. The

paper’s main objective is to develop a prototype system for assisting the selection of candidates

for an intelligent management of human resources. Furthermore, such a system can be a

starting point for the efficient organization of semi-structured and unstructured data on

recruitment activities.

Keywords: Big Data, Business Intelligence, NoSQL Databases, data mining

JEL classification: C81, D81, D83

1. Introduction In the context of social networks development and ICT challenges, human resource recruitment

and selection issues in multinational organizations is becoming more complex. At this level,

flow of information, data and knowledge comes from multiple sources with various systems

leading to a major effort in the process of extraction, integration, organization and analysis of

data for decision-making recruitment. Also conducting the selection process cannot be

performed effectively by studying profiles, resumes and recruitment sites which presents

subjective heterogeneous information. The paper aims to present intelligent methods for

making the best decisions in human resource selection using Big Data technologies,

optimization techniques and data mining. The solutions will allow automatic acquisition of

information about applicants in recruitment sites, personal web pages, social networks,

websites and academic centers and will enable decision making using intelligent optimization

methods. Research motivation stems from the fact that, in the current global economic crisis,

making effective decisions on recruitment is a key factor for companies.

Technologies for organizing and processing large volumes of heterogeneous data, unstructured

and characterized by a high velocity is in an exponential growth. The amount of data managed

by different recruitment companies available over the Internet on social networks generates

Big Data problem. We use intelligent methods for analyzing such data in order to obtain a

competitive advantage in recruitment and thus in business development.



365

2. Processing HR data from heterogeneous sources Currently, information on supply and demand in the labor market is stored electronically as

CVs in the form of text databases. These semi-structured data typically come from portals and

recruitment sites. But there is a huge amount of information on social networks, collaborative

platforms of universities and specialized forums. This data is unstructured. In order to use both

the semi-structured and unstructured data, it is necessary to use the methods and techniques of

parallel processing, extraction, cleansing, transformation and integration in a NoSQL database.

The difficulty of the problem in this case is to analyze and identify solutions and technologies

for Big Data that can be applied for organizing and processing.

For data analysis, data mining methods can determine patterns and profiles for optimal

recruitment strategy. But traditional data mining techniques are inadequate for the volume of

data. In most cases, only a small part of all available documents will be relevant for a particular

candidate. In this case, the difficulty is in identifying and implementing the algorithms for data

mining and text mining to compare and rank the documents in order of importance, relevance

and determination of profiles of candidates for recruitment.

Due of the complexity of the technologies to be used, and the rapid changes in the labor market,

the creation of an architecture that enables the introduction of new data sources, that is capable

of integrating multiple and heterogeneous sources, that includes a level of complex models

analysis and determination of profiles and lead to the creation of a knowledge-based

management of human resources. From this point of view, the difficulty lies in choosing the

elements and builds a platform enabling efficient parallel processing, extracting timely

information, interactive data analysis and satisfy performance requirements imposed by the

paradigm Big Data Analytics.

Set in a rapidly growing number of impressive data collected and stored on the Internet on the

availability of human resources has exceeded the human ability to understand without the help

of powerful tools. Thus, instead of being based on relevant information, important decisions

are made intuitively concerning recruitment, subjective or based on fixed criteria, without

taking into account the complexities of nature and human behavior. To obtain relevant

information methods such as multivariate analysis should be used for data processing, data

mining, statistical methods and mathematical methods that can be applied to large data

volumes. For these applications, the data must be well organized and indexed so as to provide

ease of use and easy retrieval of information. Recent studies oriented towards organization and

processing data from recruitment portals [1], [2] refers to the importance of this analysis for

the selection process and the impact that these techniques have on business performance.

Regarding the determination of the profiles of candidates, there are studies published in [7] and

[8] concerning the application of data mining algorithms (decision trees, association rules,

clustering) for selection of candidates and determine methods of training for staff recruited.

However, these studies do not account for data from social networks and collaborative

platforms, from sources such as universities or forums. Processing of text information and

application of data mining techniques on data from these sources are taken into consideration

more and more. We have developed numerous methods of text mining, but usually they are

oriented selection of documents (where the query is considered as a provider of constraints) or

the assessment documents (where the query is used to classify documents in order of relevance)

[3]. The goal is to retrieve keywords from a query of the text documents and evaluation of each

document depending on how much satisfies the query. In this way is evaluated the relevance

of a document to the query performed. Another method of classifying documents is the vector-

space model presented in [5] and [6]. It involves representation of a document and query

vectors and the use of a measure as an appropriate similarity to determine the suitability of the

query vector and document vector. Automatic classification is an important point in text



366

mining, because when there are a large number of documents on-line, the possibility of

automatic organization of these into classes to facilitate retrieval of documents and analysis is

essential.

For software development, there are now business intelligence technologies that can be used.

Also, current developments in information technology have led to the emergence of concepts

and new ways of organizing and processing systems in order to improve access to data and

applications organizations. Cloud Computing architecture that computing power, databases,

storage and software applications coexist in a complex and complete network of servers that

provides users with information as a service, accessible via the Internet using mobile devices.

Such a flexible architecture that allows the connection of several types of subsystems can be

used to create a platform for recruitment. There are also Big Data platforms available in cloud

computing architecture that can be used and adapted to prototype realization set.

3. Big Data Solutions

When the structure of data seems randomly designed (variety), when the speed of the flow of

data is continuously increasing (velocity), when the amount of information is growing each

second (volume) and when there is additional information hidden in the data (value), only one

solution can be assigned to manage this chaos: big data. This syntagma has been so much

promoted by the big software companies, that it seems no software solution is no longer viable

if it has no big data capabilities. The truth is there are some domains like telecommunications,

social networks, human resources, etc. that are specifically predisposed to the four V (variety,

velocity, volume, value). Of course, not only the domain matters. It depends if the data is

historical or not, if it’s supposed to be continuously analyzed, if it’s involved in decision

making processes, if it’s strategic or secret, if it’s structured, semi-structured or unstructured

etc.

The most obvious feature of big data is its volume. More and more people are using smart

devices that are connected to an Internet network and they are producing data each second. The

data is growing visibly from big to huge volume. Science has now a solid ground of data for

making all sorts of assumption based on the data received from patients, clients, athletes, etc.

It’s a paradigm that involves our whole universe in gathering, processing and distributing the

data. It is important to benefit from this flow of data, by storing it properly using big data

solutions.

As source [9] states, the two most used Big Data solutions are Cassandra and HBase. Cassandra

is the leader in achieving the highest throughput for the maximum number of nodes [10]. Of

course, there is a reverse to it: the input/output operations take a lot of time. Cassandra is

released by Facebook. HBase is part of the Apache Hadoop project and has the support of

Google, being used on extremely large data sets (billions of rows and millions of columns).

The modern technology allows efficiently storing and querying the big data sets, and the

emphasis is on using the whole data set and not just samples [11]. Big Data comes hand in hand

with analytics, because the final purpose of collecting the huge amount of data is to process

and analyze it in order to gain information, value. Analytics don’t work directly on data. Data

has to be extracted from the database using a specific language and then pass it to analytical

tools.

Up until Big Data, the best way to query data from databases was the SQL language, which

was specific for structured relational tables. When data began to be hold in NoSQL databases,

SQL became only additionally used in queries. For example, the joins are not available in

NoSQL queries. One above the other, it was recently stated (September 2014), that SQL is

more important that was thought for Big Data, Oracle releasing Big Data SQL, which extends

SQL to Hadoop and NoSQL. This road is only at the beginning.



367

4. Proposed architecture

The proposed architecture for a HR recruitment system can be structured on three levels: data,

models, interfaces. For each of these levels, the following methods and techniques can be used:

• for the data level, the system uses technologies that collect and process data from

web sources, parallel processing algorithms and data organization NoSQL

databases;

• the model level uses methods and algorithms for text mining and data mining to

build candidates profiles;

• the interface level to achieve online platform uses tools based on business

intelligence (BI).

Figure 1 - HR recruitment architecture

The impact of a HR recruitment system consists of: facilitating access to relevant information

substantiating managers recruitment decisions; minimizing the time for the selection process

through easy access to information and its synthesis; increase the information’s relevance that

reaches decision makers. The implementation of such a system provides a competitive

advantage in terms of personnel selection which brings added value to the company and will

have a major impact in the following ways:

• from an economical point of view - online platform developed on Cloud Computing

architecture can lead to a more easy organization activity within human resources

recruitment. By using the prototype it facilitates access to data, reduces the amount of

information that reaches decision factors thus minimizing the time for recruitment

decisions by easy access to information and profiles by using templates. The results of

the development platform can be applied directly in the economic environment;

• in social terms - the main beneficiaries of the prototype are managers and candidates. By

using an online scalable platform, company managers can directly select the candidates

and increase the efficiency of the recruitment process so that future employees will add

value to the company. Also, candidates will be able to publish details of experience,

training, social and cultural relations directly through the online platform, providing

links or documents without having to complete CV models for each type of job in the

offer;

• in terms of the environment - using a scalable architecture such as Cloud Computing,

companies will no longer invest in their own hardware, reducing acquisition costs,

energy consumption and climate of the data center, minimizing environmental impact.



368

5. Conclusions The HR recruitment system can be developed on a flexible architecture of Cloud Computing

so that it can be re-configured for other users by including training and personnel management

services. Determining candidates profiles and templates to characterize their profile can be

further improved by introducing new items of interest for recruitment process. Cassandra or

HBase seem the most proper solution for this BigData situation that requires analysis of a large

volume of data regarding human resources in order to obtain profiles.

Acknowledgment This paper presents some results of the research project: Sistem inteligent pentru predicția,

analiza și monitorizarea indicatorilor de performanță a proceselor tehnologice și de afaceri

în domeniul energiilor regenerabile (SIPAMER), research project, PNII – Collaborative

Projects, PCCA 2013, code 0996, no. 49/2014 funded by NASR.

References [1] C. Nermey, How HR analytics can transform the workplace,

http://www.citeworld.com/article/2137364/big-data-analytics/how-hr-analytics-can-

transform-the-workplace.html, 2014

[2] eQuest Headquarters, Big Data: HR’s Golden Opportunity Arrives,

http://www.equest.com/wp-

content/uploads/2013/05/equest_big_data_whitepaper_hrs_golden_opportunity.pdf , 2014

[3] C. Győrödi, R. Győrödi, G. Pecherle and G. M. Cornea, Full-Text Search Engine Using

MySQL, Journal of Computers, Communications & Control (IJCCC), Vol. 5, Issue 5,

December 2010, pag. 731-740;

[4] D. Taniar, Data Mining and Knowledge Discovery Technologies, IGI Publishing, ISBN

9781599049618 (2008);

[5] A. Kao and S. Poteet, Natural Language Processing and Text Mining, Springer-Verlag

London Limited 2007, ISBN 1-84628-175-X;

[6] A. Srivastava and M.Sahami, Text Mining: Classification, Clustering, and Applications.

Boca Raton, FL: CRC Press. ISBN 978-1-4200-5940-3;

[7] H. Jantan, A. Hamdan and Z. Ali Othman, Data Mining Classification Techniques for

Human Talent Forecasting, Knowledge-Oriented Applications in Data Mining, InTech

Open, 2011, ISBN 978-953-307-154-1;

[8] L. Sadath, Data Mining: A Tool for Knowledge Management in Human Resource,

International Journal of Innovative Technology and Exploring Engineering (IJITEE),

ISSN: 2278-3075, Volume-2, Issue-6, April 2013;

[9] O’Reilly Media, Big Data Now, O’Reilly, September 2011, ISBN: 978-1-449-31518-4.

[11] T. Rabl, M. Sadoghi and H.A. Jacobsen, Solving Big Data Challenges for Enterprise

Application Performance Management, 2012-08-27, VLDB, Vol. 5, ISSN 2150-8097

[12] S. Siddiqui and D. Gupta, Big Data Process Analytics: A Survey, International Journal of

Emerging Research in Management &Technology, Vol. 3, Nr. 7, July 2014, ISSN: 2278-

9359.

http://www.citeworld.com/article/2137364/big-data-analytics/how-hr-analytics-can-transform-the-workplace.html

http://www.citeworld.com/article/2137364/big-data-analytics/how-hr-analytics-can-transform-the-workplace.html

http://www.equest.com/wp-content/uploads/2013/05/equest_big_data_whitepaper_hrs_golden_opportunity.pdf

http://www.equest.com/wp-content/uploads/2013/05/equest_big_data_whitepaper_hrs_golden_opportunity.pdf



369

ARCHITECTURE OF SMART METERING SYSTEMS

Simona-Vasilica OPREA


Ion LUNGU The Bucharest University of Economic Studies

[email protected]

Abstract. Smart metering systems along with renewable energy sources are advanced

technologies that represent solutions for insufficient conventional primary energy sources

problems, gas emissions, dependency on energy sources located outside European Union and

other problems related to energy efficiency. At electricity consumers’ level, apart from the

period before smart metering system implementation, they may have an active role, by

managing programmable consumption, by using storage equipment and supplying as

prosumers the national grid with electricity produced by small size wind generators or solar

panels based on electricity price. At the grid operators and suppliers level, smart metering

systems allow a better resources planning, reduce energy losses, estimate necessary energy

that should be acquired in order to cover losses, eliminate costs regarding meters reading and

integrate a higher volume of renewable energy sources. This paper mainly describes the

simplified architecture of smart metering systems, with three distinct levels: base level, middle

level and top level.

Keywords: architecture of smart metering systems, renewables integration, advanced

management infrastructure, meter data management, data warehouse

JEL classification: Q20, C55, M15

1. Introduction European strategies regarding integration of renewable energy sources and implementation of

smart metering systems have been taken over into Romanian legislation as national targets. For

implementation of smart metering systems, Romanian National Regulatory Authority approved

gradually implementation of these systems by grid operators so that until 2020, about 80% of

energy consumers should benefit from smart metering systems.

Smart metering systems are electronic devices that measure electricity consumption, allow

secured bidirectional data transmission from/to electricity consumers and supply more

information than a regular meter, by using telecommunications. Smart metering systems

contain subsystems for metering, subsystems for data transmission and subsystems for data

management that are provided by meters.

Smart metering systems – SMS hourly (or even at 15 minutes) measure electricity consumption

using metering electronic devices, transmit recorded data and send it to unified and complex

data management system. Heterogeneity of metering devices imposes data integration

components by using universal standards that allow loading and validating data in a centralized

manner.

According to [1] and [2], these components are part of advanced metering infrastructure – AMI

that measures, collects, analyses energy consumption and facilitate communications with

metering devices. AMI includes hardware components, telecommunication networks,

interfaces, controllers for electricity consumers, software components for meter data




370

management – MDM. According to [3], the most important advanced architecture and

infrastructure for smart metering systems are: Roadmap E-Energy/Smart Grid, NIST IOP

Roadmap, IEC SMB SG 3, BMWi E-Energy Program, BDI initiative – Internet der Energy,

Microsoft SERA, CIGRE D2.24, European Union Mandate CEN/CENELEC M/441, etc.

Starting from common characteristics of these architectures, in figure 1 we present simplified

architecture of smart metering systems.

Figure 1 - Simplified architecture for SMS

By integration with grid operators and electricity suppliers’ informatics systems and by

analyses facilities, smart metering systems could be included into management information

systems – MIS. According to [4], MIS are necessary for current tactic decision assistance for

short term control and planning. MIS architecture is structured on four distinct levels, according

to [4] and [5]:

Data level – represent data sources level, database management system level and

metadata dictionaries level. In order to organize data for data processing, it is

necessary to follow a couple of stages: collection and validation of data from several

sources, data cleansing and transforming, data loading into databases or data

warehouses;

Models level – represent the level of computing models and data analyses that are

necessary for management requirements. As for obtaining models mathematic,

statistics methods, data mining algorithms, predictive analyses could be used, etc.;

Interfaces level – contains applications and tools for data dynamic analyses presented

into an interactive form with flexible analyses facilities;

Telecommunications level – contains the set of interconnection elements for

previous levels and includes computer networks, communication devices and support

for distributed software.

Advanced Metering Infrastructure

Concentrator

Concentrator

Meter Data Management

Integration with informatics systems of grid

operators and suppliers



371

Starting from this architecture, we structure the architecture for SMS on three levels:

operational level, middle level, top level.

2. Operational level

New measuring devices are smart meters that are located at the interface between electricity

consumers and national grid. Operational level contains data that is evaluated, processed and

integrated into database. In the initial stage data is validated in order to ensure quality of results.

From the database, data is utilized for invoicing application, sending messages to consumers

via electricity meter or specific portals. We present this process with its stages in figure 2.

Figure 2 - The process of data collection from SMS and integration into database

Actual problems at this level refer to suppliers/grid operators’ current informatics systems

incapacity to process data provided by SMS. In the coming years, the volume of this data will

be huge and will need powerful platforms and technologies that are able to perform respective

processes. This data is characterized by velocity, variety and veracity that impose the necessity

to study and develop Big Data techniques in order to obtain real time right and complex

information. From monthly readings of meters to 15-minute readings, about 2900 monthly

records for each single meter will be generated. Taking into account that in Romania there are

about 9 million electricity consumption places, the monthly data volume will significantly

increase that leads to changes of data storage and processing technologies.

Smart meters have to fulfil minimal functional requirement and transmit at least the following

data: active and reactive energy consumption, consumed/generated active, reactive power,

active and reactive energy supplied to the national grid, etc. This data is processed at

suppliers/grid operators’ level. Out of it, consumption profiles are dynamically generated and

electricity consumption optimisation applications based on electricity price are used by

consumers. The electricity consumer will play a more and more active role being able to

optimise consumption. The next step will be installation at large scale of smart sensors for

detection of appliances consumptions and connection of some appliances to interface control

(smart refrigerator or ovens, heating systems and air conditioning devices that can be

programmed via Internet, etc.). The concept of smart house will generate new challenges

regarding data processing and organisation that comes up from new appliances. Variety and

velocity of the data that is provided by these appliances impose new technologies for parallel

processing and data organising into NoSQL databases. Big Data technologies for smart

metering industry are presented in [6], [7].

All data collected from sensors are processed based on a standard flow [8]. Data collected from

meters, sensors and smart appliances are transmitted via telecommunication system to the main

server where cleansing and processing by Apache Hadoop clusters or database in memory take

place. After data cleansing and validation, this is transmitted to the analytic server where

models and algorithms are applied in order to obtain consumption profiles, optimisation and

Read

Validate

Validation OK

Non-validate

Validate with estimation

Database

Re-reading

Applications



372

other computation regarding energy requirements, electricity prices, etc. Information is

analysed via advanced analytic tools.

3. Middle level

The middle level contains database and telecommunication system that ensures data

transmission from consumers to suppliers/grid operators’ database and from databases to

electricity consumers. Data transmission can be done by different technologies: general packet

radio service – GPRS, optic fibre, wireless. From the adopted communication system point of

view, the most accepted solution in countries with high level of theft consists in common

infrastructure for several utilities (electricity, natural gas and thermal energy) with an

intermediary level known as middleware, concentrators and balancing meters that rapidly

locate theft (figure3).

Figure 3 - The architecture of a common telecommunication system with intermediary level

In comparison with GPRS technology that has proportional costs with transmitted data volume,

communication technology via power line carrier – PLC has more advantages because of the

lower costs. PLC technology is used currently in order to monitor the state operation of

National Power System. Taking into account equipment that belong to different grid operators,

communication errors should be avoided by open standard protocols that are able to facilitate

communication among meters, concentrators and database.

4. Top level

At this level the most important elements are represented by computation models, estimation

of consumption profile, electricity price model, analyses and reports for strategic management.

Prosumers’ behaviour estimation is a continually changing activity due to the evolution of

appliances, generation opportunities from own sources (micro-generation, solar panels), social

and demographic changes. At suppliers/grid operators’ level, consumption estimation is very

significant for planning, commercial and market activities. Thus, a good estimation of

consumption will lead to efficient decisions regarding buying or selling of electricity at

advantageous prices and setting attractive electricity prices for consumers.

Applications for advanced analyses and reports for suppliers include interactive interfaces by

using dashboards, prediction analyses, what if scenarios, planning and reporting tools, etc.

These are obtained with business intelligence technologies. By implementation of smart

metering systems and sensors, reinventing these applications represents a new challenge due

to business intelligence technologies that have to be applied with NoSQL databases.

Database

PLCC

SMS electricity

SMS natural gas

SMS

thermal

energy

Concentrator and middleware

PLC

GPRS



373

The applications that are designed for electricity consumers include information such as 15-

minute consumption, electricity price, alerts and consumption thresholds, comparison between

their consumption and similar consumers located in the same area, estimations and predictions

of electricity consumption, etc. In figure 4 we propose an architectural model for smart

metering system taking into account characteristics of previous presented levels.

Figure 4 - Architectural model for SMS

The proposed architecture is flexible and can be adapted to new technologic trends regarding

smart appliances, metering devices and sensors.

5. Conclusions Smart metering systems have been implemented due to numerous advantages. They allow a

better integration of renewable energy sources by constantly informing consumers about their

electricity consumption options. They also encourage better usage of national grid by reducing

load peaks and lifting load off-peaks. In these circumstances, electricity price is a very

important key because it stimulates customers to change their behaviour.

Such as management information system, smart metering systems have a similar architecture.

In this paper, we proposed three-level architecture for smart metering system. It contains

operational level, middle level and top level. Each level has particular characteristics that were

described in this paper.

At operational level, the volume of data is tremendous and should be treated by using Big Data

techniques. Variety and velocity of the data that is provided by consumers’ meters and

appliances impose new technologies for parallel processing and data organising into NoSQL

databases. Data collected from meters, sensors and smart appliances are transmitted via

telecommunication system to the main server where cleansing and processing by Apache

Hadoop clusters or database in memory take place.

At middle level, data transmission can be done by different technologies: general packet radio

service – GPRS, optic fibre, wireless. The most accepted solution in countries with high level

of theft consists in common infrastructure for several utilities (electricity, natural gas and

GPR

S WiM

ax Fiber

optic Othe

Operational level Middle level Top level

Determination of consumption profile

Tariffs

Advanced reports

Data

base



374

thermal energy) with an intermediary level known as middleware, concentrators and balancing

meters that rapidly locate theft.

The top level includes interactive interfaces by using dashboards, prediction analyses, what if

scenarios, planning and reporting tools. These elements are obtained with business intelligence

technologies. The applications that are designed for electricity consumers include information

such as 15-minute consumption, electricity price, alerts and consumption thresholds,

comparison between their consumption and similar consumers located in the same area,

estimations and predictions of electricity consumption. The architecture proposed is flexible

and can be adapted to new uncertain challenges that come along with smart metering

implementation.

Acknowledgment

This paper presents some results of the research project: Sistem inteligent pentru predicția,

analiza și monitorizarea indicatorilor de performanță a proceselor tehnologice și de afaceri

în domeniul energiilor regenerabile (SIPAMER), research project, PNII – Parteneriate în

domeniile prioritare, PCCA 2013, code 0996, no. 49/2014 funded by NASR.

References [1] D.L. Petromanjanc, O. Momcilovic and I. Scepanovic, Suggested architecture of smart

metering system, Proceedings of the Romanian Academy, Series A, Volume 13, Number

3/2012, The Publishing House of the Romanian Academy, 2012, pp. 278–285

[2] Electric Power Research Institute, Advanced Metering Infrastructure (AMI), Available:

http://www.ferc.gov/eventcalendar/Files/20070423091846EPRI%20%20Advanced%20M

etering.pdf, February, 2007

[3] Joint Working Group on Standards for Smart Grids - Final Report of the

CEN/CENELEC/ETSI Standards for Smart Grids, CEN/CENELEC/ETSI Joint Presidents

Group (JPG), Available: ftp://ftp.cen.eu/PUB/Publications/Brochures/SmartGrids.pdf,

2011

[4] I. Lungu and A. Bâra, “Sisteme informatice executive, Editura ASE, Bucureşti 2007, ISBN

978 – 973 – 594 – 975 – 4

[5] M. Velicanu, I. Lungu, I. Botha, A. Bâra, A. Velicanu and E. Rednic, “Sisteme de baze de

date evoluate, Editura ASE, 2009, ISBN 978-606-505-217-8

[6] J. Bughin, M. Chui and J. Manyika, Clouds, big data, and smart assets: Ten tech-enabled

business trends to watch, McKinsey Quarterly, Available: http://www.itglobal-

services.de/files/100810_McK_Clouds_big_data_and%20smart%20assets.pdf, 2010

[7] S. Rusitschka, K. Eger and C. Gerdes, Smart grid data cloud: A model for utilizing cloud

computing in the smart grid domain, Smart Grid Communications, 2010,

ieeexplore.ieee.org

[8] M. Courtney, How utilities are profiting from Big Data analytics, Engineering and

Technology Magazine, vol 9, issue 1, http://eandt.theiet.org/magazine/2014/01/data-on-

demand.cfm, 2014

http://www.ferc.gov/eventcalendar/Files/20070423091846EPRI%20%20Advanced%20Metering.pdf

http://www.ferc.gov/eventcalendar/Files/20070423091846EPRI%20%20Advanced%20Metering.pdf

ftp://ftp.cen.eu/PUB/Publications/Brochures/SmartGrids.pdf

http://www.itglobal-services.de/files/100810_McK_Clouds_big_data_and%20smart%20assets.pdf

http://www.itglobal-services.de/files/100810_McK_Clouds_big_data_and%20smart%20assets.pdf

http://eandt.theiet.org/magazine/2014/01/data-on-demand.cfm

http://eandt.theiet.org/magazine/2014/01/data-on-demand.cfm



375

DATA WAREHOUSE PYRAMIDAL SCHEMA ARCHITECTURE -

SUPPORT FOR BUSINESS INTELLIGENCE SYSTEMS

Aida-Maria POPA

Academy of Economic Studies, Bucharest, Romania [email protected]

Abstract. This paper aims to present a new schema for data organizations in data warehouses.

The concept of pyramidal schema is designed to achieve a closer relation between the data

warehouse and Business Intelligence technology. Data orientation on modules and interest

plans aims at identifying has the objective to identify which data is necessary for each module

and which aspects are pursued by the organization's management for business development.

This ongoing research has been conducted based on existing data warehouse’s schemas which

are successfully used in business intelligence systems architectures. This paper will focus on

how pyramidal schema can be an efficient solution for decision-making process at the level of

organization’s management. The subject adapts to the current development trend of companies

where data warehouse is used more frequently for storing a large volume of data and for

complex analysis along with a business intelligence system.

Keywords: business intelligence, data warehouse, making-decision process, pyramidal schema JEL classification: D83

1. Introduction

Currently, the data warehouses are one of the interest points of Business Intelligence systems.

Their role is to provide accurate and complete information to the organization’s management

to facilitate the development of business performance through a proper founded decision-

making process.

In recent years, information has represented a negligible competitive advantage of

organizations in the economic market. Whether the data is more numerous, more detailed and

more accurate, companies can know better the target customers and can respond promptly to

their requirements. They also can extend or, the contrary, narrow target group by identifying

profitable or unprofitable segments from an economic point of view.

As the current economic market tends to be more dynamic, business intelligence solutions must

have accuracy to allow companies to develop effective plans for both short-term and medium

and long term developing.

2. Data warehouses

2.1 Some aspects regarding Data Warehouse

The data warehouse represents a set with a large volume of data (from hundreds of gigabytes

to terabytes) that is used as a compact, integrated and complete data source and it is a support

for different types of information systems (decision support, executive, business intelligence)

with the purpose to provide stored data from relevant sources (operational data, external files,

archives, etc.) aimed at supporting the decision-making process in a company business.

Data warehouse represents a technology that is increasingly present and important in business

operations. Although it is known that data warehouses technology has been appreciated in

recent years, their value continues to increase, being a competent solution in terms of clients



376

and business performance of a company. Widespread use of data warehouses in more and more

activity fields reported an efficiency of operations and an improvement of market intelligence

and also new knowledge about enterprise customers [1].

Data warehouses integrate data from many different information sources and transform them

into multidimensional representation for Business Intelligence Systems [2].

Data warehouse is a complex and bigger form of database, usually designed using relational

model, which contains a large volume of historical data of a certain interest [3].

A fundamental principle of the integrity of the data warehouse is that it is a unique source of

data and information for the entire organization. This data is stored in a single, common form

of representation of data from all sources (databases, external files, archives, etc.) settling clear

conventions on the name fields, coding systems, representation of measure units for each

attribute, representation form for calendar data, avoiding duplication of the same fact from

different sources (departments) [4].

2.2 Use of data warehouses for processing information about customers

Gathering information about customers is realized with every transaction and every activity

performed by them. These data is analyzed in order to improve the quality of business

processes. In this case is highlighted the role of data warehouse because it represents a storage

method that includes all the information about customers: operational or transactional data,

interaction data, data about the client profile and also demographic and behavioral data.

For instance, before having access to a CDB (customer database), Meredith Publishing Group’s

marketing group from United States of America could only execute analysis of about 1% of

the whole customer data stored in external databases. But, having all data about customer stored

into the company data warehouse and being directly accessible, the marketing group was

capable of analyzing and to understand almost the whole available data about customers in just

a few minutes [5].

Figure 1 - Comparison between percentages of data about customers analyzed using databases and data

warehouses [5]

3. The pyramidal schema for data warehouses

Through schema is understood a construction resulting from the design activity, in this case, of

the data warehouse. A schema involves establishing the structures of data storage and defining

ways to access data (how information is stored on technical support information). It describes

how the phenomena of surrounding reality are represented by entities and attributes (data types)

with all connections (correlations) between them (constraints). The schema definition

represents a modeling activity because it serves to translate real world entities in abstract terms

[6].



377

Multidimensional modeling uses several types of schemas depending on the ways in which

objects can be arranged within the schema: star schema type, snowflake schema type, data

cubes [7].

Pyramidal schema is a structure of four levels of data organization that has the data at the base

and at the top of the pyramid it stores the results of the queries previously performed.

3.1 Specific concepts used in the definition of pyramidal schema

Facts table. In [8] is considered that the fact table is where the numerical measurements of the

business processes are stored and the measurements or events are related by foreign keys to

each dimension table. The fact table has a large volume of data sometimes containing up to

thousands or millions of rows of records. A usual query will compress or extract a large volume

of records into a handful of rows using the aggregation. The most important characteristic of

the fact table is defined by the grain and it represents the level of detail at which measurements

or results are stored. It also determines the dimensionality of the data warehouse and how the

size may have a much greater impact to the business performance.

Surrogate key. A surrogate key in a data warehouse is a unique identifier for an entity in the

modeled world or an object in the data warehouse. The surrogate key is not derived from an

application or real data which means that it does not have any business significance or logical

meaning. A surrogate key is defined as a column that uniquely identifies a one single row in a

table. These keys don’t have a natural relationship with the rest of the columns in a table and

are used in pyramidal schemas just for reducing the number of tables in the model by removing

some of the link tables. Usually the surrogate keys take as value a numeric number. The

surrogate key value is generated and stored with the rest of the columns in a row of a table. It

is also known as a dumb key, because there is no significance associated with the value.[9]

Interest modules table. This table, located on the higher level of the pyramidal schema, contains

the main segments (modules) of the company, recording their ID and name, and is directly

related to the facts table. For instance, in this table can be stored the name of departments or

another division that has access to the company's data or use the records for analysis or reports.

Interest plans table. The records of this table have to respond to one simple question: ”What I

want to know about a company?”. For example, the human resources department wants to

know the evolution of the number of employees, how many of them are qualified for their job,

which is the average age of employees, etc.

Repartition on interest plans table. This table is used as a link table between data and plans of

interest. Thus it can quickly know which data is relevant for analyzing a plan of interest. To

identify a record of data and the correspondence from the interest plans table is used an

indexing table where the tables from the lower level of the pyramidal schema are stored.

3.2 Levels on pyramidal schema of a data warehouse

The lower level (data level): is composed of a set of data organized in tables. Number of tables

from this level is reduced by using surrogate keys so that some of the link tables are removed.

The middle level (the level of distribution on interest plans of tables and records): at this level

are found the table of distribution on interest plans and the table of indexing tables.

The higher level (the level of the interest modules and plans and the annex tables): at this level

are found: modules of interest table which contains major segments of the company`s

organization (financial module, human resources, production etc.); interest plans table which

contains what results are meant to be watched (sales volume, number of employees, salary

expenses etc.); annex tables (measure units, time measure units, etc.).

The top level (the facts table level): is a table with a higher volume of records and it is the

source for analysis and reports. This table is used by the leadership management of the

http://en.wikipedia.org/wiki/Database



378

company for decision-making process. Using those records along with an efficient business

intelligence system, the business performance can be increased considering that the

information is the most important competitive advantage of any modern company.

In Figure 2 is represented the structure of pyramidal schema for data warehouses and in Figure

3 is designed a database model using the specific tables from pyramidal schema using

Microsoft Office Visio 2013.

Figure 2 - The structure by levels of pyramidal schema

Figure 3 - Representation of tables from pyramidal schema in a relational database

3.3 The necessity and advantages of using pyramidal schema

The main novelty brought by the pyramidal schema in organizing data in a data warehouse is

the use of tables of interest modules and plans. This allows the distributing of results on the



379

major segments (departments) and interest plans within a company. Other advantages common

to all data warehouses schemas are [4]:

– transferring and sharing the results stored in the organization between different

departments, offices or business partners;

– storing a large volume of data and also storing the results of queries;

– represents support for decision-making process for the company's leadership

management;

– represents competitive advantage in the economic market and beyond it;

– strategic organization of data to increase business performance;

– stored data is complete and consistent;

– reduced access and data analysis time;

– it can contain data and information on all segments of the company.

BI Systems have earned more and more importance in decision-making processes in companies

which are developed and in progress of development. Organizing data into modules and plans

of interest has a particularly role in detailed analysis of a department or a plan of interest that

is found in certain modules. A company with a complex organizational structure and an

economic activity in several industries is using at global level the BI system, but when these

segments must be individually analyzed, pyramidal schema allows access to the necessary data

and previous obtained results in relation to the analyzed segment.

4. Conclusions

In the context of the ongoing development of the notion of “intelligent warehousing” can be

brought into question also a reorganization of the data into a data warehouse in a schema

oriented on modules and plans of interest. Using pyramidal schema may lead us to a new

concept of data warehouse called “intelligent data warehouse”. Organizing data warehouse

under the pyramidal schema brings the users closer to the notion of Business Intelligence,

because the data is highlighted easier for analysis required for each segment followed by a

development company.

For the applicative part, there will be realized a simulation with test data for the railway field.

Sustainable development of the railway field depends now on the efficient use of data and

information stored. After testing we will see if pyramidal schema is an optimal solution for

areas with a wide range of modules and plans of interest.

References [1] J. Foley, The Top 10 Trends in Data Warehousing, March 10, 2014. Available:

http://www.forbes.com/sites/oracle/2014/03/10/the-top-10-trends-in-data-warehousing/

[2] G. Satyanarayana Reddy, M. Poorna Chander Rao, R. Srinivasu and S. Reddy Rikkula,

Data Warehousing, Data Mining, OLAP and OLTP Technologies Are Essential Elements

To Support Decision-Making Process in Industries, International Journal on Computer

Science and Engineering(IJCSE), vol. 2, No. 9, pp. 2865-2873, 2010, ISSN 2865-2873.

[3] B. Nedelcu, Business Intelligence Systems, Database Systems Journal, Vol. IV, Issue

4/2013, pg. 12-20, 2013, ISSN 2069-3230. Available:

http://www.dbjournal.ro/archive/14/14_2.pdf

[4] M. Velicanu and Gh. Matei, Tehnologia inteligenta afacerii, Editura Ase, Colectia

Informatica, Bucuresti, 2010, ISBN 978-606-505-311-3.

[5] A. Khan, Dr. N. Ehsan, E. Mirza and S. Zahoor Sarwar, Integration between Customer

Relationship Management ( CRM ) and Data Warehousing, Procedia Technology 1,

pp. 239 – 249, 2012, ISSN 2212-0173.

http://www.forbes.com/sites/oracle/2014/03/10/the-top-10-trends-in-data-warehousing/

http://www.dbjournal.ro/archive/14/14_2.pdf



380

[6] M. Velicanu, Database systems Explanatory Dictionary, Editura Economica, Bucuresti,

2005, ISBN 973-709-114-0.

[7] I. Lungu and A. Bara, Sisteme informatice executive, Editura Ase, Bucuresti, 2007, ISBN

978-973-594-690-6.

[8] I.Y. Song, W. Rowen, C. Medsker and E. Ewen, An Analysis of Many-to-Many

Relationships Between Fact and Dimension Tables in Dimensional Modeling, Proceedings

of the International Workshop on Design and Management of Data Warehouse

(DMDW’2001), Interlaken, Switzerland, June 4, 2001.

[9] D.P. Pop, Natural versus Surrogate Keys. Performance and Usability, Database Systems

Journal, Vol. II, no. 2/2011, ISSN 2069-3230.



381

BUSINESS INTELLIGENCE FOR HEALTHCARE INDUSTRY

Mihaela IVAN


[email protected]

Manole VELICANU


[email protected]

Ionut TARANU Bucharest University of Economic Studies

[email protected]

Abstract. Global data production is expected to increase at an astonishing 4,300 per cent by

2020 from 2.52 zettabytes in 2010 to 73.5 zettabytes in 2020 [1]. Big data refers to the vast

amount of data that is now being generated and captured in a variety of formats and from a

number of disparate sources. Big data analytics is the intersection of two technical entities that

have come together. First, there’s big data for massive amounts of detailed information.

Second, there’s advanced analytics, which can include predictive analytics, data mining,

statistics, artificial intelligence, natural language processing, and so on. Put them together and

you get big data analytics. In this paper are reviewed the Real-Time Healthcare Analytics

Solutions for Preventative Medicine provided by SAP and the different ideas realized by

possible customers for new applications in Healthcare industry in order to demonstrate that

the healthcare system can and should benefit from the new opportunities provided by ITC in

general and big data analytics in particular.

Keywords: Business Intelligence, healthcare analytics, use-cases, real-time processing. JEL classification: I15, M15, M21

1. Introduction The concepts used and presented in this paper are Big Data, which is a challenge nowadays,

In-Memory, which is a new Business Intelligence technology and Analytics, which is an use

case [2] [3]. Nowadays, it is very important to present the role of Business Intelligence

technology in the healthcare sector.

In [4], Prof. dr. med. Karl Max Einhäupl considered that “in a hospital like Charité it’s an

unending stream of data every day. We see an unending stream of data every day and it is

unconditional important that we collect this data, filter, control it and reuse it for patient care,

or for teaching, or for driving research. In the medical field, it is critical that we move away

from the flood of paper that is overwhelming doctors today; that we continually move toward

electronic data capture.” This means that if you have the right information at the right time then

everything it’s possible.

2. Healthcare Analytics

When discussing about healthcare analytics, it is important to ask how are the statistics numbers

regarding the usage of analytics in healthcare and how this affects the end user’s knowledge?

In the Figure 1 below these numbers are represented, 10% are those who use analytics today

and approximately 75% need analytics [5]. The disadvantage of those who are not using

analytics feature is that they can’t make use of all data because the ability to manage all data is




382

getting difficult. On the other side, those who use analytics today are missing new insights,

which means they are not able to imagine the potential.

Figure 1 - Healthcare Analytics

The power of collective insights is realized by following three steps:

Engage: Predict demand and supply of Supply Chain;

Visualize: Understand the customer’s thoughts;

Predict: Provide the proper offers and services to every customer. Also predict new

market trends and innovate new products [6].

Healthcare organizational data it is used in diverse cases like surgical analytics, share

healthcare visualizations and have the clinicians share the processes. Profitability and quality

analysis for management can provide the critical insights to obtain the organizations goals and

gain competitive advantages. Analytical applications are developed to provide the base for the

use of analytics in an enterprise [7].

We must consider that analytics is about people and their needs. We can see in the below picture

(Figure 2) why is this evolution so important and how the people’s thinking are.

Figure 2 - Analytics is for People

The focus is on the empathy of the end user like executives, healthcare operations, clinician,

purchaser and clinical researcher. This is in fact the user experience with these useful tools.

The user experience can be sufficient in terms of satisfaction if the tools have beautiful UI (user

interface) and an easy adoption [8].

As presented in Figure 3, in the healthcare industry it is very important to help the organizations

to measure and improve treatment quality, to address growing concerns, to better manage

revenue and to increase the overall satisfaction.



383

Figure 3 - Information vs. data in healthcare industry

There is not necessary to collect more data, because the companies’ needs are to reach more

information, considering that in these days many companies are already confronted with

processing enormous amount of data.

In our opinion the actual context of healthcare analytics it’s about redefining the possible, while

the future evolution can be described in terms like efficiency, performance, data quality, real-

time analytics for patients, doctors and medical researchers.

3. Healthcare Analytics Solutions

A Real-Time Healthcare Analytics Solution for Preventive Medicine is a solution developed

by SAP. It let users to see their analytics and to use all the functionalities of SAP HANA which

is behind this application. This solution saves time and can be easily customized for any use

case [9] [10] [11].

Below are collected different Healthcare use-cases realized by different customers for new

applications in Healthcare industry.

Acceleration of most used SAP Patient Management transactions

Like the clinical workstation, the reasons are the following:

a lot of user complaints related to performance;

the transaction is a key one as it offers a view on all patients of a given ward with

important data.

Many users work with it and use the refresh function which creates additional system load.

This use case is currently being implemented. We could think of further opening it up to

multiple providers in order to provide access to patient information of the complete Health

Information Network (would need to be based on IHE specifications).

Clinical Research Support for cancer patients

This healthcare use-case has the following advantages:

Help medical researchers and physicians comprising up-to-date clinical and medical

information into research processes;

Ability to access all relevant data across organizational boundaries real-time;

Analyses of clinical data based on structured as well as unstructured information;

Create patient cohorts for clinical trials;

Quickly determine Patient/Trial matching;



384

Could of course be applied to other patients and specialties;

Use cases is reflected in SAP Solution in Early Adoption "Medical Research Insights";

This use case could be useful for these customers NCT, DKFZ hospitals.

Patient Segmentation This use-case help Healthcare Payers to quickly analyze their patient population in order to

determine potential candidates for a disease management program (e.g. diabetes prevention).

Potential customers for this use-case could be Healthcare Payers and Health Insurances.

Health Plan Analytics This use-case will support Healthcare Payers to analyze the effectiveness of their health

programs (e.g. ROI and Performance Analysis of Disease Management Programs covering

morbidity clustering). As in the previous use-case, potential customers could be also

Healthcare Payers and Health Insurances.

Multi-Resource Planning In this situation, the use-case can help Healthcare Providers to quickly recalculate their

outpatient schedules or inpatient surgery plans based on different types of incidents like

unavailability of doctors etc. Potential customers are Healthcare Providers.

Treatment outcome analysis This will help Healthcare Providers to analyze their patient treatment outcome and costs by

considering diagnosis and DRG codes, length of stay, services performed, claims and

revenues. This could also be used to support the contract negotiation process with the payers,

by providing hospitals with the information support on the real costs for a specific patient

group. Potentially extend this to a multi provider or ACO (Accountable Care Organization)

scenario for the US. As in previous situations, potential customers are Healthcare Providers.

Evidence-based medicine Evidence-based medicine (EBM) aims to apply the best available evidence gained from the

scientific method to clinical decision making. The use case is to suggest medical guidelines

based on past patient treatments. Potential customers are Healthcare Providers.

Drug Recall

This use-case provide fast and efficient recall procedures by determining quickly all the

patients having been administered the drug to be recalled including their location and contact

information. Potential customers are Healthcare Providers.

Track & Trace of Medical Products

This use-case offer monitoring of the logistic chain of medical and pharmaceutical products

from the raw material to the point of consumption by the patient including efficient counterfeit

prevention. Potential customers are Healthcare Providers.

Prevention of Fraud and Abuse

This support analysis of incoming claims in comparison to the claims history with the aim of

detecting cases of fraud and abuse. Originally HANA Olympics use case submitted by Jim

Brett (Partner Manager for E&Y) Jim & Steve pushing on partner development in the US

GRC cross industry use case "Instant Compliance" under evaluation, Healthcare has been

asked to address requirements. Potential customers are Healthcare Payers.

Real-time patient monitoring

This use-case help Monitoring patients in real-time and triggering alerts of necessary

interventions based upon incoming data (e.g. blood pressure). This use case is an example for

a set of use cases like

MEWS (modified early warning score) in the ICU area;

elderly patients at home.

Potential customers are Healthcare Providers.



385

Determination of copayment rates Offer an insured patient the possibility to quickly find out which copayment he would have

to make for a given treatment. This service could be offered by a health insurance through a

portal or mobile device to their customers. It would create the required output based on the

insured person's health plan and on the already consumed services. This determination is data

intensive and could be accelerated through HANA. Potential customers are Health Insurances

for their insured persons or Patients directly.

Prevention of Claims Rejection This help medical controllers or physicians by informing them that a case might be subject to

a payer investigation (e.g. MDK in Germany) because of a mismatch between claims and

medical facts and other characteristics like length of stay, age etc. Potential customers are

Healthcare Providers.

In the below table 1 is realized a comparative analysis of use-cases which will be implemented

in the healthcare industry and their key benefits are highlighted.

Table 1 - Comparative analysis of Healthcare use-cases

Use cases Potential customers Key benefits

Acceleration of most used SAP Patient

Management transactions

Healthcare industry Acceleration of transaction

processing

Clinical Research Support for cancer

patients

NCT and DKFZ hospitals

from Munich

Increased patient

satisfaction

Patient Segmentation Healthcare Payers and Health

Insurances

Cost savings for hospitals

Health Plan Analytics Healthcare Payers and Health

Insurances

Real-time analysis

Multi-Resource Planning Healthcare Providers Time saving for planning

Treatment outcome analysis

Healthcare Providers Better outcome

management

Evidence-based medicine Healthcare Providers Better clinical decision

making process

Drug Recall Healthcare Providers Efficient recall procedures

Track & Trace of Medical Products Healthcare Providers Efficient counterfeit

prevention

Prevention of Fraud and Abuse Healthcare Payers Better fraud prevention

Real-time patient monitoring Healthcare Providers Real-time monitoring

Determination of copayment rates Health Insurances for their

insured persons or Patients

directly

Efficient budget planning

Prevention of Claims Rejection Healthcare Providers Efficient claims

management

Our solution proposed in the healthcare industry is based on the use-cases presented above

and has the following objectives:

Real-time analysis of hospital patient management data;

Significant speed up of reporting processes;

Monitoring clinical quality of care and patient safety.

4. Conclusions In conclusion is time for change in Healthcare sector. The use of analytics will enable putting

the right data at the fingertips of the people with the potential to generate lifesaving or lifestyle

improving insights. Big data offers breakthrough possibilities for new research and discoveries,

better patient care, and greater efficiency in health and health care, as detailed in the July issue



386

of Health Affairs [12]. We believe that big data analytics can significantly help healthcare

research and ultimately improve the quality of life for patients from any domain.


Program Human Resources Development 2007-2013, project number




References [1] Big Data Strategy - Improved understanding through enhanced data-analytics capability,

Available at http://www.finance.gov.au/agict/, June 2013

[2] M. L. Ivan, Characteristics of In-Memory Business Intelligence, Informatica Economică,

vol. 18, no. 3, 2014, pp. 17-25

[3] A. Bara, I. Botha, V. Diaconiţa, I. Lungu, A. Velicanu, M. Velicanu, A model for Business

Intelligence Systems’ Development, Informatica Economică, vol. 13, no. 4, 2009, pp. 99-

108

[4] J. Flintrop, E. A. Richter-Kuhlmann, H. Stüwe, Interview mit Prof. Dr. med. Karl Max

Einhäupl, Vorstandsvorsitzender der Charité, Available at:

http://www.aerzteblatt.de/archiv/62999/Interview-mit-Prof-Dr-med-Karl-Max-Einhaeupl-

Vorstandsvorsitzender-der-Charite-Universitaetsmedizin-Berlin-Wir-koennen-uns-keine-

Klinik-leisten-die-nicht-zu-den-besten-gehoert

[5] C. Gadalla, A Technical Guide to Leveraging Advanced Analytics Capabilities from SAP,

Available at: http://www.slideshare.net/SAPanalytics/bi2015-charles-

gadallatechguideleveraginganalytics

[6] SAP HANA Platform, Rethinking Information Processing for Genomic and Medical Data,

Available at: https://www.sap.com/bin/sapcom/en_us/downloadasset.2013-02-feb-11-

20.sap-hana-platform-rethinking-information-processing-for-genomic-and-medical-data-

pdf.html

[7] Business Intelligence and Analytics for Healthcare, Available at:

http://www.perficient.com/Industries/Healthcare/Business-Intelligence-Analytics

[8] Kim Gaddy, Making the most of analytics, Insights, Available at:

http://www.utilityanalytics.com/resources/insights/making-most-analytics

[9] Real-Time Healthcare Analytics Solution for Preventative Medicine, United Software

Associates Inc., Available at: http://marketplace.saphana.com/p/3323

[10] T. Knabke, S. Olbrich and S. Fahim, Impacts of In-memory Technology on Data

Warehouse Architectures – A Prototype Implementation in the Field of Aircraft

Maintenance and Service, in Advancing the Impact of Design Science: Moving from

Theory to Practice, Lecture Notes in Computer Science, Springer, Vol. 8463, 2014, pp.

383-387

[11] Big Data Analytics, September 14, 2011, Available at:

http://tdwi.org/research/2011/09/best-practices-report-q4-big-data-analytics.aspx

[12] Using Big Data To Transform Care, Health Affairs, July 2014, Vol. 33, Issue 7, Available

at: content.healthaffairs.org/content/33/7.toc



387

STREAMLINING BUSINESS PROCESSES IN ACADEMIA BY

BUILDING AND MANIPULATING A BUSINESS RULES

REPOSITORY

Alexandra Maria Ioana FLOREA

Academy of Economic Studies, Bucharest [email protected]

Ana-Ramona BOLOGA Academy of Economic Studies, Bucharest

[email protected]

Vlad DIACONIȚA


Razvan BOLOGA


Abstract. It is evident the existing trend in recent years to regard the university as a trader on

the market that works and is managed like a business. In this context, more and more

universities are interested to increase the efficiency of business processes and invest in the

development of advanced software solutions.

Given these issues, we will present a research proposal which suggest an approach based on

business rules to streamline the coordination and execution of business processes within a

university. From a scientific perspective, the project aims to address the area of researches

based on business rules in academia, which has not been addressed so far. It aims to develop

and implement a technique for identifying and formalizing business rules in academia and to

build a business rules repository that is constantly queried/viewed and updated as changes

occur to regulations regarding business processes or to restrictions on structures, activities

and informational flows inside the university. The business processes model and the rules

repository will be developed for the specific case of the Bucharest University of Economic

Studies and in the next stage it will be generalized for other Romanian public institutions.

Keywords: business processes, business rules, business rules repository, university

management

JEL classification: I23, L86, O33

1. Introduction

Modern universities are complex organizations that, in terms of implemented information

systems present a significant number of challenges. Generally speaking, if we analyze

academics in terms of information, we can identify a set of subsystems with independent

activities, but which develop instead a more or less intense information exchange with other

subsystems, generating in this way inevitable interdepencies.

For these subsystems of activities, universities implement a set of software applications that

automate all or part of the business processes involved. The technical infrastructure of

universities is usually made up of a mosaic of applications developed using different

technologies and, in order to ensure trouble-free execution of business flows there is necessary



388

the process integration in the systems they are contained in. This integration can be achieved

in various ways; the most important thing is the existence of an overall view of the functioning

of the university and its interactions with its partners, the objectives pursued and the rules

governing them.

In this article we present an approach based on business rules to streamline the coordination

and execution of business processes from the Bucharest University of Economic Studies.

2. Business rules in academic process modeling Business rules represent some of the most important key documented knowledge within an

organization. They allow the separation of business logic from processes and operational

applications, allowing specification of business knowledge in a way that is easy to understand

but can be executed routinely by automated rules engine.

By researching the current literature, we found that there are no similar, business rules-based

modeling approaches in academia, to date. Current efforts in this field are directed toward the

development of an industrial standard for markup languages to specify business rules on all

levels of the models of the Model Driven Architecture (MDA) [1] such as SBVR (Semantics

of Business Vocabulary and Business Rules Specification) [2] or PRR (Production Rule

Representation) [19]. There were also identified a number of researches on the definition of a

transformation model between different rules specification languages such as [3]. This project

will make an analysis of the current standards and which ones are applicable in the specific

case of a university

Although current approaches existing in the industry can be partially used with the necessary

adaptations, the difficulty lies in formalizing the large number of tacit or implicit knowledge

and the existence of a high degree of information fragmentation. Also, the specific universities

business processes mentioned above show particular aspects, involving increased attention in

their management.

The development of the business processes model of a university comes to offer a collection

of business processes documented in a standardized fashion, to highlight the relationships

between business processes and to ensure quality in their execution by observing a set of rules.

Existing business models can be applied to a part of the academic process, but there are

business processes that are specific to higher education institutions, such as admission, research

grants management, cycle management studies.

3. University business process model The starting point in the identification of business rules will be to build the business processes

model across multiple levels of detail, because only in this way we can capture all types of

behavior, imposed restrictions, interactions, existing interdependencies between processes and

how they call common business rules.

The literature in the field of modeling business processes is very rich especially for the area

dedicated to companies; there are even reference models that have been highly used such as

the SCOR model [4], the Y-CIM model [5], the template created by SAP [6] or the ITIL model

(Information Technology Infrastructure Library) [7]. Implementations have emerged in recent

years focused on processes at university level, and there have even been proposed reference

models based on business processes, such as the one developed by Svensson and Hvolby in

2012 [8], models that can be used as starting point in developing a process-oriented solutions

specific to the Romanian universities environment while taking into account different aspects

in the organization and development of processes and implicitly in how different business rules

are applied.



389

An important problem of the business rules approach in academia is the way of communication,

the interaction with all the software applications used within the university. The process model

will identify precisely where and when to apply certain business rules, but for maintaining the

repository a management interface will be developed and a security policy will be defined to

restrict access and changing of information rights and to identify those responsible for keeping

the information up to date.

Existing business models can be applied to a part of the academic processes, but there are

business processes that are specific to higher education units such as admission, research grants

management, study lifecycle management.

An example of how a higher education business process has unique activities which are not

matched with regular activities within existing business processes is detailed in figure 1. In this

figure the main activities that form the students ‘evaluation process can be observed.

Figure 1 - Students’ evaluation process

For this particular process we can identify a number of rules that are in place but aren’t

necessarily documented in a controlled manner.

The license commission’s president must be either a full professor or an associate

professor.

The members of the license commission must be at least lecturers who obtained their

PhDs.

Enrolment in the 2nd year can be done if at least 15 credits were obtained.

Enrolment in the 3rd year can be done if at least 60 credits were obtained, from which at

least 15 from the 2nd year of study.

Enrolment for the license exam must be done at least 8 months prior to the exam date.

The results of the exams must be communicated to the students the next day for oral exam

and in 4 working days for written exams.

Enrolment in the 3rd supplemental year can be done only with tax and for a maximum of

three consecutive times.

These are just a few examples of how an academic business process is governed by rules and

as such they must be adequately documented, stored and managed through a repository.



390

4. Rules repository

The overall objective of the research is to build and handle a repository of business rules that

govern business processes in a university. This repository is intended to be an exhaustive

classification that contains and allows the management of all the business rules that operate in

a university, including both academic and economic processes. Also, this repository must be

designed so that it could be used in further research as a core element of a rules engine, through

which business rules will be implemented. To attain this objective we will go through a series

of specific objectives

O1: Designing the conceptual model of business processes at a university.

In investigating the business processes we follow the indications of the Lean methodology and

in order to build the business processes model we use the notations provided by Business

Process Model and Notation version 2.0, the latest version standardized by the Object

Management Group (OMG), thus building models based on modern approaches to data

visualization. As one of the original scientific contributions, note that when modeling the

business rules, we seek to define a technique for identifying and formalizing business rules in

academia.

O2: Developing a conceptual model of business rules that govern the previously modeled

processes.

At the design stage, we keep in mind that the rules stored in the warehouse must be: relevant,

atomic, precise, declarative, reliable, authentic, unique, consistent. The deposit will be

designed so as to store not only rules and their characteristics but also the information captured

in the rules model regarding the associations between rules and processes. Also we must take

into consideration to design the necessary characteristics in order to allow the storage of

historical versions of the sets of rules.

O3: Creating a prototype that implements the rules repository and a management interface.

The advantages of implementing a centralized repository of business rules in a university are

numerous. First a much-improved maintenance can be obtained in the idea that the pace of

change for business rules is different from the business process change pace and changes will

be made unitary and centralized. There is a separation of business rules from business processes

implementation. Analysis and design activities of the business processes model that precedes

the development of the rules repository allow a precise identification of the areas of

responsibility and possible redundancies that might occur in information flows. Also, the

flexibility of the consultation, updating and visibility of the identified set of rules should be

much improved.

The risks identified regarding the proposed models and prototype development include among

others: cumbersome analysis of the processes, due to extremely numerous activities, building

an incomplete business rules model because there are many informal, unspecified rules, the

occurrence of delays of planned activities

5. Conclusions So far, there have been very few attempts made by research groups or by software vendors to

provide models and solutions that address all the activities in a university, describing from one

end to another the academic business processes

Moreover, if the scope of academic processes has been however addressed in other

investigations, there are no similar approaches based on modeling of business rules with

application in academia. So far, it has not been built any business rules engine or any business

rules repository for the rules that govern academic work processes. Although a good knowledge

and understanding of the rules governing academic processes is essential for the smooth

running of specific activities, in academia it exists a vast number of such rules (hundreds) and



391

they are both formal and informal and, therefore, difficult to understand fully, so far there has

been no attempt to manage them with a rules repository.

In this context, we propose a research which has a high degree of originality and innovation

addressing a topic that has not been previously studied respectively to develop and implement

a technique for identifying and formalizing business rules in academia and to build a business

rules repository that is constantly queried/viewed and updated as changes occur to regulations

regarding business processes or to restrictions on structures, activities and informational flows

inside the university.

Acknowledgement



POSDRU/156/1.2/G/137499 "Developing and modernizing the curricula in the Business

Informatics field (DEZIE)", beneficiary "Clubul Informaticii Economice - Cyberknowledge

Club".

References [1] Object Management Group, MDA Guide version 1.0.1, june 2003,

http://staffwww.dcs.shef.ac.uk/people/A.Simons/remodel/papers/MDAGuide101Jun03.pd

f

[2] Object Management Group, Production Rule Representation version 1.0 (PRR) . Request

for Proposal, decembrie 2009, http://www.omg.org/spec/PRR/1.0/PDF/

[3] M.H. Linehan, “Semantics in model-driven business design”, IBM T.J. Watson Research

Center, 2006, http://ceur-ws.org/Vol-207/paper08.pdf

[4] SCOR Model, http://www.supplychainopz.com/2011/01/scor-model-for-supply-chain-

improvement.html ,

[5] A.W. Scheer, W. Jost and Ö. Güngöz, “A Reference Model for Industrial Enterprises”,

chapter 8 in “Reference Modeling for Business Systems Analysis”, IGI Global, 2007, p.

167-181

[6] B. F. Dongen, M.H. Jansen-Vullers and H.M.W.Verbik, Verification of the SAP reference

models using EPC reduction, state-space analysis, and invariants, Computers in industry,

vol 58, issue 6, august 2007, p. 579-601

[7] https://www.axelos.com/itil

[8] C. Svensson, H.H. Hvolby, “Establishing a business process reference model for

Universities”, Procedia Technology nr 5, 2012, p. 635-642

http://staffwww.dcs.shef.ac.uk/people/A.Simons/remodel/papers/MDAGuide101Jun03.pdf

http://staffwww.dcs.shef.ac.uk/people/A.Simons/remodel/papers/MDAGuide101Jun03.pdf

http://www.omg.org/spec/PRR/1.0/PDF/

http://ceur-ws.org/Vol-207/paper08.pdf

http://www.supplychainopz.com/2011/01/scor-model-for-supply-chain-improvement.html

http://www.supplychainopz.com/2011/01/scor-model-for-supply-chain-improvement.html

https://www.axelos.com/itil



392

ENHANCING THE ETL PROCESS IN DATA WAREHOUSE SYSTEMS

Ruxandra PETRE


Abstract. In today's competitive world, the amount of data that is being collected is increasing

dramatically and organizations use data warehousing solutions to analyze the data and

discover the relevant information contained by it. Therefore, the need for new and innovative

solutions, to integrate data from various sources into data warehouses, is very high.

This paper focuses on the importance of the ETL (Extract, Transform and Load) process in

data warehousing environments, highlighting my proposal of ETL architecture for these

environments. Besides these aspects, the paper presents a case study on enhancing the ETL

experience by using a data integration platform provided by Oracle: ODI (Oracle Data

Integrator).

Keywords: ETL, Data Warehouse, Architecture, Oracle Data Integrator JEL classification: C88, L86

1. Introduction In recent years, organizations have been facing the dramatic increase of the volumes of data

collected and stored in their systems. To address the challenge posed by this, organizations

started to use data warehousing solutions that allow better analysis of the data and the discovery

of the relevant information contained by it. Therefore, the need for new and innovative

solutions, to integrate data from various sources into data warehouses, is very high.

Data needs to be loaded regularly to the data warehouse, in order to fulfill its purpose of

providing a consolidated and consistent data source, used for analysis and reporting. In order

to achieve this, data needs to be retrieved from the operational systems of the enterprise, as

well as from other external data sources, and loaded into the data warehouse.

Data warehousing solutions need to provide the means of integrating the data extracted from

the various source systems. The process used for integration and consolidation of the data of

the organization into the data warehouse, is the ETL (Extract, Transform and Load) process.

During ETL, the data is retrieved from the source systems, business and validation rules are

being applied to it, and then it is loaded to the data warehouse, delivering a unified view of the

enterprise data.

Consequently, an efficient, scalable and reliable ETL process is a key component for a

successful implementation of a data warehouse solution.

2. Overview of the ETL process “The Extract-Transform-Load (ETL) system is the foundation of the data warehouse. A

properly designed ETL system extracts data from the source systems, enforces data quality and

consistency standards, conforms data so that separate sources can be used together, and finally

delivers data in a presentation-ready format so that application developers can build

applications and end users can make decisions.” [1]

An ETL process comprises the following three phases of data processing in a data warehouse:

Extract – covers the retrieval of the required data from the source systems and making

it available for performing calculations;



393

Transform – applies integrity and business rules to transform the data from the

sources into the format of the target data warehouse;

Load – ensures that the transformed and aggregated data is loaded to the data

warehouse.

A variation of the ETL process is the ELT (Extract, Load and Transform), which involves

loading the data to the data warehouse, directly from the source systems, and transforming it

there.

Through ELT, “the extract and load process can be isolated from the transformation process.

This has a number of benefits. Isolating the load process from the transformation process

removes an inherent dependency between these stages. In addition to including the data

necessary for the transformations, the extract and load process can include elements of data

that may be required in the future. Indeed, the load process could take the entire source and

load it into the warehouse.” [2]

During the execution of the ETL process, various steps are being performed. The main such

steps, in my opinion, are comprised by the ETL cycle displayed in Figure 1 below:

Figure 1 - ETL process cycle

The ETL cycle consists of the following nine steps, divided between the three main components

of the process:

I. Extract:

1. Build dataset – covers the identification of the reference data that needs to be

extracted from the source systems;

2. Validate data – involves applying validation rules to clean and correct the data that

will be extracted;

3. Retrieve data – consists of copying the validated data to an intermediary layer, named

staging layer, where transformations will be performed upon it;

II. Transform:

1. Check integrity – involves applying integrity rules to the data loaded to the staging

area;

2. Perform calculations – applying business rules to calculate new measures on the

extracted data;

3. Aggregate data – implies performing aggregations upon the raw data;

III. Load:



394

1. Load data – covers the loading of the data from the staging layer to the data

warehouse;

2. Process data – implies performing further post-loading transformations upon the data;

3. Publish data – involves publishing the data to the target tables used for querying and

reports by the business users through Business Intelligence tools.

Software companies developed in recent years many commercially available ETL tools. The

main such tools are:

Oracle – WarehouseBuilder (OWB) and Data Integrator (ODI);

IBM – InfoSphereDataStage;

Microsoft – SQL Server Integration Services (SSIS);

SAP – Data Services.

ETL tools are used by enterprises to help in the implementation of ETL processes, because

they ensure maximum performance.

3. ETL architecture for data warehousing Data integration through ETL processes into data warehouse systems provides the data used

for performing complex analysis in order to support decision making in the organization. My

proposal of architecture for the ETL process, displayed in Figure 2 below, distinguishes

between the three phases of the process:

Figure 2 - ETL process architecture

According to the architecture, the raw data is selected from various data sources, both internal

and external to the enterprise, validated and then copied to the staging layer. These compose

the extraction phase in the ETL process.

After the data is retrieved from the source systems to the staging area, integrity rules are applied

on it in order to clean the raw data. New measures are calculated through business rules that

are applied as part of the transformation phase. The detailed data extracted from the sources is

aggregated in the staging layer.



395

The transformed data is loaded to the data warehouse. Further post-loading transformations are

performed upon the data and afterwards it is published to the target tables used as source for

the reports and analysis performed by the business users through Business Intelligence tools.

4. Case study: Enhancing ETL with Oracle Data Integrator Oracle Data Integrator is a highly used data integration platform, which provides an efficient

tool for defining transformation and loading rules.

“Oracle Data Integrator provides a unified infrastructure to streamline data and application

integration projects. Oracle Data Integrator employs a powerful declarative design approach to

data integration, which separates the declarative rules from the implementation details”. [3]

The structure of the data warehouse, as well as of the source databases and staging layer must

be mapped in the data model section of ODI. A new ETL project can be created where the ETL

mappings can be defined.

The source code is generated based on the defined mappings using Knowledge Modules

selected for each ODI mapping interface.

“Knowledge Modules (KMs) are code templates. Each KM is dedicated to an individual task

in the overall data integration process”. [4]

There are six types of Knowledge Modules: Reverse-engineering KM (RKM), Check KM

(CKM), Loading KM (LKM), Integration KM (IKM), Journalizing KM (JKM) and Service

KM (SKM). [5]

The steps defined for one of the ODI Knowledge Modules used for the case study, LKM SQL

to Oracle, are displayed in Figure 3 below:

Figure 3 - Loading Knowledge Module

One of the most important components of an ODI project, for the ETL process, are the mapping

interfaces. The interface contains the extraction, transformation and loading rules of each table

in the data model. In the example presented in Figure 4 below, the target table FACT_SALES

is fed with data from source tables SALES and CUSTOMERS which are joined to provide the

data in the requested format.

Figure 4 - Mapping for Sales fact table



396

Also, the ODI interface models the flow for loading the data. The flow for loading

FACT_SALES using LKM SQL to Oracle is shown in Figure 5.

Figure 5 - Flow for loading Sales fact table

In order to load data to the target tables from the sources, the mapping interface must be run in

ODI. Figure 6 shows the successful execution of MAP_SALES interface.

Figure 6 - Execution of the Sales loading interface

The most important benefit of using Oracle Data Integrator is the consistent and scalable means

of developing the ETL process. ODI provides an easy to use GUI that ensures fast learning of

the tool, and a means of having centralized ETL processes.

5. Conclusions During recent years, enterprises needed, more and more, new and innovative solutions, to

integrate data from various sources into data warehouses, due to the increase of the volumes of

data.

In this paper we focused on the importance of the ETL process in data warehousing

environments and the steps that are performed when such a process is executed. An architecture

of the ETL process is proposed, which highlights the tasks that must be performed for each

phase of the ETL.



397

I presented a case study on the data integration platform provided by Oracle: ODI (Oracle Data

Integrator). ODI provided the means for improving both the development and the maintenance

of the ETL process, by offering the graphical interface to build, manage and maintain data

integration tasks.

References [1] R. Kimball and J. Caserta, The data warehouse ETL toolkit: practical techniques for

extracting, cleaning, conforming, and delivering data, USA, Wiley Publishing, 2004

[2] Robert J Davenport, ETL vs ELT. White Paper, June 2008, Available:

http://www.dataacademy.com/files/ETL-vs-ELT-White-Paper.pdf [March 14, 2015]

[3] Oracle, Oracle Fusion Middleware Getting Started with Oracle Data Integrator, Release

11g (11.1.1), USA, September 2010

[4] Oracle, Fusion Middleware Knowledge Module Developer's Guide for Oracle Data

Integrator, Available:

http://docs.oracle.com/cd/E28280_01/integrate.1111/e12645/intro.htm [March 16, 2015]

[5] Oracle Data Integrator, Available: www.oracle.com [March 15, 2015]

http://www.dataacademy.com/files/ETL-vs-ELT-White-Paper.pdf

http://docs.oracle.com/cd/E28280_01/integrate.1111/e12645/intro.htm

http://www.oracle.com/



398

SOFTWARE DEVELOPMENT METHODOLOGY FOR INNOVATIVE

PROJECTS - ISDF METHODOLOGY

Mihai Liviu DESPA


Abstract. The paper tackles the issue of formalizing a software development methodology

dedicated to building innovative web applications. The concept of innovative web application

is defined and its specific requirements are highlighted. Innovation is depicted from the end-

user, project owner and project manager’s point of view. The concept of software development

methodology is defined. Current software development models are presented and briefly

analysed. Strengths and weaknesses are depicted and the need for a dedicated innovation

oriented software development methodology is emphasized. The requirements of a software

development methodology are identified by reviewing current scientific computer science

publications. Elements of a software development methodology are also identified by reviewing

scientific literature. The ISDF software development methodology is illustrated by presenting

a case study performed on the ALPHA application. The development life cycle is depicted as

being the basis of every software development methodology. The development life cycle

employed in the ALPHA application is submitted for analysis. Each stage of the software

development cycle is described and characteristics proprietary to the ISDF software

development methodology are highlighted. Artefacts generate by the ISDF software

development methodology in the ALPHA project are submitted for analysis. The ISDF software

development methodology is formalized by presenting its key components: roles, skills, team,

tools, techniques, routines, artefacts, processes, activities, standards, quality control,

restrictions and core principles. Conclusions are formulated and new related research topics

are submitted for debate.

Keywords: software development methodology, innovation, project management

JEL classification: L86

1. Introduction The research efforts and results presented in the current paper apply exclusively to web

applications. Though they might apply to other categories of software applications or to other

fields altogether, they were validated only in the context of web applications. From the end-

user’s point of view, a web application is considered to be innovative if it’s easier to use, faster,

cheaper, more reliable or more secure than other applications that accomplish the same results

or if it fulfils a need that has yet to be address in the online environment. In the context of the

end-user, innovation targets the fulfilment of a specific need.

From the project owner’s point of view a web application is considered innovative if it:

includes at least a functionality that generates added value for the end-user and the

functionality is not found in other web applications that target the same market;

includes a combination of functionalities that generate added value and the combination

of functionalities is not found in the same configuration in any other web application

that targets the same market; functionalities can be found separately in other web

applications but not in the same configuration;



399

provides access to a graphic interface that includes elements or element combinations

which improve user experience and are not found in other web applications that target

the same market.

In the context of the project owner innovation focusses on market characteristics and targets

novelty and added value. From the project manager and from the development team’s point of

view a web application is considered to be innovative if it includes functionality that they have

never implemented before. In the context of the project manager and the project team,

innovation focusses on the degree of novelty of the current application compared to previously

implemented applications.

This paper focuses on the perspective of the project manager and the project team regarding

innovative web applications. Research and the author’s own experience in the field of software

development lead to the conclusion that innovative web applications are characterized by

frequent change of specifications, high dynamics of technology and standards, higher than

usual risks, proprietary cost structure and custom testing scenarios. Thus the research

hypothesis of the current paper is the fact that building an innovative web application requires

a dedicated software development methodology.

A software development methodology is an effort to standardize the set of methods, procedures

and artefacts intrinsic to the software development life cycle [1]. The software development

methodology illustrated in the current paper is called Innovative Software Development

Framework and will be referred with the acronym ISDF. The methodology was developed

based on practices employed by the author in innovative IT projects he personally managed in

the last 5 years. The initial methodology was built empirically based on the development life

cycle and was refined and formalized by integrating additional elements identified by

reviewing scientific papers. The resulting methodology was tested and validated in the

successful implementation of three innovative software development projects. The ISDF

methodology is depicted in the current paper by presenting a case study performed on one of

the above mentioned projects. In order to comply with confidentiality contract clauses and to

protect the project owner’s identity data is anonymized and project will be referred to with the

acronym ALPHA.

2. Literature Review

Current software development methodologies are branched into heavyweight and lightweight.

As part of the literature review process, heavyweight and lightweight methodologies were

analysed with an emphasis on epitomizing their overall structure, positive attributes, negative

attributes and the type of project they are suitable for.

Heavyweight methodologies follow the waterfall model and rely on detailed planning,

exhaustive specifications and detailed application design. The waterfall model is predictable,

generates comprehensive software artefacts and diminishes the risk of overlooking major

architectural problems [3]. Waterfall model is typically described as a unidirectional, top down

[6] as every phase begins only after the previous phase has been completed [7]. The output of

one phase becomes input for the next phase [7]. The central figure of the waterfall model is the

project plan [11]. Waterfall development entails high effort and costs for writing and approving

documents, difficulties in responding to change, unexpected quality problems and schedule

overrun due to testing being performed late in the project and lack of project owner feedback

[3]. Other issues proprietary to the waterfall model is the fact that systems often do not reflect

current requirements and lead-time is often generated by the need to approve software artefacts.

Also the waterfall model pushes high-risk and difficult, elements to end of the project,

aggravates complexity overload, encourages late integration and produces unreliable up-front



400

schedules and estimates [4]. Waterfall works best for projects with little change, little novelty,

and low complexity [4].

Lightweight methodologies follow the agile model and emphasize working software,

responding to change effectively and user feedback. Agile model was built to be adaptive,

flexible and responsive with an emphasis on collaboration and communication. The Agile

model embraces conflict while encouraging exploration and creativity [5]. Agile model relies

on iterative and incremental development [9] and focuses on people not on technology or

techniques [8]. The central figure of the agile model is the project owner [11]. The downside

of agile model is the fact that it relies on inadequate architectural planning, over-focusing on

early results, generates weak documentation and low levels of test coverage [2]. There is a

powerful negative correlation between the size of the organization and the successful

implementation of the Agile model, thus the larger the organization the harder it is to employ

agile methods [10]. Also the Agile model offers limited support for globally distributed

development teams, reduces the ability to outsource and narrows the perspective of generating

reusable artefacts [12]. Agile model works best for small teams as in large teams the number

of communication lines that have to be maintained can reduce the effectiveness of practices

such as informal face-to-face communications and review meetings [12].

The need for formalizing a software development methodology dedicated to innovative

projects is generated by the fact that traditional heavyweight methodologies are unable of

delivering fast development without compromising quality whereas agile lightweight

methodologies are characterized by inadequate documentation, weak architecture and lack of

risk management [2]. A software development methodology has to be described quantitatively

and qualitatively, has to lead to similar results if used repeatedly, has to be applied with a

reasonable level of success and has to be relatively easy to explain and teach [13]. A software

development methodology should include people, roles, skills, teams, tools, techniques,

processes, activities, standards, quality measuring tools, and team values [12].

3. Developing the ALPHA Application

The core of every software methodology is its development life cycle. The development life

cycle formalized in the ISDF methodology and used in the ALPHA project consists of the

following stages: research, planning, design, prototype, development, testing, setup and

maintenance. Research, planning, development, testing and setup are common stages in most

software development methodologies. Building a prototype, design and maintenance are also

employed in other software development methodologies but are not regarded as distinct

development life cycle stages. Innovative software development projects though, enforce

prototyping as a distinct stage because it plays an important role in reducing risk, refining

specifications and validating the innovative idea that initially lead to the inception of the

project. As part of the research process development, life cycle stages of the ALPHA project

were analysed as independent entities highlighting, people and roles.

Research stage in the ALPHA project methodology was dedicated to gathering and

exchanging information and it involved the project manager, the project owner and the project

team. The project owner’s role was to formulate requirements and communicate them to the

project manager. The project manager’s role was to evaluate requirements and assemble a team

with the necessary set of skills, professional values and experience required to implement the

project. Including the project manager, 8 people were involved in developing the ALPHA

application. Previous experiences lead to the conclusion that the ISDF methodology is effective

on teams that do not exceed 9 team members. When selecting the team members, the project

manager took into account the fact that implementing innovative projects requires strong

associating, questioning, observing, experimenting, and networking skills [14]. The project



401

team’s role was to evaluate requirements from a technical perspective. In the ALPHA project,

the project manager together with the project team also had the role of converting requirements

into actual specifications. As part of the research process, the project owner analysed

applications that were similar or complementary to the ALPHA application.

Planning stage in the ALPHA project was dedicated to formalizing the main characteristics of

the web application and it involved the project owner, project manager and the project team.

The project owner had the role of providing feedback on software artefacts. The project

manager’s role was to plan activities, set standards and assign responsibilities to team members.

The project manager together with the team members had the role of defining the overall flow

of the application. The flow was broken down into smaller, easier to manage subassemblies.

For each subassembly a comprehensive set of functionalities was defined. Based on the

required functionality the technical team members designed the database structure. The project

manager together with the project team also chose the tools, technologies and processes that

were going to be employed in the ALPHA project.

Design stage in the ALPHA project was dedicated to creating the graphic component of the

application and it involved the project owner, the project manager and the project team. The

role of the project owner was to provide feedback on the layout. The project manager had the

role of ensuring that the graphic component is consistent with the functionality and the target

group of the web application. The only team member involved in the design stage was the

graphic designer. His role was to create a layout in accordance with specifications received

from the project manager.

Prototype stage in the ALPHA project was dedicated to building a functional proof of concept

and it involved the project owner, the project manager and the project team. The role of the

project owner was to provide feedback on the prototype. The role of the project manager was

to refine specifications in accordance with the project owner’s feedback. The role of the project

team was to build the prototype. Innovative web development projects are characterized by a

considerable degree of uncertainty. Building the prototype had the role of validating the idea

that lead to the inception of the ALPHA project. The prototype also acted as a basis for

delivering consistent feedback and refining specifications.

Development stage in the ALPHA project was dedicated to actually building the functionality

part of the application and it involved the project manager and the project team. The role of the

project manager was to monitor progress, motivate team members and report to the project

owner. The role of the development team was to write code and debug.

Testing stage in the ALPHA project was dedicated to identifying programming, design, and

architectural issues and it involved the project manager and the project team. The role of the

project manager was to insure that the testing scenarios were exhaustive. The role of the project

team was to identify and fix security, functionality, design and architectural issues and fix them.

Also the project team had to ensure that the web application is doing everything it was design

to do and nothing that it wasn’t design to do.

Setup stage in the ALPHA project was dedicated to installing the web application on the live

environment and it involved the project team. The role of the project team was to configuring

the live environment in terms of security, hardware and software resources.

Maintenance stage in the ALPHA project was dedicated to ensuring that the application is

running properly on the live environment and it involved the project team. The role of the

project team was to monitor the traffic, and the firewall, mail, database and network protocols

error logs.

Next step in the research process was to analyses the succession, connections and interaction

of the software development life cycle stages highlighting resources, activities and tools. Fig.



402

1 presents a schematic representation of the development life cycle used in the ALPHA project.

The development life cycle presented in Fig. 1 is also representative for the ISDF methodology.

Figure 1. Development life cycle for the ALPHA application.

Research for the ALPHA project started with a series of meetings between the project manager

and the project owner. The project owner presented his vision on the application and detailed

on the initial set of requirements. The project manager then analysed similar web application

already operating in the online environment. The project team performed a technical review of

the requirements. The Research stage ended with the project manager and the project team

drafting the specifications for the ALPHA application. In the Planning stage the project

manager and the project team defined the overall flow of the ALPHA application and broke it

down into manageable subassemblies. The overall flow and the subassemblies were built with

the help of use case diagrams, UCD. Fig. 2 presents the UCD diagram for the Register – Login-

Logout process of the ALPHA application.

Figure 2. UCD diagram for the ALPHA project’s Register – Login – Logout process.



403

Building UCDs is an important process in understanding the structure of the application and it

is also one of the first deliverables that the project owner comes into contact with. The ISDF

methodology does not rely heavily on UCDs because building an innovative application is a

very dynamic process and initial planning will change multiple times until the application is

completed. The role of the UCD diagrams in the ISDF methodology is to help the project team

gain a deeper understanding of the application and also provides the project owner with a

preview of what the development’s team is going to implement. In the ALPHA project a

restriction was enforced of building a maximum of 10 UCD’s and allocating a maximum of 2

hours for building each UCD. The Planning stage continued with building the database

structure. Fig. 3 presents a sample of the database structure built for the ALPHA application.

Figure 3. Sample of the ALPHA application database structure.

The role of the database structure in this stage of the ALPHA project was to help the project

team gain a deeper understanding of the application. The database structure built in the

Planning stage was not a mandatory requirement for the final application. The database

structure changed significantly in three separate occasions by the time the project was finished.

The Planning stage ended with the project manager and the project team deciding on what

tools, technologies and process to employ in the development process of the ALPHA

application. In terms of code versioning tools the project team decided to use Tortoise SVN.

For the overall planning, resource allocation, budgeting and activity planning the project

manager decided to use Microsoft Project. In terms of bug tracking, task assignment and

progress monitoring the project manager and the project team decided on using Pivotal Tracker.

In terms of technology the project team opted for the LAMP stack with CentOS as the Linux

distribution. HTTP server of choice was Apache, SGBD system was MySQL and programming

language PHP. In order to facilitate building on a MVC architecture the PHP Zend framework

was chosen. The Planning stage ended with defining standards and quality measuring

techniques. The ALPHA application was designed to be W3C, Yslow and Page Speed

compliant. Data regarding quality was collected using the web application GTmetrix.

In the Prototype phase the project team built a mock-up of the application in order to validate

the assumptions made in the Research and Planning stages. The mock-up was built using

Prototyper. The prototype was built based on the UCD’s developed in the Planning stage and

acted as a proof of concept. The prototype of the ALPHA application was presented to the

project owner for feedback, process represented in Fig. 1 by transition 3. The prototype was

not in accordance with the project owner’s vision on the final application so the project team

completely rebuilt the prototype, process represented in Fig. 1 by transition 6. After rebuilding

the prototype the feedback received form the project owner required only minor adjustments

to the prototype, process represented in Fig. 1 by transition 5. After the adjustments were

implemented the prototype reflected accurately the project’s owner vision on the final



404

application. The prototype had to be built fast and it did not require any programming skills.

In the ALPHA project the maximum time allocated for building a prototype was 3% of the

total estimated project time and there were a total of 2 prototypes built. The Research, Planning

and Prototype stages were executed in the spirit of the waterfall model and generated artefacts

that are valuable in the context of innovative projects. After the prototype was approved by the

project owner the Development and Design stage started simultaneously.

The Design stage consisted of a series of layout iterations were the graphic designer created a

layout and made adjustments according to feedback received from the project owner, process

represented in Fig. 1. by transitions 10, 14 and 15. Building the functionality for the ALPHA

application consisted of a series of iterations that were organized according to timeboxing

technique. Each iteration was planned to last two weeks and ended with a functional version

of the application. Deadlines were non-negotiable. Each iteration was built by adding

functionality to the previous iteration. The ALPHA project was built in 8 iterations. An

iteration included the Development, Testing and Setup stages. Development was performed in

the spirit of the Agile methodologies with self-organizing teams and daily meetings to assess

progress and to identify issues. Developers worked in pairs, with only one of them codding

while the other was observing. Roles were exchanged daily. Pair programming reduces the

number of bugs and increases the likelihood of delivering innovative solutions. Functionality

was built following priorities set by the project owner. Functionality prioritisation was

performed using the MoSCoW model.

Testing was performed using the testing scenarios defined in the Planning stage. Scenarios

needed adjustments as the requirements for the ALPHA applications changed during actual

implementation. The testing scenarios included all the instances of the ALPHA application.

Fig. 4 presents a sample of the testing schema used in the ALPHA application.

Figure 4. Sample of the testing schema used for the ALPHA application.

The testing schema was designed for two testers. Each tester was involved in the development

of the application starting from the Planning stage, when they contributed to building the

UCDs, and ending with the Setup stage when they tested the application on the live

environment.

The Setup stage entailed installing the applications on the live environment and adding proper

content. Data was imported into the application’s database in order to generate proper content.

The first versions of the ALPHA application was installed on the live environment after the

first development iteration, which was 5 weeks into the project, including research, planning,

prototyping and design. After the first version of the application was installed on the live

environment feedback was collected from the end-user and project manager. The role of the

end-user was to provide feedback regarding usability, design, and functionality. In the ALPHA

project after the code from the first iteration was installed and tested on the live environment

application was tested by a sample batch of potential end-users. End-user testing was

performed after each iteration. The Maintenance stage started after the code from the last

iteration was setup on the live environment. In the ALPHA project the Maintenance stage

focussed on adding new functionality and improving existing functionality. Also an important



405

part of the maintenance process was fixing design, architecture and functionality issues that

were not identified in the Testing stage.

4. Formalizing the ISDF Methodology

The development of the ALPHA application was performed using the ISDF software

development methodology. By analysing the development of the ALPHA application, the

ISDF methodology was formalized and presented in a structured manner. Table 1 presents a

concise view on the ISDF software development methodology.

Table 1. ISDF software development methodology characteristics.

Methodology

characteristic

ISDF Specific

Roles project owner; project manager; project team; end-user

Skills associate; question; observe; experiment; networking

Team 9 individuals; self-organizing; emphasize informal and face-

to-face communication

Tools prototyping; code versioning; bug reporting; progress

tracking; graphic design and workflow applications

Techniques pair programming; timebox approach; MoSCoW

prioritisation of tasks

Routines 30 minute meetings; daily written reports; weekly one hour

meetings for planning or adjusting the current iteration

Artefacts use case digammas; wireframes; prototypes; test case

scenarios; database schemas

Processes and

Activities

create artefacts; build prototypes; extend prototype using

iterative development; collect continues feedback; developed testing

scenarios before actual coding

Standards W3C compliant; B grade by Yslow and Page speed

standards; page size under 2 MB; less than 100 HTTP requests;

average page load time under 5 seconds

Quality

control

compliance; usability; reliability; repeatability; availability;

security

Restrictions no more than 30 minutes per daily meeting; no more than 10

UCDs; no more than 2 hours per UCD; no more than 3 prototypes;

no more than 1% of the total estimated time allocated to building a

prototype

Core

principles

early delivery of working software, welcome change, explore

multiple implementation scenarios, non-negotiable deadlines,

writing code over writing documentation

Roles of core importance for the ISDF methodology are project owner, project manager,

project team and end-user. The role of the project owner is to provide accurate and detailed

application requirements to the project manager and to provide continuous feedback. The ISDF

methodology requires the project owner to be involved in every stage of the development life

cycle. Project owner must provide feedback on all aspects concerning the application but most

important components are: feedback on the prototype, feedback on each development iteration

and feedback on design. The role of the project manager is to compile specifications based on

requirements provided by the project owner, assemble the project team, design the overall flow

of the application, define the implementation timeframe, design testing schemas, track progress



406

and report to the project owner. The role of the project team is to plan the architecture of the

application, choose the technologies required to build the application, design the database

structure, design the graphical layout, implement functionality, test the application and setup

the application on the live environment. The role of the end-user is to provide feedback on the

functionality, design, security and usability of the application.

Skills required in developing innovative software and by that matter required in ISDF teams,

are the ability to associate, observe, experiment, network and question. In the context of

innovation, the ability to associate means being able to make connections across areas of

knowledge. Transferring knowledge and ideas from other fields into software development is

an abundant source of innovation. Sharp observation skills are a key element of innovation as

it facilitates gathering data and information that eludes most people. When building a team the

project manager should look for individuals with a network of vast connections. Being exposed

to people with different backgrounds and perspectives increases your own knowledge. ISDF

requires people with experimenting skills that build prototypes and pioneer new concepts and

technologies. Questioning is essential for innovation as it is the catalyst for associating,

observing, experimenting and networking skills [14].

Teams employed in innovative projects built using the ISDF methodology consist of maximum

9 individuals including the project manager. ISDF teams rely heavily on face-to-face

communication. Empirical trials determined that teams larger than 9 individuals have issues

with effectively conducting the daily and weekly meetings. Also project managers find it hard

to properly go through more than 9 reports a day. ISDF teams are self-organized in terms of

assigning tasks and building functionality. The project manager acts as a mediator to balance

workload and solve conflicts.

Tools used in the ISDF methodology include prototyping, code versioning, bug reporting,

progress tracking, graphic design and workflow applications. There are countless tools that can

be used for the above mentioned tasks. Each team should choose tools that they are familiar

with, that suit their budget and comply with their company culture. For instance in the ALPHA

project Prototyper was used for building the prototype, code versioning was performed using

Tortoise SVN, bug reporting and progress tracking was performed using Pivotal Tracker,

graphic design was performed in CorelDraw and workflows were performed using Microsoft

Visio. ISDF is not a methodology that focuses on tools but it definitely tries to exploit them as

much as possible. Using the same tools over and over will allow the project manager to reuse

artefacts from past projects.

Techniques used in the ISDF methodology concern programming, tasks prioritisation and time

management. ISDF relies on pair programming technique to reduce the number of bugs,

increase solution diversity, build collaboration networks and stimulate learning. ISDF uses the

timebox approach for project planning in order to increase focus and avoid missing deadlines.

In the ISDF methodology prioritisation of tasks is accomplished using the MoSCoW technique

in order to ensure early delivery of the most valuable functionality.

Routines enforced by the ISDF methodology consist of daily 30 minute meetings, daily written

reports, weekly one hour meetings for planning or adjusting the current iteration. Every

morning team members meet together with the project manager and share progress on their

work. A special emphasises on these meetings is to identify and eliminate factors that inhibit

progress on tasks. Daily written reports are sent by the team members to the project manager

at the end of each working day. Reports contain details on the tasks performed that particular

day and also allow the team members to transmit more sensitive information to the project

manager; information that they are not comfortable sharing with the rest of the team in the daily

meetings. Weekly meetings are for planning or evaluating the overall progress of the iteration.

Each iteration begins with a weekly meeting where tasks are assigned to team members. Task



407

assignment is a collaborative process as ISDF teams are self-organized, the project manager

only intervenes to mitigate conflict or to help overcome deadlocks.

Artefacts generated by the ISDF methodology consist of use case digammas, wireframes,

prototypes, test case scenarios and database schemas. In innovative software development

application artefacts are very important because they are required in the process of protecting

intellectual property rights like obtaining patents. Innovative software development projects

often result in applications that incorporate valuable new technologies or processes that are

subject to intellectual property laws. Artefacts are also valuable assets when new team

members join the project. In the ISDF methodology all artefacts, except database schemas, are

generated by the project manager. The database schema is generated by the project team.

Process and activities critical to the ISDF methodology are represented by creating artefacts,

building a prototype, codding and extending the prototype using iterative development,

collecting continues feedback and developing testing scenarios before actual coding. ISDF is a

methodology focused on coding but creating software artefacts is a critical process in

implementing innovative applications as it facilitates protecting intellectual property rights and

it helps mitigate risks. Innovation is based on an idea. In order to tests the feasibility of the idea

building a prototype is required. Prototype can also help secure additional funding for an

innovative project. Codding and extending the prototype is performed by using iterative

development. Building an application in multiple iteration allows for better tolerance to

changing requirements as is the case in innovative projects. A critical process of the ISDF

methodology is collecting feedback from the project owner and from the end-user. Feedback

from the project owner is collected in every stage of the development lifecycle. Feedback from

the end-user is collected after the first iteration code is setup on the live environment. The

testing process begins after codding for the first iteration is finished. Testing scenarios are

written by the project manager and by the testers before the actual codding process begins.

Standards within ISDF methodology regard codding best practices, page size, HTTP requests

and average page loading time. ISDF requires that all pages be W3C compliant unless breaking

best practice guidelines was performed intentionally in order to boost performance. Also

requires a B grade by Yslow and Page speed standards for all pages. ISDF enforces page size

under 2 MB and less than 100 HTTP requests to load a page. To optimize user experience

average page loading time should be below 5 seconds.

Quality control in the ISDF methodology concerns compliance, usability, reliability,

repeatability, availability and security. Compliance is assessed by the degree in which

functionality architecture, graphic design and user flows adhere to project owner specifications.

Usability is determined by the ease with which a user accesses and uses an application’s

functionality. Reliability is determined by loading speed and response times. Reliability also

requires for applications developed with ISDF methodology to take into account users that have

access to low-speed Internet connections. Repeatability of a web application is determined by

the degree of predictability, when seeking a specific result. Availability is determined by the

extent to which the application is accessible. Security is determined by the extent to which data

and personal information are protected [15].

Restrictions enforced by the ISDF methodology concern time and resources allocated for

activities. Imposing restrictions ensures that project does not stray from its original goals,

follows the planned timeframe and does not exceed initial budget. In the ISDF methodology

the maximum length of an iteration is two weeks and the minimum length is one week. The

daily meetings must not exceed 30 minutes. No more than 10 UCD’s are created per project

and building a UCD should not take more than 2 hours. No more than 3 prototypes are built

per project and building a prototype should not take more than 1% of the estimated project

timeframe.



408

Core principles characterizing the ISDF methodology consist of early delivery of working

software, welcoming change, exploring multiple implementation scenarios and actively

involving project owner into all project stages. ISDF values writing code over writing

specifications. ISDF emphasizes design over documentation. Though planning is not overlook

development is always prioritized. The project owner decides the priority of tasks and deadlines

are non-negotiable.

5. Conclusions Research results presented in the current paper are confined to the web application development

field and were not tested on projects with a timespan larger than 14 months or on project teams

consisting of more than 10 individuals. Innovative software development projects require a

dedicated software development methodology that accounts for frequent change of

specifications, high dynamics of technology and standards, higher than usual risks, proprietary

cost structure and custom testing scenarios. The ISDF methodology was developed empirically

by trial and error in the process of implementing multiple innovative projects. The current

version of the ISDF methodology was refined by reviewing scientific literature and

incorporating valuable elements from the waterfall and agile development models. The

waterfall model provides support for generating software documentation which is valuable in

the case of innovative software development. The agile model provides a process capable of

coping with frequent change of requirements as this is frequently the case in innovative

software development projects. The roles enforced in the ISDF methodology are project owner,

project manager, project team and end-user. The ISDF methodology employs tools for

prototyping, code versioning, bug reporting, progress tracking, graphic design and workflow

applications. The routines proprietary to the ISDF methodology are daily 30 minute meetings,

daily written reports and weekly one hour meetings. The artefacts generated by the ISDF

methodology consist of use case digammas, wireframes, prototypes, test case scenarios and

database schemas. In terms of software development techniques ISDF methodology relies on

pair programming, timebox approach and MoSCoW prioritisation of tasks. The following are

processes and activities proprietary to the ISDF methodology: creating artefacts, building

prototypes, extending prototypes using iterative development, collecting continues feedback

and developing testing scenarios before actual coding. Standards of the ISDF methodology

enforce W3C compliance, Yslow and Page speed B grades, less than 100 HTTP requests to

load a page, page size under 2 MB and page loading time under 5 seconds. Quality control

regards compliance, usability, reliability, repeatability, availability and security. As a future

research topic, ISDF methodology can be scaled in order to accommodate software

development projects that require larger teams.

Acknowledgment






References

[1] T. DeMarco, “The role of software development methodologies: past, present, and future”,

Proceedings of the 18th international conference on Software engineering, 25-30 Mar.

1996, Berlin, Germany, Publisher: IEEE, ISBN: 0-8186-7246-3, pp. 2-4



409

[2] M. R. J. Qureshi, “Agile software development methodology for medium and large

projects”, IET Software, vol.6, no.4, pp.358-363, doi: 10.1049/iet-sen.2011.0110

[3] K. Petersen, C. Wohlin and D. Baca, “The Waterfall Model in Large-Scale Development”,

Proceedings of the 10th International Conference on Product-Focused Software Process

Improvement, 15-17 Jun. 2009, Oulu, Finland, Publisher Springer Berlin Heidelberg, ISBN

978-3-642-02151-0, pp. 386-400

[4] S. H. VanderLeest and A. Buter, “Escape the waterfall: Agile for aerospace”, Proceedings

the 28th Digital Avionics Systems Conference, 23-29 Oct. 2009, Orlando, USA, Publisher:

IEEE, doi: 10.1109/DASC.2009.5347438, pp. 6.D.3-1- 6.D.3-16

[5] T. Dyba and T. Dingsoyr, “What Do We Know about Agile Software Development?”, IEEE

Software, vol.26, no.5, pp. 6-9, doi: 10.1109/MS.2009.145

[6] B. V. Thummadi, O. Shiv and K. Lyytinen, “Enacted Routines in Agile and Waterfall

Processes”, Proceedings of the 2011 Agile Conference, 7-13 Aug., Salt Lake City, USA,

Publisher: IEEE, 2011, doi: 10.1109/AGILE.2011.29 pp. 67-76

[7] P. Trivedi and A. Sharma, “A comparative study between iterative waterfall and

incremental software development life cycle model for optimizing the resources using

computer simulation”, Proceedings of the 2nd International Conference on Information

Management in the Knowledge Economy, 19-20 Dec. 2013, Chandigarh, India, Publisher:

IEEE, pp. 188-194

[8] D. Duka, “Adoption of agile methodology in software development”, Proceedings of the

36th International Convention on Information & Communication Technology Electronics

& Microelectronics, 20-24 May 2013, Opatija, Croatia, Publisher: IEEE, ISBN: 978-953-

233-076-2, pp. 426-430

[9] S. Zhong, C. Liping and C. Tian-en, “Agile planning and development methods”,

Proceedings of the 3rd International Conference on Computer Research and Development,

11-13 Mar. 2011, Shanghai, China, Publisher: IEEE, doi: 10.1109/ICCRD.2011.5764064,

pp. 488-491

[10] J. A. Livermore, “Factors that impact implementing an agile software development

methodology”, Proceedings of the 2007 IEEE SoutheastCon, 22-25 March 2007,

Richmond, USA, Publisher: IEEE, doi: 10.1109/SECON.2007.342860, pp.82-86

[11] T. J. Lehman and A. Sharma, “Software Development as a Service: Agile Experiences”,

Proceedings of the 2011 Annual SRII Global Conference, 29 Mar. - 2 Apr. 2011, San Jose,

USA, Publisher: IEEE, doi: 10.1109/SRII.2011.82, pp. 749-758

[12] A. Cockburn, “Selecting a project's methodology”, IEEE Software, vol.17, no.4, pp. 64-

71, doi: 10.1109/52.854070

[13] R. Klopper, S. Gruner and D. G. Kourie, “Assessment of a framework to compare software

development methodologies”, Proceedings of the 2007 Annual Research Conference of the

South African Institute of Computer Scientists and Information Technologists on IT

Research in Developing Countries, Sunshine Coast, 30 Sep. - 03 Oct. 2007, South Africa,

Publisher: IEEE, doi: 10.1145/1292491.1292498, pp. 56-65

[14] C. M. Christensen, J. Dyer and H. Gregersen, The Innovator's DNA: Mastering the Five

Skills of Disruptive Innovators, Publisher: Harvard Business Review Press, pp. 304, ASIN:

B0054KBLRC

[15] M. Despa, I. Ivan, C. Ciurea, A. Zamfiroiu, C. Sbora, E. Herteliu, “Software testing,

cybernetic process”, Proceedings of the 8th International Conference on Economic

Cybernetic Analysis: Development and Resources, 1-2 Nov. 2013, Bucharest, Romania

ISSN 2247-1820, ISSN-L 2247-1820.



410

AGILITY IN THE IT SERVICES SECTOR: A STUDY FOR ROMANIA

Eduard-Nicolae BUDACU

Economic Informatics Doctoral Shool,

Bucharest University of Economic Studies,

[email protected]

Constanta-Nicoleta BODEA Economic Informatics and Cybernetics Department


Centre for Industrial and Services Economics, Romanian Academy

[email protected]

Stelian STANCU Economic Informatics and Cybernetics Department



[email protected]

Abstract. The paper presents a study for assessing the agility of the software development

teams working in Romanian IT services sector. The reasons for the adoption of agile practices

and tools are identified and explained. The study started in February 2015 and we can report

the first results. Many companies that have adopted agile practices have aligned their

information infrastructure accordingly, using new tools and giving support to their employees

to attend trainings, certifications and coaching. The companies intend to make further

investments in order to continuously improve the application of the agile methods.

Keywords: Agile, Romanian IT services sector, Software Development, Scrum

JEL classification: L86

1. Introduction

According to the Gartner’ study, cited by Bloomberg [1], in 2014 Romania had over 64,000 IT

specialists, Romania being on the first place in UE in terms of number of employees in the

technology sector per capita and ranks sixth in worldwide. An important characteristic of the

IT sector in Romania is that IT professionals are well educated and relatively cheap. According

to KeysFin’ statistics [2], approximately 70% of the Romanian IT companies were established

in the last 10 years and over 50% of them are still in operation. The financial performance of

the domain has steadily increased, reaching an amount of 2.8 billion euros as average annual

turnover.

IT is a cross-cutting domain, combining several categories of activities. According to NACE

classification, IT domain includes the IT manufacturing and IT services. IT manufacturing is a

small sub-set of high technology manufacturing and so is not considered in detail in our study.

The IT services sector (NACE code K72 “Computer and related activities”) includes the

following activities [3]: hardware and software consultancy and supply, publishing of software

and other software consultancy and supply, data processing, database activities and

maintenance. Software development is the main part of the IT services sector. According to

KeysFin’ statistics [2], over a third of IT firms (35%) are involved in software development,

three times more than in the segment "Other IT services" and two times more than in segment

"Maintenance and Repair". Almost half of the turnover of IT sector (48%) is made by software

development and profit exceeds annually 260 million. On the opposite side, the "Maintenance



411

and Repair" produces a profit of 11 times lower. Due to the difficulties in implementing the IT

projects, for many years the paradigm of software development “chronic crisis” is accepted.

Agile approaches were introduced as a solution for this crisis. Many surveys are performed

every year in order to identify the trends in agile adoption. This paper presents some of these

surveys, with the main results achieved during the last years. A similar survey was performed

for Romania and the paper presents the first results obtained based on the data collected until

now.

The paper is structured as follows: after the introductory part (section 1), section 2 presents the

main characteristics of the agile methods and of the agile adoption, as it is known at

international level through different surveys. Section 3 describes the design of our survey,

especially the data collection methods. Section 4 reveals the preliminary results and discussions

of our research. Conclusions are drawn in section 5.

2. Agile approaches adoption in IT services sector

2.1. What does it take to become Agile?

Agile methods are characterized with flexibility, reliance on tacit knowledge, and face to face

communication in contrast to traditional methods that rely on explicit knowledge sharing

mechanism, extensive documentation, and formal means of communication [4]. From the

Software Engineering perspective, the phases of the software development life cycle remain

the same. Requirements gathering and analysis, design, implementation or coding, testing,

deployment and maintenance are present in agile software development. Coding standards,

code review, pair programming and refactoring assure high quality software. Scrum is the most

popular of the agile methods [5]. Scrum is a team-based approach to delivering value to the

business. Team members work together to achieve a shared business goal. The Scrum

framework promotes effective interaction between team members so the team delivers value

to the business. [6] When applied to software development the objective is to deliver value in

the form of software programs or products.

Working software is delivered in a series of short time periods called sprints. Team members

gather for planning in the beginning of the sprint then take daily standup meetings for tracking

progress and identify impediments. The sprint ends with a review sessions to gather feedback

from the users and retrospective meeting for the team to reflect on the process and how to

improve it.

The roles, responsibilities and mindset of an agile team member are different from the ones in

a traditional management approach. The Project Managers shifts from a command-and-control

behavior to one of support and servant leadership. The Scrum Master is responsible for helping

the rest of the team progress, keeping them productive and teaching the Scrum process [6]. The

customer/stakeholder works hand in hand with the development team. Trough the voice of the

Product Owner the product vision and business goals is presented. Team members become less

specialized. The term “generalizing specialist” proposed by Scott Ambler describes agile team

members as “craftspeople, multi-disciplinary developers, cross-functional developers, deep

generalists, polymaths, versatility, or even "renaissance developers"” [7].

Comprehensive documentation is replaced with short, simple description of functionality

written from the user perspective called User Stories [8]. Instead of covering all the details a

user story invites for discussion and negotiation between team members. The objective is to

deliver working software while writing just enough documentation. Selecting the right tools to

support the teams will increase their agility. In order to ship software frequently teams need

continuous integration, automated build and release management tools. When working with



412

large or distributed teams a management and issue tracking tool will assure better project

visibility. Source version control tools are a must when more than one developer edits the code.

2.2 Relevant studies on the adoption of agile approaches in IT services sector

The popularity of the agile approach leads different professional groups to study the

characteristics of adoption process. Even so, there is only limited information about the agile

adoption process in Romania.

Versionone group performs an annual survey on the adoption of the agile approach. The last

annual “State of Agile” survey was conducted in 2013 and the results were published in 2014

[5]. According to this survey, in 2012 and 2013 the number of people recognizing that agile

development is beneficial to business increased with 11%. Scrum and Scrum variants remain

the most widely practiced methodology and Kanban became more popular, increasing with 7%

in the professionals’ preferences. There are new trends in the software development, such as

the increasing of agile practices in the distributed teams, from 35% in 2011 to 76% in 2013.

The usage of the agile tools is also increasing, from 67% in 2011 to 76% in 2013.

Results of a mini-survey conducted in 2014 on the agile adoption were reported in [9].

According to this survey, 33% of the respondents consider that the adoption of agile approach

represents a success, 5% consider it as a failure and 40% are neutral. During the agile adoption

process, the easiest thing to do is to use the existing tools in an agile manner and the most

difficult is to change the business culture in order to get acceptance for the agile management

practices.

3. The research method

The research is focused on the following research questions were:

1. How many Romanian companies from IT Services sector adopted already the agile

software development methods?

2. How many Romanian professionals have agile certification?

3. Which practices and tools are adopted and used by the agile teams?

In order to address these research questions, the following activities were conducted:

1. Identification of the companies from the IT services sector which adopted agile

software development methods, using IT company lists and web search on their web

sites

2. Identification of the number of certified professionals, based on public information

sources

3. Applying an online survey to gather data regarding practices and tools adopted and used

by agile teams

4. Conducting interviews with professionals involved in adopting agile methods within

companies.

5. Preliminary analysis of collected data

ANIS (“Asociaţia patronală a industriei de software şi servicii) develops and maintains the

public list of companies from IT services sector included into Romanian Software Index [10].

Companies are grouped in two categories: outsourcing companies and product companies.

There were 56 companies in that list, at the time of research. For each company, the index

includes the company name, description, contact details and information regarding the services

they offer. It is a common practice, especially for outsourcing companies, to present details

regarding the development process to emphasis the quality of their services. This is why, for

each company, we decided to perform a Google search, using the syntax:

site:[company_website] "[keyword]" (for example, site:www.domain.com "agile"). As

examples of keywords, we can mention: “agile”, “scrum”, “scrum master”, “product owner”.



413

The search returned a list of web pages within the company’s website where the exact match

of the keyword was found. The results were aggregated in a table containing the name of the

company, website, contact information and number of results indexed by the search engine. 29

companies out of 56 were identified as having at least one reference to the search terms.

The degree of interest in getting agile certifications is then analyzed. We consider the number

of certified professionals being a reliable indicator of individuals’ commitment to apply the

agile methods for a relevant period of time. The agile certification and training programmes

are conducted by ScrumAlliance, APMGInternational, Project Management Institute and

ISTQB. A web scrapping tool was used to collect information regarding the Romanian certified

professionals listed in the ScrumAlliance directory [11]. The results are presented in Table 1.

There are 210 members in the directory and some of them hold multiple certifications. The

most popular certificate is "Certified ScrumMaster® (CSM)", considering that 184

professional members are holding this type of certificate.

Table 1. Scrum Alliance certificates by town

Town No of certified

professionals

Brasov,Romania 1

Bucharest, Romania 77

Cluj-Napoca, Romania 76

Iasi, Romania 13

Oradea, Romania 1

Sibiu, Romania 1

Timisoara, Romania 21

N/A 20

Agile practitioners gather to share their experience in professional meetings. There are 9 groups

with more than 2500 members hosted on the www.meetup.com that share the topic of Agile

Software Development. The biggest group is “The Bucharest Agile Software Meetup Group”

with more than 800 members. Between 40 and 50 members gather monthly in a meeting called

Agile Talks.

These three groups (IT companies, agile certified professionals and online groups of

practitioners) were targeted for an online survey that aimed to identify the methods and

practices used by the Romanian IT companies.

The survey consists in 19 questions that provided one or multiple choices to pick from. It targets

software development team members, technical managers, project managers, CTOs or any

company member that has an overview on the development process. The questions were

adapted from similar surveys applied by [5] and [9]. Between February 15th and February 22nd

the survey was distributed via LinkedIN messages toward certified professionals, emails to

software companies and posted on the agile meet up groups. A preliminary analysis was

conducted on the 97 answers.

Four interviews were conducted with professionals that contributed to the agile methods

implementation within the company they work for. The interviews were semi structured, took

between 45-60 minutes and were hold on Skype. The participants were asked to describe the

reasons for adopting agile, how development teams get organized, phases they went through

while adopting agile, practices and tools used and future plans for development.

http://www.meetup.com/



414

4. The main results

The majority of the respondents have reported that their company has less than 5 years of

experience with Agile practices (see table 2).

Table 2. How many years has your organization been applying agile?

Time interval No. of respondents

Less than 3 years 32

Between 3 and 5 years 32

Between 5 and 10 years 20

More than 10 years 4

Don't know 9

Scrum is by far the most popular agile methods, followed by Kanban and Extreme

Programming (XP). Most of respondents picked more than one method. It was confirmed

during the interviews, that it is common to combine practices in a custom method that best fits

the company needs. Figure 1 presents the most popular agile methods applied by Romanian

companies.

Figure 1. Agile methods

The Scrum popularity is the most popular practice. Team meeting are used by the majority of

the respondents: Planning (87%), Daily standup (87%), Retrospective (81%), Review (74%).

Software requirements are defined in a Product Backlog (84%) and detailed in the Sprint

Backlog (78%). User story are used by 81% of the respondents to define requirements and

enhanced with Definition of Done (71%) and Acceptance Criteria (62%). Practices for tracking

progress like Burn down charts (57%) and Velocity tracking (54%) are less popular but still

widely used. There’s a slightly lower usage of agile software development practices when

compared with management practices. This was confirmed by the participants in the interviews

that more focus was on getting the teams organized by the agile methods in order to be more

flexible on requirement changes and deliver more frequently.

The most common tools used by agile teams (table 3) are Bug trackers (73%), Agile project

management tool (66%), Automated build tool (66%), Taskboards (61%), Wikis (56%).



415

Table 3. Top 5 tools used for tracking activities

Town No. of respondents Percentage

JIRA 68 70%

Excel 37 37%

Google Docs 27 28%

Microsoft Project 17 17%

Trello 17 17%

Common methods for sharing knowledge reported in the survey are internal presentations

(74%), informal discussions (67%), technical meetings (59%), and training programs (52%).

Wikis (52%), Mailing lists (42%), SharePoint sites (22%) are tools used by development teams

to organize and share information.

The figure 2 shows the percentage of companies in category "Beginner" (green), "Intermediate"

(blue), and "Advanced" (red), applying each of the 54 methods, practices, meetings, generically

named methods. For example, 20% of companies under "Advanced" use ASD (Adaptive

Software Development) method, but no company "Beginner" does not use this method.

Figure 2. Usage of agile, by company categories

In general we see that the "Advanced" companies use more methods than others. We can

observe the following different usage patterns for the analyzed methods:

methods that differ substantially between levels (for example: AT-Agile Testing, VCS-

Version Control Systems)

methods that do not differ much between levels (US- User Story)

methods that are used more for "Beginner" than the "Advanced" (PB- Product Backlog)

methods that are used by most companies regardless of level (PL- Planning meetings,

RET- Retrospective meetings)

In order to analyze the correlation between "company category (RATING) or its agile

("No.years") and the number of agile methods, practices, meetings and programming applied

we grouped the company according to these criteria (the agile rating/category and the number

of years when companies apply agile). In the figures 3 and 4, in parentheses as "n =" is noted

how many companies fall into each group. The figures show a positive correlation between



416

"RATING" or "No.years" and the number of agile methods, practices, meetings and

programming used.

Figure 3. Diversity of approaches according to the company category

Figure 4. Diversity of approaches according to the company agile experience (no. of years)

5. Conclusions

Many software companies have adapted their practices in order to respond to the dynamic

business environment. This preliminary research has identified what are the common Agile

methods and practices used in the Romanian IT sector. Further research will rely mainly on

getting a better understanding of how individuals interact in agile teams and how knowledge

transfer could be more effective.

Acknowledgement

The research is part of the doctoral research programme „Knowledge Management in Agile

software development teams”. The reasearch is also part of the annual research performed by

the Centre for Industrial and Services Economics, Romanian Academy.

References

[1] Bloomberg ‐ Business, Financial & Economic News, Stock Quotes,

http://www.bloomberg.com .

http://www.bloomberg.com/



417

[2] KeysFin, ”Analiza din seria Companii la Raport: IT‐ul, sectorul care ne poate scoate din

criză!”, 8 mai 2014, http://www.keysfin.com/#!/Pages/IT_domeniul_momentului

[3] Classification of economic activities NACE rev.1.1, http://ec.europa.eu/environment/emas/pdf/general/nacecodes_en.pdf

[4] Z. Alzoabi (2012), Knowledge Management in Agile Methods Context: What Type of

Knowledge Is Used by Agilests?, In A. Rahman El Sheikh, & M. Alnoukari (Eds.) Business

Intelligence and Agile Methodologies for Knowledge-Based Organizations: Cross-

Disciplinary Applications (pp. 35-71)

[5] Versionone – The 8th Annual State of Agile Survey, Versionone, nc, 2014, available at

www.stateofagile.com

[6] Scrum Alliance Core Scrum v2014.08.15, Internet: https://www.scrumalliance.org/why-scrum/core-scrum-values-roles

[7] S. Ambler, Generalizing Specialists: Improving Your IT Career Skills, available at:

http://www.agilemodeling.com/essays/generalizingSpecialists.htm

[8] M. Cohn, User Stories, Internet: http://www.mountaingoatsoftware.com/agile/user-stories

[9] Scott W. Ambler & Associates, 2014 Agile Adoption Survey, available at: www.ambysoft.com/surveys/

[10] ANIS (Asociaţia patronală a industriei de software şi servicii), Romanian Software Index http://www.softwareindex.ro/

[11] ScrumAlliance Members directory https://www.scrumalliance.org/community/member-directory

http://www.keysfin.com/#!/Pages/IT_domeniul_momentului

http://ec.europa.eu/environment/emas/pdf/general/nacecodes_en.pdf

https://www.scrumalliance.org/why-scrum/core-scrum-values-roles

https://www.scrumalliance.org/why-scrum/core-scrum-values-roles

http://www.mountaingoatsoftware.com/agile/user-stories

http://www.ambysoft.com/surveys/

http://www.softwareindex.ro/

https://www.scrumalliance.org/community/member-directory

https://www.scrumalliance.org/community/member-directory



418

IMPROVEMENT OPPORTUNITIES BY USING REMOTE AUDIT IN

THE MARITIME TRANSPORT

Costel STANCA

Constanta Maritime University [email protected]

Viorela-Georgiana STȊNGĂ Constanta Maritime University

[email protected]

Gabriel RAICU Constanta Maritime University

[email protected]

Ramona TROMIADIS Constanta Maritime University [email protected]

Abstract. The specific of maritime transport involves a large number of audits and inspections

performed on board ships. At present there is no important shipowner running a shipping

company without having an integrated management system including quality, environment,

occupational health, etc. The international maritime conventions impose other

internal/external audits regarding safety management and ship security. The need to increase

efficiency led to a high level of optimization of time spent by ships in ports and the time

available for audits significantly decreased. To be in line with actual IT&C developments the

last edition of the standard establishing guidelines for auditing management systems ISO

19011, adopted in 2011, introduced the concept of remote audit. This paper presents the

results of research regarding the opportunities to use the remote audit for the maritime ships

increasing the possibilities to harmonize the audit program with the ships port calls and to

decrease the supplementary costs involved by auditors’ traveling and accommodation.

Keywords: remote audit, ship, maritime transport

JEL classification: Auditing

1. Introduction According to actual practices on board ships are performed without exceptions three types of

audits and inspections. The most known audits are related to safety and ship security

management based on the codes included into International SOLAS Convention requirements

[1]. All these audits performed by authorities’ representatives or recognized organizations’

auditors must be preceded by intern audits on the related fields.

Another audit type often integrated to the previous presented is involved by the maritime

transport companies voluntary decision to apply management systems based on international

standards such as ISO 9001, ISO 14001, OHSAS 18001 or other.

The third type is given by Flag State Control, Port State Control, classification societies, vetting

and other similar audits and inspections.

The audits on board ship are normally performed during ships are under loading / unloading

operations in ports. Is obvious that the total number of audit days including internal and

external audits is high and the trend is to decrease the time spent by ships in ports due to

introducing of new port operation technologies and optimization of trade routes.



419

That is way shipping companies should take into consideration, especially for internal audits,

but also, when is applicable, in selection of external auditing body, the use of remote audit as

an improvement decision.

2. Aspects on remote audit

As stated by Teeter et al. the remote audit is the process by which auditors couple information

and communication technology with data analytics to assess and report on the accuracy of

financial data and internal controls, gather electronic evidence and interact with the auditee,

independent of the physical location of the auditor. It is important to see the remote audit as an

aid for a traditional periodic audit, because it allows internal auditors to interact with different

departments and functions of the firm and third parties over long distances [2].

Taking into account the technological development of nowadays society and also the rapidly

growing number of national and international compliance requirements, remote audit seems to

be the perfect solution for many auditors that cannot be physically present at a location [3].

Without leaving their office they can audit a department, a supplier and even an organization,

regardless of their location, through some important elements such as: videoconferencing,

internet access, satellite phones and authentic records (as an essential tool needed to guarantee

reliable accountability [4].

Auditors tend to increase their use of technology in order to have a more automated audit

process, with the goal of making it more cost effective [5]. One example related to the use of

technology refers to videoconferencing that leads to a smaller number of displacements to the

audit location [6].

So, we need to highlight the fact that remote audit eliminates the location constraint of an audit,

but at the same time it allows the connection between the entity to be audited (so-called auditee)

and the auditor [7].

The remote audit reduce the location requirement for auditors, this leading to a great benefit

which refers to the audit tasks that in this case can be divided between on-site and remote audit

team members. Remote audit is considered to be a great option for companies that need to

reduce the cost of the audit process, but also when referring to time savings. We also need to

emphasize its role regarding employees, because it reduces the time spend by them preparing

it.

According to Colin MacNee, a quality management consultant at IBM that wrote for the

International Register of Certificated Auditors (IRCA), we need to point out that not all audits

should be completed wholly remotely even if they offer in the planning stages of an audit a

good way of maximizing auditors’ resources. It is possible that more than half of an audit be

performed using a remote non-interactive audit (when internet is used to access electronic

documentation and to perform telephone or video interviews with staff), the rest being realized

on site (when are performed for example face-to-face interviews) [8].

Remote audits are usually used when referring to audits that involve low risk processes such

as: first-party audits (that are conducted internally by a company) and second-party audits

(which are conducted by a customer) [9]. So we need to highlight that remote audits should be

based on risks. The table 1 describe some activities and potential method of execution divided

into on site and remote audit when referring to client interface [10].

Table 1. Activities and potential method of execution

Activity On site audit Remote audit Observations

Initial kick-off

meeting

Experienced auditors

meet with process

managers

Meeting via video

conferencing

Experienced auditors meet

with process managers to

get “feeling” and



420

understanding of audit in

person

Interviews

Auditor meets with

specific parties in

person

Meeting conducted

by phone, or video

conferencing

Lack of visual

communication removes

bias, non-verbal feedback

Process mapping

Auditor reviews

documentation, tours

facility

Auditor evaluates

flowcharts, verifies

data flow in

Enterprise resource

planning (ERP)

system

Depending on application,

both are essential

Knowledge

engineering

Offline

documentation

reviewed and

updated

Online

documentation

reviewed and

updated

Offline documentation

would be digitized and

kept in an Electronic

Document Management

System (EDMS)

Source: R. A. Teeter and M.A. Vasarhelyi, “Remote Audit: A Review of Audit-Enhancing

Information and Communication Technology Literature”,

http://raw.rutgers.edu/remote%20audit.pdf [Feb. 20, 2015]

Some of the certification bodies introduced own requirements in regards to remote audits, after

the concept was include into ISO 19011 standard, but at present the third party audits are using

this type of audit very rarely, for multisite audit and especially when information or documents

are not available, or the on-site audit is no providing proper conditions to perform the review

of documents or other data.

3. Remote audit on board ships

For the purpose of the research of remote audits efficiency we selected three categories of ships

known to be among those having a reduced time for loading / unloading operations: container

ships, tanker ships and bulk carriers.

For example, a study performed on a number of 647 ships (out of which 127 were for petroleum

products and 33 liquid bulk ships) revealed an average of 35 hours port turnaround average

time for petroleum products ships and 58 hours for other liquid bulk carriers [11].

On the other hand in order to assure an effective internal audit, at least one day is necessary to

cover the requirements of the management system (if is based on a standard only, when more

standards are applicable the time should be increased).

The analysis we performed included the interview of 21 shipmasters (7 ships of each selected

category).

The results shown that in 38% of cases the remote audit was used in the last year for internal

or second-party audits, covering especially online documentation review. The average time

reduced per ship per year was 33% (about one audit day).

The characteristics of maritime transport, which offers a short average time of ships

availability, but a high level of similarities with other “sister ships” make this filed one of the

most recommended for remote audit.

The fact that sometimes the ships owned by the maritime transport company are far from

headquarters location is another reason to use remote audit for ships.



421

Solutions adopted for second-party audit include the use a local specialized company to

perform the audit, but this option is no advisable for internal audit, involving a less level of

acquiring the company policies and to find improvement opportunities.

The critical issue for remote audit remains the ability of auditor to correlate the online reviewed

documents and data with on-site audit actions and to realize an effective cross-checking in

order to assure a good coverage of objective evidences related to management system

requirements.

4. Conclusions

Even if the remote audit is still under analysis for third party auditing bodies, for the internal

and second-party audits it is expected to be introduced on a large scale for maritime ships,

reducing considerably the audit time and the related costs. Another important reason leading

to cover partly the audit by remote actions is to reduce the disturbance of onboard activities for

ships having a short turnaround time in ports. The actual developments of IT&C tools will

constitute a catalyst for such decisions.

References

[1] International Convention for the Safety of Life at Sea, 1974, as amended, International

Maritime Organization, 2014

[2] R.A. Teeter, “Essays on the enhanced audit”, Dissertation submitted to the Graduate

School-Newark Rutgers, The State University of New Jersey, May 2014

[3] A. Carlin and F. Gallegos,” IT audit: a critical business process”, IEEE Computer 40 (7),

2007, pp. 87–89.

[4] G. Müller, R. Accorsi, S. Höhn and S. Sackmann, “Sichere Nutzungskontrolle für mehr

Transparenz in Finanzmärkten”, Informatik Spektrum 33 (1), 2010, pp. 3–13.

[5] M. G. Alles, A. Kogan and M. A. Vasarhelyi, “Putting Continuous Auditing Theory into

Practice: Lessons from Two Pilot Implementations”, Journal of Information Systems,

22(2), 2008, pp. 195–214.

[6] M. G. Alles, A. Kogan and M. A. Vasarhelyi, “Principles and problems of audit automation

as a precursor to continuous auditing”, Working paper, Rutgers Accounting Research

Center, Rutgers Business School, 2010.

[7] R. Accorsi, ”A secure log architecture to support remote auditing”, Mathematical and

Computer Modelling 57, 2013, pp. 1578–1591.

[8] Colin MacNee , “What is remote auditing?”, The International Register of Certificated

Auditors, http://www.irca.org/en-gb/resources/INform/archive/issue26/Features/remote-

auditing/ [Feb. 20, 2015]

[9] D.Ade, “Remote Audit: Out of Sight but Not Out of Mind”, in Quasar, July 2012, British

Association of Research Quality Assurance, http://www.mastercontrol.com/audit-

management/remote-audit-out-of-sight-not-out-of-mind.html, [Feb., 25, 2015]

[10] R. A. Teeter and M.A. Vasarhelyi, “Remote Audit: A Review of Audit-Enhancing

Information and Communication Technology Literature”,

http://raw.rutgers.edu/remote%20audit.pdf [Feb. 20, 2015]

[11] E. Kahveci, “Fast turnaround ships and their impact on crews”, Seafarers International

Research Centre, 1998,

http://www.sirc.cf.ac.uk/uploads/publications/Fast%20Turnaround%20Ships.pdf [Feb. 25,

2015]



422

AUDIT REPORTS VALIDATION BASED ON ONTOLOGIES

Ion IVAN

University of Economic Studies, Bucharest

[email protected]

Claudiu BRANDAS

West University of Timisoara, Faculty of Economics and Business Administration, Timisoara

[email protected]

Alin ZAMFIROIU

University of Economic Studies, Bucharest

National Institute for Research and Development in Informatics Bucharest

[email protected]

Abstract. Preparing the audit reports is a perfect algorithm-based activity characterized by

generality, determinism, reproducibility, accuracy and a well-established. Audit teams apply

metrics and qualitative analysis carried out reports in the form of structured text from a

template based on different standards. Finally, the report concludes the acceptance or

rejection of the hypothesis formulated objective of the audit work. Audit report is subject to a

validation process through several steps including: Crossing databases; calculation of

indicators; mapping quality levels; Euclidean distance calculation qualitative; drawing

conclusions; comparing the calculated levels; calculating the aggregate indicator; getting

final solution. The auditors obtain effective levels. Through ontologies obtain the audit

calculated level. Because the audit report are qualitative structure of information and

knowledge it is very hard to analyze and interpret by different groups of users (shareholders,

managers or stakeholders). Developing ontology for audit reports validation will be a useful

instrument for both auditors and report users. In this paper we propose an instrument for

validation of audit reports contain a lot of keywords that calculates indicators, a lot of

indicators for each key word there is an indicator, qualitative levels; interpreter who builds a

table of indicators, levels of actual and calculated levels.

Keywords: Audit, Reports, Ontology, Validation, Governance

JEL classification: M42, C52

1. Structures for the audit reports In [1], [2] audit is defined as activity that corrections are made on the way to include techniques,

methods and models for analysis and control of information products.

An audit process is an assurance service regarding to organizational process, risks and control

environment and financial statements accuracy based on auditing methodology.

Auditing processes have well defined task duration in time and tasks set for the team members

are characterized by inputs and outputs described in procedures consistent, coherent and

effective.

According to ISACA Standards [3] the audit report should contain: scope, objectives, period

of coverage and the nature, findings, conclusions and recommendations, timing and extent of

the audit work performed.

From our point of view an audit report is a qualitative structure of information and knowledge

as a results of an audit process. In most of cases audit reports contains: auditor, audited entity,



423

scope (or limitations of scope), reference to auditing standards, period, findings, conclusions,

opinion and recommendations.

Consider a real-world entity E to be undertaken by a team using the resources they appear

financially restrictions. Entity E is a product, activity, process or a technology.

It develops a project based on the objective that aims to achieve the entity's project containing:

description of the entity to be performed; deadlines falling in the process of implementation;

resources; risks; controls; activities; inputs; outputs; expenditure budget; implementation team

and sharing tasks on each member.

The audit is a complex activity which is meant to establish the level of concordance between

what is written in the draft with what happened in reality to obtain concrete form of the entity

E. If there are computer applications programming specifications that define a virtual entity as

E. When performing software have seen the extent to which it matches the description in the

specification and implementation process concrete form to the entity E – computer application,

was the same specifications defined.

There is concern at the organization level to conduct and produce products and services at a

defined quality level, so the market to satisfy demand / supply ratio, which means the

liquidation realization of finished goods respectively maximum use of specific lines provide

services.

The audit process is one of the most important process of Corporate Governance. A good

corporate governance means the coordination, collaboration, communication between all

entities involved like shareholders, board of directors, managers, stakeholders and auditors

(internal and external) for divergent objectives so [4],[5]:

transparency;

accountability;

minimizing risks;

performance;

Also for a good corporate governance board of directors, management and auditors must focus

on the following objectives:

not work on stock;

persons performing services respecting procedures have no goals in production;

the quality was not affected in any way and is unconditionally the procedures using

materials exactly as specified; each person is so trained that makes self of what worked and

responsible to control by those who provide quality management;

framing costs in limits that do not affect profit organization;

the deadlines in managing contracts and using additional delays that allow recovery delays.

Assess the returns good governance audit to the initial documentation that contains technical

data, procedures, specific consumption and the parallel implementation of information

describing the way in which processes within the organization in all managerial aspects of

individuals activities, of consumption, recipes of production, operating machinery, the use of

raw materials and materials, finished goods stocks management.

Audit good governance finally materialized through a variety of reports. In order to improve

the transparency, monitoring and multidimensional analysis of corporate governance, these

reports can be represented using XBRL [6], [7].



424

2. Validation processes in auditing

Validation in auditing is the auditing audit but is a process which shows that audit steps are

well constructed, audit objectives are well defined, as team members auditing tasks are

distributed and balanced specific flows audit is complete.

It is considered an auditing process defined by:

specialists team structure: nsSSS ,...,, 21 ;

set procedures on which the audit is performed activities nppppP ,...,, 21 ;

people team performed set activities naaaaA ,...,, 21 .

It is considered a lot of organizations norgorgorgorgORG ,...,, 21 for which good

governance is analyzed using validation processes defined above;

In the event that there is a final audit report in which the activity is acceptable or good

governance audit team found that is not acceptable, it is build the Table 1.

Table 1. Raport of audit for organizations

Organization The result of the

audit

Long term real results

of the organization

The result

validation

1org 1rez

1rr 1

2org 2rez

2rr 2

… … … …

norgorg norgrez norgrr norg

If 1 iii rezrr

If 0 iii rezrr

It is calculate the IV, index validation by the formula:

norgIV

norg

i

i 1

If IV > 0,92 means that the validation audit process for good governance is very well built.

If IV <0,92 and IV>0,78 means that the validation process for the proper auditing process is

well good governance built.

Otherwise the validation process of auditing for good governance is itself invalid.

irr

accepted, if the real results concludes that is good governance

unacceptable, if the real results concludes that is bad governance

irez

accepted, if the audit report concludes that is good governance

unacceptable, if the report concludes that is bad governance



425

3. Orientation on ontology of the audit

Auditing processes have a specific vocabulary describing the steps, activities, inputs, outputs,

resources, people, functions, products, processes and qualifications to assess how the audited

organization has evolved in the context of corporate governing.

The auditors are available database containing descriptions of product structures, equipment,

fabrication of recipes, the activities, the way in which workers have developed interactions on

a specified time and reflection at the carrying amounts and all financial flows. Developing an

ontology-oriented approach to the audit process returns to:

levels planned big data available and actual levels;

framing intervals or outside timeframes;

activities, processes, products qualitative aspects analyze;

quantitative results mapping with qualitative levels;

building an ontology-oriented audit metrics.

Process-oriented approach ontologies audit complexity requires taking more steps, which the

most important are:

specific vocabulary building corporate governance;

audit reports structure defining;

search algorithms and extract words from database developing;

differential counting from the analysis between the planned and actual level;

simple indicators calculation;

aggregate indicator calculating;

quantitative and qualitative levels mapping;

conclusion of acceptance or rejection generates.

For validation, analysis and interpretation of audit reports based on ontology we define a class

hierarchy of audit report domain knowledge, Figure 1.

Figure 1. Class hierarchy of audit report domain knowledge developed in Protégé [8].



426

Accuracy of auditing in corporate governance as a process-oriented ontology requires a long

process of analysis, evaluation and validation and only after that it will be considered as an

acceptable solution has been obtained for the problem defined.

If is wishes to develop problem to consider process refining that involves simplifying

indicators, steps, reporting structures, so without losing the quality of the results to obtain valid

conclusion of the audit.

4. Process validation by ontologies in auditing

Note that solving a problem specific to registration on a level corresponding to each generation

based on a knowledge obtained at a time. It speaks of object-oriented and comprehensive

approach consisted in object-oriented analysis, object-oriented design, object oriented

programming, object oriented databases which means that at this level of knowledge entire

development cycle of systems included techniques, methods and technologies based on object

orientation. Switching to another generation requires that all elements which contribute to the

realization of a system to be based on the latest technologies. In this case we talk about a form

of corporate governance process oriented auditing approach is complete if ontologies but

everything about the audit is based on ontologies, including validation.

Process validation is very specifically in many areas of computer science applied to the data

validation, diagrams validation, source texts validation processing validation to the most

complex components included in software engineering. A valid means to conclude that an

entity meets all the conditions subject to actual use in a specific context.

Auditing methods for improvement of corporate governance is based on a clear set of

procedures designed that ultimately leads to a series of texts that form the final report by

concatenation. The analysis of these texts constitute a linear list that works on the principle of

seriality through a process of analysis equivalent to a continuous crossing leads to the

conclusion that corporate governance was held on acceptable terms, and conditions fragmented

crossing conclusion is contrary.

Build a validation process oriented ontology for auditing processes governing corporate returns

to identify those elements that refer to an organizations collection that have specific audit for

corporate governance oriented ontology.

It builds a small set of indicators that highlight the agreement between the audit result oriented

ontology and the actual behavior of organizations with corporate governance on a time interval

following audit.

Calculation of indicators and their aggregation allows getting a picture on the quality of the

audit process.

Oriented ontology auditing processes analyzing of many organizations and aggregating

information get that level will show that technology audit is rejected.

It is achieved in this way by focusing on ontologies homogenization of the audit process but

also the validation process.

5. Conclusions

Corporate governance systems have a high level of complexity that requires a new approach

when analyzed and subjected to evaluation. The auditing process is an important element for

improving corporate governance and assuring the transparency, accountability in order to

increase performance of organizations.

Corporate governance based organizations are subjects to regular audit process completed,

which also requires a new approach to technology so that all elements of the virtual

environment that reflects the dynamics of the process organization inputs. The only way to

achieve this is oriented approach in a accepted ontologies are defined sets organized in a



427

structure abstraction, put in correspondence with elements of real sets and categories and

concepts are related assessment appearances quantitative values in the intervals qualitative

levels. Existing hardware and software resources and access any components thereof are

operating such an approach.

Developing and implementing an ontology-based solution for validating, analyzing and

assessing audit reports create a base for a collaborative decision-making support of auditors

and audit reports users in order to increase the performance of organizations corporate

governance.

Acknowledgement




research in Romanian economics science domain”

References

[1] C. Amancei, "Metrici ale auditului informatic", Teza de doctorat, Bucuresti 2011, 176 pg.

[2] M. Popa, C. Toma and C. Amancei, "Characteristics of the Audit Processes for Distributed

Informatics Systems", Revista Informatica Economica, vol. 13, nr 3, 2009.

[3] ISACA, “IT Standards, Guidelines, and Tools and Techniques for Audit and Assurance and

Control Professionals”, 2010.

[4] C. Brandas, Study on the Support Systems for Corporate Governance, Informatica

Economică vol. 15, no. 4/2011.

[5] K.J. Hopt, "Comparative Corporate Governance: The State of the Art and International

Regulation", working paper no 170/2011, ECGI Working Paper Series of Law, 2011.

[6] C. Brandas, Improving the Decision-Making Process and Transparency of Corporate

Governance Using XBRL, World Academy of Science, Engineering and Technology Vol:6

2012-11-29.

[7] A. Bodaghi, A. Ahmadpour, The Improvement of Governance Decision Making Using

XBRL, 7th International Conference on Enterprise Systems, Accounting and Logistics (7th

ICESAL 2010) 28-29 June 2010, Rhodes Island, Greece.

[8] Protégé, “A free, open-source ontology editor and framework for building intelligent

systems”, Stanford University, 2015.



428

APPLICATION OF BUSINESS INTELLIGENCE IN PROJECT

MANAGEMENT

Mihaela I. MUNTEAN

West University of Timisoara, Romania [email protected]

Liviu Gabriel CABᾸU West University of Timisoara, Romania [email protected]

Abstract. The present paper is based on a preprint version published in RePEc database1.

Identifying a constant interest in the approached subject, the initiative of presenting the

working paper at the IE2015 conference was proceeded. The application of Business

Intelligence in project management is adding value to the process. In this terms, project

monitoring is proposed.

Keywords: business intelligence, project management, monitoring, key performance

indicators.

JEL classification: M00, O31, L20

1. Introduction Current business intelligence (BI) approaches are subordinated to performance management

[1], [2], [3], the key performance indicators (KPIs) being an important contributor to the BI

value chain [4], [5]. Successful BI initiatives are possible with the support of technologies,

tools and systems that are capable to sustain the above mentioned value chain. Along the BI

value chain, data is transformed into relevant information and is stored into the data warehouse.

The multidimensional cube, deployed above the data warehouse, together with a set of data

mining techniques will transform the information into valuable knowledge. The KPIs,

technically attached/added to the cube, are further part of the performance management system

[4], [6], [7].

The data warehouse environment concept [8], [9] is equivalent to the introduced BI value chain.

Despite its dominant technological nuance, performance measuring is not neglected. BI

projects are deployed based on a suitable data warehouse schema with respect to the imposed

key performance indicators.

Unaninously, project management (PM) is considered „the process of achieving project

objectives (schedule, budget and performance) through a set of activities that start and end at

certain points in time and produce quantifiable and qualifiable deliverables“ [10].

Methodological approaches to conducting projects have established guidelines for all project

live cycle phases: initiation, planning, execution, monitoring&control, and closing [11], 12.],

[13]. Experts in project management have estimate that PM is 20% planning and 80%

monitoring & control. Monitoring is taken place on schedules, budgets, quality, risks, and

scope [14]. Actual state is compared to baseline; actual performance is compared with the

planned one. Therefore, a minimal set of key performance indicators (KPI) will be introduced

to monitor the project’s progress during its execution.

1 Muntean M., Cabău L.G., Business Intelligence Support for Project Management, http://mpra.ub.uni-

muenchen.de/51905/1/MPRA_paper_51905.pdf



429

2. Monitoring Project Execution

In general, the purpose of monitoring can be: to assess project results, to improve project

management, to promote learning, to understand different stakeholder’s perspectives, to ensure

accountability [15]. Based on the introduced project live cycle phases, monitoring is done in

parallel with other processes like planning or execution.

Monitoring implies measuring the progress and performance of the project during its execution

and communicating the status. Project’s performance deviations from the plan are signalized

when:

the team is not working on the correct activities of the project plan;

the team is not on-schedule with the project objectives;

the team is not on-budget with the project resources;

the quality of the work is not acceptable;

additional project control activities cannot be performed.

Based on the diagnosis, appropriate corrective actions will be taken.

2.1 A Minimal Set of Key Performance Indicators

KPIs are used to assess or measure certain aspects of the business operations (at operational

level) and business strategies (at strategic level) that may otherwise be difficult to assign a

quantitative value to. Aberdeen Group recent studies have pointed out that “the creation,

management and continual review of the KPIs can be difficult because it implies referees to

large, complex data volumes and a rapidly changing business dynamics”. However, they should

be specific, measurable, achievable, result-oriented and time-bound.

Monitoring the progress of a project can be done with the help of a minimal set of KPIs. Three

aspects are taken into consideration: schedule progress, budget, and scope.

Monitoring schedule progress can be performed based on the following five KPIs (Table 1).

They are referring to the status of the activities that have been scheduled (per week or per

month), to the progress of activities (ahead, behind or on-schedule) and to the required course

corrections. Table 1. Schedule progress KPIs

No. KPI Name Definition

1.

Activity normal average

(ANA)

Represents the daily average value that must be

achieved by a specific activity, in order to be

successfully completed at the end of a time period.

2.

Activity normal value

(ANV)

Represents the normal value for a particular activity

acquired in a time interval.

3.

Activity current average

(ACA)

Represents the average value of a specific activity in

present time.

4.

Activity average progress

(AAvP)

Represents the progress recorded by a particular

activity (ahead, behind or on-schedule) compared to

baseline.

5.

Activity absolute progress

(AAbP)

Represents the percentage of a specific activity that has

been completed.

Monitoring the budget implies three KPIs (Table 2). They are referring to the amount of

budget that has been spent up to o given date, to the amount of remained budget and to the

revised estimates to complete a programmed activity.



430

Table 2. Budget monitoring KPIs


1.

Activity total cost

(ATC)

Represents the amount of budget that has been spent for a

specific activity.

2.

Activity total

budgeted (ATB)

Represents the amount of budget that has been allocated for a

specific activity.

3.

Activity remaining

budgeted (ARB)

Represents the amount of budget that has not been spent for a

specific activity.

Monitoring the scope is possible with the next three KPIs; being in scope or out of scope will

be identified. Also, possible occurred changes that will require a scope addition will be marked.

Table 3. Scope monitoring KPIs


1.

Project activities on

scope (PAS)

Represents the number of project activities that are in scope.

2.

Project activities out

of scope (PAoS)

Represents the number of project activities that are out of

scope.

3.

Project activities

number (PAN)

Represents the total number of project activities.

It is not recommended to track the considered KPIs any more than once per week or any less

than once per month.

2.2 Defining the proposed KPIs

Following an agile approach for developing the multidimensional model, phases like 1-

conceptual schema design, 2-detailed fact table design, 3-detailed dimension table design, and

4-refine the dimensional model are grounding the approach.

The resulted measures are:

𝑀1 = 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑦;

𝑀2 = 𝑈𝑛𝑖𝑡_𝑐𝑜𝑠𝑡;

𝑀3 = 𝑈𝑛𝑖𝑡_𝑏𝑢𝑑𝑔𝑒𝑡𝑒𝑑; (1)

𝑀4 = 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑦_𝑡𝑎𝑟𝑔𝑒𝑡_𝑣𝑎𝑙𝑢𝑒;

𝑀5 = 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑦_𝑐𝑢𝑟𝑟𝑒𝑛𝑡_𝑣𝑎𝑙𝑢𝑒.

They have been aggregated according to the following dimensions:

𝐷1 = 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒𝑠 → 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒𝑠_𝑇𝑦𝑝𝑒𝑠 ; 𝐷2 = 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑖𝑒𝑠 → 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑖𝑒𝑠_𝑇𝑦𝑝𝑒𝑠;

𝐷3 = 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠 → 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠_𝑇𝑦𝑝𝑒𝑠; (2)

𝐷4 = 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑖𝑒𝑠_𝑆𝑡𝑎𝑡𝑢𝑠;

𝐷5 = 𝑆𝑡𝑎𝑟𝑡_𝐷𝑎𝑦 → 𝑆𝑡𝑎𝑟𝑡_𝑀𝑜𝑛𝑡ℎ → 𝑆𝑡𝑎𝑟𝑡_𝑌𝑒𝑎𝑟;

𝐷6 = 𝐹𝑖𝑛𝑖𝑠ℎ_𝐷𝑎𝑦 → 𝐹𝑖𝑛𝑖𝑠ℎ_𝑀𝑜𝑛𝑡ℎ → 𝐹𝑖𝑛𝑖𝑠ℎ_𝑌𝑒𝑎𝑟.

Above the DW, the OLAP cube will be deployed and the considered KPIs can be added to the

cube. They will be calculated with the following formulas:

𝐴𝑁𝐴 = 𝑀4

𝐹𝑖𝑛𝑖𝑠ℎ_𝑑𝑎𝑡𝑒 − 𝑆𝑡𝑎𝑟𝑡_𝑑𝑎𝑡𝑒;

𝐴𝑁𝑉 = (𝐶𝑢𝑟𝑟𝑒𝑛𝑡_𝑑𝑎𝑡𝑒 − 𝑆𝑡𝑎𝑟𝑡_𝑑𝑎𝑡𝑒) ∗ 𝐴𝑁𝐴;



431

𝐴𝐶𝐴 =𝑀5

𝐶𝑢𝑟𝑟𝑒𝑛𝑡_𝑑𝑎𝑡𝑒 –𝑆𝑡𝑎𝑟𝑡_𝑑𝑎𝑡𝑒; (3)

𝐴𝐴𝑣𝑃 =𝐴𝐶𝐴

𝐴𝑁𝐴∗ 100;

𝐴𝐴𝑏𝑃 =𝑀5

𝑀4∗ 100.

For the above five KPIs, recommended for monitoring the schedule progress, a maximal value

is desired.

The next group KPIs is used to monitor the budget. According to their definition in Table 2,

the following formulas can be used for calculus:

𝐴𝑇𝐶 = 𝑀1 ∗ 𝑀2;

𝐴𝑇𝐵 = 𝑀1 ∗ 𝑀3; (4)

𝐴𝑅𝐵 = 𝐴𝑇𝐵 − 𝐴𝑇𝐶.

While the first KPI is desired to have a minimal value, the other two are appreciated if they

have maximal values.

Concerning the third group of KPIs (Table 3), the following ways of determination are

proposed.

𝑃𝐴𝑆 = 𝐶𝑂𝑈𝑁𝑇 (𝐼𝐷_𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦) 𝑊𝐻𝐸𝑅𝐸 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑦_𝑡𝑦𝑝𝑒_𝑛𝑎𝑚𝑒 = "𝑜𝑛 𝑠𝑐𝑜𝑝𝑒" ; (5)

𝑃𝐴𝑜𝑆 = 𝐶𝑂𝑈𝑁𝑇 (𝐼𝐷_𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦) 𝑊𝐻𝐸𝑅𝐸 𝐴𝑐𝑡𝑖𝑣𝑖𝑡𝑦_𝑡𝑦𝑝𝑒_𝑛𝑎𝑚𝑒 = "𝑜𝑢𝑡 𝑜𝑓 𝑠𝑐𝑜𝑝𝑒";

𝑃𝐴𝑁 = 𝑃𝐴𝑆 + 𝑃𝐴𝑜𝑆.

While PAS is monitored to have a maximal value, PAoS is desired to be as minimal as possible.

3. Conclusions

Among various applications of Business Intelligence (BI), project management approaches can

be substantially enriched by BI based frameworks. Our attention was focused on monitoring

project execution. The current approach is retrieved from our preprint published in RePEc

database1, and represents one of the few initiatives which correlates Business Intelligence with

project management. The introduced KPIs contribute in measuring the progress and

performance of the project during its execution and communicate the status. Monitoring the

schedule progress, the project budget and the scope is possible. The proposal allows further

extensions by introducing additional KPIs, and is reusable in new contexts.

References

[1] D.K. Brohman: The BI Value Chain: Data Driven Decision Support In A Warehouse

Environment, The 33rd Hawaii International Conference on Systems Science, 2000

[2] D. Hatch and M. Lock: Business Intelligence (BI): Performance Management Axis. QI,

Aberdeen Group Research Studies, 2009

[3] M. Muntean, D. Târnăveanu and A. Paul: BI Approach for Business Performance,

Proceedings of the 5th WSEAS Conference on Economy and Management Transformation,

2010

[4] M. Muntean and L. Cabău: Business Intelligence Approach in a Business Performance

Context, http://mpra.ub.uni-muenchen.de/29914/, 2011

http://mpra.ub.uni-muenchen.de/29914/



432

[5] I. A. Jamaludin and Z. Mansor: The Review of Business Intelligence (BI) Success

Determinants in Project Implementation, International Journal of Computer Applications,

vol 33/no. 8, 2011

[6] S. Negash and P. Gray: Business Intelligence, Proceedings of the Americas Conference on

Information Systems, 2003

[7] R. Shelton: Adding a KPI to an SQL Server Analysis Services Cube,

www.Simple_Talk.com, 2010

[8] M. Muntean: Business Intelligence Approaches,WSEAS Conference on Mathematics and

Computers in Business andEconomics, Iaşi, 2012

[8] W. H. Inmon: Building de Data Warehouse,

http://inmoncif.com/inmoncifold/www/library/whiteprsttbuild.pdf, 2000

[9] ***: Overview on Project Management Methodology, http://www.chandleraz.gov

/default.aspx?pageid=511

[10] C. N. Bodea, E. Posdarie and A. R. Lupu: Managementul proiectelor - glosar, Editura

Economica, 2002

[11] C. Brândaş: Sisteme suport de decizie pentru managementul proiectelor, Editura Brumar,

Timisoara, 2007

[12] H. Kerzner: Project Management: A System Approach of Planning, Scheduling and

Controlling, John Willey & Son, Inc., 2009

[13] S. Berkun, Making Things Happen: Mastering Project Management (Theory in Practice),

O’Reilly Media, Inc., 2008

[14] S. Rengasamy: Project Monitoring & Evaluation,

http://www.slideshare.net/srengasamy/project-moni-toring-evaluation-s-presentation,

2008

[15] J.B. Barlow et al.: Overview and Guidance on Agile Development in Large Organizations,

Communications of the Association for Information Systems, vol. 29, 2011

[16] M. Golfarelli, D. Maio, and S. Rizzi: The Dimensional Fact Model: a Conceptual Model

for Data Warehouses, International Journal of of Cooperative Information, vol. 7, no. 2,

1998

[17] M. Nagy: A Framework for SemiAutomated Implementation of Multidimensional Data

Models, Database Systems Journal, vol. 3, no. 2, July 2012

[18] N. Rahman, D. Rutz, and S. Akher: Agile Development in Data Warehousing,

International Journal of Business Intelligence Research, vol. 2, no. 3, July-September 2011

[19] B. H. Wixom, and H. J. Watson: An empirical investigation of the factors affecting data

warehousing success, Journal MIS Quaterly, Volume 25 Issue 1, March 2001

[20] N. Raden: Modeling the Data Warehouse, Archer Decision Sciences, Inc., 1996

[21] C. Phipps and K. Davis: Automating data warehouse conceptual schema design and

evaluation, DMDW'02, Canada, 2002

[22] S. Mahajan: Building a Data Warehouse Using Oracle OLAP Tools, Oracle Technical

Report, ACTA Journal, Sept. 1997

[23] J. Srivastava and P. Chen: Warehouse Creation - A Potential Roadblock to Data

Warehousing, IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 1,

January/February 1999

[24] E. Malinowski and E. Zimányi: Hierarchies in a multidimensional model: From

conceptual modeling to logical representation, Data & Knowledge Engineering, 2006,

http://code.ulb.ac.be/dbfiles/MalZim2006article. pdf

[25] M. Nagy: Design and Implementation of Data Warehouses for Business Intelligence

applied in Business, Doctoral Thesis, Cluj-Napoca, 2012

http://www.simple_talk.com/

http://inmoncif.com/inmoncifold/www/library/whiteprsttbuild.pdf

http://www.chandleraz.gov/default.aspx?pageid=511

http://www.chandleraz.gov/default.aspx?pageid=511

http://www.slideshare.net/srengasamy/project-moni-toring-evaluation-s-presentation

http://code.ulb.ac.be/dbfiles/MalZim2006article.%20pdf



433

ANALYSIS OF THE MACROECONOMIC CAUSAL RELATIONSHIPS

WHICH CAN DETERMINE A COUNTRY TO ENTER A SOVEREIGN

DEBT CRISIS

Alexandra Maria CONSTANTIN


[email protected]

Adina CRISTEA


[email protected]

Abstract. Considering the complexity of the sovereign debt crisis phenomenon, this paper

analyses the macroeconomic causal relationships which can determine a country to enter a

sovereign debt crisis, through the prism of causal relationships. The studied macroeconomic

causal relationships consist of significant relationships between the macroeconomic variables

of GDP growth rate, unemployment rate, inflation rate and balance of trade volume, which

can increase the probability of a country entering a sovereign debt crisis.

Keywords: sovereign debt crisis, causal relationships, SAS software, Granger test, Levenberg-

Marquardt method, Gauss-Newton method

JEL classification: C55, E03, G01

1. Introduction The phenomenon of sovereign debt crisis (SDC) is induced by a series of micro and

macroeconomic causal factors. The strategies implemented on a micro and macroeconomic

level can trigger or increase the probability of a sovereign debt crisis, if these strategies are not

able to simultaneously optimize the covering of budget deficit and the honoring of sovereign

obligations.

The danger for vulnerable economies from the Economic and Monetary Union to enter into

SDC is caused by a series of systematic factors. The excessive governmental expenditures

occurring before the recent financial crisis, together with the pro-cyclic behavior of national

authorities [1], have represented the main factors which triggered an increased probability of

entering SDC for countries facing difficulties in honoring their sovereign obligations.

Moreover, most specialists consider that the recent financial crisis represented the catalyst of

SDC in Europe [1]. On the other hand, some specialists believe that the real catalyst is

represented by the behavior regarding real and nominal interest rates. The probability of a

country entering SDC is influenced by numerous causal factors, consisting in the evolution of

specific macroeconomic variables.

2. Macroeconomic causal relationships which can determine a country to enter SDC

A country’s entry into a sovereign debt crisis (SDC) is governed by the macroeconomic causal

relationships. Causal relationships are high complexity relationships which trigger a

phenomenon or a series of phenomena. On the one hand, the causality concept is defined as

„the elementary explanatory scheme, all other explanatory schemes (functional, structural, etc.)

being complex forms, composed in turn from several causal relationships and being therefore



434

reducible to causality” [2]. On the other hand, causality is not part of the scientific explanation

of phenomena, scientific explanations being therefore considered noncausal.

The studied macroeconomic causal relationships consist of significant relationships between

the macroeconomic variables, which can increase the probability of a country entering SDC.

The economic phenomena have nonlinear dependencies and evolutions, due to a chaotic

behavior, which is unpredictable and dynamic. For this reason, the studies regarding modeling

of economic processes must consider that economic theory only has the purpose of offering

qualitative information regarding a model’s parameters (eg: the sign, the domain of values).

However, economic theory does not mention the form of the function responsible with defining

the studied parameters’ behavior, this function being nonlinear most of the times.

3. Methodology

We further present our methodology, the considered data and variable sets, the software utilized

for data processing and the empirical results of our study. The study used as analysis methods

the Granger causality test [3], nonlinear regression models [4], parameter estimation methods

for the nonlinear regression model and convergence criteria. The study was realized on five

European areas, delimitated as follows:

Central Europe: Austria, Czech Republic, Switzerland, Germany, Hungary, Poland,

Slovakia, Slovenia, Romania;

Southern Europe: Bulgaria, Cyprus, Croatia, Greece, Italy, Macedonia, Malta,

Montenegro, Portugal, Serbia, Spain;

Northern Europe: Denmark, Finland, Norway, Sweden, Iceland, Estonia, Latvia,

Lithuania;

Eastern Europe: Belarus, Republic of Moldova, Russia, Ukraine;

Western Europe: Belgium, the Netherlands, France, Ireland, Luxemburg, the United

Kingdom.

The variables’ values for the five mentioned European areas were calculated as means of the

variables’ values for each constituent country. The results presented and interpreted below were

obtained by using the Granger causality test and were generated by the SAS 9.3.1. software

[5].

4. Empirical results and conclusions

We further present the empirical results of analyzing causality at the level of SDC, expressed

through the probability of entering SDC for each European area, depending on the interest rate,

GDP growth rate, unemployment rate and balance of trade volume. In presenting the obtained

results for each European area, the following notations were used:

PEurop_Area – the probability of a country from a studied European area to enter CDS;

UNRT – the unemployment rate, at the level of the analyzed European area;

INFLRT – the inflation rate, at the level of the analyzed European area;

GDPGRRT – the GDP growth rate at the level of the analyzed European area;

BT – the balance of trade volume, at the level of the analyzed European area;

According to the Granger test, the probability of a European country to enter SDC is induced

and significantly affected by the GDP growth rate, the inflation rate, the unemployment rate

and the balance of trade volume. The four mentioned macroeconomic variables have a

significant influence in all the five studied European areas.

We further present the mathematical models which most accurately express the relationship

between the probability of a European country in Central, Southern, Northern, Eastern and



435

respectively Western Europe to enter SDC and the macroeconomic variables of GDP growth

rate, unemployment rate, inflation rate and balance of trade volume:

𝑃𝐶𝑒𝑛𝑡𝑟_𝐸𝑢𝑟𝑜𝑝𝑒 = 𝑒0.00064∙𝐺𝐷𝑃𝐺𝑅𝑅𝑇 +1

291380∙𝑈𝑁𝑅𝑇∙𝐼𝑁𝐹𝐿𝑅𝑇∙𝐵𝑇− 0.77103 (1)

𝑃𝑆𝑜𝑢𝑡ℎ_𝐸𝑢𝑟𝑜𝑝𝑒 =1

𝑒0.01002∙𝐺𝐷𝑃𝐺𝑅𝑅𝑇+ 1.9 ∙ 10−6 ∙ 𝑈𝑁𝑅𝑇 ∙ 𝐼𝑁𝐹𝐿𝑅𝑇 ∙ 𝐵𝑇 − 0.84791 (2)

𝑃𝑁𝑜𝑟𝑡ℎ_𝐸𝑢𝑟𝑜𝑝𝑒 = 0.00736 ∙ 𝐺𝐷𝑃𝐺𝑅𝑅𝑇 +1

𝑒0.000493∙𝑈𝑁𝑅𝑇∙𝐼𝑁𝐹𝐿𝑅𝑇∙𝐵𝑇+ 0.2009 (3)

𝑃𝐸𝑎𝑠𝑡_𝐸𝑢𝑟𝑜𝑝𝑒 = −0.0038 ∙ 𝐺𝐷𝑃𝐺𝑅𝑅𝑇 +0.0001

𝑒𝑈𝑁𝑅𝑇∙𝐼𝑁𝐹𝐿𝑅𝑇∙𝐵𝑇+ 0.04869 (4)

𝑃𝑊𝑒𝑠𝑡_𝐸𝑢𝑟𝑜𝑝𝑒 = 0.00029 ∙ 𝑒𝐺𝐷𝑃𝑅𝑅𝑇 −1

2.8∙10−8∙𝑈𝑁𝑅𝑇∙𝐵𝑇∙𝐼𝑁𝐹𝐿𝑅𝑇+ 0.3335 (5)

The models described by (1), (2), (3) and (4) were estimated through the Gauss-Newton

method, while relationship (5) was estimated through the Levenberg-Marquardt method.

The models reveal important implications for each of the five European areas:

Central Europe. The model described by (1) emphasizes that the probability of a Central

European country to enter SDC is: a) in a relation of exponential dependency with the GDP

growth rate; b) in a relation of hyperbolic dependency with the unemployment rate, inflation

rate and balance of trade volume. The probability that a Central European country enters SDC

increases as the GDP growth rate becomes stagnant, meaning that it grows insignificantly.

Southern Europe. According to (2), the probability of a Southern European country to enter

SDC: a) is inverse exponentially dependent of the GDP growth rate; b) is in a relation of null

linear dependency with the unemployment rate, inflation rate and balance of trade volume. On

a Southern European level, a GDP growth based on the realized crediting significantly

increases the probability of entering SDC.

Northern Europe. From (3) we observe that the probability of a Northern European country

to enter SDC: a) is linearly dependent of the GDP growth rate; b) is inverse exponentially

dependent of the unemployment rate, inflation rate and balance of trade volume.

Eastern Europe. According to (4), the probability of a country in Eastern Europe to enter

SDC: a) is linearly dependent of the GDP growth rate; b) is inverse exponentially dependent

of the unemployment rate, inflation rate and balance of trade volume. Due to low crediting,

Eastern Europe doesn’t appear to be in danger of entering SDC.

Western Europe. Relation (5) emphasizes that the probability of a Western European country

to enter SDC: a) is exponentially dependent of the GDP growth rate; b) is hyperbolically

dependent of the unemployment rate, inflation rate and balance of trade volume.

As sovereign debts do not play an important role in Eastern macroeconomic strategies, Eastern

Europe doesn’t appear to be in danger of entering SDC. Similarly, Northern and Western

European countries are more protected of accumulating debts and consequently entering SDC.

Central and Southern European countries should carefully monitor their indebting level and

estimate its impact on the depending macroeconomic variables. Encouraging a GDP growth

based on productivity rather than crediting is essential in overcoming the risks of SDC.

References

[1] A. Johnston, B. Hancke and S. Pant, “Comparative institutional advantage in the European

sovereign debt crisis”, LSE ‘Europe in Question’ Discussion Paper Series no.66, European

Institute, 2013.



436

[2] C. Zamfir, 2005. Spre o paradigmă a gândirii sociologice, 2nd ed., Iaşi: Polirom, 2005.

[3] C.W.J. Granger, “Investigating causal relations by econometric models and cross-spectral

methods”, Econometrica Journal, vol. 37, no. 3, pp. 424-438, 1969.

[4] S. Stancu, Econometrie. Teorie şi aplicaţii utilizând EViews, Bucharest: ASE Publishing

House, 2011.

[5] SAS® 9.1.3 Intelligence Platform System Administration Guide. Available:

http://support.sas.com/documentation/configuration/bisag.pdf, accessed 13.01.2015



437

INVESTING OPTIMALLY IN ADVERTISING AND QUALITY TO

MITIGATE PRODUCT-HARM CRISIS

Francesco MORESINO

University of Applied Sciences Western Switzerland

[email protected]

Abstract. Product-harm crisis are the nightmare of any firm as they have a disastrous effect

on their sales and image. This paper proposes a new model to compute the optimal investment

in quality and advertising in order to reduce the probability of occurrence of a possible

product-harm crisis and mitigate its effects. This method uses stochastic control theory and

can be used for both tangible products and services.

Keywords: Optimal investment, product-harm crisis, stochastic optimal control. JEL classification: C44, C61

1. Introduction Many examples remind us that no company is immune from a product-harm crisis which can

generate losses of several billion dollars. Take, for example, the most famous case in the car

industry (Toyota and Ford), food and beverage industry (Perrier) or catering (Buffalo Grill).

As evidenced by the case of Toyota, firms known to invest heavily in quality can be affected

as well as less virtuous firms.

In the US, since many years, these crises are always accompanied by costly class action

lawsuits which can be even more damaging than the decline in sales or the image degradation.

In Europe, the number of countries allowing class actions is increasing and the European

Commission considers to promulgate a related directive. To make things even worse,

consumers now act as “consum-actors” and do not hesitate to organize “boycotts”. The recent

case of Kitkat chocolate bars shows us that a powerful company may be forced to revise its

production and communication under pressure from consumers. In the current climate, no

company can ignore the impact of a product-harm crisis when making strategic decisions.

Unfortunately, few tools are used to quantify the effects of a decision on a possible crisis.

Currently, companies have a variety of tools to assess the impact of investments in normal

circumstances but not the effects on a possible crisis. Today there is a real lack, and almost

everything remains to be built.

Crisis management is an important area of management science and many articles and books

have been written on this subject, see for example Bernstein [1]. Most of these publications

offer precepts to follow when crises occur. Some also offer recommendations to prevent or

mitigate a future crisis, without quantifying the effect of these recommendations. For twenty

years, researchers have proposed studies to quantify the consequences of a crisis. Some have

used an experimental approach and have studied the effects that a crisis can have on consumer

expectations (Dawar and Pillutla [2]) or the brand loyalty (Stockmeyer [3]). Others have used

empirical approaches to quantify the effects of a crisis on sales (see e.g. Van Herde et al. [4] or

Cleeren et al. [5]). However, all these studies analyze the crisis ex post and do not offer the

manager a tool to measure the impacts of today's decisions on any future crisis. Other

researchers have studied this problem and have proposed models allowing an ex ante analysis

of crises. Using optimal control theory, they built models that calculate the optimal decision to



438

make in an anticipatory manner while taking into account the effects of any future crisis. We

can cite, among others, the pioneering work of Rao [6] and more recently those of Raman and

Naik [7]. Unfortunately, all these works assume that the crisis follows a Wiener process. This

means that, for these models, the crisis is not sudden and violent but its outbreak is spread over

time and its effects are the result of a multitude of small underlying crises. These models are

obviously not realistic, but recently Rubel et al. [8] proposed a new model closer to reality.

Adapting the work of Boukas et al. [9] and Haurie and Moresino [10] in optimal control theory,

they developed a model where crises are described by a Poisson process. The model proposed

by Rubel et al. calculates the optimal investments to be made in advertising, while taking into

account the effects of a possible crisis. This approach opens new perspectives and looks very

promising.

This paper proposes an extension of the model developed by Rubel et al. Indeed, we provide a

new model to calculate the optimal investments in quality and advertising taking into account

the effects of a possible product-harm crisis. We apply the numerical method proposed by

Kusner and Dupuis [11]. This method relies on a discretization of time and space and allows

to reformulate a stochastic control model into a Markov decision process. The solution of this

Markov decision process can be computed solving a linear program.

This paper is organized as follows. In the second section, we present the model. In the third

section, we explain the numerical method used throughout this paper. The fourth section is

dedicated to a numerical case study and finally, in the last section, further research directions

are proposed.

2. The model We propose to extent the model proposed by Rubel et al. allowing investments in quality. Let

j=0 denote the precrisis regime and j=1 the postcrisis regime. Denote with S the sales and Q

the quality. The sales dynamics are given by

𝑑𝑆

𝑑𝑡= 𝛽𝑗√𝑄(𝑡) 𝑢𝑗 (𝑀(𝑡) − 𝑆(𝑡)) − 𝛿𝑗𝑆(𝑡) − 𝜖𝑗𝑆(𝑡)(1 − 𝑄(𝑡)) ,

where M is the market size, u the investment in advertising, β the effectiveness, δ and ϵ decay

rates. The quality dynamics are given by

𝑑𝑄

𝑑𝑡= 𝛼𝑗√𝑣𝑗 (1 − 𝑄(𝑡)) − 𝜇𝑗𝑄(𝑡) ,

where v denote the investment in quality, α the effectiveness and μ the decay rate. The crisis

follows a Markovian process with generator

𝑞𝑖𝑗 = 𝜉𝑖0 + 𝜉𝑖1 𝑄 𝑖 ≠ 𝑗 .

As usual, we denote with

𝑞𝑖 =∑𝑞𝑖𝑗𝑗≠𝑖

≡ 𝑞(𝑄, 𝑖) .

When a crisis occurs, the sales fall and the damage rate is denoted with Φ. The profit function

is given by

𝜋(𝑆, 𝑄, 𝑢, 𝑣) = 𝑚1𝑆 − 𝑚2𝑆𝑄 −𝑚3𝑢 −𝑚4𝑣,



439

where 𝑚1is the unit margin, 𝑚2 the unit production price for quality, 𝑚3 and 𝑚4 investment

costs. The objective is to maximize the discounted expected profits

𝑉(𝑆, 𝑄, 𝑖) = max𝑢,𝑣

𝐸 [∫ 𝑒−𝜌𝑡∞

0

𝜋(𝑆, 𝑄, 𝑢, 𝑣)𝑑𝑡]

with discount rate ρ. Applying standard dynamic programming analysis, we obtain the

following Hamilton-Jacobi-Bellman (HJB) equations that provide sufficient conditions for the

optimality (see Fleming and Rishel [12]):

𝜌𝑉(𝑆, 𝑄, 0) = max𝑢,𝑣

{𝜋 +𝜕

𝜕𝑥𝑉(𝑆, 𝑄, 0) ⋅ 𝑓(𝑆, 𝑄, 𝑢, 𝑣)

+ [𝑉(𝑆(1 − Φ), 𝑄, 1) − 𝑉(𝑆, 𝑄, 0)]𝑞(𝑄, 0)} ,

𝜌𝑉(𝑆, 𝑄, 1) = max𝑢,𝑣

{𝜋 +𝜕

𝜕𝑥𝑉(𝑆, 𝑄, 1) ⋅ 𝑓(𝑆, 𝑄, 𝑢, 𝑣) + [𝑉(𝑆, 𝑄, 0) − 𝑉(𝑆, 𝑄, 1)]𝑞(𝑄, 1)} .

3. Numerical method The HJB system of equations can be solved applying Kusner and Dupuis method [11]. This

method approximate the partial derivative as follows:

𝜕

𝜕𝑥𝑘𝑉(𝑥, 𝑖) →

{

𝑉(𝑥 + 𝑒𝑘ℎ𝑘) − 𝑉(𝑥, 𝑖)

ℎ𝑘 if 𝑓𝑘(𝑥, 𝑤) ≥ 0

𝑉(𝑥, 𝑖) − 𝑉(𝑥 − 𝑒𝑘ℎ𝑘)

ℎ𝑘 if 𝑓𝑘(𝑥, 𝑤) < 0 ,

where x=(S,Q), w=(u,v), 𝑓 = ��, 𝑒𝑘 the unit vector of the k-th axis and ℎ𝑘 the grid’s mesh. We

denote with 𝒳 the so obtained grid. Let

𝜔 = max𝑥,𝑖,𝑤

𝑞(𝑄, 𝑖) +∑|𝑓𝑘(𝑥, 𝑤)|

ℎ𝑘𝑘

.

Define the interpolation interval

∆=1

𝜌 + 𝜔

and the discount factor

𝑟 =𝜔

𝜌 + 𝜔 .

The transition probabilities are defined as



440

Π(𝑥, 𝑥 ± 𝑒𝑘ℎ𝑘 , 𝑖, 𝑖, 𝑤) =𝑓𝑘±(𝑥, 𝑤)

𝜔ℎ𝑘 ,

Π(𝑥, 𝑥, 𝑖, 𝑗, 𝑤) =𝑞(𝑄, 𝑖)

𝜔 .

Let 𝐵(𝑥, 𝑖) ≥ 0 with ∑ 𝐵(𝑥, 𝑖) = 1𝑥,𝑖 . Substituting this in the HJB equations and using

reflecting boundary conditions leads to the following linear program:

max𝑍{∑∑∑𝜋(𝑥, 𝑤)∆𝑍(𝑥, 𝑖, 𝑤)

𝑤𝑥𝑖

}

s.t.

∑𝑍(𝑦, 𝑗, 𝑤)

𝑤

− 𝑟∑∑∑Π(𝑥, 𝑦, 𝑖, 𝑗, 𝑤)𝑍(𝑥, 𝑖, 𝑤)

𝑤𝑥𝑖

= 𝐵(𝑦, 𝑗) 𝑦 ∈ 𝒳, 𝑗 ∈ 𝐼 ,

𝑍(𝑥, 𝑖, 𝑤) ≥ 0 .

Then the optimal policy is given by

D(𝑤|𝑥, 𝑖) =𝑍(𝑥, 𝑖, 𝑤)

∑ 𝑍(𝑦, 𝑗, ��)�� .

D(𝑤|𝑥, 𝑖) = 1 if w is the optimal decision for the state (𝑥, 𝑖) and zero otherwise.

4. Numerical experiment We run the model for the set of data given in Table 1 and the grid given in Table 2. This set of

data is fictive and is inspired by the car industry.

Table 1- Data

Data Precrisis regime Postcrisis regime Data

α 0.5 0.5 ρ 0.06

µ 0.1 0.1 𝑚1 100

β 0.05 0.05 𝑚2 0.5

δ 0.1 0.3 𝑚3 20

ε 0.01 0.03 𝑚4 1

𝜉0 0.5 2 ϕ 0

𝜉1 -0.005 0.05

M 100 100

Table 2 - Grid

Data Minimum Maximum Mesh

S 0 100 4

Q 0 100 4

u 0 100 10

v 0 100 10



441

Figure 1 shows the optimal policy for the precrisis regime. Due to lack of space, not all results

are displayed in this paper. Figure 2 shows that the trajectories are attracted by the so-called

turnpike. This figure shows two trajectories in the precrisis regime, one starting from the point

S(0)=50 and Q(0)=10, the other starting from the point S(0)=90 and Q(0)=80. We see distinctly

that both converge to the turnpike S=76.2 and Q=47.3. Similarly the turnpike for the postcrisis

regime can be found and is S=60.1 and Q=69.5. As expected, sales decline in crisis times.

Interestingly we see that, in case of crisis, it is optimal to invest more in quality in order to quit

as soon as possible this turmoil phase. Finally, Figure 3 shows the steady state probabilities for

both regimes.

Figure 1- Optimal investment in advertising (left) and quality (right) for the precrisis regime.

Figure 2 - Two optimal trajectories with different initial values (precrisis regime). Left: sales; right: quality.

Figure 3- Steady state probabilities. Left: precrisis regime; right: postcrisis regime.



442

5. Conclusions This paper proposes an original method to compute the optimal investment in quality and

advertising in order to reduce the probability of occurrence of a possible product-harm crisis

and mitigate its effects. This method is an extension of the method proposed by Rubel et al.

and uses a stochastic control theory approach.

For our fictive case inspired by the car industry, our model shows that in case of crisis, it is

optimal to make a special effort for the quality in order to quit as soon as possible this turmoil

phase.

Finally, we conclude by identifying two directions for further research. First the model could

be enriched in order to take into account endogen competition. For this purpose, we have to

compute a Nash equilibrium instead of an optimum. Second, we can investigate the properties

of the turnpike in order to obtain analytical results. Note that this latter direction could be quite

tricky if not impossible.

Acknowledgment This research was supported by a grant from RCSO Economie & Management, HES-SO.

References [1] J. Bernstein, Manager's Guide to Crisis Management, McGraw-Hill, 2011.

[2] N. Dawar and M. M. Pillutla, “Impact of product-harm crises on brand equity: The

moderating role of consumer expectations”, J. of Marketing Research, vol. 37, no.2, pp

215–226, 2000.

[3] J. Stockmyer, “Brands in crisis: Consumer help for deserving victims”, Advances in

Consumer Research, vol. 23, no. 1, pp. 429–435, 1996.

[4] H. J. Van Heerde, K. Helsen and M. G. Dekimpe. “The impact of a product-harm crisis on

marketing effectiveness”, Marketing Science, vol. 26, no. 2, pp. 230–245, 2007

[5] K. Cleeren, H. J. van Heerde and M. G. Dekimpe “Rising from the Ashes: How Brands and

Categories Can Overcome Product-Harm Crisis”, J. of Marketing, vol. 77, pp. 58–77, 2013.

[6] R. C. Rao, “Estimating continuous time advertising-sales models”, Marketing Science, vol.

5, no. 2, pp. 125-142, 1986

[7] K. Raman and P. A. Naik., “Long-term profit impact of integrated marketing

communications program”, Review of Marketing Science, vol. 2, no. 1, 2004.

[8] O. Rubel, P.A. Naik and S. Srinivasan, “Optimal Advertising When Envisioning a Product-

Harm Crisis”, Marketing Science, vol. 30, no. 6, pp. 1048-1065, 2011.

[9] E. K. Boukas, A. Haurie and P. Michel, “An optimal control problem with a random

stopping time” J. Optim. Theory Appl., vol. 64, no. 3, pp. 471–480, 1990.

[10] A. Haurie and F. Moresino, “A Stochastic Control Model of Economic Growth with

Environmental Disaster Prevention”, Automatica, vol. 42, no. 8, pp. 1417-1428, 2006.

[11] H. J. Kushner and P.G. Dupuis, Numerical methods for stochastic control problems in

continuous time, Springer, 1992.

[12] W. H. Fleming and R. W. Rishel, Deterministic and Stochastic Control, Springer, 1975.



443

COUPLING TECHNO-ECONOMIC ENERGY MODELS WITH A

SHARE OF CHOICE

Francesco MORESINO


[email protected]

Emmanuel FRAGNIÈRE


[email protected]

Roman KANALA

Université de Genève

[email protected]

Adriana REVEIU Bucharest University of Economic Studies

[email protected]

Ion SMEUREANU Bucharest University of Economic Studies

[email protected]

Abstract. Classical energy planning models assume that consumers are rational and this is

obviously not always the case. This paper proposes an original method to take into account the

consumer’s real behavior in an energy model. It couples a classical energy model with a Share

of Choice model.

Keywords: consumer behavior, energy and environmental planning model, share of choice JEL classification: C44, Q48

1. Introduction For decades, energy and environmental planning models such as MARKAL [1], TIMES [2] or

more recently OSeMOSYS [3] have helped policy makers to take their long-term decisions.

However, these classical modes have a weakness: they suppose that all actors are perfectly

rational. To take into account this irrationality, a first attempt has been made using fictive

technologies [4]. In this paper we go a step further, we propose an original method which enable

us to take into account the consumer’s real behavior in an energy model. This method couples

technical methods from operations research with behavioral approaches from social sciences

and is inspired by the method proposed in reference [5]. Roughly speaking, the main step of

this method are the following. First, the consumer's real behavior is estimated with a survey.

Then the results of the survey are incorporated in a Share of Choice model which describes the

consumers' preferences. Finally, the Share of Choice model is coupled with a classical energy

model. The meta-model so obtained permits us to evaluate different possible energy policies.

2. The case study The purpose of this case study is to show how a classical energy model can be coupled with a

Share of Choice model in order to take into account the consumers' real behavior. To illustrate

our method, we take a case study where we put a focus on the consumer’s behavior concerning

bulbs. More precisely, we want to study the consumer’s preference between fluorescent and



444

LED bulbs. For this case study, we suppose that the government can conduct two campaigns,

namely an information campaign and a subvention campaign. The final objective is to choose

the optimal policy. For the energy model, we choose OSeMOSYS data set UTOPIA. UTOPIA

is a relatively small though complete energy model and is implemented with the open source

OSeMOSYS code. In UTOPIA, nothing is modified except that we introduce a second bulb

and the possibility of an information campaign and a subvention campaign. Then, we add to

this energy model a Share of Choice model that describes the consumer’s behavior regarding

bulbs.

3. The survey To evaluate the behavior of consumers concerning bulbs, we have conducted a survey in

Romania and interviewed 120 persons. As we need to evaluate the consumers' preference

between only two bulbs and don't need to evaluate the separate effect of the different attributes,

we employ a full-profile approach. More precisely, the survey relies on two steps and each one

is divided in two questions.

The first step aims at evaluating the respondent preference before a possible information

campaign. For this purpose, two cards are presented to the respondent (see Figure 1). The first

card describes the fluorescent bulbs and the second one the LED bulb. Both cards contain

indications that can be found on the packaging. It includes the price, the life duration, the

energy efficiency and the power of the bulb. First, we ask the respondent which bulb he would

be willing to buy. If the respondent choses the LED bulb, we go to the second step. If the

respondent choses the fluorescent bulb, we then evaluate if a possible subvention campaign

could turn him into a LED bulbs buyer. For this purpose we ask a second question: "what is

the maximal amount you are ready to pay for the LED bulb?". This provides us with the

respondent's Willingness To Pay (WTP) for the led bulb.

The second step aims at evaluating the respondent preference after the information campaign.

For this purpose, two cards are presented to the respondent (see Figure 2). Compared to the

previous cards, these cards contain an additional information, namely the annual cost of

utilization. The annual cost includes the depreciation cost and the electricity cost. It is based

on a standard use of 1000 hours per year. Then, as in the first step, we ask the respondent the

same questions to know his preference and his WTP for the LED bulb.

To sum up, the survey provides us with the following two kinds of information: the WTP when

no additional information is given to the respondent and the WTP when he gets additional

information. If the respondent choses the LED bulb even if no subvention is given, the WTP

is of course 35 lei, the actual price of the LED bulb.

Figure 1- First set of cards presented to the respondents



445

Figure 2 - Second set of cards presented to the respondents

4. The meta-model In order to take into account the consumers real preferences, it is necessary to translate the

survey's results into data for the Share of Choice model. Throughout the paper, we use the

following notation:

Respondent 𝑟 ∈ 𝑅, Year 𝑦 ∈ 𝑌, Subvention level per LED bulb 𝑠 ≥ 0, Information campaign level 𝑖 ∈ {0,1}. i=0 means that no information campaign is conducted whereas i=1 means that an information

campaign is conducted. For each respondent r and both campaign level 𝑖 ∈ {0,1}, the survey

provides us with the WTP (denoted in the model with w(i,r)). To describe users’ preferences,

we use an ordinal utility function. As this utility function can be calibrated as desired, we make

the following choice. The utility function for the fluorescent bulb is 0, whatever the level of

the campaign. For modelling purposes, we also converted lei in dollars using the exchange

rate 0.3. For instance, the price of the LED bulb is 10.5 dollars (35 lei). For each respondent,

the utility function of the LED bulb is given by

𝑈(0, 𝑟) ∙ (1 − 𝑖) + 𝑈(1, 𝑟) ∙ 𝑖 + 𝑠 , (1)

where the part-worth are given by

𝑈(𝑖, 𝑟) = {1 if 𝑤(𝑖, 𝑟) = 10.5

10.5 − 𝑤(𝑖, 𝑟) otherwise .

Note that 10.5 in the function represents the price of the LED bulb in dollars. Given the

campaign level and the subvention level, this utility function is positive if the LED bulb is

preferred to the fluorescent bulb and negative if the fluorescent bulb is preferred.

To describe the structure of the meta-model, we use the following notations. For data, we have

𝑑(𝑦) forecasted annual demand for bulbs,

𝑐𝑖 cost of the campaign,

𝑐𝑠 cost of the subsides.

The first data exists in the original UTOPIA, whereas the other two ones are added to the

original model. For the decision variables, we use the following notation:

i information campaign configuration: 1 if campaign and 0 otherwise,

p(r) preference for respondent r: 1 if the respondent buys LED bulbs and 0 otherwise,

l share of LED bulbs,

𝑧2(𝑦) installed capacity of LED bulbs,



446

𝑧1(𝑦) installed capacity of fluorescent bulbs,

x variables describing the activities in the classical model (𝑑(𝑦), 𝑧1(𝑦), 𝑧2(𝑦), 𝑖 and 𝑠

belong to this vector).

Note that the first four variables don't belong to the original UTOPIA model. The energy model

without Share of Choice writes

min𝑥𝑐 ∙ 𝑥

s.t.

𝐴 ∙ 𝑥 ≥ 𝑏 .

Roughly speaking, the model tries to minimize the costs respecting the constraints that all

demands are satisfied. Then in the meta-model we have to introduce the Share of Choice as

follows. For each respondent 𝑟 ∈ 𝑅, the following two inequalities must hold

𝑈(0, 𝑟) ∙ (1 − 𝑖) + 𝑈(1, 𝑟) ∙ 𝑖 + 𝑠 ≥ (𝑝(𝑟) − 1) ∙ 𝑀 ,

𝑈(0, 𝑟) ∙ (1 − 𝑖) + 𝑈(1, 𝑟) ∙ 𝑖 + 𝑠 ≤ 𝑝(𝑟) ∙ 𝑀.

Where i and p(r) are binary variables and M is a big number. In these two equations, we

recognize the utility function described in Eq. (1). For each respondent r, these two equations

insure that if the utility of the LED bulb is greater than or equal to the utility of the fluorescent

bulb, then the respondent is counted as a LED bulbs buyer (p(r)=1). In the case the utility is

smaller than zero, the respondent is counted as a fluorescent bulbs buyer (p(r)=0). Then, the

proportion of LED bulbs writes

𝑙 =∑ 𝑝(𝑟)𝑟∈𝑅

card(𝑅) ,

where card(R) is the number of respondents. Finally, we must include in the meta-model the

following constraints, where capacity and demand are put in relation:

𝑧1(𝑦) = 𝑑(𝑦) ∙ (1 − 𝑙),

𝑧2(𝑦) = 𝑑(𝑦) ∙ 𝑙 .

These two equations insure that the installed capacity of both bulbs matches the proportion

computed with the Share of Choice. In OSeMOSYS, these two constraints contain capacity

factors and activity to capacity factors not presented here. Table 1 gives the correspondence

between notations used throughout this paper and notations used for the modelling in

OSeMOSYS. Table 1- Name of data and variables

Data Notation Notation in OSeMOSYS

Respondent 𝑟 ∈ 𝑅 R in RESPONDENT

Year 𝑦 ∈ 𝑌 Y in YEAR

Cost of campaign 𝑐𝑖 COST_CAMPAIGN

Cost of subventions 𝑐𝑠 COST_SUBVENTION

Part-worth U(i,r) U[c,r]

Big number M BIGM



447

Forecasted annual demand 𝑑(𝑦) SpecifiedAnnualDemand[,”RL”,y]

Variable Notation Notation in OSeMOSYS

Information campaign level 𝑖 ∈ {0,1} campaign

Subvention level 𝑠 ≥ 0 subvention

Preference 𝑝(𝑟) ∈ {0,1} preference

LED bulbs’ share l share

Fluorescent bulbs’ capacity 𝑧1(𝑦) TotalCapacityAnnual[.”RL1”,y]

LED bulbs’ capacity 𝑧2(𝑦) TotalCapacityAnnual[.”RL2”,y]

5. Numerical experiment Our goal is to show how it is possible to couple an energy model with a Share of Choice model.

In order not to modify the original energy model, we decided to use the existing bulb from

UTOPIA though it doesn't have exactly the same price characteristics as the real bulbs ones

such as indicated in our survey. Indeed, the ratio between the price of electricity and the price

of bulbs is lower in UTOPIA than in Romania. In UTOPIA, the existing bulb, namely RL1,

corresponds to the fluorescent bulb. We introduced a new bulb, namely RL2, which correspond

to the LED bulb. Data for both bulbs can be found in Table 2. Excepted the residual capacity,

all data for RL1 are the same as in the original UTOPIA model. Table 2 - Bulbs characteristics

Fluorescent bulb LED bulb

OSeMOSYS symbol RL1 RL2

InputActivityRatio from ELC 1 0.7143

OutputActivityRatio to RL 1 1

FixedCost 9.46 4.73

CapitalCost 100 200

ResidualCapacity 0 0

AvailabilityFactor 1 1

OperationalLife 10 20

VariableCost 0 0

CapacityToActivityUnit 1 1

CapacityFactor 1 1

Data for the residential light are given in Table 3. These data points are the same as in the

original model UTOPIA. Table 3 - Demand for residential light (PJ/year).

Year 0 10 20

SpecifiedAnnualDemand 5.6 8.4 12.6

The cost of the campaign is evaluated from observations based on study [4] and is set as 20

million dollars for the whole horizon.

Our model tries to minimize the global costs for the society. The expenses of the subvention

are paid by the government to individuals. Seen from the point of view of an accountant, this

means that the cost for the society is zero: what is paid from one side is received by the other

side. In our model, the cost of the campaign should be seen as an acceptance cost. Obviously

it should lie between zero and the expenses spent by the government. Indeed, the acceptance

cost cannot be larger than the cost itself. For our experiment, we took an acceptance cost equal

to 50% of the total subvention.



448

For this set of data, it is optimal to run an information campaign and give no subvention. The

total discounted cost is 27'860 million dollars. Besides these results, it is important to notice

that the share of LED bulbs mimics the consumers behavior estimated through the survey. Note

that without a Share of Choice model taking into account the irrationality of consumers, this

proportion would have been 100%. Indeed, it is economically rational to buy only LED bulbs.

Figure 3 shows the proportion of fluorescent and LED bulbs for our scenario.

Figure 3- Bulbs penetration.

5. Conclusion We have proposed an original method which enable us to take into account the consumer’s real

behavior in an energy model. This method couples technical methods from operations research

with behavioral approaches from social sciences.

In a further development, we aim to externalize the Share of Choice model. This will have two

advantages. First, it will reduce the number of binary variable, which in turns, will reduce

drastically the complexity of the model. Second, it will ease the modelling. Indeed, it will be

possible to keep linearity properties for more complex models.

Acknowledgment This research was supported by the Swiss Enlargement Contribution in the framework of the

Romanian-Swiss Research Program (Swiss National Fund of Scientific Research Grant

IZERZ0_142217). We would like to thank Andrew Collins the designer of the light bulbs

presented in figures 1 and 2.

References [1] H. Abilock and L. G. Fishbone, “User’s Guide for MARKAL (BNL Version)”, BNL,

27075, 1979.

[2] R. Loulou, U. Remme, A. Kanudia, A. Lehtilä and G. Goldstein, “Documentation for the

TIMES Model”, IEA Energy Technology Systems Analysis Program (ETSAP), 2005.

[3] M. Howells, H. Rogner, N. Strachan, C. Heaps, H. Huntington, S. Kypreos, A. Hughes, S.

Silveira, J. DeCarolis, M. Bazillian and A. Roehrl, “OSeMOSYS: The Open Source Energy

Modeling System. An introduction to its ethos, structure and development”, Energy Policy,

vol. 39, pp. 5850-5870, 2011.

[4] E. Fragnière, R. Kanala, D. Lavigne, F. Moresino and G. Nguene, “Behavioral and

Technological Changes Regarding Lighting Consumptions: A MARKAL Case Study”,

Low Carbon Economy Journal, vol. 1, pp 8-17, 2010.

[5] E. Fragnière, C. Heitz and F. Moresino, “The Concept of Shadow Price to Monetarize the

Intangible Value of Expertise”, in Proc. The IEEE/INFORMS International Conference on

Service Operations and Logistics, and Informatics, vol. 2, pp. 1736-1741, 2008.



449

A GENDER ANALYSIS OF THE MINIMUM WAGE EFFECTS UPON

EMPLOYMENT IN ROMANIA

Madalina Ecaterina POPESCU

The Bucharest University of Economic Studies, Romania

The National Scientific Research Institute for Labour and Social Protection, Romania

[email protected]

Larisa STANILA The National Scientific Research Institute for Labour and Social Protection, Romania

[email protected]

Amalia CRISTESCU The Bucharest University of Economic Studies, Romania

The National Scientific Research Institute for Labour and Social Protection, Romania

[email protected]

Abstract: Socio-economic approaches on the minimum wage are more often than not

contradictory. Supporters of the minimum wage believe that this is a way to increase the

wellbeing of individuals, while the opponents believe that the only thing that the minimum wage

does is to reduce the employment rate and/or to increase “working under the table”. In this

context, our paper focuses on the role of the minimum wage on the labour market and its direct

effects upon employment in Romania. We also identify several particularities at gender level

by elaborating an analysis of the minimum wage effects upon male and female employment in

Romania. For that we used quarterly data for the period 2000 Q1 – 2014 Q3 and built two

employment equations differentiated by gender. Our results are consistent with the

international literature and suggest the presence of some gender differences in the way the

minimum wage effects are perceived by the employed population.

Keywords: minimum wage, gender analysis, employment, econometric approach

JEL classification: J21, C22

1. Introduction

In this paper we highlight the role of the minimum wage on the labour market and its direct

effects upon employment in Romania, with several particularities drawn at gender level. As we

know, capitalism is an efficient system of resource and production allocation, but if left

unmanaged it generates and exacerbates economic inequalities. Economic inequality between

people and especially chronic inequality has adverse effects on companies and even the

capitalist system as a whole. For most citizens, solving the inequality issue is related to incomes

which correspond to the salary level; moreover, the ones most affected by inequality are those

who depend on the minimum wage. Many of the economic theories argue that increasing the

minimum wage would reduce the number of jobs, as an increase in wage costs cannot be passed

on to customers, especially in the globalized economy.

However, socio-economic approaches on the minimum wage are more often than not

contradictory. Supporters of the minimum wage believe that this is a way to increase the

wellbeing of individuals, especially of those with low incomes. Opponents, however, believe

that the only thing that the minimum wage does is to reduce the employment rate and/or

increase “working under the table” (informal). Under these circumstances, establishing the






450

minimum wage has to reconcile social and economic considerations. Social considerations

refer to the standard of living and income inequality that lead to a certain level of pressure to

increase minimum wages (pressure coming mainly from trade unions). Economic

considerations related to productivity, competitiveness, job creation lead to the pressure (often

from employers) to maintain the minimum wage at a low level [1].

Under conditions of prolonged economic crisis keeping a high level of the minimum wage by

law may have the effect of lowering employment for the category of minimum wage level

employees (generally young and unskilled workers). This indicates once again that the state

should pay a minimum wage difference for these categories, especially if there is no decrease

in the prices on which the minimum wage also depends. This difference which can be a way to

combat unemployment, which is subsidized from the unemployment fund, has to be linked to

the criteria which led to the establishment of the minimum wage [2].

2. Literature review In an analysis performed on the French economy Aghion et al. [3] consider that it is necessary

to rethink the role and the policy of the minimum wage. The authors argue that a minimum

wage too high can damage employment, trust and social mobility. Thus, the only criteria that

must be taken into account should be those fighting against poverty, equity and economic

impacts, particularly on competitiveness and employment.

In recent years the minimum wage has become an increasingly popular policy tool in many

emerging economies, especially since the major challenge of applying the minimum wage

regulations in these economies is given by the rather high level of “work under the table”.

Fialova and Myslikova have conducted an analysis on the impact of minimum wages on the

labour market in the Czech Republic using data from the development regions (NUTS 2) in the

period 2000-2009. The result of the analysis indicated that the minimum wage had a significant

impact on increasing regional unemployment and reduced employment opportunities for

workers with low wages, generally unskilled workers [4].

A similar study was also conducted by Majchrowska și Zółkiewski [5] on the economy in

Poland. Using an econometric model based on time series the authors analysed the impact of

the minimum wage on the employment rate in Poland. The results of the econometric study

indicated that the minimum wage had a negative impact on employment in the period 1999-

2010, and the stronger negative effects took place during the period in which the minimum

salary increased substantially (2005-2010), young people being the most affected segment. At

the same time, the analysis indicated that a unique regional minimum wage can be quite

harmful on the employment rate in poorer regions.

In an analysis conducted on the labour market in Hungary, Halpern et al. identified an important

side effect of increasing the minimum wage on employment and on employment opportunities

in small businesses [6].

Although most studies show a negative effect of the minimum wage on the employment rate,

there are studies that state the contrary. Thus, David and Krueger [7] conducted a study in the

USA showing that the link between the minimum wage and the loss of jobs is weak and

unfounded. One should keep in mind that they have shown not only that the minimum wage

has no effect on employment, but that it can even increase the number of employees. For this

reason, they compared employment in fast-food restaurants in two adjacent states, New Jersey

and Pennsylvania, after the minimum wage increased in New Jersey. Rather than an automatic

reduction in the number of employees, an increased minimum wage presents contradictory

answers that do not avoid the positive effects, including for the employer.

In Romania we can see that the minimum wage is at a very low level which does not ensure a

level of subsistence; furthermore there is not the case of a situation in which the local employer



451

uses the workforce accordingly and the latter either emigrates or it is not qualified enough for

the demands of a globally competitive economy [8].

Thus, under current conditions, given the social premises and the need to stimulate domestic

consumption, government policies support raising the minimum wage. However, we have to

point out that any nominal increase in the minimum wage should be made in the context of a

development in labour productivity, because - in the event of a disconnection between these

two elements - we will be faced with negative effects on the medium and long term, including

inflation growth, a decrease in the competitiveness of SMEs, and also a decrease in exports.

There is also the risk of establishing, at the company level, the practice of redistributing a part

of the legitimate income of the highly skilled and productive employees to lower skilled

employees with a lower productivity, which will significantly affect the motivational system

of the company and will discourage performance.

3. Data and methodology

In this study, the analysis focuses on gender analysis of minimum wage impact on employment

in Romania. Using the employed population as a dependent variable, we considered the

following explanatory variables in our model: the national real minimum wage, the gross

domestic product, the average real gross earnings and the net investment index. We used the

consumer price index as chain index in order to obtain real values of the minimum wage and

average earnings and the deflator for the real domestic product.

We targeted the period Q1 2000 - Q3 2014, having therefore 59 observations that were obtained

from the Romanian National Institute of Statistics.

As econometric methods employed in this study, we used the multifactorial linear regression,

with the following general form1:

𝑌𝑡 = 𝛼𝑀𝑊𝑡 + 𝛽𝑅𝑡 + 𝑒𝑡

where et is the white noise error and t takes values from 1 to 59, representing the period Q1

2000 – Q3 2014. MW stands for the minimum wage, while R includes: real average monthly

gross earnings, the gross domestic product and the net investment index

The dependent variable represented by the employed population was divided so to allow a

gender analysis of the minimum wage effects. Since the macroeconomic effects occur in

general with a short delay, we allowed up to four lags for each variable of our analysis,

including the dependent one.

After considering the general form of the model, we then applied several transformations to

the data, in order to reduce their heterogeneity [9]. Thus, we first decided to use all variables

in natural logs. Then, after applying the Augmented Dickey-Fuller and Phillips-Perron tests in

order to check if the series are stationary, the results indicated the need to first difference the

variables (the tests were carried out in Eviews 7). After taking the logs and the first differences,

the general model turned to the following form:

𝑑𝑙𝑌𝑖𝑡 = 𝛼 + 𝛽1 × 𝑑𝑙𝑚𝑤 + 𝛽2 × 𝑑𝑙𝑒𝑎𝑟𝑛𝑡 + 𝛽3 × 𝑑𝑙𝑔𝑑𝑝𝑡 + 𝛽4 × 𝑑𝑙𝑖𝑛𝑣𝑡 + 𝑒𝑡 where: Y - the employed population for each group i, males and females;

mw - the minimum wage;

earn - average gross monthly earnings;

gdp - gross domestic product;

inv - net investments index.

After the estimation of models with statistically significant coefficients, where the selected

regressors explain a large proportion of the variation in the dependent variable, we had to check

1 The model is similar to the one used by Brown, Gilroy and Kohen in 1982.



452

that the residuals fulfil the following important conditions: the residuals should not be

correlated or heteroskedastic and should have a normal distribution [10]. In case one or more

of the conditions are not satisfied the estimated coefficients will be biased and inconsistent and

the equation should be re-specified.

4. Econometric results

The male employment equation shows a positive influence of the minimum wage, suggesting

that the minimum wage stimulates the males to find a job and to get employed. In this case, if

we were to increase the minimum wage by 10%, the male employed population would increase

by 0.5%.

The net earnings also acts as a stimulating factor for the male labour supply, only this time the

influence is not from the present value, but from lag 4. Its influence is slightly greater than that

of the minimum wage.

As expected, the net investments have a positive influence on the male employed population.

Although the impact is relatively small, its sign indicates a complementarity between labour

force and new technological equipment.

The impact of the real GDP is oscillating, being distributed negatively on lags 1 (-0.058) and

3 (-0.071) and positively on lag 4 (+0.021). The long run impact is negative (0.108): if the gross

domestic product would increase with 10%, the employed male population would decrease

with 1.08%.

Table 1- The main results of the regression models

Males Females

c -0.00000683 (0.0035) 0.001080 (0.184)

dlmw(t) 0.052 (2.87)*

dlmw(t-1) 0.242 (4.48) ***

dlmw(t-3) -0.246 (-5.18) ***

dlnre(t-3) -0.187 (-2.09) **

dlnre(t-4) 0.066 (2.49)**

dlni 0.03 (2.38)**

dlgdp(t-1) -0.058 (-2.68)**

dlgdp(t-3) -0.071 (-3.47)*

dlgdp(t-4) 0.021 (2.41)** 0.085 (3.84) ***

dlf_empl(t) 0.653 (16.79)*

Estimation method Least Squares Least Squares

R2-adj 0.948 0.69

Normality test (Jarque-Bera) 0.857 0.724

Autocorrelation test

(Breusch-Godfrey LM Test) 0.782 0.51

Heteroskedasticity test

(White) 0.951 0.33

where between brackets are the t statistics, and *, **, *** stands for a 1% , 5% and 10% significance level

A very interesting aspect is that the female employed population is also positively influencing

the male employed population evolution. The coefficient is strong: a 10% increase of the

female work force would result in a 6% increase of the male work force. Having these results,

we may consider the two segments of the work force to be in a complementary relationship

(and not a substitution one).



453

Regarding the female employment equation, several particularities can be drawn as well. For

instance, regarding the minimum wage effect upon female employment, two main aspects are

worth mentioned. First aspect is that the three quarters delay shock is negative, which reveals

a normal reaction of employers who are worried to a certain extent of the repercussions of the

annual wage increase. More precisely, a 10% increase in the real minimum wage leads to a

2.46% reduction in female employment with less than a year delay, keeping all the other

variables constant. Second, we also notice a smaller but positive delayed effect of the minimum

wage expected to occur with only one quarter delay on the labour market, confirming that there

is indeed a delayed adjustment of the employers to the frame of the planned wage funds.

The influence of the Gross Domestic Products seems normal although the impact is delayed

with one year, since the growth of the output stimulates employment by the need to create new

job entries and the emergence of new economic activities. The estimated coefficient indicates

that a 1% increase of GDP leads to a 0.085% increase of female employment, keeping all the

other explanatory variables constant.

A negative impact upon female employment is however given by a variation of the real

earnings, indicating that a 10% increase in the real net earnings will lead to a 1.87% decrease

of the female employment with a three quarters delay, when keeping the other variables

constant. This result suggests that female employees are more sensitive to real net earnings

variations and tend to have higher vulnerability in general, in case of a net earnings increase.

5. Conclusions In this paper we elaborated a gender analysis of the minimum wage effects upon employment

in Romania. For that we used quarterly data for the period 2000 Q1 – 2014 Q3 and built two

employment equations differentiated by gender. Our results suggest the presence of some

gender differences in the way the minimum wage effects are perceived by the employed

population.

More precisely, the male employment equation shows a positive influence of the minimum

wage, suggesting that the minimum wage stimulates the males to find a job and to get

employed, while when considering the female employment equation, there is a rather negative

average impact of the minimum wage.

The influence of the GDP seems normal, especially for the female case, when the growth of

the output stimulates employment by the need to create new job entries and the emergence of

new economic activities, even if the effect turns out to be delayed with one year. The impact

of real GDP is however rather oscillating, when referring to the male employment.

In conclusion, our results are consistent with the international literature and bring added value

to the empirical research of the Central and East European countries.

References

[1] Rutkowski, J., „The minimum wage: curse or cure?”, Human Development Economics

Europe and Central Asia Region, The World Bank, 2003.

[2] Grimshaw, D.,”Minimum wage trends during the crisis: The problem of stronger

”minimum wage contours” and weaker unions”, Paper for the Annual Progressive

Economy Forum, Theme: Inequality and the crisis, 2014,

http://www.progressiveeconomy.eu/sites/default/files/papers/Damian%20Grimshaw%20-

%20Minimum%20wages%20during%20the%20crisis.pdf

[3] Aghion,P., Cette, G. and Cohen, E., Changer de modele, Editeur : Odile Jacob, 2014

[4] Fialova, K., and Myslikova, M., ”Minimum Wage: Labour Market Consequences in the

Czech Republic”, IES Working Paper 6, Charles University in Prague, 2009.



454

[5] Majchrowska, A., Zółkiewski, Z.,”The impact of minimum wage on employment in

Poland”, Investigaciones Regionales, 24, 2012, pp. 211- 239.

[6] Halpern, L., Koren, M., Korösi, G., Vincze, J., ”Budgetary effects of the rise in the

minimum wage”, Ministry of Finance - Increases in the Minimum Wage in Hungary,

Working Paper No.16, 2004

[7] David, C., Krueger, A., "Minimum Wages and Employment: A Case Study of the Fast-

Food Industry in New Jersey and Pennsylvania1994". American Economic

Review 84 (4):1994, pp.772–793

[8] Roman, M.D., Popescu, M.E. „Forecast scenarios of youth employment in Romania”, Vol.

of the 13th IE Conference, 2014, pp. 554-559.

[9] Wooldridge, J.M., Introductory econometrics – A modern approach, Second Edition, 2002,

pp. 501-528.

[10] Spircu, L., Ciumara, R., Econometrie, ProUniversitaria Publishng House, 2007.

http://en.wikipedia.org/wiki/American_Economic_Review

http://en.wikipedia.org/wiki/American_Economic_Review



455

JUNCTIONS BETWEEN EVALUATION THEORY AND GAMES

THEORY

Marin MANOLESCU

University of Bucharest, Faculty of Psychology and Educational Sciences

[email protected]

Magdalena TALVAN

University of Bucharest, Faculty of Psychology and Educational Sciences

[email protected]

Abstract. The concept of school evaluation has known over the years multiple approaches,

fact that demonstrates the complexity of this process which is so important for the proper

conduct of an educational approach. In the present article we will study school evaluation

through the relation that appears between the inter-conditions between teacher and student,

studied from the perspective of the Games Theory. We consider that is pertinent to make an

analysis of the evaluative approach in terms of the influence that the relationship with

subjective conotations between student and teacher exerts on this evaluative approach,

contextualized by the particularities of each of the two educational actors. In the present paper,

we propose a model of Games theory, where players are the teacher and the student. The

purpose of this model is to theoretically demonstrate how those two educational actors reach

an agreement, regarding the decisions they will make in terms of pupil evaluation by the

teacher.

Keywords: game theory, teacher, student, strategic, evaluation methods JEL classification: C700, I210, I290

1. Introduction

Evaluation theory has experienced over the time multiple conceptual developments which has

to be verified in the classroom, in direct relation between evaluator- the one being evaluated.

From the perspective of educational actors directly involved in the educational process, we can

say that the approaches defining the concept of school evaluation refers to the relationship

between the information resulting from the interpretation and evaluation by the teacher

regarding the evaluated student and the "ideal data and expectations regarding intentions." [3].

In this context, we note that "evaluation is a complex process that allows the issuance of

valuable judgments regarding to an activity and / or result that aims to adopt final decisions

aimed at improving the work of teachers and students and common results thereof". [1]

Traditional methods used in evaluative practice represented by oral evidence, written tests,

practical tests and docimologic test is "a set of practices inherited from different educational

traditions which have gradually mixed to become, over time, sufficiently coherent to impose

itself and who resisted over the time because of their record." [7] The obtained information in

the evaluative practice by applying traditional evaluation methods are successfully completed

by data that can be withheld from the use of alternative methods: systematic observation of

behavior and work of the students; self-evaluation; investigation; project; portfolio. We recall

in this regard the finding that "although almost all experimental research lead to the conclusion

that traditional tests (regardless of the evidence used: written, oral, practical) are subjective,

patriarchal attitude towards the grade made by the professor continues, resulting in moral and





456

religious nuances that still paralyzes the desire of fairness and justice of the students or their

parents." [6].Whatever methods are used, teachers should consider the explanation of the

evaluation criteria to the students. "These requirements should not differ from those made

during training", bearing in mind the "pedagogical significance" of the evaluation process. [2]

We talk about training the students to a better self-knowledge to improve the learning style.

With well-chosen evaluation methods and being supported by the teacher, the student will learn

to relate to the evaluation criteria. In time, the student will be able to realize his level of

knowledgein a particular discipline, given its aspirations in relation to the personal

development. This will help the student to better target a career that fits as well as the flexibility

of adapting to the requirements of a rapidly transforming society.

2. Games theory and the relation between educational actors

Games Theory is a relatively new branch of mathematics, with multiple applications in

economical area. The time of occurrence of this new theory is considered to be marked by the

publication of "The Theory of Games and Economic Behaviour", having as authors John von

Neumann and Oskar Morgenstern in 1944. According to them, the game is defined as "any

interaction between various agents, governed by a set of specific rules that establish the

possible movements of each participant and earnings for each combination of moves". [5] We

also note that Solomon Marcus says: "range of these games is universal, the strategic point of

view has relevance in all areas that meet actors with different interests and where, in one way

or another, the actors depend on each other." [4] With reference to the defining characteristics

of the game from the perspective of Neumann and Morgenstern mentioned, we can identify

the actors, the strategies and the gain functions within a evaluative educational model. In this

context, actors or players involved in the game are the teacher and the student. Each is related

to certain objectives, depending on who determines how he will act, so as to obtain maximum

gain. Also, the teacher and the student take decisions that depend on their own strategic choices

and behaviors exhibited by the other educational actor.

3. Games theory model applied in the school evaluation

We present a model of Games Theory adapted in school evaluation, trying to explain through

a mathematical model how the two educational actors - teacher and student- react, one

depending on the other, to achieve maximum gain. The model is first presented in static

perspective, teacher and student taking strategies in the same time. The solution or game

balance can be improved from the dynamic perspective, the two educational actors’ decisions

are taken sequentially in time, as follows: decision is taken by one player (either by the teacher

or the student); decision adopted by the other player involved in the game.

a. The approach of proposed model from a static perspective:

In practice, there may be students and teachers who prefer or not the same evaluation methods.

The static perspective of the proposed model emphasizes that it is unlikely that the two

educational actors to reach a conflict regarding a sample evaluation applied by the teacher at a

time. We will assume that the strategies of the two parties are the same, the preference for two

evaluation methods: either evaluation based on traditional methods (denoted EMT); either

evaluation based on alternative methods (denoted EMA).

Educational gains of the two actors are as follows:

From the student perspective: We will assume that we are dealing with a student who prefers

an evaluation based on alternative methods, an evaluation based on traditional methods. In this

case, we’ll assume that the student has ten units of satisfaction when he is evaluated through

alternative methods and eight units of satisfaction, when he is evaluated through traditional



457

methods. For the student, unit satisfaction interpretation can be given by the grade that he

believes that he can achieve in the case of such type of evaluation. If the student whishes to be

examined through alternative methods, but the teacher decides to examine the student through

the traditional methods, it will be a student dissatisfaction and his gain will be zero. If the

student whishes to be evaluated through traditional methods, but the teacher decides to examine

the student using alternative evaluation methods, then the student’s gain will be zero because

there are major discrepancies between the wishes of each of the players involved.

From the teacher’s perspective: We will assume that the teacher would prefer more the

traditional evaluation version (with ten units of satisfaction for teacher and eight units of

satisfaction for the student). From the perspective of the teacher, satisfaction units could be

interpreted as being the maximum grades that can give to the students in relation to the

evaluation method that is proposed. In this case, we will assume that we are dealing with a

demanding teacher who, in the alternative evaluation, considers that the maximum grade that

he can give is 9. Also, the same teacher may give a 10 grade to a traditional sample of

evaluation.

The solution of the proposed model from static perspective can be determined by relative

earnings algorithm, the algorithm of the best response. We denote by S the set of strategies of

the two players. In this case S = SPX SE, where Sp is the set of strategies of the teacher, Sp =

{EMT, EMA} and SE represents the set of students’ strategies SE = {EMT, EMA} For each

player, will define gain functions shown in Table 1: Table 1 - The gain functions of the two players involved in the proposed model

Gain functions of the teacher and students

Teacher

UP: Sp→R

UP(EMT, EMT)=10 ,UP(EMT, EMA)=0,

UP(EMA, EMT)=0, UP(EMA, EMA)=8

Student

UE: SE→R UE(EMT, EMT)=8, UE(EMT,EMA)=0,

UE(EMA, EMT)=0, UE(EMA, EMA)=10

We note that the set of possible solutions of the proposed model is as follows: {(EMT, EMT)

(EMT EMA), (EMA EMT), (EMA EMA)}.We determine, first, the best answer for the teacher,

taking account of possible student’s strategies. The founded answer will highlighted in the gain

matrix. Thus, if the student chooses EMT strategy, then, the best answer for the teacher is to

adopt the same strategy, EMT, because 10> 0. On the other hand, if the student chooses strategy

EMA, then the best response for the teacher is also strategy, EMA, because 8> 0. Next, we will

determine the best response for the student, taking into account the possible strategies of the

teacher. The founded answer will be highlighted in the gains matrix. Thus, for the student, if

the teacher chooses EMT strategy, then, the best response for the student is the same strategy,

EMT, because 8> 0. Also, if the teacher chooses strategy EMA, then the best response for the

student is also strategy, EMA, because 10> 0. The gain matrix is described in Figure 1.

We note that we obtained two equilibria in pure strategies, namely feasible strategies for each

player. These are: either (EMT, EMT), that both educational actors agree with the evaluation

based on traditional methods; or (EMA EMA), the two players agreeing appropriate solution

of the evaluation based on alternative methods.



458

Student

EMT

EMA

Teacher

EMT

10, 8 0 , 0

EMA

0 ,0

8, 10

Figure 1 -The model’s solution from static perspective

We find that we can not say precisely which of the two strategies will be chosen by the players.

Therefore, we’ll calculate the probabilities with which the teacher or student, choose one or the

other of the evaluation methods. For this, we denote by p1 and p2 the probabilities with which

the student or teacher prefer the evaluation based on traditional methods. Obviously the

probabilities with each player prefers evaluation based on alternative methods are (1-p1), for

the student, and (1-p2) for the teacher. Then the student’s gain if the teacher adopts strategy

EMT is the UE ((p1, 1-p1); EMT) = 8p1 and the student’s gain if the teacher adopts strategy

EMA is the UE ((p1,1-p1); EMA) 10-10p1.Considering that UE ((p1, 1-p1); EMT), UE = ((p1 1-

p1; EMA) and taking into account p1 + p2 = 1, we obtain equilibrium in mixed strategies,

respectively ((p1,1 - p1), (p2, 1-p2)), with p1 = 5/9 and p2 = 4/9.

In conclusion, the solution of the game, determined by the best response algorithm indicates

three possible equilibrium, two equilibrium in pure strategies, either (EMT, EMT) or (EMO,

EMO) and a mixed strategy equilibrium.

Theoretical solution of this game suggests that fear of failure (which may be due to the choice

of divergent strategies) leads to the fact that the two players coordinate their decisions and

choose the same type of evaluation. On the other hand we can not know for sure which it will

be (because there are two behavioral equilibriumin pure strategies). Equilibrium in mixed

strategies indicates players’ preferences and intensity. At the equilibrium,, the teacher believes

that the student prefers evaluation based on traditional methods with a probability of 5/9 and

evaluation based on of alternative methods with a probability of 5/9, while the student believes

that the teacher prefers evaluation based on traditional methods with a probability of 5/9 and

evaluation basedon alternative methods with a probability of 4/9. The intensity of preferences

(which depends on the gains of each player) in this case may indicate a more probable solution

than another. In case of equal intensity it is difficult to determine which of the equilibrium is

more likely. This uncertainty can be removed if we use the dynamic version of the game.

b. The dynamic approach of the proposed model

If we turn the previously described static game into a dynamic game, one player announces

firs its preference for a particular type of examination after that, the other decides what is best

for him (no matter who starts the game, the teacher or the student) then the equilibrium changes.

The game will be as follows: first, the teacher announces what type of evaluation prefers more,

after that, the student decides on his preferred type of evaluation, by announcing his preference

to the teacher. Extended form of the game, when the teacher is the one who decides first is

described in Figure 2.



459

Figure 2 - Description of the model from dynamic perspective. The case when the teacher announces what type

of evaluation will adopt

For the dynamic game, when the teacher decides first, the solution is obtained by the following

reasoning: if the teacher would announce an evaluation based on traditional methods, the

student will compare the gain in order to accept such an evaluation (of 8 units, grade that he

will gain through the evaluation based traditional methods) with the gain he would obtain in

the case in which he would reject the evaluation (zero units) and he will decide to accept the

teacher’s proposal. If the teacher announces an evaluation based on alternative methods, then

the student’s gain is maximum if he accepts the proposal (a gain of 10 units, grade that he can

get through evaluation based on alternative methods). In this case, the student's decision will

be to accept any teacher’s proposal. In this case, the teacher knows that if he proposes an

evaluation based on traditional methods, has a gain of 10 units (can give grade 10), while if he

proposes an evaluation based on alternative methods, he will have a gain of only 8 units (can

give only grade 8). Therefore he will propose the alternative of evaluation based on traditional

methods, which brings a higher gain. We obtained an unique solution corresponding to the

combination of the strategies: (EMT, EMT). For the dynamic game in which the student is the

one who decides first (or first teacher asks students what type of evaluation prefer, then take

the decision about the adopted type of evaluation), the game equilibrium is never the same and

we have the following reasoning: if student would announce that he prefers an evaluation based

on traditional methods, the teacher compares the gain that he would have in case he accepts the

student’s proposal (10 units, the maximum grade that he can give through the evaluation based

on traditional methods) with the gain he would obtain in the case he will reject it (zero units,

discrepancy between the visions of the two players) and he will decide to accept the student’s

proposal. If a student announces that he prefers evaluation based on alternative methods,

teacher gain is 8 units, if he accepts the proposal and zero units if he refuses. In this way, the

teacher’s decision will be to accept any student’s proposal. In this case, the student knows that

if he proposes an evaluation based on traditional methods, it would have a gain of 8 units, while

if he proposes an evaluation based on alternative methods, it would have a gain of 10 units.

Consequently, he will propose the version of evaluation based on alternative methods, which

brings a higher gain, proposal that is accepted by the teacher. We obtained in this way a unique

solution, given by the combination of strategies (EMA, EMA).



460

From the dynamic perspective we get a unique equilibrium when one of the two actors

announces a certain way of evaluation, and the other one takes account of these preferences.

Therefore, it is recommended that at the beginning of the semester, to exist a communication

regarding the evaluation methods (dynamic perspective of the proposed model).

4. Conclusions

Interrelation between teacher and student that occur during the school evaluative approaches

can be described using games theory models. This paper proposed a game in which the two

educational actors adopt behavioral strategies in order to achieve maximum gain regarding the

realization of their rational expectations. The set of strategies for the two players is the same,

but the teacher and the student's preferences are different regarding the evaluation methods.

The solution of this model indicates that the transition from static to the dynamic game, on the

one hand, eliminates the uncertainty that might arise when there are multiple equilibriums. On

the other hand we find that the player who takes the first decision is an advantage in relation to

the one which chooses later because the second player will have to "adapt" to the desires and

expectations of the first player.

The proposed model emphasizes the importance of communication between teacher and

student regarding the adoption of appropriate evaluation methods and the individual

peculiarities of the teacher and the student.

References

[1] V. Dumitru, Evaluarea în procesul instructiv-educativ. București: Editura Didactică și

Pedagogică, 2005, pp.14-15.

[2] I. Jinga, A. Petrescu, M. Gavotă and V. Ștefănescu, Evaluarea performanțelor școlare.

București: Editura Afeliu, 1996, pp.50.

[3] M. Manolescu, Activitatea evaluativă între cogniție și metacogniție. București: Meteor

Press, 2003, pp.150.

[4] S. Marcus, Paradigme universale III Jocul. București: Paralela 45, 2007, pp.6.

[5] M. Roman, Jocuri şi negocieri. Bucureşti: AISTEDA, 2000, pp.5.

[6] C. Strungă, Evaluarea școlară. Timișoara: Editura de Vest, 1999, pp.115.

[7] J. Vogler, Evaluarea în învăţământul preuniversitar. Iași: Polirom, 2000, pp.27.



461

ROMANIAN EDUCATION SYSTEM – EFFICIENCY STATE

Madalina Ioana STOICA Bucharest University of Economic Studies

[email protected]

Crisan ALBU Bucharest University of Economic Studies

[email protected]

Abstract. In recent years, the higher education system in Romania has been suffering many

changes and the admission to various universities has been modified from year to year. The

reputation and the academic research have been indicators for future potential students when

making the difficult decision of choosing a university. Their evaluation has been made by

different institutions and usually simple fractions have been used to construct rankings. This

study provides an alternative in which multiple input – output systems of variables are used in

evaluating efficiency. Nonparametric technique data envelopment analysis is employed to

provide a comprehensive picture of the performance of universities. Sensitivity analysis to the

variable choice is provided through different efficiency models. Results show potential areas

of improvement in institutions efficiency and could help the government in deciding budget

funds allocations.

Keywords: efficiency, nonparametric techniques, DEA, universities

JEL classification: C8, D7, I21

1. Introduction The evaluation of higher education institutions efficiency is a major concern in this tight budget

funding schema. The prospects and attractiveness of a university resides in its capacity to

efficiently use the available resources in order to maximize its output in terms of results.

Because we are dealing with a multiple variables framework, we need to consider methods of

evaluating the system which permit the inclusion of more than one input and more than one

output.

One major evaluation process has been initiated by the Ministry of Education and Research in

2011 when a platform has been implemented such that each university had to fill in data

regarding their activity in a standardized form. Based on the collected data, the universities

were then classified based on some indicators of type fraction or simple primary data into three

categories: universities centered on education, universities of education and scientific research

or universities of education and artistic creations and universities of high research and

education.

In 2013, an International Assessment Institution – The Association of European Universities

classified for the first time the Romanian universities according to European standards. 12

universities were ranked in the first category, of universities of high research and education.

All private universities were classified into the last category, so the quality of the research in

those institutions was questioned by authorities, which considered lowering the number of

funded places for different levels (bachelor, master or PhD). On the other hand, the high profile

research institutions were allocated approximately 20% more places for research positions,

especially doctorate.





462

This paper aims to evaluate institutions efficiency by employing nonparametric technique data

envelopment analysis. The possibility to include more than one input and one output leads to a

comprehensive picture on the performance of universities and the areas where modifications

need to be made in order to increase their efficiency.

The next section provides a short overview of the technique employed, followed by a section

with the data description. The models used to evaluate universities, as well as the results and

the rankings created are presented in the Efficiency models and results section. Conclusions

come at the end.

2. Methodology

The technique employed is data envelopment analysis, which has been developed since the

paper of [1], who introduced the concept of efficiency in relation to the capacity of firms to

transform inputs into outputs. The author decomposes efficiency into two distinct parts: the

technical efficiency and the allocative efficiency. The first one reflects the firm’s capacity of

obtaining maximum amount of output given a fixed quantity of inputs. The latter one introduces

the variables prices and assumes the optimal combination of inputs given their prices and the

technological process. In 1978 the problem was formulated as a linear programming model by

[2] under the restriction of constant returns to scale, which was later relaxed under variable

returns to scale by [3].

The technique assumes the construction of an efficiency frontier where the most efficient firms

or decision making units (DMU) are represented on the curve with an efficiency score of 1,

while all others are found below the curve with an efficiency score less than 1.

In this paper, we will only deal with technical efficiency since not all universities inputs can be

associated with a price. The presentation and notation below follow [4]. Technical efficiency

can be expressed as a fraction of weighted sum of outputs divided by the weighted sum of

inputs. Therefore, for university i technical efficiency can be expressed as follows:

𝑇𝐸𝑖 =∑ 𝑢𝑝𝑖𝑞𝑝𝑖𝑝

∑ 𝑣𝑠𝑖𝑥𝑠𝑖𝑠 (1)

, where for the ith university the variables which represent inputs and outputs are described by

the column vectors xi and respectively, qi and their corresponding weights v′ and respectively,

u′. In the model above we assume a system with N inputs and M outputs. For the purpose of this

analysis, we use the variable returns to scale model, as it is presented in [4]. Only in the case

when universities would operate at an optimum level, we can assume the constant returns to

scale assumption.

Solving a mathematical program, DEA finds a set of weights in the range (0, 1) which are most

favorable for each university [5]. The optimal weights are computed using the variable returns

to scale (VRS) output oriented model as presented in [5]:

min∅,𝜆

∅

𝑠. 𝑡. − ∅𝑞𝑖 + 𝑄𝝀 ≥ 𝟎, 𝑥𝑖 − 𝑋𝝀 ≥ 𝟎, 𝐼1′𝛌=1

𝛌≥ 0. (2)

, where ∅ is a scalar satisfying 1≤ ∅ ≤ ∞, 𝛌 is a scalar vector of order I*1 and 1/∅ is the

proportional increase in outputs that can be achieved by university i, in case of fixed input

quantities. The value of 1/∅ is the technical efficiency score reported in software program

DEAP and lies between 0 and 1.



463

One DEA feature is that it provides appropriate benchmarking for DMUs such that we can

compare inefficient observations with efficient ones (peers).

The computation associated with DEA can be made in software program SAS, but also using

a number of specialized other programs especially built for DEA calculations like ONFront,

iDEAs, Warwick DEA, FEAR or DEAP. In this analysis I used DEAP 2.1 build by Tim Coelli.

3. Data description

The data used for this study was collected from the survey of assessing universities in Romania

realized by the Ministry of Education, Research, Youth and Sports in 2011 [6]. The system of

variables is based on the general accepted categories to represent the education process ([7],

[8]) for input: human resources, material resources and financial funds and for output: research

and teaching. We used a system with 8 variables to account for inputs and outputs and a short

description for them is provided below: Table 1 - Variable description

INPUT Description

CDID Cumulated sum of full professors, assistant researchers,

researchers and assistant professors.

TOTINM Enrolled students (graduates and postgraduates)

FOND Total amount of grants (national + foreign)

CARTI Number of books in the school library

DOT Classroom equipment for teaching and research

OUTPUT Description

PUB Cumulated sum of publications of type ISI (International

Statistics Institute) and IDB (International databases)

PUBISI Number of publications in the ISI journals with impact factor

computed

TOTABS Cumulated sum of graduated students

The original database consisted of 61 universities, but we found an outlier in the data

representing CDID against TOTABS and decided to exclude it from the analysis. Some

descriptive statistics are presented in the table below: Table 2 - Descriptive statistics

Particularly high values are reported for standard deviation in case of variables FOND and

CARTI, which reflect the difference in university size.



464

4. Efficiency models and results

In order to estimate universities efficiency, we constructed 6 different efficiency models where

we analyzed the results sensitivity in relation to variable change. The models are presented

below: Table 3 - Efficiency models

Model

Variable

1

(2+1) 2

(1+1) 3

(3+2) 4

(3+2) 5

(3+1) 6

(3+1)

INPUT

CDID * * * * * *

FOND * *

CARTI *

DOT * * * *

TOTINM * *

OUTPUT

PUB * * * *

PUBISI *

TOTABS * * *

The aim of the first model was to test whether large universities are more efficient when it

comes to research activity than smaller ones. The model includes human resources input

represented by academic staff and material resources represented by library books to provide

an image of the university size. The research output is measured by the number of publications.

The conclusion is that the size of universities does not influence the number of publications in

journals and we obtained efficient universities of all sizes: large, big, medium and small. Input

slacks have particularly high values for the variable books, suggesting that universities could

lower their library collection number without affecting the research results. This is an indirect

effect of the intense and increasing use of the internet.

In order to study universities efficiency when it comes to high rated journals (with an impact

factor computed) we used model 2. Results show that big universities concentrate resources on

higher quality journals in general. Most of the high profile universities in Romania, like

University Alexandru Ioan Cuza or University of medicine and pharmacy Carol Davila and

University Polytechnic in Bucharest are efficient in this model.

Model 3 introduces an input variable not used in the previous models and it refers to human

capital, particularly total students enrolled, being an indicator of university attractiveness. We

wanted to estimate efficiency score that will reflect university capacity of using available

resources in matter of human and physical capital in order to obtain a high number of graduates

and consistent research work. 16 efficient universities were found, the top being quite similar

to the previous two. In comparison to these models, we changed two input variables and added

a new output. These results confirm the assumption that a ranking does not significantly change

when partial replacement of input variables is made with similar variables, having similar

economic meaning (CARTI<->DOT) or even by introducing a new economic aspect to the

model, TOTINM.

In order to see the effects orientation has on the rankings built, we ran model 3 in an input

orientation (model 4). The assumption that the number of scholars or enrolled students could

be reduced is related to the selection process efficiency. Although the first 20 positions are

almost identical (with one exception), the top has suffered some modifications. It seems that in



465

general, technical universities are less efficient in the input orientation, meaning that

universities do not concentrate on minimizing effort or resources, but to obtain maximum

results. The same interpretation is valid for some medical institutions: University of medicine

and pharmacy Gr. T. Popa or University of medicine and pharmacy from Târgu Mureş.

The next two models were built in order to compare rankings that are very different from the

rest of the models presented so far. They account for another economic aspect: financial

resources. Although we have included this new dimension, the results are not very different

from the previous others rankings.

These two models aim to reflect the efficiency of research and respectively, of teaching taking

into account one variable to represent each major category of inputs considered for the analysis:

the number of university academia for human capital, the amount of national and foreign

financial funds to account for financial resources and the equipment available for material

resources.

In order to study trade-off effects between research and teaching, also discussed and analyzed

in [6], we replaced the publications variable with the number of graduates in model 6. The

trade-off refers to the limited amount of time professors have and that a choice needs to made

between allocating more time to teaching or to research. Even if the change of variables was

very important in this case, nine from the fourteen efficient universities in model 5 remain

efficient in model 6. The analysis reveals some examples of trade-offs: Ecologic University in

Bucharest concentrates more on obtaining a high number of graduates than on the research

activity. An opposite example of trade-off is given by University of Agricultural Science and

veterinary medicine in Cluj-Napoca, efficient in model 5, but ranking only on position 47 in

model 6. It seems that in general, concentration on research activity is a time dedication

characteristic of most medical profile universities.

5. Conclusion

The paper provides an efficiency analysis for Romanian universities constructing a sensitivity

study in relation to the variable choice and for approaching different hypothesis. Among these,

we tested whether the size of the university influences publication work or teaching output.

More than that, some models are built in order to strengthen or confirm some results of previous

studies, using other data and variables specification, like it is the case of trade-offs between

research and teaching.

The various models imply multiple variables specification according to the purpose of the

analysis and the technique used was data envelopment analysis. Among advantages of using

this technique we note the flexibility of choosing a series of input variables and a set of output

indicators without any restriction to the functional relation between them. The technique

provides a comprehensive picture of a heterogeneous system where not all indicators need to

be expressed in a quantitative way.

Our contribution to the existing literature is provided by the sensitivity analysis conducted

using six models. The results prove that slight modifications to the input or output variables do

not significantly affect the rankings built; however, changing the variables used with others

that account for different or additional economic purposes or that differ in a high degree from

the original variables may lead to inconsistent rankings in some cases.

In terms of journal publication choice, it seems that universities concentrate their resources on

higher quality journals, and most of the high profile institutions considerably increased their

ranking in the ISI publications model.

The use of financial funds is more efficient in case of small universities than in case of large

institutions. Having access to limited funding, the management seems to adjust its needs in

order to maintain output levels to satisfactory points.



466

Further analysis can be conducted taking into account the type of universities (whether they

are generalist or specialized) and also the field of study (ranking medical universities, technical,

and economic). Also, because the study here does not take into account the quality of teaching

as no available data in this matter was found, further investigations regarding this issue are

necessary.

Acknowledgement

This work was supported by the project “Excellence academic routes in doctoral and postdoct

oralresearch READ” cofunded from the European Social Fund through the Development of

HumanResources Operational Programme 20072013, contract no. POSDRU/159/1.5/S/1379

26.

References

[1] M.J. Farrell, “The measurement of productive efficiency,” Journal of Royal Statistical

Society Series A, vol. 120, no. 3, pp. 253-281, 1957.

[2] A. Charnes, W.W. Cooper and E. Rhodes, “Measuring the efficiency of decision making

units,” European Journal of Operational Research. 2, pp. 429-444, 1978.

[3] Banker, R., A. Charnes and W.W. Cooper, “Some models for estimating technical and scale

inefficiencies in data envelopment analysis,” Management Science, no. 30, pp.1078-1092,

1984.

[4] T. J. Coelli, D. S. P. Rao and G. E. Battese, An introduction to efficiency and productivity

analysis, Kluwer Academic Publishers, London. 1998.

[5] M. L. McMillan and D. Datta, The relative efficiencies of Canadian universities: A DEA

perspective, Canadian Public Policy - Analyse de politiques, 1998, vol. XXIV, no. 4.

[6] List of participating universities to the data collection process and the information

introduced in the process: 25 August 2011, 08:00:

http://chestionar.uefiscdi.ro/public5/index.php?page=punivlist [September, 2014].

[7] A. Bonaccorsi, C. Daraio and L. Simar, “Advanced indicators of productivity of uni-

versities. An application of Robust Nonparametric Methods to Italian data,”

Scientometrics. Vol. 66, No. 2, pp. 389-410, 2006.

[8] M. Stoica, “A Survey of Efficiency Studies with Application on Universities” in Proc. of

the IE 2014 International Conference, 2014.

http://chestionar.uefiscdi.ro/public5/index.php?page=punivlist



467

GST FOR COMPANY’S REPUTATION ANALYSIS IN ONLINE

ENVIRONMENTS

Camelia DELCEA


Abstract. In the now-a-days economy, the leading businesses have understood the role of

“social” into their everyday activity. Online social networks (OSN) and social media have melt

and become an essential part of every firm’s concerns. Firms have become more and more

involved in online activities, trying to engage new costumers every day. Beside this, the users

of OSN are more informed about a company’s products/services and are seeking information

both in online and offline communities. Even more, their online audiences’ and friends’

opinions have become important in making a decision. Using a grey relational analysis, this

paper tries to shape the relation between firms’ activity in online environments, their

customers’ network characteristics, their perceived image and the company’s reputation.

Keywords: grey knowledge, grey relational analysis, GST, OSN, reputation. JEL classification: C02, G32.

1. Introduction A short and easy to understand definition of OSN is that they “form online communities among

people with common interest, activities, backgrounds, and/or friendships. Most OSN are web-

based and allows users to upload profiles (text, images, and videos) and interact with others in

numerous ways” [1]. As structure, the OSN are usually perceived as a set of nodes – represented

by its users, and a set of directed (e.g. “following” activity on Facebook) of undirected (e.g.

friendship relationships) edges connecting the various pairs of nodes. [2]

But all these pairs of nodes would have no specific value if there wouldn’t be the

communication among the different users, the continuous process of exchanging information

and knowledge. Some of the new works in this field are showing that the whole value of these

OSN comes from users’ interaction.

Among these users, there can easily be identified the “hubs”: the users that have a very large

number of social links with the other members of the social network. This particular type of

users can be very useful for the marketing campaigns as they have a great potential of

communication and interacting with others. Also, the viral marketing campaigns are targeting

the well-connected users. [3, 4]

In OSN more than in other type of communities and networks, grey systems theory (GST) finds

its applicability due to the nature of the relationships between its main actors, mostly due to

the fact that the amount of knowledge extracted is limited. Even more, by adding the human

component, through the consumers’ demands and needs, strictly related to preferences, self-

awareness, self-conscience, free-will, etc. the study of the knowledge that can be extracted

through OSN is becoming more complicated.

Considering the everyday activities, is can easily be seen that the consumers’ opinion regarding

a firm’s reputation depends of a wide array of factors. With the rapid development of OSN and

their users’ engagement, these elements should be considered in order to extract the new

information that can be used in understanding the OSN’s complexity and their influence on




468

firm’s reputation. On this purpose, the pre-sent paper tries to shape the relation among some

variables that can be extracted from the online environment and their effects on firm’s

reputation, using some elements taken from the grey systems theory as it will be drawn in the

following.

2. Grey relational analysis Grey systems theory is one of the most recent developed artificial intelligence’s theories and

has been widely applied to a great range of domains, such as: decision making [5], credit risk

assessment [6], bankruptcy forecasting [7], innovation competency evaluation [8], forecasting

[9], evaluating complex products’ quality [10], etc.

Among the methods offered by this theory, grey relational analysis (GRA) is one of the most

known and used as it provides advantages related to the fact that it can generate a statistical

law without a lot of data and it isn’t necessary to have a fixed probability distribution or a linear

correlation between variables. The obtained values for GRA are between 0 and 1 and they are

underlying a stronger relationship among the considered values when the calculated GRA

approaches 1.The steps implied by the conducting such an analysis [7, 11] are presented below:

Calculating the behavior sequence of the system characteristics and the sequence of the related

factors:

.,,2,1,0)),(,),2(),1(( minxxxX iiii (1)

Determining the initial values of each sequence:

)),(,),2(),1(()1(

'''' nxxxx

xX iii

i

ii (2)

Getting the difference sequence:

))(,),2(),1((

)()()( ''0

n

kxkxk

iiii

ii

(3)

Establishing the largest difference and the smallest difference of the two poles:

),(minmin

),(maxmax

km

kM

iki

iki

(4)

Determining the grey relational coefficient:

;,,2,1),1,0(,)(

)(0 mkMk

Mmkr

i

i

(5)

Getting grey relational grade:

n

k

ii krn

r1

00 )(1

(6)

3. Case study For conducting the grey relational analysis, a questionnaire was applied to the online social

networks’ users, 258 persons answering to all the addressed questions. Having the answers, a

confirmatory factor analysis was accomplished in order to validate the construct, validity and

reliability of the questionnaire. After proceeding this, the selected factors were passed through

the grey relational analysis in order to determine the power of the linkage among them.

a. Questionnaire and data The case study was conducted to a 258 respondents, having the age between 18 and 41 years

old. Among them, 98.40% are members on the online social networks, while the rest of have

never been a part of this kind of networks, which reduced our sample to 254 valid respondents.



469

Among them, 90.94% have seen a commercial in the social media in the last year or have

seen and participated to a discussion related to a company’s product/ service.

Along with the questions regarding the personal characteristics, the respondents were asked to

answer to a series of questions, evaluated through a Likert scale taking values between 1 and

5, which have been grouped into four constructions:

Firms’ OSN Activity: the respondents were asked to appreciate the firms’ active

involvement in online environments (X1), the online customers’ support centre

activity (X2) and the firms’ active advertising in online environments (X3);

Users’ Activity and Personal Network Characteristics: also measured through three

variables, namely the connection time of each user on OSN platforms (X4), the social

influence of each user – measured through the number of friends that are using a

firm’s product/service and the total number of friends in OSN (X5) and each user’s

active discussion participation in OSN (X6);

Perceived Image: the friends’ opinion regarding the usage of a certain product was

evaluated (X7), the firms’ overall media image (X8) and the sense of achievement felt

by consumers regarding that company’s product/service (X9);

Companies’ Reputation: emotional appeal (X10), the product quality (X11) and value

for money (X12).

As the purpose was to obtain the relational grade among the first three considered variables

on the forth one, companies’ reputation, a model fit analysis was performed using SPSS

AMOS 22 on these three variables (section 3.2.) and then a grey relational analysis was

applied (section 3.3).

b. Model fit through a confirmatory factor analysis Having the answers to the questionnaire above, a confirmatory factor analysis was conducted

in order to validate its main constructions.

The starting construction is pictured in figure 1 (latent construction – A), but due to some values

obtained for main confirmatory factor analysis’s indices such as: CMIN/DF of 2.446, NFI of

0.866, RFI of 0.799, IFI of 0.916, CFI of 0.914, RMSEA of 0.076, etc. which were at the edge

of their threshold, the construction pictured in figure 1: latent construct - B has been validated

(in which the X2 variable was removed) and used in the next section where the grey analysis

was performed.

In the following, the goodness of fit (GOF) for the latent construction – B, as they were

obtained in AMOS 22, are presented. The goodness of fit indicates how well the specified

model reproduces the covariance matrix among the indicator variables, establishing whether

there is similarity between the observed and estimated covariance matrices.

The improved model has a CMIN/DF of 1.552 less than the threshold value 2.000 (Table 1),

indicating a good model fit. Table 1 - CMIN (AMOS 22 Output)

Model NPAR CMIN DF P CMIN/DF

Default model 19 26.377 17 .068 1.552

Saturated model 36 .000 0

Independence model 8 368.245 28 .000 13.152

The values of GFI and AGFI are above the limit of 0.900, recording a 0.977, respectively a

0.950 value (Table 2), while CFI is exceeding 0.900 (being 0.972 – see Table 3) the imposed

value for a model of such complexity and sample size. As for the other three incremental fit



470

indices, namely NFI, RFI and IFI, the obtained values are above the threshold value 0.900 for

NFI and IFI and below this value for RFI.

Figure 1- The initial latent construct (A) and the final latent construct (B)

Table 2 - RMR and GFI (AMOS 22 Output)

Model RMR GFI AGFI PGFI

Default model .039 .977 .950 .461

Saturated model .000 1.000

Independence model .254 .707 .623 .550

Table 3 - Baseline comparisons (AMOS 22 Output)

Model NFI

Delta1

RFI

rho1

IFI

Delta2

TLI

rho2 CFI

Default model .928 .882 .973 .955 .972

Saturated model 1.000 1.000 1.000

Independence model .000 .000 .000 .000 .000

As Table 4 shows, the root mean squared error approximation (RMSEA) has a value below

0.100 for the default model, showing that there is a little degree to which the lack of fit is due

to misspecification of the model tested versus being due to sampling error.

The 90 percent confidence interval for the RMSEA is between LO90 of 0.000 and HI90 of

0.080, the upper bound being 0.080, indicating a good model fit. More, the model’s validity

are reliability are evaluated by two additional measures: the average variance extracted (AVE)



471

and construct reliability (CR). As these two measures are not computed by AMOS 22, they

have been determined by using the equations presented in [12]. Table 4. RMSEA (AMOS 22 Output)

Model RMSEA LO 90 HI 90 PCLOSE

Default model .047 .000 .080 .524

Independence model .219 .200 .239 .000

After computing, the following values have been obtained for Firms OSN Activity, Users

Activity and Personal Network Characteristics and Perceived Image: AVE: 0.515, 0.503 and

0.518 and CR: 0.683, 0.808 and 0.837. An AVE of 0.500 indicates an adequate convergent

validity, while a CR of 0.700 or above is suggest a good reliability. Having the obtained values,

it can be concluded that the overall construct validity and reliability is good and that the

considered measures are consistently representing the reality.

c. Grey analysis Having validated the considered constructions, a grey relational analysis was per-formed. The

obtained results are presented in Table 5. It can easily be observed that the values obtained in

this analysis are significant as they are situated around 0.700. Table 5. Grey relational analysis

GRA Firms’ OSN

Activity

Users’ Activity and Personal

Network Characteristics

Perceived

Image

Companies’

Reputation .753 .691 .744

The Firms Online Activity construction is having the highest value of 0.753, showing that the

firms’ active involvement and active advertising campaigns conducted in online environments

have a positive and significant influence on that company’s reputation.

Moreover, the customers’ perceived image due to their interaction with other friends/users in

online communities, the online media image and the sense of achievement felt by customers as

they are purchasing a company’s products and services is also important and has a positive

impact on firm’s reputation.

A smaller values is obtained for the construction regarding the Users’ Activity and Personal

Network Characteristics, 0.691, which can underline the fact that personal network

characteristics are not as strictly related to the firms’ reputation as the other two constructs.

In this context, firms can take into account an intensification on their participation in online

environments through marketing campaigns designed to get to their target audience. Also, in

their advertising firms can induce more to their customers the sense of achievement due to the

usage of their products or can try to identify some of the most influential group members for

addressing their message.

4. Conclusions OSN are becoming more and more a now-a-days reality. In this context, companies have

adapted their strategies in order to meet the target audience.

This paper presents a method for selecting some of the most important areas on which a

company can focus in order to increase their reputation.

For this, a questionnaire has been deployed, applied and validated for better extracting the most

appropriate constructions. Grey relational analysis was used as the information flowing within

the feedback loops in OSN is a grey one.



472

As further research, a grey systems theory will be used for identifying the most important and

influential node among the most impressionable modes within an OSN. Having this

information, each company can adapt or create a specific strategy that will target this persons

in order to increase and strengthen competitive position on the market.

Acknowledgment This paper was co-financed from the European Social Fund, through the Sectorial

Operational Programme Human Resources Development 2007-2013, project number

POSDRU/159/1.5/S/138907 “Excellence in scientific interdisciplinary research, doctoral

and postdoctoral, in the economic, social and medical fields -EXCELIS”, coordinator

The Bucharest University of Economic Studies.

References [1] F. Schneider, A. Feldmann, B. Krishnamurthy and W. Willinger, “Understanding online

social network usage from a network perspective,” in Proceedings ACM SIGCOMM

Conference on Internet Measurement, 2009, pp. 35-48. [2] J. Heidemann, M. Klier and F. Probst, “Online social networks: A survey of a global

phenomenon,” Computer Networks, vol. 56, pp. 3866-3878, 2012.

[3] J. Yang, C. Yao, W. Ma and G. Chen, “A study of the spreading for viral marketing based

on a complex network model,” Physica A: Statistical Mechanic and its applications, vol.

389, no. 4, pp. 859-870, 2010.

[4] A. Goyal, F. Bonchi and L. Lakshmanan, “Discovering leaders from community actions,”

in Proceedings of the 17th ACM conference on information and knowledge management,

2008 pp. 499-508.

[5] J. Zhu, “Self-adaptation evaluation method in real time dynamics decision-making system

based on grey close relationship,” Grey Systems: Theory and Application, vol. 3, no. 3, pp.

276 – 290, 2013.

[6] J. Jin, Z. Yu, Z. and C. Mi, “Commercial bank credit risk management based on grey

incidence analysis,” Grey Systems: Theory and Application, vol. 2, no. 3, pp.385 – 394,

2012.

[7] C. Delcea, “Not black. Not even white. Definitively grey economic systems,” Journal of

Grey System, vol. 26, no. 1, pp. 11-25, 2014.

[8] Y. Zhu, R. Wang and K. Hipel, “Grey relational evaluation of innovation competency in an

aviation industry cluster,” Grey Systems: Theory and Application, vol. 2, no. 2, pp.272 –

283, 2012.

[9] Q. Zhang and R. Chen, “Application of metabolic GM (1,1) model in financial repression

approach to the financing difficulty of the small and medium-sized enterprises,” Grey

Systems: Theory and Application, vol. 4, no. 2, 2014.

[10] R. Xu, Z. Fang and J. Sun, “A grey STA-GERT quality evaluation model for complex

products based on manufacture-service dual-network,” Grey Systems: Theory and

Application, vol. 4, no. 2, 2014.

[11] S. Liu and Y. Lin, Grey Systems – Theory and Applications, Understanding Complex

Systems Series, Springer-Verlang-Berlin-Heidelberg, 2010.

[12] Y.E. Spanos and S. Lioukas, “An examination into the causal logic of rent generation:

contrasting Porter’s competitive strategy framework and the resource-based perspective,”

Strategic Management Journal, vol. 22, pp. 907-934, 2001.



473

DETERMINANTS OF EU MIGRATION. PANEL DATA ANALYSIS

Costin-Alexandru CIUPUREANU


[email protected]

Elena-Maria PRADA


[email protected]

Abstract. There are many studies on migration that show what type of factors influence the

people to migrate or to return to their countries. In our analysis we tried to determine what

type of variables influence the number of emigrants within a country. We used a panel data

regression model for 25 EU countries for a 5 year period, from 2008 to 2012, to show what

drove people to migrate regarding the macro-economic factors or social conditions of a

country. The results show that social contributions, education level of individuals and

economic development of a country are influencing the number of emigrants.

Keywords: Panel Data, Pooled Regression, Migration.

JEL classification: C23, F22, J61.

1. Introduction and literature review Regarding migration there are many studies that show what type of factors influence the

people to migrate or to return to their countries. It is obvious that literature on migration

domain is generous in theories explaining the migration determinants or approaches of

various issues regarding migration. The most acknowledged theories refers to: the

neoclassical theories that emphasizes the role of \economic determinants on migration [1] [2]

and the new economics of labour migration (NELM) developed during the 80s that shows

that the income maximization is influenced also by the skills of the migrants and that „older

workers are less mobile than young workers” [3]. Other theories that explain migration and

its determinants introduce the idea of status and prestige that is offered through finding

another place to live and work, the role of a country’s legislation related to the emigrant’s

social network or cultural factors [4].

More recently, a factor that influences migration is considered also the level of social

protection of a country. Borjas proposed the welfare magnet hypothesis, first mentioned in a

seminar paper. This hypothesis refers to how immigrants prefer countries that are having

generous welfare stipulations to secure themselves against the risks of labour market such as

unemployment. We can conclude that the generosity of social policies of a country attracts

people to migrate, being a pull factor for migration [5].

Migration affects social policies in developing countries as Pillinger showed. The

implications of migration on the welfare systems of developing countries has been neglected,

the social welfare systems not only have to adapt to international migration in countries of

origin, transit and destination, but also that migration policy frameworks can obstruct

important connections between migrants and their homelands and the development of social

welfare systems [6].

Most individuals choose to migrate when their income declines, to this reason there are other

motivations that can be added related more to the sociological aspects that trigger people to



474

migrate as: food, health care, shelter, and social needs. All of these aspects are sustained by

the most important aspect: having a job that provide satisfactory amount of money [7].

The phenomenon of migration of the individuals living in developing countries has been

improved by the charm of the developed countries. Remittances became thus for the

developing countries a significant source of external financing and their role is to decrease

the poverty as Devesh Kapur considers [8].

Due to the fact that some work-related migrants might have entered the host country illegally

or with a temporary work contract they usually do not have a high social status in the

destination country. Consequently, they always face lower work conditions than the nationals

of the host countries even though they may have better skills that facilitate the employment in

their home countries. The work-related migrants are in general employed into sectors such as:

construction, hotels or restaurants and health care [9].

An analysis over a period of 28 years starting with 1975 until 2002 on a very large number of

developing countries showed that remittances are a substitute of financial shortcomings, also

remittances improve the allocation of capital, and therefore promote the economic growth in

case of the financial sector does not fulfil the financial needs of the population [10].

Also, over the period 1975-2004 regarding remittances and household consumption

instability for a large sample of developing countries and after controlling for endogeneity of

remittances that analysis revealed that remittance-recipient countries present low instability

of their household consumption and play an insurance role especially for the countries that

have a low developed financial system [11].

Also, over the nexus between migration and development of a country it has been shown that

there is a strong influence, unemployment being one of the main triggers of people to migrate

[12].

Crude rate migrate and household expenditures are strongly correlated and have a mutual

connection, this meaning that money sent back to the origin country of the migrant are found

as a component of household expenditures [13].

In our analysis we tried to show what type of variables influence the number of emigrants of

a country. For demonstrating this we chose a series of variables which according to the most

popular migration theories are influencing people to migrate.

2. Data and method

The analysis will focus on the following variables: Emigrants, Wages, Social Contribution,

Gross Domestic Product, Education Early Leavers to find if there is any relation between

them, how strong it is and its direction.

We used data from Eurostat database. The period of analysis includes 5 years, starting with

year 2008 until 2012, for 25 European Union countries (Austria, Cyprus, Czech Republic,

Denmark, Estonia, Finland, France, Germany (until 1990 former territory of the FRG),

Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway,

Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, United Kingdom) which had

available data for every variable (for some countries the data set was not available). This

period was chosen in order to try to determine if there are any influences over time surprising

also the effects of the latest financial and economic crisis. The dataset was analysed by using

a simple regression model for panel data.

Emigrants is the dependent variable of the panel data regression model and refers to the

leavers from their native country with the intention to settle in other country.

The variable Wages refers to the wages and salaries data set. According to Eurostat this

variable is defined as ”the total remuneration, in cash or in kind, payable to all persons



475

counted on the payroll (including homeworkers), in return for work done during the

accounting period”. This variable does not include the social contributions.

Social Contributions refers to employers' social contributions and other labour costs paid by

the employer.

Gross Domestic Product computed as Euro per inhabitant was employed as being a robust

indicator of economic development of a country.

We also employed in our analysis the Education Early Leavers. This variable refers to the

percentage of population between 18-24 years old that have attained at most lower secondary

education and they are not involved in any other education form.

Panel data involves two dimensions: first cross-sectional and second time series. The panel

data regression model is different from an OLS regression as it provides information of both

dimensions: over individuals and over time. The general model of panel data can be

described as:

yit = αi + ∑ 𝑥𝑖𝑡kk=1 ∙ βkit + εit (1)

where: i = 1,…,N, N is the number of cross-sectional dimension (or individuals);

t = 1,…,T, T is the number of time dimension (or period).

There are many types of panel data models but the most commonly analysed models are

pooled regression, fixed and random effects. We used for our analysis the pooled regression

model considering the fact that the period of time is rather small and it cannot show relevant

information regarding the variation in time.

The fixed effects model, also known as the within estimator, has the assumption that the error

term is correlated with the constant or individual specific term α, because the model can

exclude or omit the time-invariant variables (as gender, religion etc.) from the model [14]. In

the random effects model the α is assumed to be independent of the error term ε and also

mutually independent [15].

An advantage for random effects model is that it can include time-invariant variables,

otherwise the interpretation of it is the same as it is for the fixed effects model.

The pooled regression model usually is carried out on time series cross-sectional data set, it is

often used to compare to it the fixed and random effects models and it considers obviously

both dimensions of the panel data.

The common abbreviations used in the panel data analysis are: for the individual or panel

identifier in our case being the country and we called it as id, for the time variable we named

it t, the other two variables can be easily deducted from their name. For all variables there are

three types of variation described in the regression model for fixed and random effects:

Overall variation – which shows the variation over both dimensions.

Between variation – which shows the variation over individuals, specifically the id.

Within variation – which shows the variation over time (t).

It is obvious that the standard deviation of within variation for id is zero because the cross-

sectional variable does not vary over time, and therefore the between variation for time

variable t is also zero because it does not vary over individual.

3. Results and discussion

Migration has been studied through various analyses including also panel data regression

models, most of them highlighting the macroeconomic determinants, push and pull factors of

migration. Mayda shows that international migration is positively correlated with per worker

GDP levels for the origin countries, using in the analysis data from various sources: the

immigration data was gathered from International Migration Statistics and the



476

macroeconomic variables from World Bank’s Global Development Network Growth

Database. So referring to the influence of GDP over the immigration rate we can conclude

that any person who migrates into a country contributes to the GDP growth , immigrants

becoming an important source of revenue growth [17].

Our analysis was made by using Stata 12.0 and we modelled the Emigrants as a function of

Wages, Social Contribution, Gross Domestic Product, Education Early Leavers to show how

much is the influence of these variables, in what direction it goes and what does it mean.

The second step was to analyse the results from the panel data regression model. The method

of regression model for panel data is the same as for the simple linear regression, so by

validating the hypotheses of OLS method we estimate the coefficients of the pooled

regression model shown in the Table 1.

Table1. Results of panel regression

Dependent variable Emigrants

Pooled Regression

Wages 3445,45

Social Contribution 10884,93**

GDP -3,208**

Education Early Leavers 4574,48*

Intercept 23871,22

R Squared 0,1413

Adjusted R Squared 0,1126

F test 4,93*

(*) are significant at 1%, (**) are significant at 5%

The pooled regression model is significant at 5% level of confidence, with an R Square of

0.1413 which shows that there are many other factors that influences the number of

emigrants.

The results from the panel data regression model shows that level of emigrants is influenced

by the level of social contributions directly proportional, so as the social contributions

increase the number of emigrants tend to increase as well.

The same tendency as for the social contributions is followed by the variable Education Early

Leavers. People tend to migrate if they don’t continue their studies, confirming the theory of

low skilled workers.

Also, the Gross Domestic Product, as an indicator for economic development, it has an

indirectly proportional influence on the level of emigrants. This can be explained as if the

country is more developed, people prefer to remain in their countries of origin rather than to

migrate.

The model confirms the neoclassical theory. Also it should be noted that the model is

perfectible, variables can be improved by logarithm and by adding other variables that

explain better the level of emigrants.

4. Conclusions In our analysis we tried to determine what influences people to migrate. For demonstrating

this we have chosen a series of variables which according to the most popular migration

theories are influencing people to migrate.



477

The results from the panel data regression model shows that the level of emigrants is

influenced directly proportional by the social contributions, so as the level of social

contributions increase the number of emigrants tend to increase as well.

The same tendency as for the social contributions is followed by the variable Education Early

Leavers. People tend to migrate if they do not continue their studies, confirming the theory of

low skilled workers.

The Gross Domestic Product, as an indicator of economic development, it has an indirectly

proportional influence on the level of emigrants. This can be explained as if the more

developed is a country, people prefer to remain in their country of origin rather than to

migrate.

Acknowledgment

„This paper was co-financed from the European Social Fund, through the Sectorial




Bucharest University of Economic Studies”.

References

[1] W.A. Lewis, “Economic Development with Unlimited Supplies of Labour.” The

Manchester School, 22, pp. 139–191, 1954.

[2] M.P. Todaro, “Migration and economic development: a review of theory, evidence,

methodology and research priorities”. Occasional Paper 18, Nairobi: Institute for

Development Studies, University of Nairobi, 1976.

[3] O. Stark and D. Bloom, “The New Economics of Labour Migration”. American

Economic Review, 1985, No.75, 173-178.

[4] Piore, M.J., Birds of Passage: Migrant Labor Industrial Societies. Cambridge University

Press.New York, 1979.

[5] Borjas, G. J.,“Self-Selection and the Earnings of Immigrants.” American Economic

Review 77, no. 4 pp. 531-53,1987.

[6] J. Pillinger, ”The Migration-Social Policy Nexus: Current and Future Research”. United

Nations Research Institute for Social Development. International Workshop 22–23,

Stockholm, Sweden, 2007.

[7] Z. Goschin and M. Roman,”Determinants of the remitting behaviour of Romanian

emigrants in an economic crisis context”. Eastern Journal of European Studies 2 (3), pp.

87-103, 2012.

[8] D. Kapur, “Development and Migration-Migration and Development - What comes

first?” Social Science Research Council Conference, New York City and Geneva, 2004.

[9] International Labour Organization (ILO), 2010. A rights-based approach. Available on:

http://www.ilo.org/wcmsp5/groups/public/---ed_protect/---protrav/---

migrant/documents/publication/wcms_208594.pdf.

[10] M. Ruiz-Arranz and P. Giuliano, “Remittances, Financial Development, and Growth”,

International Monetary Fund Working Papers 05/234, 2005.

[11] J.L. Combes and C. Ebeke. (2010). Remittances and Household Consumption Instability

in Developing Countries. CERDI, Etudes et Documents. Available:

http://publi.cerdi.org/ed/2010/2010.15.pdf, 2010.

[12] E. Prada, ”Economic development and migration in European Union”, The international

conference present issues of global economy - 10th Edition 1 (XIII), pp. 259-264, 2013.



478

[13] R. Bălă and E. Prada, ”Migration and private consumption in Europe: a panel data

analysis, published in Procedia Economics and Finance ISI Proceedings of the 7th

International Conference on Applied Statistics, Bucharest, Romania, 2013.

[14] W. H. Greene, Econometric analysis. Fifth edition. In: New Jersey, Ed.Prentice Hall,

pp. 283-339, 2002.

[15] G.S. Maddala, Introduction to econometrics III-ed edition, Ed. Wiley, 2001

[16] Mayda, A.M., 2009. “International Migration: A Panel Data Analysis of the

Determinants of Bilateral Flows.” Journal of Population Economics 23, no. 4, pp. 1249-

1274.



479

EUROPEAN COUNTRIES AND THE SUSTAINABILITY

CHALLENGE: FOCUS ON TRANSPORTATION

Georgiana MARIN


Alexandra MATEIU The Bucharest University of Economic Studies

[email protected]

Abstract. This study presents a cluster analysis in the transportation sector based on several

variables like for example: energy consumption level, modal split of freight transport,

harmonized index of consumer prices, greenhouse gas emissions and nitrogen oxides

emissions. These variables were chosen due to their relevance for the transport industry and

taking into account the fact that each country has its sustainability targets imposed by the

European Union. Currently with the increasing freight and transport demand, the European

Union is looking for better ways to facilitate transportation channels that are more eco-

friendly and more efficient in terms of the energy consumption levels. Whether we consider,

rail, road or air transport, sustainability is a key component that needs to be included in each

country’s infrastructure development strategy.

There are three types of country clusters being analyzed using the mentioned variables and

the key principles of the sustainability concept and several recommendations are highlighted

were the ratio between cost structures, air pollution and sustainability targets is not very well

balanced. The main idea of this comparative study is to assess what countries are similar to

Romania in terms of the values of the sustainability transport indicators imposed by the

European Union and to consider what initiatives those countries have taken in order to reach

that particular level. Finally the authors propose several measures that Romania could

reapply from other countries in order to achieve a sustainable development in transport.

Keywords: energy consumption, freight transport, gas emissions, transportation variables

JEL classification: C38, L92, Q56

1. Introduction In 1992, the United Nations conference on the environment and development of Rio de

Janeiro already proposed that ‘‘Indicators of sustainable development need to be developed to

provide solid bases for decision-making at all levels and to contribute to a self-regulating

sustainability of integrated environment and development systems’’ [8].

“Sustainable performance is closely related to the concept of continuity management” [4].

A sustainable economic development can only be secured, if also environmental and social

factors are included in the overall long term infrastructure strategy. “Transports are situated at

the junction of economical and environmental interests. On one hand, transports are an

indispensable activity in an economy characterized by specialization of production and labor.

On the other hand, transports erode stocks of natural resources (mainly energy and raw

material stocks, but also the environment in the broad sense)” [5]. Transport infrastructures,

transport policies or mobility behavior can be assessed by considering energy consumption

onlytogether with some atmospheric pollutant emissions [1], [8], [7].



480

The main greenhouse gas emissions are CO2 that are generated by the burning of the

petroleum-based products like for example: car battery cases, gasoline, motor oil, internal

combustion engines etc. During the fuel combustion process also emissions of nitrogen

oxides (NOx), methane (CH4) and hydrofluorocarbon (HFC) are being released in the

environment, causing pollution [3], [9].

For 2020, the European Union has made a unilateral commitment to reduce overall

greenhouse gas emissions from its 28 Member States by 20% compared to 1990 levels which

is one of the headline targets of the EU 2020 strategy [3].

Herman Daly names three requirements for sustainable development: (1) the rates of use of

renewable resources should not exceed the rates of their regeneration; (2) the rates of use of

non-renewable resources should not exceed the rates of development of their substitutes; and

(3) the rates of pollution emission should not exceed the assimilative capacity of the

environment [6].

The main objective of a sustainable transport strategy is to embed elements from economic,

social and environmental areas, in order not to affect the future of the next generations. The

European transport systems are sustaining the competitiveness on the market through

implementing integrated transport channels across Europe, in correlation with efficient prices

levels and through modal split of freight [2]. Taking into account the rate of economic growth

and the increasing mobility needs of the countries, the main challenge is to balance the

benefits of performing technologies with the negative effects on the surrounding

environment. Furthermore the environmental taxes which penalize the transport system for

oil consumption, emissions of greenhouses gases and nitrogen oxides are putting pressure on

the European countries to find new technologies and to develop alternative renewable

resources. The intent of this analysis is to make a statistical assessment on the main

sustainability transport indicators within the European countries and to highlight the

dependencies between the selected variables, with the purpose of recommending measures of

reducing the energy consumption levels, the greenhouse emissions gases through the use of

modal split of freight transport and through finding renewable energy alternatives.

2. Research methodology In order to analyze the transport sector among the European Union the authors have chosen

12 variables as follows:

a) Energy consumption of transport relative to GDP;

b) Modal split of freight transport roads; Modal split of freight transport railways;

c) Volume of freight transport relative to GDP;

d) Energy consumption of transport, rail, Energy consumption of transport, road;

e) HICP - annual average indices for transport prices;

f) Greenhouse gas emissions from transport;

g) Emissions of nitrogen oxides (NOx) from transport non-road, Emissions of nitrogen

oxides (NOx) from transport road, Emissions of particulate matter from transport non-

road, Emissions of particulate matter from transport road.

The 12 variables were selected based on their relevance for a sustainable development in the

transportation sector and also on the correlations that can be done by comparing them. For

example a high demand for freight transportation generates more energy consumption.

Depending on which type of transport is being used, whether railway or road, the annual

average indices for transport prices may fluctuate. Gas emission is also an indicator that

shows the relationship between the economic and environmental sector of that particular



481

country, in order to measure the consumption levels and at the same time forecast the

production capacity that needs to be in line with the pollution reduction targets.

The source for the above mentioned data was the EUROSTAT portal from the Sustainable

development - Sustainable transport database. The data has been gathered for 2012, in order

to keep as many cases as possible and to have the most recent data available. For the

greenhouse gas emissions in transport data for the following countries were provided from

2011: Iceland, Switzerland, Norway and Turkey. Nevertheless some countries registered

missing data and were excluded from the analysis: Greece, Croatia, Cyprus, Malta, Iceland,

Switzerland, Turkey.

The hierarchical cluster algorithm was applied as follows:

The methods to determine the distance between the items was the Squared Euclidian

Distance, as it is the most appropriate for the data set;

The authors have chosen the Ward Hierarchical Clustering Method because it does not

need to predict the number of clusters;

As the variables are expressed in different unit measures, the authors have chosen to

standardize them using the Z Scores Method;

3. Research results After performing the Cluster Analysis three clusters were obtained according to the

Dendogram (Figure 1):

The first cluster: Belgium, Czech Republic, Denmark, Ireland, Luxembourg,

Netherlands, Austria, Portugal, Slovakia, Finland, Sweden, Norway;

The second cluster: Bulgaria, Estonia, Latvia, Lithuania, Hungary, Poland, Romania,

Slovenia;

The third cluster: Spain, France, Italy, United Kingdom, Germany.

Figure 1. Dendogram resulted from the Cluster Analysis



482

The table containing descriptive statistics for each cluster is presented below:

Table 1. Mean by cluster

Variable Name Mean Cluster 1

(12 countries)

Mean Cluster 2

(8 countries)

Mean Cluster 3

(5 countries)

Modal split of freight transport roads

(% in total inland freight tonne-km)

76.416667 64.775000 82.820000

Modal split of freight transport railways

(% in total inland freight tonne-km)

17.4250 29.8000 13.8400

Energy consumption of transport, rail

(1 000 tonnes of oil equivalent)

118.3250 126.5250 910.2200

Energy consumption of transport, road

(1 000 tonnes of oil equivalent)

5213.6333 4007.0750 37720.9200

Greenhouse gas emissions from transport

(1 000 tonnes of CO2 equivalent)

16478.58 12067.50 117918.60

Emissions of nitrogen oxides (NOx) from

transport non road (tonnes)

47441.08 12065.25 403482.60

Emissions of nitrogen oxides (NOx) from

transport road (tonnes)

57740.17 64377.25 410631.80

Emissions of particulate matter from

transport non-road (tonnes)

2331.0833 508.6250 29781.0000

Emissions of particulate matter from

transport road (tonnes)

3475.3333 4303.2500 20957.0000

If we compare the data for Romania with other clusters following assessment can be done

based on the selected transport variables. In terms of the modal split of the freight factor of

cluster No.2, Romania is situated at an average of 78% from cluster No.3 in roads transport,

but stands at 215% compared to cluster No.3 in terms of modal split of freight for railways.

If we analyze the energy consumption level of transport compared to other clusters, cluster

No. 2 recorded the lowest consumption of oil in terms of railway transport, being at a value

of 13% of the average consumption level of cluster No.3. This seems to statistically influence

the emission of greenhouse gases and nitrogen oxides on the non-road transport. Therefore

comparing the emission of greenhouse gases and nitrogen oxides (NOx) of cluster No.2 with

cluster No.3, cluster No.2 is at an average of 2.99% for nitrogen oxides (NOx) and at 10.23%

for the greenhouse emissions.

The cluster analysis indicates that there is a linear relationship between the emissions of

nitrogen oxides (NOx) and the emissions of particulate matter from road transport. Hence in

the clusters where there was a small quantity of nitrogen oxides (NOx) emission from road

transport, also a small particulate matter emission is observed, rule that applies to all three

clusters. Between the particulate matter emissions from non-road transport and the nitrogen

oxides (NOx) emissions from non-road transport there is also a statistical correlation.

As a conclusion, the countries from the clusters presenting a high percentage of the variable

modal split of freight transport roads is experiencing a high volume of energy consumption,

greenhouse emissions, nitrogen oxides emissions and emissions of particulate matter road and

non-road.

Figure 2 is showing the annual average indices for the transport prices that can be correlated

with the variables already presented in the cluster analysis.

In the countries from cluster No.2, the Harmonised Index of Prices (HICP) had a substantial

increase, because the countries from cluster No.2 had a significantly higher percentage in

terms of the modal freight railway transport versus the one on the road. The countries that

chose modal split of freight road transport to railway transport, had a lower HICP index,

which indicates the fact that for the countries in cluster No.2 in order to be able to decrease or

stabilize the HICP index, it is recommended a progressive increase of the modal split of

freight transport roads.



483

Figure 2. HICP - Annual average indices for transport prices (2005 = 100)

For Romania particularly in order to have a sustainable transportation strategy, a set of mixed

measures need to be implemented. A few examples of such measures are: the increase of

energy efficiency, the development and investment in renewable sources, the modernization

of the road and railway infrastructure networks with emphasis on the railway one in order to

increase the average rail speed and help the fluidization of the traffic, the reduction of the

industries that work with large quantities of goods transported as they consume more energy,

redirection of the flow of goods and consumers to other transport modes that are more eco-

friendly, and the development of trans-European transport routes in order to stay competitive

on the market and create synergies with other countries.

4. Conclusions

All forms of transport generate a lower or higher form of pollution, causing emissions of

greenhouse gases and particulate matters. A careful selection of a freight transport modal split

can solve some pollution greenhouse gases problems by choosing mainly the rail transport,

however for a considerable and sustainable reduction alternative fuels must be found.

The road transport is predominantly focused on oil consumption and a change to renewable

energy sources is limited by the existing infrastructure models. The most viable option in

reducing the level of pollutants and oil consumption lies in the continuous development of

more efficient engines.

In recent years measures have been taken by the European Union countries to change the

source of gasoline in diesel and currently countries like Germany and Austria, started laying

the foundation for the use of compressed methane gas. For example engines and power

stations with natural gas technology. However even if they reduce the level of pollution and

the price of the transportation, they are not renewable sources.

Also changing the road transport on renewable sources like electricity is at the moment

limited to personal cars.

The railway transportation, depending on each engine technology is used can be on diesel or

electric. A high cost is represented by the electrical engine because this means a railway

electrification system that has high infrastructure costs. In the countries where this

infrastructure is already present, the challenge is to modify the source of energy production,

like for example solar panels, wind energy or other eco-friendly sources.



484

The end consumer education is also important to limit the pollution generated by the transport

of goods, one of the measures that could be promoted is to reduce the consumption of extra

seasonal food that would limit the transport from distant countries.

To conclude, all countries wishing to have a sustainable transport growth need to invest as a

medium and long term strategy in better transportation infrastructures, in renewable eco-

friendly resources, in more efficient logistics channels, as well as in developing new

technologies that can reduce the emissions of greenhouse gases.

References

[1] B. V. Mathiesen, H. Lund and P. Norgaard, “Integrated transport and renewable energy

systems”, Journal of Utilities Policy 16, pp. 107-116, 2008.

[2] C. Gallez, L. Hivert and A. Polacchini, “Environment energy budget of trips (EEBT): a

new approach to assess the environmental impacts of urban mobility”, International

Journal of Vehicle Design 20 (1-4), pp. 326-334, 1998.

[3] European Comission, Road transport: Reducing CO2 emissions from vehicles. Internet:

http://ec.europa.eu/clima/policies/transport/vehicles/index_en.htm

[4] G. Marin, T. B. Floricel, S. B. Keppler, “Managing sustainable performance in rail freight

transport projects using business continuity management,”in Proc. The 13th International

Conference on Informatics in Economy Education, Research & Business Technologies,

Bucharest, Romania, 2014, pp. 475-481.

[5] G. Marin and G. J. Weber, “Considerations on strategic policies and evaluation criteria for

sustainable transport,” in Proc. The 13th International Conference on Informatics in

Economy Education, Research & Business Technologies, Bucharest, Romania, 2014, pp.

622-629.

[6] H. Daly, Steady State Economics .Washington, DC: Island Press, 1991.

[7] M. J. Saunders, T. Kuhnimhof, B. Chlond and A.N.R. da Silva, “Incorporating transport

energy into urban planning”, Journal of Transportation Research Part A 42, pp. 874-882,

2008.

[8] R. Joumard and J. P. Nicolas, “Transport project assessment methodology within the

framework of sustainable development”, Journal of Ecological Indicators”, pp.136-142,

2010.

[9] United States Environmental Protection Agency (EPA), Sources from Greenhouse Gas

Emissions.Internet:http://www.epa.gov/climatechange/ghgemissions/sources/transportati

on.html.



485

THE EVALUATION AND STRENGTHENING OF THE FREIGHT

TRANSPORT SYSTEM, AS A SOLUTION FOR SUSTAINABLE

DEVELOPMENT IN ROMANIA

Georgiana MARIN


Alexandra MATEIU The Bucharest University of Economic Studies

[email protected]

Abstract. This paper highlights the sustainability and the advantages of the railway

transport in comparison to the road transport by doing a comprehensive analysis on the main

indicators in the transportation of goods sector. The study emphasizes the importance of

being not only cost effective, but also environmental friendly in choosing the optimum

transportation channel. The ,,Ordinary Least Squares” method is being applied in order to

demonstrate that railway transport is more eco-friendly than other means of transport

considering the energy consumption level. The right balance should be achieved between

using different transportation modes taking into account the infrastructure needs of each

model and also the energy consumption forecast for the goods that need to be delivered.

Although even if new infrastructures are bringing advantages and new opportunities from

both economic and social point of view, their long term sustainability is an important factor

that should be also taken into consideration and especially the effects on the surrounding

environment like for example: air pollution, noise, space consumption etc. Efficiency is an

important component of sustainability, hence any transportation system should target

maximum operational productivity with minimal usage of resources. Finally the paper

concludes by proposing a set of scenarios, which can decrease the energy consumption level

in Romania by changing the volume of the transported goods.

Keywords: hypothesis testing, road and railway freight transport, sustainability

JEL classification: C12, L92, Q56

1. Introduction

“During the last 20plus years, transport issues undergo a development towards a higher level

of effectiveness and sustainability. In this context transports need to face the conflict of

consuming resources on one hand and being a necessity for any economy, independent from

the country” [1]. “Globalization and EU enlargement to the East have created new challenges

for European transport. First, the fast growth of freight transport contributes to the economy,

but also causes congestion, noise, and pollution. EU Commission states that, without

adequate measures, the situation will continue to worsen and increasingly undermine

Europe's competitiveness and the environment” [2]. In this regard, “the EU strategy on

sustainable development has been permanently adapted to the growing exigencies of the

society […] [4]. “The Commission therefore recommends modernizing logistics to boost the

efficacy of individual modes of transport and their combinations. In particular it recommends

a better distribution of traffic towards more environmentally friendly, safer and more energy

efficient modes of transport” [2].



486

Sustainability is a concept that can be difficult to operationalize because it involves goals that

are often in conflict with one another, such as environmental conservation, social

responsibility and economic viability [3], [5], [6].

The transportation system is an important part of this concept and it needs to be accompanied

by clear metrics in order be successfully implemented and have a positive impact on the

surrounding environment. Being sustainable in the context of transportation has many aspects

that need to be taken into consideration, especially if one thinks at cost reduction and at the

same time at using wisely the invested resources.

The most widely used definition of sustainable development, from the Brundtland

Commission, is the basis of most definitions for sustainability in various disciplines:

‘‘Development that meets the needs of the present without compromising the ability of future

generations to meet their own needs’’ [7]. This definition clearly states the fact that being

sustainable means having progress, without negatively impacting the current and future

resources. If we also look at other areas, not only at transportation sector, like for example

tourism or agriculture, the same sustainability principle is being applied.

The fundamental concepts of sustainability are often illustrated through the three spheres of

sustainability, which refer to the integrated nature of environmental, social, and economic

sustainability [3].

Following components can be mentioned for each of those three spheres:

a) Environmental area[3]:

- Reduction of consumption of non-renewable and renewable resources for transportation;

- Better utilization of the land area;

- Optimization of transportation channels in order to minimize the impact on the ecological

systems;

- Reduction of waste, air - and noise pollution.

b) Social area[3]:

- Transportation channels are compliant with the safety and human health policies

regulations;

- The management of transportation embeds different governmental and community

procedures;

- Transportation system needs to fulfill primary needs of its customers and at the same time to

incorporate social requests and changes.

c) Economic area [3]:

- Transportation is cost effective and affordable for the customers;

- Transportation system is efficient and also has capacity to meet the requests;

- Transportation channels help the economy of that particular country and create new

networks and opportunities.

These components address the limited capacity of resources one has on Earth and at the same

time highlight the importance of an effective transportation system that is affordable for

everyone and that preserves the current availability of resources.

If we look at the Romanian transportation sector, in order to cope with the EU regulations,

the pressure of globalization and the need of geographical expansion, the system needs to

combine different modes of goods transportation. The suppliers need to focus on new

technologies and integrated solutions in order to reduce costs and at the same time keep the

standard level of quality.

Due to the strong competition between road and railway transportation and the existing EU

regulations, this paper proposes a comparison between the Romanian railway and road

transportation system by analyzing various factors such as: energy consumption level, cost

effectiveness, eco-friendliness, volume of transported goods and infrastructure needs. The



487

aim of this analysis is to prove which one of this means is more sustainable and how can

Romania react in order to best use its means of transport while remaining competitive on the

global market and meeting its obligations towards the EU.

2. Research methodology For the purpose of this research, data for the 1990-2012 period was gathered; the data sources

are the Romanian Statistical Yearbooks published by the National Institute of Statistics of

Romania. Three time series were employed: energy consumption for the transportation area

(thousand tons of oil equivalent), the amount of goods transported using railway

transportation (thousand tons) the amount of goods transported using road transportation

(thousand tons).

In order to demonstrate that railway transport is far more eco-friendly than other means of

transport considering the energy consumption, the multiple Ordinary Least Squares method

has been applied.

Several steps have been done:

a) Variables were tested, in order to prove that they are stationary. The Augmented Dickey

Fuller Test and the Swartz Info Criterion with 2 lags have been employed in order not to

lose too many data;

The hypothesis of the test were:

𝐻0 = 𝑇ℎ𝑒 𝑑𝑎𝑡𝑎 𝑠𝑒𝑡 ℎ𝑎𝑠 𝑎 𝑢𝑛𝑖𝑡 𝑟𝑜𝑜𝑡 (𝑖𝑠 𝑛𝑜𝑡 𝑠𝑡𝑎𝑡𝑖𝑜𝑛𝑎𝑟𝑦)

𝐻1 = 𝑇ℎ𝑒 𝑑𝑎𝑡𝑎 𝑠𝑒𝑡 𝑖𝑠 𝑠𝑡𝑎𝑡𝑖𝑜𝑛𝑎𝑟𝑦

The p-value set for the test is 0.05. The test was performed using the E-views package. If the

p-value computed is lower than 0,05, then the null hypothesis is rejected, so the series are

stationary;

b) The Ordinary Least Squares method has been employed considering (1):

𝐸𝑛𝑒𝑟𝑔𝑦 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛𝑖= a + b * 𝑅𝑜𝑎𝑑_𝑡𝑟𝑎𝑛𝑠𝑝𝑜𝑟𝑡𝑖 + c * 𝑅𝑎𝑖𝑙𝑤𝑎𝑦_𝑡𝑟𝑎𝑛𝑠𝑝𝑜𝑟𝑡𝑖 + 𝑢𝑖 (1)

𝑤ℎ𝑒𝑟𝑒: 𝑖 = 1990 𝑡𝑜 2012

𝑎, 𝑏, 𝑐 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡𝑠

𝑢𝑖 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑒𝑟𝑟𝑜𝑟 𝑠𝑒𝑟𝑖𝑒𝑠

c) The Significance level set for the Ordinary Least Squares Model was 0.05 ;

d) The Ordinary Least Squares hypothesis were tested as seen in table 1. Each test has been

performed using the E-views package.

Table 1. The test of the ordinary Least Squares hypothesis

Null Hypothesis Test Validation rule

Errors are not auto correlated. Breuch - Godfrey test P-value > 0.05

The model is homoscedastic White test P-value > 0.05

Errors follow a normal distribution Jarque-Bera test P-value > 0.05

3. Research results

As one can observe from table 2, all considered series are stationary. Thus the first hypothesis

of the Ordinary Least Squares was validated.



488

Table 2. Augmented Dickey-Fuller test for selected variables

The other Ordinary Least Squares hypothesis is presented below in table 3 and table 4.

As one can observe, all hypothesis were validated meaning that the coefficients could be

interpreted, see table 3. Furthermore, as one can observe from table 4, the p-value of each

regression coefficient was below the significance level, meaning that all coefficients were

statistically significant.

Table 3. The Ordinary Least Squares Hypothesis

Tests for the Ordinary Least

Squares hypothesis

P - value Result

Breuch-Godfrey test 0.27 Errors are not auto correlated.

White test 0.93 The model is homoscedastic

Jarque-Bera test 0.48 Errors follow a normal distribution.

Table 4. Regression coefficients and p-values

Coefficient Coefficient value P value

Road_transport 0.004838 0.0115

Railway_transport -0.064364 0.0019

a 7383.002 0.000

The R square was 60% meaning that the quantity of goods transported annually using railway

and road transportation together explain 60% of the energy consumption in transportation

area. Furthermore, the R square was lower than the Durbin Watson statistics (1.23) meaning

that the regression is not spurious.

The coefficients are interpreted as follows:

a) An increase with one tone of the amount of goods transported using road transportation

increases the total energy consumption in transportation field by 0.004838 tons of oil

equivalent;

b) An increase with one tone of the amount of goods transported using railway

transportation decreases the total energy consumption in transportation field by 0.064364

tons of oil equivalent;

Thus, considering these results, one can conclude that the railway transportation is much eco-

friendly and cheaper than road transportation of goods. Furthermore, even if road

transportation of goods increases due to infrastructure necessities, an equal increase in the

amount of goods transported using railway transportation leads to a considerable decrease of

the energy consumption. As, 60% of the energy consumption in the transportation area is

explained by the amount of goods transported using railway and road transportation modes, a

well balanced mix between the two transportation modes should be implemented in order to

achieve an acceptable energy consumption level in this area.

Variable

Augmented Dickey-Fuller test P value

of the equations’ residuals and

Equations’ coefficients

P value

Level

P value first

difference

Railway_transport

Trend and intercept 0.1341 0.0152

Intercept 0.0000 0.0026

None 0.0002 0.0000

Road_transport


Intercept 0.0000 0.0000

None 0.0000 0.0000

Energy_consumption


Intercept 0.5605 0.0000

None 0.8566 0.0001



489

In order to prove so, the authors have proposed a scenario analysis, taking into account the

Report concerning energy efficiency drawn up pursuant to Directive 2012/27/EU, in which

Romania promised to reduce its primary energy consumption until 2020 with 19%, as a

reaction to the 20% target set by the European Commission. For the analysis, the authors

have assumed that the 19% energy consumption target has been set for the freight transport

sector in Romania, in order to analyse what changes in transport volume will this target

require. The source of the datawas the EUROSTAT database and the Romanian Statistical

Yearbooks published by the National Institute of Statistics of Romania.

If the primary energy consumption in transport for 2012 was 5345.1 thousand tonnes of oil

equivalent (TOE) and will need to decrease with 19% until 2020, the value in 2020 will be

5345.1 TOE * 0.81 which gives a total of 4329.6 TOE (y), 1015.5 TOE less in comparison

with 2012.

The following scenarios are further analysed:

a) If road transport will remain in 2020 at its level from 2012 (188415 thousand tones), how

will this influence the railway transport in 2020 ?

Step 1. Calculate the railway transport level in 2020

𝐸𝑛𝑒𝑟𝑔𝑦 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛𝑖= a + b * 𝑅𝑜𝑎𝑑_𝑡𝑟𝑎𝑛𝑠𝑝𝑜𝑟𝑡𝑖+ c * 𝑅𝑎𝑖𝑙𝑤𝑎𝑦_𝑡𝑟𝑎𝑛𝑠𝑝𝑜𝑟𝑡𝑖 + 𝑢𝑖(2)

4329.6 = 7383.002 +0.004838 * 188415 - 0.064364 * railway transport 2020

4329.6 - 7383.002 - 911.551 = - 0.064364 * railway transport 2020

railway transport 2020 = 61602.029 thousand tones

Step 2. Railway transport 2020 - Railway transport 2012 = 61602.029 - 55755 =

5847.029 thousand tones

Step 3. (Railway transport 2020 - Railway transport 2012 )/railway transport 2012 =

0.1048

Railway transport should be 61602.04105 thousand tones in 2020, thus it should increase

by 10% compared to the 2012 level.

b) If road transport increases with 25% in 2020 in comparison with 2012, how much will

the railway transport be?

Railway transport should be 65142.65 thousand tones in 2020, thus it should increase by

16% compared to the 2012 level.

c) If road transport decreases with 25% in 2020 in comparison with 2012, how much will

the railway transport be ?

Railway transport should be 58061.43 thousand tones in 2020, thus it should increase by

4.1% compared to the 2012 level.

d) If rail transport remains at its 2012 level (55755 thousand tones), how will this influence

the road transport?

Road transport should be 110626.87 thousand tones in 2020, thus it should decrease by


e) If rail transport increases with 25% in 2020 in comparison with 2012, how much will the

road transport be?

Road transport should be 296065.84 thousand tones in 2020, thus it should increase by


f) If rail transport decreases with 25% in 2020 in comparison with 2012, how much will the

road transport be?



490

Road transport should be - 74812.089 thousand tones in 2020, thus it should decrease by

139% compared to the 2012 level. This negative result for 2020 shows that this scenario

is not plausible.

4. Conclusion

The conducted research has demonstrated that railway transportation is more eco-friendly and

cheaper than road transportation of goods. Moreover, it has been found that even if road

freight transportation increases due to infrastructure necessities, an equal increase in the

amount of goods transported using railway transportation leads to a considerable decrease of

the energy consumption. As, 60% of the energy consumption in the transportation area is

explained by the amount of goods transported using railway and road transportation modes,

the research highlighted that a well balanced mix in form of intermodal transportation

between the two transportation modes should be implemented in Romania in order to meet

the EU energy consumption target until 2020. The intermodal transport system stimulates the

creation of new technologies and the optimization of the existing ones, by reducing energy

costs and at the same time preserving the environment. The implementation of this mix needs

to be based on a realistic assessment of the existing market freight infrastructure and on the

coordination between government policies, national and international strategies. Based on the

proposed transport scenarios, further researches will be done in a later stage, in order to show

how can Romania effectively improve its freight transportation sector while being sustainable

on the global market.

References

[1] G. Marin and G. J. Weber, “Considerations on strategic policies and evaluation criteria

for sustainable transport,” in Proc. The 13th International Conference on Informatics in

Economy Education, Research & Business Technologies, Bucharest, Romania, 2014, pp.

622-629.

[2] G. Marin and M. Olaru, “Strategic decisions on modal transport in Romania in connection

with the principles of sustainable development,” in Proc. The International Conference

Emerging Markets Queries in Finance and Business, Bucharest, Romania, 2014.

[3] J. Zheng, N. W. Garrick, C. Atkinson-Palombo, C.McCahill and W. Marshall, “

Guidelines on developing performance metrics for evaluating transportation

sustainability,” Journal of Research in Transportation Business & Management, vol. 7 ,

pp. 164-168, 2013.

[4] M. Olaru, V. Dinu, G. Stoleriu, D. Sandru and V. Dincă, “Responsible Commercial

Activity of SMEs And Specific Values of Sustainable Development In Terms of The

European Excellence,” Amfiteatru Economic, vol. 12, no. 27, pp. 10-27, 2010.

[5] M. Hart, Guide to sustainable community indicators(Second ed.). West Hartford:CT:

Sustainable Measures, 2006.

[6] T. Litman. (2014, June). Well measured: Developing indicators for comprehensive and

sustainable transport planning [Online]. Available: http://www.vtpi.org/wellmeas.pdf.

[7] World Commission on Environment and Development (WCED), Our Common Future.

Oxford University Press, England, 1987.

http://www.vtpi.org/wellmeas.pdf



491

INNOVATION – CONTENT, NATIONAL INNOVATION STRATEGIES

AND MODELLING INNOVATION USING THE MICROECONOMIC

APPROACH

Stelian STANCU



[email protected]; [email protected]

Constanţa-Nicoleta BODEA



[email protected]

Oana Mădălina POPESCU


[email protected]

Orlando Marian VOICA The Bucharest University of Economic Studies

[email protected]

Laura Elly NAGHI The Bucharest University of Economic Studies

[email protected]

Abstract. As innovation - as a determinant factor of economic and social growth - enjoys an

increasing recognition, the nature of innovation, its role and its determining factors begin to

enjoy an increasing attention. The paper highlights the distinction between invention and

innovation, emphasizing that innovation concerns a marketable application in practice of an

invention, namely an integration of economic and social practice of the invention. In terms of

the strategy of research-development and innovation of Romania for the period 2014-2020, it

was defined in the context of the European strategy Europe 2020 and the European Cohesion

Policy. The scheme presented comes to reveal the structure and functionality of the research-

development and innovation system in Romania. The approach at the micro level of the

economic and mathematical model of innovation services highlights the relationship between

innovation and competition, thus emphasizing the level of the equilibrium rate of innovation,

for each firm, in an industry of the same level of competition, namely the intensity of the R&D

at the firm level, in an industry with “uneven” competition.

Key words: innovation, invention, competition, R&D intensity, “even industry”, “uneven

industry”.

JEL classification: D2, O31.

1. Model of service innovation - microeconomic approach

The reality of recent years shows that the competition and innovation indicators are in a

direct relationship, namely:

Competition (C) ↑→ Innovation (I) ↑

1.1. The relationship between innovation and competition







492

Let n – be the rate of innovation (the R&D intensity) at firm level (the rate of hazard) and c –

a measure for the competition.

Therefore, the relationship between innovation and competition can be written as follows:

n= 𝑒𝑔(𝑐) (1)

where 𝑔(∙) - represents an unknown function.

Hypothesis 1. It is assumed that the patents have a Poisson distribution with the rate of

hazard (innovation) (1).

1.2. Analysis of the economic and mathematical model

Hypotheses of the model

- it is considered a representative consumer (an identical lot of consumers);

- each consumer offers an inelastic labour unit;

- r – represents the inter-temporal discount;

- 𝑢(𝑦𝑡) = 𝑙𝑛𝑦𝑡 represents the utility function at the level of the representative consumer;

- The good 𝑦 is produced at each moment t using the continuous input, with the

production function given by:

𝑙𝑛𝑦𝑡 = ∫ 𝑙𝑛𝑥𝑗𝑡𝑑𝑗1

0 (2)

where 𝑥𝑗 - represents the sum of two intermediate goods produced by the duopolist in sector j,

given by the substitution function: 𝑥𝑗 = 𝑥𝐴𝑗 + 𝑥𝐵𝑗

- the representative consumer:

picks 𝑥𝐴𝑗 and 𝑥𝐵𝑗 from the optimum problem at the level of sector j:

{

𝑚𝑎𝑥𝑥𝐴𝑗,𝑥𝐵𝑗

{𝑥𝐴𝑗 + 𝑥𝐵𝑗}

𝑝𝐴𝑗𝑥𝐴𝑗 + 𝑝𝐵𝑗𝑥𝐵𝑗 = 1 (3)

- each firm produces using labour force as a single input:

it is assumed that there is a CRS (constant return to scale);

w – represents the wage rate (given).

- 𝑐𝐴 and 𝑐𝐵 represent the unit costs of production of the two industry firms. They are

presumed independent of the produced quantity.

- let k – be the technological level of firm i from the duopoly (industry) j.

Comments:

1. A unit of labour currently employed by firm i, generates an output flow (output):

𝐴𝑖 = 𝛾𝑘𝑖 with i=A, B (4)

where 𝛾 > 1 is a parameter that measures the dimension of the innovation.

2. 𝛾−𝑘𝑖 represent the units of labour, if firm i produces one unit of output.

- let 𝜑(𝑛) =𝑛2

2 be the cost of the R&D in units of labour;

- with this cost 𝜑(𝑛), the firm that employs (leader) moves one technological step

ahead, with a Poisson risk rate (innovation rate n or R&D intensity);

- we assume that the follower moves one step ahead, with the innovation rate h, without

making any R&D expenses (by copying the leader’s technology);

- 𝑛2

2 is the R&D cost of the follower, in order to move one step ahead, with a Poisson

risk rate „n+h”.

Let:

𝑛0 - be the R&D intensity at the level of each firm such as the two firms be at the same level

of competition;

𝑛−1 - be the R&D intensity at the follower firm level, in an “uneven” competitive industry; 𝑛1

- be the R&D intensity at the leader firm level, in an “uneven” competitive industry.



493

Observation: 𝑛1 = 0 then the leader firm (innovative) cannot obtain an advantage from

innovation.

The competition degree on the market of a product is inversely proportional with the degree

that two firms at the same industry level are capable to understand. Moreover,, two firms in

an “uneven” industry will not get along. The follower will have a zero profit, while the leader

will have: equal profit with the difference between its revenues, that have been normalized to

one unit, and cost, meaning:

{𝜋−1 = 0

𝜋1 = 1 − 𝛾−1

Comments:

1. Each firm at industry level obtains a profit

Equal to zero, if the firms are incapable to get along (the firms are in a Bertrand

competition, meaning that they have identical products and identical unit costs)

𝜋1

2 , if there is a maximum understanding.

2. If we parametrize competition on the product’s market with ∆= 1 − 휀

where:

∆ - represents the parameter for the competition (it represents altogether the profit growth of

the innovative firm in an industry of the same normalized level (the profit) in connection to

the leader’s profit;

휀 - represents the fraction from the leader’s profit which the follower firm can obtain through

an understanding with the leader

then:

𝜋0 = 휀𝜋1 , if 0 ≤ 휀 ≤1

2.

The influence at equilibrium of the research intensities1 𝑛0 and 𝑛1 and consequently of

the aggregate innovation rate (the Schumpeterian effects and “escape – competition”)

We assume𝑛0 and 𝑛−1 the innovation rates at equilibrium, determined by the necessary

conditions for the Markov symmetric stationary equilibrium, in which each firm tries to

maximize its expected updated profit (with the update rate r = 0)

Proposition.1. The innovation rate at equilibrium, for each firm, in an industry at the same

level of competition, is given by:

𝑛0 = √ℎ2 + 2∆𝜋1 − ℎ (5)

which means that ∆↑→ 𝑛0 ↑;

and respectively2:

𝑛−1 = √ℎ2 + 𝑛02 + 2𝜋1 − ℎ − 𝑛0 (6)

which means that ∆↑→ 𝑛−1 ↓.

Let:

𝜇1 - be the probability that the industry state of equilibrium be “uneven” and

𝜇0 - be the probability that the industry state of equilibrium be “even”.

Throughout any given unit of time (interval), the probability that an industry equilibrium (a

firm or a sector) to change from the “uneven” state to the “even” state is 𝜇1(𝑛−1 + ℎ), while

the probability that an equilibrium industry (a firm or a sector) to change from the “even”

state to the “uneven” state is 2𝜇0𝑛0.

At equilibrium:

𝜇1(𝑛−1 + ℎ) = 2𝜇0𝑛0 (7)

and knowing that 𝜇1 + 𝜇0=1, it can be concluded that the aggregate flow of innovation, I, is

given by:

1 That varies according to the competition measure (the competition degree). 2The innovation rate of a “lazy” firm (follower).



494

𝐼 = 𝜇1(𝑛−1 + ℎ) + 2𝜇0𝑛0 =4𝑛0(𝑛−1+ℎ)

2𝑛0+𝑛−1+ℎ (8)

Relationship I-∆ (The aggregate flow of innovation - the competition Degree (intensity))

From Proposition.1. it can be concluded that the equilibrium innovation rate, 𝑛0, grows with

the competition intensity between industry firms.

As a consequence, 𝑛0 can be used as a proxy measure of the competition on a product’s

market, where:

𝑛0 ∈ [𝑥 = √ℎ2 + 𝜋1 − ℎ ; 𝑥 = √ℎ2 + 2𝜋1 − ℎ

with

𝑥 = 𝑥, corresponds to a maximum agreement ( 𝜋0 =𝜋1

2);

𝑥 = 𝑥 corresponds to a maximum competition (𝜋0 = 0).

Comments:

a. When the competition is not strong on the product’s market for the “even” firms it is

hard for each one of them to innovate. The global rate will be the highest when the market

(sector) is in the “uneven” state.

- if the industry will stay most of the time in the “even” state (𝑛0 decreases in relation

to 𝜋0), it can thus be defined the effect escape competition.

b. When the competition degree is very high, in an “uneven” state there are few stimulants

for the “lagging” firm to innovate. As a sequel, the industry will be slow when leaving the

“uneven” state.

- if the industry will stay most of the time in the “uneven” state, the Schumpeter effect

will be defined. [1]

2. Empirical analysis

The analysis of concentration on the electricity supply market shows a downward trend in

2009-2011, both HHI and the cumulative market share of top 5 competitors. However, it can

be noted a low level of concentration on this market. [2]

Figure 1. Evolution of the concentration degree on the electricity supply market towards end-consumers

Source: Data processed by the Competition Council

In 2009-2011, on the segment of supply to final consumers there have been registered

fluctuations in the concentration degree measured by the HHI, but the indicator reflects a low

degree of market concentration. At the same time, the concentration indicators CR5 and

CR10 have declined slightly, indicating an erosion of the key market participants’ share. [3]

With the previous notations and running the model presented, the following results were

obtained at the level of the first two companies in the industry in two specific situations and

particular even, namely uneven:



495

Table 1. Results obtained in the case of an even industry, namely an uneven industry

Even industry

Firm 1 Firm 2

Ax 1,732051 Bx 2,157669

𝑛0 0,767764 𝑛0 0,767764

0 0,222222

Uneven industry

Firm 1

(leader)

Firm2

(follower)

Ax 1,732051 Bx 2,157669

1n 0 𝑛−1 0,437529

1 0,666667 1 0

Table 2. The probability of the status to change and integration of the competition degree

Equilibrium analysis

The

probability

that at

equilibrium an

industry to

change from

uneven to

even,

𝜇1(𝑛−1 + ℎ)

0,447989

𝜇0 (The probability

that the equilibrium

state of the industry

to be even)

0,291749

even

The

probability

that at

equilibrium an

industry to

change from

even to

uneven,

2𝜇0𝑛0

0,447989

𝜇1 (The probability

that the equilibrium

state of the industry

to be uneven)

0,708251

uneven

I(the aggregate

innovation flow)

0,895978

Distribution firms

Maximum understanding Maximum competition

𝑥 = 𝑥 0,644459 𝑥 = 𝑥 0,97605

0 0,333333 0 0

Measure of the competition degree, n0 belongs to 𝑥 = 𝑥 0,644459 𝑥 = 𝑥 0,97605

x~ 0,676106

From the data analysis in table 2, it can be seen the equilibrium probability of an industry to

change from uneven into even and vice versa, and the integration or not of the competition

degree 𝑛0 between min and max thresholds respectively, depending on the existing time of

competition, maximum and minimum respectively.



496

3. Conclusions:

1. At “even” industry level, we have stronger innovation regardless of the competition

degree on the product’s market [4];

2. The curve ∩ is more abrupt for the “even” industries;

3. The relationship innovation – competition on the market, using an nonlinear flexible

estimator in shape of a ∩.

References

[1] O.C.D.E., La mesure des activités scientifiques et technologiques-Méthode type pour les

enquêtes sur la recherche el le développement expérimental, Manuel de Frascati, 5th

Ed.1993 and 6 th Ed. 2002.

[2] J. Calvert and B.R. Martin, “Changing conceptions of basic research”, Workshop on

policy relevance and measurement of basic research, Oslo, 29-30 October, 2001.

[3] M. Dodgson, D.M. Gann and A.J. Salter, “The intensification of innovation,

International Journal of Innovation Management”, Vol.6, No.1, pp.53, March 2002.

[4] P.I. Otiman and A. Pisoschi, “Unele consideraţii asupra legii privind cercetarea ştiinţifică

şi dezvoltarea tehnologică”, Revista de Politica Ştiinţei şi Scientometrie, Vol.I, No.3,

pp.128, 2003.



497

RISK AWARENESS AS COMPETITIVE FACTOR FOR PUBLIC

ADMINISTRATION - A GERMAN CASE STUDY

Markus BODEMANN


Marieta OLARU


[email protected]

Ionela Carmen PIRNEA

“Constantin Brancoveanu” University, Rm.Valcea

[email protected]

Abstract. Public administration is under steady reconstruction: challenged by increasing

duties and public demand of more intensified and varied services and decreased funding,

strategies of activity and prevention are current essential tasks. The sustainable execution of

providing mandatory and voluntarily tasks is of prime importance. An adoption of the New

Public Management elements focuses to bridge the distinctions between private and public

management, by applying more private sector elements: Competition, output- and outcome-

orientation and responsibility for public managers. In Germany, the public administrations

have to declare their detected individual chances and threads for a sustainable allocation of

services to the citizens in the annual status report. This approach analyses and compares the

elements in the reports of the state North Rhine-Westphalia, concerning risk in the reports,

distinct them systematically to show chances for awareness and active influence. Further

risks and benefits by a decision are presented in correlation. With greater knowledge of risks

and benefits, their origin and potential kind of harm to citizenry, prevention of decrease of

public value as indicator, will be a competitive advantage of the public administration; in the

internal and external context.

Keywords: new public management, internal and external risks, perception, forecast

JEL classification: H11, H41

1. Introduction Remembering the fundamental approach of Woodrow Wilson at the end of the 19th century

[1], public managers have to track the legal and formal requirements but should be aware of

changes in the internal and external environment. It is obvious, demonstrated by the current

desolate financial situation of a great number of public administrations in Germany, that a

deviation from the traditional kind of managing is necessary. In some cases the sustainable

execution of mandatory tasks is endangered; it is elusive that the constitutional tasks are

reduced by financial bottlenecks. Searches for sources of this imbalance present a wide

variety of possible reasons: increasing number and complexity of tasks, triggered by

legislation and political programs on state, federal and European level. On the other side also

a direct accountability of public managers and local political programs, as Bodemann points

out in 2011 [2], [3].

1.1. Introduction of tools by the New Public Management



498

After a century of public management, based on the models developed by Woodrow Wilson,

Max Weber (Bureaucracy) and Frederic Taylor (Scientific Management) [4], the introduction

of the models and tools of the New Public Management (NPM) opened public management

to private sector management tools. A convergence in applied management methods is the

result, but under considering the basic differences between both sectors; financing,

competition, providing of public goods are only a few to name. Simon (1998) explains that

modern scientists make important use of private and public organization. For them the

boundaries are fluid [5]. But this almost 40 year reform is now obviously: Transparency and

comparability are only two effects the reform in Germany established. For better estimation

of the current financial status and to bridge the gap between the private and public sector, the

Germany federal states introduce step by step the double-entry-accounting system, releasing

the former use of cash-accounting. This system was in terms of content, accuracy and extent

accepted, but it lacks for example of depreciation. Therefore, reinvestments for maintaining

of building or other assets or economic goods are financed within the continuous budgeting.

Using the accrual accounting the debt level is obvious, also the development of finances and

the gap between revenues and expenses in the local public budget. Other tools like lean

management and the delegation of accountability in horizontal and vertical extension are used

to create more responsibility and a step forward to a specialized entrepreneurship, inside the

legal framework.

1.2. Closing the gap between public and private sector handling

Beneath the increased public interest in public management and the execution of political

programs, other branches have likewise an increased interest in the current or expected

conditions of the local public administration. The financial service industry waited for the

application of accrual accounting in public sector: On the one hand to get a better impression

of the sustainable development of the institution public administration and on the other hand,

the rules and regulations Basel II and III demands clear measures to check debtors, their

creditability, liquidity and their expected short- and middle term development of finances [6].

The former public sector is now no longer privileged, especially after the publishing of

factual bankruptcy of public administration, first sign by the municipality of Leukerbad in

Switzerland in the nineties of the last century; this bankruptcy was a result of a debt level of

350 Million Franc; a debt of 200 000 Franc for each inhabitant [7].

1.3. Necessity for change for public managers and politicians

As later shown, the situation in Germany is similar; tendency in decreasing own equity as one

consequence of the discrepancy between revenues and expenses, is only one indicator to be

noticed. Without an appropriate forecast system, based of financial indicators, historical and

reference class models and expertise. Further the social and political impact and combined

implications have to be identified and estimated. Only with bridging past and future a more

entrepreneurial approach could be expected [8]. This includes an awareness of the sources

and reasons for developments in the past, but, most important, assumptions and forecasts of

influence factors threatening the sustainable execution of mandatory tasks and regaining

financial space to act and steer. Olaru et al. (2014) come to a similar environmental focus for

private sector organizations [9].

2. Research methodology for indicating, pooling and processing positive and negative

factors

The focus of this research was on the State of North Rhine-Westphalia, located in the middle-

west of Germany. This state combines the complete spectrum of population, from sparse in



499

the East to a very compressed area of Rhine and Ruhr. Further the spectrum of production,

industry and research is settled; from farming to chemical industry and IT-technology.

Comparable with these conditions the duties of the public administration are same kind

various. The population surveyed is 271 municipalities with city rights, 125 municipalities

without city rights, in sum 396 political independent public administrations and additional 30

county administrations [10]. For the research 48 municipalities (12%) are selected from all

over the state to get a wide spectrum of identified desired and undesired developments, based

on each individual public administration. To integrate view and status of county

administrations, 10 counties are evaluated (33%). Objective was to summarize the presented

factors, create categories and interpret the number and chosen risk factors.

3. Risk awareness as argument for future decisions in management and politics

Deviant from the three columns of Wilson, Weber and Taylor, identification of not certain

developments is one of the most important but also effective tools to be prepared. The results

of not being aware and prepared will create an acceleration of debt level, decrease of own

equity, decreased reputation and therefore worse credit-conditions which lead to the same

consequences. The suggested management models, performance measurement, decentralized

accountability and competition within boundaries with the private market will be sacrificed

by the effort to provide the basic mandatory tasks to the society. Remarkable is the difference

in presenting adverse and desired effects to further developments of the public administration

as guarantor for individual and collective welfare. Already 20 years ago, Mussurari (1996)

stated that the local government believed to find the reason and the jurisdiction for their

existence within themselves [11].

Under the regulation of "Kameralistik" (Cash-Accounting) the input-orientation was

established. Analyzing the given adverse effects this tendency to concentrate to expected

revenues to spend in the future is already settled. There are only a few mentions which imply

an output-, or more sophisticated, an outcome-orientation of services and products provided.

4. Results of the survey in Germany

Similar to the private sector, municipalities in Germany, especially in the State of North

Rhine-Westphalia, have to prepare and publish a compulsory status report. Beneath the

standard numbers and reports chances and risks have to be declared and their affect for

sustainable provision of mandatory and voluntarily tasks [12]. For the political and

administrative discussion this is an enormous step for more transparency but also for an

increase in accountability and the necessity to estimate consequences of decisions, as

Denhardt and Denhardt (2009) explains the tasks [13]. Analyzing the given status reports it

becomes evident that the municipalities have a wide area of interpretation of the demanded

chances and risks. While some very clear describe their future estimation of development,

others use the status report in a kind of political statement. In these reports neither concrete

suggestions nor a forecast of future events or uncertain influences are given. So the quality

and quantity varies.

Using a correlation of size and volume of budget shows the same spreading. While expecting

a more detailed presentation of risks and chances in the case of fast decreasing own equity of

the public administration, to prove a serious concerns about future developments, the results

are sober. One explanation is the inexperience in using these tools as justification for

decisions in the future and to be aware of uncertain factors for operations and finances is the

short time of application of the double-entry-accounting. Accompanied with other parts of the

New Public Management tools box the increasing convergence of governance rules for

transparency and accountability between public and private sector is directed to more



500

efficiency and effectiveness; Forrer et al. (2010) add from the legal way equity as basement

for operations [14]. Remarkable in the results is the disproportion of to influence factors and

factors not to be influenced. Comparing the initial reasons for desired and undesired

developments and results the survey summarizes the following risk and chance factors.

While endogenous factors content all factors which can be influenced, at least partially by the

public administration, exogenous factors lay outside the sphere to be influenced.

The following range of the factors is presented by the analysis of the status reports:

Table 1. Types of positive and negative influence factors

1. Exogenous factors 2. Endogenous factors

1.1. Negative factors 1.2. Positive

factors

2.1. Negative factors 2.2. Positive factors

1.1.1.

Social

development

Integration

of

disabled

people

Geographic

status 2.1.1. Infra-

structure

Public

safety and

order

2.2.1.

Infra-

structure

Commercial

areas

Youth welfare

Demographic

development Re-

construction

Own equity

Kindergarten

Taxes / revenues 2.1.2.

Provision of

services of

general

interest

Waste

disposal Energetic

measures

Education Energy Street

maintenance

Social benefits Water Re-

construction

2.1.3.

Operations

Personnel

costs

Economic

stimulus

1.1.2.

General

Law suites City

development

Connectivity

principle

2.1.4.

Finances

Guarantees 2.2.2.

Finances

Holdings/

shares

1.1.3.

Finances

Transfers

Holdings/

shares Interest

payments

Basel II / III Taxes

Interest payment 2.2.3.

Operations

Organizational

improvement

Financial

operations

Internal

control

systems

Taxes Public works

Co-operations

Source: according to the research conducted by the authors

The spectrum of positive influence factors contents 3 exogenous and 14 endogenous factors,

while in the case of negative factors 12 come from the exogenous and 10 from the

endogenous sources. In the average, each public administration presented 3.4 influence

factors, therefore 2.7 negative factors and 0.7 positive factors, the maximum of given factors

was 11.



501

The following graph shows the wide spread uncertain factors, given by the public

administration. Most of the examples are legal based and out of any influence. Those

mandatory tasks combined with the neglecting of the connectivity principle, dominate the

operations; space for voluntarily tasks is very limited.

Figure1. Exogenous and endogenous negative factors and their shares

Although the uncertain factors for a sustainable planning are partly described in a very

detailed way, only a view give concrete examples of the consequences in monetary units.

Figure 2. Exogenous and endogenous positive factors and their shares

In the case of endogenous uncertain factors, which could be steered and calculated in

monetary units, guarantees are often described with the initial volume and the decrease over

the runtime. For concrete reconstructions only a few give samples for executed or planned

measures.



502

5. Conclusions

The results of the research undertaken show a wide spectrum of uncertain factors; these

factors could have desired and not desired influence of sustainable provision of mandatory

operations and create space for voluntarily tasks. The negative factors outclass the positive.

But within the given factors only in a few cases are combined with monetary values for a

better impression of possible outputs and outcomes. Because of a missing model of the

definition of risk, uncertainty and certainty, the given expectations present a different picture

of risk understanding. Also be noted that a clear priority in measures for positive effects is

given, for example provision of commercial areas and city development. The range of

negative factors is comparatively diffuse.

That leads to the conclusion that public administrations in Germany have a clear

understanding of influence factors, but lack on instruments for calculating the outcomes in

the case of realization of adverse factors. But for strategic use the identification of these

factors will have impact to political targets and the sustainable execution of mandatory tasks.

For better comparison general categories and monetary volumes will give a more

sophisticated and useful impression of current status and expected range of developments.

References [1] W. Wilson, "The Study of Administration," Political Science Quarterly, vol. 2, no. 2, pp.

197-222, June 1887.

[2] M. Bodemann, "Risk awareness als Schlüssel zu einem erfolgreichen Risikomanagement"

in: A, Niedostadek, R. Riedl, and J. Stember (eds.), "Risiken im öffentlichen Bereich,"

Berlin: Lit Verlag, 2011, pp. 87 – 109.

[3] M. Bodemann, M. Olaru and I.C. Pirnea, ”Risk awareness as key success factor for more

efficient management for local public administrations”, Proceedings of The First

International Management Conference in Danube Delta ”Challenges, performances and

tendencies in the organization management”, ASE Publishing House, 2014.

[4] H.G. Rainey, "Understanding and managing public organization", San Francisco, John

Wiley & Sons, 2009.

[5] H.A. Simon, "Why public administration," Journal of Public Administration, Research

and Theory, pp. 1 -11, January 1998.

[6] B. Loevenich, “Der Umgang mit operationellen Risiken in Kreditinstituten – Vorbild für

öffentliche Verwaltungen?,” in: A, Niedostadek, R. Riedl, and J. Stember (eds.),

"Risiken im öffentlichen Bereich," Berlin: Lit Verlag, 2011, pp. 263.

[7] D. Rehfeld, “Rating von Kommunen," in: E. Meurer and G. Stephan, „Rechnungswesen

und Controlling,“ Stuttgart: Haufe, 2005.

[8] C. Pollitt and G. Bouckaert, “Public Management Reform,” Oxford: Oxford University

Press, 2004, pp. 538 – 542.

[9] M. Olaru, D. Maier, A. Maier and M. Bodemann, “Integrated management systems, Key

factor for the sustainable development of an organization,” in: Proceedings of the 13th

International Conference on Informatics in Economy IE 2014 Bucharest, 2014.

[10] “Verwaltungsgliederung in Deutschland am 31.12.2013“. Internet:

https://www.destatis.de/DE/ZahlenFakten/LaenderRegionen/Regionales/Gemeindeverzei

chnis/Administrativ/Archiv/Verwaltungsgliederung/31122013_Jahr.html

[11] R. Mussari, "Autonomy, responsibility and New Public Management," St. Gallen:

Conference on New Public Management in International Perspective., 1996.



503

[12] B. Loevenich, “Der Umgang mit operationellen Risiken in Kreditinstituten – Vorbild für

öffentliche Verwaltungen?,” in: A, Niedostadek, R.Riedl, and J. Stember (eds.), "Risiken

im öffentlichen Bereich," Berlin: Lit Verlag, 2011, pp. 257.

[13] R.B. Denhardt and J.V. Denhardt, "Public administration: An action orientation,"

Belmont: Thomson Wadworth, 2009.

[14] J. Forrer, J.E. Kee, K.E. Newcomer and E. Boyer, "Public – private partnerships and the

public accountability question", Public Administration Review, pp. 475-484, 2010.



504

SOVEREIGN RISK DEPENDENCE PATTERN IN

EMERGING EUROPE

Gabriel GAIDUCHEVICI

Romanian Academy [email protected]

Abstract. In this study we implement a flexible model to assess both the strength and the

pattern of sovereign risk dependence between several European countries with emerging

economies. We employ a market risk approach to analyzing sovereign risk and provide a

synoptic interpretation of the results facilitating a better understanding of the

interconnectedness of sovereign risk. We conclude that sovereign risk is not a country

specific type of risk but rather a reflection of both internal factors and macroeconomic forces

external to the country.

Keywords: copula, garch, hac, sovereign risk JEL classification: C22, C46, H63

1. Introduction

The purpose in this study is to implement a flexible model to assess both the strength and the

pattern of sovereign risk dependence between several European countries. We aim at giving a

synoptic interpretation to sovereign risk, in the sense that we would like to see whether the

risks associated to these countries move in tandem or they can be analyzed independently.

We pursue a novel implementation by using an extensive data set of credit default swap

(CDS) contracts on the debt issued by 11 European emerging countries. This market risk

approach to analyzing sovereign risk has at least two advantages when compared to more

traditional macroeconomic approaches: i) as CDS spreads reflect the market perspective on

the credit risk of a particular country we believe that these indicators are tightly linked to the

general health of the country’s finances. In addition, using market data improves accuracy

because it reacts promptly to changes in the macroeconomic environment; and ii) CDS

contracts are more liquid than the corresponding sovereign bond market allowing for a

cleaner extraction of the credit risk premium from the CDS spread.

Given the rapid expansion of sovereign debt markets, understanding the nature of sovereign

risk is of increasing importance because it directly affects the cost and flow of capital across

countries. Furthermore, understanding sovereign risk and its interconnectedness improves the

ability of market participants to assess the implications of their investments, especially with

respect to diversification of debt portfolios. Existing literature focuses more on the incentives

and ability of sovereign debtors to repay their debt and, despite its importance, there is

relatively little research on the sources of commonality and dependence in sovereign risk. Of

particular relevance are the studies of [1] who map agency ratings to default losses and then

decompose sovereign credit spreads into credit risk and premium components for a panel of

24 countries. [2] use an affine sovereign credit model to demonstrate how common factors

could induce significant correlation among credit spreads. Our approach is different in that

we seek to describe the interconnectedness of sovereign risk rather than predict individual

country default. We concur that describing an economy by an index might overlook some

information but at the same time we argue that it adds simplicity and produces more realistic

results because it reduces the model dependency on extensive sets of assumptions. Our



505

implementation is a combination between the breadth provided by high dimensionality and

the generality of using aggregated indicators.

The paper proceeds as follows: section 2 presents the modeling framework, section 3

describes the implementation and section 4 concludes.

2. Modeling framework

From a methodological perspective, our goal is to disentangle the idiosyncratic components

from the common factors driving the sovereign risk. We employ a copula-GARCH approach

and proceed by dividing the study in two phases. First, we filter the univariate series to

extract all temporal dependence. The resulting cross sectional panel of standardized residuals

reflects only the pure joint dependence. Second, we fit a multidimensional hierarchical

Archimedean copula (HAC) to describe the pattern of association. One advantage of the

copula-GARCH approach is the possibility to specify and estimate the model in stages. The

marginal distributions are specified by an ARMA-GARCH model for each univariate time

series and then a copula (in this case a HAC) is estimated on the probability integral

transforms of the standardized residuals. The result is a valid multi-dimensional joint

distribution that is easier to estimate and interpret.

In the context of this analysis we are interested in modeling the cross sectional dependence

between time series data and therefore we employ an adapted version of Sklar's theorem

introduced by [3]. The multivariate distribution 𝐹 of a vector 𝑌𝑡, conditional on the

information set available at time 𝑡 − 1 given by ℱ𝑡−1 = {𝑌𝑙: 𝑙 ≤ 𝑡 − 1}, is decomposed into

its conditional margins 𝐹𝑖 and the corresponding conditional copula in the following way:

𝐹(𝑦|ℱ𝑡−1 ) = 𝐶{𝐹1(𝑦1|ℱ𝑡−1), … , 𝐹𝑝(𝑦𝑝|ℱ𝑡−1)|ℱ𝑡−1 }

with 𝑌𝑖𝑡|ℱ𝑡−1 ∼ 𝐹𝑖(⋅ |ℱ𝑡−1), 𝑖 = 1, … , 𝑝 (1)

Fitting a copula on the unconditional probability integral transform will result in an

unconditional copula model for the dependence. In a time series context however, it is

necessary to condition on the available past information which first requires the specification

of the margins and then the copula that joins the series cross-sectionally. If we define the

probability integral transform 𝑈𝑖𝑡 = 𝐹𝑖(𝑌𝑖𝑡|ℱ𝑡−1) then the conditional copula of 𝑌𝑡|ℱ𝑡−1 is

given by 𝑈𝑡|ℱ𝑡−1 ∼ 𝐶(⋅ |ℱ𝑡−1). It is important to note that both the margins and the copula

have to be conditional on the same data set. We use the standard ARMA-GARCH approach

to model the univariate distributions by specifying the following general model for each

univariate series:

𝑌𝑖𝑡 = 𝜇𝑖(𝑌𝑡−1) + 𝜎𝑖(𝑌𝑡−1)𝜖𝑖𝑡 (2)

where 𝜖𝑖𝑡 ∼ 𝐹𝑖(0,1), ∀ 𝑡. The conditional copula is fitted on the conditional distribution of the

probability integral transform of the standardized residuals constructed as:

𝜖��𝑡 =

𝑌𝑖𝑡 − 𝜇𝑖(𝑌𝑡−1)

𝜎𝑖(𝑌𝑡−1), 𝑖 = 1, 2 … , 𝑝 (3)

The parametric form of 𝐹𝑖 has to be able to accommodate thicker than normal tails and

possibly an asymmetric shape. For this analysis, we tested both the normal distribution and

the Student-t for its ability to control the thickness of the tails via the degrees of freedom

parameter.

Archimedean copulas are related to the Laplace transforms of univariate distribution

functions. According to [4] if we denote by 𝕃 the class of Laplace transforms that consist of

strictly decreasing differentiable functions than the function 𝐶: [0,1]𝑑 → [0,1] defined as:

𝐶(𝑢1, … , 𝑢𝑛; 𝜃) = 𝜙{𝜙−1(𝑢) + ⋯ + 𝜙−1(𝑢𝑑)}, 𝑢1, … , 𝑢𝑑 ∈ [0,1] (4)



506

is a d-dimensional exchangeable Archimedean copula where 𝜙 ∈ 𝕃 is called the generator

function and 𝜃 is the copula parameter. Archimedean copulas provide an elegant solution to

accommodate tail dependence in non-elliptical distributions. However, fitting a fully nested

structure to a large data set is unfeasible. This disadvantage comes from the fact that the

multivariate dependence structure typically depends on a single parameter of the generator

function. Furthermore, the resulting distribution is exchangeable which means the

dependence is symmetric with respect to the permutation of the variables. HACs alleviate

these shortcomings by providing an efficient way to recursively define the dependence

structure for large dimensional data sets. Using the same notation as in (4), a fully nested

HAC connecting 𝑑 − 1 nesting levels is defined recursively by the following relation:

𝐶(𝑢1, … , 𝑢𝑑) = 𝜙𝑑−1{𝜙𝑑−1−1 ° 𝜙𝑑−1{⋯ [𝜙2

−1°𝜙1{𝜙1−1(𝑢1) + 𝜙1

−1(𝑢2)} + 𝜙2−1(𝑢3)]

+ ⋯ 𝜙𝑑−2−1 (𝑢𝑑−1)} + 𝜙𝑑−1

−1 (𝑢𝑑)}

= 𝜙𝑑−1{𝜙𝑑−1−1 ° 𝐶(𝜙1, … , 𝜙𝑑−1)(𝑢1, … , 𝑢𝑑−1) + 𝜙𝑑−1

−1 (𝑢𝑑)}

= 𝐶𝑑−1{𝐶𝑑−2(𝑢1, … , 𝑢𝑑−1), 𝑢𝑑}

(5)

According to [5] such a structure is determined recursively starting at the lowest level with a

copula 𝜙1 forming a variable 𝑧1 = 𝜙1{𝜙1−1(𝑢1) + 𝜙1

−1(𝑢2)}. At the second level another

copula is used to capture de dependence between 𝑧1 and 𝑢3 and so on. The generators 𝜙𝑖may

come from the same family and differ only in parameter or may come from different

generator families. [5] propose an efficient method to determine the optimal structure. The

estimation procedure relies on a recursive multi-stage maximum likelihood method which

determines the parameters at each level and the structure simultaneously (the structure itself

is in fact a parameter to estimate).

3. Model implementation and results

We apply the methodological framework described above to an extensive data set of CDS

contracts on the sovereign debt of 11 European emerging economies. Briefly, a CDS is a

contract whereby the seller provides insurance to the buyer against the losses resulted from a

default by the reference entity. The CDS spread is quoted in basis points for different

maturities across the credit curve. Our data set consists of the daily, 5 year maturity, CDS

spreads of 11 countries spanning a period of roughly 5 years. Figure 1 depicts the time

evolution of the CDS spreads for each country in the data set. Countries are identified by

their ISO country code. The levels of our CDS data are indicative of autoregressive processes

and similar to other market data series they are unlikely to follow a random walk (they are

also bounded below). We model in log-differences to avoid treating the series as near unit

root processes. All data was retrieved from the Bloomberg Database.

Using the methodology of [6] the CDS spread can be expressed as:

𝑆𝑖𝑡 = 1002𝑃𝑖𝑡ℳ𝑖𝑡𝐿𝑖𝑡 (6)

where 𝑃𝑖𝑡is the probability of default, ℳ𝑖𝑡 is the market price of risk and 𝐿𝑖𝑡is the loss given

default (LGD). In our analysis we work with the log-returns of CDS spreads to mitigate their

autoregressive persistence under the following transformation:

𝑦𝑖𝑡 = 𝛥 log 𝑆𝑖𝑡 = 𝛥 log 𝑃𝑖𝑡 + 𝛥 log ℳ𝑖𝑡 + 𝛥 log 𝐿𝑖𝑡 (7)

If we follow the business cycle theory then the market price of risk is constant or evolves

slowly. If we further assume, in accordance with the common practice, a constant LGD then

the second and the third terms in (7) vanish leaving the changes in CDS spreads to be directly

attributed to the changes in the empirical probability of default. Table 1 presents the summary

statistics of the CDS spreads both in their level form and log-returns. Of particular



507

importance for our analysis is the high level of kurtosis of the log-return series as this

imposes the need for a distribution to capture the thickness in the tails (and possibly skewed).

Figure 1. CDS spreads for countries in data set (ISO country codes)

Table 1. Summary statistics of CDS spreads both in level and log-return forms

Levels (in basis points) Log returns

1st Qu Median Mean 3rd Qu Mean Std Skew Kurt

ROU 189 252 278 344 - 0.0011 0.0286 - 0.3687 7.6017

HUN 238 291 322 381 - 0.0007 0.0292 0.1818 8.8396

POL 82 126 137 166 - 0.0009 0.0365 0.2100 7.0302

CZE 59 82 91 109 - 0.0008 0.0335 - 0.1096 12.7427

SVN 74 85 111 114 - 0.0007 0.0359 - 0.0003 9.5602

BGR 126 215 229 287 - 0.0007 0.0307 0.0317 7.8669

HRV 252 288 318 344 - 0.0004 0.0271 0.3448 10.2133

SVK 81 136 193 302 0.0000 0.0394 1.8607 34.2530

EST 64 93 137 132 - 0.0013 0.0312 - 0.9213 16.5526

LVA 121 245 292 351 - 0.0014 0.0291 - 0.7707 25.9898

LTU 128 227 243 291 - 0.0012 0.0280 - 0.3372 23.3966

Daily CDS spreads have more autocorrelation (i.e. risk persistence) than is found in other

market data. Therefore our conditional mean-variance models need more structure than the

commonly used model for daily stock returns. Applying an ARMA-GARCH process to the

return series of the two indices removes the temporal correlation but preserves the cross-

sectional dependence. As most asset returns, our series are not stationary (at least in the

variance) and therefore we followed the standard mean-variance model building approach to

make our series temporal independent. The order for the mean equation was determined by

comparing the BIC of ARMA models of orders up to 5. An AR(1) process was optimal for

most of the series and this decision has also been confirmed by the partial autocorrelation

function of the squared log-returns. Then, we used the squared residuals from the mean

equation to test for conditional heteroskedasticity. Applying the Ljung-Box test on the first 12

squared log-returns revealed p-values very close to zero which gives strong indication of

rejecting the null and a motivation to introduce a conditional variance equation. To handle the



508

fat tails characteristics we used the Student-t distribution. Taking all the above into

consideration we implemented the following form of mean-variance modeling:

𝑦𝑖𝑡 = 𝜇𝑖 + 𝜃𝑖1𝑦𝑖,𝑡−1 + 𝑒𝑖𝑡

𝑒𝑖𝑡 = 𝜎𝑖𝑡𝜖𝑖𝑡, 𝜖𝑖𝑡 ~ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 − 𝑡 (𝜈)

𝜎𝑖𝑡 = 𝜔𝑖 + 𝛼𝑒𝑖,𝑡−12 + 𝛽𝜎𝑖,𝑡−1

(8)

and the results are presented in Table 2. For ROU and BGR the degrees of freedom parameter

came out not significant so we used the normal distribution instead. Similarly, for POL, CZE

and SVN the AR(1) parameters were not significant therefore we only used the equation for

the variance - GARCH(1,1). The HAC is fitted on the standardized residuals obtained from

(8). Taking into account the dependence characteristics for each pair of standardized residuals

(judging by the scatterplots) we decided to use the Gumbel generator at each node. HACs

have at least two interesting characteristics: first, the structure is recursive which entails that

the marginal distribution at each node in the tree is also a HAC. However for ease of

interpretation we opted for a fully nested HAC (i.e. binary copula at each node); second, if

the same copula, with a single parameter, is used at each level then the parameters should

increase with the levels. This provides an intuitive interpretation of the copula tree as the

dependence at the bottom is stronger than at the top. In addition, we chose to present the

results in Figure 2 using the equivalent Kendall’s 𝜏 as it is easier to interpret than the

respective copula parameter (𝜏 represents the rank correlation and is bounded by [0,1]). Copula estimation was performed in R using the HAC package by [7].

Table 2. Parameter estimates of the ARMA-GARCH processes

𝜇 𝜃1 𝜔 𝛼 𝛽 𝜈

ROU -0.0013 0.1441 0.0001 0.1540 0.7331

(0.0006) (0.0309) (0.0000) (0.0286) (0.0420)

HUN -0.0011 0.1275 0.0001 0.2334 0.7594 5.1850

(0.0005) (0.0252) (0.0000) (0.0726) (0.0661) (0.3167)

POL -0.0018 0.0001 0.2576 0.7140 4.4670

(0.0006) (0.0000) (0.0520) (0.0317) (0.3593)

CZE 0.0090 0.0000 0.3225 0.4894 3.5680

(0.0043) (0.0000) (0.0747) (0.0189) (0.0654)

SVN -0.0010 0.0002 0.4144 0.5556 3.2352

(0.0004) (0.0001) (0.1848) (0.0464) (0.1513)

BGR -0.0007 0.0874 0.0001 0.1311 0.8140

(0.0003) (0.0304) (0.0000) (0.0224) (0.0291)

HRV -0.0004 0.1161 0.0001 0.2981 0.6160 3.2850

(0.0003) (0.0250) (0.0001) (0.1566) (0.0481) (0.1338)

SVK -0.0002 -0.0450 0.0004 0.3181 0.6593 2.1472

(0.0004) (0.0224) (0.0001) (0.1157) (0.0551) (0.0522)

EST -0.0001 -0.0466 0.0000 0.1874 0.7340 3.1830

(0.0001) (0.0218) (0.0000) (0.0911) (0.0388) (0.0398)

LVA -0.0012 0.0000 0.2586 0.6530 3.5460

(0.0003) (0.0000) (0.0697) (0.0196) (0.0546)

LTU -0.0001 0.0000 0.1795 0.6878 3.4780

(0.0002) (0.0000) (0.0732) (0.0247) (0.0466)

Figure 2. Pattern of sovereign risk dependence



509

4. Conclusions In this study we determined the degree of interconnectivity of sovereign risk and the pattern

of association in the credit risk of these countries. Several important conclusions derive from

this study: i) first, copula parameters are significant across the tree meaning that sovereign

risk is not a country specific type of risk but rather a reflection of both internal factors and

macroeconomic forces external to the country; ii) judging by how the countries are grouped it

is evident that countries adopting the euro (SVN, SVK, EST) have a different risk profile and

a weaker connection to the others; iii) the newest members of the EU – ROU, BGR and HRV

are grouped together indicating similar and interconnected sovereign risk. ROU and BGR

present the strongest risk dependence, which is in line with the general perception of seeing

and referencing these countries together; iv) countries that joined the Eurozone most recently-

LVA and LTU are grouped together with CZE, the strongest economy in the group.

Acknowledgment

This work was financially supported through the project "Routes of academic excellence in

doctoral and post-doctoral research - REACH" co-financed through the European Social

Fund, by Sectoral Operational Programme Human Resources Development 2007-2013,

contract no POSDRU/159/1.5/S/137926.

References [1] E. Remolona, M. Scatigna and E. Wu, “The Dynamic Pricing of Sovereign Risk in

Emerging Markets.” The Journal of Fixed Income, vol. 17, no. 4, pp. 57-71, 2008

[2] J. Pan and K. Singleton, “Default and Recovery Implicit in the Term Structure of

Sovereign CDS Spreads”, Journal of Finance, vol. 63, no. 5, pp. 2345-2384, 2008

[3] A. Patton, “Modelling asymmetric exchange rate dependence”, International Economic

Review, vol. 47, no. 2, pp. 527-556, 2006

[4] H. Joe, Multivariate Models and Dependence Concepts. London: Chapman & Hall, 1997.

[5] O. Okhrin, Y. Okhrin and W. Schmid, “On the Structure and Estimation of Hierarchical

Archimedean Copulas”, Journal of Econometrics, vol. 173, no. 2, pp. 189-204, 2013.

[6] P. Carr and L. Wu, “A simple robust link between American puts and credit protection”,

Review of Financial Studies, vol. 24, no. 2, pp. 473-505, 2011

[7] O. Okhrin and A. Ristig, “Hierarchical Archimedean Copulae: The HAC Package”,

Journal of Statistical Software, vol. 58, no. 4, 2014



510

MEASURE YOUR GENDER GAP: WAGE INEQUALITIES USING

BLINDER OAXACA DECOMPOSITION

Radu-Ioan VIJA

The Bucharest University of Economic Studies, Economic Cybernetics and Statistics

Doctoral School

[email protected]

Ionela-Catalina ZAMFIR

The Bucharest University of Economic Studies, Economic Cybernetics and Statistics

Doctoral School

[email protected]

Abstract. Nowadays, we can observe many forms of discrimination, from everyday life racial

discrimination, to wage discrimination based on age or gender. This article is explaining and

demonstrating the wage inequalities between men and women by decomposing and analyzing

wage data using the Oaxaca Blinder statistics technique for linear regression models. The

analysis did in this article is emphasizing the importance of wage inequalities in private

companies as in the public institutions by identifying the main factors/statistical variables

which plays an important role in the non-discriminant inequality and especially in

discriminant inequalities.

Keywords: Blinder-Oaxaca decomposition, cluster, linear regression, wage gap, datamining.

JEL Classification: C13, J71

1. Introduction and literature review

In the majority of cases, the Blinder-Oaxaca method is used to study wage gap by sex and

race. Blinder Oaxaca decomposition represents a system of linear regression equations from a

statistical point of view. The discrimination problem is wide spread in these days, especially

as there are more and more organizations, rights and laws dealing with fight against

discrimination. It's measurement and combat is a common topic and modeling became

possible with the proposal of new models. Since the original Oaxaca and Blinder (1973)

decomposition technique model, many studies were made (more in education or social areas),

to test the original model, it's applications areas and it's results (such as: Dodoo, 1991; Farkas

and Vicknair, 1996; DeLeire, 2001; Sayer, 2004; Yun, 2006; Stearns et al., 2007; Berends

and Penaloza, 2008, Becker :1971, Duncan :1969, Ashenfelter: 1987, Altonji : 1999,

Althauser :1972). This study is structured in sections, such as: section 2 presents details the

methodology approach, section 3 is the case study, presenting the dataset used and results for

models applied, and section 4 shows the conclusions and further research.

2. Methodology

2.1. The Blinder-Oaxaca statistic model

We are considering the next conditional regression equation model:

𝑌£ = 𝛼£ + 𝛽£𝑋£ + 휀£, £ ∈ {A, B} (1)

with E(ε£ ) = 0 for £ ∈ {A,B}. We are interested in explaining the difference ∆ :

∆= ��𝐴 − ��𝐵 (2)



511

Considering all the co-variables and eventually considering a discriminative effect which will

be decomposed and analyzed later in this article. In order to do this, (Blinder 1973) and

(Oaxaca, 1973) proposed the next decomposition of our difference ∆ :

∆= ��𝐴 − ��𝐵∗

+ ��𝐵∗

− ��𝐵 (3)

where: ��𝐵∗

= 𝛼𝐴 + 𝛽𝐴��𝐵 corresponds to a contradictory model (which will be the YB for our

model for the A population). Thus: ∆= 𝛽𝐴(��𝐴 − ��𝐵) + (𝛼𝐴 − 𝛼𝐵) + (𝛽𝐴 − 𝛽𝐵)��𝐵

(4), where: 𝛿1 = 𝛽𝐴(��𝐴 − ��𝐵) represents the explicable difference by the own characteristics

of the population and 𝛿2 = (𝛼𝐴 − 𝛼𝐵) + (𝛽𝐴 − 𝛽𝐵)��𝐵 represents the effect of the non-

explicable coefficients. The (4) decomposition in two parts can also be seen as a particular

case of a more general decomposition (in matrix format):

∆= (��𝐴 − ��𝐵)[𝐷𝛽𝐴 + (𝐼 − 𝐷)𝛽𝐵] + (𝛽𝐴 − 𝛽𝐵)[(𝐼 − 𝐷)��𝐴 − 𝐷��𝐵] (5)

where β is now the vector including the intercepts and the other regressors of X represent the

set of co-variables completed by a first column of 1. The estimation of the model is based on

the estimators of ordinary least squares (OLS). The parameters α and β are estimated by αˆMC

and βˆMC being conditioned by the statistic populations (the data for populations A and B).

2.2. Cluster and discriminant analyses

Cluster analysis is a part of the methods and techniques for unsupervised pattern recognition.

These techniques as well are trying to satisfy the general criterion of classification [Ruxanda,

2009]: variability within classes must be as small as possible and the variability between

classes must be as high as possible. Partitioning algorithms provide superior results than

hierarchical methods because they run until the STOP condition is fulfilled. The most famous

among partitioning algorithms is the K-Means algorithm. On the other hand, the discriminant

analysis is part of methods and techniques from supervised pattern recognition. With this

method, new observations, about a class membership is unknown, can be classified into

classes using discriminant scores. There are several types of classifiers, but the Fisher linear

classifier is more used than others. The general relation for Fisher's linear classification

function is: 𝐷(𝑥) = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + ⋯ + 𝛽𝑛𝑥𝑛 (6), Where: n is the number of variables

(characteristics) in the model, and 𝛽 is the eigenvector of matrix: Σ𝑤−1Σ𝑏, where Σ𝑏 is the

variability between classes and Σ𝑤 is the variability within classes.

3. Results

3.1. Database used

The original database consists in 628 employees, coming from a mix of employees’

confidential data from companies working in the French energy field, all ages confounded

and from all the departments. For cluster and discriminant analyses, two operations were

made on the original dataset: eliminating observations with missing values and observations

that are considered to be outliers3, and standardizing the dataset, in order to classify

observations into 4 classes and estimate linear classification functions. After these operations,

only 528 employees remained, that represents 84% of the original database. The variables

used in our study are: Birth Year (DateNaissance), Sex (Sexe), Year of starting job in the

company (DateEntreeEntreprise), Nationality (Nationalite), Number of university years

(NiveauEtude), Department (BusinessUnit), Number of days off for holidays

(NbJoursAutresAbsences2013), Number of days off for sickness (NbJoursMaladie2013),

Maternity days off (NbJoursCongeParental), Place of work, Type of job contract

3 An outlier is a value that is not included in the statistical interval: [mean-3*Stdev; mean+3*Stdev], that

contains 99.98% of total observations



512

(Encadrement), Number of people managed in the company, Wage in 2011, 2012 and 2013

(SBR2011, SBR2012, SBR2013).

3.2. Graphical analysis

a). Wage discrimination related to age and sex

Figure 1. Wage discrimination related to sex and age, source: SAS output

The graph shows clear wage discrimination related to age and sex. It is notable that as they

age, men are paid better, and women tend to gain the same wage. For women, the age brings

small increases in salary, while for men, the age and the experience bring huge increases in

wage. If we think of higher wages as associated with leadership positions, is it possible that

women do not reach leadership positions with age? It is possible that women wages be

influenced by the fact that women become mothers and have maternity leave?

b). Wage discrimination related to business field and sex

Figure 2. Wage discrimination related to business field and sex, source: SAS output

The graph above shows that the salary level was low for women in "Direction generale",

"Finance" and "Supply Chain" fields, while men's salary level was higher for "Direction

generale" and "Finance" field. Men won more that women if they work in "Juridique / Fiscal"

area. These ideas prove once again that there is discrimination between women and men

working in different departments of the same company.

3.3. Cluster and discriminant analyses

Taking into account 7 standardized variables, the cluster analysis shows how observations are

grouped into 4 big clusters. The method used to classify all 5284 individuals is an

algorithmically method5: K-Means algorithm.

4 The remaining individuals after eliminating of outliers and observations with missing values. 5 Among the two classification categories (hierarchically methods and algorithmically method), this algorithm

provides the best results, due to the fact that it "runs" until the classes centroids are stable.



513

Row 1 2 3 4 Total Row 1 2 3 4 Total

F

12 224 33 269 Autre 5 5

H 25 47 48 139 259 Direction generale 20 10 55 95 180

Total 25 59 272 172 528 Finance 5 26 20 12 63

Informatique 3 33 36

Juridique / Fiscal 147 147

Marketing 5 19 24

Supply Chain 20 45 8 73

Total 25 59 272 172 528

Figure 3. Classes structure, source: Excel computation

The figure from above shows the classification results. There are 4 classes, each of them

having a certain percentage of women and men, as follows: Class 1: there are 47 men and no

women, 20 of them working in "Direction generale" and 5 of them in "Finance". They are

born between 1954 and 1974, came in the company between 1982 and 2008 and they have an

average wage of 82 239€ in 2011, 87 482€ in 2012 and 92 072€ in 2013. This class may be

named as "top managers" class. It is important to notice here that there are is no woman in

top managers class. Class 2: there are 47 men and 12 women, 26 persons work in "Finance"

and 20 in "Supply Chain". Individuals are born between 1953 and 1962, came in the

company between 1971 and 1990 and they have an average wage of 43 356€ in 2011, 44

525€ in 2012 and 45 918€ in 2013. This class may be named as "middle managers" class.

Taking into account that only 20% of individuals are women, we can observe genre

discrimination when it comes to manager's position. Class 3: there are 224 women and 48

men, 147 of individuals work in "Juridique / Fiscal", 55 in "Direction generale" and 45 in

"Supply Chain". They are born between 1961 and 1985, came in the company between 1988

and 2011 and they have an average wage of 19843€ in 2011, 20943€ in 2012 and 22696€ in

2013. This class may be named as "young and inexperienced workers" class. Class 4: there

are 139 men and 33 women, 95 of the individuals work in "Direction generale" and 33 of

them in "Informatique". They are born between 1956 and 1983, came in the company

between 1989 and 2011 and they have an average wage of 28670€ in 2011, 32189€ in 2012

and 33123€ in 2013. This class may be named as "normal workers" class.

Looking carefully at all 4 classes from above, we might say that the company "prefers"

experienced managers (top and middle level), and middle managers came in the company

much earlier than top managers. From this point of view, what are the reasons that the

company changed top managers more often (including 2008) than middle managers?

Figure 4. Linear discriminant functions, source: SAS output

The figure from above represents the linear discriminant coefficients (Fisher linear classifier)

for all 4 classes indentified above. With these coefficients, it is possible to write the estimator

functions, such as (the discriminant function for class 1): 𝐷1(∙) = −54.27 + 2.76 ∗ Sexe − 3.73 ∗ DateNaissance − 3.19 ∗ DateEntreeEnterprise − 5.50 ∗ SBR2011

+ 24.21 ∗ SBR2012 + 8.22 ∗ SBR2013 + 4.91 ∗ BusinessUnit

Using these functions, and taking into account the sex codification: 1=H (man) and 0=F

(woman), and the BusinessUnit codification (1 for Juridique/Fiscal, 2 for Supply Chain, 3 for

Marketing, 4 for Finance, 5 for Direction generale, 6 for Informatique, 7 for Autre and 8 for

RH) it is possible to calculate 4 discriminants scores (calculated for standardized data). The

higher score "gives" the class for a new individual, for whom the affiliation to a class is



514

unknown. Using cross validation, the correct classification degree is:

pc=(25+59+272+154)/528=96.6% and the general error of the model (100%-96.6%=3.4%).

According to these indicators, the model provides accurate results in order to identify

employees class for a new ("New" means employed at most in 2011, in order to have

information about the wage from 2011, 2012 and 2013) employee.

3.4. Blinder-Oaxaca model results

After running the algorithm in R the obtained data will be explained in this article from a

graphical point of view. In the first phase we will proceed by doing a global data analysis in

order to have a global idea of employee’s distribution in the company.

Figure 5. Wage distribution men-women by hiring year

Identifying the main discriminative variables in the below graph, the main discriminative

variables in our company are the followings:

Figure 6. Impact of discriminative variables in the company

We can observe that hierarchical level and the type of job contract are the main

discriminative variables in our case with an impact of 6,9 K Euros by year.

4. Conclusions and further research

Finally, using a private database with a large number of employees, we concluded that there

is wage discrimination both regarding to employees business and their gender. With cluster

and discriminant analyses we have identified four main groups of employees, we found that

there is a high gender discrimination at management level and we estimated discriminant

functions, by which new employees can be affiliated in one of the identified groups. Using

Oaxaca Blinder method in R Studio we have identified (using a set of 8 variables) the main

gender gap amount is represented by the starting year job variable by decomposing it in two

main parts, non-discriminative and discriminative. As further research, we plan to study the

phenomenon of discrimination with other methods, tracking it over time, identifying the

causes that lead to wage discrimination and proposing new solutions to reduce this wide

phenomenon.



515

References

[1]. R. P. Althauser and M. Wigler, "Standardization and Component Analysis". Sociological

Methods and Research, Vol. 1, No. 1, pp. 97–135, 1972.

[2]. J.G. Altonji and R. M. Blank., "Race and Gender in the Labor Market", In Handbook of

Labor Economics, Vol. 3C, eds. Orley Ashenfelter and David Card, pp. 3143–3259.

Amsterdam: Elsevier, 1999

[3]. O. Ashenfelter and R. Oaxaca, "The Economics of Discrimination: Economists Enter the

Courtroom", American Economic Review 77 (2), pp. 321–325, 1987

[4]. G. S. Becker, “The Economics of Discrimination”. 2nd ed., Chicago: University of

Chicago Press, 1971

[5]. M. Berends, S. R. Lucas and R. V. Penaloza, “How Changes in Families and Schools Are

Related to Trends in Black-White Test Scores.” Sociology of Education, Vol. 81, pp.

313-344, 2008

[6]. A. S., Blinder, "Wage Discrimination: Reduced Form and Structural Estimates," Journal

of Human Resources, Vol. 8, pp. 436-455, 1973

[7]. T. DeLeire, “Changes in Wage Discrimination against People with Disabilities: 1984-

93.” Journal of Human Resources, Vol. 36, pp. 144–158, 2001

[8]. F. Nil-Amoo Dodoo, “Earnings differences among Blacks in America.” Social Science

Research, Vol. 20, pp. 93-108, 1991

[9]. O.D. Duncan, “Inheritance of Poverty or Inheritance of Race. In On Understanding

Poverty: Perspectives from the Social Sciences”, ed. Daniel P. Moynihan, 1969, pp. 85–

110. New York: Basic Books

[10]. G. Farkas, and V. Keven, “Appropriate Tests of Racial Wage Discrimination Require

Controls for Cognitive Skill: Comment on Cancio, Evans and Maume.” American

Sociological Review, Vol. 61, No. 4, pp. 557-560, 1996

[11]. J.K. Galbraith and L. Jiaqing, Inequality and Industrial Change: a Global View, Chapter

16, James K. Galbraith and Maureen Berner, eds., Cambridge University Press, 2001.

[12]. J. Gardeazabal and A. Ugidos, "More on Identification in Detailed Wage

Decompositions", Review of Economics and Statistics, Vol. 86, No. 4, pp.1034-1036,

2002

[13]. M. Hlavac, "oaxaca: Blinder-Oaxaca Decomposition in R", 2014, R package version

0.1. Available at: http://CRAN.R-project.org/package=oaxaca

[14]. B. Jann, "The Blinder-Oaxaca decomposition for linear regression models", The Stata

Journal, Vol. 8, No. 4, pp. 453-479, 2008

[15]. C. Kim, "Detailed Wage Decompositions: Revisiting the Identification Problem",

Sociological Methodology, Vol. 43, pp. 346-363, 2012

[16]. R. L. Oaxaca, "Male-Female Wage Differentials in Urban Labor Markets," International

Economic Review, Vol. 14, pp. 693-709, 1973

[17]. G. Ruxanda, Analiza multidimensională a datelor, Academia de Studii Economice,

Școala Doctorală, București, 2009

[18]. L. C. Sayer, “Are Parents Investing Less in Children? Trends in Mothers and Fathers

Time with Children.” American Journal of Sociology, Vol. 110, pp. 1-43, 2004

[19]. E. Stearns, S. Moller, J. Blau and S. Potochnick, “Staying Back and Dropping Out: The

Relationship Between Grade Retention and School Dropout.” Sociology of Education

Vol. 80, pp. 210-240, 2007

[20]. M. S. Yun, "Decomposing Differences in the First Moment", Economics Letters, Vol.

82 (2), pp. 273–278, 2004



516

[21]. M. S. Yun, "Earnings Inequality in USA, 1969–1999: Comparing Inequality Using

Earnings Equations", Review of Income and Wealth, Vol. 52 (1), pp. 127–144, 2006

[22]. M. S. Yun, "Revisiting Inter-Industry Wage Differentials and the Gender Wage Gap:

An Identification Problem", IZA Discussion Paper No. 2427, 2006, Available at SSRN:

http://ssrn.com/abstract=947083



517

THE CONSUMPTION CHANNEL OF NON-KEYNESIAN EFFECTS.

SOME EMPIRICAL EVIDENCES FOR ROMANIA

Ana ANDREI


[email protected]

Angela GALUPA


[email protected]

Sorina GRAMATOVICI


[email protected]

Abstract. The results of the economic research in recent decades highlight the possibility of

expansionary effects of fiscal consolidations. We analyse the consumption channel of non-

Keynesian effects and we use the models proposed by Alfonso (2001), Giavazzi and Pagano

(1996), Rzońca and Ciżkowicz (2005) for empirical verifications of these effects on the

Romanian’s economy data. In order to apply the Rzońca and Ciżkowicz model, we also

computed the fiscal impulse using both primary deficit and Blanchard measures.

Key words: consumption channel, fiscal impulse, fiscal consolidation, Keynesian and non-

Keynesian effects, investments channel

JEL classification: E12, E21, E62, C54

1. Introduction

The recent empirical studies in the area of public finance has been focused on the exploring

the short-run expansionary influence of fiscal consolidations policies, mainly on the private

consumption, on the investments and on the output. These effects have been called by

researchers “non-Keynesian effects of fiscal policy”.

The economic literature in this area contains works of notable names. Among the most

relevant works one can refer to Giavazzi and Pagano [9]. They used the Denmark’s economy

data for the period 1983-1986 and Ireland’s economy data for the period 1987-1989 in order

to point out the evidence of expansionary fiscal contractions. They also analysed if the non-

Keynesian effects are sufficiently large and persistent.

Papers published later by Giavazzi et al. [10], [11] used both consumption and savings

channels to prove these effects.

An interesting result belongs to Bertola and Drazen [4]. They proved that if the ratio of public

consumption to GDP is critically high, then a small increase of the nominator can induce

large changes in private consumption in the opposite direction. Blanchard [5] and Sutherland

[15] proved that by reducing the taxes, one can reduce the permanent income and

consumption, as a consequence of the unsustainability of the current policy, or of the

consumers’ high debt-income ratio. Canale et. al [8] conclude that the reaction of the central

bank can be very important in determining the overall results of a fiscal consolidation. By

consequence, a synchronous and opposite monetary policy intervention could generate non-

Keynessian effects. Therefore it is necessary to correlate the fiscal plans and the central bank

projections.



518

Ardagana [3] pointed out that alternative monetary policies have relatively little effect on the

size of short-run fiscal multipliers and therefore they could not influence an expansionary

fiscal contraction.

Alesina and Perotti [2], Blanchard and Perotti [6] contributions on non-Keynesian literature

suggest that the indirect effects of the reduction or expansion of public spending on the

permanent income could be considered as a kind of the crowding-out effects of expansionary

fiscal policies. . Perotti [13] finds that, in the European economies with high debt-to-GDP

levels or rates of debt accumulation, a negative shock to government purchases could

stimulate consumption and output.

Mankiw and Champbell [7] pointed as a cause of non-Keynesian effects, the liquidity

constraints of the households that consume only their disposal income together with the

perception of fiscal policies as extensive and persistent.

Most recent works find evidences of non-Keynesian effects in Greece as one can see, for

example, in Tagkalakis [17] and Szabo [16]. They concluded that non-Keynesian effects are

higher in the case of public spending cuts and recommends expenditures based consolidation.

Bhattacharya and Mukherjee [12] completed an empirical study for OECD countries finding

that private and government consumption are complements in the household utility function,

so non- Keynesian effects can occur.

The aim of our paper is to check out the consumption channel for non-Keynesian effects of

discretionary fiscal policies using statistical data of Romania. In order to complete our study

we extend some of the results of the Afonso [1], Giavazzi, Pagano [9] and Rzońca,

Ciżkowicz [14] regarding consumption models.

2. Consumption Chanel of Non- Keynesian Effects In order to assess the existence of non-Keynesian effects in the EU-15, Afonso [1] used the

following consumption function:

)()( 221110 ttttttt gdgyaac (1)

where tc is the annual growth rate of real private consumption, ty is the growth rate of real

GDP, t , tg are respectively, the real taxes and real government expenditures as a percentage

of GDP, and td is a dummy variable, with 0td for the periods that are nor marked by

significant fiscal adjustments and 1td , in the opposite case.

The theoretical Keynesian effects are: 00 tt cg and 00 tt c , so that

0,0 11 . In the case of the fiscal consolidation, the effects above can be contrary, so that

.0,0 22

We extended the Giavazzi, Pagano [9] consumption model applied for OECD countries

(1972-1996) in order to capture the EU economic space influence. The consumption function

resulted and used to estimate the effects of fiscal policies is defined as is follows.

)()

)(1(

165143121165

143121

28

1413211

ttttttttt

ttttt

EU

ttttt

GGTRTRTTdGG

TRTRTTdYYYCC

(2)

tC is the real consumption, tY the real GDP, 28EU

tY is the income of the EURO ZONE 28

countries, that influence the business cycle for all the members. tT are real taxes, tTR are

real transfers, and tG are real government spending, td is a dummy variable during

semnificative fiscal events as cuts or expansions.

The interest of the authors is focussed on the circumstances in which a fiscal consolidation

can have the effect of increasing of the private consumption and the fall of the private



519

consumption in the absence of the fiscal policy. Usually, these effects occur when the debt-

GDP grows fast and when the fiscal correction is large and persistent.

Non-Keynesian effects are expected if 0,0,0 531 . The pure Keynesian effects

take place in “normal” periods (periods that are not affected by a strong and persistent

spending cuts and taxes increase) and if 0,0,0 531 .

The last model we consider is a version of the consumption function proposed by [14] to

analyse the consumption channel for Non-Keynesian effects for new EU members:

1

2

1

1

2

1

1

2

1

1

mt

m

mkt

k

kit

i

itt rfiycc (3)

where tfi represents fiscal impulse resulting from changes in primary deficit, and tr is the real

deposit interest rate, tc is the private consumption growth rate, ty is the real GDP growth

rate.

If the non-Keynesian effects are transmitted from consumption channel then, at least for one

k, 21 ork we have 0k .

In our empirical study, we consider both the Blanchard measure and primary deficit measure

of fiscal impulse.

The primary balance measure is:

)()( 11 tttt tgtgFI (4)

and the BFI measure is:

)())(( 111 ttttt tgtugFI (5)

where tg represents total current expenditures plus gross capital accumulation less interest

payments as a share of GDP, tt represents total revenues as a share of GDP and 1tu is the

unemployment rate previous year. We assume that )( 1tt ug can be approximated based on the

linear regression function:

tttt uug 1101)( (6)

where t is the error.

3. Empirical verifications for Romania

Our empirical work is focused to the verification of the effects of the fiscal policies based on

three models: Afonso [1], Giavazzi, Pagano [9] and Rzońca, Ciżkowicz [14] previously

extended.

In order to apply on the statistical data the Afonso’s model, we used the data from INSSE

data base for the final consumption and for the nominal GDP data, and for government

spending and taxes rates we used the data from EUROSTAT databases.

The estimated Alfonso’s model for Romania is:

𝑐𝑡 = 0.67435𝑦𝑡 + 0.4602∆𝑔𝑡 − 0.264679∆𝜏𝑡 − 0.096327𝑑𝑡∆𝑔𝑡 + 0.19261𝑑𝑡∆𝜏𝑡 (7)

The regression without intercept has satisfactory estimation indicators.

It could be seen that the coefficient of ty which can be thought of as marginal propensity

to consumption, shows that much of the increase in consumption rate is due to income’s

growth rate. It could be seen also that the effects of government spending and tax rates are

Keynesian type. During periods of fiscal consolidation, marked by significant adjustments of

the government spending and of the taxes, the signs of the coefficients mark non Keynesian

effects, so 019261.0,0096327.0 22 .



520

Instead, the Keynesian effects are dominant, so that the overall result is not significantly

influenced by periods of strong fiscal events. The observed data and the computed data of

Alfonso’s consumption model applied for Romania could be seen in the figure below.

Figure 1. The consumption growth rate: computed Alfonso’s consumption function data vs observed data

In order to apply the Rzońca, Ciżkowicz model on statistical data, we first computed the

fiscal impulse, using both the primary deficit, and Blanchard’s fiscal impulse.

Before computing BFI, we first estimated the government spending rate as a function of

unemployment rate (International Bureau of Labour, INSSE data), using the following linear

dependence:ttt ug 144825,0021834.0 . (8)

One could see that much of the government spending rate is debt to the unemployment rate

that reflects an important dependence of the government spending’s rate on the

unemployment rate in the previous year.

Figure 2. Fiscal impulse for Romania using Blanchard’s measure, Figure 3. Fiscal impulse for Romania using

primary deficit measure

Using the real interest rate data, computed based on interest rate on deposits and inflation rate

(National Bank of Romania data), the computed fiscal impulse data using both primary

deficit and Blanchard’s fiscal impulse and final consumption and real GDP growth rates, we

estimated Rzońca, Ciżkowicz’s consumption functions.

Figure 4. Rzońca, Ciżkowicz’s consumption rate with primary defficit measure of fiscal impulse computed and

observed.



521

The Rzońca, Ciżkowicz’s consumption function with primary defficit fiscal impulse is:

1111 630826,236002,2461329,0165129,0054,049446,00726,0 tttttttt rrfifiyycc

Figure 5. Rzońca, Ciżkowicz’s consumption rate with using BFI measure

The Rzońca, Ciżkowicz’s consumption function with BFI measure is:

1111 078844,362908,2299856,043226,002017,0241026,027898,0 tttttttt rrfifiyycc

Analyzing the results of the estimates in the two versions, we observe a certain

consistency of the two estimations. Both have a positive dependence of the previous

consumption rate, current real income rate, previous real income rate, that are expected

results, reflecting normal consumer behaviour. The signs of tfi are to the both equations

negative, that reflect a non-Keynesian effect with different intensity, more intense in the

second case, meaning a more significant impact on consumption growth rate.

The last model applied in this paper is the model adapted for EU countries of Giavazzi and

Pagano. The input data for this model have the same source as previous and for the

EUROZONE 28 real GDP we used EUROSTAT data.

The estimated function is the following one.

)0.783898-0.709507-5.2138.56360.40413-0.290211(

)0.4871643.1856541.73360.052442.15459--1.16487)(1(

0.000572.3153651.137210.748623669.432

111

111

28

111

ttttttt

ttttttt

EU

ttttt

GGTRTRTTd

GGTRTRTTd

YYYCC

Figure 6. Final consumption absolute growth -computed and observed (extended Giavazzi, Pagano model

applied for Romania)

From the estimated function, we see net Keynesian effects both to the first five variable

coefficients and to the coefficient of the terms multiplied with )1( td (terms reflecting the

periods with moderate discretionary fiscal policies). The terms multiplied with td that

correspond to significant fiscal events reflect non-Keynesian effects of taxes and government

spending, with a relative strong impact on real final consumption.

Analyzing the three empirical applications on Romanian’s economy data, we can conclude

that non–Keynessian effects exist in the periods of strong contracting fiscal discretionary

policies. We also conclude that the overall effects of the fiscality are of Keynesian type as a



522

consequence of the fact that the periods of strong consolidations in Romania were not very

long.

References

[1] Afonso Antonio, “Non-Keynesian Effects of Fiscal Policy in the EU-15”, Department of

Economics, Instituto Superior de Economia e Gestão, Universidade Técnica de Lisboa,

research project for the author’s Ph.D. thesis, 2001.

[2] Alesina Alberto and Perotti Roberto, “Fiscal Adjustments in OECD Countries:

Composition and Macroeconomic Effects,” IMF Staff Papers, 1997.

[3] Ardagna Silvia, “Fiscal Stabilizations: When Do They Work and Why”, European

Economic Review, vol. 48(5), 2004.

[4] Bertola Giuseppe, and Drazen Alan, “Trigger Points and Budget Cuts: Explaining the

Effects of Fiscal Austerity”, American Economic Review 83, 1993.

[5] Blanchard Olivier, “Suggestions for a New Set of Fiscal Indicators”, OECD Working

Paper No.79, 1993.

[6] Blanchard Olivier and Perotti Roberto, “An empirical characterization of the dynamic

effects of changes in government spending and taxes on output”, NBER Working Paper

n.7269, and in Quarterly Journal of Economic Theory, vol. 115, 1999 and 2002.

[7] Campbell john Y. and Mankiw Gregory N., “Permanent Income, Current Income and

Consumption”, Journal of Economic and Business, 1990.

[8] Canale Roasaria Rita, Foresti Pasquale, Marani Ugo and Napolitano Oreste,”On

Keynesian Effects of (Apparent) Non-Keynesian Fiscal Policies”, Facolta’ di Economia.

Universita’ di Napoli ”Federico II”, MPRA, 2007.

[9] Giavazzi Francesco, and Pagano Marco, “Non-Keynesian Effects of Fiscal Policy

Changes: International Evidence and the Swedish Experience,” Swedish Economic Policy

Review, 1996.

[10] Giavazzi Francesco, Jappelli Tullio, and Pagano Marco, “Searching for non-Keynesian

effects of fiscal policy changes”, CSEF Centre for Studies in Economics and Finance, WP

n.16, 1999.

[11] Giavazzi Francesco, Jappelli Tullio, Pagano Marco and Benedetti Marina, “Searching

for non-monotonic effects of fiscal policy: new evidence”, NBER working paper 11593,

2005.

[12] Mukherjee Sanchita and Bhattacharya Rina, “Private Sector Consumption and

Government Debt in Advanced Economies: An Empirical Study”, IMF Working Paper,

2013.

[13] Perotti Roberto, “Fiscal Policy in Good Times and Bad”, Quarterly Journal of

Economics, vol. 114, 1999.

[14] Rzońca Andrezej and Ciżkowicz Piotr, “Non-Keynesian Effects of Fiscal Contraction in

New Member States”, WP 519, 2005.

[15] Sutherland Alan, “Fiscal Crises and Aggregate Demand: Can High Public Debt Reverse

the Effects of Fiscal Policy?”, Journal of Public Economics 65, 1997.

[16] Szabó Zsolt. (2013). The Effect of Sovereign Debt on Economic Growth and Economic

Development. [Online] Available: http://www.asz.hu/public-finance-quarterly-

articles/2013/the-effect-of-sovereign-debt-on-economic-growth-and-economic-

development-1/a-szabo-2013-3.pdf.

[17] Tagkalakis Athanasios O. (2013). Discretionary fiscal policy and economic activity of

Greece, WP 166. [Online] Available:

http://www.bankofgreece.gr/BogEkdoseis/Paper2013169.pdf.

http://www.asz.hu/public-finance-quarterly-articles/2013/the-effect-of-sovereign-debt-on-economic-growth-and-economic-development-1/a-szabo-2013-3.pdf



http://www.bankofgreece.gr/BogEkdoseis/Paper2013169.pdf



523

FEEDBACK ANALYSIS AND PARAMETRIC CONTROL ON

PROCESS OF DISPOSABLE INCOME ALLOCATION – A DYNAMIC

MODEL ON PORTUGAL’S NATIONAL ECONOMY

Bianca Ioana POPESCU


[email protected]

Emil SCARLAT


[email protected]

Nora CHIRIȚĂ


[email protected]

Abstract. The present paper emerged from the attempt to build a dynamic model capable of

parametric control, associated with the adjustment processes of national economies – the

case of disposable income allocation – having its basis on the fact that the economy is a

complex adaptive system which is able to regulate itself by feedback processes. The model of

the mechanism for formation and allocation of disposable income aims to describe the

process by which income is formed and is allocated throughout the economy to the main

economic agents in order to resume production processes. The data used for initializing the

state variables and some of the resulting variables of other regulating feedback processes of

the national economy have been introduced as given data from the official stats of Portugal

between 2005 and 2012, years before the economic crisis and the years of economic

recovery. The results of the simulations by modification of parameters have revealed that the

effects of excess or inadequate inventory will have a stronger effect on market prices and

disposable income as they are part of a reinforcing feedback loop stronger within the

mechanism than the balancing feedback loop.

Keywords: disposable income, dynamic model, feedback processes, parametric control,

simulation.

JEL classification: E17, C32

1. Introduction It is fair to say from the beginning of this paper that the novelty that lies in our current

attempt is the idea behind the approach of the national economy as a cybernetic system. From

this point of view we are leaving behind the classical approach of the structure of a national

economy and we take into consideration the structure of feedback processes that are

performing at national level between the redefined subsystems of the national economy in

order to assure the auto-regulation of the system. [1] We will revise and give a brief

description the redesigned structure of the cybernetic system of the national economy, we

will emphasize on the functionalities of the four fundamental feedback processes that are

forming between the subsystems of cybernetic system of the real economy and we are going

to test and asses the model associated with the formation and allocation of the disposable

income.

Classical economics presents relatively static models when dealing with the interaction

between aggregated accounts of the national economy (i.e. price, demand, supply, income,

consumption) and they have been unable to simplify the explanation of the dynamics




524

involved. Additionally, the effects of excess or inadequate inventory are often not discussed.

In reality, the market price is affected by the inventory of goods held by the production

system rather than the rate at which the production system are supplying goods and services.

[2]

2. The Redesign of the National Economy

The cybernetics of economy are to be understood along three main themes: the theory that

lies behind the economic systems and thus models, the theory of economic information and

the theory of controlling economic systems. In order to put the basis of the system of a

national economy we have to define what is like our system in focus, what are its components

and the way they interact and function.

This is why we will deal with the cybernetic of the national economy system in relation with

the principal processes that the economic system incorporates, such as the processes that will

determine the material flows (production processes), the exchange of goods and services and

labor, price formation of the goods and services and the distribution of income formed by

trading these products on the associated markets. These processes are the ones that determine

what we understand by real economy while the processes that determine financial flows, such

as formation of money supply and demand, formation of investments, currency exchange will

determine what we are calling monetary economy.

2.1 The Redesign of the Structure of the National Economy

The case for similarities between economics and cybernetics has already been made but our

purpose is to build those cybernetics models that can capture the relations that are important

for our understanding of specific dynamical problems and for their simulation. In order to do

this, the first step in such attempt is to define the subsystems of the national economy in

relation with the regulating processes that we are aiming to model and simulate.

The study and analysis of The Cybernetic System of The National Economy has revealed

four fundamental feedback processes at macroeconomic level: The Feedback Process of

Equilibrium Adjustment on The Market for Goods and Services (formed between subsystems

S1, S2, S4, S5 ), that are forming between the subsystems and in Figure 1 we are presenting

the flows by which they interact.

The Feedback Process of Scarce Resources Allocation (S2, S4, S3) – a mechanism of

allocation of scare resources which describes the process by which the economy allocates

labor, energy and capital to the production system;

The Feedback Process of Disposable Income Allocation (S1, S2, S4, S5) – which will describe

the process by which the realized income in the economy is allocated to the principal

economic agents for resumption of economical processes;

The Feedback Process of Assuring Profitability (S1, S2, S4) – it will describe the way by

which the cybernetic system of the economy assures the allocation of resource to those

economic agents which make use of them with the highest profitability, sustaining this way

the economical process. In the economy, this process is assured by bankruptcy. At individual

level bankruptcy is a failure but at macroeconomic level, it’s the economy’s way of keeping

clean of underperforming agents.

2.2. The Subsystem of the Formation and Distribution of Disposable Income

The subsystems S5, of the formation and distribution of disposable income, is the one that

makes the link between the monetary economy and the real economy by distributing financial

resources. We are not to pay special attention to the way in which the total income is formed

because in the modeling of the disposable income formation we have used the added value



525

method, but as we argued before, the importance of this particular subsystem is given by the

process of income distribution.

In order to describe the process of disposable income distribution we will firstly structure the

total national income, 𝑌, into income formed by wages and income formed by property

ownership, 𝑄. We will consider that the income formed by wages will be 𝑌 − 𝑄.

From the size of income derived from wages and property, we can determine the disposable

income derived from wages, 𝑌𝑤 and respectively, the disposable income derived from

property, 𝑌𝑞. [3] Thus, the disposable income derived from wages will be given, at time t, by

the relation:

𝑌𝑤(𝑡) = 𝑊(𝑡) ∗ 𝐿(𝑡) − 𝑡𝑤∗ ∗ 𝑊(𝑡)𝐿(𝑡) + 𝐻𝑤

∗ (𝑡) (1)

Where, 𝑡𝑤∗ will represent the rate of taxes on wages income and 𝐻𝑤

∗ is the level of transfers

from government to employees.

Similarly, the disposable income derived from property ownership is given by the relation:

𝑌𝑞(𝑡) = 𝑄(𝑡) − 𝑡𝑞∗ ∗ 𝑄(𝑡) + 𝐻𝑞

∗(𝑡) (2)

Where, 𝑡𝑞∗ will be the rate of taxes on property and 𝐻𝑞

∗ will represent the size of transfers from

government to property owners (subsidies).

The disposable income derived from wages, 𝑌𝑤 and the disposable income derived from

property, 𝑌𝑞 will will determine the level of total disposable income 𝑌𝑑, which will determine

the level of consumption, 𝐶 and the level of saving, 𝑍. Between consumption and saving

there is an inversely report, the increase in one will generate the decrease in the other.

3. A Dynamic Model for the Feedback Process of the Formation and Allocation of

Disposable Income

The model was built using STELLA Modeling and Simulation Software 9.0 and the data used

for initializing state variables and the input variables were taken from the official stats of

Portugal as they are reported on EUROSTAT between 2005 and 2012. For variable using

volumes, such as total output, intermediate consumption or government consumption, we

have used chain link volumes to 2005, in order to feed the model data from which the effects

of price have been removed and to allow for the simulation data to reflect and simulate on

price effect and delay.

3.1. Model Hypothesis and Functioning

The Process of Disposable Income Formation and Allocation permits the resumption of

production cycles at the level of national economy in such way that the general profitability

will be increased and with the conditions of assuring equilibrium on the markets for goods

and services and the inputs market.

In the structure of the process will enter, besides subsystem S3, all other subsystems of the

real economy cybernetic system. The general structure of the feedback process is formed by

two main feedback loops – on feedback loop corresponds to the process of disposable income

formation while the other one corresponds to the process of disposable income allocation

throughout the economic system. The analytic structure of the feedback mechanism

associated with the process of disposable income allocation is represented in Figure 1.



526

Figure 1 - The Functioning Diagram of the Model [1], pp.411

Depending on the aggregated demand we will calculate the excess in wanted inventory (𝐽 −𝐽), which will have an influence on the price level. The real inventory of intermediate

products,𝑋, will form the capital stock, 𝐾, existing into the economy at the current period.

The size of capital stock, will be determinant for the future disposable income,𝑌𝑑 being able

to rise or decrease the level of disposable income in direct relation on how the prices for

capital goods are rising or falling.

The transmission effect associated with the two feedback loops is:

𝑌𝑑 ↑⇒ 𝐶 ↑⇒ 𝐷 ↑⇒ {𝐽 ↑⇒ (𝐽 − 𝐽) ↑⇒ 𝑃 ↑⇒ 𝑌𝑑 ↑

𝐽 ↓⇒ 𝐾 ↓⇒ 𝑌𝑑 ↓}

It can be thus observed that the feedback loop associated with the formation of disposable

income is positive, thus reinforcing, while the feedback loop associated with the distribution

of disposable income is negative. The functioning of the two feedback loop acting together

can cause oscillations determined by fallings of the disposable income.

3.2 Model Implementation and Dynamic Equations

The elements of system dynamic diagrams are feedback loops, accumulation of flows into

state variables and delays. [4] As shown in figure 3, the states variables of the model are the

aggregated demand, 𝐷, 𝑅𝑒𝑎𝑙 𝑆𝑡𝑜𝑐𝑘, 𝑃𝑟𝑖𝑐𝑒, 𝐶𝑎𝑝𝑖𝑡𝑎𝑙 𝑆𝑡𝑜𝑐𝑘, 𝐷𝑖𝑠𝑝𝑜𝑠𝑎𝑏𝑙𝑒 𝐼𝑛𝑐𝑜𝑚𝑒 and 𝐶, final

consumption. The state variables will describe the systems state at each point in time and will

give the general overview on the behavior over time. [5]

Dynamical Equations as they were built from the stock-flow diagram in STELLA 9.0 are as

follows:

𝐷(𝑡) = 𝐷(𝑡 − 𝑑𝑡) + 𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 𝐷𝑒𝑚𝑎𝑛𝑑 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 ∗ 𝑑𝑡 (3)

𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 𝐷𝑒𝑚𝑎𝑛𝑑 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛= 𝐶 + 𝑃𝑢𝑏𝑙𝑖𝑐 𝐶𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛 + 𝐼𝑛𝑡𝑒𝑟𝑚𝑒𝑑𝑖𝑎𝑡𝑒 𝐶𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛

(4)

𝐷𝑖𝑠𝑝𝑜𝑠𝑎𝑏𝑙𝑒 𝐼𝑛𝑐𝑜𝑚𝑒(𝑡)= 𝐷𝑖𝑠𝑝𝑜𝑠𝑎𝑏𝑙𝑒 𝐼𝑛𝑐𝑜𝑚𝑒(𝑡 − 𝑑𝑡)+ (𝐷𝑖𝑠𝑝𝑜𝑠𝑎𝑏𝑙𝑒 𝐼𝑛𝑐𝑜𝑚𝑒 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 − 𝐶𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛) ∗ 𝑑𝑡

(5)



527

𝐷𝑖𝑠𝑝𝑜𝑠𝑎𝑏𝑙𝑒 𝐼𝑛𝑐𝑜𝑚𝑒 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛= 𝑁𝑒𝑡 𝑇𝑎𝑥𝑒𝑠 + 𝑃𝑟𝑖𝑐𝑒 ∗ 𝐷𝐸𝐿𝐴𝑌(𝑁𝑒𝑡 𝑂𝑢𝑡𝑝𝑢𝑡, 1) + 𝐶𝑎𝑝𝑖𝑡𝑎𝑙 𝑆𝑡𝑜𝑐𝑘∗ (𝑃𝑟𝑖𝑐𝑒 − 𝐷𝐸𝐿𝐴𝑌(𝑃𝑟𝑖𝑐𝑒, 1))

(6)

𝐶𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = 𝐶0 + 𝑐𝑐 ∗ 𝐷𝑒𝑙𝑎𝑦 𝑉𝐷 + 𝑓 ∗ 𝑃𝑟𝑖𝑐𝑒 (7)

𝐷𝑒𝑙𝑎𝑦 𝑉𝐷 = 𝐷𝐸𝐿𝐴𝑌(𝐷𝑖𝑠𝑝𝑜𝑠𝑎𝑏𝑙𝑒 𝐼𝑛𝑐𝑜𝑚𝑒, 2) (8)

𝑃𝑟𝑖𝑐𝑒(𝑡) = 𝑃𝑟𝑖𝑐𝑒(𝑡 − 𝑑𝑡) + (𝑃𝑟𝑖𝑐𝑒 𝐼𝑛𝑐𝑟𝑒𝑎𝑠𝑒 − 𝑃𝑟𝑖𝑐𝑒 𝐷𝑒𝑐𝑟𝑒𝑎𝑠𝑒 ∗ 𝑑𝑡 (9)

𝑃𝑟𝑖𝑐𝑒 𝐼𝑛𝑐𝑟𝑒𝑎𝑠𝑒 = 𝑃𝑟𝑖𝑐𝑒 𝑀𝑜𝑑𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 (10)

𝑃𝑟𝑖𝑐𝑒 𝐷𝑒𝑐𝑟𝑒𝑎𝑠𝑒 = 𝐼𝐹 𝐸𝑥𝑐𝑒𝑠𝑠 𝑜𝑓 𝑊𝑎𝑛𝑡𝑒𝑑 𝑆𝑡𝑜𝑐𝑘< 0 𝑇𝐻𝐸𝑁 (−𝑃𝑟𝑖𝑐𝑒 𝑀𝑜𝑑𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛) 𝐸𝐿𝑆𝐸 0

(11)

𝑅𝑒𝑎𝑙 𝑆𝑡𝑜𝑐𝑘(𝑡) = 𝑅𝑒𝑎𝑙 𝑆𝑡𝑜𝑐𝑘(𝑡 − 𝑑𝑡) + 𝑆𝑡𝑜𝑐𝑘 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 ∗ 𝑑𝑡 (12)

𝑆𝑡𝑜𝑐𝑘 𝐹𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = −(𝐷𝐸𝐿𝐴𝑌(𝑂𝑢𝑡𝑝𝑢𝑡, 2) − 𝐷𝐸𝐿𝐴𝑌(𝐷, 1)) (13)

Figure 2 - Stock - Flow Diagram 9.0 Generated in STELLA for the Model

4. Conclusions

We have run several simulations of the model with dt steps of 1 (yearly) and 0.25 (quarterly).

The data gathered from the simulation has revealed that the parameter Alfa has a great

influence on Price formation and thus on the formation and distribution of the disposable

income. Table 1 shows that slight changes in the behaviour of the productive sector in



528

dependence with the demand will introduce socks in the formation of the disposable income

while the consumption formation will be less influenced.

Table 1 - Results on parametric control - simulations with different thresholds for Alfa, dt step =1.

More, the allocation of disposable income for consumption will be dependent of the level of

prices. If the saving will increase, the same effect will take place with the total wealth which

will determine an increase in consumption that will again determine a decrease in savings.

Simulations have revealed that in the studied period, the positive feedback loop will be

stronger until 2009 generating price increases and the growth of the disposable income.

Acknowledgement



POSDRU/159/1.5/S/134197 “Performance and excellence in doctoral and postdoctoral


References

[1] Dagum Camilo, “A New Model of Personal Income Distribution: Specification and

Estimation”, Modeling Income Distributions and Lorenz Curve, Springer, 2008, pp. 3-23.

[2] Krueger Dirk. (2015, February 10). Quantitative Macroeconomics: An Introduction.

[Online] Available: http://www.e-booksdirectory.com/details.php?ebook=2831

[3] Law Averill M. and Kelton David K., Simulation Modeling and Analysis, Third Edition,

McGrawHill, 2000.

[4] Richmond Barry, An Introduction to Systems Thinking with STELLA, isee systems, 2004.

[5] Scarlat Emil and Chiriță Nora, Cibernetica Sistemelor Economice. Bucharest: Ed. ASE,

2003, pp 400-415.

http://www.e-booksdirectory.com/details.php?ebook=2831



529

INEQUALITY OF INCOME DISTRIBUTION IN ROMANIA.

METHODS OF MEASUREMENT AND CAUSES

Malina Ionela BURLACU

The Bucharest Academy of Economic Studies

[email protected]

Abstract. Economic inequality is a subject met in time and in full amplification process, with

the development of civilizations. In the background of the economic crisis there has been

encountered a sense amplification of the income inequality due to the increasing number of

unemployed but also because of lower income among the employed population. This feeling

was boosted by the lack of policies to improve the situation or to enhance. Education is

another issue at national level that should be given major importance in the improvement of

the income distribution. Any directive in this regard should be treated with maximum

importance by both the authorities and the population. The importance of this aspect should

be of higher interest considering that along history was felt the idea that in a country where

the population shows a standard of living increasingly lower, it is more difficult to boost the

economy.

Key words: education, Gini coefficient, income, living standards

JEL Classification: O15, D63, I24

1. Introduction

Income inequality and poverty are the highest points present at the basis of any society. The

fact that in some states it feels stronger and in others less derives from any state’s social

economic policies. The idea of inequality comes since ancient times when there were three

classes (wealthy people whose status was transmitted from generation to generation; the poor

who were deprived of certain things since childhood and an intermediate class). In modern

society three types of inequality are being identified: material inequality; social inequality

and policy inequality. [1] Each type of inequality is reflected on the welfare and living

standards of the population.

Romania, among the Member States of the European Union, is one with a very high level of

income inequality. Various researches in this direction have revealed that an unequal

distribution of income significantly leads to the amplification of the crisis and thus hampering

the recovery process. It is also a factor that prevents the economy to reach its full potential.

2. Methods of measurement and causes

As a method of characterizing, highlighting the degree of income inequality in the present

study, we used the Gini coefficient. It can take values between 0 and 1 or between 0 and

100%, reflecting variation from zero concentration at a maximum concentration. The higher

the coefficient is closer to 0 the more it can be said that it shows perfect income equality.

More specifically, it can be said that the population of the country to which the indicator

value reports to, the same values recorded in income. Conversely, the higher the indicator

value is closer to 1, the more it can be said that the income inequality shows a higher degree

(wealth is concentrated around a single person).

A major benefit of this indicator is the fact that it highlights the value percentage required to

be redistributed so that the income inequality in a country to be as small as possible.



530

It also presents a major advantage, because it is not based neither on the size of the country,

nor on the social or political situation.

Figure 1. Gini coefficient of equalised disposable income

Thus, it can be seen that certain countries (Hungary, Belgium, Czech Republic, Slovenia,

Slovakia, Finland, Sweden) show a Gini coefficient value quite small (23-28%), which means

that to reduce inequality redistribution of income at a rate of 23-28% is required. In contrast,

we find countries like Bulgaria, UK, Portugal and Latvia where, to ensure equality of income

redistribution a rate of 33-38% is required. At the time of EU adherence, Romania presented

a very high value of the Gini coefficient (38). It subsequently declined, being found today at

around 34.

The cause is a huge gap between different living conditions. It is considered that there is a

part of the population that benefits of a high standard of living and on the opposite side there

are people who do not even have the opportunity to find a job to make a living and sustain a

family. [2]

Romania recorded a low standard of living compared to other Member States of the European

Union, which highlights an unequal distribution of income.

Large differences in the income derived from income and business people, from people with

higher education working in multinationals, on one hand, and those recorded in urban

reported to rural, Bucharest versus the province, on the other side. The economic crisis

effects in recent years favoured the feel of lack of a middle class (civil servants, teachers,

health system employees, employees in Romanian justice). Lowering wages in these sectors

has made possible the number of low-income population increase, and thus amplified the

income inequality and unfair distribution.



531

In light of these aspects it should be paid special attention to finding and applying policies

able to reduce and maintain equity income distribution.

For a quote or correct measurement of income inequality it is necessary to follow two

methodological issues: the definition and calculation of income and the indicators that

highlight inequality.

In Romania two surveys are carried out by the National Statistics Institute (Family Budget

Survey and Integrated Household Survey) to identify disposable income (total income

derived by the members of the household, regardless of source). In order to reflect differences

in household size and composition, this total is divided by the equivalent of "adults" using a

standard scale which assigns a weight of one of the first adult in the household, a weight of

0.5 for each additional household member aged minimum of 14 years and a weight of 0.3 to

household members aged under 14 years. The resulting number is called "disposable income

per equivalent-adult". [5]

The disposable income of a household is at the moment the most effective way to estimate

the standard of living of a household. The calculation of this indicator includes:

- Cash income (earnings, property income, money income from agriculture);

- Income in kind (food and food value from own sources of income in agriculture or services

in the household).

Revenues in nature are a very sensitive topic and worth taking into consideration in the

analysis of income inequality. By including these revenues in the analysis to obtain a much

clearer vision on inequality. Consumption of food resources is of key importance in analyzing

living standard of the population, especially the population that record low cash income. By

failing to take account of this aspect can be achieved overvaluation of inequality in Romania.

[4]

Index of income inequality whenever estimates are higher disposable income per adult-

equivalent (including or excluding value of consumption from own resources) obtained from

all individuals in quintile 5 (the richest) compared to revenues of people in quintila1

(poorest), the distribution of population by disposable income per adult-equivalent [3].

Figure 2. Income inequality index

The value of this indicator has declined in recent years, which indicates a decrease or an

approximation of the richest persons in the category of people who belong to the opposite

extreme, the poorest.

2.1. Causes of income inequality

2.1.1. Lack of investment in education Since ancient times, as in the present education had and still has an important role in the

development of society and thus population. In the current context of modern society

education system is "key" to the development of a country. The differences between social

classes and inequalities between rich and poor is reflected in the growth of a country. The

results of numerous studies in this regard shows that the determining factor of this situation is



532

the lack of investment in education. In terms of expenditure on education as a percentage of

GDP, Romania is one of the lowest levels. In addition, there is a lack of policies to encourage

education among disadvantaged families, and lack of educational conditions in rural areas. In

recent years, especially since 2008, the educational system suffered through the abolition of

numerous educational institutions in rural areas. Thus, due to the lack of financial

opportunities, many families are forced to abandon their children's education, which

significantly affects the position of Romania as an EU Member State.

2.1.2. Inequality of opportunities in educational system

Equal opportunities in education is a concept undergone extensive analysis over time by

many experts in this area. The importance of analyzing this issue derives from the fact that

education is one of the key elements underlying welfare. Thus, the specialists identified three

problems that the educational system in Romania is currently facing as well: inequality in

terms of access to education, inequality in the education system (availability of resources,

providing equal learning environments for all participants); chances of school success (high

school dropout rate).

Figure 3.Mean income by education level (the total disposable income of a household) in EUR

In Figure 3 I have tried to emphasize the difference between the average income per

household in one year, depending on education. As a basis of comparison I have chosen

Romania versus average member countries of the European Union. In terms of primary and

secondary education level income per household Romania registered 11 times lower than the

EU average. Significant differences are recorded in terms of household income of people

with higher education. Incomes of households with persons with higher education are

approximately 18 times lower than the EU average.

2.1.3. Wage inequality

In Romania there are significant differences in terms of compensation of employees. Based

on the level of training and reaching the employee's area, there is a greater differentiation of

wages. In this regard there are significant differences between urban employees compared to

employees from rural areas. At present, in Romania is attempted to encourage economic

agents to invest in rural areas to facilitate job creation. It would be possible to create a



533

connection between children's access to education (would increase the number of families

with financial possibilities) and a high standard of living.

The pay gap increases as the imperfections in the functioning of markets. Among the main

reasons that lead to these differences are distinguished the activity, geographic region, and

legal reasons (the system of laws based on which operates a particular type of activity.)

In terms of economic opportunities, the offer in Romania is very low: low income,

increasingly fewer jobs and limited opportunities for finding new employment hatches, high

taxation on labor force that encourages "moonlighting". Together, all these lead to the

creation or amplification feeling unequal distribution of income.

3. Conclusions

Income distribution in Romania is characterized by inequality in full growth. Specified period

last year, through the economic crisis had a negative impact on household incomes by

increasing the number of unemployed, and a significant decrease in wages. The increased

number of low-income people or those at the bottom of the distribution.

It requires addressing policies that promote employment population growth but also proper

remuneration of those who were under the middle class (education system, health system,

public administration). Also, the development of rural economy by increasing the number of

people who occupy a job as well as the deterrence of the ”moonlighting” system are other

directions that contribute to the fair income growth and distribution.

In addition, development of a system for improving the education and professional training

are two major directions of interest that once improved can play a significant role in the

reduction of income inequality.

References

[1] Eurostat, Statistics on income distribution [Online]. Available:

http://ec.europa.eu/eurostat/statistics=explained/index.php/Income_distribution_statistics.

ro

[2] INSSE, www.insse.ro. [Online]. Available:

http://www.insse.ro/cms/files/Web_IDD_BD_ro/O1/O1_6-

Indicele%20inegalitatii%20veniturilor.doc.

[3] Molnar Maria, “Romanian household’s inequality”, Romanian Statistical Review, 2010.

[4] Molnar Maria, ”Household’s income distribution in Romania. Inequality”, Working

Paper, Institute of National Economy, 2009.

[5] Tara Sergiu, Social inequality and poverty in Romania nowadays [Online]. Available:

http://store.ectap.ro/articole/834_ro.pdf.

http://ec.europa.eu/eurostat/statistics=explained/index.php/Income_distribution_statistics.ro

http://ec.europa.eu/eurostat/statistics=explained/index.php/Income_distribution_statistics.ro

http://www.insse.ro/cms/files/Web_IDD_BD_ro/O1/O1_6-Indicele%20inegalitatii%20veniturilor.doc

http://www.insse.ro/cms/files/Web_IDD_BD_ro/O1/O1_6-Indicele%20inegalitatii%20veniturilor.doc



534

WAR GAMES AND A THIRD PARTY INTERVENTION IN CONFLICT

Mihai Daniel ROMAN


Abstract. Last decades researchers used game theory to explain and to better understand the

conflict situations. War games were developed in order to analyze a combatant’s behavior in

various situations, especially for two party conflict. In this paper we analyze another class of

war games than involves three parties: two combatant parties and a third party that

influences one or another combatant side following its own benefit. Finally from our case

studies we conclude that is more expensive for the third part to be peacekeeper than to be

peace breaker.

Keywords: game theory model, third party intervention, war, peacekeeper, peace breaker JEL classification: H56, C72

1. Introduction Last decades researchers used game theory to explain and to better understand the conflict

situations. War games were developed in order to analyze a combatant’s behavior in various

situations. Many situations involve two parties in conflict. Military decisions are established

depending on possible strategies and the payoff functions, in complete or incomplete

information.

Another class of war games involves three players: the two combatant parties and a third

party that influences one or another combatant side following its own benefit. This third party

can help another player by supplying military equipment or by direct intervention into the

conflict.

In this paper we propose a game theory model in order to analyze the influence of a third

party into a conflict as peacekeeper or as peace-breaker.

The first part of the paper contains a brief literature review followed by the model

description. Finally we present a study-case for a third party intervention in two different

conflicts that prove the fact that it is easier to be a peace breaker than a peace keeper form an

economic perspective.

2. Literature review

In order to understand the role of a third party in a conflict, it is necessary to better

understand army battles in general. The main problem are the reasons a third party has to

intervene. For example, Regan [14] assumes that third parties take action in order to limit

hostilities. Therefore, he makes the third party a "conflict manager". Siquera [16] similarly

assumes that the short term goal of the third party is to reduce the existing level of conflict.

The intervention of the third party can be perceived as liberal or idealistic. This vision is

formed on the belief that aversion to human tragedy is the main reason why third parties get

involved in the conflict.

Intuitively, the idealistic perspective seems to offer an incomplete description of the third

party. During the Cold War, for example, the Soviet Union made a military intervention for

the left Government ruling Afghanistan. This was not to promote peace in the region, but to

protect its own national security from anti-soviet tropes. In an empirical investigation that



535

contradicts his main assumptions, Regan [13] discovered that generally the intervention of a

third party tends to increase the duration of the conflicts. Assuming an idealistic perspective,

this result indicates that a third party intervention would better achieve its purpose if it were

to ignore the conflict. Obviously, a broader explanation is necessary to better understand the

general nature of the third party effect.

Other studies, such as Morgenthau [10] or Bull [2] reach the conclusion that third party

choose to intervene when national interests are at stake. Regan [12], [14] describes this vision

as a "realism paradigm" and identifies it as a dominant philosophy in international policies.

Complementary to realism is the vision that ethic problems and domestic policies play a

crucial role in the third party's decision to intervene, a perspective supported by Blechman [1]

and Carment and James [3].

In this case the success in a territorial conflict of an "ally" can be benefic to a third party in

countless ways. Future potential benefits of a third party include better access to natural

resources and to commerce, better national security, ethical fulfillment and geo-strategic

advantages (Moseley [11]).

Taking into consideration the assumptions that incorporate both the costs and the benefits of

the intervention, we will establish the compromises that a third party makes when they decide

whether to get involved in a conflict or not. An interesting and predominant type of

intervention of a third party, considering Siqueira's model [16], is military subvention. As the

subventions grow, the probability that the ally wins or maintains territorial possession grows

as well. Furthermore, let's assume that the cost of supporting an ally is influenced by the

degree of the military subvention. In the Siqueira model the third party intervention, the third

party is treated exogenously and therefore does not act as an economic agent when choosing

the level of intervention. The third party acts strictly as a peacemaker, regardless of the stakes

of a specific conflict.

Roman [15] describes the conflict situations with financial influences for two parties involved

into the conflict. Depending on economic dimension of the conflict involved parties decides

to attack or to resist into the battle.

Chang et all [4] described a territorial dispute through a three step game that allowed the

decision to have a third party intervene, as well as understanding the nature and the potential

effects of a third party intervention in a broader manner.

3. The Model

In our paper we consider the model analyzed by Chang et all [4] that assumed the third party

is an "egoistical" agent, that pursues its own interests by maximizing a pondered sum of

strategic values associated to a disputed territory, which can be in the "wrong" hands of a

country with no allies. Furthermore, they demonstrated how "intervention technology" in the

form of military assistance (Siqueira, [17]) interacts with the canonic "conflict technologies"

of two rival parts, affecting the results of the game through sequential movement. This three

step game allowed for the examination of the role of a third party in supporting its ally, from

the discouraging perspective.

The description of the game analyzed by the formers is as follows. The third party makes the

first move to support its ally, taking into consideration the impact of its actions on the

following leader: the next sub-games played between two rival parties (1 and 2) over a

disputed territory. For the second and third step of the three-step game, two alternative

scenarios are examined. In the first scenario, the first party, a territorial defendant, moves on

to the second step in order to decide the distribution of the military goods necessary for the

defense. Meanwhile, the second party, the challenger, moves to the third and final step of the

game. The second scenario inverts the movement order between the parties 1 and 2 in the last



536

two steps of the general game. In both scenarios, the third party supports its ally, party

number 1.

Before we can label the endogeneity of the intervention of a third party in a conflict between

two rival parties, it is necessary to discuss the term "intervention technology". This term

reflects the degree to which a third party can affect the capability of an ally and by doing so

reflects the general result of the conflict. Supposing that party 3 supports its ally, using

transfers of military subventions (M) which serve to improve the military efficiency of the 1st

party by reducing arming cost. Thus we can indicate a cost reducing function as s=s(M),

where s'(M)= ds/dM<0 and s''(M)= d2s/dM2>0. In other words, an increase of M lowers the

marginal cost of arming for the first party, but the reduction effect of the cost determines an

increasing marginal cost. We will now examine how the technology used the third party's

intervention interacts with the conflict technologies of the respective parties in conflict.

Like in the conflict literature, a canonic "contest success function" is used to capture the

conflict technology.

In other words, the probabilities that the 1st and 2nd party reach an armed conflict are:

𝑝1 =𝐺1

𝐺1+𝛾𝐺2 and 𝑝2 =

𝛾𝐺2

𝐺1+𝛾𝐺2, (1)

where G1(>0) is the quantity of military goods that party 1 allocates for defending a territory,

G2(≥0) is the quantity of military goods that party 2 allocates for attacking a territory, and 𝛾

represents the relative efficiency of one unit of military good of the 2nd party divided by the

efficiency of a good of the 1st party.

The before mentioned success probabilities take the form of a simple addition of the conflict

technologies.

This property suggests that third parties play no role in two-party conflicts. It is easy to check

this affirmation if the two parties use the optimal quantities of weapons in a simultaneous

movement game. Interestingly, we can consider that the multiple step game and the

sequential movement of a third party intervention play an important role on the equilibrium

between the two parties in conflict, regardless of the additive form of the conflict

technologies in (1).

In order to endogenously characterize the third party's choice regarding the level of

intervention we use a three-step game. The third party makes the first move, optimally

choosing a subvention level that maximizes its own objectives. In the second and third step of

the game, the parties 1 and 2 make sequential moves in order to determine the optimal way to

allocate military goods for the conflict, the first to make a move being the owner of the land.

We take into consideration two generic scenarios. In the first scenario, the first party occupies

the territory and therefore assumes the role of Stackelberg ruler/leader during the second step

of the game. The second party, the challenger, moves on to the third and last step of the

game. In the second case, the second party, the owner of the territory, moves on to the second

step while the first party, the challenger, moves to the third and final part of the game.

During the second and third steps, we follow Grossman and Kim [8][9] and other after them

as they use a Stackelberg frame, in which the defendant leads to the allocation of military

goods. Gershenson [5][6] defends this structure assuming that the institutional environment

of the beneficiary is relatively rigid; therefore, the defensive allocation of goods is an

arrangement of the beneficiary. The advantage of this assumption is that it allows us to

analyze a discouraging strategy from the defendant. Chang et all [4] develops a model for

characterizing the possible results of a territorial dispute between two rival parts in a

Stackelberg game.

Knowing that the third party provides military subvention transfers (M) to the first party, we

assume, for analytical simplification, that the function of the opportunity cost is s=1/(1+M)θ ,



537

where θ measures the degree of effectivity that each subvention dollar reduces from the

unitary cost for arming for party 1 and 0<θ<1.

Since the third party employs M in the first step, the winning functions for party 1 and 2 in

the following steps of the game are:

𝑌1 = (𝐺1

𝐺1+𝛾𝐺2) 𝑉1 −

1

(1+𝑀)𝜃 𝐺1, 𝑌2 = (𝛾𝐺2

𝐺1+𝛾𝐺2) 𝑉2 − 𝐺2, (2)

where: M (≥ 0) is the level of military subventions transferred from the third party to party 1;

𝜃 represents the efficiency of a one-dollar subvention to reduce the unitary arming cost of

party 1; Vi is the total value that part i (i=1,2) increases in order to keep the territory in the

following period, when it can harness a part of the territory for profoundly intrinsic and

economic reasons.

Considering that the (2) specification implies that fact that the intervention of a third party is

tactically "indirect" in that the military support of the third party does not directly affect the

contest success function of party 1. The use of γ (>0) reflects the asymmetry of conflict. We

can think of it as a "composed/mixt good" that includes a quantity of weapons, trained

soldiers and strategic information.

4. Case study

Case I

In the first part of our case study we analyze the Second World War conflict between

Germany and Poland. At the September 1-st Germany attack Poland without a formal war

declaration. The main motif of this attack was Hitler intention to extend German territory to

East. After invading Austria and Czechoslovakia without any resistance Germany expects to

win without a battle with Great Britain and France. Using the model described in previous

paragraph we determine the subventions and arming levels would have needed Poland in

order to resist to German attack.

We consider Poland as Part 1 (player one), Germany as Part 2 (player 2) and Great Britain as

Part 3 (player 3, the third part of the conflict). We consider also only the airplanes

endowment of Germany and Poland as war arms. We denote by V1 and V2 the war airplanes

value for Poland and Germany and γ the ratio between V2 and V1. We suppose also that θ (the

subvention efficiency) is between 0.2 and 0.3. S1 – S2 is the strategic value for Part 3 if

Poland maintains his territory, Mc and M* are the critical respectively the optimum levels of

military subventions for Poland, G1 and G2 represents the optimum levels for war airplanes

endowments for Poland and Germany and pi represents the attack/riposte probability.

Results for the first model, where Part 1 (Poland) maintains his territory are described in

Table 1. Table 1. Mil. U.S. Dollars

V1 V2 Γ θ Mc S1-S2 M* Stability condition

(𝜃(𝑆1−𝑆2)

2𝛾∗ 𝑉1/𝑉2)

𝜃1−𝜃⁄

>

2γV2/V1>1

𝐺1∗ 𝐺2

∗ 𝑝1∗ 𝑝2

∗

17.4

69.2

1.349

↓.216 ↑.593

51.8

↓

Verified for all cases

93.45

0

1

0 ↓.217 ↑.564 ↓48.8

→.218 →.536 →50.2

↑.219 ↓.510 ↑51.7

↑.220 ↓.485 ↑53.1

Comparing V1 with G1* we can observe that initial endowment of Poland was five times

lower than optimum level necessary to prevent the conflict. The third part war subvention

effort should have dramatically increased by Great Britain and this situation was not

economic possible at this time.



538

If the strategic value for conflict third part decreases, then the optimum level of discouraging

arming level is at the same very high standards and Poland can’t support by herself the war

effort.

The main conclusion of this study case is that the initial war endowment and subventions

accorded to Poland could not prevent the German attack in 1939, and the initial decision of

Great Britain was a correct one due the fact that British economy could not support at this

time the war effort for another European country.

Case 2.

The second case analyzed in our paper presents the German attack on Yugoslavia in 1941.

German army was well armed and trained and his attack was a brutal one. Yugoslavian army

tried to resist but the Italian attack assure the victory in less than two weeks.

In our example the first part is Germany (the attacker), the second part is Yugoslavia and the

third part is Italy. We maintains the variables as in our first study case, with the war airplanes

endowment (the first scenario) and with army troops (the second scenario). Germany (part 1)

attack Yugoslavia (part 2) attempting to occupy Yugoslavian territory.

Results for the first scenario are described in Table 2.

Table 2. War airplanes Mil. U.S. Dollars

V1 V2 γ Θ Mc S1-

S2

M* Stability condition

(𝜃(𝑆1−𝑆2)

2𝛾∗ 𝑉1/𝑉2)

𝜃1−𝜃⁄

>

2γV2/V1>1

𝐺1∗ 𝐺2

∗ 𝑝1∗ 𝑝2

∗

14.3

26.9

1.87

↓.247 ↑9.1

12.5

↓.324


↓17.84

↓17.85

→17.86

↑17.87

↑17.88

↑.315

↑.309

→.304

↓.298

↓.293

1

0 ↓.248 ↑9.2 ↓.322

→.249 →9.3 →.319

↑.250 ↓.8.9 ↑.317

↑.251 ↓8.5 ↑.315

In this case we can observe the fact that Italy help Germany to win the war against

Yugoslavia. The real Italian subventions to Germany was around 39 mil. US $, value that is

greater like the critical level of subvention result from the model. Also Mc is positive that

indicates Italy like a peace breaker. Also the probability to win the war for Germany is 1,

that indicates the fact Yugoslavia was no chance to resists over Germany and Italy.

For the second scenario we use the soldier numbers as V1 and V2 (V1 the Germans soldiers

that attack Yugoslavia, and V2 the Yugoslavian number of soldiers).

The results for simulation are described in Table 3.

Table 3. Soldier3, Thousands

V1 V2 γ Θ Mc S1-S2 M* Stability condition

(𝜃(𝑆1−𝑆2)

2𝛾∗ 𝑉1/𝑉2)

𝜃1−𝜃⁄

>

2γV2/V1>1

𝐺1∗ 𝐺2

∗ 𝑝1∗ 𝑝2

∗

337

700

2.07

↓.247 ↑21.1

362

↓23.2


↓343

↓344

→346

↑347

↑349

↑775

↑768

→ 761

↓754

↓747

1

0 ↓.248 ↑21.2 ↓23.1

→.249 →21.3 →23.0

↑.250 ↓21.2 ↑22.9

↑.251 ↓20.5 ↑22.8

This scenario indicates that Italian support for German army was necessary due on the fact

Yugoslavian army was greater like German troops involved on Yugoslavian front war. Also

in this case Italy help Germany and act like a peace breaker. The theoretical optimum level of



539

German soldiers was close to the real one, but Yugoslavia had not the necessary level of

troops to resist over Germany and Italy.

This model confirm the history evolution with a brutal attack from Germany and with the

help of Italy.

5. Conclusions

The war conflicts was intense studied in many papers. One of the most interested analysis

concern the role of a third party in various conflicts. There are different interests of third

parties: from preventing one party attack to helping the attacker to win the war.

Our analysis shows that discouraging strategy is sometimes costly for the third part and

without a great economic power it is not possible to support the war subventions (see US and

EU help for Ukraine).

The optimal intervention for a third party can also be peacemaker or peace breaker depending

on conflict nature or third party relationship with one or another involved parties.

In any case our study shows it is more difficult for a third party to be peacekeeper than to be

peace breaker.

References

[1] B.M. Blechman, „The intervention dilemma”, Washington Quarterly 18, pp. 63–73, 1995.

[2] H. Bull, „Intervention in World Politics”, Clarendon Press, 1984.

[3] D. Carment and J. James „Internal constraints of and interstate ethnic conflict: toward a

crisis-based assessment of irredentism”, Journal of Conflict Resolution 39, pp. 82–109,

1995.

[4] Y.M.Chang and J. Potter and S. Sanders „War and peace: Third-party intervention in

conflict”, European Journal of Political Economy, pp. 954-974, 2007.

[5] D. Gershenson, „Sanctions and civil conflict”, Economica 69, pp. 185–206, 2002.

[6] D. Gershenson and H.I. Grossman, „Civil conflict: ended or never ending?”, Journal of

Conflict Resolution 44, pp. 807–821, 2000.

[7] E. Glaeser „The political economy of hatred, Quarterly Journal of Economics” 120, pp.

45–86, 2005,

[8] H.I. Grossman and M. Kim, „Swords or plowshares? A theory of the security of claims to

property”, Journal of Political Economy 103, pp. 1275–1288, 1995,

[9] I. Grossman and J. Mendoza, „Scarcity and appropriative competition”, European

Journal of Political Economy 19, pp. 747–758, 2003.

[10] H.J. Morgenthau, “To intervene or not to intervene”, Foreign Affairs 45, pp. 425–436,

1967.

[11] A. Moseley, “Political realism. The Internet Encyclopedia of Philosophy”, 2006,

http://www.iep.utm.edu/polphil/

[12] P. Regan, “Conditions for successful third party intervention in intrastate conflicts”

Journal of Conflict Resolution 40, pp. 336–359, 1996.

[13] P. Regan, “Choosing to intervene: outside intervention in internal conflicts”, Journal

of Politics 60, pp. 754–759, 1998.

[14] P. Regan, “Third-party interventions and the duration of intrastate conflicts”, Journal

of Conflict Resolution 46, pp. 55–73, 2002.

[15] M. Roman, “A game theoretic approach of war with financial influences”,

International conference New Challenges in the Field of Military Sciences, Budapest,

September 2010

[16] K. Siqueira, “Conflict and third-party intervention”, Defence and Peace Economics

14, pp. 389–400, 2003.

http://www.iep.utm.edu/polphil/



540

MACROECONOMIC FACTORS OF SMEs PERFORMANCE IN

ROMANIA IN THE PERIOD 2005-2013. A TIME SERIES APPROACH

Marușa BECA


[email protected]

Ileana Nișulescu ASHRAFZADEH


[email protected]

Abstract. In this research article, we study the relationship between macroeconomic factors

and the SMEs’ performance for the Romanian economy through the Autoregressive

Distributed Lags Model (ADL). A time series analysis was performed that uses quarterly data

for the period January 2005 – December 2013 in order to determine the effect of the

monetary and tax policy adopted by the Romanian government before, during and after the

2008 economic crisis on the SMEs’ productivity. The deflated value-added per number of

micro enterprises, small firms and medium enterprises is an endogenous variable in three

linear function models with six exogenous macroeconomic variables such as the CPI, the

unemployment rate, the FDI rate, the tax rate and the government expenditure ratio to GDP

and the lags of the dependent variable. The main finding is that the government expenditure

rate variance has the most significant negative impact on the variance of the micro firms

value added among the other explanatory variables, but a positive effect on the small and

medium enterprises value added variance.

Keywords: Autoregressive Distributed Lags Model, firm performance, monetary policy,

Romania, tax policy.

JEL classification: E42, E52, E62

1. Introduction The SMEs are the backbone of Romania’s economy because they create 54% of the national

wealth’s added value and 65% of jobs in Romania. The economic crisis has hit hard in the

Romanian SME sector, the main problems it faces are the late collection of receivables, the

lack of collateral to obtain the loans for investments and the bureaucracy in accessing

European funds. ”The Small and Medium Enterprise (SME) category consists of the firms

that employ less than 250 persons and that have an annual net turnover of maximum € 50

million and/or holds total assets as far as € 43 million.” [1] There are three categories of

SMEs: micro-enterprises, small firms and medium ones in conformity with their turnover or

total assets and number of employees.

In order to perform our analysis, we employed as dependent variables the quarterly deflated

value-added per number of micro companies, small firms and medium enterprises from the

period 2005-2013. The goal of this paper is to establish the relationship between

macroeconomic factors such as the interest rate, the CPI, the tax rate, the unemployment rate,

the FDI rate and the government expenditure rate on one side and the deflated SMEs’ value-

added per number of enterprises on the other side through a time series approach. An

Autoregressive Distributed Lags Model (ADL) is used in our investigation in order to achieve

our goal.



541

The study is structured as follows: the second part presents the main literature regarding the

economic growth or firms’ performance at the macroeconomic level, the third part the

methodology and data used, the research design, the fourth part the econometric models

equations and results and the last one presents the main conclusions and recommendations.

2. Literature review Usman (2011) shows that there is a long-run relationship between public expenditure and

growth by using cointegration and VEC results [2]. Nijkamp (2004) finds that the

composition of public expenditure matters for growth. Governments should be aware of the

fact that outlays for education, transport infrastructure, and general government can promote

economic growth. [3]. Fölster’s results point to the fact that both government expenditure and

taxation are found to be negatively associated with economic growth. [4]

Devarajan (1996) shows that an increase in the share of current expenditure has positive and

statistically significant growth effects. By contrast, the relationship between the capital

component of public expenditure and per-capita growth is negative. Thus, seemingly

productive expenditures, when used in excess, could become unproductive [5].

Kneller finds that productive government expenditure enhances growth, while non-productive

expenditure does not. When financed by some combination of non-distortionary taxation and

non-productive expenditure, an increase in productive expenditures significantly enhances the

growth, and an increase in distortionary taxation significantly reduces growth. [6].

Using a VAR model, Mutașcu finds that a positive shock in the rate of dynamic taxation level

generates a rise in the level of economic growth in the long term in the case of Romania [7].

Inward investment has had a significant impact on the economic performance of host

economies, although beneficial effects on domestic companies appear to have been slow to

develop [8].

3. Methodology and data

3.1. Data collection and variables definitions In order to undertake our research, we have used the deflated quarterly time series of the

Romanian SMEs’ value-added from January 2005 to December 2013. The database was

created with the help of the information posted on the official sites of the WorldBank, NIS

(National Institute of Statistics), NBR (National Bank of Romania), and the MPF (Ministry of

Public Finance) of Romania. The macroeconomic variables were selected according to the

review of recent literature in the field and taking into consideration the specific features of

Romania’s economy and the global economic environment. The econometric models and the

data analysis was performed in the Gretl software.

The fiscal policy was assessed though the government expenditure rate and tax rate and the

monetary policy was quantified by the interest rate and CPI.

The dependent variable selected for the analysis is the deflated SMEs’ value-added ratio (the

gross value added at factor cost is the sum of wages and other labor-related cost factors,

profits, operating subsidies, fixed capital amortization, net of taxes linked to production)

divided by the number of SMEs (micro firms, small companies or medium enterprises).

(VA_micro, VA_small, VA_medium).

The independent variables employed in our analysis are the following:

the interest rate of monetary policy (Int_rate) [9]

the tax rate represents the percentage of the tax revenues in the GDP that estimates the

fiscal burden borne by the Romanian enterprises (Tax_Rate) [10]



542

the CPI (Consumer Price Index) with the current period December 2013 measures the

price dynamics in Romania’s economy [11]

the foreign direct investments rate as a percentage of GDP which represent the

injection into the Romanian economy of foreign capital flows [9]

the unemployment rate as a percentage of active population (Unempl_rate) [11]

the government expenditure rate as a percentage of GDP Gov_Rate [11]

3.2. Research design

In this paper we performed a time-series analysis founded on quarterly data from the period

January 2005 – December 2013, i.e. 36 instances, in order to determine the relationship

between the macroeconomic factors and the Romanian SMEs’ performance. For our research,

we have analyzed the descriptive statistics of the data, we have computed the correlation

matrix in order to establish the degree of correlation that exists between the exogenous

variables and the endogenous one, we have done some exploratory analysis and before we

have created the model we have performed some tests such as the ADF test and the EG

cointegration test on the variables. After designing the econometric model we have

performed the following tests: the unit root tests (Augmented Dickey-Fuller test),

cointegration tests (Engle-Granger test), autocorrelation tests on the variables selected in

order to build the Autoregressive Distributed Lags Model (ADL) that captures the short term

effects of the tax and monetary policy.

3.3. Summary statistics

We calculated in the Gretl software the descriptive statistics for the nine variables that we

decided to employ in our analysis, the three dependent variables and six independent

variables. We observe that the value-added of the micro firms (0.02) is ten times smaller than

the one of the small enterprises (0.2) and 63 times smaller than the medium value-added rate

(1.26). Table 1. Summary Statistics, using the observations 2005:1 – 2013:4

3.4. Exploratory data analysis

We also performed an exploratory analysis of the raw data examining histograms, plots and

correlograms of the data in order to determine if the variables have a trend and a seasonal

pattern or outliers. We have decided to take the first difference of the variables in order to

transform the data into stationary time series. After we had taken the first difference of the

variables, we noticed that the unemployment rate is not stationary after the first difference

and the CPI has a trend, so we discarded them from the econometric model. The variables

selected had the same degree of integration I(0).



543

From the correlation matrix we notice that the first differences of the dependent variables

VA_micro, VA_small, VA_medium are negatively correlated with the interest rate,

unemployment rate and government expenditures and positively correlated with the first

difference of the CPI, FDI and the tax rate variables. After performing the Engle-Granger

cointegration test, we concluded that the variables are not cointegrated so the VECM can not

be employed.

4. Econometric models and the main empirical results

In order to explain the SMEs’ value-added rate to certain macroeconomic variables, we

performed an Autoregressive Distributed Lags Model (ADL) on the database that we had

created with the NIS, NBR, and MF data. Coefficients were estimated with ordinary least

squares, and standard errors were calculated using standard asymptotic approximations. The

exploratory analysis, the tests, and the econometric model were performed in the statistical

software Gretl.

After performing the Engle-Granger test for cointegration, we concluded that the independent

variables aren’t cointegrated with the dependent variable. We found the lag length for each of

the variables that are used in the model and then we have not to employ the lags of the

dependent variable because of collinearity issues. We have selected the lags of the exogenous

variables taking into consideration the minimum Akaike Information Criterion (AIC).

We have designed an Autoregressive Distributed Lags Model (ADL) in order to analyze the

short-term effects of macroeconomic variables on SMEs’ value-added in Romania. We took

the first difference for all the variables. Based on the requirement to obtain the minimum

AIC, the final form of the econometric model for the micro firms value added, with all the

coefficients being statistically significant, is the following:

Δ(VA_microt) = -0.001*Δ(Int_Ratet-2) -0.002*Δ(Tax_Ratet-1) + 0.002*Δ(Tax_Ratet-4) –

0.08*Δ(Gov_Ratet) -0.16*Δ(Gov_Ratet-1) -0.05Δ(Gov_Ratet-4) (1)

where:

Δ = the first difference of the variables selected

VA_micro = the deflated value-added per number of micro firms in Romania

VA_microt-i = the lagged deflated value-added per number of micro firms in Romania

Int_Ratet = the interest rate in the current quarter

Tax_Ratet = the tax rate as a percentage of GDP

Gov_Ratet = the government expenditure rate as a percentage of GDP

FDI = foreign direct investment rate as a percentage of GDP

t = the current period, the current quarter

We observe that the government expenditure rate variance has the most significant negative

impact on the variance of the micro firms value added. This model explains 79% of the

variation in the dependent variable VA_micro. All coefficients are statistically significant at a

p-value of 1%, 5% or 10%.

The final form of the econometric model for the small firms value added, with all the


Δ(VA_smallt) = -0.024*Δ(Int_Ratet-3) -0.035*Δ(Tax_Ratet-1) - 0.023*Δ(Tax_Ratet-2) +

0.017*Δ(Tax_Ratet-4) + 1.54*Δ(Gov_Ratet-2) + 1.59*Δ(Gov_Ratet-3) + 0.78*Δ(Gov_Ratet-4)

+0.013*Δ(FDIt) + 0.011*Δ(FDIt-1) -0.019*Δ(FDIt-3) (2)

We observe that the government expenditure rate variance has the most significant positive

impact on the variance of the small firms value added. This model explains 82.58% of the

variation in the dependent variable VA_small. All coefficients are statistically significant at a

p-value of 1% or 5%.



544

The final form of the econometric model for the medium firms value added, with all the


Δ(VA_mediumt) = -0.14*Δ(Int_Ratet-3) -0.22*Δ(Tax_Ratet-1) - 0.16*Δ(Tax_Ratet-2) +

0.11*Δ(Tax_Ratet-4) + 8.56*Δ(Gov_Ratet-2) + 8.60*Δ(Gov_Ratet-3) + 4.40*Δ(Gov_Ratet-4)

+0.08*Δ(FDIt) + 0.07*Δ(FDIt-1) -0.13*Δ(FDIt-3) (3)

We observe that the government expenditure rate variance has the most significant positive

impact on the variance of the medium firms value added. This model explains 83.60% of the

variation in the dependent variable VA_medium. All coefficients are statistically significant

at a p-value of 1% or 5%.

We have performed some tests in order to verify the assumptions of the linear model and all

of them were met, the White test and Breusch-Pagan test for heteroskedasticity, the normality

of the residuals test, the collinearity test, the Breusch-Godfrey test for autocorrelation.

5. Conclusions

Our analysis suggests that the first differences of the dependent variables VA_micro,

VA_small, VA_medium are negatively correlated with the interest rate, unemployment rate

and government expenditures and positively correlated with the first difference of the CPI,

FDI and the tax rate variables. Our analysis shows that the average value-added of the micro

firms is ten times smaller than the one of the small enterprises and 63 times smaller than the

medium value-added rate.

Our results suggest that the government expenditure rate variance has the most significant

negative impact on the variance of the micro firms value added among the other explanatory

variables but a positive effect on the small and medium enterprises value added variance.

For further research, taking into consideration the fact that the literature review states that the

education expenditure and/or the research and development expenditure as a percentage of

government expenditure have a significant impact on the economic growth of a country, these

variables could be employed in a quarterly time series analysis for the same period in order to

measure their impact on the firms’ performance.

Acknowledgment

This paper was co-financed from the European Social Fund, through the Sectorial





References

[1] AIPPIMM, “Agentia pentru Implementarea Proiectelor și Programelor pentru IMM-uri,”

AIPPIMM web site, 2013. [Online]. Available:

http://www.aippimm.ro/articol/imm/legislatie-imm/definitie-imm. [Accessed: 16-Mar-

2015].

[2] Usman A., Mobolaji H.I., Kilishi A.A., Yaru M.A. and T. A. Yakubu, “Public

expenditure and economic growth in Nigeria,” Asian Economic and Financial Review,

vol. 1, no. 3, pp. 104–113, 2011.

[3] Nijkamp Peter and Poot Jacques, “Meta-analysis of the effect of fiscal policies on long-

run growth,” European Journal of Politcal Economy, vol. 20, no. 1, pp. 91–124, Mar.

2004.

[4] Fölster Stefan and Henrekson Magnus, “Growth effects of government expenditure and

taxation in rich countries,” European Economic Review, vol. 45, pp. 1501–1520, 2001.



545

[5] Devarajan Shantayanan, Swaroop Vinaya and Zou Heng-fu, “The composition of public

expenditure and economic growth,” Journal of Monetary Economics, vol. 37, pp. 313–

344, 1996.

[6] Kneller Richard, Bleaney Michael F. and Gemmell Norman, “Fiscal policy and growth:

evidence from OECD countries,”Journal of Public Economics, vol. 74, pp. 171–190,

1999.

[7] Mutaşcu Mihai Ioan and Dănulețiu Dan Constantin, “Taxes and Economic Growth in

Romania,” Annales Universitatis Apulensis Series Oeconomica, vol. 13, no. 1, pp. 94–

105, 2011.

[8] Dawn H. and Pain N., “The Determinants and Impact of Foreign Direct Investment in the

Transition Economies: A Panel Data Analysis,” 1998.

[9] National Bank of Romania, “Raport Analiza Statistica,” National Bank of Romania web

site, 2015. [Online]. Available: http://bnr.ro/Raport-statistic-606.aspx. [Accessed: 10-

Mar-2015].

[10] Ministry of Public Finance, “Executia Bugetului General Consolidat,” Ministry of

Public Finance web site, 2015. [Online]. Available:

http://discutii.mfinante.ro/static/10/Mfp/trezorerie/Rap_trimI2011_anexa.pdf. [Accessed:

12-Apr-2015].

[11] “INSSE - Baze de date statistice - TEMPO-Online serii de timp.” [Online]. Available:

http://statistici.insse.ro/shop/. [Accessed: 10-Mar-2015].



546

EFFICIENCY OF THE EUROPEAN STRUCTURAL FUNDS INVESTED

IN EDUCATIONAL INFRASTRUCTURE

Monica ROMAN


[email protected]

Abstract. The aim of the paper is to analyse the regional differences existing between

Romanian counties with respect to the efficiency of European structural funds (ESF) devoted

to finance educational infrastructure. In this purpose, it is employed a non-parametric

method widely used for evaluating the efficiency of public policies, namely Data Envelopment

Analysis (DEA). The regional dimension was enclosed in the model by considering 31

Romanian counties as decision making units (DMUs). The results confirm the deep

disparities existing between Romanian counties concerning the efficiency of using ESF.

Some of the counties with a moderate accession rate of structural funds are among the

efficient DMUs: Braila, Iasi or Cluj.

Keywords: educational, infrastructure, European structural funds, DEA, regional efficiency,

Romania

JEL classification: H83, R58, R25, C54

1. Introduction Becoming a Member State of the European Union, Romania benefited starting with the

programming period 2007-2013 of structural and investments funds, designed to face the

economic challenges and disparities, as well as to value the opportunities available in the

country. The European Union funds represent for Romania financial instruments set up to

assist at reducing the regional disparities and fostering growth thorough in investments

domains such as employment, social inclusion, rural and urban development or research and

innovation. During the programming period 2007-2013, Romania benefited from a budget of

27.5 billion euros of which 19.2 billion euros is for structural and cohesion funds and 8.3

billion for Common Agricultural Policy.

The aim of the paper is to analyse the regional disparities existing between Romanian

counties with respect to the efficiency of structural funds devoted to finance educational

infrastructure. One of the most relevant needs for Romanian social development is the

improvement of the quality of educational infrastructures and reducing the regional

disparities existing between Romanian regions. The Regional Operational Programme

through the Key Area of Intervention 3.4 „ Rehabilitation, modernisation, development and

equipping of pre–university, university education and continuous vocational training

infrastructure” was the programme that addresses the educational infrastructure development

needs.

Therefore, the study fills a gap in the literature by evaluating the efficiency of ESF devoted to

educational infrastructure, provides valuable information for decision makers and also opens

room for further research on this challenging topic. It is employed a non-parametric method

widely used for evaluation the efficiency of public policies, namely Data Envelopment

Analysis (DEA). Efficiency was computed in output oriented model.

The contribution of the paper is threefold: the paper approaches the efficiency of using

structural funds at regional level, being one of the first attempts to apply DEA methodology




547

in this respect; the objects of the study are NUTS3 regions from Romania, a country that still

needs to progress to cohesion to other EU countries; the study is using very recent data from

the Programming period 2007-2013.

This paper is structured as follows. Section 2 briefly reviews the literature on DEA

application on public policies evaluation and on the impact of structural funds on economic

growth, respectively. Section 3 discusses the method applied and subsequently and variables

employed, followed by the presentation of the results in section 4. Finally, section 5

concludes.

2. Literature review

DEA approach involves the application of the linear programming technique to trace the

efficiency frontier. DEA was launched by Charnes et al. [4] under the assumption that

production exhibited constant returns to scale, while Banker et al. extended it to the case

where there are variable returns to scale.

It has been successfully applied in measuring both for-profit and non-profit organizations,

such as the effectiveness of regional development policies in northern Greece by by Karkazis

and Thanassoulis [11]. Coelli, Rao and Battese [6] introduce the reader to this literature and

describe several applications.

Governmental efficiency in general and public policies efficiency became research subjects

of an increased number of papers. Zhu [18] provides a series of Data Envelopment Analysis

(DEA) models for efficiency assessment and for decision making purposes. Rhodes and

Southwick [16] use DEA to analyze and compare private and public universities in the USA.

There are several applications of DEA method for Romania; Roman and Suciu [17] provide

an efficiency analysis of research activities using input oriented DEA models and Nitoi [14]

assesses the efficiency of the Romanian banking system using an input oriented, variable

return to scale, DEA model. DEA has also been used to assess different aspects of the

medical field like the efficiency of national health systems [1].

Considering the European Cohesion Policy and using a panel of NUTS3 regions, Becker et

all [3] find positive effects of Objective 1 funds on economic growth, but no employment

effects. Puigcerver-Peñalver [15] finds that structural funds have positively influenced the

growth process at regional level although their impact has been much stronger during the first

Programming period than during the second.

Mohl and Hagen [13] evaluate the growth effects of European structural funds payments at

the regional level. Using a new panel dataset of 124 NUTS regions for the time period 1995-

2005 they found empirical evidence that the effectiveness of structural funds in promoting

growth is strongly dependent on which Objective is analysed. Payments of Objective 2 and 3

have a negative effect on GDP.

3. Method and variables

The variable of interest in our model is the value of European structural funds involved in

projects that support educational infrastructure at NUTS3 aggregation level. Out of the total

number of the projects contracted in this purpose in Romania, there were selected the projects

finalized by April 2014 resulting 131 projects with a total value of 723 million lei. The

projects devoted to financing higher education and research infrastructure were in a small

number and therefore were excluded from the analysis. Other two inputs were also

considered in the efficiency evaluation: the ratio professor/student that counts for the human

resources and the number of classrooms that counts for fix capital. The output variables refer

to education performance (the average graduation rate at The National Test and the average



548

graduation rate at National Baccalaureate) and the variation of drop-out rate, as a measure of

the education accessibility.

Data were provided by the National Institute of Statistics, by the Ministry of Regional

Development and Public Administration and by the Ministry of National Education and refers

to 2013 and 2014.

The method employed in this study, Data Envelopment Analysis, is a non-parametric method

which identifies an efficiency frontier on which only the efficient Decision Making Units

(DMUs) are placed, by using linear programming techniques. First presented in 1978, the

first DEA model is known in the literature as the CCR model, after its authors, Charnes,

Cooper and Rhodes [4]. By using linear programming and by applying nonparametric

techniques of frontier estimation, the efficiency of a DMU can be measured by comparing it

with an identified frontier of efficiency. The DEA model is input or output oriented. An

output oriented DEA model is channelled towards maximizing the outputs obtained by the

DMUs while keeping the inputs constant, whilst the input oriented models focus on

minimizing the inputs used for processing the given amount of outputs.

The analytical description of the linear programming problem to be solved, in the variable-

returns to scale hypothesis, is sketched below for an output-oriented specification, which is

employed in the present study. Suppose there are k inputs and m outputs for n DMUs. For

thei-th DMU, yi is the column vector of the inputs and xi is the column vector of the outputs.

We can also define X as the (k×n) input matrix and Y as the (m×n) output matrix. The DEA

model is then specified with the following mathematical programming problem, for a given i-

th DMU:

0

1N

0Xx

0Yy

max

1

i

i

,

(1)

In problem (1), is a scalar and 1. The measure 1/ is the technical efficiency (TE)

score and varies between 0 and 1. If it is less than 1, the public intervention is inside the

frontier (i.e. it is inefficient), while if it is equal to 1 implies that the intervention is on the

frontier (i.e. it is efficient).

The vector λ is a (n×1) vector of constants that measures the weights used to compute the

location of an inefficient DMU if it were to become efficient. The restriction N1'λ = 1

imposes convexity of the frontier, accounting for variable returns to scale. Dropping this

restriction would amount to admit that returns to scale were constant. Notice that problem (1)

has to be solved for each of the n DMUs in order to obtain the n efficiency scores.

4. Results

We briefly summarize the descriptive statistics of the data used in the study. The two

indicators counting for the performance of the undergraduate education system, namely the

graduation rate at the Baccalaureate and the graduating rate at the National Evaluation,

provide a moderate homogeneity. The first one has the minimum value recorded in Ilfov

(29,26%), that is an outlier of the series, while the maximum graduation rate was registered in

Cluj (71,64%). The mean of the sample is 58,64%, in line with the national average of

59,25%. The graduation rate for National Evaluation ranges from a minimum score of

61,04% in Olt to 88,11% in Cluj, with an average of 75%. The modest performance of

undergraduate education system raised vivid debates in Romanian media and also among



549

decision makers and researchers that tried to identify the possible causes for the situation,

spreading from the poor education conditions in some schools, the disinterest of teachers who

are underpaid, the lack of parental involvement, to the shifts in youth behaviour and lack of

student interest in learning and preparing for a career.

The variation of the drop-out rate has a moderate homogeneity described by a coefficient of

variation of 25% and the mean and median are very close to each-other, pointing out the

symmetry of the series. On average the counties in the sample faced a slow decrease in the

drop-out rate, but, at the same time, there are important regional differences. The highest

decrease in drop-out rate (with 70%) appears in Hunedoara, while the highest increase (of

38%) is in Ilfov.

Input variables are homogenous and Ilfov is again in the most disadvantageous situation with

the minimum values of 2,3 classrooms per 100 pupils and 4,2 professors per 100 pupils. The

best ranked are Sălaj and Vâlcea, respectively.

The values of ESF are by far the most heterogenous of the selected variable (coefficient of

variation is 68%). Maramureş attracted the lowest amount, while Dâmboviţa attracted the

highest amount.

Figure 1. DEA results for CRS and VRS models

The average efficiency scores under the assumption of constant return to scale (CRS) is

0,885, while in the case of variable return to scale (VRS) the average efficiency is slightly

higher, 0,928. In both cases, the scores’ distributions are homogenous. In Figure 1 there are

described the results of DEA in the cases of both CRS and VRS. In practice, it is less likely

to have constant return to scale, and therefore in the following table only the results from the

VRS Model are detailed.

The results in Table 1 deserve further discussion. In the first quartile (Q1) there are seven

counties that are the most inefficient. These counties have modest education performance, but

manage to attract high amounts of funding for improving their educational infrastructure.

Counties such as Hunedoara, Arad, Dâmboviţa or Harghita are among the top recipients of

such financial resources, but the efficiency of using these is relatively low. In the second

group, with efficiency scores range between first (Q1) and second quartile (Q2) there are

eight counties, while five counties have efficiency score between second and third quartile

(Q3), approaching to efficiency frontier. Among these, there are counties such as Braşov,

Vrancea, Sibiu, Bihor or Buzău.



550

Table1. The sample counties distributed by technical efficiency scores

One third of the counties in the sample are on the efficiency frontier, having TE scores

equal to 1. Among these we found counties that have attracted financial resources above

average and manage to report good educational performance. These counties are Brăila,

Galaţi, Iaşi and Cluj. These counties could serve as good practice cases.

There are also on the efficiency frontier counties having lowest values of attracted funds

and also having low levels of output indicators, such as Vâlcea, Tulcea, Ilfov. In such

counties seems to be a lack of interest in successfully accession of ESF for improving

educational infrastructure.

The scale efficiency was also considered in the analysis, and scale was computed as the

ratio between efficiency scores in the CRS and VRS models. Not surprisingly, the findings

from both models reflect decreasing return to scale for the great majority of the DMUs,

with a coefficient of returns to scale lower than 1. This implies that an increase in inputs

will generate a smaller increase in outputs. Nine counties that were efficient in both

models present a constant return to scale: Brăila, Constanţa, Galaţi, Hunedoara, Iaşi, Ilfov,

Maramureş, Suceava and Tulcea.

5. Conclusions

In the present study, efficiency of a sample of Romanian counties was computed using a

DEA model output oriented, with a focus on the value of European structural funds. The

results confirm the deep disparities existing between Romanian counties in the efficiency of

using ESF. Some of the counties with a high accession rate of structural funds are among the

efficient DMUs: Braila, Galati, Iasi and Cluj. On the other hand, we found on the efficiency

frontier counties with low accession rate such Constanta, Ilfov, Maramureş, Suceava, Tulcea,

Vâlcea. Conclusions confirm the efficiency of using European structural funds in a number of

counties that have attracted important amounts of money, but at the same time, there are

counties that are far from efficiency frontier and there is room for improving the efficiency of

their use.

References

[1] Asandului Laura, Roman Monica and Fatulescu Puiu, “The Efficiency of Healthcare

Systems in Europe: a Data Envelopment Analysis Approach”, Procedia Economics and

Finance, Volume 10, pp. 261-268, 2014.



551

[2] Aristovnik Aleksander and Obadić Alka. “Measuring relative efficiency of secondary

education in selected EU and OECD countries: the case of Slovenia and Croatia”

Technological and Economic Development of Economy 20.3, pp. 419-433, 2014.

[3] Becker Sascha O., Egger Peter H. and Von Ehrlich Maximilian. “Going NUTS: The effect

of EU Structural Funds on regional performance” Journal of Public Economics, 94.9, pp.

578-590, 2010.

[4] Charnes A., Cooper W.W. and Rhodes E., “Measuring the efficiency of decision making

units”, European Journal of Operational. Research, Volume 2, pp. 429-444, 1978.

[5] Charnes A., Cooper W.W., Lewin A.Y. and Seiford, L.M., Data Envelopment Analysis:

Theory, Methodology and Applications, Kluwer Academic Publishers, 1994.

[6] Coelli Timothy, Rao Prasada, O’Donnell Cristopher J. and Battese George E., - An

introduction to efficiency and productivity analysis, Kluver Academic Publishers, Boston,

Dordrecht, London, 1998.

[7] Coelli Tim, “A guide to DEAP Version 2.1: a data envelopment analysis (computer)

program”, CEPA Working Paper 96/08, Department of Econometrics, University of New

England, Armidale, Australia, 1996.

[8] Angel de la Fuente, and Vives Xavier. "Infrastructure and education as instruments of

regional policy: evidence from Spain." Economic policy 10.20, pp. 11-51, 1995.

[9] Dobrescu Emilian and Pociovalisteanu Diana M., “Regional Development And Socio-

Economic Diversity In Romania”, Annals-Economy Series 6, pp. 55-59, 2014.

[10] European Commission/EACEA/Eurydice, 2013. Funding of Education in Europe 2000-

2012: The Impact of the Economic Crisis. Eurydice Report. Luxembourg: Publications

Office of the European Union.

[11] Karkazis John and Thanassoulis Emmanuel, “Assessing the Effectiveness of Regional

Development Policies in Northern Greece”, Socio-Economical Planning Science Vol 32,

no 2. pp. 123-137, 1998.

[12] Mohl Philipp and Hagen Tobias, “Econometric evaluation of EU Cohesion Policy: a

survey”, ZEW Discussion Papers, No. 09-052, 2009.

[13] Mohl Philipp and Hagen Tobias, “Do EU structural funds promote regional growth?

New evidence from various panel data approaches”, Regional Science and Urban

Economics, 40.5, pp. 353-365, 2010.

[14] Nițoi Mihai, “Efficiency in the Romanian Banking System: An Application of Data

Envelopment Analysis”, Romanian Journal of Economics, Institute of National Economy.

vol. 29(2(38)), pp. 162-176, December 2009.

[15] Puigcerver-Peñalver Mari-Carmen, “The impact of structural funds policy on European

regions growth. A theoretical and empirical approach” The European Journal of

Comparative Economics vol.4, no. 2, pp. 179-208, 2007.

[16] Rhodes E. and Southwick L., “Determinants of efficiency in Public and Private

Universities”, Department of Economics, University of South Carolina, 1986.

[17] Roman Monica and Suciu Christina, “Analiza eficienţei activităţii de cercetare

dezvoltare inovare prin metoda DEA [The Efficency Analysis Of R&D Activities By

Using Dea]”, MPRA Paper 44000, University Library of Munich, Germany, 2012.

[18] Zhu Joe, Quantitative Models for Performance Evaluation and Benchmarking: Dat

Envelopment Analysis with Spreadsheets and DEA Excel Solver, Kluwer Academic,

2002.



552

RESOURCES ALLOCATION MODEL IN A CLUSTERED CLOUD

CONFIGURATION

Mioara BANCESCU


[email protected]

Abstract. The expansion of cloud computing in the past years has raised a real need of

optimization of the elements implied by the cloud computing configuration. Resource

allocation is one of the most important challenges in the cloud computing architecture

specially when taking into account the limitations in the possibility to endow the servers from

the cloud with processing capacity, storage capacity and network communication capacity.

The challenge in modeling resource allocation in cloud configuration comes as well from the

need for faster and faster response times the client applications configured on the servers in

the cloud can get. This paper proposes a cloud computing resource allocation model.

Keywords: Cloud configuration, general equilibrium, modeling, resource allocation.


1. Introduction

Starting from the numerous definitions of cloud computing, Brodkin [3] argues that ‘cloud

computing is a way of calculating the scalable and elastic IT capacities, provided as a service

to more customers using Internet technologies’. The expansion of cloud computing in the past

years has raised a tremendous need of optimization of the elements implied by cloud

computing architecture. The resource allocation in the cloud is an emerging research topic.

As Han and Xiaolin emphasize on their work [5], once the cloud amplifies, scalable resource

sharing platform becomes cheaper and more accessible, while the problem of managing

resource allocations in a cloud computing environment becomes a challenge due to both

resources and administrative parties.

The variety of cloud resource allocation emerges from the variety of both hardware and

software resources, multiplied by the variety of users. Other contributors are the complex

applications that evolved from monolithic systems to complex multi-tiered systems.

In the paper we model a cloud system formed by two clusters that are managing three types

of resources: processing capacity, storage capacity, network communication capacity. The

first section of the paper focuses on related research work in the area of cloud computing and

resource allocation in cloud configurations. The next section contains the model formulation

with its equations, notions and abbreviations used and justification of the model equations.

The last two sections are dedicated to possible model results and final remarks.

2. Related work

As the cloud computing era raised, the community pointed on cloud resource allocation

problem which has to meet requests like cost justifiability, utility, cost reduction, efficiency,

availability, continuity, scalability and other aspects. For this, related work focused on

managing resource allocations challenges due to the diverse scalability and heterogeneity.

Han and Xiaolin [5] focused on research for cost-effective resource management strategy

design in cloud computing which was driven by the booming virtualization, heterogeneous

nature of users and financial costs. They also studied the minimization problem of resource




553

rental cost associated with hosting, while meeting the projected service demand. In addition,

the need of flexible resource allocation was emphasized for the resource trading problem in

cloud setting, where multiple tenants communicate in a Peer-to-Peer approach.

Recently, numerous studies focused on proposing cloud computing resource allocation

models. A threshold-based dynamic resource allocation scheme for cloud was proposed by

Lin et al [7]. This model allocates dynamically the virtual machines among the cloud

applications based on their load changes and use the threshold method to optimize the

resource reallocation decision. Later, the combinatorial double auction resource allocation

proposed by Sarmini et al [9] proved to be economically efficient through the double-sided

competition allowed and bidding on an unrestricted number of items.

In their paper [4], Espadas et al, raise the problem of the majority of Software-as-a-Service

(SaaS) platforms and their applications, where over and under-utilization of resources occur

because of the number of virtual machine instances deployed. The authors propose a tenant-

based model addressing tenant-based isolation, tenant-based load balancing and tenant-based

virtual machine instance allocation.

A paper proposed by Ardagna et al [1] focuses on resource allocation scheduler for multi-tier

autonomic environments in order to maximize the profits associated with multiple class of

Service Level Agreements.

In other related work, Iqbal et all [6] propose a methodology and a system for automatic

detection and resolution of bottlenecks in a multi-tier Web application hosted on a cloud in

order to satisfy maximum response time requirements. In addition the authors propose a

method to identify and retract over-provisioned resources in multi-tier cloud applications.

3. Model Formulation

The components considered for modeling in this paper are the following: two clusters

disposed in a cloud configuration, having different number and type of servers in their

structure, three resources - processing capacity, storage capacity and network communication

capacity, two clients - the applications installed on the servers from each cluster in the cloud

and a service - the response time of each cluster from the cloud to client application requests.

There is a supply and a demand in the cloud for the response time service: on one side, it is

offered based on servers endowments with the three resources and on the other side, the

response time service it is requested by each client application installed on the servers from

the cloud.

The concept of profit A ,

B it is introduced in the model in order to asses the value of the

service gained from the clients owning the applications A

SP,

B

SP subtracted by the cost of

operating the active servers, at cluster level A

SCU,

B

SCU. As the values of the response time

service and the cost of operating the services are considered per unit, the offered quantities of

the service are taken into accountA

SQ,

B

SQ in order to get to the totals values and to total

operating costs.

In order to balance the supply and the demand of the response time service in the cloud,

beside the offered quantities of the service introduced aboveA

SQ,

B

SQ, we introduce as well

the requested quantities of the service A

SQ,

B

SQ. If there is a perfect match between the

quantities offered and the quantities demanded, not having any supply excess or shortage, the

model will return value 1 as a result of executing the model on entry data. If, based on the

entry data, there is a supply in excess, the model will return a value smaller than 1 as a result



554

of executing the model. And if based on the entry data there is a shortage of supply in the

cloud, the model will return a value greater than 1 as a result of executing the model.

Under the same concept of balancing the demand and the supply, but at resource level, we

introduce: the values of the resources in the cloud sPPRe , sSPRe , sCPRe , the endowment with

resources i

PsRe , i

SsRe,

i

CsRe and the consumptions of resources

Si

sP

i QPRe

, Si

sS

i QPRe

,

Si

sC

i QPRe

1.

The model proposed in the paper contains other equilibrium conditions, at client applications

level, for which we use the notions of cloud resource values sPPRe , sSPRe , sCPRe and

endowments with resources in the cloud i

PsRe , i

SsRe,

i

CsRe to get to introducing the

notion of income for each cluster AV ,

BV . For the demand side of these equilibrium

conditions we use the requested quantities of the service A

SQ,

B

SQ multiplied with the unitary

values of the service A

SP,

B

SP.

In summary, we are using the following abbreviations in the model: A – the profit of cluster A B – the profit of cluster B

A

SCU - unitary cost of operating the servers in cluster A

B

SCU - unitary cost of operating the servers in cluster B

A

SQ - the offered quantity of the service (cluster A response time offered)

B

SQ - the offered quantity of the service (cluster B response time offered)

A

SP - the value of the service (cluster A response time)

B

SP - the value of the service (cluster B response time)

sPPRe - the value of the processing capacity resource in the cloud;

sSPRe - the value of the storage capacity resource in the cloud;

sCPRe - the value of the communication capacity resource in the cloud;

Si

sP

i QPRe

- where },{ BAi ; the consumption of the processing capacity resource to obtain

cluster A, respectively cluster B response time

Si

sS

i QPRe

- where },{ BAi ; the consumption of the storage capacity resource to obtain

cluster A, respectively cluster B response time

1 To express the resources consumed to obtain the response time in each cluster of the cloud we use the

derivates of the profit functions, as formalized in Hotelling lemma.



555

Si

sC

i QPRe

- where },{ BAi ; the consumption of the communication capacity resource to

obtain cluster A, respectively cluster B response time i

PsRe - the endowment with processing capacity resource of cluster i, where },{ BAi i

SsRe - the endowment with storage capacity resource of cluster i, where },{ BAi

i

CsRe - the endowment with communication capacity resource of cluster i, where },{ BAi

A

SQ - the requested quantity of the service (cluster A response time expected)

B

SQ - the requested quantity of the service (cluster B response time expected)

AV - the income of cluster A from the cloud BV - the income of cluster B from the cloud

Based on previous work on general equilibrium modeling technique [2], we propose

in this paper a model for resource allocation in cloud configuration with the following

equations:

(1) 0)( A

S

A

S

A

S CUPQ

(2) 0)( B

S

B

S

B

S CUPQ

(3) 0][ A

S

A

S

A

S QQP

(4) 0][ B

S

B

S

B

S QQP

(5) 0]ReRe[ReRe

Re

B

S

sP

B

A

S

sP

A

B

P

A

PsP QP

QP

ssP

(6) 0]ReRe[ReRe

Re

B

S

sS

B

A

S

sS

A

B

S

A

SsS QP

QP

ssP

(7) 0]ReRe[ReRe

Re

B

S

sC

B

A

S

sC

A

B

C

A

CsC QP

QP

ssP

(8) 0]ReReRe[ ReReRe A

S

A

S

A

CsC

A

SsS

A

PsP

A QPsPsPsPV

(9) 0]ReReRe[ ReReRe B

S

B

S

B

CsC

B

SsS

B

PsP

B QPsPsPsPV

In any of the two clusters from the cloud where the client applications get a positive response

time, the profit should be zero or if it is negative then the response time of the cluster should

be zero, meaning that the cluster becomes inactive in the cloud. Equation (1) represents the

null profit condition for cluster A and equation (2) represents the null profit condition for

cluster B.

For any service or resource having a strictly positive value, the supply in excess is zero or, if

there is any supply in excess, the value of the respective service or resource is zero. Equations

(3) and (4) represent the equilibrium condition for the response time service of cluster A,

respectively cluster B. Equations (5) to (7) represent the equilibrium condition for each

resource in the cloud.

For any client application which is acquiring the response time service, the difference

between the value of clusters endowment with resources and the value of service



556

consumption is zero. Equations (8), (9) represent the equilibrium condition for the client

applications.

4. Optimization method and results

The advanced quantitative modeling assumes the use of software packages solutions for

generating final results based on entry data given by the user. Based on our research, the

model proposed in the paper can be executed using GAMS software (General Algebraic

Modeling System) and more specifically, the integrated solver in GAMS that is dedicated to

general equilibrium, MPSGE (Mathematical Programming System for General

Equilibrium)2.

The results that can be obtained are the value and the equilibrium quantity of the service -

cloud response time, the value and the equilibrium quantities of the three resources in the

cloud - processing capacity, storage capacity and network communication capacity. The

interpretation of results is made by comparison with the target value 1, an ideal value towards

which the results should be driven so that imbalances at the moment of the cloud analysis to

be eliminated.

5. Conclusions

In this paper, the problem of resource allocation in cloud computing configuration is

considered. The null profit conditions for each cluster of the cloud, together with the

equilibrium conditions for the response time service of each cluster, the equilibrium condition

for each resource in the cloud - processing capacity, storage capacity and network

communication capacity and together with the equilibrium condition for the client

applications form a model that can be used to asses if the values and the quantities of service

and of resources are close or far from the ideal optimum values they should have. In future

works, the proposed model will be enhanced with case studies, aiming to interpret the results

obtained from executing the model on entry data of a cloud configuration with two or more

clusters.

References

[1] Ardagna Danilo, Trubian Marco and Zhang Li, “SLA based resource allocation policies in

autonomic environments”, Journal of Parallel and Distributed Computing 67(3), pp. 259-

270, 2007.

[2] Băncescu Mioara, “Analysis usage on energy resources with general equilibrium

techniques”, Supliment Revista de Economie Teoretică şi Aplicată, pp. 140-150, ISSN

1844-0029, 2010.

[3] Brodkin Jon. (2008). “Seven Cloud-Computing Security Risks” - Network World, July 02

edition. [Online] Available: http://www.networkworld.com/news/2008/070208-

cloud.html

[4] Espadas Javier, Molina Arturo, Jimenez Guillermo, Molina Martin, Ramirez Raul and

Concha David, “A tenant-based resource allocation model for scaling Software-as-a-

Service applications over cloud computing infrastructures”, Future Generation Computer

Systems 29, pp. 273–286, 2013.

[5] Han Zao and Xiaolin Li, “Resource Management in Utility and Cloud Computing”,

Springer New York Heidelberg Dordrecht, London, pp. 1-37, 2013.

2 Reference: Rutherford [8]

http://www.networkworld.com/news/2008/070208-cloud.html

http://www.networkworld.com/news/2008/070208-cloud.html



557

[6] Iqbal Waheed, Dailey Matthew N., Carrera David and Janecek Paul, “Adaptive resource

provisioning for read intensive multi-tier applications in the cloud”, Future Generation

Computer Systems 27, pp. 871–879, 2011.

[7] Lin Weiwei, Wang James Z., Liang Chen and Qi Deyu, “A Threshold-based Dynamic

Resource Allocation Scheme for Cloud Computing”, Procedia Engineering Volume 23,

pp. 695 – 703, 2011.

[8] Rutherford Thomas, “Applied general equilibrium modeling with MPSGE as a GAMS

subsystem: An overview of the modeling framework and syntax”, Computational

Economics 14, pp. 1-46, 1999.

[9] Sarmini Parnia, Teimouri Youness and Mukhtar Muriati, „A combinatorial double auction

resource allocation model in cloud computing”, Information Sciences, in press, Available

online 13 February 2014, 2014.

[10] Verhoef Chretien, Bhulai Sandjai and Mei Rob, “Optimal resource allocation in

synchronized multi-tier Internet services”, Performance Evaluation Volume 68, pp. 1072–

1084, 2011.



558

UPON DECISION-MAKING IN ALTERNATIVE DESIGN PROBLEMS

Dimitri GOLENKO-GINZBURG

Ben-Gurion University of the Negev and Ariel University [email protected]

Abstract. One of the main problems in alternative network planning boils down to

determining the optimal variant to carry out the considered simulated program. In this paper

we will formulate the optimal variants choice criteria for the case of homogenous alternative

networks which have been described in our publications [1-3].

Keywords: full and joint variants, homogenous alternative stochastic network, multi-variant

optimization, optimal decision-making variant, optimality indicator.

JEL Classification: H83, R58, R25, C54

1. Introduction

While examining homogenous alternative networks the problem focuses on determining the

full variant of a design program which is optimal from the viewpoint of a certain accepted

criterion. The difference between stochastic and deterministic alternative models reveals

itself in future utilization of the results of such "multi-variant" optimization. In deterministic

alternative networks the optimal variant has to be executed regardless of any future

conditions and circumstances; furthermore, it may be recommended to be adopted as a kind

of master plan whilst controlling the process of a complicated system design. For stochastic

networks, when each of the competing variants has a non-zero implementation probability,

control problems become more complicated, since we are facing the additional indeterminacy

as to the ways of reaching the ultimate program's targets. Taking into account information

regarding the stochastic variants quality, which has been acquired by means of the optimality

criterion, the design decision-maker should direct his efforts to carrying out measures which

ensure the most beneficial conditions of executing the determined optimal variant and those

ones being close to it.

2. The general approach

The most common situation in practice deals with the case when the quality of variants

should be assessed by several parameters (partial criteria) of the simulated process. With the

latter in mind consider the following two main formalizations of the regarded problem.

I. Let us be given different criteria (parameters) to assess alternative

variants of a plan to carry out a particular set of activities. Note that each

criterion may be calculated in the alternative network on the basis of activity estimates

comprising -th full variants of the simulated program. To calculate values one may

apply the alpha-algorithm outlined in [2].

From the set of possible variants there should be chosen a single one satisfying the following

requirements:

the chosen variant should meet to the greatest extent all accepted criteria, i.e., it

should provide for the extreme value of a metrics which has been defined in a certain way

in the criteria space ;

nnIII ,...,, 21

m

mBBB ,...,, 21

iI

j iI

nIII ,...,, 21




559

the numerical metrics variation corresponding to the chosen variant should be the

minimal one when consecutively applying the criteria in any combination and in a

arbitrary order. The latter requirement reflects the need to ensure proper flexibility of the

plan, i.e., the least sensitivity of the relative variant's quality regarding possible

amendments of the adopted optimality criteria.

To determine the regarded metrics consider an -dimension criteria space, while on each

axis (assume the -th axis representing the -th criterion) indicate the corresponding

criterion's value for the -th variant, , of the design program.

In the thus defined space determine for each variant a point such that each of its coordinates

corresponds to the optimal value of every criterion taken apart. It can be well-recognized that

this point is nothing but the "ideal" target that the system should seek to achieve while

executing a particular full variant. Obviously, variants corresponding to such points exist, in

principle, only for plans being characterized by functionally dependent criteria. Introduce

therefore the concept of a quasi-optimal plan for which the metrics value delivers an

extremum in space . In case of a group of criteria which simultaneously

maximize or minimize a certain quality objective, a quasi-optimal plan would correspond to

either minimal or maximal distance from the coordinate origin. Besides the point

corresponding to optimal criteria values (the "ideal"), each variant is also characterized by

sub-optimal values, which turn out the regarded variant to become optimal only in case when

planning and control are carried out by a single and pre-determined criterion ,

. It can be well-recognized that the distance between each pair of sub-optimal

points characterizes the variance of the optimized criteria in the transition

process from delivering the optimal value to one of them to delivering the optimum to

another one. Further on, those distances reflect the closeness of the considered variant to its

"ideal". Indeed, when referring to the geometrical interpretation of the considered problem

(figure 1), we may see how shrinking the distances between sub-optimal points brings the

latter closer together to the "ideal". Besides, the lower the regarded distances are the more

probable diminishing the variety of the optimized parameters becomes, if the analyzed

dependencies are based on smooth and convex functions. In other words, the lower the pair-

wise distances between adjacent sub-optimal points are, the more flexible the considered full

program variant becomes (plan ). This flexibility reveals itself in the fact that plan

does not change its parameters significantly when amending criteria in the process of

on-line control for a design program to create a complicated system.

ni i

j mj ,...,2,1

nIII ,...,, 21

n

kI

nk ,...,2,1

nIII ,...,, 21

jB jB

iI



560

Figure 1. Graphical representation of the optimal variant choice

3. Formalization

Let us introduce the following terms:

- the optimal value of the -th criterion for the -th full variant, ,

;

- the point corresponding to the -th variant's "ideal";

- the metrics of the -th variant's "ideal" determined as the distance from

the coordinate origin to point and complying with the main axioms of a metric space;

- the sub-optimal value of the -th criterion for the -th variant on condition that

the -th criterion assumes value , ;

- the distance from "ideal" to the middle of the segment connecting a pair of

sub-optimal points of -type.

It can be well-recognized that there are altogether points being the middles of segments

connecting sub-optimal criteria values when , and there is one such point for .

Keeping up to the introduced designations, the coordinates of those points may be determined

as

ija i j ni ,...,2,1

mj ,...,2,1

njjjj aaaA ,...,, 21 j

jj A,0 j

jA

ija i j

ja nii ,...,1,1,...,2,1

ijr jA

njjjj aaaa ,...,,...,, 21

n2n 2n



561

(1)

Substitute the relations for the segments middles to determine the distances of the latter from

the "ideal":

(2)

Using the abbreviated form, we might express the distance from to the middle of the -th

segment also as:

(3)

subject to the assumption that when .

The structure of objective for the quantitative assessment of the -th variant's quality in

compliance with both above mentioned requirements might be represented then in the

following way:

, , (4)

where

(5)

and

. (6)

The quasi-optimal variant to be recommended for the regarded complicated system design

program and the one whose implementation should be stimulated by creating the most

.2

,...,2

,2

,2

.......................................................................

2,...,

2,

2,

2

2,...,

2,

2,

2

11

33

1

2211

32

3

2

3

3

22

3

1

2

1

212

3

1

32

1

2

2

11

njnjj

n

jj

n

jj

n

j

njnjjjjjjj

njnjjjjjjj

aaaaaaaa

aaaaaaaa

aaaaaaaa

.2

...222

...............................................................................................................................................

2...

222

2...

222

21

2

3

1

33

2

2

1

22

2

1

11

232

2

3

3

2

3

2

2

3

22

2

1

3

1

2

1

2

221

2

3

2

3

1

3

2

2

2

1

2

2

1

2

11

1

nj

njnj

j

j

n

j

j

j

n

j

j

j

n

j

nj

nj

njnj

j

jj

j

jj

j

jj

j

nj

njnj

j

jj

j

jj

j

jj

j

aaa

aaa

aaa

aaa

r

aaa

aaa

aaa

aaa

r

aaa

aaa

aaa

aaa

r

jA i

2

1

1

2

n

kkj

i

kj

i

kj

ij aaa

r

kj

i

kj aa ki

F j

jjjj rAF ,0 mj ,...,2,1

n

iijj rr

1

n

iijjj aA

1

,0



562

suitable conditions, would be the one delivering the extreme value to the considered

objective, i.e.,

.

It can be well-recognized that applying the above outlined method not only enables

determining all possible ways of reaching the program's final target but facilitates also

choosing from a set of particular design activities the variant which is the least sensitive to

environmental conditions changes. Under these circumstances prediction by means of the

alternative network model becomes an active function of the entire process to plan and

control design of a new complicated system under conditions of stochastic indeterminacy.

References

[1] Golenco (Ginzburg), Lishitz, D.I., Livshitz, S.E. and Kesler, S.Sh., Statistical Modeling in

R&D Projecting, Leningrad: Leningrad University Press (in Russian), 1976.

[2] Golenko-Ginzburg, D., Stochastic Network Models in R&D Projecting, Voronezh:

Nauchnaya Kniga (in Russian), 2010.

[3] Golenko-Ginzburg, D., Burkov, V. and Ben-Yair, A., Planning and Controlling

Multilevel Man-Machine Organization Systems under Random Disturbances, Ariel

University Center of Samaria, Ariel: Elinir Digital Print, 2011.

jmjj

FextremumF,...,2,1



563

ARTIFICIAL NEURAL NETWORK APPROACH FOR DEVELOPING

TELEMEDICINE SOLUTIONS: FEED-FORWARD BACK

PROPAGATION NETWORK

Mihaela GHEORGHE


[email protected]

Abstract. Artificial neural networks have the ability of learning patterns corresponding to

different medical symptoms and based upon them, their methodologies are representing an

important classifier tool which can be used in the process of early detection of diseases to

distinguish between infected or non-infected patients. This paper presents a neural network

approach for medical diagnosis, more specifically diabetes diagnosing as a case study based

on a feed-forward back propagation network.

Keywords: back propagation, telemedicine, artificial neural network, algorithm

JEL classification: C45, I1

1. Introduction Artificial neural networks represent a powerful instrument which can be used by different

physicians or healthcare professionals in the process of analyzing complex medical data in

order to extract useful information for classification and prognosis. These types of instruments

are mostly important for their ability of identifying patterns or trends within different data sets

that can be further used in prognosis. Thus, neural networks are becoming an important

classifier for different tasks within telemedicine fields and especially regarding medical

diagnosis subset.

Medical information systems from hospitals or other medical institutions have become larger

in recent years and due to these continuous trends it has become difficult to extract useful

information which could be used for decision support systems designed to assist healthcare

providers in taking decisions. Due to the fact that traditional methods consisting in manual

analysis of data are ineffective, artificial neural networks methods are required and are

becoming essential for diagnosing different diseases.

In this study, three-layered MLP (Multilayer Perceptron) feed-forward neural network

architecture is being used and trained with back-propagation algorithm.

2. Artificial neural network: theoretical framework

An artificial neural network (ANN) represents a mathematical model that simulates a biological

neural network in terms of structure and functionalities. It consists of simple processing units,

named neurons, and all the weighted connections between them. The strength of a connection

(weight) between two neurons i and j will be referred within this paper as wij. Therefore, as

described in [1], a neural network is a sorted triple (N,V,w) with two sets N, V and a function

w, where:

- N represents a set of neurons;

- V is defined as a set {(i,j)|i,j ∈ N}, where each pair of (i,j) represents a connection

between neuron i and neuron j;



564

- w represents the weight function and is defined as w: V -> R and w(i,j) is the weight of

connection (i,j), also shortened as wij.

Within a neural network, data is being transferred between neurons via connections with a

specific connecting weight. The data processing workflow of a neuron is being represented in

figure 1.

Figure 1. Neural network data processing workflow

This process implies the following [1]:

- A propagation function which receives the output o1,o2,…,on of others neurons

i1,i2,…,in (all of them are connected to neuron j) and has the main scope of transforming

them into network inputs (netj) based on the connected weights wij.

If I = {i1,i2,…,in} represents a set of neurons, then netj is calculated by the

propagation function fprops (1) as follows:

netj = fprops(o1,o2,…,on,𝑤𝑖1𝑗, 𝑤𝑖2𝑗,… 𝑤𝑖𝑛𝑗) (1)

The propagation function is represented by the weighted sum as described in (2) and it

is defined as the multiplication of the output of each neuron i by 𝑤𝑖𝑗 and the summation

of the results:

netj = ∑ (𝑜𝑖 ∗ 𝑤𝑖𝑗)𝑖∈𝐼 (2)

- An activation function which has the ability of transforming the net inputs received

from the propagation function and also the old activation status of a neuron to a new

one. If j represents a specific neuron and aj is the activation state assigned to it. The

threshold value 𝜃j represents the position of the maximum gradient value of the

activation function (3).

aj(t) = fact(netj(t),aj(t-1), 𝜃j) (3)

- An output function which transforms activation to output for others neurons. If j

represents a neuron, the output function (4) calculates the output value based on its

activation state aj.



565

fout (aj) = oj (4)

The learning strategy associated to ANN represents an algorithm which can be used to train

neural network in order for this to produce a desired output of a given input. The way that

individual artificial neurons are connected is called topology and this represents, basically, an

architecture or graph of an artificial neuronal network. These topologies are divided into two

basic classes which are illustrated in figure 2.

Figure 2. Neural network topologies: feed-forward (FNN) and recurrent (RNN)

Neurons are grouped in layers and as shown in figure 2, there are usually up to three layers:

input, hidden and output. The differences between topologies [2] are related to the way in which

the information is being sent between neurons. In what concerns feed-forward networks, the

information must flow from input to output in only one direction and without any back-loops.

The simplest feed-forward ANN consists in a single perceptron which is capable of learning

linear separable problems. Regarding recurrent ANN, these are similar to the previous

described ones but they have no restrictions in what concerns back-loops.

In this study, feed-forward neural network architecture and three-layered multilayer Perceptron

are used and trained with back-propagation algorithm.

3. Case study: ANN model for diagnosing diabetes

3.1. Dataset description The dataset used for this study was obtained from UCI Repository of Machine Learning

Database [3] and contains a number of 768 instances representing patient’s diabetes database.

There are a number of 8 attributes and a class, all described in table 1 as follows:

Table 1. Diabetes database attributes

Attribute name Unit

Number of times pregnant Numeric

Plasma glucose concentration mg/dl

Diastolic blood pressure mmHg

Triceps skin fold thickness mm

2-Hour serum insulin 𝜇U/ml

Body mass index ((weight in kg/(height in

m)^2)

Diabetes pedigree function Numeric

Age Numeric (years)

Class variable Numeric (0 or 1)



566

In what concerns the class distribution, there are two classes with values of 1, interpreted as

“tested positive” for diabetes, and values of 0, interpreted as “tested negative”.

3.2. ANN model for diabetes dataset

For diagnosing diabetes, the neural model used for training and testing consists in 3 layers as

illustrated in figure 3: one input layer, one hidden layer and one output layer. There are 8 input

neurons, 5 hidden neurons determined based on (5) and 2 output neurons for diagnosis with

their corresponding values: “tested positive” and “tested negative”.

Figure 3. ANN model for diabetes

nhidden = 𝑛𝑟𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠+ 𝑛𝑟𝑐𝑙𝑎𝑠𝑠𝑒𝑠

2 (5)

3.3. Back propagation algorithm

This has the ability to adjust network weights by error propagation from the output layer to the

input one. Within the training phase of the neural network, it minimizes the error by estimating

the weights. Back propagation algorithm consists of the following steps [4]:

- Step 1: preprocessing, which is based on using a normalization [5] process for the row

input in order to prepare the data to be suitable for training and is calculated by (6):

value = 𝑥𝑖− 𝑥𝑚𝑖𝑛

(𝑥𝑚𝑎𝑥−𝑥𝑚𝑖𝑛)∗𝑠𝑐𝑎𝑙𝑒+𝑡𝑟𝑎𝑛𝑠𝑙𝑎𝑡𝑖𝑜𝑛 ,where scale =1 and translation = 0

(6)

- Step 2: initialization of weights and biases with small real values.

- Step 3: initialization of weights gradient (7) and of the total error (8)

∆𝑤𝑖𝑗=0, ∀𝑖, 𝑗 ∈ 𝑁 (7)

E = 0 (8)

- Step 4: calculating actual outputs from the hidden layer (9), real output of the networks

(10) and error function per epoch (11).

𝑦𝑗 = 𝑓(∑ 𝑥𝑖 ∗ 𝑤𝑖𝑗 − 𝜃𝑗𝑛𝑖=1 ), (9)

where n is the number of inputs of the neurons j from the hidden layer and f is

the activation function represented by the sigmoid function.

𝑦𝑘 = 𝑓(∑ 𝑦𝑗 ∗ 𝑤𝑗𝑘 − 𝜃𝑘𝑚𝑗=1 ) , (10)

where m is the number of outputs of the neurons k from the output layer.

E = E + 𝑒𝑘

2

2 (11)



567

- Step 5: determining error gradient (12) based on error function (13), adapting gradient

weights for hidden and output layers (14) and determining gradient weights between

input and hidden output layers (15).

𝛿𝑘 = 𝑓′ ∗ 𝑒𝑘=𝑦𝑘(1 − 𝑦𝑘) ∗ 𝑒𝑘 , (12)

where f’ is the derivate of activation function (unipolar sigmoid function)

calculated based on (16).

𝑒𝑘 = 𝑦𝑑,𝑘 − 𝑦𝑘 (13)

∆𝑤𝑗𝑘 = ∆𝑤𝑗𝑘 + 𝑦𝑘 ∗ 𝛿𝑘 (14)

∆𝑤𝑖𝑗 = ∆𝑤𝑖𝑗 + 𝑥𝑖 ∗ 𝛿𝑗 (15)

f’(x) = 𝑒−𝑥

(1+𝑒−𝑥)2 = f(x)*(1-f(x)) (16)

- Step 6: starting a new iteration based on two alternatives that consists in verifying if the

current epoch number is greater than the established one, in which case, the algorithm

will continue with step 4, otherwise all weights will be adapted based on weight

gradient calculated at step 5 and illustrated by (17):

𝑤𝑖𝑗 = 𝑤𝑖𝑗 + 𝜂 ∗ ∆𝑤𝑖𝑗, where 𝜂 represents the learning rate (17)

In order to improve [6] the results and accelerate the learning process in neural networks, the

normal algorithm described previously is changed by introducing the momentum term (α) that

will be applied to the weights during their update as described in (18).

∆𝑤𝑖𝑗(𝑡) = ∆𝑤𝑖𝑗(𝑡) + α ∗ ∆𝑤𝑖𝑗(𝑡 − 1) (18)

3.4. Experiment results and algorithm performance The results obtained by applying the steps described in 3.3 on the dataset presented in 3.1, for

different values of the learning rate and momentum number are described in table 2, for 500

iteration (epoch number), and in table 3, for 1000 iteration.

Table 2. The results for 500 iteration Table 3. The results for 1000 iteration

Based on the results presented in the above tables, the best parameters for the current case study are

obtained with a number of 1000 epochs, a learning rate equal to 0.15 and 0.95 as a momentum term.

The performance of this classifier can be determined by the computation of total classification accuracy

and the root mean square error. The graphical representation for the root mean square error based on

the number of epochs taken into consideration is shown in figure 4.

Learning rate Momentum Root square

error

0.95 0.05 0.3826

0.85 0.15 0.3813

0.75 0.25 0.3803

0.65 0.35 0.3796

0.55 0.45 0.3794

0.45 0.55 0.3818

0.35 0.65 0.3831

0.25 0.75 0.3906

0.15 0.85 0.3785

0.05 0.95 0.3938

Learning rate Momentum Root square

error

0.95 0.05 0.3769

0.85 0.15 0.3758

0.75 0.25 0.3749

0.65 0.35 0.374

0.55 0.45 0.3888

0.45 0.55 0.3758

0.35 0.65 0.3767

0.25 0.75 0.3837

0.15 0.85 0.3689

0.05 0.95 0.3973



568

Figure 4. Graph between Error and Epoch

The total classification accuracy [7] is defined as the ratio of number of correct decisions and total

number of cases and can be calculated starting from the confusion matrix generated by training diabetes

dataset and illustrated in figure 5 and based on (19).

Figure 5. Confusion matrix for diabetes diagnosis

Accuracy = 𝑇𝑃+𝑇𝑁

𝑛 , (19)

where TP is the number of true positives, TN the number of true negatives and n the total

number of instances.

Accuracy = 432+188

768= 0.807291 (80,7291 %)

The higher values obtained for this indicator demonstrates that back propagation algorithm is a good

approach that can be used in medicine in the process of medical diagnosis.

4. Conclusions The experimental results of the proposed approach described in this study are showing the fact that back

propagation algorithm and moreover, artificial neural networks, can be used within the tele-medicine

field for medical diagnosis. The described model, trained with neural algorithm and adjusted with a

momentum term for optimization purposes, achieves higher accuracy and can therefore, be used in order

to perform pattern classification that represents the process in which diagnosis is build.

Acknowledgment ,,This work was financially supported through the project "Routes of academic excellence in doctoral

and post-doctoral research - READ" co-financed through the European Social Fund, by Sectoral

Operational Programme Human Resources Development 2007-2013, contract no

POSDRU/159/1.5/S/137926.”

References [1] D. Kriesel, “A brief introduction to Neural Networks”, 2011, [On-line], Available:

http://www.dkriesel.com/_media/science/neuronalenetze-en-zeta2-2col-dkrieselcom.pdf

[2] A. Krenker, J. Bester and A. Kos, “Introduction to the Artificial Neural Networks, Artificial

Neural Networks - Methodological Advances and Biomedical Applications”, 2011, ISBN:

953-307-243-2, [On-line], Available at: http://www.intechopen.com/books/artificial-

neural-networksmethodological-advances-and-biomedical-applications/introduction-to-

the-artificial-neural-networks



569

[3] Internet: https://archive.ics.uci.edu/ml/datasets/Diabetes, [Feb. 21, 2015]

[4] The Back propagation Algorithm. Internet:

http://page.mi.fu-berlin.de/rojas/neural/chapter/K7.pdf, [Feb. 21, 2015]

[5] Statistical Normalization and Back Propagation for Classification. Internet:

http://www.researchgate.net/profile/Santhakumaran_A/publication/260024206_Statistica

l_Normalization_and_Back_Propagation_for_Clasification/links/00b4952f1e2762d4

91000000.pdf, [Feb. 21, 2015]

[6] Improved Back propagation learning in neural networks with windowed momentum.

Internet: http://synapse.cs.byu.edu/papers/IstookIJNS.pdf, [Feb.21, 2015]

[7] Classification, Basic Concept, Decision Trees and Model evaluation. Internet: http://www-

users.cs.umn.edu/~kumar/dmbook/ch4.pdf, [Feb. 21, 2015]



570

NEURAL NETWORK-BASED APPROACH IN FORECASTING

FINANCIAL DATA

Cătălina-Lucia COCIANU


Hakob GRIGORYAN Bucharest University of Economic Studies

[email protected]

Abstract. Considering the fact that markets are generally influenced by different external

factors, the stock market prediction is one of the most difficult tasks of time series analysis. The

research reported in this paper aims to investigate the potential of artificial neural networks

(ANN) in solving the forecast task in the most general case, when the time series are non-

stationary. We used a feed-forward neural architecture: the nonlinear autoregressive network

with exogenous inputs. The network training function used to update the weight and bias

parameters corresponds to gradient descent with adaptive learning rate variant of the

backpropagation algorithm. The results obtained using this technique were compared with the

ones resulted from some ARIMA models. We used the mean squared error (MSE) measure to

evaluate the performances of these two models. The comparative analysis leads to the

conclusion that the proposed model can be successfully applied to forecast the financial data.

Keywords: neural network, nonlinear autoregressive network, exogenous inputs, time series,

ARIMA model

JEL classification: C45, C46, C63, CO2

1. Introduction

Predicting stock price index and its movement has been considered one of the most challenging

applications of time series prediction. According to the efficient market theory proposed in [1],

the stock price follows a random walk and it is practically impossible to make a particular long-

term global forecasting model based on historical data. The ARIMA and ANN techniques have

been successfully used for modeling and forecasting financial time series. Compared with ANN

models, which are complex forecasting systems, ARIMA models are considered to be much

easier techniques for training and forecasting.

Artificial neural networks have been widely used for time series forecasting and they have

shown good performance in predicting stock market data. Chen et al., [2], introduced a neural

network model for time series forecasting based on flexible multi-layer feed-forward

architecture. F. Giordano et al., [3], used a new neural network-based method for prediction of

non-linear time series. Lin et al.,[4], applied artificial neural network to predict Taiwan stock

index option price. Z. Liao et al., [5], applied stochastic time effective neural network to

develop a forecasting model of global stock index. Mohamed et al., [6], used neural networks

to forecast the stock exchange movements in Kuwait Stock Exchange. Cai et al., [7], used

neural networks for predicting large scale conditional volatility and covariance of financial

time series.

In the second section of the paper, we present a nonlinear autoregressive network with

exogenous inputs aiming to forecast the closing price of a particular stock. The ANN-based



571

strategy applied for data forecasting is analyzed against the ARIMA model, and a comparative

analysis of these models are described in the final part of the paper.

2. The ANN-based technique for data forecasting

Let Yt be the stock closing value at the moment of time t. For each t, we denote by Xt =

(Xt(1), Xt(2), … , Xt(𝑛))𝑇 the vector whose entries are the values of the indicators significantly

correlated to Yt, that is the correlation coefficient between Xt(𝑖) and Yt is greater than a certain

threshold value, for i = 1,2, … , 𝑛 . The neural model used in our research is a dynamic network.

The direct method was used to build the model of prediction of the stock closing value, which

is described as follows.

Ŷ(t+p) = ƒANN(𝑌𝑡(𝑑)

, 𝑋𝑡(𝑑)

) (1)

𝑌𝑡(𝑑)

= {Yt, Yt−1, Yt−2, … , Yt−d} (2)

𝑋𝑡(𝑑)

= {Xt, Xt−1, Xt−2, … , Xt−d} (3)

where Ŷ(t+p) is the forecasted value of the stock price for the prediction period p and d is the

delay expressing the number of pairs (Xk, Yk), 𝑘 = 𝑡, 𝑡 − 1, … , 𝑡 − 𝑑 used as input of the neural

model. In our model, we consider 𝑝 = 1.

The considered delay has significant influence on the training set and prediction process. We

use correlogram to choose the appropriate window size for our neural networks. We need to

eliminate the lags where the Partial Autocorrelation Function (PACF) is statistically irrelevant

[8].

The nonlinear autoregressive network with exogenous inputs (NARX) is a recurrent dynamic

network, with feedback connections encompassing multiple layers of the network. The output

of the NARX network can be considered an estimate of the output of a certain nonlinear

dynamic system. Since the actual output is available during the training of the network, a series-

parallel architecture is created [9], where the estimated target is replaced by the actual output.

The advantages of using NARX model for forecasting purposes are twofold. On the one hand,

the inputs used in the training phase are more accurate and, on the other hand, since the

resulting network has a feed-forward architecture, FX → FH → FY, a static backpropagation

type of learning can be used. An example of this series-parallel network is depicted in figure

1, where d=2, n=10 and the number of neurons in the hidden layer is 24.

Figure 1. Example of a series-parallel network

After the training step, the series-parallel architecture is converted into a parallel configuration,

in order to perform the multi-step-ahead prediction task. The corresponding neural network

architecture is presented in figure 2.



572

Figure 2. Example of a parallelized network

We use the standard performance function, defined by the mean sum of squares of the network

errors. The data division process is cancelled to avoid the early stopping. The network training

function used to update the weight and bias parameters corresponds to gradient descent with

adaptive learning rate variant of the backpropagation algorithm. The number of neurons in the

hidden layer is set according to the following equation [10]:

[2√(𝑚 + 2)𝑁]

where m stands for the number of the neurons of the output layer and N is the dimension of

input data.

3. Experimental results

We tested the proposed model on 300 samples dataset. The samples are historical weekly

observations of a set of variables S, between 3/1/2009 and 11/30/2014. The set S contains the

opening, closing, highest and lowest price respectively of SNP stock from Bucharest Stock

Exchange, and seven indicators obtained from technical and fundamental analysis of the stock

market.

The correlogram shows that for all variables PACF function drops immediately after the 2nd

lag. This means that window size for all variables could be set to 2.

In our tests, we used 200 samples for training purposes and 100 unseen yet samples for

data forecasting.

The neural network parameters are determined based on the following process.

REPEAT

1. Initialize the parameters of the NN.

2. Train the NN using the set of training samples in 6000 epochs.

UNTIL the overall forecasting error computed on the already trained data in terms of

MSE measure is less than a certain threshold value.

In our tests, the threshold value is set to 10−3.

If we denote by 𝑇 = (𝑇(1), 𝑇(2), … , 𝑇(𝑛𝑟)) the vector of target values and by 𝑃 =

(𝑃(1), 𝑃(2), … , 𝑃(𝑛𝑟)) the vector whose entries correspond to the predicted values, the MSE

error measure is defined by

𝑀𝑆𝐸(𝑇, 𝑃) =1

𝑛𝑟∑(𝑇(𝑖) − 𝑃(𝑖))

2 (4)

𝑛𝑟

𝑖=1

The results obtained using the above mentioned technique are reported in the following. The

overall forecasting error computed on the already trained data prediction is 0.00035. The

regression coefficient computed on the already trained data and the data fitting are presented

in Figure 3. The network predictions versus actual data in case of already trained samples are



573

illustrated in Figure 4. The overall forecasting error computed on the new data prediction is

0.0012. The network predictions versus actual data in case of new samples are illustrated in

Figure 5.

Figure 3. The regression coefficient and data

fitting in case of already trained samples

Figure 4. Predictions versus actual data in case of already

trained samples

Figure 5. The network predictions versus actual data

in case of new samples

Figure 6. The error histogram in case of new

samples

The error histogram in case of new data set is depicted in Figure 6.

We developed a comparative analysis of the neural network-based approaches against the well-

known ARIMA forecasting method. Since the dataset is a non-stationary time series, we

excluded ARIMA models with the value of the differencing parameter equal to 0. Also, in case

of ARIMA models with differencing parameter values larger or equal to 2, the effect of over-

differencing occurred.

The parameters of ARIMA model related to AR(p) and MA(q) processes were tuned based on

the following criteria: relatively small values of BIC (Bayesian Information Criterion),

relatively high values of adjusted R2 (coefficient of determination) and relatively small standard

error of regression (SER). The results of our tests are summarized in Table 1. According to

0 20 40 60 80 100 120 140 160 180 2000.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Network Predictions

Expected Outputs

0 10 20 30 40 50 60 70 80 90 1000.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

Network Predictions

Expected Outputs



574

these results, the best model from the point of view of the above mentioned criteria is

ARIMA(1,1,1) model. We concluded that the best fitted models are ARIMA(1,1,0) and

ARIMA(1,1,1).

Table 1. ARIMA models

ARIMA model BIC Adjusted R2 SER

(1,1,0) -5.292201 0.987351 0.015247

(1,1,1) -5.547453 0.990408 0.013278

(0,1,1) -2.283686 0.754100 0.068656

(0,1,0) -1.017242 0.108715 0.130709

The overall forecasting error computed on the new data prediction is 0.0077 in case of using

ARIMA(1,1,0) model, and 0.0096 in case of using ARIMA(1,1,1) model . The results of

forecasting are illustrated in Figure 7.

Figure 7. Predicted values of ARIMA(1,1,0) and ARIMA(1,1,1) models versus

actual data

4. Conclusions

The research reported in this paper focused on a comparative analysis of NARX neural network

against standard ARIMA models. The study was developed on a dataset consisting in 300

historical weekly observations of a set of variables, between 3/1/2009 and 11/30/2014. The

results obtained using the proposed neural approach proved better results from the point of

view of MSE measure. The obtained results are encouraging and entail future work toward

extending the study in case of using alternative neural models.

References [1] E.F. Fama, Efficient capital markets: A review of theory and empirical work, The Journal

of Finance, 25 (2) (1970), pp. 383–417

[2] Y. Chen, B. Yang, J. Dong, and A. Abraham, Time-series forecasting using flexible neural

tree model, Information Sciences, vol. 174, no. 3-4, pp. 219–235, 2005.



575

[3] F. Giordano, M. La Rocca, and C. Perna, Forecasting nonlinear time series with neural

network sieve bootstrap, Computational Statistics and Data Analysis, vol. 51, no. 8, pp.

3871–3884, 2007.

[4] C.T. Lin, H.Y. Yeh, Empirical of the Taiwan stock index option price forecasting model –

Applied artificial neural network, Applied Economics, 41 (15) (2009), pp. 1965–1972

[5] Z. Liao, J. Wang, Forecasting model of global stock index by stochastic time effective

neural network, Expert Systems with Applications, 37 (1) (2009), pp. 834–841

[6] M.M. Mohamed, Forecasting stock exchange movements using neural networks: empirical

evidence from Kuwait, Expert Systems with Applications, vol. 27, no. 9, pp.6302–6309,

2010.

[7] X. Cai, G. Lai, X. Lin, Forecasting large scale conditional volatility and covariance using

neural network on GPU, The Journal of Supercomputing., 63 (2013), pp. 490–507

[8] D.N. Gujarati, Basic econometrics, McGraw-Hill, New York (2003)

[9] Narendra, Kumpati S., Kannan Parthasarathy, Learning Automata Approach to

Hierarchical Multiobjective Analysis, IEEE Transactions on Systems, Man and

Cybernetics, Vol. 20, No. 1, January/February 1991, pp. 263–272.

[10] Fagner A. de Oliveira, Cristiane N. Nobre, Luis E. Zarate Applying Artificial Neural

Networks to prediction of stock price and improvement of the directional prediction index

– Case study of PETR4, Petrobras, Brazil, Journal of Expert Systems with Applications,

40, (2013), 7596–7606.



576

SEMANTIC HMC FOR BUSINESS INTELLIGENCE USING CROSS-

REFERENCING

Rafael PEIXOTO

Checksem - Laboratoire Le2i, UMR CNRS 6306, Dijon, France

[email protected]

Thomas HASSAN Checksem - Laboratoire Le2i, UMR CNRS 6306, Dijon, France

[email protected] Christophe CRUZ


[email protected] Aurélie BERTAUX


[email protected] Nuno SILVA

GECAD, ISEP-IPP, Porto, Portugal

[email protected]

Abstract. Keeping abreast with current market trends requires the centralization of large amount of

information. Due to the increasing number of news available on the web, selecting only valuable

information for each consumer is mandatory to reduce the consumer information overload. However,

information available on the web can have uncertain and imprecise data, leading to veracity issues. We

aim to measure Big Data veracity using cross-referencing of several information sources. In this work

we present a new vision to cross-referencing of several huge web information sources using a Semantic

Hierarchical Multi-label Classification process called Semantic HMC to extract the knowledge

available in those sources.

Keywords: Ontology, Hierarchical Multi-label Classification, similarity measure. JEL classification: L86 Information and Internet Services

1. Introduction The decision-making process in the economic field requires the centralization and intakes of a large

amount of information. The aim is to keep abreast with current market trends. Thus, contractors,

businessmen and salespersons need to continuously be aware of the market conditions. This means to

be up-to-date regarding ongoing information and projects undergoing development. With the help of

economic monitoring, prospects can be easily identified, so as to establish new contracts. Our tool called

First Pro’fil [1]–[3] (http://www.firsteco.fr/) is specialized in the production and distribution of press

reviews about French regional economic actors. The overload of news is a particular case of information

overload, which is a well-known problem, studied by Information Retrieval and Recommender Systems

research fields. News recommender systems already exist [4], Athena [5], GroupLens [6] or News Dude

[7]. Some of these systems use domain knowledge to improve the recommendation task [4], [5]. To

achieve this goal, a content-based recommender system is being developed [3], [8]. A recommender

system is necessary for ranking the items and a content-based approach is required to analyze the

content of each article to structure and preserve information content. The results of the analysis enables

linking the domain knowledge to the articles to improve the recommendation task [4], [5].

However the amount of news available on the web is growing, requiring new forms of processing to

enable enhanced decision making, insight discovery and process optimization. The term of Big Data is

mainly used to describe these huge datasets. Various types of data compose Big Data, including

unstructured data that represents 90% of its content [10]. An increasing number of V’s has been used

http://www.firsteco.fr/



577

to characterize Big Data [9], [10]: Volume, Velocity, Variety, Veracity and Value. Volume means the

big amount of data that is generated and stored by transaction-based data stored through the years, text

data constantly streaming in from social media, increasing amounts of sensor data being collected, etc.

Thus, Big Data is not only a huge volume of data but it must be processed quickly. Velocity means both

(i) how fast data is being produced and (ii) how fast the data must be processed and analyzed to meet a

demand. Variety means that various types of data compose Big Data. These types include semi-

structured and unstructured data such as audio, video, webpages, and text, traditional structured data,

etc. Veracity concerns the truthfulness in data. In traditional data warehouses there was always the

assumption that the data is certain, clean, precise and complete but in Big Data context, namely the

user-generated data can be uncertain, erroneous, imprecise and incomplete. The Value characteristic

measure how valuable is the information to a Big Data consumer. Value is the desired outcome of Big

Data analytics and Big Data “raison d’etre” because if data don’t have value then is useless. We aim to

measure data veracity of a Big Data source by using similar data in several web sources linked by cross-

referencing. Cross-referencing means linking the several textual information sources that share similar

meanings. When focusing on cross-referencing web information sources, one must instantly focus on

extracting knowledge from these sources.

To extract knowledge from Big Data sources we propose to use a Semantic HMC [11], [12] process

that is capable of Hierarchically Multi-Classify a large Variety and Volume of unstructured data items.

Hierarchical Multi-Label Classification (HMC) is the combination of Multi-Label classification and

Hierarchical classification [13]. In HMC items can be assigned to different hierarchical paths and

simultaneously may belong to different class labels in a same hierarchical level. The Semantic HMC

process is unsupervised such that no previously labelled examples or enrichment rules to relate the data

items with the labels are needed. The label hierarchy and the enrichment rules are automatically learned

from the data through scalable Machine Learning techniques.

This paper claims that cross-referencing high quality information (i.e. events) in items from several

sources we can measure its veracity. The main contribution of this paper is then to cross-referencing of

huge web information sources by using the Semantic HMC to extract the knowledge available in these

sources. The cross-references are then used to measure the data veracity and improve the

recommendation of economical news.

Next section focuses on related work such as Semantic HMC, semantic measures and Cross-

Referencing Methods Proposals. Section 3 proposes how to cross-referencing huge web information

sources using the Semantic HMC for veracity measure. The last section concludes this paper.

2. Related work In order to compare cross-referencing methods of web information sources, we need to evaluate the

Semantic Measure between concepts. Semantic Measure is normally a philosophic term. It is a point of

view that differs from one person to another regarding the semantic links strengths between two

concepts. Trying to computerize this philosophic term, in order to compare different textual

information, is a complex task and requires performing high-level language processing. However, the

evaluation of the Semantic Measure between two concepts depends firstly on the kind of the semantic

links, and secondly on the kind of knowledge resources.

2.1 Semantic Measure Type In order to compare two concepts, and in particular two textual information sources in the case of

documentary research, one must evaluate the semantic measure between these sources. Semantic

measure is a generic term covering several concepts [14]:

Semantic relatedness, is the most general semantic link between two concepts. Two concepts

do not have to share a common meaning to be considered semantically related or close, as they

can be semantically linked (related) by a functional relationship or frequent association

relationship like meronym or antonym concepts. (e.g. Pilot “is related to” Airplane).



578

Semantic similarity, is a specific case of semantic relatedness. Two concepts are considered

similar if they share common meanings and characteristics, like synonym, hyponym and

hypernym concepts (e.g. Old “is similar to” Ancient).

Semantic distance, is the inverse of the semantic relatedness, as it indicates how much two

concepts are unrelated to one another.

2.2 Cross-Referencing Methods Proposals

We have identified three kinds of approaches for Cross-Referencing information: semantic similarity,

paraphrase identification and event extraction techniques.

In order to improve cross-referencing methods for web information sources, semantic measures and

precise semantic similarity definitions are proposed in the literature. These measures can normally be

grouped into five categories: Path Length-based measures [15], Information Content-based measures

[16], [17], [18], Feature-based measures [19], [20], Distributional-based measures[21], [22] and Hybrid

measures [18], [23], [24].

Paraphrase identification corresponds to the ability of identifying phrases, sentences, or longer texts

that convey the same, or almost the same information [25]. Paraphrase identification techniques can be

classified into three categories: recognition, generation, or extraction.

A current active research field in cross-referencing methods for web information sources is event

extraction. Event extraction is a common application of text mining to derive high quality information

from text by identifying events. Event extraction techniques depend on paraphrase identification

methods to identify events expressed in different ways. As Hogenboom et al. [26] cite, one can

distinguish between three main categories of event extraction: data-driven event extraction, knowledge-

driven event extraction, and hybrid event extraction.

3. Cross-referencing

The current paper discusses a large variety of approaches to natural language processing from diverse

fields. When focusing on cross-referencing web information sources, one must instantly focus on

extracting knowledge from these sources. While there is a compromise between the size, the coverage,

the structure and the growth of the knowledge resources, dealing with the knowledge extraction from

huge web information sources is considered the main challenge in hands.

We believe that improving the web structure will be the most efficient approach to measure veracity in

web information sources. We suggest to benefit from the large data offered by the web, and consider

data-driven approaches as the most suitable event extraction techniques. These techniques aim at

converting data into knowledge relying on quantitative methods such as clustering and the use of

statistics. Therefore, choosing paraphrase extraction approaches based on distributional hypothesis and

on recognition approaches based on surface string similarity, are a good solution as they both depend

directly on semantic measures, which can be used to benefit from the presence of large context like the

use of distributional-based measures. Since no approach is yet proved as the most efficient and reliable,

one must choose, regarding the context of the issue, the most suitable combination of approaches. This

choice depends in the first place on the knowledge resource used, then the event extraction technique.

Finally, it depends on the best match between the paraphrase identification techniques and the similarity

measure.

The proposed Semantic HMC process [11], solves this issue by learning automatically a concept

hierarchy and enrichment rules from Big Data through scalable Machine Learning techniques. To

represent the knowledge in the Semantic HMC process, an Ontology-described Knowledge Base is

used. Ontologies [27] are the most accepted way to represent semantics in the Semantic Web [28] and

a good solution for intelligent computer systems that operate close to the human concept level, bridging

the gap between the human requirements and the computational requirements [29]. Initially the

Semantic HMC enriches the ontology from the huge Volume and Variety of initial data and once this

learning phase is finished, the classification system learns incrementally from the new incoming items

to provide high Velocity learning. The result of this Semantic HMC process is a rich ontology with the

items classified with the learned concept hierarchy.



579

To infer the most specific concepts for each data item and all subsuming concepts, rule-based reasoning

is used, exhaustively applying a set of rules to a set of triples to infer conclusions. This Rule-based

reasoning approach allows the parallelization and distribution of work by large clusters of inexpensive

machines using Big Data technologies as Map-reduce [30].[30] Web Scale Reasoners [31] currently

uses Rule-Based reasoning to reach high scalability by loading parallelization and distribution, thus

addressing the Velocity and Volume dimensions of Big Data.

The Semantic HMC process consists of 5 individually scalable steps matching the requirements of Big

Data processing:

Indexation creates an index of parsed items and identifies relevant terms.

Vectorization creates a term co-occurrence frequency matrix of all indexed items and a TF-IDF

vector of each item.

Hierarchization creates a hierarchy of relevant concepts based on term-frequency.

Resolution creates classification rules to enrich the ontology based on term-frequency.

Realization first populates the ontology with items and then determines the corresponding

hierarchy concept and all subsumant concepts. This is intended as Hierarchical Multilabel

Classificaiton (HMC).

Once the items are classified, after the realization step, a set of classified items is available for post

processing. Similarity measures can then be easily computed between items classified with the same

labels. Notice that while paraphrase and event extraction technics cannot be undertaken directly on the

whole set of items available in one day at once, it can be undertaken with smaller subsets (i.e. classified

with the same labels) and then cross-reference their sources. Using these cross-referenced items we can

measure its veracity. Two uses of cross-referencing for measure item veracity are identified:

Cross-referencing with information sources that have a particular trustworthy. As an example,

if the item is cross-referenced with trusted sources it is a veracity indicator.

Cross-referencing with a significant set of items from several sources. As an example, if the

item is cross-referenced from several sources we can state that this item has a higher veracity

than an event that appears in a restricted number of items from only one source.

Then exploiting the veracity of its items easily does easily measure the veracity of each source.

4. Conclusions In this paper we present how to cross-referencing of large web information sources by using a Semantic

HMC process to extract the knowledge available in these sources. This cross-referencing principle

allows the First Eco Pro’fil to analyze the economical news veracity. It also discusses a large variety of

approaches to natural language processing from diverse fields that are mandatory to do cross-

referencing. In further work we aim to measure the data veracity as described and use it in the value

extraction process. Our current work consists in the implementation of the proposed methodology using

programming models for processing and generating large data sets as Map-Reduce.

Acknowledgment This project is founded by the company Actualis SARL, the French agency ANRT, the ”Conseil

Régional de Bourgogne” and through the COMPETE Program under the project AAL4ALL

(QREN13852).

5. References [1] C. Cruz and C. Nicolle, “Ontology Enrichment and Automatic Population From XML Data,”

Learning, pp. 17–20, 2008.

[2] D. Werner and C. Cruz, “Precision difference management using a common sub-vector to extend

the extended VSM method,” Procedia Comput. Sci., vol. 18, pp. 1179–1188, 2013.

[3] D. Werner, N. Silva, and C. Cruz, “Using DL-Reasoner for Hierarchical Multilabel Classification

applied to Economical e-News,” in Science and Information Conference, 2014, p. 8.

[4] D. C. De Roure, S. E. Middleton, and N. R. Shadbolt, “Ontological user profiling in recommender

systems,” ACM Transactions on Information Systems, vol. 22, no. 1. pp. 54–88, 2004.



580

[5] W. IJntema, F. Goossen, F. Frasincar, and F. Hogenboom, “Ontology-based news

recommendation,” in Proceedings of the 1st International Workshop on Data Semantics - DataSem

’10, 2010, p. 1.

[6] P. Resnick, N. Iacovou, and M. Suchak, “GroupLens: an open architecture for collaborative filtering

of netnews,” Proc. …, vol. pp, pp. 175–186, 1994.

[7] D. Billsus, D. Billsus, M. J. Pazzani, and M. J. Pazzani, “A personal news agent that talks, learns

and explains,” in Proceedings of the third annual conference on Autonomous Agents, 1999, pp.

268–275.

[8] D. Werner, C. Cruz, and C. Nicolle, “Ontology-based Recommender System of Economic

Articles.,” in WEBIST, 2012, pp. 725–728.

[9] M. Chen, S. Mao, and Y. Liu, “Big Data: A Survey,” Mob. Networks Appl., vol. 19, no. 2, pp. 171–

209, Jan. 2014.

[10] P. Hitzler and K. Janowicz, “Linked data, big data, and the 4th paradigm,” Semant. Web, vol. 4,

pp. 233–235, 2013.

[11] T. Hassan, R. Peixoto, C. Cruz, A. Bertaux, and N. Silva, “Semantic HMC for big data analysis,”

in Big Data (Big Data), 2014 IEEE International Conference on, 2014, pp. 26–28.

[12] T. Hassan, R. Peixoto, C. Cruz, N. Silva, and A. Bertaux, “Extraction de la Valeur des données du

Big Data par classification multi-label hiérarchique sémantique,” in EGC 2015 - 15ème conférence

internationale sur l’extraction et la gestion des connaissances, 2015.

[13] W. Bi and J. Kwok, “Multi-label classification on tree-and DAG-structured hierarchies,” Yeast,

pp. 1–8, 2011.

[14] A. Budanitsky and G. Hirst, “Evaluating WordNet-based Measures of Lexical Semantic

Relatedness,” Comput. Linguist., vol. 32, no. August 2005, pp. 13–47, 2006.

[15] C. Leacock and M. Chodorow, “Combining Local Context and WordNet Similarity for Word Sense

Identification,” An Electron. Lex. Database, pp. 265–283, 1998.

[16] D. Lin, “An Information-Theoretic Definition of Similarity,” in Proceedings of ICML, 1998, pp.

296–304.

[17] P. Resnik, “Semantic Similarity in a Taxonomy: An Information-Based Measure and its

Application to Problems of Ambiguity in Natural Language,” J. Artif. Intell. Res., vol. 11, pp. 95–

130, 1999.

[18] T. Pedersen and J. Michelizzi, “WordNet :: Similarity - Measuring the Relatedness of Concepts,”

HLT-NAACL--Demonstrations ’04 Demonstr. Pap. HLT-NAACL 2004, no. Patwardhan 2003, pp.

38–41, 1998.

[19] A. Tversky, “Features of similarity.,” Psychological Review, vol. 84. pp. 327–352, 1977.

[20] E. G. M. Petrakis, G. Varelas, A. Hliaoutakis, and P. Raftopoulou, “X-Similarity: Computing

Semantic Similarity between concepts from different ontologies,” J. Digit. Inf. Manag., vol. 4, pp.

233–237, 2006.

[21] R. L. Cilibrasi and P. M. B. Vitanyi, “The Google Similarity Distance,” IEEE Trans. Knowl. Data

Eng., vol. 19, no. 3, pp. 370–383, 2007.

[22] D. Hindle, “Noun Classification from predicate-argument structures,” in Proceedings of the 28th

annual meeting on Association for Computational Linguistics, 1990, pp. 268–275.

[23] R. Knappe, H. Bulskov, and T. Andreasen, “On Similarity Measures for Concept-based Querying,”

in Proceedings of the 10th International Fuzzy Systems Association World Congress (IFSA’03),

2003, pp. 400–403.

[24] Z. Zhou, Y. Wang, and J. Gu, “New model of semantic similarity measuring in wordnet,” in

Proceedings of 2008 3rd International Conference on Intelligent System and Knowledge

Engineering, 2008, pp. 256–261.

[25] I. Androutsopoulos and P. Malakasiotis, “A survey of paraphrasing and textual entailment

methods,” J. Artif. Intell. Res., vol. 38, pp. 135–187, 2010.

[26] A. Syed, K. Gillela, and C. Venugopal, “The Future Revolution on Big Data,” Future, vol. 2, no.

6, pp. 2446–2451, 2013.

[27] T. R. Gruber, “A Translation Approach to Portable Ontology Specifications by A Translation

Approach to Portable Ontology Specifications,” Knowl. Creat. Diffus. Util., vol. 5, no. April, pp.

199–220, 1993.



581

[28] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Sci. Am., vol. 284, no. 5, pp. 34–

43, 2001.

[29] L. Obrst, “Ontologies for semantically interoperable systems,” in Proceedings of the twelfth

international conference on Information and knowledge management - CIKM ’03, 2003, pp. 366–

369.

[30] J. Dean and S. Ghemawat, “MapReduce : Simplified Data Processing on Large Clusters,”

Commun. ACM, vol. 51, no. 1, pp. 1–13, 2008.

[31] J. Urbani, “Three Laws Learned from Web-scale Reasoning,” in 2013 AAAI Fall Symposium

Series, 2013.



582

MULTI-DOMAIN RETRIEVAL OF GEOSPATIAL DATA SOURCES

IMPLEMENTING A SEMANTIC CATALOGUE

Julio Romeo VIZCARRA

[email protected]

Instituto Politécnico Nacional, CIC, Mexico

Christophe CRUZ [email protected]@u-bourgogne.fr

Laboratoire Le2i, UMR CNRS 6306, Dijon, France

Abstract. Nowadays, the expertise of a user plays an important role in the search and retrieval in the

information systems that usually combines general and specialized knowledge in the construction of

queries. In addition, most of the queries systems are currently restricted on specific domains. Tackling

these issues, we propose a methodology that implements a semantic catalogue in order to provide a

smart queries system for retrieving data sources on the web by means of the extension of the user

expertise. We propose the combination of a query expansion method, and the use of similarity measures

and controlled vocabularies. Thus, it allows the system to recommend data sources that are able to fit

the need of a user in terms of information. To reach this goal, we exploit standard such as OWL from

the W3C and the CSW GeoCatalogue from the OGC.

Keywords: Semantic catalogue, smart queries, knowledge engineering, multi-domain retrieval,

similarity across ontologies. JEL classification: L86 Information and Internet Services

1. Introduction

Nowadays, the modern society is in a general crisis of knowledge. This term was introduced by Gross

[1], it refers to the necessity of understanding an increasing number of concepts produced for the science

and technological applications. On this way, the science and the scientific vocabulary have increasingly

merged with wider society through applied science that involves the daily life. As a consequence, the

borders between the scientific knowledge (specialized), and the knowledge in the real world (general)

outside science have become blurred [2][3]. Both kinds of knowledge can be frequently used to refer

common objects or situations. On the other hand, some knowledge is produced within a certain domain,

but it can be consumed for others, commonly this shared knowledge cannot be easily accessed and

known [4]. In this context, the users can face a lack of background, expertise or non-knowledge

(opposite of knowledge) within specific fields.

In order to get closer to a solution of the knowledge issues previously described, users of information

systems need central tools able to handle general and specialized knowledge, non-knowledge and

expertise about different domains. Moreover, another issue has to be taken into account. The

information heterogeneities are thematic, semantic, spatial, temporal, etc. As a consequence, the

conceptualization of a domain can widely differ from another domain by defining distinct concept,

objects, places or circumstances with the same vocabulary or defining the identical concept, objects,

places or events with the identical vocabulary. Thus, these heterogeneities are critical factors in the

information integration and retrieval [5].

Currently, there is a vast amount of spatial information available on the web though services. This

information allows scientists to perform complex analysis. Goodwin [6] used the term smart queries to

describe analyses that combine heterogeneous data sources in order to solve complex problems [7][8].

Our field of interest is the use of heterogeneous data sources to perform spatio-temporal smart queries

using Semantic Web tools. In previous work [9] we presented our research on spatio-temporal operators,

using local data repositories. The next logical step in the evolution of our work is to integrate it to the

SDI (Spatial Data Infrastructure). The term SDI was first introduced by the U.S. National Research



583

Council in 1993. It refers to a set of technologies, policies and agreements designed to allow the sharing

of spatial information and resources between institutions [10]. The Spatial Data Infrastructure has a

service-oriented architecture. In such infrastructure, functionalities such as storage and data search are

carried out through Web services. The typical workflow involves: 1) The discovery of a data source, 2)

The download of relevant geospatial data, 3) The use of appropriate analytical methods and 4) The

visualization of the results on a suitable map.

Today, the OGC services can be storage in a catalogue and include metadata information, which is

described in different ways. Those descriptions include problems of heterogeneity which in the process

of integration or retrieval becomes complex, time consumption process, ambiguous, etc. It is relevant

to get the right meaning of the concepts in such descriptions; on the other hand, the traditional queries

have the same problems with their concepts.

As an example, we present two smart queries given to a user who can involve general and specialized

knowledge and non-knowledge:

Query 1: What is the population of crows in southwest of France? In this query, the concept

crow can be described in two ways:

Crow described on general knowledge [11, 12] may be a raven, black bird, superstition bird, a

butterfly called "common raven”, etc.

Crow described in specialized knowledge [13], the crow (Corvus corax) may be related

semantically with “Birds robin to mallard size”, “Birds medium size”, “Other similar birds in

the same category: small corvid , Corvus frugilegus corone (Rook Carrion crow), Pica pica

(Magpie), Garrulus glandarius (Jay), Corvus monedula (Jackdaw), large corvid, etc.”

Query 2: Now considering a query requested for a specialist in geology. What are the locations

with colluvium in USA during the past 20 years?

Using the general knowledge the concept Colluvium is unknown for most of the people [14].

Then it is necessary to describe Colluvium on the specialized knowledge [15], Colluvium is

sediment that has moved downhill to the bottom of the slope without the help of running water

in streams, gravity in the form of soil creep, and downhill.

After consulting the concept Colluvium on the specialized domain, we are able to understand and infer

the meaning on the general knowledge linking this concept with others semantically related on the

general domain such as sediment, deposit, alluvial sedimentation, sedimentary clay, etc.

Our work aims to tackle the issues previously described. We provide to the user the capability of

navigating through large amounts of information with an expert approach. This is obtained with the

inclusion of specialized knowledge in other fields in the search which the user might not have the best

expertise. Moreover, the methodology computes semantically user’s queries and returns similar results

in an ordered relevance list. The main domains considered in the retrieval of data sources are thematic,

spatial and temporal, which can be also described by knowledge specialized and general.

Next section focuses on related work such as Semantic HMC, semantic measures and Cross-

Referencing Methods Proposals. Section 3 deals with our approach to improve cross-referencing using

the Semantic HMC. The last section concludes this paper.

2. Related work This section describes some of the most relevant previous works such as projects and publications.

Moreover, it is presented projects more technical oriented that can be found on the web. They provided

similar solutions the issue tackled.

2.1 Academic purposes

In [16], authors tried to construct a bilingual dictionary from a corpus using the similarity among

concepts by polysemy. The contribution of Sra Suvrit [17] was an enhancing on retrieval of information.

In the experiments, they showed the usage of learning dictionaries enabling a fast and accurate Nearest

Neighbor (NN) retrieval. The dictionaries work on covariance matrix data sets without using semantic

https://core.tdar.org/ontology/5862/small_corvid

https://core.tdar.org/ontology/5862/Corvus_frugilegus_corone_(Rook_Carrion_crow)

https://core.tdar.org/ontology/5862/Pica_pica_(Magpie)

https://core.tdar.org/ontology/5862/Pica_pica_(Magpie)

https://core.tdar.org/ontology/5862/Garrulus_glandarius_(Jay)

https://core.tdar.org/ontology/5862/Corvus_monedula_(Jackdaw)

https://core.tdar.org/ontology/5862/large_corvid

http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=deposit



584

features. Pedersen Ted [18] processes the semantic on the concepts constructing a dictionary melded

from various sources. They faced the overlapping among dictionaries. In this approach, they

implemented Cross–Level Semantic Similarity (CLSS) which is a novel variation on the problem of

semantic similarity. The work of Shahriar, Md Sumon [19] proposed a smart query answering

architecture oriented to marina sensor data with a data mining approach. They implemented many

processes, but no similarity features were provided to process information. Shvaiko [20] implemented

an extension of Geonetwork [26] adding a new interface. They included semantic capabilities by using

a faceted ontology, but it is limited to semantic matching operations using S-Match [21] between the

query and the ontology. A second work with Farazi [22] exploit this work to provide an enhancement

that extended the capability of the queries by giving similar answers. It computes the similarity using a

nearest neighbor approach without considering a similarity measure that computes the information from

an ontology.

2.2 Services on the web

The project “Aonaware Web Services” [23] presents the possibility of consulting dictionaries for the

human understanding. Concepts are introduced by providing an extensively definition. The project

ontology Lookup Service [24] provides a web service where it is possible to query multiple ontologies

instead of only dictionaries. This service is for human recognition and automatic processing in the

documents' retrieval domain. The work of Falcons [25] provides a service of consulting through queries.

The system looks up for those concepts into their ontologies lexically. As results, the system shows an

excerpt of those ontologies. The system does not use specialized domains, and it is impossible to choose

relevant ontologies for the search.

2.3 Discusion

Based on the analysis of related work, experimental results show that the information retrieval is

improved by using many dictionaries. Word alignment techniques can be applied on shared vocabulary

in dictionaries to face the overlapping of a melded dictionary from several dictionaries. The common

issue in the related work was the conceptual ambiguity which can be tackled with the semantic

processing.

The common factor regarding smart query processing, semantic retrieval and catalogues is the similarity

measure necessity to compute the implicit information from the ontology. The cross–level semantic

similarity is a feasible solution for processing shared knowledge among ontologies. An important

contribution is the extension of the user’s expertise in several areas.

Considering the limitations and main features of presented work, the proposed methodology handles

the knowledge from different domains in order to improve the retrieval with mechanisms of

collaboration among ontologies. Similarity measures are included with the purpose of expanding the

expertise of users on general and specialized knowledge about thematic, spatial and temporal domains.

3. Proposition

This section describes our contribution based on three main stages. The three stages are namely the

“knowledge analysis”, the “data source analysis” and the “query analysis”.

In the “knowledge analysis” stage, a knowledge base is built composed of a set of ontologies

from thematic, spatial and temporal domains. The thematic domains consider general

ontologies described by a common vocabulary (common domains) and specialized ontologies

use specialized concepts (specific domains or domains not commonly used). When the

ontologies are loaded in the knowledge base, a necessary semantic pre-processing is executed

in two steps. The first step calculates similar concepts in each ontology on the same domain.

The second step computes the similarity of the concepts among ontologies.

In the “data source analysis” stage, the geospatial data sources are stored. The concepts that

compose the metadata description are included in the ontology mapping (concept-data source)

in order to link concepts in the ontologies to data sources.



585

In last stage called “query analysis”, queries are introduced and transformed into smart queries.

Actually, the query are extended and linked to ontology vocabularies. Users are able to retrieve

geospatial data sources semantically related using knowledge specialized and on the thematic,

spatial and temporal domains. In this stage, the semantic pre-processing from the

conceptualization stage is used in order define the query and get similar concepts related. The

semantic pre-processing is also used in the synthesis stage for obtaining the geospatial data

sources related using the ontology population mapping.

Figure 1 presents the overall solution proposed. Some parts are widely used in many projects like the

pre-processing of the queries. Our approach is unique in the sense that we exploit concept expansion in

each ontology using all the ontology using distance similarity measures. And, the result of this

expansion is also exploited for the expansion of the queries which make the results much broader and

closer to the need of the user. In order to limit the size of the expansion, the principle of intersection

allows the systems to select only the vocabularies that are in common to a set of ontologies after the

expansion process of the ontologies.

. Figure 1. Query analysis stages.

5. Conclusions We have studied the handling of multi domain knowledge as a way for extending the user expertise in

queries processing. This feature can be especially important in the retrieval of data sources in a

catalogue. The inclusion of a mechanism of collaborative domains can extend the information in the

queries and disambiguating the concepts. The similarity measures provide a semantic approach in the

query analysis and in the construction of the catalogue. We have described the work-in-progress that is

currently under development. A significant part of the components of the system have been already

developed, although they still need to be integrated.

References [1] M. Gross, “The Unknown in Process Dynamic Connections of Ignorance, Non-Knowledge and

Related Concepts”, Sage Publications, Current Sociology, vol 55,5 pp 742-759, 2007.



586

[2] M. Gross, H. Hoffmann-Riem, “Ecological Restoration as a Real-World Experiment: Designing

Robust Implementation Strategies in an Urban Environment”, Public Understanding of Science

14(3): 269–84, 2005.

[3] W. Krohn, J. Weyer, “Society as a Laboratory: The Social Risks of Experimental Research”,

Science and Public Policy 21(3): 173–83, 1994.

[4] U. Beck, “World Risk Society”, Oxford: Polity Press,1999.

[5] V. Kashyap, et al, “Semantic heterogeneity in global information”, Cooperative Information

Systems: Current Trends and Directions, 1997.

[6] J. Goodwin, “What have ontologies ever done for us - potential applications at a national mapping

agency”, In: in OWL: Experiences and Directions (OWLED), 2005.

[7] Md. S. Shahriar, et al, “Smart query answering for marine sensor data”, Sensors, Molecular

Diversity Preservation International, vol. 11,3, 2885-2897, 2011.

[8] J. Han, Y. Huang, N. Cercone, Y. Fu, “Intelligent Query Answering by Knowledge Discovery

Techniques”, IEEE Trans. Knowl. Data Eng. 8s, 373-390, 1996.

[9] B. Harbelot, H. Arenas, H., C. Cruz, “The spatio-temporal semantics from a perdurantism

perspective”, In: Proceedings of the Fifth International Conference on Advanced Geographic

Information Systems, Applications, and Services GEOProcessing. February-March, 2013.

[10] ESRI: GIS Best Practices: Spatial Data Infrastructure (SDI), 2010

[11] J. Noragh, “Power of Raven, Wisdom of Serpent”, Floris Books. ISBN 0-940262-66-5, 1995.

[12] A. Borang, et al, “Butterflies of Dihang Dibang Biosphere Reserve of Arunachal Pradesh”, Eastern

Himalayas, India, Bullefin of Arunachal Forest Research,vol 24, 41-53, 2008.

[13] Fauna Completeness Ontology. Keith Kintigh. (tDAR ID: 376370); doi:10.6067/XCV8HT2NMV,

2012.

[14] M. Smithson, “Ignorance and Science: Dilemmas, Perspectives, and Prospects”, Knowledge:

Creation, Diffusion, Utilization 15(2): 133–56, 1993.

[15] W. E. Dietrich, et al, “Hollows, colluvium, and landslides in soil-mantled landscapes”,

Binghamton Symposia in Geomorphology, International Series. Allen and Unwin, Hillslope

processes, 361-388, 1986.

[16] L. Xiaodong et al. “Topic models+ word alignment= a flexible framework for extracting bilingual

dictionary from comparable corpus”, Proceedings of the Seventeenth Conference on

Computational Natural Language Learning, Sofia, Bulgari, 212-221, 2013.

[17] S. Suvrit et al., “Generalized dictionary learning for symmetric positive definite matrices with

application to nearest neighbor retrieval”, Machine Learning and Knowledge Discovery in

Databases, Springer.318-332, 2011.

[18] T. Pedersen, “Duluth: Measuring Cross--Level Semantic Similarity with First and Second--Order

Dictionary Overlaps”, SemEval 2014, pp.247, 2014.

[19] Md S. Shahriar, et al, “Smart query answering for marine sensor data,Sensors, Molecular Diversity

Preservation International ,vol. 11,3, 2885-2897, 2011.

[20] P. Shvaiko et al. “A semantic geo-catalogue implementation for a regional SDI”, University of

Trento, 2010.

[21] F. Giunchiglia, P. Shvaiko, M. Yatskevich. “S-Match: an algorithm and an implementation of

semantic matching”, In Proc. of ESWS, 2004.

[22] F. Farazi, et al. “A semantic geo-catalogue for a local administration”, Artificial intelligence

review, Springer ,vol 40,2.193-212, 2013.

[23] Aonaware Web Services. http://services.aonaware.com/DictService/, consulted on November

2014.

[24] The Ontology Lookup Service. http://www.ebi.ac.uk/ontology-lookup/, consulted on November

2014.

[25] Falcons. http://ws.nju.edu.cn/falcons/objectsearch/index.jsp, consulted on November 2014.

[26] J. Ticheler, et al. “GeoNetwork opensource Internationally Standardized Distributed Spatial

Information Management”, OSGeo Journal, vol.2, 1, 2007.

http://en.wikipedia.org/wiki/International_Standard_Book_Number

http://en.wikipedia.org/wiki/Special:BookSources/0-940262-66-5

http://services.aonaware.com/DictService/

http://www.ebi.ac.uk/ontology-lookup/



587

EMOTIONAL ROBO-INTELLIGENCE CREATION PROCESS

Dumitru TODOROI

AESM

[email protected]

Abstract. Materialization of notions of information, knowledge, and conscience, its functions,

and its adaptability features with the perspective of intelligent systems creation process help

to investigate and develop the Computer Based Information Emotion Systems (CBIES) for

Information, Knowledge based, and Consciousness Societies. There are evaluated the CBIES’

second level elements for these societies based of its corresponding CBIES’ first level

(component parts and phases of activity) elements. Human emotions are examined in order to

create Emotional ROBO-intelligences (EQ) as the continuation of Creative ROBO -

intelligence (IQ). Fruitful cooperation of both IQ and EQ intelligence will evolve in

Consciousness Society, which will be created in the years 2019 - 2035, according to multiple

surveys in the field. Human temperament is examined with the purpose of creating artificial

emotional, sensual and creative intelligences. Adaptable tools of defining the new robotic

elements are used for defining the elements of higher level of emotional creative ROBO-

intelligences. Formulation, formalization and adaptable algorithmization of the higher level

elements of the temperament ROBO-intelligences represent evolutionary development of the

creation process of the ROBO-intelligences of the Consciousness Society. Proposed research

results represent logical continuation of research results [1 - 4].

Keywords: emotion, intelligence, temperament, conscience society, ROBO-intelligence

JEL classification: C45

Introduction

Human society is on the threshold of Consciousness Society and is currently supported by The

Third Industrial Revolution which, according to estimates by scientists in the field, will be

created during the years 2019-2035. Dramaturgical consciousness goes along side with the

distributed energy and communication systems of the third industrial revolution, as well as

psychological Consciousness that came with The Second Industrial Revolution and ideological

Consciousness which participated in The Third Industrial Revolution.

Empathic human civilization has a multitude of features which in Consciousness Society will

be specific and to the robotic civilization. Books [3] and [7] on the basis of our multiple

references support us in demonstration the truth of statement: "In Consciousness Society the

Artificial Intelligence (ROBO-intelligence) will be equal to Human structured Intelligence and

this Society will be empathic”.

For decades robots have diligently been tasked to perform a range of duties largely scoped

within industrial manufacturing. More recently, we have seen the emergence of a new

landscape of more social, personal, expressive, nurturing, and emotional robotic platforms.

Increasingly, robots play a critical new role as extensions of ourselves, enabling our creativity,

creating new objects, serving as companions, expressing emotions, empowering communities,

and challenging our civil rights.

o initiate discussion in Emotional ROBO-intelligence creation process let put the Problem:

ROBO-intelligence entity with emotions (Emotional ROBO-intelligence: EQ) has to activate



588

using some situation: Entity is in the best disposition and is asked to clear the dusty room after

school lecture in car driving.

Asked questions: What is emotional ROBO-intelligence? What are ROBO-intelligence’

component parts? What are ROBO-intelligence entity competences to initiate, process, and

finish the task? What ROBO-intelligence competences have to be activated to analyze the

emotion situation for doing this task? What are ROBO-intelligence competences’ measures?

How to program the evaluation process of doing the task using emotional measures? What are

emotional evaluation steps of ROBO-intelligence in the process of doing the task? What

competences are needed to transmit to Asker that task was finished?

Answer to some of questions can be done by the Creative ROBO-intelligences [7] which

possess creative features: Inspiration, Imagery, Imagination, Intuition, Insight, Improvisation,

and Incubation. These IQ features evaluate in correspondence with Six Steps to the Creativity

ROBO-intelligence top: acquire Knowledge, develop Curiosity, become Interested, Passion,

Dedication, and Professionalism.

Creative features and its evaluation steps produce second level IQ items of Creative ROBO-

intelligences. Each IQ item is defines by its special Consciousness Society Intelligent

Information System (Table 1).

Table 1. Consciousness Society Intelligent Information System (CSIIS).

CSIIS’s

phase

\ \ \ \ \ \ \ \ \ \ \ \ \

\

CSIIS’

component part

Consciousness

capture of IQ

item

Consciousness

Storag

e of IQ item

Consciousness

Processing of IQ

item

Consciousnes

s Distribution of IQ

item

IQ

item’s Hard-ware

IQ item’s capturing organs

and tools

IQ

item’s brain

fixation

IQ item’s Neuronal brain

connection

IQ item’s Neuronal

subconscious and

unconscious

connection

IQ

item’s Soft-ware

Drivers

for IQ item’s

capturing organs

and tools

Fixation the IQ

item in

ROBO-

memory

IQ item’s processing in

two brain

hemispheres

IQ item’s Drivers of organs of

Neuronal

subconscious and

unconscious

connection

People &

IQ interaction

People &

ROBO-

intelligence

interaction

initialization

People

& ROBO-

intelligence

interaction

documentation

People &

ROBO intelligence

and

society

consciousness

formation

People &

ROBO intelligence

evolution, its

implementation

in society

IQ

item’s IKC

(data,

information,

knowledge,

consciousness)

IQ item’s IKC formulation

IQ

item’s IKC

formalization

IQ item’s IKC operational

functionalities

IQ item’s IKC

distribution in

subconscious and

unconscious



589

IQ

item’s Brain-ware

(methods,

models

algorithms,

procedures)

IQ item’s Brain-ware

capture

IQ

item’s Brain-ware

storage in

ROBO -

memory


interpretation

technologies


conscience,

subconscious

and

unconscious

creativity

IQ

item’s Group-ware

(consciousness

communications

)

Individua

l and group IQ

item’s Group-ware

capture

IQ

item’s Group-ware

storage of

group

interactions

Coordinatio

n of individual and

group IQ item’s

Group-ware

processing

IQ item’s

Group-ware

transmission and its

crystallization in

Subconscious

and Unconscious

Each of the cells of the Table 1 contains functional evolution of ROBO – intelligence item.

Evolution is supported by Adaptable ROBO – intelligences creative Tools using Formula for

Creative (Artificial) Intelligence: IQ = IKC*TS [5].

1. Tests in the Emotional ROBO-intelligence creation process

Previous questions underline the problems which have to be solved to obtain the Emotional

ROBO-intelligence which has to understand the human announced task, to do this task in this

emotional situation, to transform its emotion, and to reproduce the answer for Boss about

finishing the task.

Many tests that promise to measure emotional intelligence have appeared in recent years and

are used in the process of creation the Emotional ROBO-intelligence. Some of these tests seem

promising, but many have not been empirically evaluated. We have reviewed many of these

tests and selected those for which there is a substantial body of research having a goal of

creating Emotional ROBO-intelligence (at least five published journal articles or book chapters

that provide empirical data based on the test).

Created Consortium for Research on Emotional Intelligence in Organizations (CREIO) do not

sell or distribute any measures or assessments. To get information related to obtaining specific

tasks of identification and measures such as qualifications or certifications needed to administer

specific measures, to store or process emotions and to organize emotion relationship and

distribution please refer to the contact information provided with the description of each

assessment.

Maps of emotional feelings and Computer Based Information Emotion System (CBIES)

represent the first steps we consider to go through to create Emotional ROBO-intelligence.

2. Maps of emotional feelings

Definition of the emotional existence plan is the following: level of human existence which

registers changes in emotional states.

The way of expressing the emotions can vary very much: love/hate, happiness/sadness,

calmness/anxiety, trust/fury, courage/fear, etc.

Emotions coordinate our behavior and psychological condition during the main survival

events and pleasant interactions. Nevertheless we are aware of our current emotional condition

as happiness or fury; mechanisms that cause/ feed these sensations are still undiscovered. Here

it is used a personal topographic instrument of report that unveils the fact that different

emotional conditions are associated with distinct topographic sensations and universally

cultural body feelings; these feelings could highlight conscious emotional experiences.



590

Watching the sensation topography caused by emotions we perceive a unique instrument for

researching the emotions that could be called biomarker against emotional disturbing.

Somatosensory Feedback has been proposed to trigger conscious emotional experiences

because the emotions often are felt by the body. Below (Figure 1) is presented the map of

bodily sensations associated with different emotions using a unique topographical self-report

method.

Figure 1. Bodily sensations map associated with different sensory emotions using a unique topography method

Bodily topography of basic (Upper) and non-basic (Lower) emotions associated with

words. The body maps show regions whose activation increased (warm colors) or decreased

(cool colors) when feeling each emotion.

Researchers at Aalto University in Finland have compiled maps of emotional feelings

associated with culturally universal bodily sensations, which could be at the core of

emotional experience. These emotional feelings are: Fury, Fear, Disgust, Happiness,

Sadness, Anxiety, Amazement, Neutrality, Love, Depression, Pride, Shame, Envy, Hatred.

The researchers found that the most common emotions trigger strong bodily sensations and

the bodily maps of these sensations were topographically different for different emotions

If it is put the question to create some entity - Robot with emotions: Emotional ROBO-

intelligence (EQ), the specialists must study carefully images from the entire world about at

list the human face expressions with different feelings. For an entertaining and pleasant

presence of such a machine to the human, such EQ has to behave politely, express emotions,

„read” human emotions and react adequately. It is interesting at list to create EQ’s Head &

Heart which has to have mobile for expressing emotions such as happiness, sadness or

melancholy.

2.1. Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) is an ability-based

test designed to measure the four branches of the EI model of Mayer and Salovey [8-10].

MSCEIT was developed from an intelligence-testing tradition formed by the emerging

scientific understanding of emotions and their function and from the first published ability

measure specifically intended to assess emotional intelligence, namely Multifactor

Emotional Intelligence Scale (MEIS).

http://www.eiconsortium.org/measures/msceit.html

http://www.kurzweilai.net/images/Bodily-topography-of-basic-emotions.jpg



591

MSCEIT consists of 141 items and takes 30-45 minutes to complete. MSCEIT provides 15

main scores: Total EI score, two Area scores, four Branch scores, and eight Task scores. In

addition to these 15 scores, there are three Supplemental scores [8].

Mayer-Salovey-Caruso Emotional Intelligence Test – Scales were discussed and analyzed the

Four Branches of Emotional Intelligence:

- Perceiving Emotions: The ability to perceive emotions in oneself and others as well as

in objects, art, stories, music, and other stimuli;

- Facilitating Thought: The ability to generate, use, and feel emotion as necessary to

communicate feelings or employ them in other cognitive processes;

- Understanding Emotions: The ability to understand emotional information, to understand

how emotions combine and progress through relationship transitions, and to appreciate

such emotional meanings;

- Managing Emotions: The ability to be open to feelings, and to modulate them in oneself

and others so as to promote personal understanding and growth

Our goal is to investigate emotional ROBO-intelligences which possess known classical

emotion elements: Happiness, Fear, Surprise, Disgust, Sadness, and Anger from the point of

view of its introduction in the robot entities as intellectual, emotional, moral, temperamental,

and sensual compartments.

Presented by Aalto University’s researchers emotional feelings can be defined and expressed

by the help of classical emotion elements using Adaptable tools.

2.2. Emotional ROBO – intelligence evolution

Follow the performing of physical, intellectual and spiritual work (lower level elements of

ROBO-intelligences) received with emotional developments of ROBO - intelligences: self-

awareness [11], managing emotions, motivation [12], empathy [11] and handling relationships

- lower level elements of ROBO-intelligences - to higher level items of ROBO - intelligences

(Table 2). Its definitions:

Self-awareness is the capacity for introspection and the ability to recognize oneself as an

individual separate from the environment and other individuals;

Managing emotions is ability to control emotional mental states;

Motivation is psychological feature that arouses an organism to action toward a desired goal;

the reason for the action; that which gives purpose and direction to behavior;

Empathy is identification with and understanding of another's situation, feelings, and motives;

Handling relationships - ability to have relationships in/with society.

Table 2. Emotional ROBO - intelligences Evolution

Evolution

versus

Work

Self-awareness Managing emotions Motivation Empathy Handling

relationships

Physical

work

Self- awareness

Physical work

Physical work

managed by

emotions

Motivated

Physical work

Empathic

Physical work

Relational

Physical work

Intellectual

work

Self -awareness

Intellectual

work

Intellectual work

managed by

emotions

Motivated

Intellectual

work

Empathic

Intellectual

work

Relational

Intellectual

work

Spiritual

work

Self -awareness spiritual work

Spiritual work

managed by emotions

Motivated spiritual work

Empathic

spiritual work

Relational

spiritual work

http://en.wikipedia.org/wiki/Introspection



592

High level elements of ROBO-intelligences presented in cells of the Table 2 are functionally

defined by the Adaptable Tools of evaluated Integrated Systems [13].

3. Computer Based Information Emotion System functions.

Emotional ROBO-intelligence is a Computer Based Information Emotion System (CBIES),

which component parts are: emotion measure, identification and perception, emotion

storage and processing, and emotion relationship – distribution. Each of CBIES component

parts is supported by corresponding hard-ware, soft-ware, knowledge-ware, social

(people’s)-ware, conscience-ware and group-ware.

It is needs to create special CBIES for each of Emotional ROBO-intelligence’ characteristics:

emotion identification, emotion competence, emotion quantity, quality, and storage,

emotion education, evolution, and processing, and emotion distribution.

3.1. CBIES’s Emotion entity measure is referred to such functions as emotional internal

states, impulses, preferences, resources, and intuitions.

Emotional intelligence measure is referred to the Emotional Self-Awareness: the skill of

perceiving and understanding one’s own emotions.

Emotion entity value for internal states, impulses, preferences, resources, and intuitions as

capacity to identify and understand the impact one’s own feelings is having on thoughts,

decisions, behavior and performance at work.

Quantity and Quality of emotional internal states, impulses, preferences, resources, and

intuitions functionally represent the information to calculate the EQ power as well as powers

for the following 5 composite scales and 15 subscales

3.2. CBIES’s Emotion Identification functionally represents emotion entity’s ID,

Competences, Measure, Value, Quality and Quantity of emotion entity.

Our investigation is referred to the classical emotion elements Happiness, Fear, Surprise,

Disgust, Sadness, and Anger.

Emotional intelligence’s axe with evolution steps are represented by: (1) Self-awareness: recognizing internal feelings,

(2) Managing emotions: finding ways to handle emotions that are appropriate to the situation,

(3) Motivation: using self-control to channel emotions toward a goal,

(4) Empathy: understanding the emotional perspective of other people,

(5) Handling relationships: using personal information and information about others to handle

social relationships and to develop interpersonal skills.

3.3. CBIES’s Emotion entity competences, based on Emotional & Social Competence

Inventory identified by Dr. Daniel Goleman in Working with Emotional Intelligence [Wolff,

2006] and Bar-On model of emotional-social intelligence [Bar-On, 2006], functionally

measure an overall EQ power as well as powers for the following composite scales: Self-

Awareness, Self-Management, Social Awareness, and Relationship Management.

3.3.1. Self-Awareness concerns knowing one's internal states, preferences, resources, and

intuitions. The Self-Awareness cluster contains three competencies:

Emotional Awareness: Recognizing one's emotions and their effects

Accurate Self-Assessment: Knowing one's strengths and limits

Self-Confidence: A strong sense of one's self-worth and capabilities

3.3.2. Self-Management refers to managing ones' internal states, impulses, and resources. The



593

Self-Management cluster contains six competencies:

Emotional Self-Control: Keeping disruptive emotions and impulses in check

Transparency: Maintaining integrity, acting congruently with one’s values

Adaptability: Flexibility in handling change

Achievement: Striving to improve or meeting a standard of excellence

Initiative: Readiness to act on opportunities

Optimism: Persistence in pursuing goals despite obstacles and setbacks

3.3.3. Social Awareness refers to how people handle relationships and awareness of others’

feelings, needs, and concerns. The Social Awareness cluster contains three competencies:

Empathy: Sensing others' feelings and perspectives, and taking an active interest in their

concerns

Organizational Awareness: Reading a group's emotional currents and power relationships

Service Orientation: Anticipating, recognizing, and meeting customers' needs

3.3.4. Relationship Management concerns the skill or adeptness at inducing desirable

responses in others. The Relationship Management cluster contains six competencies:

Developing Others: Sensing others' development needs and bolstering their abilities;

Inspirational Leadership: Inspiring and guiding individuals and groups;

Change Catalyst: Initiating or managing change;

Influence: Wielding effective tactics for persuasion;

Conflict Management: Negotiating and resolving disagreements;

Teamwork & Collaboration: Working with others toward shared goals. Creating group

synergy in pursuing collective goals.

3.4. CBIES’s Emotion storage and processing are referred to the process of accumulation the

processing results of emotion educational evolution steps (Home 7 years, High School,

Second school, …) using Piirto’s 7i intelligence characteristics axe and Piirto’s 6 emotion

intelligence evolution steps axe.

Emotional ROBO-intelligence’s axe of hierarchically evaluation steps are functionally

represented by Piirto’s 6 Creativity’s top elements: (a) Acquire Knowledge, (b) Develop

Curiosity, (c) Become Interested, (d) Passion, (e) Dedication, and (f) Professionalism.

3.4.1. CBIES’s Emotional intelligence processing functionally measure:

(1) Emotional Self-Control: the skill of effectively controlling strong emotions experienced,

(2) Emotional Self-Management: the skill of effectively managing one’s own emotions,

(3) Emotional Expression: the skill of effectively expressing one’s own emotions, and

(4) Emotional Reasoning: the skill of utilizing emotional information in decision-making.

One side of emotion processing constitutes emotional intrapersonal processing [Bar-On,

2006]. Emotional management and regulation processing define emotional stress

management.

3.4.2. CBIES’s Emotion intrapersonal processing (self-awareness and self-expression) is a

composition of the next 5 functions:

(1) Self-Regard: To accurately perceive, understand and accept oneself;

(2) Emotional Self-Awareness: To be aware of and understand one’s emotions;

(3) Assertiveness: To effectively and constructively express one’s emotions and oneself;

(4) Independence: To be self-reliant and free of emotional dependency on others;

(5) Self-Actualization: To strive to achieve personal goals and actualize one’s potential.



594

3.5. CBIES’s Emotional intelligence relationship-distribution is defined by:

- emotional Awareness of perceiving and understanding of others (the skill of perceiving

and understanding others’ emotions), and

- emotional Management of influencing the moods of others (the skill of influencing the

moods and emotions of others)

CBIES’s Emotion relationship-distribution accumulates functionally emotional

interpersonal processing, stress management, adaptability, and emotional general mood.

3.5.1. Emotional interpersonal processing (social awareness and interpersonal relationship)

is a composition of the next 3 functions:

Empathy: To be aware of and understand how others feel

Social Responsibility: To identify with one’s social group and cooperate with others

Interpersonal Relationship: To establish mutually satisfying relationships and relate well with

others

3.5.2. Emotional stress management (emotional management and regulation) is a

composition of the next 2 functions:

Stress Tolerance: To effectively and constructively manage emotions

Impulse Control: To effectively and constructively control emotions

3.5.3. Emotional adaptability (change management) is a composition of the next 3 functions:

Reality-Testing: To objectively validate one’s feelings and thinking with external reality

Flexibility: To adapt and adjust one’s feelings and thinking to new situations

Problem-Solving: To effectively solve problems of a personal and interpersonal nature

3.5.4. Emotional general mood (self-motivation) is a composition of the next 2 functions:

Optimism: To be positive and look at the brighter side of life

Happiness: To feel content with oneself, others and life in general

3.5.5. Morality’s axe of Emotional ROBO-intelligence is represented by the next elements:

(a) Accept differences in others, (b) Respond promptly to others, (c) Leave some "free" time,

(d) Care about others as if they were you, (e) Treat everyone similarly, (f) Never engage in

violent acts, (g) Have an inner sense of thankfulness, and (h) Have a sense of commitment.

4. Temperament Emotion ROBO-intelligences.

In Table 3 are presented high level robot elements which are defined based functionally on

elements of inferior level presented on one side, and creativity of intelligence on the other side

by the functions of the choleric, melancholic, phlegmatic, and sanguine temperaments.

Table 3.Temperament ROBO – intelligences with emotions features

Emotion

vs

Character

Happiness Fear Amazement Disgust Sadness Anger

Choleric Happy choleric:

calm, doesn’t

smile very much

even when he is

very happy

Scared

choleric:

has no fear

likes to risk

Amazed

choleric: calm,

doesn’t seem to

be amazed

Disgusted

Choleric:

wants to

avoid

Sad Choleric:

doesn’t

appreciate tears

and emotions,

not easy to

discourage

Angered Choler

ic:

rarely feels

angered

Sanguine Happy sanguine:

is trying to share

the happiness

with others

Frightened

sanguine:

he is not

fearful

Amazed

sanguine : very

emotional and

demonstrative

Disgusted

sanguine:

rapidly

avoids

Sad sanguine:

wants to share

sad thoughts

with smb

Furious

sanguine: easily

irritates



595

emotions

with disgust

Phlegmatic Happy

phlegmatic : a

born

pessimistic,

that keeps him

connected to

the reality

Frightene

d

Phlegmat

ic: is able

to be

calm in

the

middle of

the storm

Amazed

Phlegmatic :

phlegmatic is

master of

himself, is not

overwhelmed

by emotions

Disgusted

Phlegmatic

: doesn’t

offend ,

doesn’t call

attention

on himself

and

performs

what he is

expected to

do without

any

rewards

Sad

phlegmatic:

resists the

challenges

and is

listening to

the others

what they

have to say,

consoles the

suffering ones

Furious

Phlegmatic:

refuses to be

impressed by

the bright

choleric’s

decisions, and

doesn’t take

seriously

melancholic’s

laborious

plans

Melancholic Happy

Melancholic:

has to learn to

be optimistic

Melancho

lic with

fear: is

sober,

sensible

and

correct

Amazed Mel

ancholic: lives

the moment

emotional,

intense and

durable

Disgusted

Melancholi

c: weak

reactivity,

feeling of

inferiority

Sad

Melancholic:

is introverted

silent and

things

profoundly

with a

pessimistic

note

Angered

Melancholic:

without

tempering the

anger the

melancholic

suffers very

much

4.1. Phlegmatic and Melancholic ROBO-Intelligences Examples of Phlegmatic and Melancholic types which possess such classical first level emotion

elements as: Happiness, Fear, Surprise, Disgust, Sadness, and Anger, is presented (Table 4).

Table 4. Phlegmatic and Melancholic ROBO-Intelligences with emotions.

Characters

combination

with

Emotions

Happiness Fear Surprise Disgust Sadness Anger

Phlegmatic Phlegmatic

happiness

Phlegmatic

fear

Phlegmatic

surprise

Phlegmatic

disgust

Phlegmatic

sadness

Phlegmatic

anger

Melancholic Melancholic

happiness

Melancholic

Fear

Melancholic

surprise

Melancholic

disgust

Melancholic

sadness

Melancholic

anger

It is supposed that Phlegmatic ROBO – intelligence to this stage of its development possesses

such first level emotion elements as: Happiness, Fear, Surprise, Disgust, Sadness, and Anger.

More the Phlegmatic ROBO – intelligence possesses the first level elements - features of

Phlegmatic temperament. In such situation the higher level elements of Phlegmatic ROBO –

intelligence (the cell-elements in Table 4) can be developed using theses first level of

Phlegmatic ROBO – intelligence elements.

4.2. Temperament states.

Depending on the temperament man lives different states Table 5 this work needs to be paid

special attention.



596

Table 5. Temperament ROBO – intelligences with emotions sub-types.

Emotions

/

Temperaments

Happiness Fear Disgust Anger Sadness Surprise

Choleric Fearless Cheerfully Intolerant Vindictive angry,

stunned

Astonished

Sanguine Pleased Anxious Irritable Cranky Depressed Impressed

Phlegmatic Controlled Inert,

Impenetrable

Patient Calm,peace

support

Compassion

(he relieve

the

suffering)

Balanced

Melancholic Unbalanced Closed,

hidden

Impatiently Control

avoid

situations /

places

Depression Intemperate

4.3. Adapter’s Pragmatic Examples components of Temperament EQ Each element cell in the tables below need to be defined by its ADAPTER. In Table 6 presents

pragmatic examples of the two elements of ROBO - emotional intelligences character that can

stimulate and generate data according to the character's emotions.

Table 6. ROBO - emotional intelligences nature

emotions

combined

with

characters

Happiness Fear Surprise Disgust Sadness Ager

Choleric choleric happy

(a favorite

football team

won)

choleric

fearful

(fears he

will not be

able to

arrive on

time)

choleric

wondering

(colleague

has done

better than

him)

Choleric

disgust

(was awakened

from sleep)

choleric sad

(got sick, you

had to have a

business

meeting)

choleric

enraged

(from an

assembly did

not show any

member of the

organization)

Sanguine The sanguine

happy

(won a trip to

the U.S.)

The

sanguine

fearful

(fears he

will not be

able to

perform the

work

proposed)

The

sanguine

wondering

(friend won

a trip to the

dream)

The sanguine

disgust

(someone

broke the rules

in front of)

The sanguine

sad

(not accepted

for a trip to

France)

The sanguine

anger

(not received

the torch

command

name day)

Phlegmatic phlegmatic

happy

(project was

named best)

Phlegmatic

fear

(will not be

acceptable

to those of a

particular

organizatio

n)

phlegmatic

wondering

(managed

to amaze a

whole room

with his

ideas)

Phlegmatic

disgust

(before

meeting

someone dirty

clothes coffee)

phlegmatic

sad

(with a few

minutes

failed to

submit

timely

project that

worked)

phlegmatic

angered

(someone stole

business ideas)



597

Melan

cholic

melancholy

happy

(has been

accepted for a

scientific

research study)

Melancholi

c fear

(will not be

able to fit in

the time

limit)

melancholy

wondering

(Received a

higher

salary than

expected)

Melancholic

disgust

(he was

mocked by

someone)

melancholy

sad

(was fired)

melancholy

angered

(ideas which

were worked

much

appreciated as

unsuccessful)

If we combine all 3 steps of ROBO-intelligences evaluation (Intelligence, Creativity top, and

Emotions) we can create all four types of ROBO-intelligences with Creativity, Temperament,

and Emotions.

Each temperament robot will have his own features, which at the same time shows us that

everyone’s personality is unique and consists of a combination of features.

Artificial creative and emotional intelligence can become a great help to people working with

consumers daily because such robots can determine a human personality and show an idea of

how to interact with them.

Conclusion

Complexity of physical, intellectual and spiritual work determines hardness to which individual

can achieve superior qualities from the pyramid of “type of works” [5]. They should be lived

consciously by the people in order to be introduced into the AI. On the background of society

degradation, there are also people who showed brilliant and unimaginable skills.

Specializing of European space in services and technologies, demonstrates the transition of the

society from the physical to the intellectual and spiritual work. Implementation of these

qualities in ROBO-intelligence would be next step in formation of Consciousness Society.

Emotions, creativity, personality - all these are important features of human being, but lately,

they are used incorrectly leading to conflicts which can be avoided. In order to avoid we can

attach all those features to a robot and make it help us with usual activities.

The way our emotions work is a science in the true sense. Emotional Intelligence involves

bringing emotions to their true place in our lives everyday, its use is relevant to everyone and

is applicable in all environments and professions.

In computers, things evolve exponentially. In just a few generations the robots left the scientific

fantasy, curiosity, and play on the field developed psyche, creativity, the intelligent, emotion

and human spirit to replace or augment their human counterparts.

We're already at the point where you have to consider the next step of evolution in robotics.

According to engineers robotics, it seems that at some point in the near future, the next step

could be a great expansion in robotics. Convince us of this fact predictions, or rather performed

by specialist plans for the future:

(1) The human mind can be downloaded and inserted into the robot's memory in less than 10

years, enabling them to live forever people, these plans are supported by Russian entrepreneur

Dmitry Iţkov who said it has hired 100 scientists carry out the project, called „Avatar'';

(2) Around 2050 in Massachusetts will be legalized marriage with robots, believes David Levy.

It seems ironic that these sentient robots only take so by injecting them with humanity .... The

fact is that robots are nothing without the potential that the human brain is nothing but a

computer do various tricks to emulate.



598

True, we have a long way to go to reach our full potential, but '' The world is a true treasure in

changing thing'' and we conclude that the next few years will realize many changes, real

changes for humanity.

In Conscious Society (2019 – 2035) Artificial Intelligence will be equal with the value of

structured Natural Intelligence (IN)

IA = IN structured

In the advanced period of the Society the formula will evaluate in IA > IN structured, from

here starts to activate Artificial Conscience that is the next step in the development of the

Human Society – it will be already ROBO-human Society.

Using Adaptable Tools [13] ROBO-intelligences, represented by its pragmatics as was done

in this research, can be completed with its usage context, its syntax (representation forms), its

semantics (represented by the algorithms of their creation) and by the examples of its higher

level elements. That work have to be done in the nearest feature.

Adaptable tools are based on the usage of ADAPTER. The ADAPTER is a mechanism of

adaptable interactions human-machine on the axes: language, processor, data, actions,

definition and appeal from the point of view of usage in adaptable programming. Adaptable

programming technology is a new direction in drawing up the targets for programming systems

of perspective and adaptable programing technology is practiced via extension and reduction

of ROBO-intelligences creation process.

The Adapter is a metalinguistic instrument with direct contacts with the functioning

environment consist of a) language and translator, b) data and actions (operations, instructions

and conduct) and c) usage and defining activities of the new elements;

The Adapter represents each new element via other elements defining 1) pragmatics of new

element; 2) its syntax (shape) 3) semantics (content – its functional expression given via lower

level elements), 4) context of the new element usage and 5) one or more examples of presenting

the new element.

The Adapter for each of the elements of the 2 level of the emotional ROBO – intelligences have to be presented by the cell element which is processed by presenting it (Table 7) as an

Adapter pragmatics, syntax, content, context and examples of usage.

Table 7. Adapter serving emotional ROBO – intelligences

Adapter

vs

Temperament

Pragmatics Syntax Semantics Context Examples

Choleric Choleric

seeks to be

in control

of

situations,

to be on top,

to be the

best.

Choleric in

syntax use

imperative,

commanding

language,

wording

things as

orders rather

than

requests.

Choleric in

semantics are

firm and

forceful in

their approach

to problems.

They believe in

'tough love’

and try to 'help'

others by

challenging

them to prove

themselves, as

they

They're more

likely to tell

someone who

they are trying

to 'help' that

they're

pathetic,

expecting the

person to say

'no, I'm not

pathetic, I'll

show you!', as

indeed a

choleric would

Their confidence

and demanding

natures make

them natural

leaders, though

this doesn't mean

that they would

necessarily enjoy

leadership

positions; they're

just more likely to

take charge if

necessary rather



599

themselves

would.

in response to

such a thing

than fumbling

around worrying.

Sanguine Sanguine

people are

boisterous,

bubbly,

chatty,

openly

emotional,

social

extroverts.

Sanguine fear

consists of 2

syntactic

parts : fear

and

sanguinity

Sanguine fear

of the ROBO –

intelligence

semantically is

defined with

the

contribution of

the semantic

functions of

the

intelligence,

fear and

sanguine fear

Sanguine

ROBO –

Intelligence “is

worried” like a

Sanguine

Sanguines find

social interactions

with faces both

familiar and

unfamiliar

invigorating. This

is how they

recharge, and time

alone - while

sometimes

desirable - can

bore them quickly

Phlegmatic Phlegmatic

do not act

as if they

are better

than others.

They are

eager to

please, and

quick to

give in to

others

rather than

asserting

their own

desires as if

they're the

most

important.

Conflict

terrifies

them. They

do not start it

(except

perhaps in

extreme

circumstance

s), or provoke

it, and try to

defuse it

when it

comes up.

When forced

into an

argument,

they get very

upset and

distressed,

seeking

escape rather

than victory.

Semantic

Phlegmatic

tries and words

things in a way

that is not

offensive to

others. The

will be more

supportive

than critical

Phlegmatic

Robo

intelligence

barely

expresses

emotion at all.

While the

sanguine might

whoop and

cheer and jump

for joy at the

slightest

provocation,

phlegmatics

are unlikely to

express more

than a smile or

a frown. Their

emotions

happen mainly

internally.

The phlegmatic

members of a

pack might have

been the obedient

followers who'd

get much of the

actual work done

at the command of

their superiors.

They may not

stand out, but

without them,

nothing would

work.

Melancholic The

defining

feature of a

melancholi

c attitude is

perfectionis

m. They are

idealists

who wish

for things to

be a certain

way, and

they get

distressed

They are very

wary of

making

friends.

Unlike

sanguine, it

can take them

a very long

time for them

to consider

someone

they're

familiar with

a 'friend', but

once they've

They are

moved deeply

by beauty, and

by distress.

They are very

easily hurt,

because of

their

perfectionistic

tendencies.

Often their

moods are like

delicate glass

sculptures;

built up

They respond

to things that

they dislike

with misery

and with tears

rather than

with rage.

The melancholic

members of a

pack may have

been the analysts,

the information

gatherers. They

scouted for

potential danger,

or for food, and

reported back to

the pack leader.

The more accurate

their findings

were, the better;

this led to a trend



600

when they

are not.

reached this

point, they

will likely

stick with

that person

loyally.

slowly,

deliberately,

and carefully,

but easily

broken, and

hard to repair

once shattered.

towards

perfectionism, as

the 'analysts'

closer to

perfection

survived better

than those that

made sloppy

mistakes.

References 1. Mihalcea, R., Rosca, I-Gh., Todoroi, D. “Sisteme informatice in Societatea Conştiinţei.” In

Analele ASEM, Editia a VIII-a, Editura ASEM, 2010, p. 341 – 360.

2. Mihalcea, R., Rosca, I-Gh., Todoroi, D. “Discovering and managing Creativity in

Conscience Society.” In Analele ASEM, Editia a IX-a, Editura ASEM, Chişinău-2011, pp.

225 – 239.

3. Todoroi, D., Creativity in Conscience Society, LAMBERT Academic Publishing,

Saarbrucken, Germany, 2012. ISBN: 978-3-8484-2335-4

4. Todoroi, D., Rosca, I-Gh. “Intelligence development creativity features leveling in

Conscience Society.” In Analele ASEM, Editia a XI-a, Editura ASEM, Chişinău-2013, pp.

278– 290. ISSN 1857-1433

5. Todoroi, D. Crearea societăţii conştiinţei, MaterialeleTeleconferinţei Internaţionale a

tinerilor cercetători “Crearea Societăţii Conştiinţei”, Ed. a 3-a, 11-12 aprilie 2014,

Chişinău, 129 pagini / coord.: Dumitru Todoroi: ASEM (Chisinau, Republic of Moldova),

ARA (CalTech, Los Angeles, USA), UAIC (Iashi, România), ISU (Chicago, USA), UB

(Bacău, România), UC (Cluj, România), ASE (Bucharest, România). ISBN 978-9975-75-

612-6.

[6]. Rifkin, J., The Empathic Civilization: the race to global consciousness in a world in crisis,

Penguin Books Ltd., New York, 2009, ISBN 978-1-58542-765-9

[7] Todoroi, D., Creativity’s Kernel Development for Conscience Society, Informatica

Economică, Bucharest, vol. 16, no. 1/2012, pp. 70 – 86

[8] “Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT), by J. D. Mayer, P.

Salovey, and D. R. Caruso, 2002, Toronto, Ontario: Multi-Health Systems, Inc.

[9] Brackett, M. A., & Mayer, J. D. (2003). Convergent, discriminant, and incremental validity

of competing measures of emotional intelligence. Personality and Social Psychology

Bulletin, 29, 1147-1158.

[10] Brackett, M.A., Rivers, S.E., Shiffman, S., Lerner, N., & Salovey, P. (2006). Relating

emotional abilities to social functioning: A comparison of self-report and performance

measures of emotional intelligence. Journal of Personality and Social Psychology, 91,

780-795.

[11] http://dexonline.ro

[12] http://www.thefreedictionary.com/dedication

[13] Todoroi, D., Micuşa, D., Sisteme adaptabile, Editura Alma Mater, Bacău, România,

2014, 158 pagini. ISBN 978-606-527-347-4



601

MODELING THE RELATIONSHIPS NETWORKS INSIDE GROUPS

AS GRAPHS

Diana RIZESCU (AVRAM)1 Bucharest University of Economic Studies

[email protected] Vasile AVRAM


Abstract. The aim of this paper is to establish a common conceptual “framework like” for representing

relationships between elements (agents, parts, individuals) inside of complex systems such as social

and economic systems does no matter they based on agents or based on humans. The representation

realizes analogies with the human brain and psyche in a permanent effort in using the models inspired

by brain and psyche to model social and economic systems. The goal is to find a proof graph

representation of the network relationships inside of “social groups” (does no mater if is a small group,

an enterprise or an entire social system) and to create the possibility to realize the measurements

regarding its internal and external complexity. The graph framework like developed here allows to

represent the attractors and their basins as neighboring relationships that are proof using two Euler

formulas, one specific to graphs and another, proposed here, for sphere to check the completeness.

Keywords: agents, complex systems, emergence, graphs, relationship graph JEL classification: A10, C30, C67

1. Introduction We consider that an objective reality is that in which about every item we can say that is true or not. A

model of the world can be achieved using these true considered elements to mentally represent that

external world and thus to define for that a formal model. Relations between elements of the world seen

results from some cause (have a causal), it means from the sensory data received from this and some

form of coding these signals in the formal system. The formal system thus constructed is handled by

the mind through what we call inference. The simplest way to represent the components of a system is

realized by the input-output (I/O) diagram that describes a transition, modeled mathematically as

p:I→O, where each block is an Agent that realizes changes on the inputs to obtain outputs (Figure 1).

Figure 1. The Representation of Agents as Transaction

We consider that the Agent have two kinds of behavior observable from outside [2]:

- Internal, represented by the operations intrinsic to the Agent (learning, reasoning, decision);

- External represented by the communication and reactions/ interactions with other agents and their

environment.

When we build an agent based system these behave as basic blocks of a software system that uses at

least two technologies, artificial intelligence (AI) and object-oriented (OO) distributed processing. The

resulted software system have the functionality that maps inputs into outputs via pre-established

processing and is reactive by that it focuses on interactions between components as reaction/ response

1 „This paper was co-financed from the European Social Fund, through the Sectorial Operational Programme

Human Resources Development 2007-2013, project number POSDRU/159/1.5/S/138907 "Excellence in

scientific interdisciplinary research, doctoral and postdoctoral, in the economic, social and medical fields -

EXCELIS", coordinator The Bucharest University of Economic Studies”.



602

to stimulus from external world and not as traditional applications where important is the execution flux

(this dictates also the predefined/ pre-established reactions. In this “agents society” the agent have the

properties of atomicity (is indivisible), consistency and isolation (is closed and is not affected by the

environment changes) and, durability (its effect is permanent without any “roll-back”). In this context

we can see a system with agents as a network of such interactions and the relationship system do not

represent nothing than a special case of transition of that kind. A way to understand natural and artificial

systems, and also to model them, is represented by the relationships structure between its composing

elements (or parts). In a relational system the outputs of on agent can be inputs for on or many other

agents. The transformation p() is a functional component having the property it exists independently of

material parts that makes it possible (of course the death/ damaging of material part can result in

incapacity of perception of the observer so as scientifically demonstrated recently for the soul). The

complex systems do not necessitate a central processor. The results of recent studies in neurobiology

and cognitive psychology exploit the idea that the emergence of consciousness and self consciousness

are dependent on the rate of production of “cells meta-assemblies” as neural results of self-reflection.

A provision mental state is understood as a global system status that is caused by nonlinear local

complex interactions of its parts, but can not be reduced to such parties [1]. Topological measurements

of these kind of complex networks showed a striking similarity to their many other types of networks,

but completely different and parts of the objective reality (such as Internet, electrical circuits food chains

etc.).

2. From Local to Global Organization

To pass from a local coordination level to a global level of the organization, we must keep in mind that

all interactions between agents inside the complex system will tend to a coherent, stable status, until

agents adapts (mutually) to each other. This process accelerate generally due to the positive feed-back.

The reason is that after two or more agents attains a mutual fit status that defines a stable assembly to

which the other remaining agents can adapt by trying to fit in the existing assembly. As the assembly

grows (become larger) this provides “niches” to which other agents can fit. As many agents joins the

assembly as this become bigger and the provided “niches” to agents to join become bigger too. Thus

the assembly will raise exponentially until will cover the entire global system. This grows is typically

more rapid when the agents are identical (as for example the molecules of the same substance) or similar

(as for example the individuals of the same specie), because the solution given by an agent will be

adopted by the others so that in the future, once found a good local arrangement, will be necessary only

minimal “trial and error”. This is typical of natural self-organization processes as crystallization,

magnetization or the emergence of coherent laser light. When the agents are all different each one at is

turn must explore to find its unique “niche” in a medium that continuously evolves, resulting a less

exploding development. In the case of identical agents the emerging global structure is typically

uniform or regulated, because the optimal arrangement for one agent is also optimal for the others

agents. As a result, the self organization produces a perfect ordered pattern. When the agents are

different as in a ecosystem or a market the resulted structure is more complex and unpredictable.

3. Global Dynamics

If we now consider the system as a whole - rather than individual agents - we can see that the system

undergoes variation. Self-organization means therefore that the system found an attractor, as for

example a part of the state space where it can enter but can not leave. In this respect, the attractor is a

preferred region of global dynamics: surrounding attractor states (the attractor basin) are unstable and

will eventually be lost and replaced by states inside the attractor In nonlinear systems must keep in

mind that the equation of evolution can provide solutions each of them being nothing but a pattern of

behavior. The attractor for a such system represents an equilibrium position that is unique and describes

a time invariant (independent of) situation. The equilibrium state becomes a universal attractor point.

The stability is essentially determined by the response of the system to perturbations.

If X={X1, X2, …, Xn} is the multitude of system inputs then we will consider them as evolving in time

Xi(t) and the perturbation will be highlighted as a stable state of them to moment t to which added the

correction, Xi(t)=Xi,s+xi(t), and the system’s equation can be defined as:



603

𝑑𝑥𝑖

𝑑𝑡= 𝐹𝑖({𝑋𝑖,𝑠 + 𝑥𝑖}, 𝜆) − 𝐹𝑖({𝑋𝑖,𝑠}, 𝜆), by which the time dynamics of system (

𝑑𝑥𝑖

𝑑𝑡) is defined

by laws of the changing rate of the system (𝐹𝑖) applied to the snapshot of the system stable state (𝑋𝑖,𝑠)

to which is highlighted the perturbation (𝑥𝑖) as a correction (by incrementing or decrementing) of the

input and with the condition of considering the changing of the parameters (𝜆) by the external world to

the system (control parameters) [3].

Self-organization means to search new attractors appearing when system is far away of equilibrium

state. When in the same space exists more attractors each of them have its own attraction basin

containing the state sets from which starting the system goes to a specific attractor.

The coexistence of multiple attractors is common to systems and shows an adapted behavior able to

achieve regulating tasks. For systems based on informational message exchanges we can define the

attractor as communication of information and interaction of some kind with another agent with the

goal to solve a certain type of problems (in the area of specialization of the agent).

In the human systems the connectivity reflected by that a decision or an action of an individual (group,

organization, institution, human system) will affect all other related individuals in the system. The effect

do not have a uniform impact but vary directly with the state of each individual related at a time.

The connectivity applies to the interrelationship of individuals inside a system and also to the kinship

human systems. Self-organization in this context is taken to mean the group of individuals gathering

together to perform a specific task.

4. Relationship Structure

For being able to determine the regularities of such systems we must be able to define a graph G=(V,M),

for the analyzed system. We denote by V={ vi }, (i=1, 2, …, N) the vertices/ nodes multitude and by L

={ (vi, vj) } the multitude of links/ edges, it means the oriented graph connecting the ordered vertices

pairs, Λ= (V, L). The edge {i,j} is the line starting in the vertices i and ending in the vertices j. The

directed edges are called arcs. Two vertices i and j are called adjacent if they connected by at least one

edge. Each node i is characterized by its degree ki, defined as the number of attached edges. Similarly,

we define the input-degree, kiI, as the number of input edges (links) and output-degree, ki

O , as number

of exiting edges, and with respect of formula ki= kiI + ki

O. The sum of degrees of all nodes is an odd

number. Depending on the existence/ inexistence of directionality in the graph edges this can be

directed, respectively, undirected graph. Some graph G can be used to represent the structure of a system

by considering the system’s elements (parts) as nodes and the edges its interactions. If N is the number

of vertices (nodes) and L is the number of links of the graph then the mean degree (Gm) is Gm=2L/N,

because each edge is attached to two nodes.

5. The Network Structure

Conventionally a network structure is modeled as a graph G which consists of a set of vertices (nodes)

V and a set of edges (and/or arcs) M that we define as unordered pairs of distinct vertices. A path (way)

in G from node vo to node ve is defined as an alternative sequence of nodes and edges (vo, m-1, v1,... me,

ve), where mi = {vi-1, vi} are the edges that realizes the connection to next nodes with respect that no

node can be traversed two or more times. Such a path is not necessarily unique. The length of the path

is given by the number of its nodes. The degree of a node x is given by the number of edges containing

x, for example the number of its neighbors: deg(x)=|{mϵM|xϵm}|= |{yϵV|{x,y}ϵM}|=|σ{x}|, where |A|

defines the cardinality (the number of elements) of the set (multitude) A. If between two nodes vi and

vj exists an edge then the nodes are called adjacent and the adjacency relationship can be quantified by

the term aii=1 and the not adjacent by the term aii=0, of the adjacency matrix AN,N of the graph G,

denoted by A(G).



604

Figure 2. Nodes Neighboring

The degree of node i computed as sum of all inputs in the line i of the matrix and the total adjacency of

a graph as sum over all elements of the matrix, such: 𝑎𝑖 = ∑ 𝑎𝑖𝑗𝑁𝑗=1 ; 𝐴(𝐺) = ∑ ∑ 𝑎𝑖𝑗

𝑁𝑗=1

𝑁𝑖=1 = ∑ 𝑎𝑖

𝑁𝑖=1 .

Equivalent, we can define deg(x) as the number of edges incident with the node x and we can consider

the degree of inputs and outputs. The adjacency of undirected graphs (each edge is seen for both related

nodes) is greater than those of directed (because here we consider only the link in the direction indicated

by arrow). The mean degree of the node denoted by <ai> and the connectivity (Conn) are determined

by formulas: < 𝑎𝑖 > = 𝐴(𝐺)

𝑁 ; 𝐶𝑜𝑛𝑛 =

𝐴(𝐺)

𝑁2 =2𝑀

𝑁2 .

The distance d(x, y) is the length of the shortest path in G connecting the node x with node y. If a

connection path between x and y don’ exist we set d(x, y) = . Thus the graph G is connected if and

only if d(x, y) is finite, V x, y ϵ V (x≠y). d(vo, ve) represents the less number of nodes that must be

traversed to attain ve from vo. This number is an integral if we don’t have weighted graphs. The mean

length of a path (l) in a graph with N nodes is given by formula 𝑙 =1

𝑁(𝑁−1)∑ 𝑑(𝑣𝑖, 𝑣𝑗)∀𝑖,𝑗 . In a random

graph the mean length l rise lower the rise of the graph magnitude. The neighbors of a node can be

grouped on categories depending on the number of arcs needed by the path between them, namely

primary neighbors (category zone 1, z1), secondary neighbors (category zone 2, z2), tertiary neighbors

(category zone 3, z3), etc. The set of the neighbors of some node x is σ{x}= {yϵV|{x,y}ϵM}. If we fix

some node x then we can define its neighbors by grouping them depending on the distance between

them as primary, secondary, tertiary, … neighbors that will be included in the corresponding (suitable)

neighboring zones z1, z2, …, zn (Figure 3). The concentric circles with dotted line have the role to

demarcate the neighboring zones (the levels). This representation is suitable to highlight node

adjacency and their degree. To check and verify the neighboring completeness of a specific level we

propose using the Euler’s equation for spheres: if a surface of a sphere is cut into F facets with E edges

and V nodes then we have the equality: V-E+F=2. This equation can be proof by realizing the subgraph

of the interest neighboring order (as for example in Figure 2) and by adding to this fictitious edges

between the nodes of the same level, to form the facets, as illustrated in Figure 3 by the red lines. To

level 2 is defined a dashed fictitious broken line only for understanding reasons in the flat

representation.

In that way we define the facets as if they obtained by cutting a sphere in which our graph can be

inscribed. For each neighboring zone we number the vertices (nodes), and the edges and arcs and check

by the formula. For given example in Figure 3 we have:

- Primary neighbors: (z1): V=1+6=7; E=6+6=12; F=7, hence V-E+F=7-12+7=2

- Secondary (z2): V=1+6+8=15; E=6+6+16=28; F=6+8+1=15, hence V-E+F=15-28+15=2



605

Figure 3. The Node Neighbors, Edges and Facets

We can us this representation to analyze the attractors corroborated by measurements about the

clustering. This approach can be easy adapted to weighted graphs. We define deg(x) as a sum of the

weights of all edges containing x and we define the path length as a sum of the weight associated to all

edges in the path. According to Amaral et al. (in [4]) the distribution P(k) of nodes grades k = deg(x)

allows the identification of at least three types of networks structurally defined: single-scale, free

networks (free) scale, and large-scale networks. The Euler’s theorem on graphs relates the number of

nodes (vertices) V, edges E, independent cycles C and components K of a graph as C=E-V+K. To

model social networks a great interest represented by the high degree of clustering given by that the

friends of a member tend to be friends of all other members. Clustering is low for random graphs. The

clustering coefficient of node i, denoted by Ci, is defined as the ratio between the number of vertices Vi

of primary neighbors of the node i and the maximum number of nodes of the completed subgraph

formed starting with its primary neighbors, hence Vi(max)=ai(ai-1)/2, or 𝐶𝑖 =2𝐸𝑖

𝑎𝑖(𝑎𝑖−1) . When Ci is the

mean on the entire network, we have =1

𝑁∑ 𝐶𝑖 =

1

𝑁𝑁𝑖−1 ∑

2𝐸𝑖

𝑘𝑖(𝑘𝑖−1)𝑁𝑖=1 . Similarly, we can consider the

secondary neighbors and determine the clustering coefficient and so on for other neighboring level.

6. Conclusions

Figure 4. The Social System Modeled as Network and the Relationships Inside as Probability Matrix (source

[5])

Any kind of social groups or the social/ economic organizations can be represented as multigraph (one

agent/ element in the group can have in the same time many roles in the same network and in

relationship with the same other agent/ element denoted previously by nodes or vertices) and by

associating to the nodes and edges of the obtained graph the probabilities corresponding to each

property/ characteristic. We obtain probabilities matrices (Figure 4). The multigraph will be

decomposed in simple graphs each one having an associated probability matrix. The simple graph will

be checked by using the two Euler’s formula introduced previously. By considering the time dimension

and the multitude of relationships we obtain a three dimension massive composed by the probability

matrix in which each panel represents a relationship/ association (suggested by label 1) for which we



606

can compute measurements as mutual entropy, normalized mutual entropy and marginal entropy or the

proposed equivalents based on informational energy when we use excerpts. Once obtained the

probability matrices they can be used as inputs for the models to commensurate the internal and external

complexity (as shown in [5] and [6]), for example. This model will be extended and adapted to realize

the modeling of the relationships in the general model used to integrate the vary brain and psyche

models realize and make cooperate them.

References

[1] K. Mainzer, (2007) Thinking in Complexity - The Computational Dynamics of Matter, Mind, and

Mankind, ISBN 978-3-540-72228-1, Springer.

[2] H. Lin, (2007) Architectural Design of Multi-Agent Systems: Technologies and Techniques, IGI

Global.

[3] J. Moffat, (2003) Complexity Theory and Network Centric Warefare, CCRP Publication Series.

[4] L. A. N. Amaral, A. Scala, M. Barthélémy, and H. E. Stanley, (2003), “Classes of small-world

networks”, Applied Physical Sciences, Proceedings of the National Academy of Sciences USA,

doi: 10.1073/pnas.200327197, pp. 11149–11152.

[5] D. Rizescu and V. Avram, (2013). Using Onicescu's Informational Energy to Approximate Social

Entropy. Procedia - Social and Behavioral Sciences, Volume 114, 21 February 2014, ISSN 1877-

0428, pp. 377-381.

[6] V. Avram and D. Rizescu, (2014) Measuring External Complexity of Complex Adaptive Systems

Using Onicescu’s Informational Energy, Mediterranean Journal of Social Sciences, Vol. 5 No.22,

August 2014, Rome-Italy, ISSN 2039-2117, DOI:10.5901/mjss.2014.v5n22p407, pp. 408-417.



607

A SEMANTIC MOBILE WEB APPLICATION FOR RADIATION

SAFETY IN CONTAMINATED AREAS

Liviu-Adrian COTFAS

University of Franche-Comté, Montbéliard, France

Bucharest University of Economic Studies, Bucharest, Romania

[email protected] / [email protected]

Antonin SEGAULT University of Franche-Comté, Montbéliard, France

[email protected]

Federico TAJARIOL University of Franche-Comté, Montbéliard, France

[email protected]

Ioan ROXIN University of Franche-Comté, Montbéliard, France

[email protected]

Abstract. After a nuclear disaster, people living in contaminated areas encounter numerous

questions concerning the risk they face and how to reduce it. In this paper, we present a mobile

web application designed to facilitate knowledge sharing amongst the affected population. The

system allows querying and browsing a base of documents gathered both by experts and

through crowdsourcing. The information needs are modeled as set of use-cases, starting from

existing reports on the long-term radiation safety. A semantic search engine is used to retrieve

the resources annotated with a thesaurus of the concepts relevant to long-term radiation safety.

The application is part of a larger crisis monitoring and management system, which also

includes social media aspect-based emotion and sentiment analysis.

Keywords: crisis communication, post nuclear accident, resilience, semantic web JEL classification: L86, H12, H84

1. Introduction

Crisis communication can be defined as information sharing to protect the stakeholders from

the negative consequences of a crisis ([1] quoting [2]). Among the various types of crises, either

natural, or man-made, the post accidental phase of a nuclear disaster, known as the PAN phase,

is considered to have very specific characteristics. This phase begins after the end of both the

emergency and transition phases, when radioactive substances are no longer leaking into the

environment, and the contamination of the territories can be globally assessed [3]. As shown

by existing research, the uncertainty of the population living in the contaminated areas focus

on practical questions such as “Is it safe to stay?”, “what is safe to eat?”, “what is the health

risk?”, “how to reduce it?”. The information required to answer these questions is both highly

technical and controversial. Moreover, the nuclear risk, man-made, invisible, long-lasting,

leads to particularly high levels of fear [4]. An excessively reassuring crisis communication,

such as the one carried out during the emergency phase of the Fukushima Daiichi disaster by

the Japanese government and TEPCO (the plant's operator), was shown to lead to a loss of trust

[5].

A possible solution for communicating useful knowledge to people in crisis situations, consists

in the development of mobile guide applications. Numerous such applications have already



608

been issued by both governmental agencies and non-governmental organizations. Some of the

most popular applications are the ones developed by the Federal Emergency Management

Agency [6] and the American Red Cross [7].

In this paper, we propose a crowdsourcing system to help the population facing a PAN situation

in accessing, understanding and sharing the knowledge they need to assess and reduce their

exposure to radiations. Compared to existing approaches, semantic web technologies have been

used to both store the knowledge and facilitate its discovery trough semantic search. While the

crowdsourcing approach constantly supplies the system with the latest information, the

semantic approach provides the necessary structure and information discovery capabilities. The

mobile web application used for accessing the available information is shown in Figure 1.

Figure 1. Mobile web application for the post accidental phase of a nuclear disaster

The paper is organized as follows. The second section focuses on the structure of the semantic

guide, while also highlighting the information needs of the people living in contaminated areas.

In the third section, the semantic search approach is described, while the fourth section presents

the technical approach as well as an overview of the system crisis monitoring and management

system to which the presented application belongs. The last section summarizes the paper and

introduces some of the future research directions.

2. Semantic Guide

The main information needs of people living in the contaminated areas, were established using

several exiting reports on the long-term radiation safety [3], [8]. Starting from these

information needs, several scenarios have been constructed, each of them containing of a set

of activities such as readings, maps and tools, as shown in Figure 2. The readings are composed

of informative texts and pictures, written either by experts or gathered through crowdsourcing.

Similarly, maps provide contextual information, from both official and crowdsourced sources.

The tools consist of small interactive programs which assist users when calculating internal

and external exposure doses or when sharing assistance and knowledge. The scenarios were

structured with the help of members of the CEPN [9]. During their work in the contaminated

areas of Belarus, through the Ethos [10] and Core [11] rehabilitation project, they noticed that,

while people's questions often relate to high-level scenarios like "What is the health impact ?"

or "How may I protect myself ?", these questions cannot be answered without addressing the



609

lower-level scenarios: "Is my environment contaminated ?", "How am I exposed ?". Therefore,

a meta-scenario, modelling the dependencies between the scenarios, shown in Figure 3 is

proposed.

Figure 2 One of the scenarios: "How to measure the

contamination of the environment?”

Figure 3. The meta-scenario, articulation the

set of nine scenarios

3. Semantic Search Beside the pre-set browsing paths defined in these scenarios, the system relies on a semantic

search engine allowing users to perform queries on the documents base. Each document

(corresponding to an activity - readings, maps, tools) is annotated with the radiation safety

concepts it addresses. These concepts are defined in a thesaurus of the domain specific

knowledge, stored as a SKOS vocabulary, as shown in Figure 4. The plain text user queries are

automatically annotated using the same set of concepts. The semantic similarity of all

documents is calculated using distances in the graph of concepts [12], [13], [14] and the closest

documents are displayed, as shown in Figure 5. This order can be adjusted through user

feedbacks, as shown in Figure 8.

Figure 4. The semantic knowledge base

The semantic search engine also relies on a semantic representation of the meta-scenario,

integrated in the domain thesaurus. Each document can thus also be annotated with the scenario

it is related to. When a user performs a query, if the system can identify a relevant scenario, it

can provide information on the other scenarios that may be useful prerequisites, as shown in

Figure 6. The search engine thus acts as a two levels recommendation system.



610

Figure 5. Semantic

search results

Figure 6. Guide page

showing the

prerequisites

Figure 7. Recommended

readings

Figure 8. Guide page

with user ratings

The semantic search engine is also used to provide new paths within the document base. For

each document, using the semantic annotations, the system can propose a list of related

documents, as shown in Figure 7, allowing a more serendipitous use.

4. Crisis monitoring and management system

Compared to many existing disaster support applications in which the participants can only use

specific mobile devices, the proposed mobile implementation uses the latest web technologies

to offer portability across different platforms as well as a rich user experience. The application

can either be used directly from the browser, or can be installed using a thin native wrapper

that provides the required translation from JavaScript method calls to native methods. It works

on all devices that comply with the HTML5 standard specifications including both smartphones

and feature phones. Among the libraries used to develop the mobile application, Boostrap and

AngularJS play a central role.

Figure 9. Crisis management and monitoring platform

As show in Figure 9, the proposed application is part of a larger semantic web based crisis

monitoring and management platform that also includes modules for social media aspect-based

emotion and sentiment analysis. The data is stored as triples using ontologies, and information

such as the points of interest displayed on maps can be accessed by 3rd parties through a public

SPARQL endpoint.



611

5. Conclusion

In this paper, we presented a mobile web application for knowledge sharing amongst people

living a post-nuclear-accident situation. The application includes a semantic search engine

allowing users to browse expert and crowdsourced documents that have been annotated with

domain specific concepts. The development of a prototype for the platform is still a work in

progress. User tests will be conducted with potential users to validate the interface design and

to fine tune the search algorithm. Afterwards, a larger test will then be carried out to evaluate

the impact of the system on users' knowledge and attitudes toward radiation safety. While the

system has been designed for assisting people in post-nuclear accident situations, the proposed

approach can be adapted to other types of crises.

Acknowledgment This study was produced as part of the SCOPANUM reasearch project, supported by grants

from CSFRS (http://csfrs.fr/), and a doctoral grant from Pays de Montbéliard Agglomération

(http://www.agglo-montbeliard.fr/).

References [1] Barbara Reynolds and Matthew W. Seeger, “Crisis and emergency risk communication

as an integrative model,” J. Health Commun., vol. 10, no. 1, pp. 43–55, 2005.

[2] W.T. Coombs, Ongoing Crisis Communication: Planning, Managing, and Responding.

SAGE Publications, 1999.

[3] CODIRPA, “Rapport du groupe de travail « Culture pratique de radioprotection en

situation post-accidentelle »,” ASN, 2011.

[4] Paul Slovic, “Perception of risk from radiation,” Radiat. Prot. Dosimetry, vol. 68, no. 3–

4, pp. 165–180, 1996.

[5] Jessica Li, Arun Vishwanath, and H Raghav Rao, “Retweeting the Fukushima nuclear

radiation disaster,” Commun. ACM, vol. 57, no. 1, pp. 78–85, 2014.

[6] Federal Emergency Management Agency, “Mobile App | FEMA.gov.” [Online].

Available: http://www.fema.gov/mobile-app. [Accessed: 11-Mar-2015].

[7] American Red Cross, “Red Cross Mobile Apps.” [Online]. Available:

http://www.redcross.org/prepare/mobile-apps. [Accessed: 11-Mar-2015].

[8] SAGE Project, Guidance on Practical Radiation Protection for People Living in Long-

Term Contaminated Territories. 2005.

[9] Centre d’étude sur l’Evaluation de la Protection dans le domaine Nucléaire, “CEPN.”

[Online]. Available: http://www.cepn.asso.fr/en/. [Accessed: 11-Mar-2015].

[10] Centre d’étude sur l’Evaluation de la Protection dans le domaine Nucléaire, “ETHOS -

La réhabilitation des conditions de vie dans les territoires contaminés par l’accident de

Tchernobyl en Biélorussie.” [Online]. Available: http://ethos.cepn.asso.fr/. [Accessed:

11-Mar-2015].

[11] United Nations Development Programme, “UNDP Support Project for the ‘Cooperation

for Rehabilitation’ (CORE) Programme in areas affected by Chernobyl.” [Online].

Available: http://un.by/en/undp/db/00011742.html. [Accessed: 11-Mar-2015].

[12] A. Hliaoutakis, G. Varelas, E. Voutsakis, E. G. Petrakis, and E. Milios, “Information

retrieval by semantic similarity,” Int. J. Semantic Web Inf. Syst. IJSWIS, vol. 2, no. 3, pp.

55–73, 2006.

[13] C. Delcea, R-M. Paun, and I-A. Bradea, "Company’s Image Evaluation in Online Social

Networks," Journal of Internet Social Networking and Virtual Communities, 2014.



612

[14] G. Orzan, C. Delcea, E. Ioanas and Mihai Cristian Orzan, "Buyers’ Decisions in Online

Social Networks Environment," Journal of Eastern Europe Research in Business &

Economics



613

PREDICTING EFFICIENCY OF JAPANESE BANKING SYSTEM

USING ARTIFICIAL NEURAL NETWORKS (ANN): DATA

ENVELOPMENT ANALYSIS (DEA) APPROACH

Ionut-Cristian IVAN

Institute for Doctoral Studies,


[email protected]

Abstract. Recent directions of research regarding the efficiency field are mainly connected

with data-mining methodology, resulting thus hybrid models of efficiency scores estimates. The

present paper uses data from 99 Japanese banks while trying to build a learning machine that

could predict, with a given error threshold, the efficiency of a certain DMU (decision making

unit). The initial set of observations is divided into two sub-sets – a training set and a testing

set.

Keywords: Data Envelopment Analysis, Neural Networks, efficiency



Programme Human Resources Development 2007-2013, project number POSDRU

159/1.5/S/134197 ”Performance and excellence in doctoral and postdoctoral research in

Romanian economics science domain”.

1. Introduction The Japanese banking system is a unique system, still recovering from the late ’80 economic

bubble that made the stock and real-estate prices to drop dramatically, and also a system that

knows economic disequilibrium during the recent financial crisis. The incongruity of Japanese

banking system comes from the following sources: the existence of special financial

institutions (keiretsu) that acts as intermediary between the financial environment and certain

firms (always the same firm or group of subsidiaries), the stoppage of foreign banks to activate

across Japan, the degree of granularity and the existence of an obvious classification of banks

according to their branches’ spread across the country (inter-regional and intra-regional).

The main idea of this article revolves around the development of a learning machine that could

predict, with a certain degree of error, the estimation of the efficiency score associated with a

bank.

Recently there haven’t been studies that dealt with the idea of implementing data mining

techniques in order to obtain efficiency estimates, but, nonetheless, the following articles must

be considered: “Technical and scale efficiency of Japanese commercial banks: A non-

parametric approach” (Fukuyama, 1993) and “Efficiency in Japanese banking: An empirical

analysis” (Drake and Hall, 2003). The above mentioned articles use data envelopment analysis

(DEA), as a non-parametric tool, to calculate the efficiency scores, based on linear

programming models [1] [2].

In contradistinction to these articles, this paper will use newest data, extracted from the 2012’

reported income statements of the banks and also will introduce supervised learning in order

to help predicting the efficiency score of a new analyzed bank [3].



614

The article is structured as follows. The second sections give a short introduction in research

literature. The third section presents in a simple and structured way the main notions that where

use to develop the learning machine. The fourth section is dedicated to the application of the

aforementioned methodology with a short introduction in DEA results. Last section concludes

my research.

2. Literature Review Once the foundations of efficiency techniques were made by Farrell, through his 1957’s work,

"The measurement of productive efficiency", in which Farrell was referring to the idea of

calculating efficiency measures, relative to the convex hull that covers the set of observations

[4], in 1978, Charnes, Cooper and Rhodes introduced the DEA term defining a model with an

input orientation and constant economies of scale. The model was more of a theoretical model

that couldn’t be applied to the real economy, mainly because it considered that all firms operate

at optimal scale. The next step in the development of related non-parametric techniques

methodology was made in 1983, when Cooper, Banker and Charnes included in the linear

programming model the assumption of variables economies of scale. This new model can

differentiate between technical efficiency and scale efficiency, and also can specify in which

part of the economies of scale a firm is found [5].

3. Methodology Prior to the development of the artificial learning machine, the research started with the

application of non-parametric techniques in order to obtain DEA estimates over the observation

set. The simplest linear programming model that can be use to obtain the efficiency scores for

an input orientated approach with constant return to scales, is stated in Coelli’s “An

introduction to efficiency and productivity analysis” (2005) and has the following form (Eq.

1):

minθ,λ

θ

−qi + Qλ ≥ 0 (Eq. 1)

θxi − Xλ ≥ 0

λ ≥ 0 , where θ – scalar, Q – matrix of output, X – matrix of input, λ – vector of constants; in

this case, θ is actually the efficiency score attached to a certain decision making unit [6].

Supplementary constraints are added to ensure the convexity of the envelopment over

the production feasible set, relatively to which the efficiency scores are calculated using the

concept of distance. Practically, the introduction of the convexity of production set constraint,

assures the presence of variable returns to scale.

In the development of the neural network that will be used for predictions, I use the

signal back propagation algorithm [7] in a form of a feed-forward network with bias.

For a simpler representation of the signal back propagation algorithm, I will consider

the following notations:

L1 - the first layer of the neural network (input layer) – associated with index i and p -

number of neurons;

L2 - the second layer of the neural network (hidden layer) – associated with index j and

r - number of neurons;

L3 – the third layer of the neural network (output layer) – associated with index k and

s - number of neurons;

xi, xj, xk – input signal that enters the L1, L2, L3 layers;

yi, yj, yk - the output signal of the L1, L2, L3 layers;

wij - synaptic weights between input layer and hidden layer;



615

wjk - synaptic weights between the hidden layer and output layer;

fi - activation function of the neurons from the hidden layer;

fj - activation function of the neurons from the output layer;

ϑ – training rate, ϑϵ(0,1);

Step1. Initialize wij, wjk vectors with random values; consider training rate, and the

maximum tolerable error (as a condition to stop the algorithm);

Step2. Consider a (xn,yn) pair as a network training pair, where nϵ{1,2,...N};

Step3. Determine the activation values of the hidden layer neurons xj = ∑ (wijpi=1 xi)

and output values of the hidden layer yj = fj (xj);

Step4. Determine the values of the activation of neurons in the output layer xk =

∑ (wjkrj=1 yj) and the output value yk = fk (xk);

Step5. Depending on the desired output value ŷ and the actual obtained value yk, the

error term is computed for the output layer’ neurons δk = fk'(xk)( ŷk-yk);

Step6. The error term is calculated for the hidden layer’ neurons: δj = fj'(xj) ∑ (δk ∗sk=1

wjk) Step7. L3 layer’ weights are updated: wjk = wjk + ϑ δk yj

Step8. L2 layer’ weights are updated: wij = wij + ϑ δj yj

STOP. Stop condition is reached when the error established at step 1 is obtained.

4. Empirical Results

While computing the efficiency scores for DEA estimates, I used Fethi and Pasiouras approach

that considers revenues from the income statements as output variables and expenses as input

variables [8]. Thus, I selected net income, interest received and revenues from fees and

commissions as outputs and expenses with fees and commissions, expenses with provisions

and interest paid as inputs. Also, I work in variable return to scale conditions with an input

orientation in computing the distances from observations to the convex hull that covers the data

cloud. I mention that I chose the input orientation, since the banks can change more easily their

inputs rather than their outputs.

Before I started to compute the efficiency scores, I first checked if there are any outlier

observations that could lead to erroneous obtained scores of efficiency.

I considered the inclusion of the analyzed observations in the smallest possible hyper sphere.

The problem was solved by reducing the dimensionality of the representation in a two- or three-

dimensional space and by using the Lagrange method for the minimization of the circle / sphere

that contains our observations (also it can be used certain aggregation methods that could move

the analyzed system in a bi-dimensional space). Once obtained the minimum hyper sphere and

calculated the volume V, we will then calculate n volumes obtained by removing an

observation - Vi, where n equals the number of studied observations; Considering the i-th

observation, if the ratio Vi / V tends to 0, then, the i-th observation is an outlier. A similar test

is performed by Bogetoft and Otto [9] but with an approximation of the volume as the

determinant of a compound matrix, formed by input and output matrices. Space dimensionality

reduction can be performed by applying the methodology afferent with auto - associative neural

networks and/or nonlinear principal component analysis. I suggest the generalized principal

component analysis model. Applying the Oja algorithm [10], which involves changes in

synaptic weights wn according to input values (xn) and output (yn) and a random learning rate

η , according with the following formula (Eq. 2):

Δw = wn+1 – wn = η*yn (xn – yn wn) (Eq. 2)

on a neural network with errors back propagation, will lead to a stable vector of synaptic

weights corresponding to the first principal component; Also, the algorithm can be generalized

to compute more principal components by inserting a number of layers on which Oja's rule can



616

be applied. Also, the resulting output variances converge to the eigenvalue associated with the

computed principal component, when the number of iterations tends to infinity (as long as the

activation function is differentiable in the input and in the synaptic weights).

Once applied the above methods I encountered three outliers that were removed from further

analysis. These outliers where actually the keiretsu banks, that were forming a separate cluster.

After applying DEA methodology, I have obtained an average score of efficiency equal to

0.754, meaning that banks could decrease their input by 24.6% and still obtain the same output.

Comparing with the article mentioned in Introduction, the overall average score brings this

research closer to Drake (2003) that obtained 0.72, rather than Fukuyama (2003) [1] [2].

The learning machine have the form of a feed-forward neural network, as it can be seen in

Figure 1, with bias, a 6-6-1 layer formation with six neurons on input and hidden layers and a

single neuron on output layer. The neural network will have its inputs derived from the initial

DEA inputs and outputs and the exit neuron will give us the efficiency estimate.

Figure 1. Architecture of the neural network

As it can be seen, the training phase ended with a 0.044 error term. Prior to the implementation

of the training phase, I defined a partition following a Bernoulli probability distribution, that

allowed me to obtain the percentage of observation that where to be used in the process of

training. Thus, 78% of total number of observation was used as training observations and the

difference as testing observations.



617

Also, the neuralnet package in R allows the easy computation of neural networks’ output, being

given the matrix of associated weights, through “compute” function. I applied this function on

the testing set and I have obtained a 0.07 average error.

Further, I parsed the training set into four classes, according to the obtained efficiency

estimates, in order to check the accuracy of the defined neural network while predicting new

DMU’s.

Figure 2. Overall statistics for prediction accuracy

It can be observed that the system predicts the correct cluster with an accuracy of 93%, while

the probability associated with H0: accuracy = 0, tends to 0. Also, Cohen’s Kappa shows a high

level of agreement between classes.

5. Conclusions

The article proposed the development of a learning machine that could correctly predict the

efficiency score of a new DMU. It was observed that the neural networks had good results in

training and testing phase, with small associated errors.

Next research should follow the convergence of the neural networks estimates.

References

[1] Fukuyama, H., “Technical and scale efficiency of Japanese commercial banks: A non-

parametric approach”, Applied Economics 25, 1101–1112, 1993

[2] Drake L., Hall M., “Efficiency in Japanese banking: An empirical analysis, Journal of

Banking Finance”, 891-917, 2003

[3] http://ediunet.jp/

[4] Farell M.J. “The Measurement of Productive Efficiency”, Journal of the Royal Statistical

Society, 253-290, 1957

[5] Charnes, A., Cooper, W.W., “Polyhedral cone––ratio DEA models with an illustrative

application to large commercial banks”, Journal of Econometrics 46, 73–91, 1990

[6] Coelli T., Prasada D.S., “An introduction to efficiency and productivity analysis”, Springer,

2005

[7] Rumelhart D., Hinton G., Williams R., “Learning representations by back-propagation

errors”, Nature 323 (6088), 1986

[8] Fethi M., Pasiouras F. “Assessing bank efficiency and performance with operational

research and artificial intelligence techniques: A survey”, European Journal of

Operational Research, 189-198, 2009

[9] Bogetoft P., Otto L., “Benchmarking with DEA, SFA and R”, Springer, New York, 2010

[10] Haykin S., “Neural networks – a comprehensive foundation”, Prentice Hall, 1999

http://ediunet.jp/



618

SEMANTIC RELATIONS BETWEEN AUTHORSHIP, DOMAINS

AND CULTURAL ORIENTATION WITHIN TEXT DOCUMENT

CLASSIFICATION

Mădălina ZURINI


[email protected]

Abstract. The present paper addresses the problem of author term document classification

considering an additional level of analysis, the semantic distribution of terms and senses

extracted within the initial training set of documents. In order to achieve a high performance

classification of documents upon authors and domains, an author-domain oriented model is

discussed regarding the advantages brought to the general accuracy of the text classifier.

Defining the model conducts to analyzing the relations between documents’ assigning to

authors, documents’ orientations to domains and cultural approaches. A comparison between

term oriented classification and term distribution within each analyzed domain is conducted

in order to evaluate the best method for integrating it in a wider approach of authorship and

cultural orientation of scientific articles. The result of the present paper consists in a modeling

of database tables, relations and attributes used in the processes of text classification and

author assignment. Various instructions for extracting the information needed for the text

supervised classification are defined for creating a procedure that will be further used in the

authorship and cultural orientation application. The main results of different researches

conducted by internationals authors are briefly presented for a possible extension of the

present original model. Wordnet lexical ontology is introduced in the model for generating a

term-senses superior level used for describing the distribution of senses varying by the general

domain oriented scientific papers.

Keywords: authorship, text document classification, generative models for documents, cultural

orientation features, semantic analysis JEL classification: C89, B16

1. Introduction Information retrieval, machine learning and statistical natural language processing deals among

others with content extraction within text documents. The extracted content is further used in

problems such as: text classification to specific domains, document clustering for obtaining a

set of similar documents regarding their content, authorship assignment and so on. Depending

on the document representation, either by term representation, term and senses representation,

text feature representation, the results of the text processing can be used to organize, classify

or search a collection of documents.

Combining the author level, document given by text level and domain level, using generative

models for documents, a range of important questions can be answered, such as: the subjects

that an author writes about, the authors that are likely to have written a document, domain

intersection referring to the similar set of characteristics of documents from different domains

of assignment.

Adding the level of stylometry to the current approaches within document analysis, increases

the accuracy of assignment to authors, domains and topics. The stylometry level also obtains



619

the description of the diversity of an author’s vocabulary, leading to a model for assigning text

documents to a set of authors, further more in the area of cultural orientation analysis.

Authorship attribution, the science of inferring characteristics of the author from the

characteristics of documents written by that author, as presented in [3], is a problem with a

long history and a wide range of application.

In this paper, a general approach of document – author – domain is presented as an initial point

of conducting an extension to cultural orientation. Never the less, semantic analysis is

introduced in the model, using WordNet lexical ontology. WordNet is used for English written

documents with the possibility of identifying the contextual senses of multi-sense used words.

It is proven that transforming the term level to a sense level increases the accuracy of

information extraction from text documents, leading to a high performance in future works

using documents’ representation, such as supervised and unsupervised classification. The paper

presents the classification process at its general function, afterword exemplifying the domains

and codomains of the classification function with problems such as: topics, subjects, authors,

domains, cultures.

In Section 2, the generative models for documents are discussed, using authors and topics,

creating the author-topic model. This model describes the relations between a set of known

authors and a set of given topics. This model narrows the wide representation space of topics

by relating each author to its personal discussing topics within the documents previous assigned

to him.

Section 3 reveals the structure of a database for modelling the process of document

representation, authorship assignment and cultural orientation. For that, using the generative

models described in sector 2, the main attributes, relations, cardinality relations and main

information extraction functions are obtained. Also, the main clustering and classification

algorithms are inserted in the database model for highlighting the area where the parameter

results are to be filled with. Section 4 focuses on the conclusions drawn from the present

research paper, highlighting the future work relating cultural orientation models within

research papers written in English by authors from different cultures.

2. Generative models for text documents Much of the prior work on multi-label document classification uses data sets in which there are

relatively few labels and many training instances for each label, [8]. In [1], the generative

models based on the multi-variant Bernoulli and multinomial distributions are presented as a

widely used method in document representations. In the more recent researches, the spherical

k-means algorithms, with desired properties for document clustering, are used in the special

case of generative models. Generative models of text typically associate a multinomial with

every class label or topic, [7], [2].

The statistical analysis of style, stylometry, as described in [4], is based on the assumption that

every authors’ style has certain features being inaccessible to conscious manipulation.

Therefore they are considered to provide the most reliable basis for the identification of an

author. The style of an author may vary over time because of the differences in topics or genre

and personal development. In general, stylometry should identify features which are invariant

to these effects but are expressive enough to discriminate an author from other writers.

Two different models for extracting the stylometry within text documents implies the use of

distinctive stylistic features that characterize a specific author and models that focus on

extracting the general semantic content of a document rather than the stylistic details of how it

was written.

The probabilistic generative models reduces the process of writing a text document to a simple

series of probabilistic steps. The first step in generating these probabilistic models implies



620

reducing the text document to a bag of words and their counting within the document. The bag

of words can be formed out of the most n found words within the document. Reaching the

suitable n value is done in the context of maximizing the percentage of information retained

within the words and minimizing the number of word-features used for describing and

modelling the objects managed.

Figure 1 presents the generative model integrating the author, domains and bag of words levels.

For the domain and words levels, a probability is calculated in order to use it for further analysis

when new text documents are used as input for the classification process. The probabilities are

calculated using an initial set of documents used for the model’s training.

Figure 1. Graphical model for author-domain-bag of words levels

For each author Ai from the database, a direct relation existing with the topic level. This relation

is translated within a set of probabilities of an author Ai to write a scientific article about a

certain topic Tt. The word-topic level refers to the relation between each topic and the set of

words obtained from the training set of documents. The author-topic-semantic level combines

each author to each topic and words within a probability of connecting these levels.

Table 1 contains the description of each variable used in figure 1.

Table 1. Description of variables used

Variable Description

𝑝𝐴𝑇𝑖𝑗 Probability of a document written by author i to be of

topic j

𝑝𝑊𝐴𝑇𝑖𝑗𝑘 Probability of a word k to be found in an article of topic

j written by author i

𝑝𝑊𝐴𝑇𝑗𝑘 Probability of a word k to be found in an article of topic

j regardless of the author than written the article

i

1i

2i

ti

W

kji

w

1j

w

Card W j

Topi

c level

Author level

Word-topic

level

Author-

topic-semantic

level



621

Given a new document D, the abstract is divided into words. Each word is used and the

probability is computed in order to decide to which topic the document D is most suitable to

be assigned.

𝑇𝐷 = max∏ 𝑝𝑊𝐴𝑇𝑖𝑗𝑘

𝐶𝑎𝑟𝑑 𝑊𝑘=1

𝑗,

where 𝑇𝐷 is the most suitable topic to assign the new document D and Card W is the number

of words in which the abstract of the document D was divided in.

3. Database model for authors, domains and semantic layers Starting from the general description of a generative model used for representing and

classifying scientific articles presented in chapter 2, a database structure is conducted.

The main points in the analysis are transformed within layers of representation: authors, topics

and words. Among each two layers a many to many relation exists. For example, an author can

write articles associated to multiple topics, while a topic can be written by various authors.

When dealing with articles written by more than one author, the information obtain within the

article, the topic and the words and senses are processed for each author from the list of authors.

Figure 2 presents the database used for processing, representing and classifying scientific

articles. The model is structured in an opened way regarding adding new authors, topics,

documents and words. Tables AuthorTopic and WordTopicAuthor are used for transforming

the many to many initial relations among tables into one to many relations suitable for further

extracting possible and needed information.

Figure 2. Proposed database for modelling author, topic and word layers



622

Combining the proposed database model with the generative model is done by calculating the

probabilities associated to topics, words and authors. For each formula, a SQL statement is

presented in table 2.

Table 2. Sql statements for probabilities from the generative model

Variable Sql statement

𝑝𝐴𝑇𝑖𝑗

((SELECT AuthorTopic.noDocuments FROM AuthorTopic

WHERE idAuthor = i AND idTopic = j) + 1) / ((SELECT

Author.noDocuments FROM Author WHERE idAuthor = i)

+ SELECT SUM(idTopic) FROM Topic)

𝑝𝑊𝐴𝑇𝑖𝑗𝑘

((SELECT noAppereance.WordTopicAuthor FROM

WordTopicAuthor WHERE idWord = k AND idTopic = j

AND idAuthor = i) + 1) / ((SELECT

SUM(noAppeareance.WordTopicAuthor) FROM

WordTopicAuthor WHERE idWord = k AND idAuthor = i)

+ SELECT SUM(idTopic) FROM Topic

𝑝𝑊𝐴𝑇𝑗𝑘

SELECT AVERAGE (SELECT

noAppereance.WordTopicAuthor FROM WordTopicAuthor

AND Topic WHERE idWord = k AND idTopic = j AND

idAuthor = i) + 1) / ((SELECT

SUM(noAppeareance.WordTopicAuthor) FROM

WordTopicAuthor WHERE idWord = k AND idAuthor = i)

+ SELECT SUM(idTopic) FROM Topic) where i =

SELECT authorID from Author

The presented probabilities are computed and used for assigning to topics new documents

written by various authors that are available within the initial database and training set of

articles.

4. Conclusions The conducted model for extracting and storing the semantic relations between authors,

domains and cultural orientation is used in solving natural language processing problems such

as: clustering, classification, extracting within a set of given document, authors and domains.

Using this model, the accuracy of processing is increased, adding the level of semantic analysis

and author-domain relation. Future work focusses on using this model in authorship assignment

application and authors’ cultural orientation extraction. This extraction is further used for

creating a modelling solution for describing the main semantic interactions between authors

from different cultures and main topics written. Also, it is used in describing the connections

of research conducted by the set of authors analyzed, extrapolating to the whole collectivity.

Furthermore, the present model is discussed in the future as an input base for text document

processing such as: plagiarism analysis, a corpus-based and intrinsic one and near-duplicate

detection, as presented in [6], the exemplification of plagiarism analysis within the web pages

from World Wide Web.

Acknowledgment ,,This work was financially supported through the project "Routes of academic excellence in

doctoral and post-doctoral research - READ" co-financed through the European Social Fund,



623

by Sectoral Operational Programme Human Resources Development 2007-2013, contract no

POSDRU/159/1.5/S/137926.”

References [1] S. Zhong and J. Ghosh “A comparative study of generative models for documents

clustering”, in Proceedings of the workshop on Clustering High Dimensional Data and Its

Applications in SIAM Data Mining Conference, 2003

[2] M. Rosen-Zvi “The author-topic model for authors and documents”, in Proceedings of the

20th conference on Uncertainty in artificial intelligence AUAI Press., 2004, pp. 487-494

[3] P. Juola “Authorship attribution”, Foundation and Trends in information Retrieval, Vol 1,

no. 3, pp. 233-234, 2006

[4] J. Diederich, J. Kindermann, E. Leopold and G. Paass “Authorship attribution with support

vector machine”, Applied Intelligence, Vol. 19, no. 1-2, pp. 109-123, 2003

[5] D. Lewis, G. Agam, S. Argamon, O. Frieder, D. Grossman and J. Heard “Building a test

collection for complex document information processing”, in SIGIR’ 06 Proceedings of the

29th annual international ACM SIGIR conference on Research and development in

information retrieval, New York, 2006, pp. 665-666

[6] B. Stein, M. Koppel and E. Stamatatos “Plagiarism analysis, authorship indentification and

near-duplicate detection PAN’07, ACM SIGIR Forum, Vol. 41, no. 2, 2007, pp. 68-71

[7] J. Eisenstein, A. Ahmed and E.P. Xing “Sparse additive generative models for text”, in

Proceedings of the 28th International Conference on Machine Learning, ICML-11, 2011,

pp. 1041-1048

[8] T. Rubin, A. Chambers, P. Smyth and M. Steyvers “Statistical topic models for multi-label

document classification”, Machine Learning, Vol. 88, no. 1-2, 2012, pp. 157-208

Proceedings IE 2015.pdf - International Conference on ...

Documents

Transcript of Proceedings IE 2015.pdf - International Conference on ...