Business Process Modelling is an Essential Part of a Requirements Analysis. Contribution of EFMI...

10
34 IMIA Yearbook of Medical Informatics 2012 © 2012 IMIA and Schattauer GmbH Summary Objectives: To perform a requirements analysis of the barriers to con- ducting research linking of primary care, genetic and cancer data. Methods: We extended our initial data-centric approach to include socio-cultural and business requirements. We created reference models of core data requirements common to most studies using unified mod- elling language (UML), dataflow diagrams (DFD) and business proc- ess modelling notation (BPMN). We conducted a stakeholder analysis and constructed DFD and UML diagrams for use cases based on simu- lated research studies. We used research output as a sensitivity analysis. Results: Differences between the reference model and use cases iden- tified study specific data requirements. The stakeholder analysis identified: tensions, changes in specification, some indifference from data providers and enthusiastic informaticians urging inclusion of socio-cultural context. We identified requirements to collect informa- tion at three levels: micro- data items, which need to be semantically interoperable, meso- the medical record and data extraction, and macro- the health system and socio-cultural issues. BPMN clarified complex business requirements among data providers and vendors; and additional geographical requirements for patients to be repre- sented in both linked datasets. High quality research output was the norm for most repositories. Conclusions: Reference models provide high-level schemata of the core data requirements. However, business requirements’ modelling identifies stakeholder issues and identifies what needs to be ad- dressed to enable participation. Keywords General practice; medical records systems, computerized; research design; registry; records as topic Yearb Med Inform 2012:34-43 Business Process Modelling is an Essential Part of a Requirements Analysis Contribution of EFMI Primary Care Working Group S. de Lusignan 1,3 , P. Krause 2 , G. Michalakidis 2 , M. Tristan Vicente 3 , S. Thompson 1 , M. McGilchrist 4 , F. Sullivan 4 , P. van Royen 5 , L. Agreus 6 , T. Desombre 1 , A. Taweel 7 , B. Delaney 8 1 Department of Health Care Management and Policy, University of Surrey, Guildford, Surrey, UK 2 Department of Computing, University of Surrey, Guildford, Surrey, UK 3 Primary Care Informatics, Division of Population Health Sciences and Education, St. George’s – University of London, London, UK 4 Division of Clinical & Population Sciences and Education, The Mackenzie Building, Dundee, Scotland 5 Dept of Primary and interdisciplinary care, University of Antwerp, Antwerpen (Wilrijk), Belgium 6 Dept of Neurobiology, Care Sciences and Society, Karolinska Institutet, Huddinge, Stockholm, Sweden 7 Department of Informatics and Public Health, King’s College London, London, UK 8 Department of Primary Care and Public Health Sciences, London, UK Introduction Requirements analyses are part of soft- ware engineering with the purpose of reducing the chances of failure or the need for frequent expensive changes further along the development path- way. A requirements analysis is the process of determining what a system must do, acknowledging the at times conflicting needs of the various stakeholders [1, 2]. Conventionally conducting a requirements analysis was an early step in the waterfall approach to software development [3, 4, 5]. However, "agile" approaches which in- clude user and stakeholder involvement throughout development [6] and recog- nise the need for flexibility and change [7] have been used more and also com- bined with the waterfall approach. Medical research is a complex proc- ess and although there are many com- puterised data repositories, they are not commonly linked in research studies [8]. These data repositories include: genetic databases, disease registries, and, routinely collected primary care data. Genetic data has been collected into biobanks since the completion of the mapping of the human genome [9, 10], though biological specimens have been used in forensic science for longer [11, 12]. Cancer registries are also well established [13, 14] and more recently have become international and used to study of a wide range of conditions [15], and the effectiveness of treat- ments [16]. Primary care data are also widely used in research. These data are rich, collected longitudinally and in countries with registration systems pro- vide a population denominator [17, 18]. Research networks have been devel- oped within primary care, at regional and national level and have improved the capability and capacity for research in primary care; including recruitment into trials [19, 20]. There are many models, sometimes they are part of National or regional sentinel networks [21], or of single vendor linked data collection schemes [15]. There have been a number of at- tempts to link these different types of data for research [22, 23, 24, 25], but the successes have largely been lim- ited to the use of specific methods in bioinformatics [26, 27]. Objective We carried out this requirements analy- sis to identify how to maximise the use For personal or educational use only. No other uses without permission. All rights reserved. Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

Transcript of Business Process Modelling is an Essential Part of a Requirements Analysis. Contribution of EFMI...

34

IMIA Yearbook of Medical Informatics 2012

© 2012 IMIA and Schattauer GmbH

SummaryObjectives: To perform a requirements analysis of the barriers to con-ducting research linking of primary care, genetic and cancer data.Methods: We extended our initial data-centric approach to includesocio-cultural and business requirements. We created reference modelsof core data requirements common to most studies using unified mod-elling language (UML), dataflow diagrams (DFD) and business proc-ess modelling notation (BPMN). We conducted a stakeholder analysisand constructed DFD and UML diagrams for use cases based on simu-lated research studies. We used research output as a sensitivity analysis.Results: Differences between the reference model and use cases iden-tified study specific data requirements. The stakeholder analysisidentified: tensions, changes in specification, some indifference fromdata providers and enthusiastic informaticians urging inclusion ofsocio-cultural context. We identified requirements to collect informa-tion at three levels: micro- data items, which need to be semanticallyinteroperable, meso- the medical record and data extraction, andmacro- the health system and socio-cultural issues. BPMN clarifiedcomplex business requirements among data providers and vendors;and additional geographical requirements for patients to be repre-sented in both linked datasets. High quality research output was thenorm for most repositories.Conclusions: Reference models provide high-level schemata of thecore data requirements. However, business requirements’ modellingidentifies stakeholder issues and identifies what needs to be ad-dressed to enable participation.

KeywordsGeneral practice; medical records systems, computerized; researchdesign; registry; records as topic

Yearb Med Inform 2012:34-43

Business Process Modelling is an Essential Partof a Requirements AnalysisContribution of EFMI Primary Care Working Group

S. de Lusignan1,3, P. Krause2, G. Michalakidis2, M. Tristan Vicente3, S. Thompson1,M. McGilchrist4, F. Sullivan4, P. van Royen5, L. Agreus6, T. Desombre1, A. Taweel7, B. Delaney8

1 Department of Health Care Management and Policy, University of Surrey, Guildford, Surrey, UK2 Department of Computing, University of Surrey, Guildford, Surrey, UK3 Primary Care Informatics, Division of Population Health Sciences and Education, St. George’s –University of London, London, UK

4 Division of Clinical & Population Sciences and Education, The Mackenzie Building, Dundee, Scotland5 Dept of Primary and interdisciplinary care, University of Antwerp, Antwerpen (Wilrijk), Belgium6 Dept of Neurobiology, Care Sciences and Society, Karolinska Institutet, Huddinge, Stockholm, Sweden7 Department of Informatics and Public Health, King’s College London, London, UK8 Department of Primary Care and Public Health Sciences, London, UK

IntroductionRequirements analyses are part of soft-ware engineering with the purpose ofreducing the chances of failure or theneed for frequent expensive changesfurther along the development path-way. A requirements analysis is theprocess of determining what a systemmust do, acknowledging the at timesconflicting needs of the variousstakeholders [1, 2]. Conventionallyconducting a requirements analysis wasan early step in the waterfall approachto software development [3, 4, 5].However, "agile" approaches which in-clude user and stakeholder involvementthroughout development [6] and recog-nise the need for flexibility and change[7] have been used more and also com-bined with the waterfall approach.

Medical research is a complex proc-ess and although there are many com-puterised data repositories, they are notcommonly linked in research studies[8]. These data repositories include:genetic databases, disease registries,and, routinely collected primary caredata. Genetic data has been collectedinto biobanks since the completion ofthe mapping of the human genome [9,10], though biological specimens havebeen used in forensic science for longer

[11, 12]. Cancer registries are also wellestablished [13, 14] and more recentlyhave become international and used tostudy of a wide range of conditions[15], and the effectiveness of treat-ments [16]. Primary care data are alsowidely used in research. These data arerich, collected longitudinally and incountries with registration systems pro-vide a population denominator [17, 18].

Research networks have been devel-oped within primary care, at regionaland national level and have improvedthe capability and capacity for researchin primary care; including recruitmentinto trials [19, 20]. There are manymodels, sometimes they are part ofNational or regional sentinel networks[21], or of single vendor linked datacollection schemes [15].

There have been a number of at-tempts to link these different types ofdata for research [22, 23, 24, 25], butthe successes have largely been lim-ited to the use of specif ic methods inbioinformatics [26, 27].

ObjectiveWe carried out this requirements analy-sis to identify how to maximise the use

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

IMIA Yearbook of Medical Informatics 2012

35

Business Process Modelling is an Essential Part of a Requirements Analysis

of IT in conducting complex researchprojects linking aggregated primarycare data to a disease registry or a ge-netic data repository.

Methods1 OverviewWe adopted an agile approach to al-low for any alterations in the require-ments as other parts of the project de-veloped [28]. We simultaneouslyconducted a literature review and heldexpert consensus workshops [29, 30].We developed a reference model andmodels for data flow and of the busi-ness process involved. We conducteda stakeholder analysis that includedinteracting with potential data provid-ers and electronic health record (EHR)system vendors.

2 TRANSFoRm ResearchProgrammeThis investigation was part of theTRANSFoRm (Translational Researchand Patient Safety In Europe) researchprogramme [31]. TRANSFoRm is de-signed to reduce the barriers to con-ducting research using routine healthdata and promote interoperability be-tween different health computer sys-tems so that international data can becollected about the quality of care andfor research [32]. One element ofTRANSFoRm was a requirementsanalysis to assess the feasibility of con-ducting research studies set out in theform or use cases.

3 Creating a Reference Model for theResearch Process and StudiesWe separately modelled the use casedef ined research studies, using theUnif ied Modelling Language (UML)[33]. We excluded consent and dataprivacy issues as they were being dealt

with elsewhere within the project [34].We developed a schema of the proc-esses associated with a linked data re-search project as a reference model(Figure 1) [35] based on our experi-ence of extracting, aggregating andprocessing routinely collected data [36,37]. Additionally we recognised thatlinked research required the same in-dividuals to be represented in bothlinked databases; as cases present inonly one database could not be usedfor linked research. We called this linka "geographical requirement" (Figure2) and included it within our businessrequirements.

We created UML and data flow dia-grams (DFD) for the reference modelat a high level of abstraction, so wecould contrast them with individualstudy use cases [38, 39, 40].

4 Stakeholder AnalysisWe identified the following stakeholdergroups and summarised our findings ina mind map and stakeholder analysisgrid (Figure 3, Table 2):

(1) End users of the linked datasets; spe-cifically those looking to run the usecase defined linked-data studies.

(2) Direct and indirect providers of thesedata. Direct suppliers include da-tabase owners or other direct dataproviders to a study. We spoke indepth to ten. Indirect data providersinclude those recording or aggregat-ing data earlier in the process.

(3) The TRANSFoRm study team andtheir funders, and

(4) External experts, principally fromthe International Medical Infor-matics Association (IMIA) and Eu-ropean Federation for MedicalInformatics (EFMI) primary careinformatics working groups.

5 Use Case TestingWe analysed the TRANSFoRm use casesto define the data requirements and theintegrity of the use cases, constructing aDFD and UML models for each. Theuse cases were in two clinical domains,type-2 diabetes and acid-reflux relatedoesophageal disease. The type-2 diabe-

Fig. 1 Schema of the process involved in conducting a linked data research study

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

36

IMIA Yearbook of Medical Informatics 2012

de Lusignan et al.

(Table 2). There were challenges ingetting direct data providers to engagewith the process. We made 316 con-tacts with data providers we thoughtmight participate and only 56 (17.7%)supplied the comprehensive informa-tion def ined by the requirementsanalysis. The proportion of valid re-sponses from primary care data pro-viders was 26.4% (29/110); from can-cer registries 16.9% (15/89); and fromgenetic data bases 10.3% (12/117).Many of the data repositories were re-gional not national and already sus-tained by a viable programme of re-search; most were producing highgrade research output. Their dataprocessing systems had grown or-ganically as EHR systems had be-come more sophisticated. They weregenerally proprietary, not usingstandard metadata, and functioningas "black boxes" between data entryand export for research. However, allcould output data in a range of stand-ard formats.

We identif ied and contacted 17EHR vendors identif ied through ourinitial searches and a further nineflagged by stakeholders. EHR ven-dors were reluctant to engage unlessthere were some chance of benef itsrealisation including a priori publica-tion of criteria for inclusion in futureTRANSFoRm studies. One primarycare EHR system reported it has anopen application programme interface(API). The lack of open APIs makesthis approach to interaction with EHRsmore challenging as it might requireseparate licenced APIs to be developedwith each vendor.

TRANSFoRm investigators haddifferent perspectives reflecting theirvarying backgrounds; they werelargely clinical or information sys-tems researchers. These differenceswere nuanced, but the former gener-ally described the project as a com-plex clinical process, while the lat-ter looked to rationalise things intomachine processable activities. TheTRANSFoRm senior investigatorswere trying to co-ordinate the "big

Fig. 2 BPMN schema of the key business decisions which inform whether a data repository is likely to participate in a TRANSFoRm research study

tes cohort study explored the risk of com-plications and response to oral medica-tion, a cohort study linking primary careand genetic data. The second domain,gastro-oesophageal reflux disease(GORD), included two use cases link-ing primary care and cancer registrydata. One was a randomised control-led trial of on-demand compared withcontinuous use of proton pump inhibi-tors (PPIs) a class of anti-indigestionmedicine. The other was a nested case-control study to explore the associationof GORD symptoms and the use of PPIswith developing Barrett’s disease andcancer of the oesophagus.

6 Defining the Scope of theInformation Needed fromCandidate DatabasesWe def ined the scope of the informa-tion we would need to collect from can-didate data providers; though askingstakeholders to def ine what they con-sidered to be essential requirements.

7 Business Process ModellingLate on in our process we introducedBusiness Process Modelling Notation(BPMN) to define business requirements.

The stakeholder analysis informed usthese were important and they were ne-glected in our initial programme design.

8 Sensitivity AnalysisWe collected data about peer review pub-lications produced by data repositoriesbeing considered for inclusion in theTRANSFoRm research studies. We didthis because that track record may bepredictive of future success. Finally, weconducted a review of what elements ofour requirements analysis contributed toour final specifications.

9 Ethical ConsiderationsThis investigation did not involve anydirect contacts with patients or accessto medical records. No specif ic ethicalconsent was deemed necessary.

Results1 Stakeholder AnalysisWe identif ied key stakeholders andused a mind map to capture key con-cepts (Figure 3) and summarised theirviews in a stakeholder analysis grid

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

IMIA Yearbook of Medical Informatics 2012

37

Business Process Modelling is an Essential Part of a Requirements Analysis

Fig. 3 Mind map of stakeholders and issues

Table 1 Summary stakeholder analysis

1

2

3

4

Project Stakeholder

End users / researchers

Direct Data providers(Data repositories andnetworks)

Indirect Data providers(Citizens, Patients, GPs)

Study team & funders

External experts

Specific Information Needs

Will data deliver study?1. Unique linked patient records;2. Case definition (Coding);3. Outcome variables;4. Inclusion & exclusion criteria5. Study specific - controls,randomisation, patient link

1. Return on investment of time -studies, funding2. Strategic interest

1. About use of their data2. Maintenance of confidentiality/privacy

1. Progress towards achieving milestonesin project2. Requirements of next steps in project3. Meeting own institutions objectives

Academic interest in how and whetherthis can be done

Best Source of InformationNeeded

1. Own study protocol2. Relevant use case3. Track record of provider

Who will get a funded study?Track record of researcherNetwork / repository owner

1. TRANSFoRm study protocol2. Work package outputs

Simulations & studies - proof ofconcept

Planned Method of Delivery

Require a third party to link data

Engage once benefits

Largely via network / repository

1. Connecting data via TRANSFoRmengine2. Provenance engine

Informal input / advice re: projects

Timing Considerations

Want to avoid any delays inresearch process

Have standard processableoutputs readily available

Challenging to changearrangement

Critical nature of dependenciesfrom previous work packages

Variable

picture" and focused on project mile-stones and repor ts to funders. Therewere also inevitable changes inproject scope. Important changes forthe requirements analysis were the in-clusion of cancer registries and adop-tion of contextual provenance withthe requirement to describe "Howthat data came to be ." This change

in scope of provenance extended be-yond what is def ined within the usecases as it included influences priorto recording.

The Informatics working g roupmembers also broadened the scopeof the requirements. They arguedthat human factors and the widercontext of the health system, legal

and socio-cultural factors all influencethe process and should be includedin the requirements analysis [23, 24].Their view was that contextual prov-enance whilst important could onlyapply to data that were recorded andthat a broader contextual overviewmight better explain why data werepresent or not.

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

38

IMIA Yearbook of Medical Informatics 2012

de Lusignan et al.

2 Use Case AnalysisWe compared the use cases developedfor the TRANFoRm project with thereference model schema we constructedfor a generic research study to identifyspecif ic data requirements.

The diabetes use case varied littlefrom the reference model schema otherthan we needed to add a prescribingdatabase. The UML model reflected thatthis was a cohort study where a subsetof the population is identif ied (Figure4(a)). The GORD use cases were morecomplex, also reflecting the study type(Figures 4(b) and 4(c)).

4(a) In this use case genetic andprimary care data about type-2 dia-betes (T2D) are linked; <<uses>> des-ignates that the information is requiredat least once. 4(b) Case-control studyto exploring the associate of GORD,use of PPIs, and oesophageal adeno-carcinoma. 4(c) GORD RCT compar-ing on-demand and regular PPI

The diabetes study has an almost iden-tical DFD to that in the reference modelschema (Figure 5(a)). The GORD case-control study DFD (Figure 5(b)) onlydiffered from the diabetes study DFD asa result of the addition of randomisation.The data flows in the RCT were muchmore complex and reflected a differentapproach to collecting data (Figure 5(c)).In this use case, the GP or the researchermined their EHR data to identify andconsent patients to participate at the be-ginning of the study, whereas in the otherstudies there is no GP-patient interaction.

3. Scope of the InformationRequired to Link CandidateDatabases:We identif ied three levels of granular-ity of information required to assess ifa data source could be used to partici-pate in a research project (Table 2):(1) Micro-level requirements were a de-

tailed description of the type data source,the data itself, the potential for linkageand achieving semantic interoperabilitybetween data sources [42];

(2) Meso-level requirements includedreporting methods of data ex-traction [43, 44], details of theEHR systems used and their ar-chitecture [45], audit trails thatmight provide information aboutdata provenance, and the size ofthe database;

(3) Macro-level issues related to thenature of the health system andsocio-cultural issues.

Study specif ic requirements were de-f ined by differences between the ref-erence model and use cases.

4 Scope of the BusinessRequirementsWe identif ied and integrated into ourrequirements analysis a number of busi-ness requirements that went beyond thedata requirements for conducting astudy (Figure 6). Most of what weframed as business requirements f itwithin the scope of a research network.The business process modelling wascritical in developing an understandingof the lack of engagement, and howconcentration on the data model ignoredthe need for a business case.

Fig. 4 UML use case model giving an overview research process for: (a) Diabetes (cohort study), (b) GORD (case control) and (c) GORD (randomisedcontrolled trial) use cases

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

IMIA Yearbook of Medical Informatics 2012

39

Business Process Modelling is an Essential Part of a Requirements Analysis

5 Sensitivity AnalysisWe used publication of peer-reviewedstudies as an index of a functional busi-ness process. Eight of the ten dataproviding stakeholders who providedfull information could provide evi-dence of work produced from theirdata and indexed in the bibliographicdatabase Medline. Four data providersestimated between 30 and 100 peer-reviewed publications had been pub-lished in the last five years. Of the re-

Fig. 5 DFD for the TRANSFoRm use cases

maining four, two estimated 11 to 20and two estimated 2 to 5 publications.

6 Contributions to the RequirementsAnalysisFinally, we report how the differentinputs contributed to the f inal require-ments analysis (Table 2). With the ex-ception of socio-cultural issues, all therequirements were identif ied by morethan one source.

Discussion1 Principal FindingsThe requirements for l inking com-plex heterogeneous databases are ex-tensive, and failure to foresee thebusiness requirements resulted instake-holder disengagement. In ad-dition to the anticipated core data re-quirements we have identif ied geo-graphical and socio-culturalrequirements. The use case, datadriven model, helped def ine whethera database was suitable to conduct apar ticular research study. However,we also required information aboutthe geographical overlap betweendata repositories and readiness to par-ticipate. We used a reference modelto identify generic as well as studyspecif ic requirements. Local lan-guage and the regional nature ofsome data repositories are additionalchallenges to conducting interna-tional research. Many of the success-ful data collection systems were pro-prietary and saw no reason to moveto more open methods and systems;they believed this entailed risk andexpense but with no obvious benef it.Only after we included BMPN in ourapproach did we manage to ration-alise and frame these requirements.

2 Implications of the FindingsThe data requirements from the usecase analysis identif ied most dataneeded to conduct research, but maymiss important business issues whichmay constrain involvement. Whilstgeneric reference models for re-search studies make sense of theprocesses and data flows; the busi-ness requirements of the data reposi-tory owners and EHR vendors alsoneed to be modelled and met. Re-searchers who design data projectsthat involve linking databases needto make sure that they include in-centives for participation otherwisethey risk low levels of engagement.

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

40

IMIA Yearbook of Medical Informatics 2012

de Lusignan et al.

One strategy to overcome this mightbe creating a research network infra-structure to register researchers anddata providers wishing to conductlinked data studies that might pro-vide a forum to facilitate and brokersolutions.

3 Comparison with the LiteratureLinking heterogeneous data is challeng-ing [46]. Use cases and UML have usedin research reporting adverse events inclinical trials [47] and medical imageanalysis pilots [48]. The OntoRAT toolkit

is a working example of the use ofontologically driven systems [49]. Suchontologically driven systems might bestbe developed using object-orientated ap-proaches as has been used in specialistcancer management [50] and for check-ing healthcare domain models [51].

Fig. 6 Detailed BPMN diagram to demonstrate the steps in ascertaining if a data source meets the business requirements

Table 2 Scope of data collection defined by the requirements analysis

Method &control of data

extraction

Record system& information

model

Data extractedDemographics ID

Coded DataFree-text

Data quality

Taxonomy of errorsassociated with data

extraction

PCROMPrimary Care

Research Model

OpenEHRArchetypes

www.openehr.org

Use case/protocol andEPR architecture

ISO 11179Metadata

http//metadata-standards.org/11179

Meso-level

Macro-level

Organisationallevel

Socio-culturalfactors

Primary HealthCare System

Socio-culturaldimensio to what isdisease & presented

to primary care

Wider socialinfluences

On data availability/ consent & access

to data

Fragmentationof care

Separate providers

CoverageUnivesal/limited

Management ofAmbulatory care

sensitive conditions

Studyspecific

Use-Casecompatibility

Diabetes studyPrimary care &genetic data

Barretsoesophagus &medicatoin

Micro-level

Dara source /Data type

DataInteroperability

level

Genetic data

Primary Care /Cancer Registrydata

All types of data

CDISC Clinical DataExchange Consortium- www.cdisc.org

BRIDG BiomedicalResearch IntegratedDomain Group -www.bridg.org

HL7 Health LevelSeven - www.hl7.org

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

IMIA Yearbook of Medical Informatics 2012

41

Business Process Modelling is an Essential Part of a Requirements Analysis

Although widely used in softwareengineering, the modelling tools wepropose to be used have been littleused in the health domain; despitecalls that they should be made moreaccessible to health researchers [52].DFD have been used little in healthcare though they have been proposedas a component of object orientatedhealth information management system[53]. UML has been used the most. Ithas been proposed as a tool for avoid-ing integration problems [54] and formodelling research studies [55], butthus far with little evidence of ben-ef it [56]. BPMN is now consideredan appropriate tool for modellingdigital pathology and care pathwaysbut has not been proposed for mod-elling health research [57, 58].

4 Limitations of the MethodWe could have just used UML. TheUML alternatives are either to de-velop a domain model or use UMLstate-machine or f inite state machinediagrams [26-8]. Business architec-ture models (BAM) have been applied

to the life sciences to help generateUML models [59]; and other tailoredpackages have been used to generateUML diagrams for bioinformaticsresearch [60] and to generate genotype-phenotype maps [61].

Our requirements analysis pro-duced a non-executable set of mod-els and methods for sorting candi-date databases into those potentiallyuseable in linked data research. Wedid not def ine these steps in suff i-cient detail to be able to set them outusing business process executablelanguage (BPEL) specif ications [62],BPEL has been used within health-care [63], and might help reducebarriers to potential researchers anddata providers.

5 Call for Further ResearchThe user requirements developed needto be tested prospectively. Where therequired activities can be published withclearly def ined interfaces it will bepossible to produce an executable modelsuch as BPEL.

ConclusionsWe def ined generic user requirementsfor research studies l inking healthdata based on a reference model,which can be extended for specif icstudies; using UML use case dia-g rams and DFD. Additionally wehave identif ied geographical require-ments and the importance of socio-cultural context. However, the keylearning from this exercise is the im-portance of modelling the businessprocess and including this within a re-quirements analysis. Failure to includeand recognise the importance of mod-elling the business process as part ofrequirements analysis risks lack of en-gagement by intended participants andincreases the risk of project failure.

AcknowledgementsIMIA and EFMI for supporting theirprimary care informatics workinggroups. Elena Crecan for her contribu-tion to the research. TRANSFoRm ispart-f inanced by the European Com-mission - DG INFSO (FP7 2477).Antonis Ntasioudis for assistance withthe diagrams and modelling.

Data sharing agreementThe limited quantitative data collectedas part of this study have been placedon a database at University of Dundeewith the Work Package lead (FS); ini-tially for further analysis within theTRANFoRm project. The generic mod-els developed as part of this study arefreely available for use from this pa-per or from the clinical informaticsw e b s i t e : h t t p : / / w w w. c l i n i n f . e u /refmodel/ - users are requested to citethis paper.

References1. Cadle J, Paul D, Turner P. Business analysis

techniques, 72 Essential Tools for Success.Swindon: BCS; 2010.

2. Alexander I, Stevens R. Writing betterrequirements. London: Addison-Wesley; 2002.

3. Krause P, de Lusignan S. Procuring interoperabilityat the expense of usability: a case study of UK

Table 3 Contribution of use cases and stakeholder analysis to final requirements

Data level/ type

Use cases

Stakeholderanalysis:

End users

DataProviders

Study Team

Externalexperts

Patientleveldata

x

xx

xx

xx

x

Metadata&Extraction

x

xx

x

xx

Healthserviceorganisa-tional

x

xx

Diabetes& GORDstudies

xx xx

xx

Sensitivityanalysis

Track recordpeer reviewpublications

xx

x

x

xx

Codingsystem &Interoper-ability

xx

xx

x

Recordsystem &informationmodel

x

x

x

xx

Socio-cultural& legal

xx

Non-usecase

xx

x

Micro Meso Macro Study Specific

Key: x=contribution to requirements; xx=major contribution to requirements

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

42

IMIA Yearbook of Medical Informatics 2012

de Lusignan et al.

National Programme for IT assurance process. StudHealth Technol Inform 2010;155:143-9.

4. Hood C, Wiedemann P, Fichtinger S, Pautz U.The interface between requirements developmentand all other systems engineering processes.Berlin; Springer-Verlag; 2010.

5. Reddy M, Pratt W, Dourish P, Shabot M.Sociotechnical requirements analysis for clinicalsystems. Methods Inf Med 2003;42(4):437-44.

6. Montabert C, McCrickard D, Winchester W, Perez-Quinones M. An integrative approach torequirements analysis: How task models supportrequirements reuse in a user-centric designframework. Interact Comput Aug 2009;21(4):304-15. DOI: 10.1016/j.intcom.2009.06.003

7. Surendra N. Using an ethnographic process toconduct requirements analysis for agile systemsdevelopment. Inf Technol Manag 2008;9(1):55-68. DOI 10.1007/s10799-007-0026-6

8. Matteson S, Paulauskis J, Foisy S, Hall S, DuvalM. Opening the gate for genomics data into clinicalresearch: a use case in managing patients’ DNAsamples from the bench to drug development.Pharmacogenomics 2010;11(11):1603-12.

9. International Human Genome SequencingConsortium.Initial sequencing and analysis of thehuman genome. Nature 2001;409(6822):860–921

10. International Human Genome SequencingConsortium. Finishing the euchromatic sequenceof the human genome. Nature 2004;431(7011):931-45.

11. Hirtzlin I, Dubreuil C, Préaubert N, Duchier J,Jansen B, Simon J, et al; EUROGENBANKConsortium. An empirical survey on biobankingof human genetic material and data in six EUcountries. Eur J Hum Genet 2003;11(6):475-88.

12. McEwen J. Forensic DNA data banking by statecrime laboratories. Am J Hum Genet 1995;56(6):1487-92.

13. Schneider B. Progress in cancer control throughcancer registries. CA Cancer J Clin 1958;8(6):207-10

14. Navarro C, Martos C, Ardanaz E, Galceran J,Izarzugaza I, Peris-Bonet R, et al.; SpanishCancer Registries Working Group. Population-based cancer registries in Spain and their rolein cancer control. Ann Oncol 2010;21 Suppl3:iii3-13.

15. Gatta G, Capocaccia R, Trama A, Martínez-GarcíaC; RARECARE Working Group. The burden ofrare cancers in Europe. Adv Exp Med Biol2010;686:285-303.

16. Francisci S, Capocaccia R, Grande E,Santaquilani M, Simonetti A, Allemani C, etal.; EUROCARE Working Group. The cure ofcancer: a European perspective. Eur J Cancer2009 Apr; 45(6):1067-79.

17. de Lusignan S, Chan T. The development ofprimary care information technology in the UnitedKingdom. J Ambul Care Manag 2008 Jul-Sep;31(3):201-10.

18. de Lusignan S, van Weel C. The use of routinelycollected computer data for research in primarycare: opportunities and challenges. Fam Pract2006;23(2):253-63.

19. Clement S, Pickering A, Rowlands G, Thiru K,

Candy B, de Lusignan S. Towards a conceptualframework for evaluating primary care researchnetworks. Br J Gen Pract 2000;50(457):651-2.

20. Peterson KA, Fontaine P, Speedie S. The ElectronicPrimary Care Research Network (ePCRN): a newera in practice-based research. J Am Board FamMed 2006;19(1):93-7.

21. Boffin N, Bossuyt N, Vanthomme K, Van CasterenV. Readiness of the Belgian network of sentinelgeneral practitioners to deliver electronic healthrecord data for surveillance purposes: results ofsurvey study. BMC Fam Pract 2010;11(1):50.

22. Friedman C, Hripcsak G, Johnson SB, CiminoJJ, Clayton PD. A generalized relational schemafor an integrated clinical patient database.Proceedings 14th Annual Symposium onComputer Applications in Medical Care. LosAlamitos, CA: IEEE Computer Society Press;1990. p. 335-9.

23. Prokosch HU, Ganslandt T. Perspectives formedical informatics. Reusing the electronicmedical record for clinical research. Methods InfMed 2009;48:38-44.

24. Burgun A, Bodenreider O. Accessing andintegrating data and knowledge for biomedicalresearch. Yearb Med Inform 2008:91-101.

25. Ohmann C, Kuchinke W. Future developments ofmedical informatics from the viewpoint ofnetworked clinical research. Interoperability andintegration. Methods Inf Med 2009;48(1):45-54.

26. Mirhaji P, Zhu M, Vagnoni M, Bernstam EV,Zhang J, Smith JW. Ontology driven integrationplatform for clinical and translational research.BMC Bioinformatics 2009;10 Suppl 2:S2.

27. Wang X, Liu L, Fackenthal J, Cummings S,Olopade OI, Hope K, et al. Translational integrityand continuity: personalized biomedical dataintegration. J Biomed Inform 2009;42(1):100-12.

28. Huo M, Verner J, Zhu L, Ali Barbar M. Softwarequality and agile methods. Proceedings of 28thAnnual International Computer Software andApplications Conference (COMPSAC’04)2004;Vol 1: 520-5. http://doi.ieeecomputersociety.org/10.1109/CMPSAC.2004.1342889

29. de Lusignan S, Peace C, Shaw N, Liaw S-T,Michalakeidis G, Vincente M, et al. What are thebarriers to conducting International research usingroutinely collected primary care data? Stud HealthTechnol Inform 2011;165:135-40. DOI: 10.3233/978-1-60750-735-2-135

30. de Lusignan S, Liaw S-T, Krause P, Curcin V,Vincente M, Michalakidis G, et al. Key conceptsto assess the readiness of data for Internationalresearch: Data quality, lineage and provenance,extraction and processing errors, traceability, andcuration. Yearb Med Inform 2011;6(1):112-20.

31. TRANSFoRm (Translational Research andPatients safety In Europe). URL: http://www.transformproject.eu

32. European Union, Europe’s Information Society.Communication from the Commission to theCouncil, the European Parliament, the EuropeanEconomic and Social Committee and theCommittee of the Regions - e-Health - makinghealthcare better for European citizens: an actionplan for a European e-Health Area{SEC(2004)539}; 2004. URL: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?

uri=CELEX:52004DC0356:EN:NOT33. Kumarapeli P, de Lusignan S, Ellis T, Jones B.

Using Unified Modelling Language (UML) as aprocess-modelling technique for clinical-researchprocess improvement. Med Inform Internet Med2007 Mar;32(1):51-64.

34. de Lusignan S, Chan T, Theadom A, Dhoul N.The roles of policy and professionalism in theprotection of processed clinical data: a literaturereview. Int J Med Inform 2007 Apr;76(4):261-8.

35. Blobel B. Ontologies, knowledge representation,artificial intelligence – hype or prerequisites forInternational pHealth interoperability. Stud HealthTechnol Inform 2011;165:11-20. DOI: 10.3233/978-1-60750-735-2-11

36. van Vlymen J, de Lusignan S, Hague N, Chan T,Dzregah B. Ensuring the quality of aggregatedgeneral practice data: lessons from the PrimaryCare Data Quality Programme (PCDQ). StudHealth Technol Inform 2005;116:1010-15.

37. van Vlymen J, de Lusignan S. A system of metadatato control the process of query, aggregating,cleaning and analysing large datasets of primarycare data. Inform Prim Care 2005;13(4):281-91.

38. Podeswa H. UML™ for the IT Business Analyst:A Practical Guide to Object-Oriented RequirementsGathering. Boston, MA: Thompson Coursetechnology PTR; 2005.

39. Fakhroutdinov K. UML, Use Case Diagrams.URL: http://www.uml-diagrams.org/use-case-diagrams.html

40. Ambler S. UML2 Use Case Diagrams. URL:http:/ /www.agilemodeling.com/ar tifacts/useCaseDiagram.htm

41. Object Management Group (OMG) businessmanagement initiative. Business processmodelling notation (BPMN), version 2. URL:http://www.bpmn.org/

42. Dolin RH, Alschuler L. Approaching semanticinteroperability in Health Level Seven. J Am MedInform Assoc 2011 Jan 1;18(1):99-103.

43. Michalakidis G, Kumarapeli P, Ring A, vanVlymen J, Krause P, de Lusignan S. A system forsolution-orientated reporting of errors associatedwith the extraction of routinely collected clinicaldata for research. Stu Health Technol Inform2010;160:724-8.

44. Nadkarni, P.M., Brandt, C. Data Extraction and AdHoc Query of an Entity-Attribute-Value Database. JAm Med Inform Assoc 1998; 5:511-7

45. Santos MR, Bax MP, Kalra D. Building a logicalEHR architecture based on ISO 13606 standardand semantic web technologies. Stud HealthTechnol Inform 2010;160(Pt 1):161-5.

46. Castano S, De Antonellis V. Global viewing ofheterogeneous data sources. IEEE Transactionson Knowledge and Data Engineering2001;13(2):277-97. DOI: http://dx.doi.org/10.1109/69.917566

47. London JW, Smalley KJ, Conner K, Smith JB.The automation of clinical trial serious adverseevent reporting workflow. Clin Trials 2009;6(5):446-54.

48. Estrella F, Hauer T, McClatchey R, Odeh M,Rogulin D, Solomonides T. Experiences ofengineering Grid-based medical software. Int JMed Inform 2007;76(8):621-32.

49. Al-Hroub Y, Kossmann M, Odeh M. Developing

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2

IMIA Yearbook of Medical Informatics 2012

43

Business Process Modelling is an Essential Part of a Requirements Analysis

an Ontology-driven Requirements Analysis Tool(OntoRAT): A Use-case-driven Approach.Proceedings of second international conference onthe applications of digital information and webtechnologies (ICADIWT 2009) 2009:130-8.

50. Weber R, Knaup P, Knietitg R, Haux R, MerzweilerA, Mludek V, et al. Object-oriented businessprocess analysis of the cooperative soft tissuesarcoma trial of the german society for paediatriconcology and haematology (GPOH). Stud HealthTechnol Inform 2001;84(Pt 1):58-62.

51. Baksi D. Model checking of healthcare domainmodels. Comput Methods Programs Biomed2009;96(3):217-25.

52. Jun GT, Ward J, Morris Z, Clarkson J. Health careprocess modelling: which method when? Int JQual Health Care 2009;21(3):214-24.

53. Krol M, Reich DL. Object-oriented analysis anddesign of a health care management informationsystem. J Med Syst 1999;23(2):145-58.

54. Benson T. Prevention of errors and user alienationin healthcare IT integration programmes. InformPrim Care 2007;15(1):1-7.

55. Kumarapeli P, De Lusignan S, Ellis T, Jones B.Using Unified Modelling Language (UML) as aprocess-modelling technique for clinical-research

process improvement. Med Inform Internet Med2007;32(1):51-64.

56. Vasilakis C, Lecnzarowicz D, Lee C. Applicationof Unified Modelling Language (UML) to theModelling of Health Care Systems: AnIntroduction and Literature Survey. In Ed. Tan J.Developments in Health Information Systems andTechnologies: Models and methods. Vancouver,BC: IGI-global; 2010,275-87. DOI: 10.4018/978-1-61692-002-9.ch019.

57. Scheuerlein H, Rauchfuss F, Dittmar Y, Molle R,Lehmann T, Pienkos N, et al. New methods forclinical pathways-Business Process ModelingNotation (BPMN) and Tangible Business ProcessModeling (t.BPM). Langenbecks Arch Surg 2012Feb 24. [Epub ahead of print]

58. Rojo MG, Daniel C, Schrader T. Standardizationefforts of digital pathology in Europe. Anal CellPathol (Amst) 2012;35(1):19-23.

59. Becnel Boyd L, Hunicke-Smith SP, StaffordGA, Freund ET, Ehlman M, Chandran U, et al.The caBIG(R) Life Science BusinessArchitecture Model. Bioinformatics 2011 Mar29. [Epub ahead of print]

60. Yan Q. Bioinformatics for transporter pharmaco-genomics and systems biology: data integration

and modeling with UML. Methods Mol Biol2010;637:23-45.

61. Baker EJ, Jay JJ, Philip VM, Zhang Y, Li Z,Kirova R, et al. Ontological DiscoveryEnvironment: a system for integrating gene-phenotype associations Genomics 2009;94(6):377-87.

62. Zheng Y, Zhou J, Krause P. An Automatic TestCase Generation Framework for Web Services.Journal of Software 2007;2(3):64-77. http://citeseerx.ist .psu.edu/viewdoc/download?doi=10.1.1.94.6186&rep=rep1&type=pdf

63. Tan W, Missier P, Foster I, Madduri R, Goble C.A Comparison of Using Taverna and BPEL inBuilding Scientific Workflows: the case of caGrid.Concurr Comput 2010;22(9):1098-1117.

Correspondence to:Simon de LusignanDepartment of Health Care Management and PolicyUniversity of SurreyGuildford, Surrey, UKTel +44 (0) 1483 683 089E-mail: [email protected]

For personal or educational use only. No other uses without permission. All rights reserved.Downloaded from imia.schattauer.de on 2012-09-07 | ID: 1000510252 | IP: 134.36.204.2