Imaging Services on the Grid as a Product Line: Requirements and Architecture

7
Imaging Services on the Grid as a Product Line: Requirements and Architecture Mathieu Acher, Ph. Collet, Philippe Lahire, Johan Montagnat To cite this version: Mathieu Acher, Ph. Collet, Philippe Lahire, Johan Montagnat. Imaging Services on the Grid as a Product Line: Requirements and Architecture. Workshop Service-Oriented Architectures and Software Product Lines - Putting Both Together (SOAPL’08), at SPLC 2008, Sep 2008, Limerick, Ireland. pp.6. <hal-00419992> HAL Id: hal-00419992 https://hal.archives-ouvertes.fr/hal-00419992 Submitted on 18 May 2010 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destin´ ee au d´ epˆ ot et ` a la diffusion de documents scientifiques de niveau recherche, publi´ es ou non, ´ emanant des ´ etablissements d’enseignement et de recherche fran¸cais ou ´ etrangers, des laboratoires publics ou priv´ es.

Transcript of Imaging Services on the Grid as a Product Line: Requirements and Architecture

Imaging Services on the Grid as a Product Line:

Requirements and Architecture

Mathieu Acher, Ph. Collet, Philippe Lahire, Johan Montagnat

To cite this version:

Mathieu Acher, Ph. Collet, Philippe Lahire, Johan Montagnat. Imaging Services on the Gridas a Product Line: Requirements and Architecture. Workshop Service-Oriented Architecturesand Software Product Lines - Putting Both Together (SOAPL’08), at SPLC 2008, Sep 2008,Limerick, Ireland. pp.6. <hal-00419992>

HAL Id: hal-00419992

https://hal.archives-ouvertes.fr/hal-00419992

Submitted on 18 May 2010

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinee au depot et a la diffusion de documentsscientifiques de niveau recherche, publies ou non,emanant des etablissements d’enseignement et derecherche francais ou etrangers, des laboratoirespublics ou prives.

Imaging Services on the Grid as a Product Line :Requirements and Architecture

Mathieu Acher∗, Philippe Collet∗, Philippe Lahire∗, Johan Montagnat∗∗Universite de Nice Sophia Antipolis, laboratoire I3S - CNRS UMR 6070

Polytech Nice - Sophia, 930 Route des Colles, BP 145, F-06903 Sophia Antipolis Cedex, [email protected], [email protected], [email protected], [email protected]

Abstract�SOA is now the reference architecture formedical imaging processing on the grid. Imaging servicesmust be composed in work�ows to implement the pro-cessing chains, but the need to handle end-to-end qualitiesof service hampered both the provision of services andtheir composition. This paper analyses the variability offunctional and non functional aspects of this domain andproposes a �rst architecture in which services are orga-nized within a product line architecture and metamodelshelp in structuring necessary information.

I. INTRODUCTION

Service Oriented Architectures (SOA) emerged in dis-tributed computing and IT management as a conceptualarchitecture to reduce the complexity of software sys-tems. This paradigm promotes an organization in whichbasically i) reusable self-contained services are providedthrough standardized interfaces and easily exchange thestructured information � handling execution platformheterogeneity �, and ii) services are composed andcoordinated in work�ows, possibly in a dynamic way� so that business processes can be easily created andmaintained on the long term �.

SOA is also becoming a reference architecture forgrid computing in different scienti�c domains [1]. Ourwork stands in the medical imaging area, in which gridshelp in building patient-speci�c models and in reducingcomputing time for meeting time constraints of clinicalpractice. Grid computing makes also possible to dealwith many problems related to large medical data setsmanipulation, usually heavily fragmented, on very widedistributed infrastructures. Besides image analysis toolpipelines are undergoing homogenization, strongly mo-tivated by the need for mutualizing software developmentand easily comparing results. The SOA paradigm is thusespecially adapted to this domain, with imaging servicesinherently decoupled and abstracted from technical plat-forms, and work�ows to design composed algorithmsthrough process chains.

But to fully exploit this paradigm, the communityfaces two major challenges that we identify as beingrelated to an important variability. First, code main-tainers have to provide basic imaging services fromheterogeneous code, with detailed information so thatthey can be easily composed to construct new algorithmsand to master their deployment on the grid. Second,maintainers have to manage the numerous non functionalproperties that have to be exploited during deploymentor run times, in order to ensure a quality of service(QoS) adapted to the user. These QoS properties exposedifferent forms of variability as they may be related toa service itself (reliability, availability, cost, expectedexecution time...), to its provision on the grid (parallelismgrain, data handling protocol, adaptability to resources...)or to some user needs (emergency of a computation,expected output quality...).

Variability is now an essential concept in softwareengineering. Its realization can be seen as a means to de-scribe the whole generality of a software artefact throughthe speci�cation of commonalities and differences. Thebest support for this concept is currently the SoftwareProduct Line (SPL) paradigm [2], which applies to soft-ware the general industrial notion of engineering familyof similar entities. In our context, we aim at tacklingthe variability issues in imaging services and work�owson the grid by including them within a product linearchitecture. This service product line would describemajor variations in functional and non functional medicalimaging service speci�cations, as well as in processpipelines described through work�ows. In this paper, wefocus on shorter term goals:

• We present the results of our domain analysis onimaging grid services, detailing segmentation ser-vices as an illustration. We therefore propose appro-priate metamodels to represent functional variabilityand QoS mechanisms (Section II).

Figure 1. A medical imaging wor�ow

• We propose a �rst architecture that enables ser-vice providers and work�ow experts to capturethe commonalities and the differences of legacyservices, to build the right service according to thesecommonalities and differences ef�ciently and to usethe line to select appropriate services according tofunctional and non functional criteria (Section III).

II. VARIABILITY IN MEDICAL IMAGING SERVICES

In a medical imaging work�ow, medical experts com-pose different kinds of processing on images, eachalgorithm being provided by a service.

A. Functional VariabilityFigure 1 shows a work�ow corresponding to a brain

Magnetic Resonance Imaging (MRI) segmentation. The�rst stage consists in the registration of images sothat they are spatially aligned in the same coordinatesystem. Then, the brain region is isolated and an in-tensity correction is applied. This preprocessing stageis followed by image segmentation, which delineatessome structures in the brain. For each task of thework�ow, numerous services are available on the grid.For instance several segmentation algorithms exist. Theyare as many candidates to realize the last task in thework�ow. However, in our particular context, medicalimages are MRI, in a speci�c format � the studiedbody part is the brain, and following the preprocessingstages, noise can be considered as weak. This contextis indeed a combination of varied features: the medicalimage could have been a CT-scan or an ultrasoundand it could represent completely different anatomical

Figure 2. Metamodel handling functional variability in medicalimaging (FODA representation)

regions. In the same time, the functional capabilitiesof services also change according to the context. Forinstance some algorithms are better adapted to processMRI than CT-scans; indeed, they are developed to reachthis goal. It is then necessary to introduce speci�cinformation on images and manipulated body structures,and this form of variability can be directly expressedas alternative choices. Following our approach based onmetamodels, one metamodel must be capable to provideto medical imaging experts the means to explicit theseservice features through the concepts of software productline. Model Driven Engineering (MDE) techniques areintended to be used to capture the description of servicevariability and product-line capabilities.

As a result, we propose a �rst metamodel to handlefunctional variability in medical imaging. Figure 2 showsa partial representation of this metamodel as a featurediagram [3]. Several variability issues are then handled :the knowledge of information associated to the image iseither mandatory (e.g. the image format), optional (e.g.image noise) or with variants (e.g body structure). Themetamodel makes possible to characterize both the targetdata set and the capabilities of an algorithm to processit.

B. QoS VariabilityThe behavior of image processing algorithms also

changes according to quality criteria. Typically, onewould like to specify a service performance, its capacity

Figure 3. Metamodel handling variability of QoS mechanisms

to handle all kinds of data (robustness), to produce agood result (accuracy) or to ensure data con�dentiality.Some high-level QoS dimensions have been identi�edas relevant for grid work�ows (time, cost, �delity, reli-ability, security) [4]. They are also relevant in medicalimaging [5]. Our approach consists in modelling QoSattributes of medical imaging so that grid schedulers,work�ow engines or developers can select services onthe grid according to QoS constraints. The compositionprocess is then similar to a QoS-driven composition[6]. To illustrate our motivations and approach, we takethe case of image segmentation. It consists in locatingand extracting perceptual units of the image, so thatthe image is more meaningful and easier to analyze.In the medical area, the goal of segmentation is toselect anatomical or arti�cial objects, which correspondto the real anatomy of the patient, and which need to bemeasured or visualised by the clinical user. The processof segmentation is a crucial (and often preliminary) stepfor medical imaging analysis and diagnosis. We �rst notethat segmentation has no general solution and the need toevaluate segmentation algorithms has often been pointedout [7], both to compare hundreds of algorithm variantsdeveloped by researchers and to provide to �nal userssome means to choose the appropriate algorithm for theirproblems.

An analysis of evaluation mechanisms for segmenta-tion algorithms also shows an important variability. Forinstance, the computation of quality measures cannotbe realized before the processing. This is the case ofgoodness methods, which compute speci�c properties ofsegmented objects in the image (such as intra-regionuniformity, inter-region-contrast, entropy, shape, edge

quality, etc). These methods are subjective � it is alwayspossible to build an algorithm capable to outperform allothers according to another chosen evaluation criteria �, and consequently they may be biased but they have theadvantage of being automatic [8].

In contrast, the discrepancy methods rely on a refer-ence image, which is supposed to be the ground truth,so that it can be compared according to various criteriato the results of segmentation algorithms. Evaluationmethods then expose several variability degrees: someneed explicit knowledge, can be computed statically(with an a priori knowledge) or dynamically, are subjectto biases, etc. We also consider evaluation methodsas being characterized by some QoS properties, alsovariable: necessary execution time, computation cost orevaluation reliability, etc. Besides, during evaluation theapplication domain must also be speci�ed. According to[9], it is determined by three entities: the aim of thesegmentation, the studied body region and the imageprotocol (including the acquisition method). The behav-ior of a segmentation algorithm is then dependent ofthis context, as well as the different qualities provided.For instance, a particular segmentation method may havehigh performance in determining the volume of a tumorin the brain on an MRI image, but may have a lowperformance in segmenting a cancerous mass from amammography scan of a breast. It is clear that a QoSmetric makes only sense in a very precise context. Figure3 summarizes our proposed QoS metamodel and itsvariability description. A QoS property is then describedthrough a metric, a dimension, and optionally a com-putation method. All analysed variability points can beexpressed through this metamodel, and especially the in-terdependencies between dimensions and QoS propertiesthemselves (informally represented by the dotted lines onthe �gure, expressed through constraints1 in our case).

Back to segmentation, several QoS dimensions mustbe taken into account. The main considered qualityis accuracy, which refers to the degree to which thesegmentation results agree with the true segmentation.The reproductibility (or precision) is the capability for aprocess to be repeatable (i.e. to reproduce the same resultfor the same input) and it corresponds to the measureof a random �uctuation. The predictability of an imageanalysis is also important for experts, which have mentalmodels of how an algorithm works and can then correcta malfunction. The reliability of an image processingchain is also evaluated in terms of robustness, i.e. the

1using Object Constraint Language (OCL)

Figure 4. One QoS property model

performance realized in the presence of disruptive factorsor its capacity to handle a wide range of images as input.Finally more classic QoS dimensions, such as time orcost, are expressed through theoretical complexity of thealgorithms (in space and time). It is clearly important tobe able to re�ne some high level QoS concerns in sucha way that medical or software experts have the meansto specify their requirements. But a major issue is thento handle mutually dependent constraints between QoSdimensions, which is a classic open problem in computerscience: a restriction on memory usage may easily implya performance drop. With segmentation, such interdepen-dencies are also found, as it is dif�cult to evaluate thespeci�city and the sensitivity of an algorithm througha unique metric without trade-off [10]. As a result, wecan instantiate the QoS metamodel to characterize eachQoS property of a service. In Figure 4, a property modelwith the following characteristics is instantiated: the QoSdimension is accuracy, it is dynamically measurable, theevaluation is made on the output with a small margin oferror and it is possible to compare it with its output.The variability of QoS mechanisms have impacts onthe operations to select services, control or adapt thework�ow. Back to the work�ow of Figure 1, selectinga service according to QoS constraints implies that theservice can evaluate a priori the QoS dimensions. Herethe presented QoS characteristics are not compatiblewith this selection operation as it only supports dynamiccomputation and requires the knowledge of the output.But it can be used to perform appropriate monitoring atruntime.

III. ARCHITECTURE OF THE SERVICE PRODUCT LINE

Considering the mentioned requirements, our aim isto tackle the determined variability issues by includingservices within a product line architecture.

A. Principles

Figure 5. The Software Product Line Framework

Figure 5 sets the principles of the proposed architec-ture. It relies on i) a Service Product Line Framework(SPLF) describing the business domain, ii) a servicerepository where one may �nd legacy services of thebusiness domain and iii) metamodels that capture knowl-edge of the SPLF. The SPLF describes possible typesof services and work�ows for the domain of medicalimaging. It considers services variability, including:• a set of common properties (a structured list of

assumptions that are satis�ed for all members ofthe domain). For example, any service for imagesegmentation requires a medical image as input;

• a set of possible differences. For example, theformat of medical image may vary depending onthe service that is chosen (DICOM, Nifti, etc).

The choice to insert or not one type of service (forinstance in a preprocessing step) is another variationpoint, but at the work�ow level.

Introducing variability within the description of theused services allows the developers (e.g. medical imag-ing computation expert) to describe the structure and thebehaviour of the services, as well as to propose variants,de�ne optional parts, etc. Then, the grid work�ow ex-perts are able to transparently deploy services and toef�ciently execute complete applications composed of

several of them. During the process of building and exe-cuting the work�ow, they choose the most appropriatedservices for each task of the work�ow, using the serviceproduct line. Thus, end users (e.g. practitioners) are ableto specify the data on which the application will be runand to execute it on a grid, specifying their requirementsand QoS needs.

B. Services in the repository

The SPLF uses services in a repository. They aremaking a collection of algorithms for image processing.We provide to medical imaging experts means to or-ganize the information associated to services and makeexplicit their functional (see Figure 2) and non functional(see Figure 3) characteristics. The concept of service isalso modelized through a metamodel, as proposed forexample in [11]; it aims at integrating these properties.Through metamodels, services are sharing a commonframework in order to describe their interfaces, espe-cially preconditions (i.e. nature of manipulated images)and postconditions (i.e. QoS provided). QoS propertiesof services must take into account grid infrastructurecharacteristics. In particular, the grid latency may leadto performance drop on the execution of a service [12].QoS grid concerns are referring to many QoS dimensionsand are then using variable mechanisms. Typically, onewant to express a dynamic measure on the responsetime of grid nodes, as well as a statistical evaluationabout fault tolerance of resources. We thus propose ametamodel representing this information (see Figure 3).Finally, legacy services of the repository are identi�edas candidates, and are potentially reusable in differentcontexts.

C. Building the SPL

There are hundreds of segmentation algorithms avail-able on the grid that are able to be used for the seg-mentation task of the work�ow on Figure 1. Intuitivelythese services may be handled through an actual service(a service interface) of type Segmentation, which canbe included in one SPL description. Services are thenincluded in the SPL based on additional informationassociated. Consequently, a line of services can be seenas a service that is able to provide access to multipleservices, members of the line. Moreover, a service canbe described by multiple interfaces, but has one capa-bility (i.e. segmentation). In order to describe the wholegenerality of an entity, we specify common elements anddifferences. The result is the construction of a generic

interface, which describes indirectly multiple interfaces[13].

As the metamodel would describe all possible soft-ware product lines of the business domain (medicalimaging), it will be able to represent all functionaland non-functional commonalities and variations of theservices belonging to the repository. Each service ofthis repository, which belongs to the software productline, will be conform to a given model and by extension(transitivity) to the metamodel. Thus, the software archi-tect can infer a software product line considering someservices of the repository.

D. Using the SPLReasoning and veri�cation on the SPLF is supported

by model engineering techniques. As the variabilityof grid services for medical imaging is captured inmetamodels, this makes possible to reason on servicesand to achieve the necessary operations determined inSection II, such as selection, adaptation and monitoring.The SPLF relies on these metamodels to represent onesoftware product line, which corresponds to one of itsinstances. Thanks to the SPLF, the repository and themetamodels, it is possible to derive services from a givensoftware product line. The main focus during productderivation is on satisfying complex dependencies, i.e.dependencies that affect the binding of a large number ofvariation points, such as quality attributes. A key aspectin resolving these dependencies is having an overviewon these complex dependencies and how they mutuallyrelate. As we have already noticed in Section II, analgorithm in medical imaging tries to perform a trade-offbetween sensibility and sensitivity. Another example ofa complex dependency is a restriction on memory usageof a software system. Relation to other dependencies ishow this restriction interacts with a requirement on theperformance. These examples show the need for the �rst-class representation of dependencies, including complexdependencies, in variability models and the need forappropriate means to model the relations between thesedependencies in the SPL. Thus, the SPL provides tosoftware architect a generic interface describing theset of functional and non functional characteristics andmeans to express constraints in order to choose the mostadapted service for each work�ow task.

IV. CONCLUSION

The goal of this paper was to propose an approach forimproving the reuse of services dedicated to the domainof medical imaging. We addressed variability of grid

services for medical imaging by using an approach basedon software product lines. Functional and non functionalvariability of imaging services have been analysed usingsegmentation as a running example. We focused on QoSattributes and provided two supporting metamodels, onefor expressing functional variability in image processing,the other to describe QoS mechanisms. We also de-scribed the architecture of a service product line relyingon these metamodels.

This architecture is undergoing implementation withinthe Kermeta workbench and its support for Feature Dia-gram2. Beside implementation, we want to validate ourapproach on a large service and data set. The proposedarchitecture is intended to be used in a project calledNeuroLOG [14]. This project is targeting the neurologydomain and adopts a user-centric perspective to meet theneuroscientists expectations. It also aims at fostering theadoption of HealthGrids in a pre-clinical community.

Other open issues deal with the modeling of linesof services as well as with the speci�cation of theQoS properties. As we mentioned in this paper, weare interested in two kinds of variability : i) variabilityof the service itself (for example the reliability of onesegmentation algorithm) and ii) the variability of the QoSprocessing mechanism (for example the computation ofthe reliability measure may be static or dynamic). Thesetwo kinds of variability interact one with the other andone crucial issue is to compose them in order to obtaina full-�edged architecture.

As it has been explained in Section II, the propertiesthat are addressed may be considered at various levelsof details. For instance one may reason at the levelof the time used for handling one service or at thelevel of the time needed to send its result (the �rst oneincludes the second one). Another important aspect isthen to be able to reason even if the scale or the type ofseveral properties is not the same and if the informationcontained need to be aggregated. QoS dimensions mayindeed have multiple views, depending on the domainconsidered. An expert of the grid would not have thesame preoccupations as an expert of business domainconsidering, for instance, the reliability dimension. Theimpact of both cases must be studied together.

ACKNOWLEDGMENT

The authors thank Diane Lingrand and Alban Gaig-nard for their valuable comments regarding this work.

2http://www.kermeta.org/mdk/ProductDerivation/

REFERENCES

[1] I. Foster, C. Kesselman, J. Nick, and S. Tuecke, �The Physiol-ogy of the Grid: An Open Grid Services Architecture for Dis-tributed Systems Integration,� Open Grid Service InfrastructureWG, GGF, Tech. Rep., June 2002.

[2] J. Bosch, G. Florijn, D. Greefhorst, J. Kuusela, J. H. Obbink,and K. Pohl, �Variability issues in software product lines,�in PFE, ser. Lecture Notes in Computer Science, F. van derLinden, Ed., vol. 2290. Springer, 2001, pp. 13�21.

[3] K. Czarnecki, S. Helsen, and U. Eisenecker, �FormalizingCardinality-based Feature Models and their Specialization,� inSoftware Process Improvement and Practice, 2005, pp. 7�29.

[4] J. Yu and R. Buyya, �A Taxonomy of Work�ow ManagementSystems for Grid Computing,� Journal of Grid Computing(JGC), vol. 3, no. 3-4, pp. 171 � 200, Sept. 2005.

[5] P. Jannin, J. Fitzpatrick, D. Hawkes, X. Pennec, R. Shahidi,and M. Vannier, �Validation of Medical Image Processing inImage-guided Therapy,� IEEE Transactions on Medical Imag-ing (TMI), vol. 21, no. 12, pp. 1445�1449, Dec. 2002.

[6] L. Zeng, B. Benatallah, M. Dumas, J. Kalagnanam, and Q. Z.Sheng, �Quality driven web services composition,� in WWW'03: Proceedings of the 12th international conference on WorldWide Web. New York, NY, USA: ACM, 2003, pp. 411�421.

[7] Y. J. Zhang, �A review of recent evaluation methods for imagesegmentation,� in Signal Processing and its Applications, SixthInternational, Symposium on. 2001, vol. 1, Kuala Lumpur,2001, pp. 148�151.

[8] J. S. Cardoso and L. Corte-Real, �Toward a generic evaluationof image segmentation,� Image Processing, IEEE Transactionson, vol. 14, no. 11, pp. 1773�1782, 2005.

[9] J. K. Udupa, V. R. Leblanc, Y. Zhuge, C. Imielinska,H. Schmidt, L. M. Currie, B. E. Hirsch, and J. Woodburn,�A framework for evaluating image segmentation algorithms,�Computerized Medical Imaging and Graphics, vol. 30, no. 2,pp. 75�87, March 2006.

[10] A. Popovic, M. de la Fuente, M. Engelhardt, and K. Rader-macher, �Statistical validation metric for accuracy assessmentin medical image segmentation,� International Journal of Com-puter Assisted Radiology and Surgery, vol. 2, no. 3-4, pp. 169�181, December 2007.

[11] A. D'Ambrogio, �A model-driven wsdl extension for describingthe qos ofweb services,� in ICWS '06: Proceedings of the IEEEInternational Conference on Web Services. Washington, DC,USA: IEEE Computer Society, 2006, pp. 789�796.

[12] D. Lingrand, J. Montagnat, and T. Glatard, �Estimating theexecution context for re�ning submission strategies on produc-tion grids,� in Assessing Models of Networks and DistributedComputing Platforms (ASSESS) (CCgrid'08). Lyon: IEEE,May 2008.

[13] C. Feier, D. Roman, A. Polleres, J. Domingue, M. Stollberg,and D. Fensel, �Towards intelligent web services: Web servicemodeling ontology (wsmo),� in Proceedings of the InternationalConference on Intelligent Computing (ICIC), Hefei, China, 82005.

[14] J. Montagnat, A. Gaignard, D. Lingrand, J. Rojas Balderrama,P. Collet, and P. Lahire, �NeuroLOG: a community-drivenmiddleware design,� in HealthGrid. Chicago: IOS Press, June2008, pp. 49�58.