Integrating Document and Workflow Management Tools using XML and Web Technologies: A Case Study

10
Integrating Document and Workflow Management Tools using XML and Web Technologies: a Case Study Lerina Aversano, Gerardo Canfora, Andrea De Lucia, Pierpaolo Gallucci (aversano / canfora / delucia / gallucci)@unisannio.it RCOST – Research Centre On Software Technology Department of Engineering, University of Sannio Palazzo Bosco Lucarelli, Piazza Roma - 82100 Benevento, Italy Abstract A critical point for developing successful information systems for distributed organisations is the need for integrating heterogeneous technologies and tools. This paper reports a case study of integrating two key enabling technologies, namely workflow and document management. Integration is achieved by combining several approaches, including software engineering and hypertexts. In this way, we raise the integration problem from the level of a purely technical issue to a level of conceptual modelling: integration is not focused solely on the information/software systems but involves, and is driven by, the related business processes and the documents they deal with. 1. Introduction Convergence between telecommunications and computing and the explosion of the Internet have opened the way to new ways of conceiving, designing, and running businesses and enterprises. More and more companies are moving towards distributed or even virtual organisation models, where independent institutions, departments, and groups of specialised individuals converge in a temporary network with the aim of utilising a competitive advantage or solving a specific problem. Information and communication technology (ICT) is a primary enabler of virtual organisations, as peoples and institutions in a network make substantially more use of computer- mediated channels than physical presence to interact and cooperate in order to achieve their objectives. In particular, two primary enabling technologies are workflow management and document management systems. One of the main advantage of workflow management systems is moving the focus from the automation of single process activities, through traditional information systems, to the overall management and improvement of the business processes, through the integration of different software technologies [11, 22, 27]. In addition, the last generation of workflow management systems leverage the Web as an enabling infrastructure, thus allowing a higher level of coordination and control among the geographically distributed teams and individuals that take part in a business process [1]. Document management systems complement workflow management as they focus on the management of the documents developed and exchanged by the subjects taking part in a business process. Typical features of modern document management systems include the management of content independently of the document layout and presentation, the support for the collaborative production of documents, and advanced information retrieval techniques [6]. A critical point for developing successful information infrastructures and services for distributed and virtual organisations is the need for integrating different, and sometimes heterogeneous, technologies and tools. Integration involves creating expressive models of the business processes and the involved documents and devising open architectural models of the supporting infrastructure and services. In this paper we illustrate a case study of achieving integration of workflow and document management. We combine software engineering approaches, namely object oriented modelling through the Unified Modelling Language (UML) [7] and hypertexts, and particularly the eXtensible Markup Language (XML) [21]. The main role of UML is modelling the business processes and involved documents at an abstract level, whereas XML Document Type Definitions (DTDs) are used to concretely model the document content. Finally, visual approaches are used to implement the processes and to develop the end-users interfaces of the integrated application. Several other authors have pointed out the problem of integrating workflow and document management to create information systems for distributed collaborative Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR02) 1534-5351/02 $17.00 ' 2002 IEEE

Transcript of Integrating Document and Workflow Management Tools using XML and Web Technologies: A Case Study

Integrating Document and Workflow Management Tools using XML and Web Technologies: a Case Study

Lerina Aversano, Gerardo Canfora, Andrea De Lucia, Pierpaolo Gallucci (aversano / canfora / delucia / gallucci)@unisannio.it

RCOST – Research Centre On Software Technology Department of Engineering, University of Sannio

Palazzo Bosco Lucarelli, Piazza Roma - 82100 Benevento, Italy

Abstract

A critical point for developing successful information systems for distributed organisations is the need for integrating heterogeneous technologies and tools. This paper reports a case study of integrating two key enabling technologies, namely workflow and document management. Integration is achieved by combining several approaches, including software engineering and hypertexts. In this way, we raise the integration problem from the level of a purely technical issue to a level of conceptual modelling: integration is not focused solely on the information/software systems but involves, and is driven by, the related business processes and the documents they deal with. 1. Introduction Convergence between telecommunications and computing and the explosion of the Internet have opened the way to new ways of conceiving, designing, and running businesses and enterprises. More and more companies are moving towards distributed or even virtual organisation models, where independent institutions, departments, and groups of specialised individuals converge in a temporary network with the aim of utilising a competitive advantage or solving a specific problem. Information and communication technology (ICT) is a primary enabler of virtual organisations, as peoples and institutions in a network make substantially more use of computer-mediated channels than physical presence to interact and cooperate in order to achieve their objectives. In particular, two primary enabling technologies are workflow management and document management systems. One of the main advantage of workflow management systems is moving the focus from the automation of single process activities, through traditional information systems, to the overall management and

improvement of the business processes, through the integration of different software technologies [11, 22, 27]. In addition, the last generation of workflow management systems leverage the Web as an enabling infrastructure, thus allowing a higher level of coordination and control among the geographically distributed teams and individuals that take part in a business process [1]. Document management systems complement workflow management as they focus on the management of the documents developed and exchanged by the subjects taking part in a business process. Typical features of modern document management systems include the management of content independently of the document layout and presentation, the support for the collaborative production of documents, and advanced information retrieval techniques [6]. A critical point for developing successful information infrastructures and services for distributed and virtual organisations is the need for integrating different, and sometimes heterogeneous, technologies and tools. Integration involves creating expressive models of the business processes and the involved documents and devising open architectural models of the supporting infrastructure and services. In this paper we illustrate a case study of achieving integration of workflow and document management. We combine software engineering approaches, namely object oriented modelling through the Unified Modelling Language (UML) [7] and hypertexts, and particularly the eXtensible Markup Language (XML) [21]. The main role of UML is modelling the business processes and involved documents at an abstract level, whereas XML Document Type Definitions (DTDs) are used to concretely model the document content. Finally, visual approaches are used to implement the processes and to develop the end-users interfaces of the integrated application. Several other authors have pointed out the problem of integrating workflow and document management to create information systems for distributed collaborative

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

organisations. For example, Sumiya and Saito [24] discuss a multimedia document management environment that supports the cooperative work of groups and individuals during aircraft maintenance. Joeris [12] proposes an integrated approach for workflow and document management in engineering applications. Kappel et al. [16] advocate the use of active object-oriented databases to integrate workflow and hypermedia documents management. Our approach differs from other proposals as we use a unique combination of software engineering and hypertexts to rise the integration issue from the pure technology level to a level of conceptual modelling. The work presented in this paper has been developed and experimentally validated within a technology-transfer research project, named LINK, aiming at introducing innovative technologies within the Public Administration (PA), particularly in the local PA in the Sannio area. At the same time, the project aims at transferring an adequate background of methodological and technological knowledge to the local Small and Medium software Enterprises (SMEs), in order to enable them to provide answers to the emerging needs of the PA. The paper is organised as follows. Section 2 gives background information on the LINK project and the reference technology model adopted in the Sannio area. Section 3 illustrates the architecture and the main features of GIANO, a XML-based document management system developed within the LINK project. Workflow modelling and automation are addressed in section 4, which also discusses the technology selection process adopted in the LINK project. The integration of workflow and document management both at the content level and at the level of user interface is discussed in section 5. Finally, lessons learned and concluding remarks are given in section 6. 2. The context LINK is a project aiming at the development and transfer to local SMEs of know-how in the form of models, methodologies, and technologies to be transformed in a short time in the offer of modern software products and services for the PA. Figure 1 shows the technology transfer model underlying the LINK activities and project. Research institutions have a twofold role:

• stimulating the local ICT market of peripheral PA by promoting and enacting pilot Business Process Reengineering (BPR) projects driven by technology and service innovation, thus helping PA departments to understand, structure, and qualify their needs;

• promoting and helping a network of local SMEs with an adequate background of methodological and technological knowledge to provide answers to the emerging needs for PA process innovation.

The main feedback for the research institutions from the

project partners consists of:

• stimuli for identifying and addressing new research problems;

• assessment of the technology transfer methodology.

The bold arrows in Figure 1 refer to the final goal of the technology transfer experience.

Figure 1: LINK technology transfer model in the Sannio area

We have identified in the workflow and document management the key areas where the demand for advanced services is particularly relevant. In particular, document management aims to enable managers, employees, and citizens to retrieve the documents produced, or views of them, in a flexible way, while introducing a Workflow Management System (WfMS) into the organisation eases the cooperative production of administrative documents through the integration, the coordination, and the communication of both human and automatic tasks of an administrative process. Legacy system migration and integration in the workflow platform has also been identified as an important activity that allows to take more advantage from introducing workflow technologies. The last issue is not addressed in this paper. The interested reader can refer to [2, 5]. Six SMEs and two peripheral PA departments have joined the LINK technological transfer consortium in the Sannio area. In particular four SMEs joined actively both the pilot experimental project and the training program, and one of the two PA departments, namely the Province of Benevento, agreed to participate in the pilot project with the role of customer. Several activities have been conducted together with the LINK partners. The first activity was the implementation of a document management system that allows storing the documents in XML format and retrieving document views according to the needs of different users. The main aspects of this system are presented in section 3. The second activity was concerned with workflow analysis and automation. We have devised guidelines for the analysis and modelling of the processes of an organisation [4]. Their application to the PA partner resulted in the identification and modelling of a key

stimulating the demand

SMEs

Research

PA

know

products and services

assessing and analysing

partnerships

market needs

research stimuli and results validation

research stimuli and results validation

SMEs

Research

PA

products and services (market)

how addresses

(market)

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

process to be automated in the pilot project. In particular, in our analysis we identified the tender management process as one of the most critical processes of the technical office and decided to analyse and automate it. A second aspect concerned with this activity was the evaluation of workflow management technologies to select the WfMS for the pilot project [4]. Our evaluation was based on the specifications of the Workflow Management Coalition reference model [27]. This activity is briefly discussed in section 4. Details can be found in [4]. The final activity was the implementation of the workflow prototype using the selected workflow platform and the integration of the previously developed document management system into the prototype. This issue is presented in section 5. 3. GIANO: an XML-based document

management system The first subsystem developed within the LINK project by two University's researchers is a document management system aiming to provide storing and retrieving facilities for administrative documents. The goals of the document retrieval sub-system are:

• providing a context sensitive querying mechanism to achieve high levels of recall and precision1; for example, one may want to retrieve only documents having a given keyword in a specific document part and neglect documents including the same keyword in different parts;

• providing different users with different views of the retrieved documents; for example different administrative employees need to access different parts of administrative documents and also different views have to be provided to the citizens exploiting the information retrieval service.

Most information retrieval systems do not provide neither of the requirements above: usually, the input is specified using a query language, while the group of documents in output are selected according to different models (for example, probabilistic or vector space models), but without taking into account the document structure [6]. Recent research on information retrieval allows to overcome this problem and to specify the parts of a document a given keyword has to be searched in; this is in particular true for systems that use some kind of mark-up

1 Two metrics that are widely used in information retrieval to measures the performances of a query are the recall and the precision. The recall is the ratio between the number of documents retrieved that are also relevant for the query and the total number of relevant documents in the document space; the precision is the ratio between the number of relevant documents retrieved and the total number of documents retrieved by the query.

language, for example XML, to encode the document [21]. However, these systems consider the structure of the documents only for retrieval purposes and do not manipulate the document structure to present different views to different users; indeed, these systems just returns the list of documents retrieved. We have developed a document management system, called GIANO, that satisfies both requirements above. It combines database and information retrieval technologies with hypertext systems and mark-up languages. The document space consists of XML files, whereas a relational database is used to store metadata on the documents. Different Document Type Definitions (DTDs) are used to define the structure of different document types [3]. Figure 2 shows an example of DTD for a typical document involved in public administration processes. For each DTD a table in the GIANO database exists whose attributes correspond to metadata and particular tags of the DTD that identify items for which searching the database is more efficient than searching the document space; examples of such attributes include title, author, and creation date. An additional table in the GIANO database associates each document to the set of its keywords (where available).

Figure 2: DTD for the document “Determina approvazione Progetto”

The document retrieval in GIANO is achieved through two refinement levels, as depicted in Figure 3. At the first level a class of documents (i.e. a DTD) is selected and a database query on the discriminative tags of the DTD is made to select a subset of documents in the document space. This defines a new document space where the second refinement is applied. Indeed, the information retrieval subsystem searches the XML documents in this space according to a context-sensitive boolean query expressed on the DTD tags. The final set of relevant

<!ELEMENT determina_approvazione_progetto (intestazione, oggetto, corpo)> <!ELEMENT intestazione (#PCDATA | titolo | int_ente | int_ufficio | numero | data)*>

<!ELEMENT numero (#PCDATA)> <!ELEMENT int_ente (#PCDATA)> <!ELEMENT int_ufficio (#PCDATA)> <!ELEMENT titolo (#PCDATA)> <!ELEMENT data (#PCDATA)> <!ELEMENT oggetto (#PCDATA)>

<!ELEMENT corpo (determinatore, punto_rif_progetto, punto_predetermina+, azione,punto_postdetermina+)>

<!ELEMENT determinatore (#PCDATA)> <!ELEMENT punto_rif_progetto (#PCDATA | rif_lavoro | progettista | importo_spesa| quadro_spesa)*>

<!ELEMENT rif_lavoro (#PCDATA)> <!ELEMENT progettista (#PCDATA)> <!ELEMENT importo_spesa (#PCDATA)> <!ELEMENT quadro_spesa (#PCDATA)>

<!ELEMENT punto_predetermina (punto_testo | punto_rif_documento)> <!ELEMENT punto_testo (#PCDATA)> <!ELEMENT punto_rif_documento (rif_documento+, spec_punto)>

<!ELEMENT rif_documento (rif_tipo, rif_numero, rif_data)> <!ELEMENT rif_tipo (#PCDATA)> <!ELEMENT rif_numero (#PCDATA)> <!ELEMENT rif_data (#PCDATA)>

<!ELEMENT spec_punto (#PCDATA)> <!ELEMENT azione (#PCDATA)> <!ELEMENT punto_postdetermina (#PCDATA)>

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

documents is retrieved together with a measure of their relevance with respect to the query. The retrieved documents are presented to the user in hypertext format. The user can select a document to browse and a format visualisation style (i.e. an XSL - eXtensible Style Language - style sheet), thus achieving a customised document view.

Figure 3: GIANO document levels

Figure 4: GIANO architecture Figure 4 shows the architecture of GIANO. GIANO is a client-server system consisting of different components communicating through the TCP/IP protocol. This architecture is composed of the GIANO client, the GIANO server, the repository, and external components, such as the Web browser and Web server, the relational Data Base Management System (DBMS), and the Information Retrieval System (IRS). The Client component provides the facilities for user accounting, DTD selection, and query composition and refinement. The GIANO client communicates with the GIANO server (Document Manager component) to load the list of available DTDs. The GIANO client also communicates

with the Web Server to load the selected DTD. The metadata and the DTD tags are used to compose the two levels of queries (see Figures 5 and 6). The metadata-level query is sent to the Document Manager to recover from the DB the list of documents that satisfy the query. The context-sensitive query is sent to the Information Retrieval Manager to access the Information Retrieval System and retrieve the final set of documents. We have integrated in GIANO an open source XML-based IRS, namely ISEARCH version 1.14, developed by Centre for Networked Information Discovery and Retrieval (CNIDR)2. The Hypertext Generator is the component that produces the HTML index of the retrieved documents including the link to the corresponding XML files, together with information about the XSL file-sheet used to visualise and browse them. The documents are visualised using a browser (see Figure 7 for an example).

Figure 5: Metadata level of queries

Figure 6: Context sensitive level of queries The latest version of GIANO client exploits the visual query language presented in [25], named InfoCrystal, that uses a simple visual metaphor to help users deal with some of the complexities inherent in information retrieval. InfoCrystal uses space proximity, shape, colour/texture and orientation coding to visualise all the binary relationships between the search criteria. Figure 8 shows

2 www.cnidr.org

1° Refinement Level

2° Refinement Level

Visualization Styles

Document repository

HYPERSPACE Virtual Document Level

IRS-level retrieved documents

DB-level retrieved documents

GIANO CLIENT

GIANO SERVER

WEB SERVER

WEB BROWSER

DTD XML XSL

HYPERTEXT GENERATOR

HTML

INFORMATION RETRIEVAL MANAGER

Document Index

DOCUMENT MANAGER

DB

REPOSITORY

DBMS IRS

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

the GIANO user interface for a visual query with three basic search criteria, corresponding to the vertices of the triangle. Each icon within the triangle depicts a complex criterion which combines the basic criteria. The icon is associated with the number of retrieved documents for the corresponding criterion. We have augmented the visual query language with browsing facilities: the user can access the list of documents retrieved for the criterion associated with an icon by mouse pointing and clicking.

Figure 7: Visualization of document retrieved

Figure 8: GIANO Visual query language 4. Workflow modelling and automation Reengineering the business processes of an organisation and automating the related workflows requires a preliminary analysis of the existing organisation and processes [13] to develop a knowledge about the organisation and to point-out the role and the activities of human and automatic resources within the processes. This is achieved through observations, questionnaires, and interviews conducted with key users and process owners of the organisation being reengineered. This information is organised in a structured document and used as starting point for abstracting an effective representation of the reverse engineered process. Reference [4] presents the

details of the guidelines used for reverse engineering the organisation processes. In our joint experience with analysts of the local SMEs and the personnel of the Province of Benevento we have reverse engineered the tender management process of the technical office. The process reverse engineering approach was iterative. A restricted team reverse engineered the process and produced a first draft model. In particular, only two experienced analysts of the two SMEs with a significant market segment in the PA were selected together with two University's researchers with experience in process modelling and a manager of the PA Department. The production of the final version required a number of revisions conducted with the help of other employees of the PA department and analysts of the SMEs. We used UML activity diagrams to model the flow of the process activities, including decisions and synchronisations, use-cases to model organisational aspects, i.e., which actors (roles) participate to which use-case (activity or group of activities), and interaction (sequence and collaboration) diagrams to depict dynamic aspects within a use case [7]. Although most WfMSs provide a graphical process definition language, we decided to adopt a higher level language to abstract from the details of the specific language and to make the process reverse engineering activity independent of the selected workflow platform. Moreover, the technological transfer essence of the project suggested the selection of standard, rather than the definition of a new modelling language. Using a standard software engineering language for workflow modelling enabled the use of an independent team for the analysis of workflow technologies. An evaluation of market workflow technologies was performed to select the more appropriate platform for the automation of the administrative processes of the PA department. This activity was conducted by a team of two University’s researchers, four analysts/programmers (one for each SME participant), and two employees of the PA Department. This team also implemented the workflow prototype for the analysed process. In particular we used a customised and simplified version of the DESMET method [15] for the evaluation of software engineering technologies. The method consists of two steps [4]: selection of the candidates and experimental assessment. In the first step we analysed available documentation and run product demos to identify a subset of the available WfMS to pass to the experimental assessment step. Experimentation consisted of running the WfMS in controlled trials to comparatively assess their quality. In both steps we exploited a quality model consisting of a set of characteristics derived from the analysis of the Workflow Management Coalition standard [27] and related evaluation scales. The final score of each WfMS was obtained as a weighted average of the characteristic

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

measures. The selection step involved four different systems for which documentation and demo versions were available. This step used several characteristics belonging to different categories, such as product identification, workflow design facilities, organisation modelling and association with computing resources, workflow engine functionality and user interface, capability of handling different objects and dynamically change the process during enactment, interoperability with other tools. Most important features concern the workflow enabling to the web. Current web based WfMSs provide user interactions through a web browser. The user retrieves a list of tasks for which he/she is currently responsible. In this way the process execution does not require additional desktop applications with limited platform availability. The two best products were short-listed for the experimental assessment step. The experimental activities to perform were specified in a trial specification table, defining the trials to perform and the corresponding evaluated characteristics. The scores lead to the selection of Ultimus Workflow3, a Web-b ased Client/Server workflow suite running on Microsoft Windows NT. The Ultimus Workflow Server controls the execution of the workflow processes. It exploits Microsoft Transaction Server, BackOffice Server, Microsoft Internet Information Server, and enterprise databases, such as SQL Server and Oracle. Ultimus Workflow uses DHTML, ActiveX and Java to realise the client interface and allows user interface customisation at run-time. It provides an Integrated Development Environment (Ultimus Designer) to graphically design new processes and decompose them into sub-processes. We translated the UML process model of the analysed process into a graphical model based on the Ultimus graphical primitives for process definition. Figure 9 shows the main workflow model of the analysed process produced with Ultimus Designer.

Figure 9: Ultimus process model The different components of the Ultimus Workflow suite communicate through an internal open database that enables interoperability with external applications.

3 http://www.ultimus1.com

Ultimus FloStation is a Microsoft NT based service that allows the interaction of external applications with the internal database of a process instance. It maintains a task list and provides tools to manage their execution. Multiple FloStations can be installed for scalability. External applications can be included in a process model using automatic steps, called flobots. In particular, Ultimus is integrated and provides flobots for widely used office automation environments, ODBC Databases, E-Mail servers, and file servers. In addition, it is possible to develop custom flobots. 5. Workflow and document tool integration Workflow and document management are the two technologies used to implement the prototype for the automation of the administrative processes of the PA partner in the project LINK. The integration of these technologies is not trivial: indeed, the selected workflow management system does not provide facilities to automatically produce the administrative documents into the format used by the GIANO document management system. Indeed, the first version of the workflow prototype for the analysed administrative process, developed during the experimental assessment of the selected workflow management systems only used the predefined flobots for Microsoft Office to produce documents in word format. Another difference is the fact that the workflow prototype is web-based, while GIANO user interface was not web-based and then could not be integrated into the Ultimus client. The integration of GIANO and the workflow prototype has been achieved at two different levels: document level and user interface level. At the first level we have integrated the analysis of the administrative processes with the analysis of the involved documents. The reverse engineered documents are first modelled with UML diagrams and then refined to derive the XML representation used by GIANO. At the second level we used a methodology defined in the literature, namely MORPH [17, 18, 19], to reengineer the GIANO user interface to a web-based user interface. 5.1 Document level integration The first integration step consists of integrating document analysis and modelling into the workflow analysis and modelling process. The information collected during the business process reverse engineering activity needs to focus on the document structure and life cycle, in addition to the workflow of the process activities. We use UML to build static and dynamic models of each document type involved in the process. In this way, we have a consistent notation to model all the aspects of a business process. This activity was conducted by the same team involved in

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

the workflow analysis and modelling task. Building a model of a document proceeds iteratively through the analysis of a selected sample of documents. The first step is the document class definition. The goals of this step are:

• to partition the documents into classes on the basis of the information content and the usage;

• to identify the information content of each document class;

• to identify the relationships existing between the classes of documents.

The result is a particular UML class diagram that we call document-relationships model (see Figure 10 for an example), in which nodes represent classes of documents and edges depict the mutual relationships.

Figure 10: document class diagram The second step consists of describing the life cycle for each object document. We distinguish two phases in the temporal existence of a document [23]:

• active phase, in which the document could undertake variations in the form and contents;

• passive phase, in which the document is stable and does not need to be modified.

Modelling the life cycle concerns only the active phase of the object documents. The intermediate phases of the process refer to the sequence of actions that modify the contents and the values of the attributes in a document instance. The document life cycle is formalised through state diagrams, that show the dynamic behaviour of the objects. UML class diagrams are used to semi-automatically generate the document schema (static part), in particular a DTD document, including a tag for each attribute in the class diagram and metadata table in the DB. The DTD is automatically generated from the UML class diagram (see Figure 11). Currently, the software engineer selects the tags that correspond to metadata fields; future work will

be devoted to automate this process. UML state diagrams are needed in addition to the DTD and the metadata table schema to produce the form-based user interfaces for the workflow activities corresponding to the document creation/evolution (see Figure 11). Indeed, an interactive workflow activity is required for each edge in the UML state diagram. The software engineer needs to graphically design the HTML form, using the facilities of the WfMS process definition tool (see Figure 12), and specify the correspondence between HTML form elements and DTD tags. This is used to generate the scripting functions that produce the XML document from the values contained in the HTML forms and load the XML documents into the HTML form during document evolution. It is worth noting that a task might produce/modify only a part of a document and then only a subset of the DTD tags are required.

Figure 11: DTD, metadata, and form production

Figure 12: Building the user interface We have used the Microsoft DOM4 interface to deals with XML documents. In particular, DOM primitives have been accessed through VBscripts executed by an Ultimus flobot step. An additional script in the flobot step stores

4 http://msdn.microsoft.com/workshop/Author/dom/domoverview.asp

Bando gara (BG)

ID protocolloAnno emissione

Determina Approvazione Spesa (DTAS)

ID protocolloAnno emissione

Determina aggiudicazione Gara (DTAG)

ID protocolloAnno emissione

0..1

Determina Approvazione Progetto (DTAP)

ID protocolloAnno emissione

0..1

Delibera Approvazione Progetto (DLAP)

ID protocolloAnno emissione

1

1

1

1

0..1

Delibera Nomina Commissione (DLNC)

ID protocolloAnno emissione

0..1

0..1

0..10..1

0..1

0..1

0..1

0..1 0..1

Class Diagram State Diagram

UML

DTD and Database Metadata

elements

Form based user interfaces for interactive workflow tasks

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

and indexes newly produced or modified documents into the GIANO repository. 5.2 Giano user interface reengineering The second issue concerns a web-based reengineering of the Giano user interface, that was conducted by two University’s researchers. The advantages of this reengineering activitiy are twofold:

• integrating GIANO into the Ultimus workflow client, thus providing the user with the same interface to produce and retrieve the documents;

• enabling external users (the customers of the administration) to access public parts of the administrative documents through the web.

To map the graphical objects of the old user interface onto objects of the new interface we used the guidelines of the MORPH methodology [17, 18, 19]. MORPH was originally developed for reengineering text-based user interfaces of legacy systems to graphical user interfaces; however, it has also been used to reengineer graphical user interfaces from one platform to another. The MORPH method entails three steps, detection, representation, and transformation. In the first step a static analysis is conducted to identify and extract the user interface implementation patterns from the source code. The representation step aims to build a hierarchical abstract model where the identified user interface coding patterns are the leaves and higher level conceptual interaction tasks and attributes are abstracted from the lower level patterns. This abstract model is stored in the MORPH knowledge base. The final step defines a set of transformation rules used to move the abstract model into a concrete implementation with a particular GUI technology. The authors of the MORPH method have built tools to automate the user interface reengineering task. In our case, the reengineering of the Giano user interface was conducted manually; nevertheless, MORPH guidelines were useful to build a mapping between the interaction objects of the old GIANO user interface and the HTML objects provided by the Ultimus client designer. The transformation of the old user interface into the new HTML interface was driven by the need for avoiding re-training the old users of GIANO. Therefore, in addition to establishing a one-to-one correspondence between the old and the new interaction objects, the mapping we built maintained a correspondence between the old screens and the new HTML pages and forms. In the old version of GIANO, the communication between the client and the server was based on TCP/IP sockets. Therefore, to avoid changing the GIANO server, we needed to reengineer the client-side java communication layer. In particular, we implemented a DLL that is loaded by the Ultimus workflow engine.

6. Concluding remarks The work reported in this paper has addressed the problem of creating an information infrastructure and services for distributed and virtual organisations, and particularly the integration of two key enabling technologies, namely workflow and document management. We have discussed a case study where integration has been achieved by combining several approaches, including software engineering and hypertexts. In this way, we have raised the integration problem from the level of a purely technical issue to a level of conceptual modeling: integration is not focused solely on the information/software systems but involves and is driven by the related business processes. This allows a stricter integration of technologies and, primarily, a better support to the organisation’s core business processes. An important driver for our research is the role of the documents during the process advancement. Indeed, in the case of administrative processes, the documents have a central role and can be considered the main artefact produced by the process. Moreover, they are produced exploiting the contribution of different actors that have a partial view of the documents being produced or searched. Therefore, in this case the integration of workflow and document management is particularly appropriate. This integration has been necessary due to the fact that most workflow management systems follow an activity based approach and offer a limited support for document production and management. Therefore, to manage a data centred process, such as the tender process in a public administration, repository based document management systems are preferred. However, these systems do not provide support for workflow and cooperation. The problem of defining integrated approaches offering a valid support both for process coordination and document management is a relevant issue [14]. Some approaches in the area of software development process have also been proposed [9, 10, 26]. In particular, they integrate software process and configuration management activities. However, they are specifically oriented for software organisations and are not adequate to support more general business processes. Our approach aims at providing an integrated environment for process and document management in public organisations. The activity of analysis and modelling of the documents has been integrated with the user interface design. The document reverse engineering activity has also included the identification of the actors involved in the production of the different parts of the document. This information has been used to define a mapping between the elements of a DTD and the graphical objects of the user interface. Although the document level integration would have been sufficient for integrating GIANO in the workflow prototype, we pursued the migration of the GIANO legacy user interface into a new Web-based interface to provide

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

all the process actors with the same graphical user interface during both the production and the retrieval of the documents. In addition, the migration of the user interface to the web is a key to allows citizens, and not only process actors, to access public parts of the administrative documents stored in the GIANO repository. The case study presented in this paper was made in the context of a technology transfer project involving local Small and Medium Enterprises (SMEs) with the aim of providing them with of know-how in the form of models, methodologies, and technologies to be transformed in a short time in the offer of modern software products and services for the Public Administration (PA). The SMEs were selected and classified with respect to their orientation to net-centric technologies and their orientation to the PA market [4]. A consideration that we can derive from our experience concerns the degree of participation of the personnel of SMEs partners of the project. We noticed that SMEs with a higher technological orientation tend to be more active than SMEs with a significant market segment in the PA. According to the literature in the technology transfer field, the former can be classified as innovators or early adopters with respect to technology innovation, while the latter have to be classified as late majority adopters [20]. Indeed, our findings confirm that innovators and early adopters trust the evidence built jointly with researchers, whereas early and late majority adopters will wait for business evidence [20]. This suggests that territorial and technology innovation is driven more often by a company’s attitude to be innovator or early adopter, rather than by a real market push. This is particularly true in our case, because the SMEs oriented to the PA market have their market segment in the peripheral PA. Indeed, while central PA departments are more active in experimenting new technologies and structuring their information systems needs (often acting as innovators in technological innovation processes [8]), the peripheral PA is characterized by a tendency to conform to innovations dictated by the central PA, rather than to promote them. This also makes the SMEs that supply information systems to peripheral PA late majority adopters. Aknowledgments We would like to thank Aniello Cimitile for his contribution on the LINK project. A special thank goes to Lorenzo Toscano and Antonella Orlacchio for their work on the GIANO system and to Antonio Sanginario for his work on the workflow prototype. We also would like to thank Eustema SpA for the support with the Ultimus Workflow suite and the partners of the project LINK, in particular the Province of Benevento and the SMEs Alphasoft srl, Snap srl, Peoples Network srl, and Techcon srl.

References [1] C.K. Ames, S.C. Burleigh, and S.J. Mitchell,

“WWWorkflow: World Wide Web based workflow”, Proceedings of the 13th International Conference on System Sciences, vol. 4, 1997, pp. 397-404.

[2] L. Aversano, A. Cimitile, G. Canfora, and A. De Lucia, “Migrating Legacy Systems to the Web”, Proceedings of European Conference on Software Maintenance and Reengineering, Lisbon, Portugal, IEEE CS Press, 2001, pp.148-157.

[3] L. Aversano, A. Cimitile, G. Canfora, A. De Lucia, “La Gestione dei Flussi Documentali nella Pubblica Amministrazione: un Caso di Potenziamento dei Rapporti di Fornitura tra Piccole e Medie Imprese e Pubblica Amministrazione”, Atti della Conferenza Nazionale LINK, Roma 16-17 Gennaio 2001.

[4] L. Aversano, G. Canfora, A. De Lucia, and P. Gallucci, “Business Process Reengineering and Workflow Automation: A Technology Transfer Experience”, The Journal of Systems and Software, 2002, to appear.

[5] L. Aversano, A. Cimitile, A. De Lucia, and P. Gallucci, “Web-centric Business Process Reengineering”, Proceedings of 3rd International Workshop on Net-Centric Computing, Toronto, Canada, 2001, pp. 7-11.

[6] R. Baetza-Yates, B. Ribero-Neto, Modern Information Retrieval, ACM press, 1999.

[7] G. Booch, J. Rumbaugh and I. Jacobson, The Unified Modelling Language User Guide, Addison-Wesley, 1999.

[8] G. Cantone, “Measure-driven processes and architecture for the empirical evaluation of software technology”, Journal of Software Maintenance: Research and Practice, vol. 12, no. 1, 2000, pp. 47-78.

[9] Conradi, R. et al.: "Integrated Product and Process Management in EPOS", in Journal of Integrated CAE (special issue on Integrated Product & Process Modeling), 1995.

[10] Estublier, J.; Dami, S.; Amiour, M.: “High Level Process Modeling for SCM Systems”, in Software Configuration Management - ICSE’97 SCM-7 Workshop, LNCS 1235, Springer, Berlin, 1997; pp. 81-97.

[11] D. Georgakopoulos, H. Hornick, and A. Sheth, “An Overview of Workflow Management: from Process Modelling to Workflow Automation Infrastructure”, Distributed and Parallel Databases, vol. 3, 1995.

[12] G. Joeris “Cooperative and Integrated Workflow and

Document Management for Engineering Applications”, Proceedings of the 8th International Workshop on Database and Expert Systems Applications, 1997 pp. 68-73;

[13] I. Jacobson, M. Ericsson, and A. Jacobson, The Object Advantage: Business Process Reengineering

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE

with Object Technology, ACM Press, Addison-Wesley, 1995.

[14] Joeris G., “Cooperative and Integrated Workflow and Document Management for Engineering Applications”, Proceedings of the 8th International Workshop on Database and Expert Systems Applications, 1997, pp. 68-73.

[15] B. Kitchenham, “DESMET: a Method for Evaluating Software Engineering Method and Tools”, Technical Report 96-09, Dep of Computer Science, University of Keele: Staffordshire, UK, 1996.

[16] G.Kappel, S. Rausch-Schott, S. Reich, W. Retschitzegger “Hypermedia Document and Workflow Management Based on Active Object-Otiented Databased”, Proceedings of International Conference on System Science, 1997, pp. 377-386.

[17] M. Moore and S. Rugaber, “Using Knowledge Representation to Understand Interactive Systems”, Proceedings of 5th International Workshop on Program Comprehension, Dearborn, MI, IEEE CS Press, 1997, pp. 60-67.

[18] M. Moore, User Interface Reengineering, PhD Dissertation, College of Computing, Georgia Institute of Technology, Atlanta, GA, 1998.

[19] M. Moore and L. Moshkina, “Migrating legacy user interfaces to the Internet: shifting dialogue initiative”, Proceedings of 7th Working Conference on Reverse Engineering, Brisbane, Australia, IEEE CS Press, 2000, pp. 52-58.

[20] S.L. Pfleeger and W. Menezes, “Marketing technology to software practitioners”, IEEE Software, vol. 17, no. 1, 2000, pp. 27-33.

[21] P. Prescod and C.F. Goldfarb, The XML Handbook, Prentice Hall, 2000.

[22] H. Stark and L. Lachal, Ovum Evaluates: Workflow, Ovum ltd., Sept. 1995.

[23] A. Salminen, K. Kauppinen, and M. Lehtovaara, “Standardization of Digital Legislative Documents: a Case Study”, Proceedings of the 29th International Conference on System Sciences, 1996, pp. 72-81.

[24] S. Sumiya, T. Saito “Development of a Multimedia Document Management System for Cooperative Work Environment”, Proceedings of International Conference on Computer Software and Applications, IEEE CS Press, 1992 pp. 346 355

[25] A. Spoerri “Infocrystal: A visual Tool for Information retrieval & Management”, Proceedings of International Conference on Information and Knowledge Management (CIKM), Washington, D.C., 1993.

[26] Westfechtel, B.: “Integrated Product and Process Managemet for Engineering Design Applications”. In Integrated Computer-Aided Engineering, Vol. 3, No. 1, John Wiley & Sons, New York, 1996; pages 20-35.

[27] Workflow Management Coalition, “Workflow Management Coalition: Reference Model”, 1994, http://www.aiim.org/wfmc/standards/docs/tc003v11.pdf.

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering (CSMR�02) 1534-5351/02 $17.00 © 2002 IEEE