Leveraging temporal and spatial separations with the 24-hour knowledge factory paradigm

26
Electronic copy available at: http://ssrn.com/abstract=1122506 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1 Amar Gupta Thomas R. Brown Chair in Management and Technology University of Arizona Eller College of Management McClelland Hall, Tucson, AZ 85721-0108 Ph: (520) 307-0547 Email: [email protected] Rajdeep Bondade Doctoral Student, University of Arizona Department of Electrical and Computer Engineering 1230 E. Speedway Blvd, Tucson, AZ 85721 Ph: (520) 360-1974 E-mail: [email protected] Igor Crk Doctoral Student, University of Arizona Department of Computer Science Gould-Simpson Building 1040 E. 4th Street, Tucson, AZ 85721 Ph: (520) 820-2890 E-mail: [email protected] Abstract— The 24-hour knowledge factory paradigm facilitates collaborative effort between geographically distributed offshore teams. The concept of handoff, along with the vertical segregation of tasks, enables software development teams to work on a continuous basis on the project. This notion is made possible through efficient knowledge representation, as the process of handoff essentially entails all work completed to be transferred from one site to the next. Data management is one of the key parameters that defines the success of this business model and is achieved through leveraging of specific tools, models, and concepts. Index Terms—data management, global software development, knowledge representation, offshoring Leveraging Temporal and Spatial Separations with the 24-Hour Knowledge Factory Paradigm

Transcript of Leveraging temporal and spatial separations with the 24-hour knowledge factory paradigm

Electronic copy available at: http://ssrn.com/abstract=1122506

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

1

Amar Gupta Thomas R. Brown Chair in Management and Technology

University of Arizona Eller College of Management

McClelland Hall, Tucson, AZ 85721-0108 Ph: (520) 307-0547

Email: [email protected]

Rajdeep Bondade Doctoral Student, University of Arizona

Department of Electrical and Computer Engineering 1230 E. Speedway Blvd, Tucson, AZ 85721

Ph: (520) 360-1974 E-mail: [email protected]

Igor Crk

Doctoral Student, University of Arizona Department of Computer Science

Gould-Simpson Building 1040 E. 4th Street, Tucson, AZ 85721

Ph: (520) 820-2890 E-mail: [email protected] 

Abstract— The 24-hour knowledge factory paradigm facilitates collaborative effort between geographically distributed offshore teams. The concept of handoff, along with the vertical segregation of tasks, enables software development teams to work on a continuous basis on the project. This notion is made possible through efficient knowledge representation, as the process of handoff essentially entails all work completed to be transferred from one site to the next. Data management is one of the key parameters that defines the success of this business model and is achieved through leveraging of specific tools, models, and concepts.

Index Terms—data management, global software development, knowledge representation, offshoring

Leveraging Temporal and Spatial Separations with the 24-Hour Knowledge Factory Paradigm

Electronic copy available at: http://ssrn.com/abstract=1122506

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

2

1. INTRODUCTION

HE concept of the 24-hour knowledge factory can be described in terms of a set of three or

more geographically and temporally dispersed software development teams, all working on the

same phase of a project, during their appropriate day hours pertaining to their time zone. Unlike

the traditional global software development model, wherein a complete project is iteratively

segregated into small modules and assigned to a development team, the 24-hour knowledge

factory is based upon a vertical partition of tasks. In the former case, collaboration between

teams is minimal and occurs only in conditions that deal with interfacing two different modules.

In the 24-hour knowledge factory however, at the end of the work day, a particular team transfers

all existing work accomplished to the succeeding team, in another part of the world. This new

development team continues the same phase of the project, where the initial team left off, in a

cyclic manner. By judiciously dispersing development teams around the world, it is possible for

the same phase of the project to progress on a non-stop basis, thereby increasing the efficiency of

the project since each team perceives that work is accomplished overnight, when developers at

that location are asleep. Earlier, the time difference between geographically distributed teams

was perceived to be an obstacle to productivity as it added significant communication overheads,

leading to time delays, increase in costs, and an overall drop in efficiency. Now, this conception

has changed and the time difference is being adopted as a major strategic advantage. The advent

of IT technology, such as the Internet, online chat, email, message boards along with an array of

collaborative tools, has led to growing leveraging of the time difference, enabling companies to

better utilize global talent (Sambamurthy 2003).

In the 24-hour knowledge factory, collaboration between teams is accomplished by the process

of handoff. The goal of handoff is to precisely communicate the tasks accomplished during a

particular work period, so that it can be efficiently continued on forward. The entire handoff

T

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

3

procedure should ideally consume only a few minutes, which is possible only through efficient

data management.

The concept of the 24-hour knowledge factory can be applied not just to the software

development environment, but also to a myriad of other knowledge-based applications. The

essence of this paradigm lies in its efficient data representation and transfer, and hence can be

applied to any endeavor that is dictated by similar terms. As an example, a company involved in

the design of new integrated circuits (IC chips) can employ talented engineers from around the

world, and still have them work in their respective countries and at appropriate hours, without

having to ask them to relocate to another country or to work during odd hours. The 24-hour

knowledge paradigm provides such firms with the potential to design and develop new chips in

much shorter periods of time, due to the non-stop decentralized work paradigm.

The 24-hour knowledge factory research group at the University of Arizona has developed new

models for knowledge representation, which are discussed in this paper. Applications of these

models for decision support systems by offsite teams are also considered in this paper.

2. OUTSOURCING

“As companies disaggregate intellectual activities internally and outsource more externally,

they approach true virtual organization with knowledge centers interacting largely through

mutual interest and electronic rather than authority systems” (Quinn 1999). The concept of the

24-hour knowledge factory is an extension of the conventional outsourcing model and provides

businesses with the opportunity to reduce costs and development times. As one example, the 24-

hour knowledge factory provides the potential to obtain feedback and to test results by

counterpart offshore sites, on an overnight basis as seen from the perspective of the original site.

In the conventional model, a particular development phase is completed; only then, can a module

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

4

be tested and feedback provided.

As handoff occurs, completed work along with the rationale for the decisions made during the

particular shift is conveyed to the subsequent team. If this transfer of ideas progresses with ease,

then the time spent on completing a project drastically reduces.

3. PRACTICAL APPLICATIONS OF THE 24-HOUR KNOWLEDGE FACTORY

The use of development teams that are geographically widespread is becoming more common.

Companies such as IBM, Motorola, Fujitsu, and Siemens have employed different level of

collaboration for their projects. A controlled experiment was conducted at IBM to evaluate the

performance between a team that was entirely collocated, and another which was distributed.

The distributed team consisted of two sites, one in the US and the second in India. These two

teams were identically structured, with seven core developers of similar experience and managed

by the same development manager. The teams were organized on a task-oriented basis and were

observed closely for a period of 52 weeks. At the beginning of this exercise, it was perceived that

the quality would be higher in the case of the collocated team, and that the overall costs would be

lower for the distributed team. In reality, the distributed team outperformed the collocated team

in terms of the documentation and knowledge repository. Since all developers in the collocated

team were present at one site, the members of that team interacted informally with each other; as

such, computer-based knowledge was constricted in comparison to that in the case of the

distributed team. In the latter scenario, knowledge was dispersed and stored in formalized

mechanisms, partially due to a high correlation between the tasks being carried out by different

sites. It was also observed that the turnaround time to resolve tasks was significantly shorter in

the distributed team scenario. The most significant obstacle in the latter scenario was the loss of

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

5

informal communication between developers; this was partially resolved by initiating a face-to-

face meeting at the beginning of the endeavor (Seshasai 2007).

In the literature, geographic proximity is considered to be important in ensuring high

productivity of the workforce, and optimization of many types of processes. The above example

contradicts traditional thinking. It highlights that with appropriate technology and work

procedures, distributed teams can surmount traditional barriers and outperform collocated teams.

4. ENABLING TECHNOLOGIES FOR DATA REPRESENTATION

Knowledge transfer and reuse attain greater importance in the case of outsourcing (Myopolous,

1998). Different technologies must be employed to successfully ensure that various offshore sites

can efficiently share knowledge resources. Technologies such as SharePoint, developed by

Microsoft, and Quickr by IBM allow developers to share information without having to save any

information locally. The Wiki concept incorporates capabilities for version control, search

engines, and content indices. Distributed authoring and versioning systems, such as Web-Dav,

also facilitate collaboration between offshore teams; this tool allows any change made to an

artifact in the project, such as program codes, diagrams, and documents, to be reflected at all

other offshore sites as well. Extreme programming is another relevant concept; it emphasizes

collective ownership of code by all members belonging to the development team (Crk, 2008).

Apart from data representation, secure data transfer is also required. Security technologies, such

as digital signatures, encryptions and authentication, are relevant for safeguarding the frequent

transmissions of information across continents.

The techniques mentioned above serve as the foundations to the design of secure data

representation models and software for the 24-hour knowledge factory.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

6

4.1 Knowfact

Knowfact is an integrated decision support system, developed at MIT under the supervision of

the first author of this paper. It was developed to demonstrate the ability of a system to obtain

knowledge and to distribute it accordingly. This system encompasses details of all technical and

business decisions made at different times, based on inputs from geographically distributed

stakeholders.

The Knowfact support system is shown in Figure 1. It consists of two important modules, the

Decision Rationale Module (DRM) and the Decision History Module (DHM). The DRM is

responsible for defining important attributes of a system; these attributes are then employed by

the utility interview to represent the current progress of the project. This component enables

developers to represent information on all key aspects in a well-organized manner. The DRM

knowledge representation module extracts vital information about the objectives and this

information is used by stakeholders to ascertain the progress of the project. The module serves as

a decision support system, as it considers multiple attributes, such as utility and cost-benefit,

with respect to a decision to be made. Based on the analysis of these attributes, the DRM

presents an evaluation of possible alternatives, thereby enabling an intelligent decision to be

made by the stakeholders. Decision-making studies have shown that stakeholders usually face a

trade-off between quality information and accessible information; under time pressure, decisions

are frequently made based to lower quality information that is more accessible (Todd, 1991,

Ahituv, 1998). Knowfact, as well as the broader 24-hour knowledge factory paradigm,

emphasizes the importance of an automated repository of knowledge; this is a module that

gathers ‘smart’ information in order to facilitate quality decisions to be made quickly and

readily.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

7

The DHM compiles a history of all parameters that led to a particular decision, and represents

it in a graphical format. These parameters are then evaluated based on their importance, and is

used as feedback to determine the values of the attributes defined in the DRM. Developers then

use the utility interview to determine the relative importance of the attributes for the next stage of

the project. The DHM also incorporates a knowledge repository that is used to store data about

the current state of the activity being performed, along with a history of previous states. While

most of the system functionality is automated, Knowfact also prompts the user to justify the

rationale behind certain major decisions. This aids future developers of the system to understand

the logic behind a particular decision.

Figure 1- KNOWFACT process flow

4.2 Composite Persona

In the traditional software development cycle, a particular problem is iteratively broken down

into tractable segments, and each segment is assigned to a developer. It is the responsibility of

the developer to carry the artifact through the processes of coding, testing, and maintenance. This

sometimes leads to the development of disjoint solutions, which are understood only by the

individuals who created the concerned parts of the code. In many such cases, the problem can be

traced back to complex parameters that are highly correlated with other parameters. This

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

8

problem can be mitigated by establishing a high degree of communication between developers;

this can be a formidable challenge, especially in offshoring scenarios, due to geographical,

temporal, and cultural distances. Studies have shown that developers spend 70% of their time on

collaborative tasks (Vessey, 1995). Therefore, a reduction in this time would lead to a significant

increase in productivity.

Scarbrough (1998) discusses that the knowledge-based view of a firm focuses on fostering

specialization of employees’ knowledge and on creating internal networks of these human

knowledge sources, while business process reengineering focuses on external relationships for

rapidly growing performance complemented by generalization of the knowledge source. In order

to facilitate the creation of such knowledge networks between development teams, the notion of

the composite persona has been proposed. A composite persona (CP) is a highly structured

micro-team, which possesses dual properties of both the individual and a team of engineers. To

an external observer, the CP seems like a single entity; but internally, it consists of many

developers. CPs are geographically distributed, with each offshore site having the same number

of composite personae, with each developer belonging to one or more CPs. A problem is divided

into tractable units, with the smaller modules are handled by a CP, rather than by a single

developer. It is the responsibility of this micro-team to serve as the owner of the module and the

associated classes, as well as to generate the necessary solution. The CP is always active, but

with a different driver for each shift, based on the appropriate time zone. A vertical division of

tasks is employed in this model, with each team representing a CP working in a sequential

manner, around the clock. When a driver of a CP has completed the day’s work, then the work-

in-progress, augmented by details of the rationale of all associated decisions, is handed off to the

subsequent driver. The new driver quickly learns details of the work completed so far, so that

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

9

work on the incomplete tasks is resumed within a few minutes time.

Communication is critical in defining the efficacy of the composite persona model, and can be

modeled into three forms. The first type of communication is handoff, and its success is

determined by how well knowledge can be represented before it is transferred to the subsequent

driver. The second form of communication is lateral, occurring real-time between members

belong to the same active CP. The third form of communication consists of lateral

communication with handoff. Such a form of communication may arise due to interaction

between two CP drivers at one development site. When this site turns off and data are handed to

another site, the same conversation can be resumed from the received data.

4.3 CPro

Knowledge management is defined very broadly, encompassing processes and practices

concerned with the creation, acquisition, capture, sharing and use of knowledge, skills and

expertise (Quintas, 1996). In the 24-hour knowledge factory model, the concept of the composite

persona is employed for sharing of skills and knowledge. A software process that is developed

around this notion is CPro. CPro incorporates the concept of the composite persona and

integrates various automated algorithms that can be used to generate work schedules for efficient

project completion, for reducing software defects and for effective knowledge representation and

transfer between the developers of the CP.

The knowledge management life cycle model categorizes knowledge into six different phases:

create, organize, formalize, distribute, apply and evolve (Birkinshaw, 2002). CPro follows a

similar approach by breaking down a particular project into subtasks; planning, design, design

review, coding, code review and testing. It then generates an automated schedule for the

expected progress of the project, on behalf of all the developers. Since a single developer cannot

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

10

determine such a schedule alone due to the vertical sequence of tasks, and since it is not possible

to predict random obstacles, CPro incorporates an algorithm known as the schedule caster to

generate the expected deliverables. CPro acquires estimates for all subtasks from developers, and

then performs a Monte Carlo simulation to determine the most effective schedule for the

different CPs.

CPro asks each of the developers to rate all the tasks based on a scale of Very Complex,

Complex, Moderate, Simple and Very simple, with each of the qualitative measures associated

with an offset exponential distribution. Using this information, along with a combination of

previous estimates and the historical data on actual productivity, the schedule caster employs a

continuous distribution to simulate several possible work schedules, as shown in Figure 2.

Figure 2- Simulated work schedules

The CPro process begins with a learning cycle, wherein each time a developer makes an

estimate and then completes the task, the data are recorded into CPro, which then modifies the

probability distribution for that developer. Two mechanisms were analyzed in detail. The first

mechanism used Bayesian probability with joint distributions and the second one involved case-

based reasoning, wherein cases were generated based on prior estimate and actual performance.

After the learning process, CPro can generate high-probability distribution curves that are likely

to lead to high productivity.

Once the schedule caster has generated a schedule based on developer inputs, and a task has

been assigned to a CP, the CP is wholly responsible for that task. It has to be completed fully or

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

11

dropped. Between CP drivers, handoff occurs at the phase boundary. The concept of the

vertically distributed sequence of tasks is preserved, as each subsequent task requires complete

knowledge of prior tasks.

CPro has been formalized to assign work using two different algorithms: the random work

assignment and the selected work assignment algorithm. In the random work assignment

algorithm, for each task, multiple iterations are performed by randomly choosing a developer,

and applying these estimates to generate a simulated effort. The final schedule is then based on

the most optimum result obtained from each developer’s distribution curve. This model,

however, does not take into consideration any preferences, on the developers’ part. The selected

work assignment model is more sophisticated than in the case of the random work assignment

model. When the most optimal work schedule is being generated for the CP, factors such as shift

timings and the complexity involved in the task are taken into consideration.

The selected work assignment model employs a greed policy for scheduling of work. All open

tasks are arranged according to increasing order of complexity. Once a developer has selected a

task, CPro generates a simulation of the actual effort required to complete the task. If the task

cannot be completed within the end of the shift, and the developer is unwilling to work overtime,

then that particular task is disregarded, making it available for another developer. If the

developer does intend to complete the task, then CPro provides the capability of locking that

particular task, preventing any other developer from being assigned the same task.

4.3.1 Defect Reduction Strategies in CPro

Knowledge transfer between members of a CP determines the long term success of the CP.

Defect reduction strategies are employed to ensure that the knowledge being transferred is

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

12

represented well and is free from errors.

Artifact reviews is one technique to detect defects. In the 24-Hour Knowledge Factory

paradigm, each individual can serve as a peer reviewer for work done by colleagues in other

countries. This can help to identify defects.

Test driven development is a technique that involves the development of software solutions

through rigorous testing of code artifacts. Before any artifact is developed, a minimal set of test

cases are generated; as the artifact is coded, tested and debugged, the discovery of any defect

leads to the formulation of another test case. The collection of test cases can be used to represent

the CP’s understanding of the problem, and their respective solutions.

Effective knowledge transfer is further realized in the 24-hour knowledge factory paradigm

through a heuristic assignment process. During the course of the work day, many tasks are

available for the current driver of the CP, such as coding, testing, and reviewing of an existing

piece of code. A set of heuristics have been integrated into CPro; these heuristics are used to

recommend a course of action, so that the maximum amount of knowledge is transferred.

MultiMind

Suchan and Hayzak (2001) highlighted that a semantically rich database can be useful in

creating a shared language and mental models (Suchan, 2001). MultiMind is a tool that embodies

this objective; it is based on the CPro software process and builds on other known concepts such

as Lifestream. MultiMind uses the 24-hour knowledge factory concept to support collaborative

effort among geographically dispersed sites that rely on the CP methodology. The high level

representation of MultiMind is shown in Figure 3.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

13

Figure 3- MultiMind Architecture

MultiMind builds upon known business services techniques and IT technologies, such as

Scrum knowledge transfer technique and extreme programming concepts, in order to perform

handoff. Scrum stand-up meetings are automated, as work summaries of the project are

developed automatically by the embedded project management system. When a developer signs

out, then the system automatically captures a snapshot of all relevant data. When the subsequent

CP drivers logs in, scrum reports are generated based on the current state of the project and

previous reports. In the design of this prototype, the objectives were to: automate what can be

automated; store everything that can be stored; and minimize the cognitive overhead for

cooperation (Gupta, 2007). Multimind supports the following functionality:

• It defines atomic editing operations;

• It stores every edit (down to the atomic level), every form of communication (such as e-

mail, and chat), and it keeps record of every document browsed and every search made;

• It automatically creates a summary by monitoring the activity of each worker and

prepares a hierarchical report of all the changes made; and

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

14

• It creates an informational summary by compressing the edit log.

A decision support system is an interactive, computer-based information system that utilizes

decision rules and models, coupled with a comprehensive database to support all phases of the

decision-making process mainly in semi-structured (or unstructured) decisions under the full

control of the decision makers (Yang, 1995). A novel decision support system capability is

incorporated into MultiMind using a tool called Lifestream, along with ideas from Activity

Theory. MultiMind incorporates a monotonically increasing database, which tracks all accessed

objects and knowledge events (Gupta, 2007). Knowledge events are generated when a developer

accesses some form of electronic memory such as the project-knowledge base, email, or the

World Wide Web. Every event that occurs during the course of a project is associated with a

timestamp, which carries information about the task performed by the previous user. The

timestamps that are formally defined in MultiMind are as follows:

1. LoginEvent and LogoutEvent: generated when an agent begins and completes their shift;

2. MessageViewEvent and MessageSearchEvent: generated when an agent reads or searches

through a message;

3. ExternalWebAccessEvent and ExternalWebSearchEvent: generated when an agent access

a webpage or issues a web search; and

4. DocumentSearchEvent: generated when the agent issues a search in memory for a

particular document.

If any confusion arises regarding a decision taken by an earlier CP driver, then the recorded

sequence of events that led to that decision can be utilized to determine the future course of

action. Information on documents and artifacts accessed, web searches conducted, messages

exchanged, and other sequence of events can be quickly analyzed to understand the rationale of a

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

15

previous decision. This process is known as code justification, and is achieved by data mining

Lifestream for relevant project artifacts and knowledge events that have occurred within the

desired frame of time. Currently, research is being conducted to incorporate intelligent filters in

the semantic analysis process, in order to remove redundant information, and to retain only

information relevant to the specific decision. The Lifestream process is shown in Figure 4.

Figure 4- Lifestream Process

MultiMind also includes an embedded discussion tool for all project artifact interfaces. This

tool uses concepts from the Speech Acts theory to classify communication into different

categories. New message threads that are created can be of the form ‘inquiry’ or ‘inform’. When

a developer responds to messages, it can be in the form of ‘response’ or another ‘inquiry’ in

order to pose a counter-question. MultiMind recognizes the speech acts as shown in Figure 5.

Figure 5- Hierarchy of Speech Acts

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

16

4.4 Implicit and Explicit Communication in the 24-hour knowledge factory

In his book, The Mythical Man Month, Fred Brooks describes the problems associated with

growing teams of developers (Brooks, 1995). The amount of communication overhead that is

introduced can lead to negative consequences. In order to mitigate this potential problem, several

design decisions were made with respect to implicit and explicit communication. First, in order

to maintain around-the-clock productivity, a distributed CP must contain functional expertise at

each of the distributed sites. Second, although communication channels exist between sites, the

real-time usable ones are constrained only to those within the co-located subsets of the CPs.

Due to the distributed nature of expertise within a CP, the communication overhead between

sites could be potentially crippling at some of the stages of development that involve heavy

communication. This is only true, however, if we are considering explicit communication as the

sole means for transmitting and sharing knowledge. However, studies have shown that expert

teams rely more on implicit communication strategies and on the shared mental model of the task

at hand (Gupta, 2007). These models can incorporate knowledge about the various system

components and the relationships between them. These mental models are developed by

interaction with other member of the team. Thus, both explicit communication and implicit

communication are important for the 24-hour knowledge factory. Research has also shown that

knowledge is created by the interaction between explicit and tacit knowledge (Nonaka, 1994),

the boundary between them is not too clear, with one form of knowledge being created by the

other (Spender, 1996).

In order to facilitate explicit communication for knowledge transfer, Jazz offers a collaborative

environment. Jazz is developed by IBM and is an extension of the Eclipse IDE. It provides

communication tools within the IDE itself and utilizes context provided by the IDE to create

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

17

meaningful collaboration artifacts. It focuses on project planning and facilitating high-level

communication between developers, thereby reducing the dependence on special purpose

communication tools and informal meeting. Implicit communication, on the other hand is

facilitated by the use of tools to increase the understanding of project artifacts, as well as their

interconnections and dependencies. A number of commercial and non-commercial code

visualization tools and code browsers are now available; they try to implement software

visualization, which forms the basis of implicit communication. Tools such as CodeCrawler, X-

Ray, Seesoft, Tarantula, and Moose try to enhance the developer’s understanding of the code, by

illustrating the logical structure, execution-related line level statistics, and locations of errors

within the executed code. These tools are largely overlooked due to the lack of integration with

existing development environments (IDEs).

4.5 EchoEdit

Three surveys conducted in 1977, 1983, and 1984, highlight the lack of documentation as one

of the biggest problems that the code maintenance people have to deal with (Dekleva, 1992). In

order to ensure that code artifacts maximize knowledge transfer, developers must ensure that

they are documented well. Techniques such as literate programming, an IT service concept

introduced by Donald Knuth that combines documentation language with the programming

language (Knuth, 1984), along with tools such as TeX, Cweb and METAFONT, can be used for

efficient documentation. EchoEdit is a tool that combines documentation with voice recognition

capabilities (O’Toole, 2008). This tool allows programmers to generate audio comments that are

subsequently converted into text and embedded into the code. Comments can also be played

back to the developer. Audio comments in EchoEdit are usually reserved for high level issues,

while low level syntactical and semantic issues are addresses in textual format, thereby providing

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

18

knowledge representation capabilities in a dual manner.

Currently, EchoEdit is developed as a plug-in for the Eclipse environment. When another

programmer enters this environment, voice annotations can be played back by selecting an icon

displayed in the margin, similar to that in the PREP environment. EchoEdit supports three user

actions: the recording of an audio comment, the playback of the audio file, and the conversion of

the audio file into a text comment. An example of the comment, after conversion from audio

format, is shown in Figure 6.

Figure 6- Example of EchoEdit

The block diagram for EchoEdit is shown in Figure 7 (O’Toole, 2008). When an audio

comment is to be recorded, a recording event is generated and sent to the action handler, which

initializes the audio recorder and starts listening for the record event. Two threads are created

based upon this event: one to perform recording and the other to convert this voice to text.

During recording, all data are stored into memory, and once conversion is completed, the text,

along with the audio file is inserted back into the original file being worked on.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

19

EchoEdit is one of the plug-ins developed for the 24-hour knowledge factory. With the aid of

both text and audio comments for knowledge representation, handoff can be accomplished

within a period of few minutes.

Figure 7- Block diagram of EchoEdit

5. FURTHER RESEARCH RELATED TO THE 24-HOUR KNOWLEDGE FACTORY

There are several areas that require further research in order to fully exploit the potential of the

24-hour knowledge factory. In order to surmount spatial, temporal, and other kinds of

boundaries, Gupta has advocated a four-faceted knowledge-based approach that places emphasis

on (Gupta, 2001):

• Knowledge Acquisition, or the tapping of traditional systems to provide accurate and

comprehensive material for the new knowledge-based systems;

• Knowledge Discovery, or automated mining of numerical, textual, pictorial, audio,

video, and other types of information, to capture underlying knowledge, either on a

one-time basis or on a continuous basis;

• Knowledge Management to deal with different types of heterogeneities that

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

20

invariably exist when inputs have to cross-over borders of different types (national,

organizational, departmental, and others); and

• Knowledge Dissemination to extract, customize, and direct knowledge to appropriate

departments and users, based on their individual needs.

So far, most of the research has focused on collaboration on a limited scale, one that occurs

sporadically. In the case of the 24-hour knowledge factory, collaboration occurs continually; as

such, the research group at the University of Arizona is currently focusing on the development of

IS tools that can better address the requirements of knowledge manipulation and enhancement in

a highly decentralized environment. By building upon research conducted in the areas of

knowledge acquisition, knowledge discovery, knowledge management, and knowledge

dissemination, one is striving to reduce the overhead that is placed on distributed developers.

The 24-hour knowledge factory research group is also trying to develop new algorithms for

efficient decomposition of a project into sub-tasks, so that each sub-task can be performed

individually by a professional without having to know the specific details of the other sub-tasks.

Berger describes multiple evolutions of technology that have seen the splitting of tasks as the

catalyst for new work models and new technologies (Berger, 2006). The IBM System 360 makes

use of this concept by disaggregating of the mainframe hardware from the software logic,

thereby allowing separate workers to add value to each component separately and without

knowledge of each other’s work. This concept is being adapted for multiple disciplines, so that a

“component-based approach” can be utilized to construct basic artifacts for these respective

disciplines. Research is also being conducted on different techniques to combine the separately

completed tasks into a finished, marketable product or service.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

21

6. CONCLUSION

Efficient knowledge management and transfer is a critical aspect of the 24-hour knowledge

factory model. The composite persona model, manifested by the CPro process, utilizes multiple

approaches to realize facilitate efficient operation, on a round-the-clock basis, in a globally

distributed work environment. MultiMind and EchoEdit have been developed as concept

demonstration tools to validate some of the key design principles, especially ones related to the

process of efficient handoff. Such tools provide a comprehensive knowledge repository of the

rationale behind all decisions, offer an intelligent explanation of high-level and low level details

of code artifacts, and formulate the most efficient schedule for attaining maximum performance

on all the phases of the project. All these facets help to ensure that collaboration between

multiple teams occurs in an optimal manner. Overall, the 24-hour knowledge factory paradigm

provides the potential for projects to progress in a non-stop manner with minimal communication

overhead, thereby increasing productivity. Whereas in traditional literature, geographic

proximity is considered to be of vital importance in ensuring high productivity of the workforce,

and optimization of many types of processes, some of the cited examples contradict traditional

thinking and highlight that, in some cases, distributed teams can outperform collocated teams.

ACKNOWLEDGEMENTS

We would like to acknowledge the following current and former members of the 24-Hour

Knowledge Factory Research Group at the University of Arizona for their research ideas: Nathan

Denny, Kate O’Toole, Shivaram Mani, Ravi Seshu, Manish Swaminathan, and Jamie Samdal

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

22

REFERENCES

Ahituv, N., Igbaria, M. & Sella, A. (1998). “The effects of time pressure and completeness of information on decision making”, Journal of Management Information Systems, Vol. 15, No. 2, pgs. 153-172 Battin, R., Crocker, D., Kreidler, J. & Subramanian, K. (2001) “Leveraging Resources in Global Software Development”, IEEE Software, Vol. 18, Issue 2 Beck, K. (2000) “Extreme Programming Explained: Embrace Change”, Addison Wesley Birkinshaw, J. & Sheehan, T. (2002) “Managing the knowledge life cycle”, MIT Sloan Management Review, Vol. 44, No. 1, pp. 75–83 Brooks Jr, F. P. (1995) “The Mythical Man-Month”, Addison-Wesley Carmel, E. & Agarwal, R. (2002) “The Maturation of Offshore Sourcing of Information Technology Work”, MIS Quarterly Executive, Volume 1, Issue 2 Crk, I., Sorensen, D. & Mitra, A. (2008) “Leveraging Knowledge Reuse and System Agility in the Outsourcing Era”, Journal of Information Technology Research, Vol. 1, Issue 2, pgs 1-20 Dekleva, S., (1992) "Delphi study of software maintenance problems," Conference on Software Maintenance Proceedings, Vol. 9, No. 12, pp.10-17 Denny, N., Crk, I. & Seshu, R. (2008a) “Agile Software Processes for the 24-Hour Knowledge Factory Environment”, Journal of Information Technology Research, Vol. 1, Issue 1, pgs 57-71 Denny, N., Mani, S., Seshu, R., Swaminathan, M. & Samdal, J. (2008b) “Hybrid Offshoring: Composite Personae and Evolving Collaboration Technologies”, Journal of Information Technology Review, Vol. 21, Issue 1, pgs 89-104 Freeman, E. & Gelernter, D. (1996) “Lifestreams: A Storage Model for Personal Data”, ACM SIGMOD Record Gao, J. Z., Chen, C., Toyoshimo, Y. & Leung, D.K. (1999) “Engineering on the Internet for Global Software Production”, IEEE Computer, Vol. 32, No. 5, pgs 38-47 Gupta, A. (2001) “A Four-Faceted Knowledge Based Approach to Surmounting National and Other Borders.” The Journal of Knowledge Management, Vol. 5, No. 4, 2001 Gupta, A. & Seshasai, S. (2007) “24-Hour Knowledge Factory: Using Internet Technology to Leverage Spatial and Temporal Separations”, ACM Transactions on Internet Technology, Vol. 7, No. 3, Article 14 Hill, C., Yates, R., Jones, C., Kogan, S. (2006) “Beyond predictable workflows: Enhancing productivity in artful business processes”, IBM Systems Journal, Vol. 45, No. 4, pgs 663-682

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

23

Humphrey, W. S. (1995) “Introducing the Personal Software Process”, Annals of Software Engineering, Vol. 1, No. 1, pgs 311 – 325 Knuth, D. (1984) “Literate Programming”, The Computer Journal, Vol. 27, Issue 2, pgs 97-111 Lacity, M., Willcocks, D. & Feeny, D. (1995) “IT Sourcing: Maximize Flexibility and Control”, Harvard Business Review, Vol. 73, No. 3, pgs 84-93 Myopolous, J. (1998) “Information Modeling in the Time of Revolution”, Information Systems, Vol. 23, No. 3, pgs 127-155 Nonaka, I. (1994) “A dynamic theory of organizational knowledge creation”, Organization Science, Vol. 5, No.1, pgs 14-39 O’Toole, K., Subramanian, S. & Denny, N. (2008) “Voice-Based Approach for Surmounting Spatial and Temporal Separations”, Journal of Information Technology Research, Vol. 1, Issue 2, pgs 54-60 Quinn, J. B. (1999) “Strategic Outsourcing: Leveraging Knowledge Capabilities”, Sloan Management Review, Vol. 40, No. 4, pgs 9-21 Quintas, J. B., Anderson, P. & Finkelstein, S. (1996) ``Managing professional intellect: making the most of the best'', Harvard Business Review, Vol. 74, pgs 71-80. Sambamurthy, V., Bharadwhaj, A. & Grover, V. (2003) “Shaping Agility through Digital Options: Reconceptualizing the Role of Information Technology in Contemporary Firms,” MIS Quarterly, Vol. 27, No. 2, pgs 237-263 Scarbrough, H. (1998) “BPR and the knowledge-based view of the firm”, Knowledge and Process Management, Vol. 5, No. 3, pgs 192-200 Searle, J.R. (1975) “A Taxonomy of Illocutionary Acts. Language, Mind and Knowledge”, Minnesota Studies in the Philosophy of Science, pgs 344-369 Seshasai, S. & Gupta, A. (2007) "Harbinger of the 24-Hour Knowledge Factory" Available at SSRN: http://ssrn.com/abstract=1020423 Schwaber, K. & Beedle, M. (2002) “Agile Software Development with Scrum”, Series in Agile Software Development, Upper Saddle River, NJ: Prentice Hall Spender, J.C. (1996) “Organizational knowledge learning and memory: three concepts in search of a theory”, Journal of Organizational Change Management, Vol. 9, No. 1, pgs 63-78 Suchan, J. & Hayzak, G. (2001) “The communication characteristics of virtual teams: a case study”, IEEE Transactions on Professional Communication, Vol. 44, No. 3

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

24

Todd, P. & Benbasat, I. (1991) “An experimental investigation of the impact of computer-based decision aids on decision making strategies”, Information Systems Research, Vol. 2, No. 2, pgs. 87-115 Vessey, I. & Sravanapudi, A. P. (1995) “Case Tools as Collaborative Support Technologies”, Communications of the ACM, Vol. 38, Issue 1, pgs 83-95 Yang, H. L. (1995) “Information/knowledge acquisition methods for decision support systems and expert systems”, Information Processing and Management, Vol. 31, No. 1, pgs 47-58 Whitehead, E. J. (1997) “World Wide Web Distributed Authoring and Versioning (WebDAV): An Introduction”, StandardView 5, pgs 1, 3-8

BIOGRAPHY OF AUTHORS

Amar Gupta is a Tom Brown Endowed chair of management and technology, professor of

entrepreneurship, management information systems, and computer science at the University of

Arizona since 2004. Earlier, he was with the MIT Sloan School of Management (1979-2004). He

has served as the founding co-director of the productivity from Information Technology

(PROFIT) initiative.

Rajdeep Bondade is a doctoral student in ECE, at the University of Arizona. He received his

Bachelors of Engineering in Electronics and Communications from PES Institute of Technology,

Bangalore, India. He joined the 24-Hour Knowledge Factory Research group in January 2008.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

25

Igor Crk is currently pursuing a doctoral degree in computer science at the University of

Arizona. He holds a masters degree in computer science from the University of Arizona. Current

research interests include context-driven energy management in operating systems and the 24-

Hour Knowledge Factory model for distributed agile software development.

LIST OF FIGURES

Figure 1- KNOWFACT process flow

Figure 2- Simulated work schedules

Figure 3- MultiMind Architecture

Figure 4- Lifestream Process

Figure 5- Hierarchy of Speech Acts

Figure 6- Example of EchoEdit

Figure 7- Block diagram of EchoEdit

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

26