Leveraging temporal and spatial separations with the 24-hour knowledge factory paradigm
Transcript of Leveraging temporal and spatial separations with the 24-hour knowledge factory paradigm
Electronic copy available at: http://ssrn.com/abstract=1122506
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
1
Amar Gupta Thomas R. Brown Chair in Management and Technology
University of Arizona Eller College of Management
McClelland Hall, Tucson, AZ 85721-0108 Ph: (520) 307-0547
Email: [email protected]
Rajdeep Bondade Doctoral Student, University of Arizona
Department of Electrical and Computer Engineering 1230 E. Speedway Blvd, Tucson, AZ 85721
Ph: (520) 360-1974 E-mail: [email protected]
Igor Crk
Doctoral Student, University of Arizona Department of Computer Science
Gould-Simpson Building 1040 E. 4th Street, Tucson, AZ 85721
Ph: (520) 820-2890 E-mail: [email protected]
Abstract— The 24-hour knowledge factory paradigm facilitates collaborative effort between geographically distributed offshore teams. The concept of handoff, along with the vertical segregation of tasks, enables software development teams to work on a continuous basis on the project. This notion is made possible through efficient knowledge representation, as the process of handoff essentially entails all work completed to be transferred from one site to the next. Data management is one of the key parameters that defines the success of this business model and is achieved through leveraging of specific tools, models, and concepts.
Index Terms—data management, global software development, knowledge representation, offshoring
Leveraging Temporal and Spatial Separations with the 24-Hour Knowledge Factory Paradigm
Electronic copy available at: http://ssrn.com/abstract=1122506
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
2
1. INTRODUCTION
HE concept of the 24-hour knowledge factory can be described in terms of a set of three or
more geographically and temporally dispersed software development teams, all working on the
same phase of a project, during their appropriate day hours pertaining to their time zone. Unlike
the traditional global software development model, wherein a complete project is iteratively
segregated into small modules and assigned to a development team, the 24-hour knowledge
factory is based upon a vertical partition of tasks. In the former case, collaboration between
teams is minimal and occurs only in conditions that deal with interfacing two different modules.
In the 24-hour knowledge factory however, at the end of the work day, a particular team transfers
all existing work accomplished to the succeeding team, in another part of the world. This new
development team continues the same phase of the project, where the initial team left off, in a
cyclic manner. By judiciously dispersing development teams around the world, it is possible for
the same phase of the project to progress on a non-stop basis, thereby increasing the efficiency of
the project since each team perceives that work is accomplished overnight, when developers at
that location are asleep. Earlier, the time difference between geographically distributed teams
was perceived to be an obstacle to productivity as it added significant communication overheads,
leading to time delays, increase in costs, and an overall drop in efficiency. Now, this conception
has changed and the time difference is being adopted as a major strategic advantage. The advent
of IT technology, such as the Internet, online chat, email, message boards along with an array of
collaborative tools, has led to growing leveraging of the time difference, enabling companies to
better utilize global talent (Sambamurthy 2003).
In the 24-hour knowledge factory, collaboration between teams is accomplished by the process
of handoff. The goal of handoff is to precisely communicate the tasks accomplished during a
particular work period, so that it can be efficiently continued on forward. The entire handoff
T
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
3
procedure should ideally consume only a few minutes, which is possible only through efficient
data management.
The concept of the 24-hour knowledge factory can be applied not just to the software
development environment, but also to a myriad of other knowledge-based applications. The
essence of this paradigm lies in its efficient data representation and transfer, and hence can be
applied to any endeavor that is dictated by similar terms. As an example, a company involved in
the design of new integrated circuits (IC chips) can employ talented engineers from around the
world, and still have them work in their respective countries and at appropriate hours, without
having to ask them to relocate to another country or to work during odd hours. The 24-hour
knowledge paradigm provides such firms with the potential to design and develop new chips in
much shorter periods of time, due to the non-stop decentralized work paradigm.
The 24-hour knowledge factory research group at the University of Arizona has developed new
models for knowledge representation, which are discussed in this paper. Applications of these
models for decision support systems by offsite teams are also considered in this paper.
2. OUTSOURCING
“As companies disaggregate intellectual activities internally and outsource more externally,
they approach true virtual organization with knowledge centers interacting largely through
mutual interest and electronic rather than authority systems” (Quinn 1999). The concept of the
24-hour knowledge factory is an extension of the conventional outsourcing model and provides
businesses with the opportunity to reduce costs and development times. As one example, the 24-
hour knowledge factory provides the potential to obtain feedback and to test results by
counterpart offshore sites, on an overnight basis as seen from the perspective of the original site.
In the conventional model, a particular development phase is completed; only then, can a module
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
4
be tested and feedback provided.
As handoff occurs, completed work along with the rationale for the decisions made during the
particular shift is conveyed to the subsequent team. If this transfer of ideas progresses with ease,
then the time spent on completing a project drastically reduces.
3. PRACTICAL APPLICATIONS OF THE 24-HOUR KNOWLEDGE FACTORY
The use of development teams that are geographically widespread is becoming more common.
Companies such as IBM, Motorola, Fujitsu, and Siemens have employed different level of
collaboration for their projects. A controlled experiment was conducted at IBM to evaluate the
performance between a team that was entirely collocated, and another which was distributed.
The distributed team consisted of two sites, one in the US and the second in India. These two
teams were identically structured, with seven core developers of similar experience and managed
by the same development manager. The teams were organized on a task-oriented basis and were
observed closely for a period of 52 weeks. At the beginning of this exercise, it was perceived that
the quality would be higher in the case of the collocated team, and that the overall costs would be
lower for the distributed team. In reality, the distributed team outperformed the collocated team
in terms of the documentation and knowledge repository. Since all developers in the collocated
team were present at one site, the members of that team interacted informally with each other; as
such, computer-based knowledge was constricted in comparison to that in the case of the
distributed team. In the latter scenario, knowledge was dispersed and stored in formalized
mechanisms, partially due to a high correlation between the tasks being carried out by different
sites. It was also observed that the turnaround time to resolve tasks was significantly shorter in
the distributed team scenario. The most significant obstacle in the latter scenario was the loss of
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
5
informal communication between developers; this was partially resolved by initiating a face-to-
face meeting at the beginning of the endeavor (Seshasai 2007).
In the literature, geographic proximity is considered to be important in ensuring high
productivity of the workforce, and optimization of many types of processes. The above example
contradicts traditional thinking. It highlights that with appropriate technology and work
procedures, distributed teams can surmount traditional barriers and outperform collocated teams.
4. ENABLING TECHNOLOGIES FOR DATA REPRESENTATION
Knowledge transfer and reuse attain greater importance in the case of outsourcing (Myopolous,
1998). Different technologies must be employed to successfully ensure that various offshore sites
can efficiently share knowledge resources. Technologies such as SharePoint, developed by
Microsoft, and Quickr by IBM allow developers to share information without having to save any
information locally. The Wiki concept incorporates capabilities for version control, search
engines, and content indices. Distributed authoring and versioning systems, such as Web-Dav,
also facilitate collaboration between offshore teams; this tool allows any change made to an
artifact in the project, such as program codes, diagrams, and documents, to be reflected at all
other offshore sites as well. Extreme programming is another relevant concept; it emphasizes
collective ownership of code by all members belonging to the development team (Crk, 2008).
Apart from data representation, secure data transfer is also required. Security technologies, such
as digital signatures, encryptions and authentication, are relevant for safeguarding the frequent
transmissions of information across continents.
The techniques mentioned above serve as the foundations to the design of secure data
representation models and software for the 24-hour knowledge factory.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
6
4.1 Knowfact
Knowfact is an integrated decision support system, developed at MIT under the supervision of
the first author of this paper. It was developed to demonstrate the ability of a system to obtain
knowledge and to distribute it accordingly. This system encompasses details of all technical and
business decisions made at different times, based on inputs from geographically distributed
stakeholders.
The Knowfact support system is shown in Figure 1. It consists of two important modules, the
Decision Rationale Module (DRM) and the Decision History Module (DHM). The DRM is
responsible for defining important attributes of a system; these attributes are then employed by
the utility interview to represent the current progress of the project. This component enables
developers to represent information on all key aspects in a well-organized manner. The DRM
knowledge representation module extracts vital information about the objectives and this
information is used by stakeholders to ascertain the progress of the project. The module serves as
a decision support system, as it considers multiple attributes, such as utility and cost-benefit,
with respect to a decision to be made. Based on the analysis of these attributes, the DRM
presents an evaluation of possible alternatives, thereby enabling an intelligent decision to be
made by the stakeholders. Decision-making studies have shown that stakeholders usually face a
trade-off between quality information and accessible information; under time pressure, decisions
are frequently made based to lower quality information that is more accessible (Todd, 1991,
Ahituv, 1998). Knowfact, as well as the broader 24-hour knowledge factory paradigm,
emphasizes the importance of an automated repository of knowledge; this is a module that
gathers ‘smart’ information in order to facilitate quality decisions to be made quickly and
readily.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
7
The DHM compiles a history of all parameters that led to a particular decision, and represents
it in a graphical format. These parameters are then evaluated based on their importance, and is
used as feedback to determine the values of the attributes defined in the DRM. Developers then
use the utility interview to determine the relative importance of the attributes for the next stage of
the project. The DHM also incorporates a knowledge repository that is used to store data about
the current state of the activity being performed, along with a history of previous states. While
most of the system functionality is automated, Knowfact also prompts the user to justify the
rationale behind certain major decisions. This aids future developers of the system to understand
the logic behind a particular decision.
Figure 1- KNOWFACT process flow
4.2 Composite Persona
In the traditional software development cycle, a particular problem is iteratively broken down
into tractable segments, and each segment is assigned to a developer. It is the responsibility of
the developer to carry the artifact through the processes of coding, testing, and maintenance. This
sometimes leads to the development of disjoint solutions, which are understood only by the
individuals who created the concerned parts of the code. In many such cases, the problem can be
traced back to complex parameters that are highly correlated with other parameters. This
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
8
problem can be mitigated by establishing a high degree of communication between developers;
this can be a formidable challenge, especially in offshoring scenarios, due to geographical,
temporal, and cultural distances. Studies have shown that developers spend 70% of their time on
collaborative tasks (Vessey, 1995). Therefore, a reduction in this time would lead to a significant
increase in productivity.
Scarbrough (1998) discusses that the knowledge-based view of a firm focuses on fostering
specialization of employees’ knowledge and on creating internal networks of these human
knowledge sources, while business process reengineering focuses on external relationships for
rapidly growing performance complemented by generalization of the knowledge source. In order
to facilitate the creation of such knowledge networks between development teams, the notion of
the composite persona has been proposed. A composite persona (CP) is a highly structured
micro-team, which possesses dual properties of both the individual and a team of engineers. To
an external observer, the CP seems like a single entity; but internally, it consists of many
developers. CPs are geographically distributed, with each offshore site having the same number
of composite personae, with each developer belonging to one or more CPs. A problem is divided
into tractable units, with the smaller modules are handled by a CP, rather than by a single
developer. It is the responsibility of this micro-team to serve as the owner of the module and the
associated classes, as well as to generate the necessary solution. The CP is always active, but
with a different driver for each shift, based on the appropriate time zone. A vertical division of
tasks is employed in this model, with each team representing a CP working in a sequential
manner, around the clock. When a driver of a CP has completed the day’s work, then the work-
in-progress, augmented by details of the rationale of all associated decisions, is handed off to the
subsequent driver. The new driver quickly learns details of the work completed so far, so that
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
9
work on the incomplete tasks is resumed within a few minutes time.
Communication is critical in defining the efficacy of the composite persona model, and can be
modeled into three forms. The first type of communication is handoff, and its success is
determined by how well knowledge can be represented before it is transferred to the subsequent
driver. The second form of communication is lateral, occurring real-time between members
belong to the same active CP. The third form of communication consists of lateral
communication with handoff. Such a form of communication may arise due to interaction
between two CP drivers at one development site. When this site turns off and data are handed to
another site, the same conversation can be resumed from the received data.
4.3 CPro
Knowledge management is defined very broadly, encompassing processes and practices
concerned with the creation, acquisition, capture, sharing and use of knowledge, skills and
expertise (Quintas, 1996). In the 24-hour knowledge factory model, the concept of the composite
persona is employed for sharing of skills and knowledge. A software process that is developed
around this notion is CPro. CPro incorporates the concept of the composite persona and
integrates various automated algorithms that can be used to generate work schedules for efficient
project completion, for reducing software defects and for effective knowledge representation and
transfer between the developers of the CP.
The knowledge management life cycle model categorizes knowledge into six different phases:
create, organize, formalize, distribute, apply and evolve (Birkinshaw, 2002). CPro follows a
similar approach by breaking down a particular project into subtasks; planning, design, design
review, coding, code review and testing. It then generates an automated schedule for the
expected progress of the project, on behalf of all the developers. Since a single developer cannot
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
10
determine such a schedule alone due to the vertical sequence of tasks, and since it is not possible
to predict random obstacles, CPro incorporates an algorithm known as the schedule caster to
generate the expected deliverables. CPro acquires estimates for all subtasks from developers, and
then performs a Monte Carlo simulation to determine the most effective schedule for the
different CPs.
CPro asks each of the developers to rate all the tasks based on a scale of Very Complex,
Complex, Moderate, Simple and Very simple, with each of the qualitative measures associated
with an offset exponential distribution. Using this information, along with a combination of
previous estimates and the historical data on actual productivity, the schedule caster employs a
continuous distribution to simulate several possible work schedules, as shown in Figure 2.
Figure 2- Simulated work schedules
The CPro process begins with a learning cycle, wherein each time a developer makes an
estimate and then completes the task, the data are recorded into CPro, which then modifies the
probability distribution for that developer. Two mechanisms were analyzed in detail. The first
mechanism used Bayesian probability with joint distributions and the second one involved case-
based reasoning, wherein cases were generated based on prior estimate and actual performance.
After the learning process, CPro can generate high-probability distribution curves that are likely
to lead to high productivity.
Once the schedule caster has generated a schedule based on developer inputs, and a task has
been assigned to a CP, the CP is wholly responsible for that task. It has to be completed fully or
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
11
dropped. Between CP drivers, handoff occurs at the phase boundary. The concept of the
vertically distributed sequence of tasks is preserved, as each subsequent task requires complete
knowledge of prior tasks.
CPro has been formalized to assign work using two different algorithms: the random work
assignment and the selected work assignment algorithm. In the random work assignment
algorithm, for each task, multiple iterations are performed by randomly choosing a developer,
and applying these estimates to generate a simulated effort. The final schedule is then based on
the most optimum result obtained from each developer’s distribution curve. This model,
however, does not take into consideration any preferences, on the developers’ part. The selected
work assignment model is more sophisticated than in the case of the random work assignment
model. When the most optimal work schedule is being generated for the CP, factors such as shift
timings and the complexity involved in the task are taken into consideration.
The selected work assignment model employs a greed policy for scheduling of work. All open
tasks are arranged according to increasing order of complexity. Once a developer has selected a
task, CPro generates a simulation of the actual effort required to complete the task. If the task
cannot be completed within the end of the shift, and the developer is unwilling to work overtime,
then that particular task is disregarded, making it available for another developer. If the
developer does intend to complete the task, then CPro provides the capability of locking that
particular task, preventing any other developer from being assigned the same task.
4.3.1 Defect Reduction Strategies in CPro
Knowledge transfer between members of a CP determines the long term success of the CP.
Defect reduction strategies are employed to ensure that the knowledge being transferred is
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
12
represented well and is free from errors.
Artifact reviews is one technique to detect defects. In the 24-Hour Knowledge Factory
paradigm, each individual can serve as a peer reviewer for work done by colleagues in other
countries. This can help to identify defects.
Test driven development is a technique that involves the development of software solutions
through rigorous testing of code artifacts. Before any artifact is developed, a minimal set of test
cases are generated; as the artifact is coded, tested and debugged, the discovery of any defect
leads to the formulation of another test case. The collection of test cases can be used to represent
the CP’s understanding of the problem, and their respective solutions.
Effective knowledge transfer is further realized in the 24-hour knowledge factory paradigm
through a heuristic assignment process. During the course of the work day, many tasks are
available for the current driver of the CP, such as coding, testing, and reviewing of an existing
piece of code. A set of heuristics have been integrated into CPro; these heuristics are used to
recommend a course of action, so that the maximum amount of knowledge is transferred.
MultiMind
Suchan and Hayzak (2001) highlighted that a semantically rich database can be useful in
creating a shared language and mental models (Suchan, 2001). MultiMind is a tool that embodies
this objective; it is based on the CPro software process and builds on other known concepts such
as Lifestream. MultiMind uses the 24-hour knowledge factory concept to support collaborative
effort among geographically dispersed sites that rely on the CP methodology. The high level
representation of MultiMind is shown in Figure 3.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
13
Figure 3- MultiMind Architecture
MultiMind builds upon known business services techniques and IT technologies, such as
Scrum knowledge transfer technique and extreme programming concepts, in order to perform
handoff. Scrum stand-up meetings are automated, as work summaries of the project are
developed automatically by the embedded project management system. When a developer signs
out, then the system automatically captures a snapshot of all relevant data. When the subsequent
CP drivers logs in, scrum reports are generated based on the current state of the project and
previous reports. In the design of this prototype, the objectives were to: automate what can be
automated; store everything that can be stored; and minimize the cognitive overhead for
cooperation (Gupta, 2007). Multimind supports the following functionality:
• It defines atomic editing operations;
• It stores every edit (down to the atomic level), every form of communication (such as e-
mail, and chat), and it keeps record of every document browsed and every search made;
• It automatically creates a summary by monitoring the activity of each worker and
prepares a hierarchical report of all the changes made; and
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
14
• It creates an informational summary by compressing the edit log.
A decision support system is an interactive, computer-based information system that utilizes
decision rules and models, coupled with a comprehensive database to support all phases of the
decision-making process mainly in semi-structured (or unstructured) decisions under the full
control of the decision makers (Yang, 1995). A novel decision support system capability is
incorporated into MultiMind using a tool called Lifestream, along with ideas from Activity
Theory. MultiMind incorporates a monotonically increasing database, which tracks all accessed
objects and knowledge events (Gupta, 2007). Knowledge events are generated when a developer
accesses some form of electronic memory such as the project-knowledge base, email, or the
World Wide Web. Every event that occurs during the course of a project is associated with a
timestamp, which carries information about the task performed by the previous user. The
timestamps that are formally defined in MultiMind are as follows:
1. LoginEvent and LogoutEvent: generated when an agent begins and completes their shift;
2. MessageViewEvent and MessageSearchEvent: generated when an agent reads or searches
through a message;
3. ExternalWebAccessEvent and ExternalWebSearchEvent: generated when an agent access
a webpage or issues a web search; and
4. DocumentSearchEvent: generated when the agent issues a search in memory for a
particular document.
If any confusion arises regarding a decision taken by an earlier CP driver, then the recorded
sequence of events that led to that decision can be utilized to determine the future course of
action. Information on documents and artifacts accessed, web searches conducted, messages
exchanged, and other sequence of events can be quickly analyzed to understand the rationale of a
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
15
previous decision. This process is known as code justification, and is achieved by data mining
Lifestream for relevant project artifacts and knowledge events that have occurred within the
desired frame of time. Currently, research is being conducted to incorporate intelligent filters in
the semantic analysis process, in order to remove redundant information, and to retain only
information relevant to the specific decision. The Lifestream process is shown in Figure 4.
Figure 4- Lifestream Process
MultiMind also includes an embedded discussion tool for all project artifact interfaces. This
tool uses concepts from the Speech Acts theory to classify communication into different
categories. New message threads that are created can be of the form ‘inquiry’ or ‘inform’. When
a developer responds to messages, it can be in the form of ‘response’ or another ‘inquiry’ in
order to pose a counter-question. MultiMind recognizes the speech acts as shown in Figure 5.
Figure 5- Hierarchy of Speech Acts
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
16
4.4 Implicit and Explicit Communication in the 24-hour knowledge factory
In his book, The Mythical Man Month, Fred Brooks describes the problems associated with
growing teams of developers (Brooks, 1995). The amount of communication overhead that is
introduced can lead to negative consequences. In order to mitigate this potential problem, several
design decisions were made with respect to implicit and explicit communication. First, in order
to maintain around-the-clock productivity, a distributed CP must contain functional expertise at
each of the distributed sites. Second, although communication channels exist between sites, the
real-time usable ones are constrained only to those within the co-located subsets of the CPs.
Due to the distributed nature of expertise within a CP, the communication overhead between
sites could be potentially crippling at some of the stages of development that involve heavy
communication. This is only true, however, if we are considering explicit communication as the
sole means for transmitting and sharing knowledge. However, studies have shown that expert
teams rely more on implicit communication strategies and on the shared mental model of the task
at hand (Gupta, 2007). These models can incorporate knowledge about the various system
components and the relationships between them. These mental models are developed by
interaction with other member of the team. Thus, both explicit communication and implicit
communication are important for the 24-hour knowledge factory. Research has also shown that
knowledge is created by the interaction between explicit and tacit knowledge (Nonaka, 1994),
the boundary between them is not too clear, with one form of knowledge being created by the
other (Spender, 1996).
In order to facilitate explicit communication for knowledge transfer, Jazz offers a collaborative
environment. Jazz is developed by IBM and is an extension of the Eclipse IDE. It provides
communication tools within the IDE itself and utilizes context provided by the IDE to create
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
17
meaningful collaboration artifacts. It focuses on project planning and facilitating high-level
communication between developers, thereby reducing the dependence on special purpose
communication tools and informal meeting. Implicit communication, on the other hand is
facilitated by the use of tools to increase the understanding of project artifacts, as well as their
interconnections and dependencies. A number of commercial and non-commercial code
visualization tools and code browsers are now available; they try to implement software
visualization, which forms the basis of implicit communication. Tools such as CodeCrawler, X-
Ray, Seesoft, Tarantula, and Moose try to enhance the developer’s understanding of the code, by
illustrating the logical structure, execution-related line level statistics, and locations of errors
within the executed code. These tools are largely overlooked due to the lack of integration with
existing development environments (IDEs).
4.5 EchoEdit
Three surveys conducted in 1977, 1983, and 1984, highlight the lack of documentation as one
of the biggest problems that the code maintenance people have to deal with (Dekleva, 1992). In
order to ensure that code artifacts maximize knowledge transfer, developers must ensure that
they are documented well. Techniques such as literate programming, an IT service concept
introduced by Donald Knuth that combines documentation language with the programming
language (Knuth, 1984), along with tools such as TeX, Cweb and METAFONT, can be used for
efficient documentation. EchoEdit is a tool that combines documentation with voice recognition
capabilities (O’Toole, 2008). This tool allows programmers to generate audio comments that are
subsequently converted into text and embedded into the code. Comments can also be played
back to the developer. Audio comments in EchoEdit are usually reserved for high level issues,
while low level syntactical and semantic issues are addresses in textual format, thereby providing
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
18
knowledge representation capabilities in a dual manner.
Currently, EchoEdit is developed as a plug-in for the Eclipse environment. When another
programmer enters this environment, voice annotations can be played back by selecting an icon
displayed in the margin, similar to that in the PREP environment. EchoEdit supports three user
actions: the recording of an audio comment, the playback of the audio file, and the conversion of
the audio file into a text comment. An example of the comment, after conversion from audio
format, is shown in Figure 6.
Figure 6- Example of EchoEdit
The block diagram for EchoEdit is shown in Figure 7 (O’Toole, 2008). When an audio
comment is to be recorded, a recording event is generated and sent to the action handler, which
initializes the audio recorder and starts listening for the record event. Two threads are created
based upon this event: one to perform recording and the other to convert this voice to text.
During recording, all data are stored into memory, and once conversion is completed, the text,
along with the audio file is inserted back into the original file being worked on.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
19
EchoEdit is one of the plug-ins developed for the 24-hour knowledge factory. With the aid of
both text and audio comments for knowledge representation, handoff can be accomplished
within a period of few minutes.
Figure 7- Block diagram of EchoEdit
5. FURTHER RESEARCH RELATED TO THE 24-HOUR KNOWLEDGE FACTORY
There are several areas that require further research in order to fully exploit the potential of the
24-hour knowledge factory. In order to surmount spatial, temporal, and other kinds of
boundaries, Gupta has advocated a four-faceted knowledge-based approach that places emphasis
on (Gupta, 2001):
• Knowledge Acquisition, or the tapping of traditional systems to provide accurate and
comprehensive material for the new knowledge-based systems;
• Knowledge Discovery, or automated mining of numerical, textual, pictorial, audio,
video, and other types of information, to capture underlying knowledge, either on a
one-time basis or on a continuous basis;
• Knowledge Management to deal with different types of heterogeneities that
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
20
invariably exist when inputs have to cross-over borders of different types (national,
organizational, departmental, and others); and
• Knowledge Dissemination to extract, customize, and direct knowledge to appropriate
departments and users, based on their individual needs.
So far, most of the research has focused on collaboration on a limited scale, one that occurs
sporadically. In the case of the 24-hour knowledge factory, collaboration occurs continually; as
such, the research group at the University of Arizona is currently focusing on the development of
IS tools that can better address the requirements of knowledge manipulation and enhancement in
a highly decentralized environment. By building upon research conducted in the areas of
knowledge acquisition, knowledge discovery, knowledge management, and knowledge
dissemination, one is striving to reduce the overhead that is placed on distributed developers.
The 24-hour knowledge factory research group is also trying to develop new algorithms for
efficient decomposition of a project into sub-tasks, so that each sub-task can be performed
individually by a professional without having to know the specific details of the other sub-tasks.
Berger describes multiple evolutions of technology that have seen the splitting of tasks as the
catalyst for new work models and new technologies (Berger, 2006). The IBM System 360 makes
use of this concept by disaggregating of the mainframe hardware from the software logic,
thereby allowing separate workers to add value to each component separately and without
knowledge of each other’s work. This concept is being adapted for multiple disciplines, so that a
“component-based approach” can be utilized to construct basic artifacts for these respective
disciplines. Research is also being conducted on different techniques to combine the separately
completed tasks into a finished, marketable product or service.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
21
6. CONCLUSION
Efficient knowledge management and transfer is a critical aspect of the 24-hour knowledge
factory model. The composite persona model, manifested by the CPro process, utilizes multiple
approaches to realize facilitate efficient operation, on a round-the-clock basis, in a globally
distributed work environment. MultiMind and EchoEdit have been developed as concept
demonstration tools to validate some of the key design principles, especially ones related to the
process of efficient handoff. Such tools provide a comprehensive knowledge repository of the
rationale behind all decisions, offer an intelligent explanation of high-level and low level details
of code artifacts, and formulate the most efficient schedule for attaining maximum performance
on all the phases of the project. All these facets help to ensure that collaboration between
multiple teams occurs in an optimal manner. Overall, the 24-hour knowledge factory paradigm
provides the potential for projects to progress in a non-stop manner with minimal communication
overhead, thereby increasing productivity. Whereas in traditional literature, geographic
proximity is considered to be of vital importance in ensuring high productivity of the workforce,
and optimization of many types of processes, some of the cited examples contradict traditional
thinking and highlight that, in some cases, distributed teams can outperform collocated teams.
ACKNOWLEDGEMENTS
We would like to acknowledge the following current and former members of the 24-Hour
Knowledge Factory Research Group at the University of Arizona for their research ideas: Nathan
Denny, Kate O’Toole, Shivaram Mani, Ravi Seshu, Manish Swaminathan, and Jamie Samdal
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
22
REFERENCES
Ahituv, N., Igbaria, M. & Sella, A. (1998). “The effects of time pressure and completeness of information on decision making”, Journal of Management Information Systems, Vol. 15, No. 2, pgs. 153-172 Battin, R., Crocker, D., Kreidler, J. & Subramanian, K. (2001) “Leveraging Resources in Global Software Development”, IEEE Software, Vol. 18, Issue 2 Beck, K. (2000) “Extreme Programming Explained: Embrace Change”, Addison Wesley Birkinshaw, J. & Sheehan, T. (2002) “Managing the knowledge life cycle”, MIT Sloan Management Review, Vol. 44, No. 1, pp. 75–83 Brooks Jr, F. P. (1995) “The Mythical Man-Month”, Addison-Wesley Carmel, E. & Agarwal, R. (2002) “The Maturation of Offshore Sourcing of Information Technology Work”, MIS Quarterly Executive, Volume 1, Issue 2 Crk, I., Sorensen, D. & Mitra, A. (2008) “Leveraging Knowledge Reuse and System Agility in the Outsourcing Era”, Journal of Information Technology Research, Vol. 1, Issue 2, pgs 1-20 Dekleva, S., (1992) "Delphi study of software maintenance problems," Conference on Software Maintenance Proceedings, Vol. 9, No. 12, pp.10-17 Denny, N., Crk, I. & Seshu, R. (2008a) “Agile Software Processes for the 24-Hour Knowledge Factory Environment”, Journal of Information Technology Research, Vol. 1, Issue 1, pgs 57-71 Denny, N., Mani, S., Seshu, R., Swaminathan, M. & Samdal, J. (2008b) “Hybrid Offshoring: Composite Personae and Evolving Collaboration Technologies”, Journal of Information Technology Review, Vol. 21, Issue 1, pgs 89-104 Freeman, E. & Gelernter, D. (1996) “Lifestreams: A Storage Model for Personal Data”, ACM SIGMOD Record Gao, J. Z., Chen, C., Toyoshimo, Y. & Leung, D.K. (1999) “Engineering on the Internet for Global Software Production”, IEEE Computer, Vol. 32, No. 5, pgs 38-47 Gupta, A. (2001) “A Four-Faceted Knowledge Based Approach to Surmounting National and Other Borders.” The Journal of Knowledge Management, Vol. 5, No. 4, 2001 Gupta, A. & Seshasai, S. (2007) “24-Hour Knowledge Factory: Using Internet Technology to Leverage Spatial and Temporal Separations”, ACM Transactions on Internet Technology, Vol. 7, No. 3, Article 14 Hill, C., Yates, R., Jones, C., Kogan, S. (2006) “Beyond predictable workflows: Enhancing productivity in artful business processes”, IBM Systems Journal, Vol. 45, No. 4, pgs 663-682
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
23
Humphrey, W. S. (1995) “Introducing the Personal Software Process”, Annals of Software Engineering, Vol. 1, No. 1, pgs 311 – 325 Knuth, D. (1984) “Literate Programming”, The Computer Journal, Vol. 27, Issue 2, pgs 97-111 Lacity, M., Willcocks, D. & Feeny, D. (1995) “IT Sourcing: Maximize Flexibility and Control”, Harvard Business Review, Vol. 73, No. 3, pgs 84-93 Myopolous, J. (1998) “Information Modeling in the Time of Revolution”, Information Systems, Vol. 23, No. 3, pgs 127-155 Nonaka, I. (1994) “A dynamic theory of organizational knowledge creation”, Organization Science, Vol. 5, No.1, pgs 14-39 O’Toole, K., Subramanian, S. & Denny, N. (2008) “Voice-Based Approach for Surmounting Spatial and Temporal Separations”, Journal of Information Technology Research, Vol. 1, Issue 2, pgs 54-60 Quinn, J. B. (1999) “Strategic Outsourcing: Leveraging Knowledge Capabilities”, Sloan Management Review, Vol. 40, No. 4, pgs 9-21 Quintas, J. B., Anderson, P. & Finkelstein, S. (1996) ``Managing professional intellect: making the most of the best'', Harvard Business Review, Vol. 74, pgs 71-80. Sambamurthy, V., Bharadwhaj, A. & Grover, V. (2003) “Shaping Agility through Digital Options: Reconceptualizing the Role of Information Technology in Contemporary Firms,” MIS Quarterly, Vol. 27, No. 2, pgs 237-263 Scarbrough, H. (1998) “BPR and the knowledge-based view of the firm”, Knowledge and Process Management, Vol. 5, No. 3, pgs 192-200 Searle, J.R. (1975) “A Taxonomy of Illocutionary Acts. Language, Mind and Knowledge”, Minnesota Studies in the Philosophy of Science, pgs 344-369 Seshasai, S. & Gupta, A. (2007) "Harbinger of the 24-Hour Knowledge Factory" Available at SSRN: http://ssrn.com/abstract=1020423 Schwaber, K. & Beedle, M. (2002) “Agile Software Development with Scrum”, Series in Agile Software Development, Upper Saddle River, NJ: Prentice Hall Spender, J.C. (1996) “Organizational knowledge learning and memory: three concepts in search of a theory”, Journal of Organizational Change Management, Vol. 9, No. 1, pgs 63-78 Suchan, J. & Hayzak, G. (2001) “The communication characteristics of virtual teams: a case study”, IEEE Transactions on Professional Communication, Vol. 44, No. 3
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
24
Todd, P. & Benbasat, I. (1991) “An experimental investigation of the impact of computer-based decision aids on decision making strategies”, Information Systems Research, Vol. 2, No. 2, pgs. 87-115 Vessey, I. & Sravanapudi, A. P. (1995) “Case Tools as Collaborative Support Technologies”, Communications of the ACM, Vol. 38, Issue 1, pgs 83-95 Yang, H. L. (1995) “Information/knowledge acquisition methods for decision support systems and expert systems”, Information Processing and Management, Vol. 31, No. 1, pgs 47-58 Whitehead, E. J. (1997) “World Wide Web Distributed Authoring and Versioning (WebDAV): An Introduction”, StandardView 5, pgs 1, 3-8
BIOGRAPHY OF AUTHORS
Amar Gupta is a Tom Brown Endowed chair of management and technology, professor of
entrepreneurship, management information systems, and computer science at the University of
Arizona since 2004. Earlier, he was with the MIT Sloan School of Management (1979-2004). He
has served as the founding co-director of the productivity from Information Technology
(PROFIT) initiative.
Rajdeep Bondade is a doctoral student in ECE, at the University of Arizona. He received his
Bachelors of Engineering in Electronics and Communications from PES Institute of Technology,
Bangalore, India. He joined the 24-Hour Knowledge Factory Research group in January 2008.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
25
Igor Crk is currently pursuing a doctoral degree in computer science at the University of
Arizona. He holds a masters degree in computer science from the University of Arizona. Current
research interests include context-driven energy management in operating systems and the 24-
Hour Knowledge Factory model for distributed agile software development.
LIST OF FIGURES
Figure 1- KNOWFACT process flow
Figure 2- Simulated work schedules
Figure 3- MultiMind Architecture
Figure 4- Lifestream Process
Figure 5- Hierarchy of Speech Acts
Figure 6- Example of EchoEdit
Figure 7- Block diagram of EchoEdit