Genetic Algorithm Based System for Patient Scheduling in Highly Constrained Situations
Transcript of Genetic Algorithm Based System for Patient Scheduling in Highly Constrained Situations
Genetic Algorithm Based System for Patient Scheduling in
Highly Constrained Situations
Vili Podgorelec, Peter Kokol
University of Maribor Faculty of Electrical Engineering and Computer Science
Smetanova 17, 2000 Maribor Slovenia
[email protected], [email protected]
Abstract
In medicine and health care there are a lot of situations when patients have to be scheduled
on different devices and/or to different physicians or therapists. May it concern preventive
examinations, laboratory tests or convalescent therapies, we are always looking for an
optimal schedule that would result in finishing all the activities scheduled as soon as
possible, with the least patient waiting time and maximum device utilization. Since patient
scheduling is highly complex problem, it is impossible to make a qualitative schedule by
hand or even with exact heuristic methods. Therefore we developed a powerful automated
scheduling method for highly constrained situations based on genetic algorithms and
machine learning. In this paper we present the method, together with the whole process of
schedule generation, the important parameters to direct the evolution and how the algorithm
is guaranteed to produce only feasible solutions, not breaking any of the required
constraints. We applied the described method to a problem of scheduling patients with
different therapy needs to a limited number of therapeutic devices, but the algorithm can be
easily modified to be used in similar situations. The results are quite encouraging and since
all the solutions are feasible, the method can be easily incorporated into an interactive user
interface, which can be of major importance when scheduling of patients, and human
resources in general, is considered.
Citation Reference: V. Podgorelec, P. Kokol, Genetic Algorithm Based System for Patient Scheduling in Highly Constrained Situations, Journal of Medical Systems, Plenum Press, vol. 21, num. 6, pp. 417-427, December 1997.
1. Introduction
The problem of constructing an automated scheduling system is know to be a very complex,
especially for those situations when the human resources are involved. But it is also well
known that to do it by hand takes a lot of effort and associated administrative work, in many
cases it is even impossible. Therefore the application of computers to scheduling problems
has a long and varied history. As the first generation of computer timetabling problems in
the early 1960’s were presented programs to produce school timetables with the aim of
fitting classes and teachers to periods. Soon after that numerous different heuristic
approaches to timetabling have been introduced, including simulated annealing, constraint
logic programming, linear programming and graph coloring heuristics. But it soon became
clear that the exact methods are useless for more complex problems due to their
ineffectiveness. Therefore a lot of different non-exact or soft methods are used lately, which
do not give optimal solutions, but reasonably good solutions are obtained in a relative short
time. What kind of solution is good enough and how long are we prepared to wait for it
depends on the problem given, but all the research has one objective in common: how to find
as good solution as possible in as short time as possible.
Regardless of the used method, there are some basic rules that have to be fulfilled in order to
construct a qualitative and effective automated scheduling system. First of all, we have to
guarantee that all obtained solutions will be feasible; we have to fulfill all specific
constraints of the given problem. Moreover, the system has to be efficient enough to find an
adequate solution in a limited time to be still useful for the practical use. Beside these basic
constraints there are even more non-obligatory constraints, that improve the quality of a
scheduling system, when they are fulfilled; and in the case of patient scheduling they are
even indispensable. One such property is generality or independence of the problem. It
guarantees the use of the system for different kind of scheduling problems, not only for one
very specific situation. Second non-obligatory property is the possibility of user interaction
in the phase of solution development; when scheduling of patients is considered, this option
definitely becomes necessary. Also very useful is the ability of the system to continue the
search for the solution when one (or more) of the activities already scheduled is canceled
and removed from the schedule, or when the execution of scheduled activities starts almost
simultaneously with the scheduling.
Although some recent scheduling methods are able to provide an adequate solution in a
reasonable time, there are very little of them that include at least one of the before mentioned
non-obligatory properties. A kind of hybrid genetic algorithms have shown some prosperous
results lately [2-4], and yet they have quite a number of disadvantages. They have been to
problem dependent, which is exactly the opposite of the commonly known properties of
genetic algorithms, and could have easily evolved to a suboptimal solution. And instead of
finding the schedule itself they were searching for the set of rules of how to produce a good
schedule, what disallows user to interact with the system.
In the manner to overcome above weaknesses we developed a scheduling algorithm based on
the genetic algorithms [1,6,7,9] and machine learning [5,8]. With the introduction of actors,
resources and activities the method became problem independent. Because of the diversity
kept in a population and “the best survive” principle we avoid the premature convergence to
a suboptimal solution. In addition machine learning abolishes possible negative
consequences of badly chosen parameter values. With the adequate internal representation of
the individuals we guarantee all temporary solutions to be feasible, which assures the user to
interact with the system. Also it is very important that the user can directly influence the
direction of the evolution by weighting properly all the parameters that influence the quality
of evolved solution. And only when the method fulfills all of the above it can be considered
as a possible solution to a patient scheduling problem.
We applied the described algorithm to a problem of scheduling patients with different
physical therapy needs to a limited number of therapeutic devices and a limited number of
therapists. In this case, a solution is a schedule, which we can consider to be an assignment
of patients to time intervals on the specific therapeutic devices that are operated by physical
therapists, trained for specific devices. Obtained results turned out to be very promising and
because of the algorithm’s effectiveness and low computing resources consumption it can be
used, in our opinion, also for very complex scheduling problems.
The basic information on genetic algorithms and machine learning together with the choices
made to hit the requirements of our situation are given in Section 2. Section 3 describes the
generation of schedules with the use of our system. Section 4 presents the results obtained by
scheduling a test problem, after which the paper ends with some conclusions in Section 5.
2. Genetic algorithms and machine learning
Genetic algorithms are adaptive heuristic search methods which may be used to solve all
kinds of complex search and optimization problems. They are based on the evolutionary
ideas of natural selection and genetic processes of biological organisms. As the natural
populations evolve according to the principles of natural selection and “survival of the
fittest”, first laid down by Charles Darwin, so by simulating this process, genetic algorithms
are able to evolve solutions to real-world problems, if they have been suitably encoded. They
are often capable of finding optimal solutions even in the most complex of search spaces or
at least they offer significant benefits over other search and optimization techniques.
The variety and complexity of learning systems makes it difficult to formulate a universally
accepted definition of learning. However, the common denominator of most learning
systems is their capability for making structural changes to themselves over time with the
intent of getting more efficient in performing given tasks. One of the most important means
for understanding the strengths and limitations of a particular learning system is a clear
definition of knowledge structures, possible structural changes and the legal operators for
selecting and making those changes. There are several different approaches to changing
knowledge structures. The simplest one is the changing of parameters that influence the
system's behavior; we used this approach in our scheduling system.
Let's take a look at simple learning system model (Figure 1). Such a system is performance
oriented. We have some tasks that have to be performed (in our case a schedule generation)
and learning consists of both knowledge acquisition and refinement. System is separated into
two subsystems:
• a task component whose performance-oriented behavior is to be improved; in our case
it is a genetic algorithm that generates schedules, and
• a learning component charged with making appropriate structural changes; in our case
we try to find optimal parameters' values that affect the execution of genetic algorithm.
Figure 1. A performance-oriented learning system.
Since the execution of genetic algorithm depends only on a few parameters (three in our
case), it is beside of setting the appropriate initial values, also very important how and when
in the process of evolution those values are modified. Therefore we presented knowledge as
a set of simple rules indicating whether the parameter's value should be increased or
decreased and when this action has to be done (Figure 2). Knowledge is acquisited and
improved on the base of solved problems as we observe how the solution evolves depending
on the parameters' values. As the solution evolves we randomly choose rules and reward or
punish them (increase or decrease the probability of a rule to be executed) accordingly to
their effect. By executing different problems we can decide which rules are useful in an
actual situation.
IF (generationNum>150)AND (generationNum<200) AND (overallWaitingTime>100)
THEN IncreaseMutationProb BY 0,001%
WITH PROBABILITY 62,17% ;
Figure 2. An example of a rule. Numbers in italic indicate the values that are updated through machine learning.
3. Schedule generation
Every scheduling problem can be described with three categories: activities, actors, and
resources. Let’s have a number of actors, each of them having one or more activities to
perform. Each activity needs some number of resources in order to be performed. Scheduling
is then a process of assigning all activities to the particular time slots and to the particular
resources.
Considering the scheduling of patients, patients are the actors, therapies are the activities and
therapeutic devices and therapists are two types of resources that are necessary for a therapy
to be performed.
As we try to construct a schedule, there are some fundamental constraints that could not be
broken in order to produce feasible solutions (a solution is feasible if the problem is
executable by the given schedule). These are:
• no actor (patient) can perform more than one activity at a time,
• every resource (therapeutic device or physical therapist) can be used by only one actor
at a time (it can perform only one activity simultaneously),
• each activity (therapy) has to be performed in only one continuous time interval,
• every activity can be performed only with a specific resource, and
• some activities have to be performed in an exact time order.
A schedule that satisfies these constraints is a feasible schedule. But just because a schedule
is feasible, it does not mean it is good enough to be used. Many other criteria exist which
influence the quality of a schedule. Any of these criteria can be included into the evaluation
function, that helps us select the fittest individuals for the future evolution, as we will see
when the selection operator will be discussed later on.
3.1. Internal representation of the individuals
Internal representation of the individuals within a population has to be defined in such a
way, that it represents feasible schedule and simultaneously leaves enough space to derive
the genetic operators. It has to guarantee that all the solutions will still be feasible after the
selection and mutation process, and we should be able to randomly select the crossover point
and the influence of the mutation.
In our case, we represent the individuals with a multi-dimensional model. First dimension is
always the time, whereas all other dimensions represent all needed types of resources. In the
case of patients scheduling the model is three-dimensional with the last two dimensions
representing therapeutic devices and therapists. Objects in the model are therapies that have
to be scheduled. Their positions within the model show resources needed for them to
perform and the time order in which they perform.
And how do we guarantee that all the obtained solutions are feasible? As we construct new
solutions, only the time order is defined, not absolute time intervals, and upon this structure
genetic algorithm operates. Then, in the second phase, the exact time intervals are defined
for all the activities; as good as possible according to the time order given (see example on
Figure 3).
Figure 3. An example of internal representation of the individuals with only one resource type. First the time order is defined (1st phase), then absolute time intervals are calculated (2nd phase). Activities with
the same color belong to the same actor - therefore they can not intersect.
The internal representation of individuals is actually a multi-dimensional array, its elements
are the consecutive numbers of activities. Indexes in the first dimension represent the time
order, and other the allocation of resources.
3.2. Seeding of the initial population
One of the parameters that influences the evolution in the genetic algorithm is the size of
population. The size can change during the execution of the algorithm, or it can remain
constant. We used the later approach; in this manner the number of individuals remains
constant all the time. The actual size is determined upon knowledge gathered in the learning
subsystem.
Before we can start with the evolution, we have to seed the initial population of individuals.
They are usually generated randomly, but it is very important to create needed diversity. In
our case, individuals are created as we fill the table with the therapies in a random time
order.
3.3. Selection
For the selection scheme we used modified exponential ranking selection method. After the
evaluation of all individuals, they are sorted accordingly to their fitness score. Then we
replace existing individuals from the worst to the best by creating new ones with crossover
from two selected individuals, that still exist from the old population. When all the
individuals are replaced, the new population is generated (there is still mutation to be
applied).
Actually we never replace all the individuals in the population. In this manner we guarantee
that in every new population there will be a solution at least as good as the best one in the
previous population. The number of the preserved individuals is the second parameter that
influences the evolution. Both its initial value and its change during the execution are
controlled by the learning subsystem.
For effective selection we have to define an adequate evaluation function, that determines
the fitness score of each individual. For this case we implemented the method of negative
points given to the individuals by the evaluation function based on the values of parameters
that, for the given problem, determine the quality of the obtained solution. Less negative
points an individual has, better is its fitness score, and more chances it has to be selected for
the crossover.
For each specific scheduling problem there are some parameters that determine the quality
of the obtained solutions. Regardless of the problem given, there are some general criteria of
the quality:
• overall duration of all activities,
• time of the idleness of resources,
• overall waiting time of all actors (time when an actor has to wait between two of its
activities),
• average waiting time of the actors, and
• maximum waiting time of an actor.
For all the parameters it is good that their values are as low as possible. Therefore we can
use their values as the negative points - evaluation function is then simply the sum of
parameters’ values. But all the parameters are not equally important, therefore we introduce
weights, that increase or decrease their importance. User can select these weights and in this
way direct the searching. For example, if the highest weight is put to the parameter of
average waiting time, the algorithm will prioritize the solutions where the actors on the
average do not wait long.
3.4. Crossover
When we select two individuals from the current population, with crossover we construct a
new solution, that is placed into a growing new population (Figure 4). Both selected
individuals are divided into several parts by cutting the multi-dimensional model of internal
representation of the individuals along the timeline (first dimension) on all of the resource
types. New individual is then constructed by randomly choosing parts from both parents and
putting them together.
Figure 4. An example of two-dimensional model. Shaded fields from both parents construct new individual.
3.5. Mutation
After the new population is fulfilled with individuals, we still have to apply the mutation
operator (Figure 5). For all the individuals, except for some number of preserved ones,
mutation operator with some probability exchanges two randomly chosen activities.
Figure 5. An example of mutation. Shaded fields are exchanged.
Mutation probability is the third and last parameter that influences the evolution and it is
again determined upon the knowledge gathered by the learning subsystem. It turned out by
running different tests, that the appropriate mutation probability for described method is
quite higher than usual. It is mainly the consequence of the fact, that offsprings inherit by
crossover more information from their parents as usual. Also preserving some number of
individuals unchanged has an important role in this situation.
3.6. The process of schedule generation
For every given problem, we first generate some random test problems, execute them and in
this manner the initial knowledge needed for the adequate parameters setting is acquisited.
Also the actual problem can be executed several times to refine gathered knowledge. The
evolution process starts with the seeding of the initial population by generating a number of
schedules randomly, taking just the care that all individuals are feasible. Each individual is
then evaluated for its fitness score. According to the fitness score, better individuals have
more chances to be selected to produce new ones by crossover. The crossover phase is
repeated until new individuals fulfill the complete population (with the exception of some
number of the best individuals that are preserved for the next generation). As the population
size is constant, every time a new individual is created, one old individual (with the lowest
fitness score) is eliminated. When all the new individuals are generated, the mutation
operator is applied with some probability. Next the parameters are modified based on the
gathered knowledge. All the phases, from evaluation of each solution’s fitness score to the
creation of a new population, are repeated until an acceptable solution is evolved.
4. Application of the algorithm and the results
We applied the described algorithm to a problem of scheduling patients with different
physical therapy needs to a limited number of therapeutic devices and a limited number of
therapists. In this case, a solution is a schedule which we can consider to be an assignment of
patients to time intervals on the specific therapeutic devices that are operated by therapists,
trained for a specific device (or more of them). Patients are now the actors, therapies are the
activities and devices and therapists are two types of resources, that are necessary for a
therapy to be performed.
In this case, we have to assure the following constraints to be fulfilled in order to obtain a
feasible solution:
• every patient can perform only one therapy at a time,
• every therapeutic device can be used by only one patient at a time,
• all therapies have to be performed in only one continuous time interval and can not be
broken in several parts,
• each therapy can be performed only on a specific prescribed therapeutic device, that
can be operated only by trained physical therapists, and
• some therapies have to be performed in an exact time order.
The criteria for the quality of the obtained solutions are the following:
• overall duration of all therapies,
• time when devices are idle,
• overall waiting time of patients,
• maximum waiting time of a single patient,
• average waiting time of patients who have more than one therapy, and
• average waiting time only for those patients who are actually waiting.
Upon these parameters we construct an evaluation function that determines the fitness score
of each individual solution. Such application of the described algorithm is intended for little
specialized physical therapeutic studios. The patients in these studios are mostly children.
Therefore it is more important to consider that all the children should finish their therapies as
soon as possible, not waiting long between them, rather than the idle time of devices is very
low. By weighting the parameters in evaluation function properly we were able to get almost
an optimal solution.
Another very important feature in this kind of applications is the handling of so called late
cancellations. This situation occurs when one (or more) of the activities are canceled and
others have already begun. For example a patient got problems after a therapy and therefore
he cancels some of the following ones. In this case we try to reschedule the remaining
activities to reduce the overall duration. This is a great problem with almost all other
scheduling methods, but our algorithm performed quite well due to the diversity kept in a
population.
As an example let’s take a look at the results obtained with the described algorithm for the
situation with 5 different therapeutic devices and 22 patients, each of them with more than 2
prescribed therapies on average. Based on the gathered knowledge of learning subsystem we
set the parameters: population size was 67 individuals, mutation probability varied from
0,003 up to 2,7 %, and in each population between 5 and 6 individuals were unchanged
preserved for the next generation. There were 5 physical therapists available, each of them
for one therapeutic device. One special constraint was, that all the patients have to perform
their therapies on the device number 5 before the therapies on the device number 1. We were
looking for the solutions with the lowest possible overall duration and small waiting times
between therapies. You can see one possible solution, evolved after only 165 generations, on
Figure 6.
Figure 6. Results obtained with the described algorithm. Upper part shows the schedule for all
therapeutic devices (number in the box means patient number); lower part shows schedules for each of the patients.
The duration of all therapies in the obtained solution is 335 minutes and it is also the shortest
possible solution (look at the device number 2). Devices 1, 3 and 4 are idle for 10, 5 and 30
minutes, but that does not affect the overall duration. The maximum overall waiting time is
15 minutes for patient number 21, who has 4 different therapies. The average waiting time
for all patients with more than one therapy is 5 minutes. Patients 1, 5 and 17, who had have
to perform therapies on both devices 1 and 5 are scheduled correctly (therapy on device 5 is
always scheduled before the therapy on device 1 for the same patient).
5. Conclusion
In the paper we have described a new method for patient scheduling in highly constrained
situations. With the use of genetic algorithm and additional machine learning we have
provided an effective method that finds qualitative solutions. The introduction of actors,
sources and activities allows all kinds of scheduling problems to be described and solved.
With the modification of genetic algorithm the premature convergence to a suboptimal
solutions is avoided and successful rearrangement of existing schedules is achieved. An
adequate internal representation of individuals together with chosen genetic operators
guarantees all solutions to be feasible, and in this manner the possibility of very important
user interaction with the system. The ease of controlling the direction of evolution by
adjusting importance of the parameters that affect the quality of final solution makes the
method even more appropriate to be incorporated into an interactive scheduling tool.
Because of its effectiveness and low computing resources consumption it could be used also
for very complex problems.
References
[1] Thomas Baeck. Evolutionary Algorithms in Theory and Practice. Oxford University Press, Inc., 1996.
[2] E. K. Burke, D. G. Elliman, R. F. Weare. A Genetic Algorithm for University Timetabling. AISB Workshop on Evolutionary Computing, Leeds, 1994.
[3] E. K. Burke, D. G. Elliman, R. F. Weare. Automated Scheduling of University Exams. Proceedings of IEEE Colloqium on Resource Scheduling for Large Scale Planning Systems, Digest No. 1993/144.
[4] E. K. Burke, D. G. Elliman, R. F. Weare. A Hybrid Genetic Algorithm for Highly Constrained Timetabling Problems. Proceedings of the 6th International Conference on Genetic Algorithms (ICGA’95, Pittsburgh, USA), pp. 605-610, Morgan Kaufmann, San Francisco, CA, USA.
[5] Kenneth De Jong. Learning with Genetic Algorithms: An Overview. In Bill P. Buckles, Frederick E. Petry (eds): Genetic Algorithms. IEEE Computer Society Press, Los Alamitos, CA, USA, 1994.
[6] Stephanie Forrest. Genetic Algorithms. ACM Computing Surveys, pp. 77-80, Vol. 28, No. 1, March 1996.
[7] David E. Goldberg. Genetic and Evolutionary Algorithms Come of Age. Communications of the ACM, pp. 113-119, Vol. 37, No. 3, March 1994.
[8] David E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, Reading MA, 1989.
[9] John H. Holland. Adaptation in natural and artificial systems. MIT Press, Cambridge MA, 1975.