Deciding when to intervene: a Markov decision process approach

International Journal of Medical Informatics 60 (2000) 237–253

Deciding when to intervene: a Markov decision processapproach

Paolo Magni a,*, Silvana Quaglini a, Monia Marchetti b, Giovanni Barosi b

a Dipartimento di Informatica e Sistemistica, Uni6ersita degli Studi di Pa6ia, 6ia Ferrata 1, I-27100 Pa6ia, Italyb Laboratorio di Informatica Medica, IRCCS Policlinico S. Matteo, P.le Golgi 2, Pa6ia, Italy

Received 25 June 1999; received in revised form 17 April 2000; accepted 14 June 2000

Abstract

The aim of this paper is to point out the difference between static and dynamic approaches to choosing the optimaltime for intervention. The paper demonstrates that classical approaches, such as decision trees and influencediagrams, hardly cope with dynamic problems: they cannot simulate all the real-world strategies and consequently canonly calculate suboptimal solutions. A dynamic formalism based on Markov decision processes (MPPs) is thenproposed and applied to a medical problem: the prophylactic surgery in mild hereditary spherocytosis. The papercompares the proposed approach with a static approach on the same medical problem. The policy provided by thedynamic approach achieved significant gain over the static policy by delaying the intervention time in some categoriesof patients. The calculations are carried out with DT-Planner, a graphical decision aid specifically built for dealingwith dynamic decision processes. © 2000 Elsevier Science Ireland Ltd. All rights reserved.

Keywords: Decisions and time; Therapy planning; Decision theory; Markov decision processes; Decision models; Hereditaryspherocytosis

www.elsevier.com/locate/ijmedinf

1. Introduction

Physicians are forced to decide about med-ical interventions when precipitating eventsthreaten patient’s life or when patient’s well-being is severely affected. In these cases timeconstrains the decision. However, there are a

lot of circumstances in which time is a pre-cious resource which clinicians rely upon.

When dealing with chronic disorders,physicians often adopt a ‘watchful waiting’strategy, i.e. they postpone the decision up toa critical point where sufficient informationhas been gained from the evolving clinicalscenario.

When dealing with prophylactic interven-tions, physicians aim to optimize the inter-vention time, so as to maximize the netbenefit versus the risks posed by intervention.

* Corresponding author. Tel.: +39-0382-505511; fax: +39-0382-525638.

E-mail addresses: [email protected] (P. Magni),[email protected] (S. Quaglini), [email protected] (M.Marchetti), [email protected] (G. Barosi).

1386-5056/00/$ - see front matter © 2000 Elsevier Science Ireland Ltd. All rights reserved.

PII: S1386-5056(00)00099-X

P. Magni et al. / International Journal of Medical Informatics 60 (2000) 237–253238

Fig. 1. A static approach to determine an approximate solution to the optimal intervention time problem: the decisionmodel is solved in some decision time points.

When dealing with monitoring, for exam-ple in sequencing drug administration, physi-cians periodically have to check and revisethe therapeutic protocol.

When such problems, involving uncer-tainty, complexity and dynamic change, arefaced using the decision theory framework, itis crucial that the decision models treat timeresources available in the real-world cor-rectly. In the most common used formalismsof medical decision making, i.e. decision trees[1] and influence diagrams [2,3], the majorobstacle which arises is the need to clearlyestablish both decision-time, i.e. the timewhen alternative actions are evaluated (byforecasting their outcomes), and inter6ention-time, i.e. the time when an action is per-formed. For this reason, a decision problemis easily managed when it does not directlyinvolve uncertainty in the intervention time,i.e. in static decision problems. On the con-trary, decision models hardly cope with prob-lems that result in having to consider thedecision time explicitly, i.e. dynamic prob-lems. As a matter of fact, modelling recurrentdecisions by using embedded decision nodesproduces an explosion of the decision treethat becomes computationally intractable.For these reasons, in the field of decisionanalysis, decision time is managed conven-

tionally through strategies that only roughlyapproximate the rich complexity of decisionmaking in the real world and hence lead tosub-optimal treatment suggestions.

For example, a common strategy is tosolve the static problem at several time in-stants (in general for different age classes ofpatients), and then to consider the set ofsolutions obtained as an approximated dy-namic strategy (Fig. 1). With advancing age,outcome probabilities and patient’s expectednatural life-span change, usually causing adecline of expected utility. For example,when prophylactic mastectomy and prophy-lactic oophorectomy among women whohave a genetically determined increased riskfor breast and ovary cancer are comparedwith no prophylaxis, gains in life expectancydecline with patient’s age at the time of pro-phylactic surgery [4].

However, in this way the decision modelsolves the problem at any age as if it were theonly possible decision time, without takinginto account that there are other decisiontime points and hence that the decision mightbe reconsidered later.

Another strategy used for managing uncer-tainty about intervention time is to includethe option of postponing the intervention(but at fixed times) explicitly among the pos-

P. Magni et al. / International Journal of Medical Informatics 60 (2000) 237–253 239

sible strategies of the decision model. Thesimplest decision model has three options: nointervention, immediate intervention and in-tervention some time later (Fig. 2). In thiscase, a reasonable delay has to be (a priori)assessed according to the clinical problemand supplied as an input to the decisionanalysis.

In the previous example, pregnant women,who are at high risk of breast and ovariancancer, might consider nevertheless prophy-lactic surgery only after having completedchild bearing and lactation. However, thecomparison between the expected utilities ofa 30-year-old cohort immediately undergoingthe intervention and a cohort in which it isdelayed 10 years, is only a rough approxima-tion of other possibly better strategies, suchas reconsidering the decision every year forthe following 10 years.

Formulating plans under uncertainty andcoping with dynamic decision problems is amajor task of both artificial intelligence andcontrol theory. Instruments have been devel-oped to cope with dynamic problems, forexample dynamic influence diagrams [5] andMarkov decision processes (MDPs) [6]. Dy-namic influence diagrams are a structural and

semantic time extension of influence dia-grams, whereas MDPs are the decision-theo-retic extension of discrete Markov processes.The traditional formulation of MDPsthrough the transition matrix requires settinga great number of parameters (the probabili-ties of the transition matrix), that often donot have an immediate counterpart in themedical domain. In this sense, the black-boxnature of the transition matrix kept MDPsfrom becoming popular in medical decisionmaking. However, in the early nineties Tze-Yun proposed a powerful instrument,called influence views (IVs), to overcome thisobstacle and provide a ‘view’ on a generictransition of MDPs [7]. IVs were later for-malized by Magni et al. and are now sup-ported by a software tool, called DT-Planner[8].

In this paper we show that MDP-IVs arethe suitable instrument to cope with dynamicmedical decision problems in general and inparticular with the problem of the interven-tion time assessment. The study case is theoptimal time for prophylactic splenectomyand cholecystectomy in patients affected bymild hereditary spherocytosis. In order toshow the differences between the static solu-tion and the dynamic one, we have solvedthis decision problem following bothstrategies.

2. Markov decision processes and influenceviews

2.1. MDPs

MDPs are formalisms based on decisiontheory and discrete time Markov process the-ory [6]. Fig. 3 provides a schematic view ofthe MDPs main concepts.

First of all, the time axis is discretized andlabelled with N time points, called decision

Fig. 2. A static approach to determine an approximatesolution to the optimal intervention time problem: dif-ferent delays are compared adding branches to thedecision tree.


Fig. 3. The main elements of a Markov decision process.

instants; each of the resulting intervals iscalled ‘Markov cycle’. This value is chosenaccording to the problem at hand and repre-sents a plausible timing for patientmonitoring.

Thus, it is assumed that, at each timepoint, the decision maker may observe thecurrent state of the Markov process beingcontrolled. At the same time, the decisionmaker takes one action, from a finite set ofpossible actions, relying on the current stateitself. Therefore the stochastic process is ledto evolve along a driven trajectory, since thetransition matrix (P) of the Markov processat each time point is dependent on the lastaction.

In order to compare all possible sequencesof actions and to determine the optimal one,in accordance with decision theory, eachstrategy is finally scored by a utility function,computed as the sum of utility con-tributes, which in each step of the Markovchain depends on the process state andon the performed action. By maximizingthe expected utility over an observed timeperiod, called time horizon, an optimal dy-namic policy is obtained. This establishesthe optimal action on the basis of thestate assumed by the system and the decisiontime.

2.2. IVs

IVs are direct acyclic graphs providing arepresentation of a single transition of aMDP and, as defined in [8] can include thefollowing nodes:

2.2.1. State nodesEach state node represents a variable ob-

tained from the factorization1 of the MDPspace state. Since the IV specifies a MDPtransition from one generic time epoch to thenext, each state node appears twice in the IV,in order to express the same variable in thetwo different time epochs. They are calledinitial and final state nodes.

2.2.2. E6ent nodesEach event node represents an event vari-

able placed between initial and final statenodes. Event variables help the user in thespecification of the state transition and in thestructuring of the state space. In other words,event variables allow the stochastic model ofthe generic transition to be broken down intomore local structured and simple stochasticmodels. This allows an introduction into the

1 Factorizing the state of an MDP means to choose thevariables whose Cartesian product gives the whole state space.


MDPs of the domain knowledge and hencesimplifies the probability acquisition. We candistinguish two types of event nodes on thebasis of their role: context nodes and transi-tion nodes.� Context nodes. These represent context

variables. In the IV network, a contextvariable is a node that has no state nodesin its ancestral set (i.e. context nodes arenot located on any path between initialand final state nodes). The use of contextvariables is a way to parameterize theprognostic model according to problemfeatures that are not supposed toechange during the decision making pro-cess. Typically the characteristics of differ-ent patients or populations of patients(e.g. sex) are represented by contextnodes.

� Transition nodes. These represent transi-tion variables. They are event nodes lo-cated on a path between initial and finalstate nodes. They are not observable, bydefinition, and are introduced to specifythe probability distribution underlying ageneric time transition in an easier way.Typically they represent casual relation-ships that may be elicited from the domainexperts.

2.2.3. Numerical nodeThis kind of node, not yet defined in [8],

has been adopted to introduce into the net-work some of the model numeric parameters.These nodes have only numerical nodes intheir ancestral set. A typical example is thenode age, when the patient’s age is used as aparameter in the definition of some age vari-ant probabilities.

2.2.4. Utility nodeThe utility node expresses the utility (cost)

function of the MDP on a single transition. Itcan depend on the state nodes, event nodesand numerical nodes.

It is interesting to note that the numericalnode and the utility node are ‘anomalous’nodes in the probabilistic network, be-cause they do not represent stochastic vari-ables. A simple example of IVs is reported inFig. 4.

The relations among problem variables areexpressed graphically by arcs betweennodes and quantified by conditional proba-bilities, that can be extracted from literatureor from large databases [9]. It is con-venient to note that arcs starting from nu-merical nodes are not probabilistic but func-tional.

Decision nodes could be used to representthe available options, as usually happens ininfluence diagrams. However, in order tobuild a more general framework, we pre-ferred requiring the specification ofseparate networks for each of the possibleactions, rather than using a unique net-work embedding the decision nodes. As amatter of fact, the model of each decisionmay involve a different set of variables, sothat the IVs can differ both in theirqualitative structure and parameter quantifi-cation.

Fig. 4. Example of influence view: Death and Inter6 aretwo state nodes; Age is a numerical node; NatDeath,Disease and DisDeath are event nodes. In particular,NatDeath is a context node while both Disease andDisDeath are transition nodes.


2.3. Sol6ing MDP-IVs

The solution of MDPs described throughthe IVs is obtained in two steps. In the firststep for each action, the transition matrix Pof the MDP is computed, by propagating theprobabilities distribution on the IVs network.At this point the MDP can be solved byusing classical algorithms as dynamic pro-gramming [10] or value iteration [11]. A de-tailed description of the algorithmsimplemented in DT-Planner can be found in[8].

Although during the computation of theoptimal policy IVs are transformed into clas-sical MDPs and are then solved by means ofthe same well-known algorithms, IVs offersome advantages. They allow us to specifyprobabilistic relations among a small numberof random variables, instead of specifying thejoint probability distribution. In other words,it is possible to describe the model specifying‘local’ knowledge about conditional depen-dencies between a few related events, ratherthan giving ‘global’ transition probabilitiesbetween all the possible states of the MDP.Furthermore, they ‘open a window’ that al-lows the domain knowledge underlying thetransition among Markov chain states to beshown.

3. A medical problem: prophylactic surgeryin mild hereditary spherocytosis

Hereditary spherocytosis (HS) is the mostcommon constitutional erythrocyte mem-brane disorder, characterized by a chronicdestruction of red blood cells. The severity ofthe disease is variable and in 30% of cases itis mild, defined by a hemoglobin level over 11g/dl, a reticulocyte count of 3–6% and abilirubin level of 1–2 mg/dl [12]. Eventhough they are not anemic, patients with

mild HS show a sustained erythropoiesis [13]predisposing them to episodes of Parvovirus-induced aplasia, hemolytic crisis leading to anincreased risk of gallstone formation.

In anemic HS patients splenectomy is al-ways advocated to eliminate the main site ofred blood cell destruction. The utility ofsplenectomy is, on the contrary, uncertain inpatients with a mild form of the disease. Thebenefit of preventing adverse disease conse-quences has therefore to be balanced accu-rately against the risks of surgery, whichinclude mortality, morbidity and post-splenectomy infections. Nevertheless, in thelast 10 years new surgical and vaccinal op-portunities have changed the trade-offs in-volved in the choice. Laparoscopiccholecystectomy has the potential of eithercuring or preventing gallstones with lowersurgical risk and patient discomfort, whilepolyvalent anti-pneumococcal and anti-Haemophilus influenzae type b vaccinationsalong with lifelong penicillin prophylaxishave the potential of preventing post-splenec-tomy septic episodes.

The decision problem consists in assessinga therapeutic plan, that specifies, in accor-dance with the patient’s conditions, when theprophylactic splenectomy and/or the prophy-lactic cholecystectomy are useful, in order tomaximize the life (or the quality of life) of thepatient.

4. The decision model

In order to cope with the HS therapeuticproblem we have built a dynamic decisionmodel based on the MDP-IVs approach. Wefix the Markov cycle at 1 year. According tothe IV framework, the probabilistic model ofpatient evolution can be described by fourIVs, one for each possible choice (i.e. noprophylactic surgery, prophylactic cholecystec-


Table 1State nodes used to describe the patient’s state

LevelNode nameState variable

No gallstones (gallbladder present), gallstones asymptomatic, gallstones occasionalGalClassState ofgallstones colics, gallstones recurrent colics, no gallstones (gallbladder removed), no

gallstones (death)State of spleen Present, removes (1 year), removed (2 years), removed (3 years), removed (4Spleen

years), removed (\4 years)

Fig. 5. Gallstones history model.

tomy, prophylactic splenectomy, prophylacticsplenectomy and cholecystectomy), and by astate space factored into two state variables(i.e. state of gallstones and state of spleen).The state of gallstones variable describes thepathologic state (and the presence) of thegallbladder whereas the state of spleen vari-able represents the presence of the spleen(Table 1).

Their Cartesian product determines the pa-tient’s state, which in some conditions limitsthe set of possible choices.

In gallstone history we distinguished pa-tients without gallstones, patients with asymp-tomatic gallstones, i.e. gallstones foundthrough echography but without clinicalmanifestation, patients with gallstones andoccasional biliary colics, i.e. less than threeepisodes in the last year, and patients withgallstones and recurrent biliary colics, i.e.more than three episodes in the last year [14].

As shown in Fig. 5, after each step of theMarkov chain (1 year), a patient can:� be unchanged: without gallstones, without

colics, with occasional colics, withoutgallbladder;

� develop asymptornatic gallstones;� develop colics if gallstones were

asymptomatic.In order to model gallstone history exhaus-

tively, we consider, for this state node, twoother levels: no gallstones (gallbladder re-mo6ed) and no gallstones (death). Of course,in both cases gallstones can not develop dur-ing the Markov cycle.

In order to model these assumptions by IVnetworks, we introduce the following yes/noe6ent nodes: NoGal1, AsymGal1, SymGal1,NoGallbl1, Death1 and NoGal2, AsymGal2,SymGal2, NoGallbl2, Death2. They representa re-classification of the GalClass node levelsuseful to express the model of conditional


Table 2Event rodes used in the influence view to describe thetransition among the patient’s states

Node name LevelEvent variable

NoGal1 Yes, NoWithoutgallstones atthe beginning

Without NoGal2 Yes, Nogallstones inthe end

AsymGal1Asymptomatic Yes, Nogallstones atthe beginning

Asymptomatic AsymGa2 Yes, Nogallstones inthe end

Symptomatic SymGal1 Yes, Nogallstones atthe beginning

Symptomatic SymGal2 Yes, Nogallstones inthe end

NoGallbl1 Yes, NoWithoutgallbladder atthe beginning

Without NoGallbl2 Yes, Nogallbladder inthe end

Death at the Death1 Yes, Nobeginning

Death2 Yes, NoDeath in the endComplications No, Cholecystitis,Complic

PancreatitusComplDeath Yes, NoDeath due to

complica-tions

Episodes of Yes, NoSepsissepsis

Yes, NoSepDeathDeath due tosepsis

Pre-SplenAnti-infective No, Vaccination,Vaccination andprophylaxisPenicillin

Presence of Yes, NoSpleenPresspleen

Yes, NoHemolCrisHemolytic crisisValueAgePatient’s ageMale, FemaleSexPatient’s sexYes, NoNatDeathDeath due to

other causes

Table 2 (Continued)

Node nameEvent variable Level

SurgDeathDeath due to Yes, Noprophylacticsurgery

AgeClass1Classes of age 0, 1, 2,…,101(years)

Classes of age AgeClass2 0, 5, 15,…, 75(years)

AgeClass3 540, \40Classes of age(years)

Classes of age AgeClass4 530, 30–70, \70(years)

independence underlying the Markov transi-tion. In particular, NoGal is referred to thelevel without gallstones, AsymGal to the levelasymptomatic gallstones, SymGal to the levelsoccasional and recurrent colics, NoGallbl tothe level no gallstones (gallbladder remo6ed)and finally Death to the level no gallstones(death). Suffixes 1 and 2 are referred respec-tively to the re-classification of the initial andfinal state nodes. Note that, arcs betweenGalClass and NoGal1, AsymGal1, SymGal1,NoGallbl1, Death1 express deterministic rela-tions instead of probabilistic ones.

Acute cholecystitis and biliary pancreatitis,which are possible acute biliary complica-tions, were modelled through the node Com-plic. In both cases surgery is required so that,when this event occurs the patient might ei-ther die or live without the gallbladder.

Additionally, we considered baseline natu-ral mortality, which depends on sex and age.For this reason we introduced two event vari-ables: Age and Sex. The node Age is a nu-merical node and it gives values to nodesAgeClassl, AgeClass2, AgeClass3 and Age-Class4 representing different classifications(see Table 2).

The presence of the spleen involves anincreased probability of gallstone formation,while its absence causes a high risk of infec-


Fig. 6. Influence view related to the decision no surgery.

Fig. 7. Influence view related to the decision cholecystectomy.

tions (sepsis), mitigated by drug prophylaxisand/or vaccination (Pre-Spleen).

Sepsis incidence depends on the length oftime since the spleen was removed [15,16]. Inaccordance with the literature we set a con-stant risk of sepsis after the fourth year fol-lowing splenectomy. We assume that sepsis

prophylaxis abolishes the sepsis risk.The IV obtained for the first action (No

surgery) is shown in (Fig. 6).The structure of the other IVs can be easily

derived by simplifying the IV related to Nosurgery action. Nevertheless, it is necessary tointroduce another node that represents the


Fig. 8. Influence view related to the decision splenectomy.

Fig. 9. Influence view related to the decision splenectomy and cholecystectomy.

risk of death caused by surgical treatment(SurgDeath). Obviously, prophylactic surgeryadds some penalties in the QALYs (qualityadjusted life years) model.

Thus, the IV for the action Cholecystec-tomy, shown in Fig. 7, is derived from thefirst one Fig. 6 by simplifying the part of theprobabilistic model which describes gallstone

formation. In fact, at the end of the Markovcycle the GalClass state variable assumes theonly values no gallstones (gallbladder re-mo6ed) or no gallstones (death).

On the other hand, the IV relating to theaction Splenectomy shown in Fig. 8 is ob-tained from the one represented in Fig. 6, bythe simplification of the probabilistic model


Fig. 10. Static model: an approximate solution to the problem of determining the optimal intervention time for malepatients with a pre-splenectomy treatment prescribing both vaccination and penicillin. Blank, no surgery; C,cholecystectomy; S, splenectomy; SC, splenectomy and cholecystectomy.

related to the state node Spleen. In fact, atthe end of the Markov cycle after theSplenectomy the state node Spleen assumesthe value removed (1 year). For this reasonnodes as HemolCris or SpleenPres disappear.

The IV related to the action Splenectomyand cholecystectomy (Fig. 9) is the most sim-plified because it has both the simplificationsdescribed above.

The conditional probability tables of thisgraphical model are derived in the followingway. Nodes Age, Sex and Pre-Spleen areinstantiated in accordance with the character-istics of the patient or of the class of exam-

ined patients. The tables of nodes as NoGal1,AsymGal1, SymGal1, NoChol1 and Death1and those of nodes introduced to discretizethe numerical node Age (AgeClass1, Age-Class2, AgeClass3 and AgeClass4) derivefrom the node definitions. The table of nodeNatDeath is derived from Italian mortalitytables [17] related to the general population.The other probability tables are derived fromprevious studies in a similar way as in [18].

5. The resulting policies

The utility function, used to compare thedifferent strategies, is based on the QALYsmodel as reported in [18]. The time horizon,on which the optimal therapy has beenderived and on which the QALYs score hasbeen maximized, is the patient’s whole life.

In particular a male patient, over six2 fol-lowing complete pre-splenectomy prophylaxis(i.e antipneumococcal vaccination and peni-cillin), was considered. Since in this patientthe risk of sepsis is null, in the state nodeSpleen the levels remo6ed (1 year), remo6ed (2

Fig. 11. Differences in terms of quality adjusted lifeexpected days from suggested prophylatic surgery andno prophylactic surgery in 6, 30, 50, 70 year old males.In the case of recurrent colics the difference is madewith respect to cholecystectomy.

2 It is known that for children under 6 a splenectomy is notrecommended [19].


years),…, may be summarized into the levelAbsent.

This medical problem was tackled, using thesame model, following both the classical staticapproach and the proposed dynamic one.

5.1. The static approach

Following the strategy shown in Fig. 1, theMDP-IV model was solved by fixing the deci-sion time for different sex and age classes. Thesolutions for male patients are summarized inFig. 10 and the gain of the suggested prophy-lactic strategies as compared to no prophylac-tic surgery, in 6, 30, 50, 70 year old males, isreported in (Fig. 11).

If the spleen is still present in patientswithout gallstones at the decision time, splenec-tomy alone proves worthwhile for prophylacticpurposes until the age of 35: 30 and 6-year-oldpatients would gain, respectively, 49 and 123quality adjusted life days (QALDs). On thecontrary, if the spleen has already been re-moved no cholecystectomy is required untilgallstones appear.

In patients with gallstones, splenectomy andcholecystectomy provide a gain in life expec-tancy whose magnitude declines with increas-ing age, because of decreasing natural lifeexpectancy. Patients with asymptomatic gall-stones (with spleen) and under 45 have thehighest benefit from splenectomy combinedwith cholecystectomy. The gain with respect tono prophylactic surgery is 400 QALDs in 6 yearold males, and 169 QALDs in 30 year oldmales. Cholecystectomy is, instead, neverworth to be done prophylactically. If thespleen was already removed, cholecystectomyalone is suggested until 67 years, with a gainof 259 QALDs in 6-year-old males, 127QALDs in 30-year-old males and 40 QALDsin 50-year-old males.

In patients with occasional biliary colics(with spleen), the gain in life expectancy from

splenectomy combined with cholecystectomy ishigher than that of cholecystectomy alone untilthe threshold age of 53. With respect to noprophylactic surgery, it provided a gain of 540QALDs in 6-year-old males, 322 QALDs in30-year-old males and 71 QALDs in 50-year-old males. After the threshold age, cholecystec-tomy alone results in a higher gain until the ageof 70. At 70 cholecystectomy alone and nosurgery are comparable strategies. On theother hand for patients without spleen, chole-cystectomy alone is always required, with again of 315 QALDs at 6, 218 QALDs at 30,137 QALDs at 50 and 63 QALDs at 70.

For potential patients of cholecystectomydue to recurrent biliary colics, and who areunder 53, the best strategy is to combinecholecystectomy with splenectomy. The gainwith respect to only cholecystectomy is of 387QALDs in 6-year-old males. At the ages of 30and 50 patients could gain respectively 243 and43 QALDs.

5.2. The dynamic approach

In Fig. 12 the optimal intervention time for6-year-old HS patients is reported. Given thepossibility of postponing the decision, the besttreatment in a patient who is 6 without gall-stones is to wait for prophylactic splenectomy.Every year a clinical check-up is scheduled, butif gallstones do not develop, no surgery issuggested until the age of 15, when splenectomyis suggested. On the other hand, if gallstonesappear (before 15 years of age), both cholecys-tectomy and splenectomy are suggested. Most(88%) of the 6-year-old children, however, donot develop gallstones until they are 15, and,therefore, splenectomy is performed at thatage. Furthermore, they should undergo chole-cystectomy as soon as asymptomatic gallstonesare discovered by echography if they are under55. Otherwise, cholecystectomy should be de-layed until his/her first colic.


Fig. 12. The optimal intervention time for 6-year old male patients with a pre-splenectomy treatment prescribing bothvaccination and penicillin. Blank, no surgery; C, cholecystectomy; S, splenectomy; SC, splenectomy and cholecystec-tomy.

Fig. 13. Static solution (white) and dynamic solution (grey): table of treatment for male patients with a pre-splenec-tomy treatment prescribing both vaccination and penicillin. Blank, no surgery; C, Cholecystectomy; S, splenectomy;SC, splenectomy and cholecystectomy.

It is interesting to remark that the table oftherapy proposed in this section for 6 year-oldpatients, actually gives the optimal therapy forevery male patient. In fact, for example, thedecision suggested by the table after 10 yearsof follow-up (time point ten) coincides withthe decision for a 15-year-old subject at hisfirst medical encounter. This fact is causedmainly by the structure of the HS problem andby the fixed time horizon of 100 years adopted

in this problem. For this reason the optimaltreatment forecasted for a 6-year-old patientcan also be read as the optimal therapeuticprotocol for a generic male patient, given aperfect compliance to vaccination and pro-phylaxis before splenectomy. Given these con-siderations it is possible to easily compare theoptimal strategy derived from the dynamicperspective with the approximated onederived from the static perspective.


5.3. Dynamic model 6s. static model

Starting from Fig. 10, which reports thesolution of the static model at different ages,and from Fig. 12, which shows the optimalsolution derived using a dynamic decisionmodel, we can build a table (Fig. 13) tocompare the two models.

They differ in a lot of cases. Particularlyinteresting is the therapy of children under 15.Considering a 6-year-old male without gall-stones and with spleen, the static model sug-gests splenectomy (i.e. splenectomy for6-year-old children), whereas the dynamicmodel suggests postponing splenectomy untilthey are 15, if gallstones are not found before.Clearly, the difference between these two rec-ommendations is substantial. In fact, theadoption of a therapeutic protocol recom-mending splenectomy in the youngest patientshas obvious relevant social and psychologicalimplications.

Moreover, it is interesting to note that themost efficient protocol, obtained adopting dy-namic models, corresponds to higher life ex-pectancy. Using QALDs measure, the score(life expectancy) of each policy is clearly de-pendent on the weight set used to expressquality of life for each health state. A sensitiv-ity analysis on these parameters verified thatthe difference of total life expectancy obtainedby a 6-year-old child following the two proto-cols, varies from some tens to several hundredsof days, in accordance with the different utilitycoefficients adopted. In particular, includingthe quality weights reported in [18], life expec-tancy is 24814 QALDs for the dynamic deci-sion and 24802 QALDs for the static decision.Therefore, the difference is 12 QALDs. Notethat a healthy 6-year-old child has a lifeexpectancy equal to 25018 QALDs, that isonly 204 QALDs longer than children of thesame age affected by mild HS. Even if it looksvery small, the difference between static and

dynamic models represents 6% of the gapbetween life expectancy of healthy and HSchildren. This low gain depends on the mildlyincreased risk of death and on slightly loweredqualities of life in post surgery states (1 withoutgallstones with or without spleen, 0.99 withoutgallbladder and without spleen).

6. Discussion

In medical decision making, time can be seenas a sequence of health-related events, withspecific probabilities of occurrence, and healthinterventions, which may modify such proba-bilities. Since health-related events have inter-dependent and time-dependent probabilities ofoccurrence, planning of health interventions isa very difficult task. In this paper we tackledthe problem of planning a postponed decisionto perform a health intervention. We com-pared static strategies, in which the decisionwas taken at a prefixed time, with moresophisticated strategies. To model postponeddecisions realistically, we used a frameworkbased on MDPs [8], that allows us to recon-sider, at any decision time, the choice between‘immediate intervention’ and ‘delay the deci-sion’. The influence view formalism also al-lowed us to represent the problem clearly andefficiently.

We applied this model to the decision aboutprophylactic surgery in patients with mild HS.The approach of static models at differentages, which lacks the possibility of postponingthe decision, made splenectomy worth carryingout in patients without gallstones with agesranging from 6 to 35 and in patients withgallstones who are aged 6–44. In the lattergroup, cholecystectomy is also worth doing.Using the dynamic model the solutionchanges: in patients seen for the first time atthe age of 6, the decision to perform splenec-tomy can be postponed up to the age of 15.


Nine years of additional waiting are modestin terms of utility (QALYs), because of thesmall difference between the utility of the twostrategies, intervention or waiting, but can bevery important from social and psychologicalpoints of view. However, the two approachesmay differ consistently in problems dealingwith riskier interventions or more severe dis-eases. Moreover, if post-splenectomy qualityof life is age-dependent, the delay of decisioncan become relevant.

In this paper we demonstrated that MDP-IVs can support planning of treatment forpatients with chronic diseases, in order tocapture the best time for interventions. More-over, we pointed out the importance of thedynamic component of a decision problem,since if it is neglected a suboptimal strategymay be chosen.

MDP-IVs framework may be adopted tocope with other decision problems in whichtime plays a crucial role. In particular, thefollowing medical problems may be examplesof such an approach:� genetic and biochemical screening for

hereditary hemochromatosis (sequenceand timing of the screening tests) [20];

� aortic valve replacement for aortic valvecalcific stenosis (timing of intervention andpatients selection) [21];

� prophylactic mastectomy and ooforectomyin carriers of BCRA1 or BCRA2 mutation(sequence and timing);

� screening and prophylactic interventionfor carotid artery stenosis (timing of fol-low-up and intervention);

� prophylactic surgery for small abdominalaortic aneurysms (timing and patientsselection);

� screening and prophylactic surgery of cere-bral aneurysms (patients selection andtiming);

� graft versus host disease prophylaxis afterbone marrow transplantation [22];

� optimal duration of anticoagulant therapyfollowing venous thromboembolism.Some of these decision problems have al-

ready been solved in literature by using astatic approach. We found that by adopting adynamic modelization of the decision prob-lem using MDP-IVs the results were morefinely tuned and clinically relevant. In partic-ular, we illustrate below two applications andcompare the insight of the dynamic modelsuggestions with the ones of published staticmodels.

In the field of genetically-determined dis-eases, hereditary hemochromatosis may bescreened with a biological test in adults andwith a molecular assay in any age individuals.Biological screening includes three types oftests, and may require confirmation by eithera molecular assay or liver biopsy. Both bio-logical and molecular screenings are cost-ef-fective [23,24], however the analyses wereperformed for the ideal scenario of 30 year-old males and a prefixed sequence of tests allof which had to be performed in the samescreening session. DT-Planner allowed us toshow the optimal sequence of tests in differ-ent age cohorts: the most cost-benefit strategyat all was to do genetic screening of infantsand about 20 years later perform biochemicalscreening.

The dynamic approach may also be used tomodel valve replacement for aortic valvestenosis. A prosthetic valve may be implantedwith open-heart surgery, although prostheticvalves-related complications may also be fa-tal, thus the timing of valve replacementneeds to be finely tuned. Moreover, patientswho are older or with a coronaropathy haveboth higher risk for open-heart surgery andfor keeping the stenosis. Trade-off is difficultand modelling was so hard that no staticmodel has ever been built for this problem.The dynamic model implemented with DT-Planner provided qualitative recommenda-


tions which completed and finely tuned therecent guidelines of the American Heart Asso-ciation [21]. As a matter of fact, the modelcalculated that patients under 60 with a mod-erate aoftic stenosis, who are usually managedconservatively, were better off having surgicalreplacement until they were free of coronar-opathy and heart failure. The results of themodel also allowed clinicians to ask for furtherstudies investigating critical issues, such as theindication to surgery for patients with overtheart failure.

For patients with a moderately severe aorticstenosis and without either coronaropathy orheart failure, the model prescribes undergoingaortic valve replacement if they are over 60,since the burden of the disease would be higherfor older patients. The life expectancy ofpatients treated as recommended by the modelwas calculated to be far higher than threealternative policies: that of never operating, ofreplacing the valve once symptoms occur, andof replacing the valve once stenosis gets worse.The model allowed a saving of 3–5 years of lifewith respect to these three strategies.

In addition to the decision problems men-tioned, in the near future we plan to implementother decision models, even if certain modelsrequire much effort to track down the litera-ture and elicit time variant probabilities.Moreover future efforts will be devoted to thecomplete definition of MDP-IV structure toperform cost-utility analysis, and to implementthe necessary facilities into the DT-Plannertool.

Acknowledgements

We would like to thank the anonymousreviewers for their helpful suggestions.

References

[1] R. Howard, Decision Analysis: Introductory Lec-

tures on Choices Under Uncertainty, Addison–Wesley, Reading, MA, 1968.

[2] R.M. Oliver, J.Q. Smith, Influence Diagrams, Be-lief Nets and Decision Analysis, John Wiley &Sons, New York, 1990.

[3] D.K. Owens, R.D. Shachter, R.F. Nease, Repre-sentation and analysis of medical decision prob-lems with influence diagrams, Med. Decis. Mak.17 (3) (1997) 241–262.

[4] D.K. Schrag, K.M. Kuntz, J.E. Garber, J.C.Weeks, Decision analysis-effects of prophylacticmastectomy and ooforectomy on life expectancyamong women with BRCA1 or BRCA2 muta-tions, New Eng. J. Med. 336 (1997) 1465–1471.

[5] J. Tatman, R. Shachter, Dynamic programmingand influence diagrams, IEEE Trans. System ManCybernetic 20 (1990) 365–379.

[6] T. Dean, M. Wellmann, Planning and Control,Morgan Kaufmann, San Mateo, CA, 1991.

[7] T.Y. Leong, An integrated approach to DynamicDecision Making under Uncertainty. PhD thesis,MIT, 1994.

[8] P. Magni, R. Bellazzi, DT-Planner: an environ-ment for managing dynamic decision problems,Comp. Methods Prog. Biomed. 54 (1997) 183–200.

[9] C. Cao, T.-Y. Leong, A. Leong, F.C. Seow, Dy-namic decision analysis in medicine: a data-drivenapproach, Int. J. Med. Informat. 51 (1998) 13–28.

[10] D. Bertsekas, Dynamic Programming, PrenticeHall, Engelwood Cliffs, 1987.

[11] H.C. Tijms, Stochastic Modelling and Analysis: AComputational Approach, John Wiley & Sons,New York, 1986.

[12] S.W. Eber, R. Armbrust, W. Schoter, Variableclinical severity of hereditary spherocytosis: rela-tion to erythrocytic spectrin concentration, os-motic fragility and autohemolysis, J. Ped. 117(1990) 409–416.

[13] R. Guarnone, E. Centenara, M. Zappa, A.Zanella, G. Barosi, Erythropoietin production anderythropoiesis in compensated and anaernic statesof hereditary spherocytosis, Br. J. Haematol. 92(1996) 150–154.

[14] J. Lund, Surgical indications in cholelithiasis: pro-phylactic cholecystectomy elucidated on the basisof long-term follow-up on 526 non-operated cases,Ann. Surg. 151 (1960) 153–161.

[15] W.D. Erickson, E.O. Burgert, H.B. Lynn, Thehazard of infection following splenectomy in chil-dren, Am. J. Child. 116 (1968) 1–12.

[16] R.J. Holdsworth, A.D. Irving, A. Cuschieri, Post-


splenectomy sepsis and its mortality rate: actualversus perceived risks, Br. J. Surg. 78 (1991) 1031–1038.

[17] ISTAT, Annuario Statistico Italiano, Roma, 1995.[18] M. Marchetti, S. Quaglini, G. Barosi, Prophylactic

splenectomy and cholecystectomy in mild heredi-tary spherocytosis: analyzing the decision in differ-ent clinical scenarios, J. Inter. Med. 244 (1998)217–226.

[19] R.D. CroomIII, C.W. McMillan, G.F. Sheldon,E.P. Orringer, Hereditary spherocytosis: recent ex-perience and current concepts of pathophysiology,Ann. Surg. 203 (1986) 34–39.

[20] M. Marchetti, P. Magni, S. Quaglini, G. Barosi, Amarkov decision process: screening hereditaryhemochromatosis. Proceedings of 21st AnnualMeeting of the Society for Medical Decision Mak-ing (Nevada (USA), 3–6 October 1999), pp. 532.

[21] A. Minetti, Strategie per la sostituzione valvolare

aortica nei pazienti con stenosi aortica calcifica.un’analisi decisionale. Masters thesis, Universithdegli Studi di Pavia, 1999, (in Italian).

[22] P. Magni, R. Bellazzi, F. Locatelli, Using uncer-tainty management techniques in medical therapyplanning: a decision-theoretic approach. Applica-tions of Uncertainty Formalisms. In: A. Hunter, S.Parsons (Eds.), Lecture Notes in Artificial Intelli-gence 1455 (Subseries of Lecture Notes in Com-puter Science). Springer, 1998, chapter 3, pp.38–57.

[23] G.J. Buffone, J.R. Beek, Cost-effectiveness analysisfor evaluation of screening programs: hereditaryhemochromatosis, Clin. Chem. 40 (1994) 1631–1636.

[24] P.C. Adams, L.S. Valberg, Screening blood donorsfor hereditary hemochromatosis: decision analysismodel comparing genotyping to phenotyping, Am.J. Gastroenterol. 94 (1999) 1593–1600.

.

Deciding when to intervene: a Markov decision process approach

Documents

Transcript of Deciding when to intervene: a Markov decision process approach