Optimal control under probability constraint - Hal-Inria

33
HAL Id: inria-00585861 https://hal.inria.fr/inria-00585861 Submitted on 14 Apr 2011 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Optimal control under probability constraint Pierre Carpentier, Jean-Philippe Chancelier, Guy Cohen To cite this version: Pierre Carpentier, Jean-Philippe Chancelier, Guy Cohen. Optimal control under probability con- straint. SADCO Kick off, Mar 2011, Paris, France. inria-00585861

Transcript of Optimal control under probability constraint - Hal-Inria

HAL Id: inria-00585861https://hal.inria.fr/inria-00585861

Submitted on 14 Apr 2011

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

Optimal control under probability constraintPierre Carpentier, Jean-Philippe Chancelier, Guy Cohen

To cite this version:Pierre Carpentier, Jean-Philippe Chancelier, Guy Cohen. Optimal control under probability con-straint. SADCO Kick off, Mar 2011, Paris, France. �inria-00585861�

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Optimal control under probability constraint

P. Carpentier — ENSTA ParisTechJ.-P. Chancelier — Ecole des Ponts ParisTech

G. Cohen — Ecole des Ponts ParisTech�

This work has been supported byThales-Alenia-Space (Cannes) and CNES (Toulouse)

SADCO Workshop

March 3, 2011

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 1 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Presentation outline

1 Problem formulation

2 Modeling improvement

3 Stochastic Arrow-Hurwicz algorithm

4 Numerical results

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 2 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Satellite model and deterministic optimization problemEngine failureStochastic formulation

1 Problem formulationSatellite model and deterministic optimization problemEngine failureStochastic formulation

2 Modeling improvement

3 Stochastic Arrow-Hurwicz algorithm

4 Numerical results

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 3 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Satellite model and deterministic optimization problemEngine failureStochastic formulation

Satellite model

dr

dt= v ,

dv

dt= −µ r

‖r‖3+

F

mκ , (1a)

dm

dt= − T

g0Ispδ . (1b)

(1a) : 6-dimensional state vector (position r and velocity v).

(1b) : 1-dimensional state vector (mass m including fuel).

κ involves the direction cosines of the thrust and the on-off switchδ of the engine (3 controls), and µ,F ,T , g0, Isp are constants.

The deterministic control problem is to drive the satellite from theinitial condition at ti to a known final position rf and velocity vfat tf (given) while minimizing fuel consumption m(ti)−m(tf).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 4 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Satellite model and deterministic optimization problemEngine failureStochastic formulation

Deterministic optimization problem

Using equinoctial coordinates for the position and velocity; state vector x ∈ R7,

and cartesian coordinates for the thrust of the engine; control vector u ∈ R3,

the deterministic optimization problem is written as follows:

minu(·)

K(x(tf)

)(2a)

subject to:

x(ti) = xi ,•x (t) = f

(x(t), u(t)

), (2b)

‖u(t)‖ ≤ 1 ∀t ∈ [ti, tf ] , (2c)

C(x(tf)

)= 0 . (2d)

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 5 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Satellite model and deterministic optimization problemEngine failureStochastic formulation

Engine failure

Sometimes, the engine may fail to work when needed: thesatellite drift away from the deterministic optimal trajectory.After the engine control is recovered, it is not always possibleto drive the satellite to the final target at tf .

By anticipating such possible failures and by modifying thetrajectory followed before any such failure occurs, one mayincrease the possibility of eventually reaching the target.But such a deviation from the deterministic optimal trajectoryresults in a deterioration of the economic performance.

The problem is thus to balance the increased probability ofeventually reaching the target despite possible failures againstthe expected economic performance, that is, to quantifythe price of safety one is ready to pay for.

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 6 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Satellite model and deterministic optimization problemEngine failureStochastic formulation

Stochastic formulation (1)

A failure is modeled using two random variables:

tp : random initial time of the failure,td : random duration of the failure.

For every realization (tξp, tξd):

1 u(·) denotes the control used prior to any failure

; u is defined over [ti, tf ] but implemented over [ti, tξp]

and corresponds to an open-loop control,2 the control is 0 in [tξp, t

ξp + tξd],

3 v ξ(·) denotes the control used after the end of the failure

; v ξ is defined over [tξp + tξd, tf ] (if nonempty)and corresponds to a closed-loop strategy v.

The satellite dynamics in the stochastic formulation writes:

xξ(ti) = xi ,•x ξ(t) = f ξ

(xξ(t), u(t), v ξ(t)

).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 7 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Satellite model and deterministic optimization problemEngine failureStochastic formulation

Stochastic formulation (2)

The problem is to minimize the expected cost (fuel consumption)

w.r.t. the open-loop control u and the closed-loop strategy v,

the probability to hit the target at time tf being at least p.

minu(·)

E(

minvξ(·)

K(xξ(tf)

))(3a)

subject to:

xξ(ti) = xi ,•x ξ(t) = f ξ

(xξ(t), u(t), v ξ(t)

), (3b)

‖u(t)‖ ≤ 1 ∀t ∈ [ti, tf ] , ‖v ξ(t)‖ ≤ 1 ∀t ∈ [tξp + tξd, tf ] , (3c)

P(

C(xξ(tf)

)= 0)≥ p . (3d)

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 8 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

1 Problem formulation

2 Modeling improvementProbability as an expectationCost refinementFinal formulation

3 Stochastic Arrow-Hurwicz algorithm

4 Numerical results

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 9 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

Probability as an expectation

Let I(y) =

{1 if y = 0,

0 otherwise.Then

P(

C(xξ(tf)

)= 0)

= E(I(∥∥C

(xξ(tf)

)∥∥)) .Thus, Problem (3) can (shortly) be written:

minu(·)

E(

minvξ(·)

K(xξ(tf)

))(4a)

s.t. E(I(∥∥C

(xξ(tf)

)∥∥)) ≥ p . (4b)

Formulation (4) opens the possibility to make use of a stochasticArrow-Hurwicz algorithm in order to obtain the solution u](·).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 10 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

Missing the target should not contribute to the cost

Whenever a failure occurred, setting v ξ ≡ 0 over [tξp + tξd, tf ]

generally makes the satellite misses the target and does notcontribute to the probability constraint,

but it forces fuel consumption to its minimum andcontributes to minimize the cost function.

This yields an artificially good expected cost; however, one is notinterested in what happens whenever the whole mission has failed.

Therefore, a better formulation is to register only scenarios whenthe target is effectively hit; instead of the original expected costin (4), we thus prefer to deal with a conditional expected cost:

E(

K(xξ(tf)

) ∣∣∣ C(xξ(tf)

)= 0)

=E(

K(xξ(tf)

)× I(∥∥C

(xξ(tf)

)∥∥))E(I(∥∥C

(xξ(tf)

)∥∥)) .

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 11 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

Conditional expected cost

The problem is reformulated as

minu(·),v(·)

E(

K(xξ(tf)

)× I(∥∥C

(xξ(tf)

)∥∥))E(I(∥∥C

(xξ(tf)

)∥∥)) (5a)

s.t. E(I(∥∥C

(xξ(tf)

)∥∥)) ≥ p . (5b)

Such a formulation is however not well-suited for the stochasticArrow-Hurwicz algorithm:

a ratio of expectations is not an expectation!

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 12 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

An useful lemma

Using compact notation, Problem (5) is:

minu

J(u)

Θ(u)s.t. Θ(u) ≥ p , (6)

in which J and Θ assume positive values.

1 If u] is a solution of (6) and if Θ(u]) = p, then u] is also a solution of

minu

J(u) s.t. Θ(u) ≥ p . (7)

2 Conversely, if u] is a solution of (7), and if an optimal Kuhn-Tuckermultiplier β] satisfies the condition

β] ≥ J(u])

Θ(u]),

then u] is also a solution of (6).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 13 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

Back to a cost in expectation

Finally, instead of (5) we aim at solving

minu(·),v(·)

E(

K(xξ(tf)

)× I(∥∥C

(xξ(tf)

)∥∥)) (8a)

s.t. E(I(∥∥C

(xξ(tf)

)∥∥)) ≥ p . (8b)

The cost function again corresponds to a standard expectation!

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 14 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Probability as an expectationCost refinementFinal formulation

Final formulation and associated Lagrangian

minu(·)

(E(

minvξ(·)

K(xξ(tf)

)× I(∥∥C

(xξ(tf)

)∥∥)))s.t. µ p − E

(I(∥∥C

(xξ(tf)

)∥∥)) ≤ 0 .

Assuming there exists a saddle point for the associated Lagrangian,we have finally to solve

maxµ≥0

minu(·)

{µ p + E

(minvξ(·)

(K(xξ(tf)

)−µ)×I(∥∥C

(xξ(tf)

)∥∥))} .

Remark: it is convenient to put apart the no-failure event, namely{

tp ≥ tf}

in the problem formulation. For the sake of simplicity, we do not present this

last problem improvement. We will denote by πf the probability of this event.

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 15 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Algorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

1 Problem formulation

2 Modeling improvement

3 Stochastic Arrow-Hurwicz algorithmAlgorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

4 Numerical results

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 16 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Algorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

Algorithm overview (1)

In order to solve

maxµ≥0

minu(·)

{µ p + E

(minvξ(·)

(K(xξ(tf)

)− µ

)× I(∥∥C

(xξ(tf)

)∥∥)︸ ︷︷ ︸W (u, µ, ξ)

)}.

that is,maxµ≥0

minu(·)

{µ p + E

(W (u, µ, ξ)

)},

we use a stochastic Arrow-Hurwicz algorithm (see [2] and [3]).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 17 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Algorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

Algorithm overview (2)

Arrow-Hurwicz algorithm

At iteration k ,

1 draw a failure ξk = (tξk

p , tξk

d ) according to its probability law,

2 update u(·):

uk+1 = ΠB

(uk − εk ∇uW (uk , µk , ξk)

),

3 update µ:

µk+1 = max(

0, µk + ρk(p + ∇µW (uk+1, µk , ξk)

)).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 18 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Algorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

Downstream closed-loop problem

Consider the inner optimization problem:

W (u, ξ, µ) = minvξ(·)

{(K(xξ(tf)

)− µ

)× I(∥∥C

(xξ(tf)

)∥∥)} ,If reaching the target is impossible (C

(xξ(tf)

)6= 0 ∀v ξ),

the best thing to do is to stop immediately: W ≡ 0.

If reaching the target is possible but too expensive (thatis K

(xξ(tf)

)≥ µ), the best thing to do is again to stop

immediately: W ≡ 0.

Therefore, it must be checked that one can reach the targetwhile maintaining fuel consumption below a threshold, andthe corresponding optimal control v ξ∗ (·) must be computed.

At every iteration k, we must evaluate function W as well asits derivatives w.r.t. u(.) and µ. But W is not differentiable!

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 19 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Algorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

First difficulty: I is not a smooth function

I(y) =

{1 if y = 0,

0 otherwise, Ir (y) =

(

1− y2

r2

)2if y ∈ [−r , r ],

0 otherwise.

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

There are specific rules to drive rk to 0 as the iteration numberk goes to infinity in order to obtain the best asymptotic MeanQuadratic Error of the gradient estimates (see [1]).

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 20 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Algorithm overviewDownstream closed-loop problemSmoothing function ISolving the resulting approximated problem

Second difficulty: solving the approximated problem

The approximated closed-loop problem to solve at each iteration is:

Wr (uk , ξk , µk) = minvξ(·)

{(K(xξ(tf)

)− µk

)× Ir

(∥∥C(xξ(tf)

)∥∥)} .In this setting, we have to check if the target is reached up to r .

In practice, the solution of the approximated problem is derivedfrom the resolution of two standard optimal control problems.

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 21 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

1 Problem formulation

2 Modeling improvement

3 Stochastic Arrow-Hurwicz algorithm

4 Numerical resultsBrief descriptionResults for different values of pThe price of safety

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 22 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Mission description

Interplanetary mission (Earth-Mars trajectory):

duration of the mission: 450 days,tp: exponential distribution s.t. P

(tp ≥ tf

)= πf ≈ 0.58,

td: exponential distribution s.t. P(2 ≤ td ≤ 7

)≈ 0.80.

Comp. normale wComp. tangentielle sComp. radiale q

0 1 2 3 4 5 6 7 8 9−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

1.0

Using normalized units:

ti = 0.69 and tf = 8.73.

The deterministic optimal controlhas a “bang–off–bang” shape. Alongthe deterministic optimal path, theprobability to recover a failure is:pdet ≈ 0.94.

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 23 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Parameters tuning

Gradient step length:

εk =a

b + k, ρk =

c

d + k,

usual for a stochastic gradient algorithm.

Smoothing parameter:

rk =α

β + k13

,

MQE reduced by a factor 2000 in about 100.000 iterations.

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 24 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Masse finale / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.675

0.676

0.677

0.678

0.679

0.680

0.681

Multiplicateur probabilite / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Comp. normale wComp. tangentielle sComp. radiale q

0 1 2 3 4 5 6 7 8 9−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

1.0

Figure: Probability level p < πf

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 25 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Masse finale / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.674

0.675

0.676

0.677

0.678

0.679

0.680

0.681

Multiplicateur probabilite / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.314

0.316

0.318

0.320

0.322

0.324

0.326

Comp. normale wComp. tangentielle sComp. radiale q

0 1 2 3 4 5 6 7 8 9−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

1.0

Figure: Probability level p = 0.750

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 26 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Masse finale / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.673

0.674

0.675

0.676

0.677

0.678

0.679

0.680

0.681

Multiplicateur probabilite / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.318

0.320

0.322

0.324

0.326

0.328

0.330

0.332

Comp. normale wComp. tangentielle sComp. radiale q

0 1 2 3 4 5 6 7 8 9−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

1.0

Figure: Probability level p = 0.960

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 27 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Masse finale / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.674

0.675

0.676

0.677

0.678

0.679

0.680

0.681

Multiplicateur probabilite / iterations0 10000 20000 30000 40000 50000 60000 70000 80000 90000 1000000.320

0.325

0.330

0.335

0.340

0.345

0.350

0.355

0.360

0.365

Comp. normale wComp. tangentielle sComp. radiale q

0 1 2 3 4 5 6 7 8 9−0.4

−0.2

0.0

0.2

0.4

0.6

0.8

1.0

Figure: Probability level p = 0.990

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 28 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Fuel consumption versus probability level

Consommation sans panne / Probabilite0.85 0.90 0.95 1.00

0.3204

0.3206

0.3208

0.3210

0.3212

0.3214

0.3216

0.3218

0.3220

0.3222

Figure: Fuel consumption versus probability level pP. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 29 / 31

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Conclusion

Main conclusion

We are able to deal with probability constraints in the optimalcontrol framework.

Future works

From the theoretical point of view:

existence of a saddle point for the constrained problem,smoothing process (results available only for inequalityconstraints).

From the numerical point of view:

efficient solver for the downstream problem,computer parallelization.

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 30 / 31

Laetitia Andrieu, Guy Cohen and Felisa Vazquez-Abad.

Stochastic Programming with Probability.

arXiv, math.OC 0708.0281, 2007.

Jean-Christophe Culioli and Guy Cohen.

Decomposition/Coordination Algorithms in Stochastic Optimization.

SIAM Journal on Control and Optimization, 28, 1372-1403, 1990.

Jean-Christophe Culioli and Guy Cohen.

Optimisation stochastique sous contraintes en esperance.

CRAS, t.320, Serie I, pp. 75-758, 1995.

Thierry Dargent.

Transfert d’orbite optimal par application du principe du minimum.

Logiciel T 3D, version 2, 19 mars 2004.

Andras Prekopa.

Stochastic Programming.

Kluwer, Dordrecht, 1995.

Problem formulationModeling improvement

Stochastic Arrow-Hurwicz algorithmNumerical results

Brief descriptionResults for different values of pThe price of safety

Thank you for your attention. Any question?

P. Carpentier and SOWG Optimal control under probability constraint March 3, 2011 31 / 31