Adaptive traffic control system based on Bayesian probability interpretation

18
+ Adaptive Traffic Control System based on Bayesian Probability Interpretation Mohamed A. Khamis, Walid Gomaa, Ahmed El-Mahdy, and Amin Shoukry Department of Computer Science and Engineering Egypt-Japan University of Science and Technology (E-JUST), Alexandria, Egypt {mohamed.khamis, walid.gomaa, ahmed.elmahdy, amin.shoukry}@ejust.edu.eg 29-Mar-15 JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

Transcript of Adaptive traffic control system based on Bayesian probability interpretation

+

Adaptive Traffic Control System based on

Bayesian Probability Interpretation

Mohamed A. Khamis, Walid Gomaa, Ahmed El-Mahdy, and Amin Shoukry

Department of Computer Science and Engineering

Egypt-Japan University of Science and Technology (E-JUST), Alexandria, Egypt

{mohamed.khamis, walid.gomaa, ahmed.elmahdy, amin.shoukry}@ejust.edu.eg

29-Mar-15 JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

+ Outline

Multi Agent Traffic Control

Traffic Lights and Vehicles

Traffic Model and Simulation

Simulation: Extend the Green Light District (GLD) traffic simulator (M. Wiering et al. 2004)

Congestion: Vehicle Spawning Probability Distributions

Acceleration: Implement ‘Intelligent Driver Model’ (M. Treiber et al. 2002)

Lane Changing: Implement ‘Minimizing Overall Braking decelerations Induced by Lane changes’ (M. Treiber et al. 2002)

Main Contribution: Traffic control based on Bayesian probability interpretation that is adaptive to the high dynamics and non-stationarity of the road network

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

2

+ Introduction: Traffic Problems

Accidents in Egypt is 34x worse than developed world

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

In Egypt, traffic problems are responsible for more than 25, 000

accidents in 2010 with more than 6000 deaths per year.

3

+ Multi-Agent Traffic Simulation Model

GPS positions/

Wi-Fi observations/

cellular triangulation

Road users driving with

smartphone's sensors

Vehicle Agent

Controller Agent

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

4

+

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

5

+ Reinforcement Learning Traffic Control

Environment

Traffic Nodes,

Vehicles,

Road Conditions

Goal: Optimize signal splitting duration to minimize the total waiting

time of all vehicles standing at each traffic light before exiting city

State: traffic node,

vehicle direction, position

and destination

Action: Set traffic

light at each traffic

intersection to

green or red

Reward: equals 1 if a vehicle stays

at the same place, otherwise

equals 0 (vehicle can advance)

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

6

+

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

The state transition probability:

The Q-Value: expected trip waiting time from [tl, p] given tl decision is L

γ is the future discount factor (0 < γ < 1) to ensure that Q-values are bounded.

The V-function:

The probability that the light is red or green:

Reinforcement Learning Traffic Control

7

+

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

The traffic light controller at node n sums up the

individual advantages of all vehicles standing at

the traffic lights i that are set to green by the

controller decision (while all other traffic lights

are set to red).

The gain is computed as following:

Reinforcement Learning Traffic Control

8

+ Congestion Modeling: Vehicle spawning probability distributions

The change in road conditions is modeled by varying the vehicle

spawning inter-arrival probability distributions.

We categorize the vehicle spawning distributions as following:

(1) Fixed (no inter-arrival probability distribution), or probabilistic

(not fixed),

(2) static (the same vehicle spawning distribution for all time

intervals), or dynamic (not static).

We implement the following continuous probability distributions:

Uniform, Triangular, Exponential, Erlang, Weibull, and Gaussian.

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

9

+ Adaptive Traffic Control based on

Bayesian Probability Interpretation

In the non-stationary environment, agents need to

learn the whole history even for environment

dynamics previously experienced

The policy that was computed is no longer valid

when the dynamics change.

(Wiering et al. 2000) learn the transition probability

functions Pr(s′|s, a) and Pr(a|s) by counting the

frequency of observed experiences.

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

10

+ Adaptive Traffic Control based on

Bayesian Probability Interpretation

In congested traffic situation, each next state s′ has a bias

probability Pr(s′|s, a) for a given action a and state s where

summation of probabilities = 1.

To make the transition probability non-stationary

(adaptive), we estimate the weights of the next states/actions

using the Bayesian learning depending on current road

conditions.

Learning the whole history even for environment dynamics

which have been previously experienced gives our controller

ability to deal with congestion situations that occur for some

limited time (which is typical to rush hours and accidents).

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

11

+

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

The primitive Bay’s rule where A represents the

posterior probability and B is the set of observed

experiences xi's is given by:

12

Pt could be:

(1) the posterior probability when fixing s with different a or s′,

(2) the posterior probability when fixing (s, a) with different s′.

+

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

In case (1):

Observed experience xi = 1 when the same situation (s, a, s′) occurs

xi = 0 when same state s with different action a or next state s’ occurs.

t equals the number of times the same state s occurs.

In case (2):

xi = 1 when the same situation (s, a, s′) occurs.

xi = 0 when the same situation (s, a) with different next state s′ occurs.

t equals the number of times the same situation (s, a) occurs

13

+ Performance Evaluation

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

15

+ Performance Evaluation

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

16

+ Conclusions and Future Work

Adaptive traffic control based on Bayesian probability

interpretation.

Non-stationarity in traffic road conditions are modeled by

varying spawning probability distributions.

Leads to smaller trip waiting time increasing road users

satisfaction!

Future work includes physical deployment on a real zone

in Alexandria city

Make real time validation for traffic control

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

17

+ Acknowledgements

This work is partially funded by IBM PhD

Fellowship and Faculty Award

Special thanks are due to Dr Hisham El-Shishiny,

manager of IBM Center for Advanced Studies in

Cairo, for fruitful discussions

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

18

Thank you!

http://scf.ejust.edu.eg

JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012

19