Adaptive traffic control system based on Bayesian probability interpretation
Transcript of Adaptive traffic control system based on Bayesian probability interpretation
+
Adaptive Traffic Control System based on
Bayesian Probability Interpretation
Mohamed A. Khamis, Walid Gomaa, Ahmed El-Mahdy, and Amin Shoukry
Department of Computer Science and Engineering
Egypt-Japan University of Science and Technology (E-JUST), Alexandria, Egypt
{mohamed.khamis, walid.gomaa, ahmed.elmahdy, amin.shoukry}@ejust.edu.eg
29-Mar-15 JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
+ Outline
Multi Agent Traffic Control
Traffic Lights and Vehicles
Traffic Model and Simulation
Simulation: Extend the Green Light District (GLD) traffic simulator (M. Wiering et al. 2004)
Congestion: Vehicle Spawning Probability Distributions
Acceleration: Implement ‘Intelligent Driver Model’ (M. Treiber et al. 2002)
Lane Changing: Implement ‘Minimizing Overall Braking decelerations Induced by Lane changes’ (M. Treiber et al. 2002)
Main Contribution: Traffic control based on Bayesian probability interpretation that is adaptive to the high dynamics and non-stationarity of the road network
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
2
+ Introduction: Traffic Problems
Accidents in Egypt is 34x worse than developed world
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
In Egypt, traffic problems are responsible for more than 25, 000
accidents in 2010 with more than 6000 deaths per year.
3
+ Multi-Agent Traffic Simulation Model
GPS positions/
Wi-Fi observations/
cellular triangulation
Road users driving with
smartphone's sensors
Vehicle Agent
Controller Agent
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
4
+ Reinforcement Learning Traffic Control
Environment
Traffic Nodes,
Vehicles,
Road Conditions
Goal: Optimize signal splitting duration to minimize the total waiting
time of all vehicles standing at each traffic light before exiting city
State: traffic node,
vehicle direction, position
and destination
Action: Set traffic
light at each traffic
intersection to
green or red
Reward: equals 1 if a vehicle stays
at the same place, otherwise
equals 0 (vehicle can advance)
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
6
+
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
The state transition probability:
The Q-Value: expected trip waiting time from [tl, p] given tl decision is L
γ is the future discount factor (0 < γ < 1) to ensure that Q-values are bounded.
The V-function:
The probability that the light is red or green:
Reinforcement Learning Traffic Control
7
+
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
The traffic light controller at node n sums up the
individual advantages of all vehicles standing at
the traffic lights i that are set to green by the
controller decision (while all other traffic lights
are set to red).
The gain is computed as following:
Reinforcement Learning Traffic Control
8
+ Congestion Modeling: Vehicle spawning probability distributions
The change in road conditions is modeled by varying the vehicle
spawning inter-arrival probability distributions.
We categorize the vehicle spawning distributions as following:
(1) Fixed (no inter-arrival probability distribution), or probabilistic
(not fixed),
(2) static (the same vehicle spawning distribution for all time
intervals), or dynamic (not static).
We implement the following continuous probability distributions:
Uniform, Triangular, Exponential, Erlang, Weibull, and Gaussian.
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
9
+ Adaptive Traffic Control based on
Bayesian Probability Interpretation
In the non-stationary environment, agents need to
learn the whole history even for environment
dynamics previously experienced
The policy that was computed is no longer valid
when the dynamics change.
(Wiering et al. 2000) learn the transition probability
functions Pr(s′|s, a) and Pr(a|s) by counting the
frequency of observed experiences.
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
10
+ Adaptive Traffic Control based on
Bayesian Probability Interpretation
In congested traffic situation, each next state s′ has a bias
probability Pr(s′|s, a) for a given action a and state s where
summation of probabilities = 1.
To make the transition probability non-stationary
(adaptive), we estimate the weights of the next states/actions
using the Bayesian learning depending on current road
conditions.
Learning the whole history even for environment dynamics
which have been previously experienced gives our controller
ability to deal with congestion situations that occur for some
limited time (which is typical to rush hours and accidents).
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
11
+
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
The primitive Bay’s rule where A represents the
posterior probability and B is the set of observed
experiences xi's is given by:
12
Pt could be:
(1) the posterior probability when fixing s with different a or s′,
(2) the posterior probability when fixing (s, a) with different s′.
+
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
In case (1):
Observed experience xi = 1 when the same situation (s, a, s′) occurs
xi = 0 when same state s with different action a or next state s’ occurs.
t equals the number of times the same state s occurs.
In case (2):
xi = 1 when the same situation (s, a, s′) occurs.
xi = 0 when the same situation (s, a) with different next state s′ occurs.
t equals the number of times the same situation (s, a) occurs
13
+ Conclusions and Future Work
Adaptive traffic control based on Bayesian probability
interpretation.
Non-stationarity in traffic road conditions are modeled by
varying spawning probability distributions.
Leads to smaller trip waiting time increasing road users
satisfaction!
Future work includes physical deployment on a real zone
in Alexandria city
Make real time validation for traffic control
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
17
+ Acknowledgements
This work is partially funded by IBM PhD
Fellowship and Faculty Award
Special thanks are due to Dr Hisham El-Shishiny,
manager of IBM Center for Advanced Studies in
Cairo, for fruitful discussions
JEC-ECC 2012, Alexandria, Egypt, March 6 - 8, 2012
18