July  2007, 3(3): 429-444. doi: 10.3934/jimo.2007.3.429

Learning algorithms for finite horizon constrained Markov decision processes

1. 

IE and OR Interdisciplinary Programme, IIT Bombay, Mumbai, 400 076, India, India

Received  April 2006 Revised  November 2006 Published  July 2007

We propose a heuristic and two stochastic approximation based learning algorithms for finite horizon, finite state-action constrained Markov decision models. We include models and numerical examples arising from risk management in fund allocation, retailer-depot product availability in a supply chain and admission control in a simple queue, that have to satisfy performance based constraints.
Citation: A. Mittal, N. Hemachandra. Learning algorithms for finite horizon constrained Markov decision processes. Journal of Industrial & Management Optimization, 2007, 3 (3) : 429-444. doi: 10.3934/jimo.2007.3.429
[1]

Lakmi Niwanthi Wadippuli, Ivan Gudoshnikov, Oleg Makarenkov. Global asymptotic stability of nonconvex sweeping processes. Discrete & Continuous Dynamical Systems - B, 2020, 25 (3) : 1129-1139. doi: 10.3934/dcdsb.2019212

[2]

Juliang Zhang, Jian Chen. Information sharing in a make-to-stock supply chain. Journal of Industrial & Management Optimization, 2014, 10 (4) : 1169-1189. doi: 10.3934/jimo.2014.10.1169

[3]

Chih-Chiang Fang. Bayesian decision making in determining optimal leased term and preventive maintenance scheme for leased facilities. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020127

[4]

Yves Dumont, Frederic Chiroleu. Vector control for the Chikungunya disease. Mathematical Biosciences & Engineering, 2010, 7 (2) : 313-345. doi: 10.3934/mbe.2010.7.313

[5]

J. Frédéric Bonnans, Justina Gianatti, Francisco J. Silva. On the convergence of the Sakawa-Shindo algorithm in stochastic control. Mathematical Control & Related Fields, 2016, 6 (3) : 391-406. doi: 10.3934/mcrf.2016008

[6]

Diana Keller. Optimal control of a linear stochastic Schrödinger equation. Conference Publications, 2013, 2013 (special) : 437-446. doi: 10.3934/proc.2013.2013.437

[7]

Alberto Bressan, Ke Han, Franco Rampazzo. On the control of non holonomic systems by active constraints. Discrete & Continuous Dynamical Systems - A, 2013, 33 (8) : 3329-3353. doi: 10.3934/dcds.2013.33.3329

[8]

Paula A. González-Parra, Sunmi Lee, Leticia Velázquez, Carlos Castillo-Chavez. A note on the use of optimal control on a discrete time model of influenza dynamics. Mathematical Biosciences & Engineering, 2011, 8 (1) : 183-197. doi: 10.3934/mbe.2011.8.183

[9]

Guirong Jiang, Qishao Lu. The dynamics of a Prey-Predator model with impulsive state feedback control. Discrete & Continuous Dynamical Systems - B, 2006, 6 (6) : 1301-1320. doi: 10.3934/dcdsb.2006.6.1301

[10]

A. K. Misra, Anupama Sharma, Jia Li. A mathematical model for control of vector borne diseases through media campaigns. Discrete & Continuous Dynamical Systems - B, 2013, 18 (7) : 1909-1927. doi: 10.3934/dcdsb.2013.18.1909

[11]

Luke Finlay, Vladimir Gaitsgory, Ivan Lebedev. Linear programming solutions of periodic optimization problems: approximation of the optimal control. Journal of Industrial & Management Optimization, 2007, 3 (2) : 399-413. doi: 10.3934/jimo.2007.3.399

[12]

Shanjian Tang, Fu Zhang. Path-dependent optimal stochastic control and viscosity solution of associated Bellman equations. Discrete & Continuous Dynamical Systems - A, 2015, 35 (11) : 5521-5553. doi: 10.3934/dcds.2015.35.5521

[13]

Peter Benner, Jens Saak, M. Monir Uddin. Balancing based model reduction for structured index-2 unstable descriptor systems with application to flow control. Numerical Algebra, Control & Optimization, 2016, 6 (1) : 1-20. doi: 10.3934/naco.2016.6.1

2019 Impact Factor: 1.366

Metrics

  • PDF downloads (62)
  • HTML views (0)
  • Cited by (1)

Other articles
by authors

[Back to Top]