June  2022, 12(2): 475-493. doi: 10.3934/mcrf.2021031

Maximum principle for discrete-time stochastic optimal control problem and stochastic game

1. 

School of Mathematics, Shandong University, Jinan 250100, Shandong Province, China

2. 

School of Mathematics and Quantitative Economics, Shandong University of Finance and Economics, Jinan 250014, Shandong Province, China

* Corresponding author: Feng Zhang

Received  October 2020 Revised  February 2021 Published  June 2022 Early access  June 2021

This paper is first concerned with one kind of discrete-time stochastic optimal control problem with convex control domains, for which necessary condition in the form of Pontryagin's maximum principle and sufficient condition of optimality are derived. The results are then extended to two kinds of discrete-time stochastic games. Two illustrative examples are studied, for which the explicit optimal strategies are given. This paper establishes a rigorous version of discrete-time stochastic maximum principle in a clear and concise way and paves a road for further related topics.

Citation: Zhen Wu, Feng Zhang. Maximum principle for discrete-time stochastic optimal control problem and stochastic game. Mathematical Control and Related Fields, 2022, 12 (2) : 475-493. doi: 10.3934/mcrf.2021031
References:
[1]

T. T. K. An and B. Øksendal, Maximum principle for stochastic differential games with partial information, J. Optim. Theory Appl., 139 (2008), 463-483.  doi: 10.1007/s10957-008-9398-y.

[2]

A. Beghi and D. D'Alessandro, Discrete-time optimal control with control-dependent noise and generalized Riccati difference equations, Automatica, 34 (1998), 1031-1034.  doi: 10.1016/S0005-1098(98)00044-2.

[3]

L. Chen and Z. Y. Yu, Maximum principle for nonzero-sum stochastic differential game with delays, IEEE Trans. Automat. Control, 60 (2015), 1422-1426.  doi: 10.1109/TAC.2014.2352731.

[4]

S. N. Cohen and R. J. Elliott, A general theory of finite state backward stochastic difference equations, Stoch. Proc. Appl., 120 (2010), 442-466.  doi: 10.1016/j.spa.2010.01.004.

[5]

S. N. Cohen and R. J. Elliott, Backward stochastic difference equations and nearly time-consistent nonlinear expectations, SIAM J. Control Optim., 49 (2011), 125-139.  doi: 10.1137/090763688.

[6]

O. L. V. Costa and A. de Oliveira, Optimal mean-variance control for discrete-time linear systems with Markovian jumps and multiplicative noises, Automatica, 48 (2012), 304-315.  doi: 10.1016/j.automatica.2011.11.009.

[7]

K. Du and Q. X. Meng, A maximum principle for optimal control of stochastic evolution equations, SIAM J. Control Optim., 51 (2013), 4343-4362.  doi: 10.1137/120882433.

[8]

R. J. ElliottX. Li and Y. H. Ni, Discrete time mean-field stochastic linear-quadratic optimal control problems, Automatica, 49 (2013), 3222-3233.  doi: 10.1016/j.automatica.2013.08.017.

[9]

H. Halkin, A maximum principle of the pontryagin type for systems described by nonlinear difference equations, SIAM J. Control Optim., 4 (1966), 90-111.  doi: 10.1137/0304009.

[10]

Y. C. HanS. G. Peng and Z. Wu, Maximum principle for backward doubly stochastic control systems with applications, SIAM J. Control Optim., 48 (2010), 4224-4241.  doi: 10.1137/080743561.

[11]

E. C. M. Hui and H. Xiao, Maximum principle for differential games of forward-backward stochastic systems with applications, J. Math. Anal. Appl., 386 (2012), 412-427.  doi: 10.1016/j.jmaa.2011.08.009.

[12]

R. Isaacs, Differential Games, John Wiley and Sons, New York, 1965.

[13]

S. L. Ji and H. D. Liu, Maximum principle for stochastic optimal control problem of forward-backward stochastic difference systems, Int. J. Control, (2021). doi: 10.1080/00207179.2021.1889033.

[14]

X. S. JiangS. P. TianT. L. Zhang and W. H. Zhang, Stability and stabilization of nonlinear discrete-time stochastic systems, Int. J. Robust Nonlin., 29 (2019), 6419-6437.  doi: 10.1002/rnc.4733.

[15]

D. Li and C. W. Schmidt, Cost smoothing in discrete-time linear-quadratic control, Automatica, 33 (1997), 447-452.  doi: 10.1016/S0005-1098(96)00171-9.

[16]

X. Y. Lin and W. H. Zhang, A maximum principle for optimal control of discrete-time stochastic systems with multiplicative noise, IEEE Trans. Automat. Control, 60 (2015), 1121-1126.  doi: 10.1109/TAC.2014.2345243.

[17]

Q. Lü and X. Zhang, General Pontryagin-type Stochastic Maximum Principle and Backward Stochastic Evolution Equations in Infinite Dimensions, SpringerBriefs in Mathematics. Springer, Cham, 2014. doi: 10.1007/978-3-319-06632-5.

[18]

J. B. MooreX. Y. Zhou and A. E. B. Lim, Discrete time LQG controls with control dependent noise, Syst. Control Lett., 36 (1999), 199-206.  doi: 10.1016/S0167-6911(98)00092-9.

[19]

Y. H. NiR. J. Elliott and X. Li, Discrete-time mean-field stochastic linear-quadratic optimal control problems, Ⅱ: Infinite horizon case, Automatica, 57 (2015), 65-77.  doi: 10.1016/j.automatica.2015.04.002.

[20]

M. Pachter and K. D. Pham, Discrete-time linear-quadratic dynamic games, J. Optim. Theory Appl., 146 (2010), 151-179.  doi: 10.1007/s10957-010-9661-x.

[21]

P. Paruchuri and D. Chatterjee, Discrete time pontryagin maximum principle under state-action-frequency constraints, IEEE Trans. Automat. Control, 64 (2019), 4202-4208.  doi: 10.1109/TAC.2019.2893160.

[22]

S. G. Peng, A general stochastic maximum principle for optimal control problems, SIAM J. Control Optim., 28 (1990), 966-979.  doi: 10.1137/0328054.

[23]

M. A. RamiX. Chen and X. Y. Zhou, Discrete-time indefinite LQ control with state and control dependent noises, J. Global Optim., 23 (2002), 245-265.  doi: 10.1023/A:1016578629272.

[24]

H. Y. SunL. Y. Jiang and W. H. Zhang, Infinite horizon linear quadratic differential games for discrete-time stochastic systems, J. Optim. Theory Appl., 10 (2012), 391-396.  doi: 10.1007/s11768-012-1004-z.

[25]

G. C. Wang and Z. Y. Yu, A Pontryagin's maximum principle for non-zero sum differential games of BSDEs with applications, IEEE Trans. Automat. Control, 55 (2010), 1742-1747.  doi: 10.1109/TAC.2010.2048052.

[26]

H. X. Wang, H. S. Zhang and X. Wang, Optimal control for stochastic discrete-time systems with multiple input-delays, in Proc. 10th World Congress on Intelligent Control and Automation, Beijing, (2012), 1529–1534. doi: 10.1109/WCICA.2012.6358121.

[27]

Z. Wu, A general maximum principle for optimal control of forward-backward stochastic systems, Automatica, 49 (2013), 1473-1480.  doi: 10.1016/j.automatica.2013.02.005.

[28]

H. S. Zhang and X. Zhang, Second-order necessary conditions for stochastic optimal control problems, SIAM Rev., 60 (2018), 139-178.  doi: 10.1137/17M1148773.

[29]

W. H. ZhangY. L. Huang and H. S. Zhang, Stochastic $H_2/H_\infty$ control for discrete-time systems with state and disturbance dependent noise, Automatica, 43 (2007), 513-521.  doi: 10.1016/j.automatica.2006.09.015.

[30]

X. ZhangR. J. Elliott and T. K. Siu, A stochastic maximum principle for a Markov regime-switching jump-diffusion model and its application to finance, SIAM J. Control Optim., 50 (2012), 964-990.  doi: 10.1137/110839357.

show all references

References:
[1]

T. T. K. An and B. Øksendal, Maximum principle for stochastic differential games with partial information, J. Optim. Theory Appl., 139 (2008), 463-483.  doi: 10.1007/s10957-008-9398-y.

[2]

A. Beghi and D. D'Alessandro, Discrete-time optimal control with control-dependent noise and generalized Riccati difference equations, Automatica, 34 (1998), 1031-1034.  doi: 10.1016/S0005-1098(98)00044-2.

[3]

L. Chen and Z. Y. Yu, Maximum principle for nonzero-sum stochastic differential game with delays, IEEE Trans. Automat. Control, 60 (2015), 1422-1426.  doi: 10.1109/TAC.2014.2352731.

[4]

S. N. Cohen and R. J. Elliott, A general theory of finite state backward stochastic difference equations, Stoch. Proc. Appl., 120 (2010), 442-466.  doi: 10.1016/j.spa.2010.01.004.

[5]

S. N. Cohen and R. J. Elliott, Backward stochastic difference equations and nearly time-consistent nonlinear expectations, SIAM J. Control Optim., 49 (2011), 125-139.  doi: 10.1137/090763688.

[6]

O. L. V. Costa and A. de Oliveira, Optimal mean-variance control for discrete-time linear systems with Markovian jumps and multiplicative noises, Automatica, 48 (2012), 304-315.  doi: 10.1016/j.automatica.2011.11.009.

[7]

K. Du and Q. X. Meng, A maximum principle for optimal control of stochastic evolution equations, SIAM J. Control Optim., 51 (2013), 4343-4362.  doi: 10.1137/120882433.

[8]

R. J. ElliottX. Li and Y. H. Ni, Discrete time mean-field stochastic linear-quadratic optimal control problems, Automatica, 49 (2013), 3222-3233.  doi: 10.1016/j.automatica.2013.08.017.

[9]

H. Halkin, A maximum principle of the pontryagin type for systems described by nonlinear difference equations, SIAM J. Control Optim., 4 (1966), 90-111.  doi: 10.1137/0304009.

[10]

Y. C. HanS. G. Peng and Z. Wu, Maximum principle for backward doubly stochastic control systems with applications, SIAM J. Control Optim., 48 (2010), 4224-4241.  doi: 10.1137/080743561.

[11]

E. C. M. Hui and H. Xiao, Maximum principle for differential games of forward-backward stochastic systems with applications, J. Math. Anal. Appl., 386 (2012), 412-427.  doi: 10.1016/j.jmaa.2011.08.009.

[12]

R. Isaacs, Differential Games, John Wiley and Sons, New York, 1965.

[13]

S. L. Ji and H. D. Liu, Maximum principle for stochastic optimal control problem of forward-backward stochastic difference systems, Int. J. Control, (2021). doi: 10.1080/00207179.2021.1889033.

[14]

X. S. JiangS. P. TianT. L. Zhang and W. H. Zhang, Stability and stabilization of nonlinear discrete-time stochastic systems, Int. J. Robust Nonlin., 29 (2019), 6419-6437.  doi: 10.1002/rnc.4733.

[15]

D. Li and C. W. Schmidt, Cost smoothing in discrete-time linear-quadratic control, Automatica, 33 (1997), 447-452.  doi: 10.1016/S0005-1098(96)00171-9.

[16]

X. Y. Lin and W. H. Zhang, A maximum principle for optimal control of discrete-time stochastic systems with multiplicative noise, IEEE Trans. Automat. Control, 60 (2015), 1121-1126.  doi: 10.1109/TAC.2014.2345243.

[17]

Q. Lü and X. Zhang, General Pontryagin-type Stochastic Maximum Principle and Backward Stochastic Evolution Equations in Infinite Dimensions, SpringerBriefs in Mathematics. Springer, Cham, 2014. doi: 10.1007/978-3-319-06632-5.

[18]

J. B. MooreX. Y. Zhou and A. E. B. Lim, Discrete time LQG controls with control dependent noise, Syst. Control Lett., 36 (1999), 199-206.  doi: 10.1016/S0167-6911(98)00092-9.

[19]

Y. H. NiR. J. Elliott and X. Li, Discrete-time mean-field stochastic linear-quadratic optimal control problems, Ⅱ: Infinite horizon case, Automatica, 57 (2015), 65-77.  doi: 10.1016/j.automatica.2015.04.002.

[20]

M. Pachter and K. D. Pham, Discrete-time linear-quadratic dynamic games, J. Optim. Theory Appl., 146 (2010), 151-179.  doi: 10.1007/s10957-010-9661-x.

[21]

P. Paruchuri and D. Chatterjee, Discrete time pontryagin maximum principle under state-action-frequency constraints, IEEE Trans. Automat. Control, 64 (2019), 4202-4208.  doi: 10.1109/TAC.2019.2893160.

[22]

S. G. Peng, A general stochastic maximum principle for optimal control problems, SIAM J. Control Optim., 28 (1990), 966-979.  doi: 10.1137/0328054.

[23]

M. A. RamiX. Chen and X. Y. Zhou, Discrete-time indefinite LQ control with state and control dependent noises, J. Global Optim., 23 (2002), 245-265.  doi: 10.1023/A:1016578629272.

[24]

H. Y. SunL. Y. Jiang and W. H. Zhang, Infinite horizon linear quadratic differential games for discrete-time stochastic systems, J. Optim. Theory Appl., 10 (2012), 391-396.  doi: 10.1007/s11768-012-1004-z.

[25]

G. C. Wang and Z. Y. Yu, A Pontryagin's maximum principle for non-zero sum differential games of BSDEs with applications, IEEE Trans. Automat. Control, 55 (2010), 1742-1747.  doi: 10.1109/TAC.2010.2048052.

[26]

H. X. Wang, H. S. Zhang and X. Wang, Optimal control for stochastic discrete-time systems with multiple input-delays, in Proc. 10th World Congress on Intelligent Control and Automation, Beijing, (2012), 1529–1534. doi: 10.1109/WCICA.2012.6358121.

[27]

Z. Wu, A general maximum principle for optimal control of forward-backward stochastic systems, Automatica, 49 (2013), 1473-1480.  doi: 10.1016/j.automatica.2013.02.005.

[28]

H. S. Zhang and X. Zhang, Second-order necessary conditions for stochastic optimal control problems, SIAM Rev., 60 (2018), 139-178.  doi: 10.1137/17M1148773.

[29]

W. H. ZhangY. L. Huang and H. S. Zhang, Stochastic $H_2/H_\infty$ control for discrete-time systems with state and disturbance dependent noise, Automatica, 43 (2007), 513-521.  doi: 10.1016/j.automatica.2006.09.015.

[30]

X. ZhangR. J. Elliott and T. K. Siu, A stochastic maximum principle for a Markov regime-switching jump-diffusion model and its application to finance, SIAM J. Control Optim., 50 (2012), 964-990.  doi: 10.1137/110839357.

Figure 1.  The sequences $ \{\alpha_{1,k}\} $ and $ \{\beta_{1,k}\} $
Figure 2.  The sequences $ \alpha_{2,k} $ and $\beta_{2,k} $
Figure 3.  The sequences $ \Psi_{k} $ and $ \Phi_{k} $
Figure 4.  The sequences $ \{\psi_{k}\} $ and $ \{\phi_{k}\} $
[1]

Yan Wang, Yanxiang Zhao, Lei Wang, Aimin Song, Yanping Ma. Stochastic maximum principle for partial information optimal investment and dividend problem of an insurer. Journal of Industrial and Management Optimization, 2018, 14 (2) : 653-671. doi: 10.3934/jimo.2017067

[2]

Ka Chun Cheung, Hailiang Yang. Optimal investment-consumption strategy in a discrete-time model with regime switching. Discrete and Continuous Dynamical Systems - B, 2007, 8 (2) : 315-332. doi: 10.3934/dcdsb.2007.8.315

[3]

Zaidong Zhan, Shuping Chen, Wei Wei. A unified theory of maximum principle for continuous and discrete time optimal control problems. Mathematical Control and Related Fields, 2012, 2 (2) : 195-215. doi: 10.3934/mcrf.2012.2.195

[4]

Sie Long Kek, Mohd Ismail Abd Aziz, Kok Lay Teo, Rohanin Ahmad. An iterative algorithm based on model-reality differences for discrete-time nonlinear stochastic optimal control problems. Numerical Algebra, Control and Optimization, 2013, 3 (1) : 109-125. doi: 10.3934/naco.2013.3.109

[5]

Sie Long Kek, Kok Lay Teo, Mohd Ismail Abd Aziz. Filtering solution of nonlinear stochastic optimal control problem in discrete-time with model-reality differences. Numerical Algebra, Control and Optimization, 2012, 2 (1) : 207-222. doi: 10.3934/naco.2012.2.207

[6]

Sie Long Kek, Mohd Ismail Abd Aziz. Output regulation for discrete-time nonlinear stochastic optimal control problems with model-reality differences. Numerical Algebra, Control and Optimization, 2015, 5 (3) : 275-288. doi: 10.3934/naco.2015.5.275

[7]

Yadong Shu, Bo Li. Linear-quadratic optimal control for discrete-time stochastic descriptor systems. Journal of Industrial and Management Optimization, 2022, 18 (3) : 1583-1602. doi: 10.3934/jimo.2021034

[8]

Shanjian Tang. A second-order maximum principle for singular optimal stochastic controls. Discrete and Continuous Dynamical Systems - B, 2010, 14 (4) : 1581-1599. doi: 10.3934/dcdsb.2010.14.1581

[9]

Jingzhen Liu, Shiqi Yan, Shan Jiang, Jiaqin Wei. Optimal investment, consumption and life insurance strategies under stochastic differential utility with habit formation. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022040

[10]

Zuo Quan Xu, Fahuai Yi. An optimal consumption-investment model with constraint on consumption. Mathematical Control and Related Fields, 2016, 6 (3) : 517-534. doi: 10.3934/mcrf.2016014

[11]

Ran Dong, Xuerong Mao. Asymptotic stabilization of continuous-time periodic stochastic systems by feedback control based on periodic discrete-time observations. Mathematical Control and Related Fields, 2020, 10 (4) : 715-734. doi: 10.3934/mcrf.2020017

[12]

Shaolin Ji, Xiaole Xue. A stochastic maximum principle for linear quadratic problem with nonconvex control domain. Mathematical Control and Related Fields, 2019, 9 (3) : 495-507. doi: 10.3934/mcrf.2019022

[13]

Carlo Orrieri. A stochastic maximum principle with dissipativity conditions. Discrete and Continuous Dynamical Systems, 2015, 35 (11) : 5499-5519. doi: 10.3934/dcds.2015.35.5499

[14]

Guangjun Shen, Xueying Wu, Xiuwei Yin. Stabilization of stochastic differential equations driven by G-Lévy process with discrete-time feedback control. Discrete and Continuous Dynamical Systems - B, 2021, 26 (2) : 755-774. doi: 10.3934/dcdsb.2020133

[15]

Jingzhen Liu, Ka-Fai Cedric Yiu, Kok Lay Teo. Optimal investment-consumption problem with constraint. Journal of Industrial and Management Optimization, 2013, 9 (4) : 743-768. doi: 10.3934/jimo.2013.9.743

[16]

Lei Sun, Lihong Zhang. Optimal consumption and investment under irrational beliefs. Journal of Industrial and Management Optimization, 2011, 7 (1) : 139-156. doi: 10.3934/jimo.2011.7.139

[17]

Agnieszka B. Malinowska, Tatiana Odzijewicz. Optimal control of the discrete-time fractional-order Cucker-Smale model. Discrete and Continuous Dynamical Systems - B, 2018, 23 (1) : 347-357. doi: 10.3934/dcdsb.2018023

[18]

Yuefen Chen, Yuanguo Zhu. Indefinite LQ optimal control with process state inequality constraints for discrete-time uncertain systems. Journal of Industrial and Management Optimization, 2018, 14 (3) : 913-930. doi: 10.3934/jimo.2017082

[19]

Shaojun Lan, Yinghui Tang, Miaomiao Yu. System capacity optimization design and optimal threshold $N^{*}$ for a $GEO/G/1$ discrete-time queue with single server vacation and under the control of Min($N, V$)-policy. Journal of Industrial and Management Optimization, 2016, 12 (4) : 1435-1464. doi: 10.3934/jimo.2016.12.1435

[20]

Jingtao Shi, Juanjuan Xu, Huanshui Zhang. Stochastic recursive optimal control problem with time delay and applications. Mathematical Control and Related Fields, 2015, 5 (4) : 859-888. doi: 10.3934/mcrf.2015.5.859

2021 Impact Factor: 1.141

Metrics

  • PDF downloads (522)
  • HTML views (355)
  • Cited by (0)

Other articles
by authors

[Back to Top]