October  2021, 8(4): 403-443. doi: 10.3934/jdg.2021023

Linear-quadratic zero-sum mean-field type games: Optimality conditions and policy optimization

Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08540, USA

Received  July 2020 Revised  July 2021 Published  October 2021 Early access  August 2021

Fund Project: A preliminary version of this work was submitted to the 59th Conference on Decision and Control

In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic cost are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of indistinguishable agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The optimality conditions of the game are analysed for both open-loop and closed-loop controls, and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the model-based case, the gradients are computed exactly using the model, whereas they are estimated using Monte-Carlo simulations in the sample-based case. Numerical experiments are conducted to show the convergence of the utility function as well as the two players' controls.

Citation: René Carmona, Kenza Hamidouche, Mathieu Laurière, Zongjun Tan. Linear-quadratic zero-sum mean-field type games: Optimality conditions and policy optimization. Journal of Dynamics & Games, 2021, 8 (4) : 403-443. doi: 10.3934/jdg.2021023
References:
[1]

Y. AchdouF. Camilli and I. Capuzzo-Dolcetta, Mean field games: Numerical methods for the planning problem, SIAM J. Control Optim., 50 (2012), 77-109.  doi: 10.1137/100790069.  Google Scholar

[2]

Y. Achdou and I. Capuzzo-Dolcetta, Mean field games: Numerical methods, SIAM J. Numer. Anal., 48 (2010), 1136-1162.  doi: 10.1137/090758477.  Google Scholar

[3]

Y. Achdou and J.-M. Lasry, Mean field games for modeling crowd motion, in Contributions to Partial Differential Equations and Applications, Comput. Methods Appl. Sci., 47, Springer, Cham, 2019, 17-42. doi: 10.1007/978-3-319-78325-3_4.  Google Scholar

[4]

Y. Achdou and M. Laurière, Mean field games and applications: Numerical aspects, in Mean Field Games, Lecture Notes in Math., 2281, Fond. CIME/CIME Found. Subser., Springer, Cham, 2020,249-307. doi: 10.1007/978-3-030-59837-2_4.  Google Scholar

[5]

Y. Achdou and M. Laurière, Mean field type control with congestion (Ⅱ): An augmented Lagrangian method, Appl. Math. Optim., 74 (2016), 535-578.  doi: 10.1007/s00245-016-9391-z.  Google Scholar

[6]

Y. Achdou and M. Laurière, On the system of partial differential equations arising in mean field type control, Discrete Contin. Dyn. Syst., 35 (2015), 3879-3900.  doi: 10.3934/dcds.2015.35.3879.  Google Scholar

[7]

A. Al-TamimiF. L. Lewis and M. Abu-Khalaf, Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica J. IFAC, 43 (2007), 473-481.  doi: 10.1016/j.automatica.2006.09.019.  Google Scholar

[8]

C. AlasseurI. Ben Tahar and A. Matoussi, An extended mean field game for storage in smart grids, J. Optim. Theory Appl., 184 (2020), 644-670.  doi: 10.1007/s10957-019-01619-3.  Google Scholar

[9]

B. Anahtarci, C. D. Karıksı z and N. Saldi, Value iteration algorithm for mean-field games, Systems Control Lett., 143 (2020), 10pp. doi: 10.1016/j.sysconle.2020.104744.  Google Scholar

[10]

J. Barreiro-Gomez, T. E. Duncan and H. Tembine, Discrete-time linear-quadratic mean-field-type repeated games: Perfect, incomplete, and imperfect information, Automatica J. IFAC, 112 (2020), 16pp. doi: 10.1016/j.automatica.2019.108647.  Google Scholar

[11]

T. Başar and P. Bernhard, $H^{\infty}$ Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, Birkhäuser, Boston, MA, 2008. doi: 10.1007/978-0-8176-4757-5.  Google Scholar

[12]

D. Bauso, Game Theory with Engineering Applications, Advances in Design and Control, 30, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2016. doi: 10.1137/1.9781611974287.  Google Scholar

[13]

D. BausoH. Tembine and T. Başar, Robust mean field games with application to production of an exhaustible resource, IFAC Proceedings Volumes, 45 (2012), 454-459.  doi: 10.3182/20120620-3-DK-2025.00135.  Google Scholar

[14]

A. Bensoussan, G. Da Prato, M. C. Delfour and S. K. Mitter, Representation and Control of Infinite Dimensional Systems, Systems & Control: Foundations & Applications, Birkhäuser Boston, Inc., Boston, MA, 2007. doi: 10.1007/978-0-8176-4581-6.  Google Scholar

[15]

A. Bensoussan, J. Frehse and P. Yam, Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer, New York, 2013. doi: 10.1007/978-1-4614-8508-7.  Google Scholar

[16]

A. BensoussanT. Huang and M. Laurière, Mean field control and mean field game models with several populations, Minimax Theory Appl., 3 (2018), 173-209.   Google Scholar

[17]

L. Briceño-Arias, D. Kalise, Z. Kobeissi, M. Laurière, Á. Mateos González and F. J. Silva, On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings, in CEMRACS 2017-Numerical Methods for Stochastic Models: Control, Uncertainty Quantification, Mean-Field, ESAIM Proc. Surveys, 65, EDP Sci., Les Ulis, 2019,330-348. doi: 10.1051/proc/201965330.  Google Scholar

[18]

L. M. Briceño-AriasD. Kalise and F. J. Silva, Proximal methods for stationary mean field games with local couplings, SIAM J. Control Optim., 56 (2018), 801-836.  doi: 10.1137/16M1095615.  Google Scholar

[19]

H. Cao, X. Guo and M. Laurière, Connecting GANs, MFGs, and OT, preprint, arXiv: 2002.04112. Google Scholar

[20]

P. Cardaliaguet, Notes on Mean Field Games, 2013. Available from: https://www.ceremade.dauphine.fr/cardaliaguet/MFG20130420.pdf. Google Scholar

[21]

P. Cardaliaguet and C.-A. Lehalle, Mean field game of controls and an application to trade crowding, Math. Financ. Econ., 12 (2018), 335-363.  doi: 10.1007/s11579-017-0206-z.  Google Scholar

[22]

E. Carlini and F. J. Silva., A fully discrete semi-Lagrangian scheme for a first order mean field game problem, SIAM J. Numer. Anal., 52 (2014), 45-67.  doi: 10.1137/120902987.  Google Scholar

[23]

R. Carmona and F. Delarue, Probabilistic Theory of Mean Field Games with Applications. I. Mean Field FBSDEs, Control, and Games, Probability Theory and Stochastic Modelling, 83, Springer, Cham, 2018. doi: 10.1007/978-3-319-58920-6.  Google Scholar

[24]

R. CarmonaJ.-P. Fouque and L.-H. Sun, Mean field games and systemic risk, Commun. Math. Sci., 13 (2015), 911-933.  doi: 10.4310/CMS.2015.v13.n4.a4.  Google Scholar

[25]

R. Carmona, K. Hamidouche, M. Laurière and Z. Tan, Policy optimization for linear-quadratic zero-sum mean-field type games, Proceedings of the IEEE Conference on Decision and Control, Jeju, Korea, 2020. doi: 10.1109/CDC42340.2020.9303734.  Google Scholar

[26]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅰ: The ergodic case, SIAM J. Numer. Anal., 59 (2021), 1455-1485.  doi: 10.1137/19M1274377.  Google Scholar

[27]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅱ: The finite horizon case, preprint, arXiv: 1908.01613. Google Scholar

[28]

R. Carmona, M. Laurière and Z. Tan, Linear-quadratic mean-field reinforcement learning: Convergence of policy gradient methods, preprint, arXiv: 1910.04295. Google Scholar

[29]

R. Carmona, M. Laurière and Z. Tan, Model-free mean-field reinforcement learning: Mean-field MDP and mean-field Q-learning, preprint, arXiv: 1910.12802. Google Scholar

[30]

A. CherukuriB. Gharesifard and J. Cortés, Saddle-point dynamics: Conditions for asymptotic stability of saddle points, SIAM J. Control Optim., 55 (2017), 486-511.  doi: 10.1137/15M1026924.  Google Scholar

[31]

A. Cosso and H. Pham, Zero-sum stochastic differential games of generalized McKean-Vlasov type, J. Math. Pures Appl. (9), 129 (2019), 180-212.  doi: 10.1016/j.matpur.2018.12.005.  Google Scholar

[32]

C. Daskalakis and I. Panageas, The limit points of (optimistic) gradient descent in min-max optimization, NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, 9256-9266. Available from: https://dl.acm.org/doi/pdf/10.5555/3327546.3327597. Google Scholar

[33]

B. Djehiche and S. Hamadène, Optimal control and zero-sum stochastic differential game problems of mean-field type, Appl. Math. Optim., 81 (2020), 933-960.  doi: 10.1007/s00245-018-9525-6.  Google Scholar

[34]

B. DjehicheA. Tcheukam and H. Tembine, Mean-field-type games in engineering, AIMS Electronics and Electrical Engineering, 1 (2017), 18-73.  doi: 10.3934/ElectrEng.2017.1.18.  Google Scholar

[35]

C. Domingo-Enrich, S. Jelassi, A. Mensch, G. M. Rotskoff and J. Bruna, A mean-field analysis of two-player zero-sum games, preprint, arXiv: 2002.06277. Google Scholar

[36]

R. Elie, T. Ichiba and M. Laurière, Large banking systems with default and recovery: A mean field game model, preprint, arXiv: 2001.10206. Google Scholar

[37]

R. ElieJ. PérolatM. LaurièreM. Geist and O. Pietquin, On the convergence of model free learning in mean field games, Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 7143-7150.  doi: 10.1609/aaai.v34i05.6203.  Google Scholar

[38]

M. Fazel, R. Ge, S. M. Kakade and M. Mesbahi, Global convergence of policy gradient methods for the linear quadratic regulator, preprint, arXiv: 1801.05039. Google Scholar

[39]

Z. Fu, Z. Yang, Y. Chen and Z. Wang, Actor-critic provably finds Nash equilibria of linear-quadratic mean-field games, preprint, arXiv: 1910.07498. Google Scholar

[40]

H. Gu, X. Guo, X. Wei and R. Xu, Mean-field controls with Q-learning for cooperative MARL: Convergence and complexity analysis, preprint, arXiv: 2002.04131. Google Scholar

[41]

X. Guo, A. Hu, R. Xu and J. Zhang, Learning mean-field games, Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, 4967-4977. Google Scholar

[42]

M. HuangR. P. Malhamé and P. E. Caines, Large population stochastic dynamic games: Closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle, Commun. Inf. Syst., 6 (2006), 221-251.  doi: 10.4310/CIS.2006.v6.n3.a5.  Google Scholar

[43]

C. Jin, P. Netrapalli and M. I. Jordan, What is local optimality in nonconvex-nonconcave minimax optimization?, preprint, arXiv: 1902.00618. Google Scholar

[44]

H. KimJ. ParkM. BennisS.-L. Kim and M. Debbah, Mean-field game theoretic edge caching in ultra-dense networks, IEEE Transactions on Vehicular Technology, 69 (2019), 935-947.  doi: 10.1109/TVT.2019.2953132.  Google Scholar

[45]

V. Kučera, The discrete Riccati equation of optimal control, Kybernetika (Prague), 8 (1972), 430-447.   Google Scholar

[46]

J.-M. Lasry and P.-L. Lions, Mean field games, Jpn. J. Math., 2 (2007), 229-260.  doi: 10.1007/s11537-007-0657-8.  Google Scholar

[47]

Z. Liu, B. Wu and H. Lin, A mean field game approach to swarming robots control, 2018 Annual American Control Conference (ACC), Milwaukee, WI, 2018. doi: 10.23919/ACC.2018.8431807.  Google Scholar

[48]

T.-T. Lu and S.-H. Shiou, Inverses of 2 × 2 block matrices, Comput. Math. Appl., 43 (2002), 119-129.  doi: 10.1016/S0898-1221(01)00278-4.  Google Scholar

[49]

E. Mazumdar, M. I. Jordan and S. S. Sastry, On finding local Nash equilibria (and only local Nash equilibria) in zero-sum continuous games, preprint, arXiv: 1901.00838. Google Scholar

[50]

F. Mériaux, V. Varma and S. Lasaulce, Mean field energy games in wireless networks, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, 2012. doi: 10.1109/ACSSC.2012.6489095.  Google Scholar

[51]

M. NouiehedM. SanjabiT. HuangJ. D. Lee and M. Razaviyayn, Solving a class of non-convex min-max games using iterative first order methods, Advances in Neural Information Processing Systems, 32 (2019), 14934-14942.   Google Scholar

[52]

A. C. M. Ran and R. Vreugdenhil, Existence and comparison theorems for algebraic Riccati equations for continuous- and discrete-time systems, Linear Algebra Appl., 99 (1988), 63-83.  doi: 10.1016/0024-3795(88)90125-5.  Google Scholar

[53]

D. ShiH. GaoL. WangM. PanZ. Han and H. V. Poor, Mean field game guided deep reinforcement learning for task placement in cooperative multi-access edge computing, IEEE Internet of Things Journal, 7 (2020), 9330-9340.  doi: 10.1109/JIOT.2020.2983741.  Google Scholar

[54]

J. SunJ. Yong and S. Zhang, Linear quadratic stochastic two-person zero-sum differential games in an infinite horizon, ESAIM: Control Optim. Calc. Var., 22 (2016), 743-769.  doi: 10.1051/cocv/2015024.  Google Scholar

[55] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, NJ, 2007.   Google Scholar
[56]

R. Xu, Zero-sum stochastic differential games of mean-field type and bsdes, Proceedings of the 31st Chinese Control Conference, (2012), 1651-1654. Google Scholar

[57]

K. Zhang, Z. Yang and T. Basar, Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games, Advances in Neural Information Processing Systems, (2019) 11598-11610. Google Scholar

show all references

References:
[1]

Y. AchdouF. Camilli and I. Capuzzo-Dolcetta, Mean field games: Numerical methods for the planning problem, SIAM J. Control Optim., 50 (2012), 77-109.  doi: 10.1137/100790069.  Google Scholar

[2]

Y. Achdou and I. Capuzzo-Dolcetta, Mean field games: Numerical methods, SIAM J. Numer. Anal., 48 (2010), 1136-1162.  doi: 10.1137/090758477.  Google Scholar

[3]

Y. Achdou and J.-M. Lasry, Mean field games for modeling crowd motion, in Contributions to Partial Differential Equations and Applications, Comput. Methods Appl. Sci., 47, Springer, Cham, 2019, 17-42. doi: 10.1007/978-3-319-78325-3_4.  Google Scholar

[4]

Y. Achdou and M. Laurière, Mean field games and applications: Numerical aspects, in Mean Field Games, Lecture Notes in Math., 2281, Fond. CIME/CIME Found. Subser., Springer, Cham, 2020,249-307. doi: 10.1007/978-3-030-59837-2_4.  Google Scholar

[5]

Y. Achdou and M. Laurière, Mean field type control with congestion (Ⅱ): An augmented Lagrangian method, Appl. Math. Optim., 74 (2016), 535-578.  doi: 10.1007/s00245-016-9391-z.  Google Scholar

[6]

Y. Achdou and M. Laurière, On the system of partial differential equations arising in mean field type control, Discrete Contin. Dyn. Syst., 35 (2015), 3879-3900.  doi: 10.3934/dcds.2015.35.3879.  Google Scholar

[7]

A. Al-TamimiF. L. Lewis and M. Abu-Khalaf, Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica J. IFAC, 43 (2007), 473-481.  doi: 10.1016/j.automatica.2006.09.019.  Google Scholar

[8]

C. AlasseurI. Ben Tahar and A. Matoussi, An extended mean field game for storage in smart grids, J. Optim. Theory Appl., 184 (2020), 644-670.  doi: 10.1007/s10957-019-01619-3.  Google Scholar

[9]

B. Anahtarci, C. D. Karıksı z and N. Saldi, Value iteration algorithm for mean-field games, Systems Control Lett., 143 (2020), 10pp. doi: 10.1016/j.sysconle.2020.104744.  Google Scholar

[10]

J. Barreiro-Gomez, T. E. Duncan and H. Tembine, Discrete-time linear-quadratic mean-field-type repeated games: Perfect, incomplete, and imperfect information, Automatica J. IFAC, 112 (2020), 16pp. doi: 10.1016/j.automatica.2019.108647.  Google Scholar

[11]

T. Başar and P. Bernhard, $H^{\infty}$ Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, Birkhäuser, Boston, MA, 2008. doi: 10.1007/978-0-8176-4757-5.  Google Scholar

[12]

D. Bauso, Game Theory with Engineering Applications, Advances in Design and Control, 30, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2016. doi: 10.1137/1.9781611974287.  Google Scholar

[13]

D. BausoH. Tembine and T. Başar, Robust mean field games with application to production of an exhaustible resource, IFAC Proceedings Volumes, 45 (2012), 454-459.  doi: 10.3182/20120620-3-DK-2025.00135.  Google Scholar

[14]

A. Bensoussan, G. Da Prato, M. C. Delfour and S. K. Mitter, Representation and Control of Infinite Dimensional Systems, Systems & Control: Foundations & Applications, Birkhäuser Boston, Inc., Boston, MA, 2007. doi: 10.1007/978-0-8176-4581-6.  Google Scholar

[15]

A. Bensoussan, J. Frehse and P. Yam, Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer, New York, 2013. doi: 10.1007/978-1-4614-8508-7.  Google Scholar

[16]

A. BensoussanT. Huang and M. Laurière, Mean field control and mean field game models with several populations, Minimax Theory Appl., 3 (2018), 173-209.   Google Scholar

[17]

L. Briceño-Arias, D. Kalise, Z. Kobeissi, M. Laurière, Á. Mateos González and F. J. Silva, On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings, in CEMRACS 2017-Numerical Methods for Stochastic Models: Control, Uncertainty Quantification, Mean-Field, ESAIM Proc. Surveys, 65, EDP Sci., Les Ulis, 2019,330-348. doi: 10.1051/proc/201965330.  Google Scholar

[18]

L. M. Briceño-AriasD. Kalise and F. J. Silva, Proximal methods for stationary mean field games with local couplings, SIAM J. Control Optim., 56 (2018), 801-836.  doi: 10.1137/16M1095615.  Google Scholar

[19]

H. Cao, X. Guo and M. Laurière, Connecting GANs, MFGs, and OT, preprint, arXiv: 2002.04112. Google Scholar

[20]

P. Cardaliaguet, Notes on Mean Field Games, 2013. Available from: https://www.ceremade.dauphine.fr/cardaliaguet/MFG20130420.pdf. Google Scholar

[21]

P. Cardaliaguet and C.-A. Lehalle, Mean field game of controls and an application to trade crowding, Math. Financ. Econ., 12 (2018), 335-363.  doi: 10.1007/s11579-017-0206-z.  Google Scholar

[22]

E. Carlini and F. J. Silva., A fully discrete semi-Lagrangian scheme for a first order mean field game problem, SIAM J. Numer. Anal., 52 (2014), 45-67.  doi: 10.1137/120902987.  Google Scholar

[23]

R. Carmona and F. Delarue, Probabilistic Theory of Mean Field Games with Applications. I. Mean Field FBSDEs, Control, and Games, Probability Theory and Stochastic Modelling, 83, Springer, Cham, 2018. doi: 10.1007/978-3-319-58920-6.  Google Scholar

[24]

R. CarmonaJ.-P. Fouque and L.-H. Sun, Mean field games and systemic risk, Commun. Math. Sci., 13 (2015), 911-933.  doi: 10.4310/CMS.2015.v13.n4.a4.  Google Scholar

[25]

R. Carmona, K. Hamidouche, M. Laurière and Z. Tan, Policy optimization for linear-quadratic zero-sum mean-field type games, Proceedings of the IEEE Conference on Decision and Control, Jeju, Korea, 2020. doi: 10.1109/CDC42340.2020.9303734.  Google Scholar

[26]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅰ: The ergodic case, SIAM J. Numer. Anal., 59 (2021), 1455-1485.  doi: 10.1137/19M1274377.  Google Scholar

[27]

R. Carmona and M. Laurière, Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games Ⅱ: The finite horizon case, preprint, arXiv: 1908.01613. Google Scholar

[28]

R. Carmona, M. Laurière and Z. Tan, Linear-quadratic mean-field reinforcement learning: Convergence of policy gradient methods, preprint, arXiv: 1910.04295. Google Scholar

[29]

R. Carmona, M. Laurière and Z. Tan, Model-free mean-field reinforcement learning: Mean-field MDP and mean-field Q-learning, preprint, arXiv: 1910.12802. Google Scholar

[30]

A. CherukuriB. Gharesifard and J. Cortés, Saddle-point dynamics: Conditions for asymptotic stability of saddle points, SIAM J. Control Optim., 55 (2017), 486-511.  doi: 10.1137/15M1026924.  Google Scholar

[31]

A. Cosso and H. Pham, Zero-sum stochastic differential games of generalized McKean-Vlasov type, J. Math. Pures Appl. (9), 129 (2019), 180-212.  doi: 10.1016/j.matpur.2018.12.005.  Google Scholar

[32]

C. Daskalakis and I. Panageas, The limit points of (optimistic) gradient descent in min-max optimization, NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, 9256-9266. Available from: https://dl.acm.org/doi/pdf/10.5555/3327546.3327597. Google Scholar

[33]

B. Djehiche and S. Hamadène, Optimal control and zero-sum stochastic differential game problems of mean-field type, Appl. Math. Optim., 81 (2020), 933-960.  doi: 10.1007/s00245-018-9525-6.  Google Scholar

[34]

B. DjehicheA. Tcheukam and H. Tembine, Mean-field-type games in engineering, AIMS Electronics and Electrical Engineering, 1 (2017), 18-73.  doi: 10.3934/ElectrEng.2017.1.18.  Google Scholar

[35]

C. Domingo-Enrich, S. Jelassi, A. Mensch, G. M. Rotskoff and J. Bruna, A mean-field analysis of two-player zero-sum games, preprint, arXiv: 2002.06277. Google Scholar

[36]

R. Elie, T. Ichiba and M. Laurière, Large banking systems with default and recovery: A mean field game model, preprint, arXiv: 2001.10206. Google Scholar

[37]

R. ElieJ. PérolatM. LaurièreM. Geist and O. Pietquin, On the convergence of model free learning in mean field games, Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 7143-7150.  doi: 10.1609/aaai.v34i05.6203.  Google Scholar

[38]

M. Fazel, R. Ge, S. M. Kakade and M. Mesbahi, Global convergence of policy gradient methods for the linear quadratic regulator, preprint, arXiv: 1801.05039. Google Scholar

[39]

Z. Fu, Z. Yang, Y. Chen and Z. Wang, Actor-critic provably finds Nash equilibria of linear-quadratic mean-field games, preprint, arXiv: 1910.07498. Google Scholar

[40]

H. Gu, X. Guo, X. Wei and R. Xu, Mean-field controls with Q-learning for cooperative MARL: Convergence and complexity analysis, preprint, arXiv: 2002.04131. Google Scholar

[41]

X. Guo, A. Hu, R. Xu and J. Zhang, Learning mean-field games, Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, 4967-4977. Google Scholar

[42]

M. HuangR. P. Malhamé and P. E. Caines, Large population stochastic dynamic games: Closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle, Commun. Inf. Syst., 6 (2006), 221-251.  doi: 10.4310/CIS.2006.v6.n3.a5.  Google Scholar

[43]

C. Jin, P. Netrapalli and M. I. Jordan, What is local optimality in nonconvex-nonconcave minimax optimization?, preprint, arXiv: 1902.00618. Google Scholar

[44]

H. KimJ. ParkM. BennisS.-L. Kim and M. Debbah, Mean-field game theoretic edge caching in ultra-dense networks, IEEE Transactions on Vehicular Technology, 69 (2019), 935-947.  doi: 10.1109/TVT.2019.2953132.  Google Scholar

[45]

V. Kučera, The discrete Riccati equation of optimal control, Kybernetika (Prague), 8 (1972), 430-447.   Google Scholar

[46]

J.-M. Lasry and P.-L. Lions, Mean field games, Jpn. J. Math., 2 (2007), 229-260.  doi: 10.1007/s11537-007-0657-8.  Google Scholar

[47]

Z. Liu, B. Wu and H. Lin, A mean field game approach to swarming robots control, 2018 Annual American Control Conference (ACC), Milwaukee, WI, 2018. doi: 10.23919/ACC.2018.8431807.  Google Scholar

[48]

T.-T. Lu and S.-H. Shiou, Inverses of 2 × 2 block matrices, Comput. Math. Appl., 43 (2002), 119-129.  doi: 10.1016/S0898-1221(01)00278-4.  Google Scholar

[49]

E. Mazumdar, M. I. Jordan and S. S. Sastry, On finding local Nash equilibria (and only local Nash equilibria) in zero-sum continuous games, preprint, arXiv: 1901.00838. Google Scholar

[50]

F. Mériaux, V. Varma and S. Lasaulce, Mean field energy games in wireless networks, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, 2012. doi: 10.1109/ACSSC.2012.6489095.  Google Scholar

[51]

M. NouiehedM. SanjabiT. HuangJ. D. Lee and M. Razaviyayn, Solving a class of non-convex min-max games using iterative first order methods, Advances in Neural Information Processing Systems, 32 (2019), 14934-14942.   Google Scholar

[52]

A. C. M. Ran and R. Vreugdenhil, Existence and comparison theorems for algebraic Riccati equations for continuous- and discrete-time systems, Linear Algebra Appl., 99 (1988), 63-83.  doi: 10.1016/0024-3795(88)90125-5.  Google Scholar

[53]

D. ShiH. GaoL. WangM. PanZ. Han and H. V. Poor, Mean field game guided deep reinforcement learning for task placement in cooperative multi-access edge computing, IEEE Internet of Things Journal, 7 (2020), 9330-9340.  doi: 10.1109/JIOT.2020.2983741.  Google Scholar

[54]

J. SunJ. Yong and S. Zhang, Linear quadratic stochastic two-person zero-sum differential games in an infinite horizon, ESAIM: Control Optim. Calc. Var., 22 (2016), 743-769.  doi: 10.1051/cocv/2015024.  Google Scholar

[55] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, NJ, 2007.   Google Scholar
[56]

R. Xu, Zero-sum stochastic differential games of mean-field type and bsdes, Proceedings of the 31st Chinese Control Conference, (2012), 1651-1654. Google Scholar

[57]

K. Zhang, Z. Yang and T. Basar, Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games, Advances in Neural Information Processing Systems, (2019) 11598-11610. Google Scholar

Figure 1.  Model-based policy optimization: Convergence of each part of the utility. (a) $ C_y $ as a function of $ (K_1,K_2) $. (b) $ C_z $ as a function of $ (L_1,L_2) $
Figure 2.  Model-based policy optimization: Convergence of the control parameters in (a) and of the relative error on the utility in (b)
Figure 3.  Sample-based policy optimization: Convergence of each part of the utility. (a) $ C_y $ as a function of $ (K_1,K_2) $. (b) $ C_z $ as a function of $ (L_1,L_2) $
Figure 4.  Sample-based policy optimization: Convergence of the control parameters in (a) and of the relative error on the utility in (b)
Table 1.  Simulation parameters
Model parameters
$ A $ $ \overline{A} $ $ B_1=\overline{B}_1 $ $ B_2=\overline{B}_2 $ $ Q $ $ \overline{Q} $ $ R_1=\overline{R}_1 $ $ R_2=\overline{R}_2 $ $ \gamma $
0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.9
Initial distribution and noise processes
$ \epsilon_0^0 $ $ \epsilon^1_0 $ $ \epsilon^0_t $ $ \epsilon^1_t $
$ \mathcal{U}([-1, 1]) $ $ \mathcal{U}([-1, 1]) $ $ \mathcal{N}(0, 0.01) $ $ \mathcal{N}(0, 0.01) $
AG and DGA methods parameters
$ \mathcal{N}^{max}_1 $ $ \mathcal{N}^{max}_2 $ $ T $ $ \eta_1 $ $ \eta_2 $ $ K_1^0 $ $ L_1^0 $ $ K_2^0 $ $ L_2^0 $
10 200 2000 0.1 0.1 0.0 0.0 0.0 0.0
Gradient estimation algorithm parameters
$ \mathcal{T} $ $ M $ $ \tau $
50 10000 0.1
Model parameters
$ A $ $ \overline{A} $ $ B_1=\overline{B}_1 $ $ B_2=\overline{B}_2 $ $ Q $ $ \overline{Q} $ $ R_1=\overline{R}_1 $ $ R_2=\overline{R}_2 $ $ \gamma $
0.4 0.4 0.4 0.3 0.4 0.4 0.4 0.4 0.9
Initial distribution and noise processes
$ \epsilon_0^0 $ $ \epsilon^1_0 $ $ \epsilon^0_t $ $ \epsilon^1_t $
$ \mathcal{U}([-1, 1]) $ $ \mathcal{U}([-1, 1]) $ $ \mathcal{N}(0, 0.01) $ $ \mathcal{N}(0, 0.01) $
AG and DGA methods parameters
$ \mathcal{N}^{max}_1 $ $ \mathcal{N}^{max}_2 $ $ T $ $ \eta_1 $ $ \eta_2 $ $ K_1^0 $ $ L_1^0 $ $ K_2^0 $ $ L_2^0 $
10 200 2000 0.1 0.1 0.0 0.0 0.0 0.0
Gradient estimation algorithm parameters
$ \mathcal{T} $ $ M $ $ \tau $
50 10000 0.1
[1]

Salah Eddine Choutri, Boualem Djehiche, Hamidou Tembine. Optimal control and zero-sum games for Markov chains of mean-field type. Mathematical Control & Related Fields, 2019, 9 (3) : 571-605. doi: 10.3934/mcrf.2019026

[2]

Pierre Cardaliaguet, Jean-Michel Lasry, Pierre-Louis Lions, Alessio Porretta. Long time average of mean field games. Networks & Heterogeneous Media, 2012, 7 (2) : 279-301. doi: 10.3934/nhm.2012.7.279

[3]

Josu Doncel, Nicolas Gast, Bruno Gaujal. Discrete mean field games: Existence of equilibria and convergence. Journal of Dynamics & Games, 2019, 6 (3) : 221-239. doi: 10.3934/jdg.2019016

[4]

Yves Achdou, Manh-Khang Dao, Olivier Ley, Nicoletta Tchou. A class of infinite horizon mean field games on networks. Networks & Heterogeneous Media, 2019, 14 (3) : 537-566. doi: 10.3934/nhm.2019021

[5]

Fabio Camilli, Elisabetta Carlini, Claudio Marchi. A model problem for Mean Field Games on networks. Discrete & Continuous Dynamical Systems, 2015, 35 (9) : 4173-4192. doi: 10.3934/dcds.2015.35.4173

[6]

Martin Burger, Marco Di Francesco, Peter A. Markowich, Marie-Therese Wolfram. Mean field games with nonlinear mobilities in pedestrian dynamics. Discrete & Continuous Dynamical Systems - B, 2014, 19 (5) : 1311-1333. doi: 10.3934/dcdsb.2014.19.1311

[7]

Adriano Festa, Diogo Gomes, Francisco J. Silva, Daniela Tonon. Preface: Mean field games: New trends and applications. Journal of Dynamics & Games, 2021, 8 (4) : i-ii. doi: 10.3934/jdg.2021025

[8]

Marco Cirant, Diogo A. Gomes, Edgard A. Pimentel, Héctor Sánchez-Morgado. On some singular mean-field games. Journal of Dynamics & Games, 2021, 8 (4) : 445-465. doi: 10.3934/jdg.2021006

[9]

Kuang Huang, Xuan Di, Qiang Du, Xi Chen. A game-theoretic framework for autonomous vehicles velocity control: Bridging microscopic differential games and macroscopic mean field games. Discrete & Continuous Dynamical Systems - B, 2020, 25 (12) : 4869-4903. doi: 10.3934/dcdsb.2020131

[10]

Martino Bardi. Explicit solutions of some linear-quadratic mean field games. Networks & Heterogeneous Media, 2012, 7 (2) : 243-261. doi: 10.3934/nhm.2012.7.243

[11]

Diogo A. Gomes, Gabriel E. Pires, Héctor Sánchez-Morgado. A-priori estimates for stationary mean-field games. Networks & Heterogeneous Media, 2012, 7 (2) : 303-314. doi: 10.3934/nhm.2012.7.303

[12]

Yves Achdou, Victor Perez. Iterative strategies for solving linearized discrete mean field games systems. Networks & Heterogeneous Media, 2012, 7 (2) : 197-217. doi: 10.3934/nhm.2012.7.197

[13]

Matt Barker. From mean field games to the best reply strategy in a stochastic framework. Journal of Dynamics & Games, 2019, 6 (4) : 291-314. doi: 10.3934/jdg.2019020

[14]

Olivier Guéant. New numerical methods for mean field games with quadratic costs. Networks & Heterogeneous Media, 2012, 7 (2) : 315-336. doi: 10.3934/nhm.2012.7.315

[15]

Juan Pablo Maldonado López. Discrete time mean field games: The short-stage limit. Journal of Dynamics & Games, 2015, 2 (1) : 89-101. doi: 10.3934/jdg.2015.2.89

[16]

Laura Aquilanti, Simone Cacace, Fabio Camilli, Raul De Maio. A Mean Field Games model for finite mixtures of Bernoulli and categorical distributions. Journal of Dynamics & Games, 2021, 8 (1) : 35-59. doi: 10.3934/jdg.2020033

[17]

Siting Liu, Levon Nurbekyan. Splitting methods for a class of non-potential mean field games. Journal of Dynamics & Games, 2021, 8 (4) : 467-486. doi: 10.3934/jdg.2021014

[18]

Jun Moon. Linear-quadratic mean-field type stackelberg differential games for stochastic jump-diffusion systems. Mathematical Control & Related Fields, 2021  doi: 10.3934/mcrf.2021026

[19]

Max-Olivier Hongler. Mean-field games and swarms dynamics in Gaussian and non-Gaussian environments. Journal of Dynamics & Games, 2020, 7 (1) : 1-20. doi: 10.3934/jdg.2020001

[20]

Diogo Gomes, Marc Sedjro. One-dimensional, forward-forward mean-field games with congestion. Discrete & Continuous Dynamical Systems - S, 2018, 11 (5) : 901-914. doi: 10.3934/dcdss.2018054

 Impact Factor: 

Metrics

  • PDF downloads (49)
  • HTML views (126)
  • Cited by (0)

[Back to Top]