June  2022, 4(2): 271-298. doi: 10.3934/fods.2022007

Mean-field and kinetic descriptions of neural differential equations

1. 

Institute of Geometry and Applied Mathematics, RWTH Aachen University, Templergraben 55, 52074 Aachen, Germany

2. 

NRW.Bank, Kavalleriestraße 22, 40213 Düsseldorf, Germany

3. 

Department of Mathematics "G. Castelnuovo", Sapienza University of Rome, P.le Aldo Moro 5, 00185 Roma, Italy

*Corresponding author: Giuseppe Visconti

Received  November 2021 Revised  February 2022 Published  June 2022 Early access  March 2022

Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons $ N $, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.

Citation: Michael Herty, Torsten Trimborn, Giuseppe Visconti. Mean-field and kinetic descriptions of neural differential equations. Foundations of Data Science, 2022, 4 (2) : 271-298. doi: 10.3934/fods.2022007
References:
[1]

D. Araújo, R. I. Oliveira and D. Yukimura, A mean-field limit for certain deep neural networks, arXiv preprint, arXiv: 1906.00193, 2019.

[2]

L. ArlottiN. Bellomo and E. De Angelis, Generalized kinetic (boltzmann) models: Mathematical structures and applications, Math. Models Methods Appl. Sci., 12 (2002), 567-591.  doi: 10.1142/S0218202502001799.

[3]

N. Bellomo, A. Marsan and A. Tosin, Complex Systems and Society: Modeling and Simulation, Springer, 2013. doi: 10.1007/978-1-4614-7242-1.

[4]

K. BobzinW. WiethegerH. HeinemannS. DokhanchiM. Rom and G. Visconti, Prediction of particle properties in plasma spraying based on machine learning, Journal of Thermal Spray Technology, 30 (2021), 1751-1764.  doi: 10.1007/s11666-021-01239-2.

[5]

J. A. CarrilloM. FornasierG. Toscani and F. Vecil, Particle, kinetic, and hydrodynamic models of swarming, Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences, (2010), 297-336.  doi: 10.1007/978-0-8176-4946-3_12.

[6]

T. Q. Chen, Y. Rubanova, J. Bettencourt and D. K. Duvenaud, Neural ordinary differential equations, In Advances in Neural Information Processing Systems, (2018), 6571–6583.

[7]

Y. Chen and W. Li, Optimal transport natural gradient for statistical manifolds with continuous sample space, Inf. Geom., 3 (2020), 1-32.  doi: 10.1007/s41884-020-00028-0.

[8]

R. M. ColomboM. Mercier and M. D. Rosini, Stability and total variation estimates on general scalar balance laws, Commun. Math. Sci., 7 (2009), 37-65.  doi: 10.4310/CMS.2009.v7.n1.a2.

[9]

I. CraveroG. PuppoM. Semplice and G. Visconti, CWENO: Uniformly accurate reconstructions for balance laws, Math. Comp., 87 (2018), 1689-1719.  doi: 10.1090/mcom/3273.

[10]

P. Degond and S. Motsch, Large scale dynamics of the persistent turning walker model of fish behavior, J. Stat. Phys., 131 (2008), 989-1021.  doi: 10.1007/s10955-008-9529-8.

[11]

G. Dimarco and G. Toscani, Kinetic modeling of alcohol consumption, J. Stat. Phys., 177 (2019), 1022-1042.  doi: 10.1007/s10955-019-02406-0.

[12]

Y. Dukler, W. Li, A. Lin and G. Montúfar, Wasserstein of wasserstein loss for learning generative models, In International Conference on Machine Learning, (2019), 1716–1725.

[13]

H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar and P.-A. Muller, Data augmentation using synthetic data for time series classification with deep residual network, arXiv preprint, arXiv: 1808.02455, 2018.

[14]

C. GebhardtT. TrimbornF. WeberA. BezoldC. Broeckmann and M. Herty, Simplified ResNet approach for data driven prediction of microstructure-fatigue relationship, Mechanics of Materials, 151 (2020), 103625.  doi: 10.1016/j.mechmat.2020.103625.

[15]

J. Goldberger and E. Ben-Reuven, Training deep neural-networks using a noise adaptation layer, In ICLR, 2017.

[16]

F. Golse, On the dynamics of large particle systems in the mean field limit, In Macroscopic and Large Scale Phenomena: Coarse Graining, Mean Field Limits and Ergodicity, (2016), 1–144. doi: 10.1007/978-3-319-26883-5_1.

[17]

S.-Y. HaS. Jin and D. Kim, Convergence of a first-order consensus-based global optimization algorithm, Math. Models Methods Appl. Sci., 30 (2020), 2417-2444.  doi: 10.1142/S0218202520500463.

[18]

E. Haber, F. Lucka and L. Ruthotto, Never look back - A modified EnKF method and its application to the training of neural networks without back propagation, Preprint, arXiv: 1805.08034, 2018.

[19]

K. HeX. ZhangS. Ren and J. Sun, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770-778.  doi: 10.1109/CVPR.2016.90.

[20]

M. Herty, A. Thünen, T. Trimborn and G. Visconti, Continuous limits of residual neural networks in case of large input data, arXiv preprint, arXiv: 2112.14150, 2021.

[21]

M. Herty and G. Visconti, Kinetic methods for inverse problems, Kinet. Relat. Models, 12 (2019), 1109-1130.  doi: 10.3934/krm.2019042.

[22]

P.-E. Jabin, A review of the mean field limits for vlasov equations, Kinet. Relat. Models, 7 (2014), 661-711.  doi: 10.3934/krm.2014.7.661.

[23]

K. Janocha and W. M. Czarnecki, On loss functions for deep neural networks in classification, Schedae Informaticae, 25 (2016).  doi: 10.4467/20838476SI.16.004.6185.

[24]

G.-S. Jiang and C.-W. Shu, Efficient implementation of weighted ENO schemes, J. Comput. Phys., 126 (1996), 202-228.  doi: 10.1006/jcph.1996.0130.

[25]

M. I. Jordan and T. M. Mitchell, Machine learning: Trends, perspectives, and prospects, Science, 349 (2015), 255-260.  doi: 10.1126/science.aaa8415.

[26]

A. V. Joshi, Machine Learning and Artificial Intelligence, Springer, 2020. doi: 10.1007/978-3-030-26622-6.

[27]

P. Kidger and T. Lyons, Universal approximation with deep narrow networks, In Conference on Learning Theory, 2020.

[28]

N. B. Kovachki and A. M. Stuart, Ensemble Kalman inversion: A derivative-free technique for machine learning tasks, Inverse Probl., 35 (2019), 095005, 35 pp. doi: 10.1088/1361-6420/ab1c3a.

[29]

A. Kurganov and D. Levy, A third-order semidiscrete central scheme for conservation laws and convection-diffusion equations, SIAM J. Sci. Comput., 22 (2000), 1461-1488.  doi: 10.1137/S1064827599360236.

[30]

D. LevyG. Puppo and G. Russo, Compact central WENO schemes for multidimensional conservation laws, SIAM J. Sci. Comput., 22 (2000), 656-672.  doi: 10.1137/S1064827599359461.

[31]

A. T. Lin, S. W. Fung, W. Li, L. Nurbekyan and S. J. Osher, Apac-net: Alternating the population and agent control via two neural networks to solve high-dimensional stochastic mean field games, Proc. Natl. Acad. Sci., 118 (2021), Paper No. e2024713118, 10 pp. doi: 10.1073/pnas.2024713118.

[32]

A. T. Lin, W. Li, S. Osher and G. Montúfar, Wasserstein proximal of gans, In International Conference on Geometric Science of Information, (2021), 524–533. doi: 10.1007/978-3-030-80209-7_57.

[33]

H. Lin and S. Jegelka, Resnet with one-neuron hidden layers is a universal approximator, NIPS'18, Red Hook, NY, USA, Curran Associates Inc, (2018), 6172–6181.

[34]

Y. Lu and J. Lu, A universal approximation theorem of deep neural networks for expressing probability distributions, Advances in Neural Information Processing Systems, Curran Associates, Inc., 33 (2020), 3094–3105.

[35]

Y. LuA. ZhongQ. Li and B. Dong, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, 35th International Conference on Machine Learning, ICML 2018, 2018 (2018), 5181-5190. 

[36]

S. MeiA. Montanari and P.-M. Nguyen, A mean field view of the landscape of two-layer neural networks, Proc. Natl. Acad. Sci., 115 (2018), 7665-7671.  doi: 10.1073/pnas.1806579115.

[37]

S. Mishra, A machine learning framework for data driven acceleration of computations of differential equations, Math. Eng., 1 (2019), 118-146.  doi: 10.3934/Mine.2018.1.118.

[38]

V. C. Müller and N. Bostrom, Future progress in artificial intelligence: A survey of expert opinion, In Fundamental Issues of Artificial Intelligence, Springer, [Cham], 376 (2016), 553–570.

[39]

H. Noh, T. You, J. Mun and B. Han, Regularizing deep neural networks by noise: Its interpretation and optimization, Advances in Neural Information Processing Systems 30, Curran Associates, Inc., (2017), 5109–5118.

[40]

S. C. Onar, A. Ustundag, Ç. Kadaifci and B. Oztaysi, The changing role of engineering education in industry 4.0 era, In Industry 4.0: Managing The Digital Transformation, Springer, (2018), 137–151.

[41]

F. Otto and C. Villani, Generalization of an inequality by talagrand and links with the logarithmic sobolev inequality, J. Funct. Anal., 173 (2000), 361-400.  doi: 10.1006/jfan.1999.3557.

[42]

L. Pareschi and G. Toscani, Self-similarity and power-like tails in nonconservative kinetic models, J. Stat. Phys., 124 (2006), 747-779.  doi: 10.1007/s10955-006-9025-y.

[43] L. Pareschi and G. Toscani, Interacting Multiagent Systems. Kinetic equations and Monte Carlo methods, Oxford University Press, 2013. 
[44]

D. Ray and J. S. Hesthaven, An artificial neural network as a troubled-cell indicator, J. Comput. Phys., 367 (2018), 166-191.  doi: 10.1016/j.jcp.2018.04.029.

[45]

D. Ray and J. S. Hesthaven, Detecting troubled-cells on two-dimensional unstructured grids using a neural network, J. Comput. Phys., 397 (2019), 108845, 31 pp. doi: 10.1016/j.jcp.2019.07.043.

[46]

L. Ruthotto and E. Haber, Deep neural networks motivated by partial differential equations, J. Math. Imaging Vis., 62 (2020), 352-364.  doi: 10.1007/s10851-019-00903-1.

[47]

L. RuthottoS. OsherW. LiL. Nurbekyan and S. W. Fung, A machine learning framework for solving high-dimensional mean field game and mean field control problems, Proc. Natl. Acad. Sci., 117 (2020), 9183-9193.  doi: 10.1073/pnas.1922204117.

[48]

R. Schmitt and G. Schuh., Advances in production research, Proceedings of the 8th Congress of the German Academic Association for Production Technology (WGP), Springer, 2018.

[49]

J. Sirignano and K. Spiliopoulos, Mean field analysis of neural networks: A central limit theorem, Stochastic Process. Appl., 130 (2020), 1820-1852.  doi: 10.1016/j.spa.2019.06.003.

[50]

H. TercanT. Al KhawliU. EppeltC. BüscherT. Meisen and S. Jeschke, Improving the laser cutting process design by machine learning techniques, Production Engineering, 11 (2017), 195-203.  doi: 10.1007/s11740-017-0718-7.

[51]

G. Toscani, Kinetic models of opinion formation, Commun. Math. Sci., 4 (2006), 481-496.  doi: 10.4310/CMS.2006.v4.n3.a1.

[52]

C. Totzeck, Trends in consensus-based optimization, arXiv preprint, arXiv: 2104.01383, 2021.

[53]

D. Tran, M. W. Dusenberry, M. V. D. Wilk, and D. Hafner. Bayesian layers: A module for neural network uncertainty, In NeurIPS, 2019.

[54]

T. TrimbornS. Gerster and G. Visconti, Spectral methods to study the robustness of residual neural networks with infinite layers, Foundations of Data Science, 2 (2020), 257-278.  doi: 10.3934/fods.2020012.

[55]

Q. WangJ. S. Hesthaven and D. Ray, Non-intrusive reduced order modelling of unsteady flows using artificial neural networks with application to a combustion problem, J. Comput. Phys., 384 (2019), 289-307.  doi: 10.1016/j.jcp.2019.01.031.

[56]

Y. Wang and W. Li, Information newton's flow: Second-order optimization method in probability space, arXiv preprint, arXiv: 2001.04341, 2020.

[57]

K. Watanabe and S. G. Tzafestas, Learning algorithms for neural networks with the Kalman filters, J. Intell. Robot. Syst., 3 (1990), 305-319.  doi: 10.1007/BF00439421.

[58]

P. J. Werbos, The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting, volume 1. John Wiley & Sons, 1994.

[59]

M. Wooldridge, Artificial Intelligence requires more than deep learning - but what, exactly?, Artificial Intelligence, 289 (2020), 103386.  doi: 10.1016/j.artint.2020.103386.

[60]

Z. WuC. Shen and A. Van Den Hengel, Wider or deeper: Revisiting the resnet model for visual recognition, Pattern Recognition, 90 (2019), 119-133.  doi: 10.1016/j.patcog.2019.01.006.

[61]

A. Yegenoglu, S. Diaz, K. Krajsek and M. Herty, Ensemble Kalman filter optimizing deep neural networks, In Conference on Machine Learning, Optimization and Data Science, Springer LNCS Proceedings, 12514 (2020).

[62]

Z. You, J. Ye, K. Li, Z. Xu and P. Wang, Adversarial noise layer: Regularize neural network by adding noise, In 2019 IEEE International Conference on Image Processing, (2019), 909–913. doi: 10.1109/ICIP.2019.8803055.

[63]

A. Zeng, S. Song, K.-T. Yu, E. Donlon, F. R. Hogan, M. Bauza, D. Ma, O. Taylor, M. Liu and E. Romo, et al, Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching, In 2018 IEEE International Conference on Robotics and Automation (ICRA), (2018), 1–8. doi: 10.1109/ICRA.2018.8461044.

[64]

D. ZhangL. Guo and G. E. Karniadakis, Learning in modal space: Solving time-dependent stochastic PDEs using physics-informed neural networks, SIAM J. Sci. Comput., 42 (2020), 639-665.  doi: 10.1137/19M1260141.

show all references

References:
[1]

D. Araújo, R. I. Oliveira and D. Yukimura, A mean-field limit for certain deep neural networks, arXiv preprint, arXiv: 1906.00193, 2019.

[2]

L. ArlottiN. Bellomo and E. De Angelis, Generalized kinetic (boltzmann) models: Mathematical structures and applications, Math. Models Methods Appl. Sci., 12 (2002), 567-591.  doi: 10.1142/S0218202502001799.

[3]

N. Bellomo, A. Marsan and A. Tosin, Complex Systems and Society: Modeling and Simulation, Springer, 2013. doi: 10.1007/978-1-4614-7242-1.

[4]

K. BobzinW. WiethegerH. HeinemannS. DokhanchiM. Rom and G. Visconti, Prediction of particle properties in plasma spraying based on machine learning, Journal of Thermal Spray Technology, 30 (2021), 1751-1764.  doi: 10.1007/s11666-021-01239-2.

[5]

J. A. CarrilloM. FornasierG. Toscani and F. Vecil, Particle, kinetic, and hydrodynamic models of swarming, Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences, (2010), 297-336.  doi: 10.1007/978-0-8176-4946-3_12.

[6]

T. Q. Chen, Y. Rubanova, J. Bettencourt and D. K. Duvenaud, Neural ordinary differential equations, In Advances in Neural Information Processing Systems, (2018), 6571–6583.

[7]

Y. Chen and W. Li, Optimal transport natural gradient for statistical manifolds with continuous sample space, Inf. Geom., 3 (2020), 1-32.  doi: 10.1007/s41884-020-00028-0.

[8]

R. M. ColomboM. Mercier and M. D. Rosini, Stability and total variation estimates on general scalar balance laws, Commun. Math. Sci., 7 (2009), 37-65.  doi: 10.4310/CMS.2009.v7.n1.a2.

[9]

I. CraveroG. PuppoM. Semplice and G. Visconti, CWENO: Uniformly accurate reconstructions for balance laws, Math. Comp., 87 (2018), 1689-1719.  doi: 10.1090/mcom/3273.

[10]

P. Degond and S. Motsch, Large scale dynamics of the persistent turning walker model of fish behavior, J. Stat. Phys., 131 (2008), 989-1021.  doi: 10.1007/s10955-008-9529-8.

[11]

G. Dimarco and G. Toscani, Kinetic modeling of alcohol consumption, J. Stat. Phys., 177 (2019), 1022-1042.  doi: 10.1007/s10955-019-02406-0.

[12]

Y. Dukler, W. Li, A. Lin and G. Montúfar, Wasserstein of wasserstein loss for learning generative models, In International Conference on Machine Learning, (2019), 1716–1725.

[13]

H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar and P.-A. Muller, Data augmentation using synthetic data for time series classification with deep residual network, arXiv preprint, arXiv: 1808.02455, 2018.

[14]

C. GebhardtT. TrimbornF. WeberA. BezoldC. Broeckmann and M. Herty, Simplified ResNet approach for data driven prediction of microstructure-fatigue relationship, Mechanics of Materials, 151 (2020), 103625.  doi: 10.1016/j.mechmat.2020.103625.

[15]

J. Goldberger and E. Ben-Reuven, Training deep neural-networks using a noise adaptation layer, In ICLR, 2017.

[16]

F. Golse, On the dynamics of large particle systems in the mean field limit, In Macroscopic and Large Scale Phenomena: Coarse Graining, Mean Field Limits and Ergodicity, (2016), 1–144. doi: 10.1007/978-3-319-26883-5_1.

[17]

S.-Y. HaS. Jin and D. Kim, Convergence of a first-order consensus-based global optimization algorithm, Math. Models Methods Appl. Sci., 30 (2020), 2417-2444.  doi: 10.1142/S0218202520500463.

[18]

E. Haber, F. Lucka and L. Ruthotto, Never look back - A modified EnKF method and its application to the training of neural networks without back propagation, Preprint, arXiv: 1805.08034, 2018.

[19]

K. HeX. ZhangS. Ren and J. Sun, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770-778.  doi: 10.1109/CVPR.2016.90.

[20]

M. Herty, A. Thünen, T. Trimborn and G. Visconti, Continuous limits of residual neural networks in case of large input data, arXiv preprint, arXiv: 2112.14150, 2021.

[21]

M. Herty and G. Visconti, Kinetic methods for inverse problems, Kinet. Relat. Models, 12 (2019), 1109-1130.  doi: 10.3934/krm.2019042.

[22]

P.-E. Jabin, A review of the mean field limits for vlasov equations, Kinet. Relat. Models, 7 (2014), 661-711.  doi: 10.3934/krm.2014.7.661.

[23]

K. Janocha and W. M. Czarnecki, On loss functions for deep neural networks in classification, Schedae Informaticae, 25 (2016).  doi: 10.4467/20838476SI.16.004.6185.

[24]

G.-S. Jiang and C.-W. Shu, Efficient implementation of weighted ENO schemes, J. Comput. Phys., 126 (1996), 202-228.  doi: 10.1006/jcph.1996.0130.

[25]

M. I. Jordan and T. M. Mitchell, Machine learning: Trends, perspectives, and prospects, Science, 349 (2015), 255-260.  doi: 10.1126/science.aaa8415.

[26]

A. V. Joshi, Machine Learning and Artificial Intelligence, Springer, 2020. doi: 10.1007/978-3-030-26622-6.

[27]

P. Kidger and T. Lyons, Universal approximation with deep narrow networks, In Conference on Learning Theory, 2020.

[28]

N. B. Kovachki and A. M. Stuart, Ensemble Kalman inversion: A derivative-free technique for machine learning tasks, Inverse Probl., 35 (2019), 095005, 35 pp. doi: 10.1088/1361-6420/ab1c3a.

[29]

A. Kurganov and D. Levy, A third-order semidiscrete central scheme for conservation laws and convection-diffusion equations, SIAM J. Sci. Comput., 22 (2000), 1461-1488.  doi: 10.1137/S1064827599360236.

[30]

D. LevyG. Puppo and G. Russo, Compact central WENO schemes for multidimensional conservation laws, SIAM J. Sci. Comput., 22 (2000), 656-672.  doi: 10.1137/S1064827599359461.

[31]

A. T. Lin, S. W. Fung, W. Li, L. Nurbekyan and S. J. Osher, Apac-net: Alternating the population and agent control via two neural networks to solve high-dimensional stochastic mean field games, Proc. Natl. Acad. Sci., 118 (2021), Paper No. e2024713118, 10 pp. doi: 10.1073/pnas.2024713118.

[32]

A. T. Lin, W. Li, S. Osher and G. Montúfar, Wasserstein proximal of gans, In International Conference on Geometric Science of Information, (2021), 524–533. doi: 10.1007/978-3-030-80209-7_57.

[33]

H. Lin and S. Jegelka, Resnet with one-neuron hidden layers is a universal approximator, NIPS'18, Red Hook, NY, USA, Curran Associates Inc, (2018), 6172–6181.

[34]

Y. Lu and J. Lu, A universal approximation theorem of deep neural networks for expressing probability distributions, Advances in Neural Information Processing Systems, Curran Associates, Inc., 33 (2020), 3094–3105.

[35]

Y. LuA. ZhongQ. Li and B. Dong, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, 35th International Conference on Machine Learning, ICML 2018, 2018 (2018), 5181-5190. 

[36]

S. MeiA. Montanari and P.-M. Nguyen, A mean field view of the landscape of two-layer neural networks, Proc. Natl. Acad. Sci., 115 (2018), 7665-7671.  doi: 10.1073/pnas.1806579115.

[37]

S. Mishra, A machine learning framework for data driven acceleration of computations of differential equations, Math. Eng., 1 (2019), 118-146.  doi: 10.3934/Mine.2018.1.118.

[38]

V. C. Müller and N. Bostrom, Future progress in artificial intelligence: A survey of expert opinion, In Fundamental Issues of Artificial Intelligence, Springer, [Cham], 376 (2016), 553–570.

[39]

H. Noh, T. You, J. Mun and B. Han, Regularizing deep neural networks by noise: Its interpretation and optimization, Advances in Neural Information Processing Systems 30, Curran Associates, Inc., (2017), 5109–5118.

[40]

S. C. Onar, A. Ustundag, Ç. Kadaifci and B. Oztaysi, The changing role of engineering education in industry 4.0 era, In Industry 4.0: Managing The Digital Transformation, Springer, (2018), 137–151.

[41]

F. Otto and C. Villani, Generalization of an inequality by talagrand and links with the logarithmic sobolev inequality, J. Funct. Anal., 173 (2000), 361-400.  doi: 10.1006/jfan.1999.3557.

[42]

L. Pareschi and G. Toscani, Self-similarity and power-like tails in nonconservative kinetic models, J. Stat. Phys., 124 (2006), 747-779.  doi: 10.1007/s10955-006-9025-y.

[43] L. Pareschi and G. Toscani, Interacting Multiagent Systems. Kinetic equations and Monte Carlo methods, Oxford University Press, 2013. 
[44]

D. Ray and J. S. Hesthaven, An artificial neural network as a troubled-cell indicator, J. Comput. Phys., 367 (2018), 166-191.  doi: 10.1016/j.jcp.2018.04.029.

[45]

D. Ray and J. S. Hesthaven, Detecting troubled-cells on two-dimensional unstructured grids using a neural network, J. Comput. Phys., 397 (2019), 108845, 31 pp. doi: 10.1016/j.jcp.2019.07.043.

[46]

L. Ruthotto and E. Haber, Deep neural networks motivated by partial differential equations, J. Math. Imaging Vis., 62 (2020), 352-364.  doi: 10.1007/s10851-019-00903-1.

[47]

L. RuthottoS. OsherW. LiL. Nurbekyan and S. W. Fung, A machine learning framework for solving high-dimensional mean field game and mean field control problems, Proc. Natl. Acad. Sci., 117 (2020), 9183-9193.  doi: 10.1073/pnas.1922204117.

[48]

R. Schmitt and G. Schuh., Advances in production research, Proceedings of the 8th Congress of the German Academic Association for Production Technology (WGP), Springer, 2018.

[49]

J. Sirignano and K. Spiliopoulos, Mean field analysis of neural networks: A central limit theorem, Stochastic Process. Appl., 130 (2020), 1820-1852.  doi: 10.1016/j.spa.2019.06.003.

[50]

H. TercanT. Al KhawliU. EppeltC. BüscherT. Meisen and S. Jeschke, Improving the laser cutting process design by machine learning techniques, Production Engineering, 11 (2017), 195-203.  doi: 10.1007/s11740-017-0718-7.

[51]

G. Toscani, Kinetic models of opinion formation, Commun. Math. Sci., 4 (2006), 481-496.  doi: 10.4310/CMS.2006.v4.n3.a1.

[52]

C. Totzeck, Trends in consensus-based optimization, arXiv preprint, arXiv: 2104.01383, 2021.

[53]

D. Tran, M. W. Dusenberry, M. V. D. Wilk, and D. Hafner. Bayesian layers: A module for neural network uncertainty, In NeurIPS, 2019.

[54]

T. TrimbornS. Gerster and G. Visconti, Spectral methods to study the robustness of residual neural networks with infinite layers, Foundations of Data Science, 2 (2020), 257-278.  doi: 10.3934/fods.2020012.

[55]

Q. WangJ. S. Hesthaven and D. Ray, Non-intrusive reduced order modelling of unsteady flows using artificial neural networks with application to a combustion problem, J. Comput. Phys., 384 (2019), 289-307.  doi: 10.1016/j.jcp.2019.01.031.

[56]

Y. Wang and W. Li, Information newton's flow: Second-order optimization method in probability space, arXiv preprint, arXiv: 2001.04341, 2020.

[57]

K. Watanabe and S. G. Tzafestas, Learning algorithms for neural networks with the Kalman filters, J. Intell. Robot. Syst., 3 (1990), 305-319.  doi: 10.1007/BF00439421.

[58]

P. J. Werbos, The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting, volume 1. John Wiley & Sons, 1994.

[59]

M. Wooldridge, Artificial Intelligence requires more than deep learning - but what, exactly?, Artificial Intelligence, 289 (2020), 103386.  doi: 10.1016/j.artint.2020.103386.

[60]

Z. WuC. Shen and A. Van Den Hengel, Wider or deeper: Revisiting the resnet model for visual recognition, Pattern Recognition, 90 (2019), 119-133.  doi: 10.1016/j.patcog.2019.01.006.

[61]

A. Yegenoglu, S. Diaz, K. Krajsek and M. Herty, Ensemble Kalman filter optimizing deep neural networks, In Conference on Machine Learning, Optimization and Data Science, Springer LNCS Proceedings, 12514 (2020).

[62]

Z. You, J. Ye, K. Li, Z. Xu and P. Wang, Adversarial noise layer: Regularize neural network by adding noise, In 2019 IEEE International Conference on Image Processing, (2019), 909–913. doi: 10.1109/ICIP.2019.8803055.

[63]

A. Zeng, S. Song, K.-T. Yu, E. Donlon, F. R. Hogan, M. Bauza, D. Ma, O. Taylor, M. Liu and E. Romo, et al, Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching, In 2018 IEEE International Conference on Robotics and Automation (ICRA), (2018), 1–8. doi: 10.1109/ICRA.2018.8461044.

[64]

D. ZhangL. Guo and G. E. Karniadakis, Learning in modal space: Solving time-dependent stochastic PDEs using physics-informed neural networks, SIAM J. Sci. Comput., 42 (2020), 639-665.  doi: 10.1137/19M1260141.

Figure 1.  Left: Moments of our PDE model with $ \sigma(x) = x, w = -1, b = 0 $. Right: Moments of our PDE model with $ \sigma(x) = x, w = -1, b = -\frac{m_1(t)}{m_0(0)} $
Figure 2.  Left: The energy and variance plotted against the desired values with $ \sigma(x) = x, w = -1, b = 0 $. Right: The energy and variance plotted against the desired values with $ \sigma(x) = x, w = -1, b = -\frac{m_1(t)}{m_0(0)} $
Figure 3.  We consider $ 50 $ vehicles with measured length $ 2 $ and $ 8 $ obtained as uniformly distributed random realizations. Left: Histogram of the measured length of the vehicles. Right: Trajectories of the neuron activation energies of the $ 50 $ measurements
Figure 4.  Solution of the mean field neural network model at different time steps. The initial value is a uniform distribution on $ [2, 8] $ and the weight and bias is chosen as $ w = 1, \ b = -5 $
Figure 5.  Left: Regression problem with $ 5\cdot10^3 $ measurements at fixed positions around $ y = x $. Measurement errors are distributed according to a standard Gaussian. Center: Numerical slopes computed out of the previous measurements. Right: Numerical intercepts computed out of the previous measurements
Figure 6.  Evolution at time $ t = 0 $ (left plot), $ t = 1 $ (center plot), $ t = 2 $ (right plot) of the mean field neural network model (30) for the regression problem with weights $ w_{xx} = 1 $, $ w_{xy} = w_{yx} = 0 $, $ w_{yy} = -1 $, and biases $ b_x = -1 $, $ b_y = 0 $
Figure 7.  Evolution at time $ t = 0 $ (left plot), $ t = 1 $ (center plot), $ t = 5 $ (right plot) of the one dimensional mean field neural network model for the regression problem with weight $ w = 1 $ and bias $ b = -1 $
Figure 8.  Results of the mean field neural network model with updated weights and biases in the case of a novel target
Figure 9.  Solution of the Fokker-Planck neural network model at different times. Here, we have chosen the identity as activation function with weight $ w = -1 $, bias $ b = 0 $ and diffusion function $ K(x) = 1 $
Table 1.  Example of a data set for a classification problem
Measurement 3 3.5 5.5 7 4.5 8 $ \dots$
Classifier car car truck truck car truck $ \dots$
Measurement 3 3.5 5.5 7 4.5 8 $ \dots$
Classifier car car truck truck car truck $ \dots$
[1]

Rong Yang, Li Chen. Mean-field limit for a collision-avoiding flocking system and the time-asymptotic flocking dynamics for the kinetic equation. Kinetic and Related Models, 2014, 7 (2) : 381-400. doi: 10.3934/krm.2014.7.381

[2]

Joachim Crevat. Mean-field limit of a spatially-extended FitzHugh-Nagumo neural network. Kinetic and Related Models, 2019, 12 (6) : 1329-1358. doi: 10.3934/krm.2019052

[3]

Gerasimenko Viktor. Heisenberg picture of quantum kinetic evolution in mean-field limit. Kinetic and Related Models, 2011, 4 (1) : 385-399. doi: 10.3934/krm.2011.4.385

[4]

Seung-Yeal Ha, Jinwook Jung, Jeongho Kim, Jinyeong Park, Xiongtao Zhang. A mean-field limit of the particle swarmalator model. Kinetic and Related Models, 2021, 14 (3) : 429-468. doi: 10.3934/krm.2021011

[5]

Seung-Yeal Ha, Jeongho Kim, Jinyeong Park, Xiongtao Zhang. Uniform stability and mean-field limit for the augmented Kuramoto model. Networks and Heterogeneous Media, 2018, 13 (2) : 297-322. doi: 10.3934/nhm.2018013

[6]

Michael Herty, Mattia Zanella. Performance bounds for the mean-field limit of constrained dynamics. Discrete and Continuous Dynamical Systems, 2017, 37 (4) : 2023-2043. doi: 10.3934/dcds.2017086

[7]

Nastassia Pouradier Duteil. Mean-field limit of collective dynamics with time-varying weights. Networks and Heterogeneous Media, 2022, 17 (2) : 129-161. doi: 10.3934/nhm.2022001

[8]

Matthew Rosenzweig. The mean-field limit of the Lieb-Liniger model. Discrete and Continuous Dynamical Systems, 2022, 42 (6) : 3005-3037. doi: 10.3934/dcds.2022006

[9]

Kamel Hamdache, Djamila Hamroun. Macroscopic limit of the kinetic Bloch equation. Kinetic and Related Models, 2021, 14 (3) : 541-570. doi: 10.3934/krm.2021015

[10]

Kazuhisa Ichikawa, Mahemauti Rouzimaimaiti, Takashi Suzuki. Reaction diffusion equation with non-local term arises as a mean field limit of the master equation. Discrete and Continuous Dynamical Systems - S, 2012, 5 (1) : 115-126. doi: 10.3934/dcdss.2012.5.115

[11]

Theresa Lange, Wilhelm Stannat. Mean field limit of Ensemble Square Root filters - discrete and continuous time. Foundations of Data Science, 2021, 3 (3) : 563-588. doi: 10.3934/fods.2021003

[12]

Seung-Yeal Ha, Jeongho Kim, Peter Pickl, Xiongtao Zhang. A probabilistic approach for the mean-field limit to the Cucker-Smale model with a singular communication. Kinetic and Related Models, 2019, 12 (5) : 1045-1067. doi: 10.3934/krm.2019039

[13]

Young-Pil Choi, Samir Salem. Cucker-Smale flocking particles with multiplicative noises: Stochastic mean-field limit and phase transition. Kinetic and Related Models, 2019, 12 (3) : 573-592. doi: 10.3934/krm.2019023

[14]

Seung-Yeal Ha, Jeongho Kim, Xiongtao Zhang. Uniform stability of the Cucker-Smale model and its application to the Mean-Field limit. Kinetic and Related Models, 2018, 11 (5) : 1157-1181. doi: 10.3934/krm.2018045

[15]

Hyunjin Ahn, Seung-Yeal Ha, Jeongho Kim. Uniform stability of the relativistic Cucker-Smale model and its application to a mean-field limit. Communications on Pure and Applied Analysis, 2021, 20 (12) : 4209-4237. doi: 10.3934/cpaa.2021156

[16]

Franco Flandoli, Matti Leimbach. Mean field limit with proliferation. Discrete and Continuous Dynamical Systems - B, 2016, 21 (9) : 3029-3052. doi: 10.3934/dcdsb.2016086

[17]

Giada Basile, Tomasz Komorowski, Stefano Olla. Diffusion limit for a kinetic equation with a thermostatted interface. Kinetic and Related Models, 2019, 12 (5) : 1185-1196. doi: 10.3934/krm.2019045

[18]

Xuping Xie, Feng Bao, Thomas Maier, Clayton Webster. Analytic continuation of noisy data using Adams Bashforth residual neural network. Discrete and Continuous Dynamical Systems - S, 2022, 15 (4) : 877-892. doi: 10.3934/dcdss.2021088

[19]

Patrick Gerard, Christophe Pallard. A mean-field toy model for resonant transport. Kinetic and Related Models, 2010, 3 (2) : 299-309. doi: 10.3934/krm.2010.3.299

[20]

Thierry Paul, Mario Pulvirenti. Asymptotic expansion of the mean-field approximation. Discrete and Continuous Dynamical Systems, 2019, 39 (4) : 1891-1921. doi: 10.3934/dcds.2019080

 Impact Factor: 

Metrics

  • PDF downloads (181)
  • HTML views (126)
  • Cited by (0)

[Back to Top]