# American Institute of Mathematical Sciences

June  2022, 4(2): 271-298. doi: 10.3934/fods.2022007

## Mean-field and kinetic descriptions of neural differential equations

 1 Institute of Geometry and Applied Mathematics, RWTH Aachen University, Templergraben 55, 52074 Aachen, Germany 2 NRW.Bank, Kavalleriestraße 22, 40213 Düsseldorf, Germany 3 Department of Mathematics "G. Castelnuovo", Sapienza University of Rome, P.le Aldo Moro 5, 00185 Roma, Italy

*Corresponding author: Giuseppe Visconti

Received  November 2021 Revised  February 2022 Published  June 2022 Early access  March 2022

Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons $N$, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.

Citation: Michael Herty, Torsten Trimborn, Giuseppe Visconti. Mean-field and kinetic descriptions of neural differential equations. Foundations of Data Science, 2022, 4 (2) : 271-298. doi: 10.3934/fods.2022007
##### References:

show all references

##### References:
Left: Moments of our PDE model with $\sigma(x) = x, w = -1, b = 0$. Right: Moments of our PDE model with $\sigma(x) = x, w = -1, b = -\frac{m_1(t)}{m_0(0)}$
Left: The energy and variance plotted against the desired values with $\sigma(x) = x, w = -1, b = 0$. Right: The energy and variance plotted against the desired values with $\sigma(x) = x, w = -1, b = -\frac{m_1(t)}{m_0(0)}$
We consider $50$ vehicles with measured length $2$ and $8$ obtained as uniformly distributed random realizations. Left: Histogram of the measured length of the vehicles. Right: Trajectories of the neuron activation energies of the $50$ measurements
Solution of the mean field neural network model at different time steps. The initial value is a uniform distribution on $[2, 8]$ and the weight and bias is chosen as $w = 1, \ b = -5$
Left: Regression problem with $5\cdot10^3$ measurements at fixed positions around $y = x$. Measurement errors are distributed according to a standard Gaussian. Center: Numerical slopes computed out of the previous measurements. Right: Numerical intercepts computed out of the previous measurements
Evolution at time $t = 0$ (left plot), $t = 1$ (center plot), $t = 2$ (right plot) of the mean field neural network model (30) for the regression problem with weights $w_{xx} = 1$, $w_{xy} = w_{yx} = 0$, $w_{yy} = -1$, and biases $b_x = -1$, $b_y = 0$
Evolution at time $t = 0$ (left plot), $t = 1$ (center plot), $t = 5$ (right plot) of the one dimensional mean field neural network model for the regression problem with weight $w = 1$ and bias $b = -1$
Results of the mean field neural network model with updated weights and biases in the case of a novel target
Solution of the Fokker-Planck neural network model at different times. Here, we have chosen the identity as activation function with weight $w = -1$, bias $b = 0$ and diffusion function $K(x) = 1$
Example of a data set for a classification problem
 Measurement 3 3.5 5.5 7 4.5 8 $\dots$ Classifier car car truck truck car truck $\dots$
 Measurement 3 3.5 5.5 7 4.5 8 $\dots$ Classifier car car truck truck car truck $\dots$
 [1] Rong Yang, Li Chen. Mean-field limit for a collision-avoiding flocking system and the time-asymptotic flocking dynamics for the kinetic equation. Kinetic and Related Models, 2014, 7 (2) : 381-400. doi: 10.3934/krm.2014.7.381 [2] Joachim Crevat. Mean-field limit of a spatially-extended FitzHugh-Nagumo neural network. Kinetic and Related Models, 2019, 12 (6) : 1329-1358. doi: 10.3934/krm.2019052 [3] Gerasimenko Viktor. Heisenberg picture of quantum kinetic evolution in mean-field limit. Kinetic and Related Models, 2011, 4 (1) : 385-399. doi: 10.3934/krm.2011.4.385 [4] Seung-Yeal Ha, Jinwook Jung, Jeongho Kim, Jinyeong Park, Xiongtao Zhang. A mean-field limit of the particle swarmalator model. Kinetic and Related Models, 2021, 14 (3) : 429-468. doi: 10.3934/krm.2021011 [5] Seung-Yeal Ha, Jeongho Kim, Jinyeong Park, Xiongtao Zhang. Uniform stability and mean-field limit for the augmented Kuramoto model. Networks and Heterogeneous Media, 2018, 13 (2) : 297-322. doi: 10.3934/nhm.2018013 [6] Michael Herty, Mattia Zanella. Performance bounds for the mean-field limit of constrained dynamics. Discrete and Continuous Dynamical Systems, 2017, 37 (4) : 2023-2043. doi: 10.3934/dcds.2017086 [7] Nastassia Pouradier Duteil. Mean-field limit of collective dynamics with time-varying weights. Networks and Heterogeneous Media, 2022, 17 (2) : 129-161. doi: 10.3934/nhm.2022001 [8] Matthew Rosenzweig. The mean-field limit of the Lieb-Liniger model. Discrete and Continuous Dynamical Systems, 2022, 42 (6) : 3005-3037. doi: 10.3934/dcds.2022006 [9] Kamel Hamdache, Djamila Hamroun. Macroscopic limit of the kinetic Bloch equation. Kinetic and Related Models, 2021, 14 (3) : 541-570. doi: 10.3934/krm.2021015 [10] Kazuhisa Ichikawa, Mahemauti Rouzimaimaiti, Takashi Suzuki. Reaction diffusion equation with non-local term arises as a mean field limit of the master equation. Discrete and Continuous Dynamical Systems - S, 2012, 5 (1) : 115-126. doi: 10.3934/dcdss.2012.5.115 [11] Theresa Lange, Wilhelm Stannat. Mean field limit of Ensemble Square Root filters - discrete and continuous time. Foundations of Data Science, 2021, 3 (3) : 563-588. doi: 10.3934/fods.2021003 [12] Seung-Yeal Ha, Jeongho Kim, Peter Pickl, Xiongtao Zhang. A probabilistic approach for the mean-field limit to the Cucker-Smale model with a singular communication. Kinetic and Related Models, 2019, 12 (5) : 1045-1067. doi: 10.3934/krm.2019039 [13] Young-Pil Choi, Samir Salem. Cucker-Smale flocking particles with multiplicative noises: Stochastic mean-field limit and phase transition. Kinetic and Related Models, 2019, 12 (3) : 573-592. doi: 10.3934/krm.2019023 [14] Seung-Yeal Ha, Jeongho Kim, Xiongtao Zhang. Uniform stability of the Cucker-Smale model and its application to the Mean-Field limit. Kinetic and Related Models, 2018, 11 (5) : 1157-1181. doi: 10.3934/krm.2018045 [15] Hyunjin Ahn, Seung-Yeal Ha, Jeongho Kim. Uniform stability of the relativistic Cucker-Smale model and its application to a mean-field limit. Communications on Pure and Applied Analysis, 2021, 20 (12) : 4209-4237. doi: 10.3934/cpaa.2021156 [16] Franco Flandoli, Matti Leimbach. Mean field limit with proliferation. Discrete and Continuous Dynamical Systems - B, 2016, 21 (9) : 3029-3052. doi: 10.3934/dcdsb.2016086 [17] Giada Basile, Tomasz Komorowski, Stefano Olla. Diffusion limit for a kinetic equation with a thermostatted interface. Kinetic and Related Models, 2019, 12 (5) : 1185-1196. doi: 10.3934/krm.2019045 [18] Xuping Xie, Feng Bao, Thomas Maier, Clayton Webster. Analytic continuation of noisy data using Adams Bashforth residual neural network. Discrete and Continuous Dynamical Systems - S, 2022, 15 (4) : 877-892. doi: 10.3934/dcdss.2021088 [19] Patrick Gerard, Christophe Pallard. A mean-field toy model for resonant transport. Kinetic and Related Models, 2010, 3 (2) : 299-309. doi: 10.3934/krm.2010.3.299 [20] Thierry Paul, Mario Pulvirenti. Asymptotic expansion of the mean-field approximation. Discrete and Continuous Dynamical Systems, 2019, 39 (4) : 1891-1921. doi: 10.3934/dcds.2019080

Impact Factor: