# American Institute of Mathematical Sciences

August  2020, 19(8): 4085-4095. doi: 10.3934/cpaa.2020181

## Function approximation by deep networks

 1 Institute of Mathematical Sciences, Claremont Graduate University, Claremont, CA 91711 2 Center for Brains, Minds, and Machines, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, 02139

*Corresponding author

Received  August 2019 Revised  November 2019 Published  May 2020

Fund Project: The research of the first author is supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via 2018-18032000002. The research of the second author is supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216

We show that deep networks are better than shallow networks at approximating functions that can be expressed as a composition of functions described by a directed acyclic graph, because the deep networks can be designed to have the same compositional structure, while a shallow network cannot exploit this knowledge. Thus, the blessing of compositionality mitigates the curse of dimensionality. On the other hand, a theorem called good propagation of errors allows to "lift" theorems about shallow networks to those about deep networks with an appropriate choice of norms, smoothness, etc. We illustrate this in three contexts where each channel in the deep network calculates a spherical polynomial, a non-smooth ReLU network, or another zonal function network related closely with the ReLU network.

Citation: H. N. Mhaskar, T. Poggio. Function approximation by deep networks. Communications on Pure and Applied Analysis, 2020, 19 (8) : 4085-4095. doi: 10.3934/cpaa.2020181
##### References:

show all references

##### References:
This figure from [13] shows an example of a $\mathcal{G}$–function ($f^*$ given in (3.1)). The vertices $V\cup \mathbf{S}$ of the DAG $\mathcal{G}$ are denoted by red dots. The black dots represent the inputs; the input to the various nodes as indicated by the in–edges of the red nodes. The blue dot indicates the output value of the $\mathcal{G}$–function, $f^*$ in this example
On the left, with ${\mathbf{x}}_0 = (1, 1, 1)/\sqrt{3}$, the graph of $f({\mathbf{x}}) = [({\mathbf{x}}\cdot{\mathbf{x}}_0-0.1)_+]^8 + [(-{\mathbf{x}}\cdot{\mathbf{x}}_0-0.1)_+]^8$. On the right, the graph of $\mathcal{D}_{\phi_\gamma}(f)$. Courtesy: D. Batenkov
 [1] Weihua Liu, Andrew Klapper. AFSRs synthesis with the extended Euclidean rational approximation algorithm. Advances in Mathematics of Communications, 2017, 11 (1) : 139-150. doi: 10.3934/amc.2017008 [2] Vikas S. Krishnamurthy. The vorticity equation on a rotating sphere and the shallow fluid approximation. Discrete and Continuous Dynamical Systems, 2019, 39 (11) : 6261-6276. doi: 10.3934/dcds.2019273 [3] Alessandro Scagliotti. Deep Learning approximation of diffeomorphisms via linear-control systems. Mathematical Control and Related Fields, 2022  doi: 10.3934/mcrf.2022036 [4] Purshottam Narain Agrawal, Şule Yüksel Güngör, Abhishek Kumar. Better degree of approximation by modified Bernstein-Durrmeyer type operators. Mathematical Foundations of Computing, 2022, 5 (2) : 75-92. doi: 10.3934/mfc.2021024 [5] Gabriella Bretti, Roberto Natalini, Benedetto Piccoli. Fast algorithms for the approximation of a traffic flow model on networks. Discrete and Continuous Dynamical Systems - B, 2006, 6 (3) : 427-448. doi: 10.3934/dcdsb.2006.6.427 [6] Pierluigi Colli, Gianni Gilardi, Jürgen Sprekels. Deep quench approximation and optimal control of general Cahn–Hilliard systems with fractional operators and double obstacle potentials. Discrete and Continuous Dynamical Systems - S, 2021, 14 (1) : 243-271. doi: 10.3934/dcdss.2020213 [7] Kevin N. Webster. Low-rank kernel approximation of Lyapunov functions using neural networks. Journal of Computational Dynamics, 2022  doi: 10.3934/jcd.2022026 [8] Christian Bläsche, Shawn Means, Carlo R. Laing. Degree assortativity in networks of spiking neurons. Journal of Computational Dynamics, 2020, 7 (2) : 401-423. doi: 10.3934/jcd.2020016 [9] Denis Mercier, Serge Nicaise. Existence results for general systems of differential equations on one-dimensional networks and prewavelets approximation. Discrete and Continuous Dynamical Systems, 1998, 4 (2) : 273-300. doi: 10.3934/dcds.1998.4.273 [10] Lars Grüne. Computing Lyapunov functions using deep neural networks. Journal of Computational Dynamics, 2021, 8 (2) : 131-152. doi: 10.3934/jcd.2021006 [11] D. Lannes. Consistency of the KP approximation. Conference Publications, 2003, 2003 (Special) : 517-525. doi: 10.3934/proc.2003.2003.517 [12] Cristina Stoica. An approximation theorem in classical mechanics. Journal of Geometric Mechanics, 2016, 8 (3) : 359-374. doi: 10.3934/jgm.2016011 [13] Susanna V. Haziot. On the spherical geopotential approximation for Saturn. Communications on Pure and Applied Analysis, 2022, 21 (7) : 2327-2336. doi: 10.3934/cpaa.2022035 [14] Jorge Bustamante. Approximation of functions and Mihesan operators. Mathematical Foundations of Computing, 2022  doi: 10.3934/mfc.2022033 [15] Hongfei Yang, Xiaofeng Ding, Raymond Chan, Hui Hu, Yaxin Peng, Tieyong Zeng. A new initialization method based on normed statistical spaces in deep networks. Inverse Problems and Imaging, 2021, 15 (1) : 147-158. doi: 10.3934/ipi.2020045 [16] Anne-Sophie de Suzzoni. Consequences of the choice of a particular basis of $L^2(S^3)$ for the cubic wave equation on the sphere and the Euclidean space. Communications on Pure and Applied Analysis, 2014, 13 (3) : 991-1015. doi: 10.3934/cpaa.2014.13.991 [17] Jakub Cupera. Diffusion approximation of neuronal models revisited. Mathematical Biosciences & Engineering, 2014, 11 (1) : 11-25. doi: 10.3934/mbe.2014.11.11 [18] Bernd Aulbach, Martin Rasmussen, Stefan Siegmund. Approximation of attractors of nonautonomous dynamical systems. Discrete and Continuous Dynamical Systems - B, 2005, 5 (2) : 215-238. doi: 10.3934/dcdsb.2005.5.215 [19] Rua Murray. Approximation error for invariant density calculations. Discrete and Continuous Dynamical Systems, 1998, 4 (3) : 535-557. doi: 10.3934/dcds.1998.4.535 [20] Bo Tan, Qinglong Zhou. Approximation properties of Lüroth expansions. Discrete and Continuous Dynamical Systems, 2021, 41 (6) : 2873-2890. doi: 10.3934/dcds.2020389

2021 Impact Factor: 1.273