Nonlocal models have recently had a major impact in nonlinear continuum mechanics and are used to describe physical systems/processes which cannot be accurately described by classical, calculus based "local" approaches. In part, this is due to their multiscale nature that enables aggregation of micro-level behavior to obtain a macro-level description of singular/irregular phenomena such as peridynamics, crack propagation, anomalous diffusion and transport phenomena. At the core of these models are nonlocal differential operators, including nonlocal analogs of the gradient/Hessian. This paper initiates the use of such nonlocal operators in the context of optimization and learning. We define and analyze the convergence properties of nonlocal analogs of (stochastic) gradient descent and Newton's method on Euclidean spaces. Our results indicate that as the nonlocal interactions become less noticeable, the optima corresponding to nonlocal optimization converge to the "usual" optima. At the same time, we argue that nonlocal learning is possible in situations where standard calculus fails. As a stylized numerical example of this, we consider the problem of non-differentiable parameter estimation on a non-smooth translation manifold and show that our nonlocal gradient descent recovers the unknown translation parameter from a non-differentiable objective function.
Citation: |
[1] |
R. A. Adams and J. J. F. Fournier, Sobolev Spaces, Elsevier/Academic Press, Amsterdam, 2003.
![]() ![]() |
[2] |
B. Alali and R. Lipton, Multiscale dynamics of heterogeneous media in the peridynamic formulation, J. Elasticity, 106 (2012), 71-103.
doi: 10.1007/s10659-010-9291-4.![]() ![]() ![]() |
[3] |
B. Alali, K. Liu and M. Gunzburger, A generalized nonlocal vector calculus, Z. Angew. Math. Phys., 66 (2015), 2807-2828.
doi: 10.1007/s00033-015-0514-1.![]() ![]() ![]() |
[4] |
F. Andreu-Vaillo, J. M. Mazon, J. D. Rossi and J. J. Toledo-melero, Nonlocal Diffusion Problems, American Mathematical Society (Mathematical Surveys and Monographs (Book 165)), Providence, RI; Real Sociedad Matemática Española, Madrid, 2010.
doi: 10.1090/surv/165.![]() ![]() ![]() |
[5] |
K. E. Atkinson, The Numerical Solution of Integral Equations of the Second Kind, Cambridge University Press, Cambridge Monographs on Applied and Computational Mathematics, 1997.
doi: 10.1017/CBO9780511626340.![]() ![]() ![]() |
[6] |
A. Auslender and M. Teboulle, Interior gradient and epsilon-subgradient descent methods for constrained convex minimization, Math. Oper. Res., 29 (2004), 1-26.
doi: 10.1287/moor.1030.0062.![]() ![]() ![]() |
[7] |
Z. P. Bazant and M. Jirasek, Nonlocal integral formulations of plasticity and damage: Survey of progress, J. Engineering Mechanics, 128 (2002), 1119-1149.
![]() |
[8] |
C. Bjorland, L. Caffarelli and A. Figalli, Non-local gradient dependent operators, Adv. Math., 230 (2012), 1859-1894.
doi: 10.1016/j.aim.2012.03.032.![]() ![]() ![]() |
[9] |
J. Bourgain, H. Brezis and P. Mironescu, Another look at sobolev spaces, Optimal Control and Partial Differential Equations, IOS, Amsterdam, (2001), 439–455.
![]() ![]() |
[10] |
A. Buades, B. Coll and J. M. Morel, A non-local algorithm for image denoising, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2 (2005), 60-65.
![]() |
[11] |
L. Chen, K. Painter, C. Surulescu and A. Zhigun, Mathematical models for cell migration: A nonlocal perspective, Philosophical Transactions of the Royal Society of London B: Biological Sciences, 11 (2019).
![]() |
[12] |
Y. Chen, Q. Gao, Y. Wei and Y. Wang, Study on fractional order gradient methods, Appl. Math. Comput., 314 (2017), 310-321.
doi: 10.1016/j.amc.2017.07.023.![]() ![]() ![]() |
[13] |
E. K. P. Chong and S. H. Zak, An Introduction to Optimization, 4th edtion, Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley and Sons, Inc., USA, 2013.
![]() |
[14] |
R. M. Colombo and M. Lécureux-Mercier, Nonlocal crowd dynamics models for several populations, Acta Math. Sci. Ser., 32 (2012), 177-196.
doi: 10.1016/S0252-9602(12)60011-3.![]() ![]() ![]() |
[15] |
M. Cozzi, S. Dipierro and E. Valdinoci, Nonlocal phase transitions in homogeneous and periodic media, J. Fixed Point Theory Appl., 19 (2017), 387-405.
doi: 10.1007/s11784-016-0359-z.![]() ![]() ![]() |
[16] |
G. Dal Maso, Introduction to $\Gamma$-Convergence, Progress in Nonlinear Differential Equations and their Applications, 8. Birkhüser Boston, Inc., Boston, MA, 1993.
doi: 10.1007/978-1-4612-0327-8.![]() ![]() ![]() |
[17] |
M. D'Elia, M. Gulian, H. Olson and G. E. Karniadakis, A unified theory of fractional, nonlocal, and weighted nonlocal vector calculus, Fract. Calc. Appl. Anal., 24 (2021), 1301-1355.
doi: 10.1515/fca-2021-0057.![]() ![]() ![]() |
[18] |
R. Díaz Millán and M. P. Machado, Inexact proximal $\epsilon$-subgradient methods for composite convex optimization problems, J. Global Optim., 75 (2019), 1029-1060.
doi: 10.1007/s10898-019-00808-8.![]() ![]() ![]() |
[19] |
D. Donoho and C. Grimes, Image manifolds which are isometric to Euclidean space, J. Math. Imaging Vision, 23 (2005), 5-24.
doi: 10.1007/s10851-005-4965-4.![]() ![]() ![]() |
[20] |
Q. Du, Nonlocal Modeling, Analysis, and Computation, Society for Industrial and Applied Mathematics, (CBMS-NSF Regional Conference Series in Applied Mathematics), PA, 2019.
doi: 10.1137/1.9781611975628.ch1.![]() ![]() ![]() |
[21] |
Q. Du, M. Gunzburger, R. Lehoucq and K. Zhou, Analysis and approximation of nonlocal diffusion problems with volume constraints, SIAM Rev., 54 (2012), 667-696.
doi: 10.1137/110833294.![]() ![]() ![]() |
[22] |
Q. Du, M. Gunzburger, R. Lehoucq and K. Zhou, A nonlocal vector calculus, nonlocal volume constrained problems, and nonlocal balance laws, Math. Models Methods Appl. Sci., 23 (2013), 493-540.
doi: 10.1142/S0218202512500546.![]() ![]() ![]() |
[23] |
R. Duddu and H. Waisman, A nonlocal continuum damage mechanics approach to simulation of creep fracture in ice sheets, Computational Mechanics, 51 (2013), 961-974.
![]() |
[24] |
E. Emmrich, R. Lehoucq and D. Puhst, Peridynamics: A nonlocal continuum theory, Lect. Notes Comput. Sci. Eng., 89 (2013), 45-65.
doi: 10.1007/978-3-642-32979-1_3.![]() ![]() ![]() |
[25] |
L. C. Evans, Partial Differential Equations, 2nd edition, American Mathematical Society, 2010.
doi: 10.1090/gsm/019.![]() ![]() ![]() |
[26] |
G. Gilboa and S. J. Osher, Nonlocal operators with applications to image processing, Multiscale Model. Simul., 7 (2008), 1005-1028.
doi: 10.1137/070698592.![]() ![]() ![]() |
[27] |
X. L. Guo, C. J. Zhao and Z. W. Li, On generalized $\epsilon$-subdifferential and radial epiderivative of set-valued mappings, Optim. Lett., 8 (2014), 1707-1720.
doi: 10.1007/s11590-013-0691-9.![]() ![]() ![]() |
[28] |
M. Jirásek, Nonlocal models for damage and fracture: Comparison of approaches, Internat. J. Solids Structures, 35 (1998), 4133-4145.
doi: 10.1016/S0020-7683(97)00306-5.![]() ![]() ![]() |
[29] |
O. Kallenberg, Foundations of Modern Probability, 2nd edition, Probability and Its Applications, Springer, 2002.
doi: 10.1007/978-1-4757-4015-8.![]() ![]() ![]() |
[30] |
H. Lee and Q. Du, Nonlocal gradient operators with a nonspherical interaction neighborhood and their applications, ESAIM Math. Model. Numer. Anal., 54 (2020), 105-128.
doi: 10.1051/m2an/2019053.![]() ![]() ![]() |
[31] |
J. Lellmann, K. Papafitsoros, C. Schonlieb and D. Spector, Analysis and application of a nonlocal hessian, SIAM J. Imaging Sci., 8 (2015), 2161-2202.
doi: 10.1137/140993818.![]() ![]() ![]() |
[32] |
R. LeVeque. Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems (Classics in Applied Mathematics Classics in Applied Mathemat), Society for Industrial and Applied Mathematics, (SIAM), Philadelphia, PA, 2007.
doi: 10.1137/1.9780898717839.![]() ![]() ![]() |
[33] |
A. Maleki, M. Narayan and R. Baraniuk, Suboptimality of nonlocal means on images with sharp edges, Appl. Comput. Harmon. Anal., 33 (2012), 370-387.
doi: 10.1016/j.acha.2012.02.003.![]() ![]() ![]() |
[34] |
A. Maleki, M. Narayan and R. G. Baraniuk, Anisotropic nonlocal means denoising, Appl. Comput. Harmon. Anal., 35 (2013), 452-482.
doi: 10.1016/j.acha.2012.11.003.![]() ![]() ![]() |
[35] |
T. Mengesha and D. Spector, Localization of nonlocal gradients in various topologies, Calc. Var. Partial Differential Equations, 52 (2015), 253-279.
doi: 10.1007/s00526-014-0711-3.![]() ![]() ![]() |
[36] |
J. Nocedal and S. J. Wright, Numerical Optimization, 2nd edition, Springer, New York, 2006.
![]() ![]() |
[37] |
M. D. Paola and M. Zingales, Long-range cohesive interactions of non-local continuum faced by fractional calculus, International J. Solids and Structures, 45 (2008), 5642-5659.
doi: 10.1016/j.ijsolstr.2008.06.004.![]() ![]() |
[38] |
Y. Pu, J. Zhou, Y. Zhang, N. Zhang, G. Huang and P. Siarry, Fractional extreme value adaptive training method: Fractional steepest descent approach, IEEE Trans. Neural Netw. Learn. Syst., 26 (2015), 653-662.
doi: 10.1109/TNNLS.2013.2286175.![]() ![]() ![]() |
[39] |
S. Rokkam, M. Gunzburger, M. Brothers, N. Phan and K. Goel, A nonlocal peridynamics modeling approach for corrosion damage and crack propagation, Theoretical and Applied Fracture Mechanics, 101 (2019), 373-387.
doi: 10.1016/j.tafmec.2019.03.010.![]() ![]() |
[40] |
D. L. Russell, Optimization Theory, New York, W. A. Benjamin, 1970.
![]() ![]() |
[41] |
S. Shalev-Shwartz and S. Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, USA, 2014.
![]() |
[42] |
D. Sheng, Y. Wei, Y. Chen and Y. Wang, Convolutional neural networks with fractional order gradient method, Neurocomputing, 408 (2020), 42-50.
doi: 10.1016/j.neucom.2019.10.017.![]() ![]() |
[43] |
N. Z. Shor, K. C. Kiwiel and A. Ruszczynski, Minimization Methods for Non-Differentiable Functions, Springer Series in Computational Mathematics, Springer Berlin Heidelberg, 2012.
![]() |
[44] |
S. Silling and R. Lehoucq, Convergence of peridynamics to classical elasticity theory, J. Elasticity, 93 (2008), 13-37.
doi: 10.1007/s10659-008-9163-3.![]() ![]() ![]() |
[45] |
S. A. Silling, Reformulation of elasticity theory for discontinuities and long-range forces, J. Mech. Phys. Solids, 48 (2000), 175-209.
doi: 10.1016/S0022-5096(99)00029-0.![]() ![]() ![]() |
[46] |
S. A. Smolyak, Quadrature and interpolation formulas for tensor products of certain classes of functions, Dokl. Akad. Nauk SSSR, 148 (1963), 1042-1045.
![]() ![]() |
[47] |
M. V. Solodov and B. F. Svaiter, A hybrid approximate extragradient-proximal point algorithm using the enlargement of a maximal monotone operator, Set-Valued Anal., 7 (1999), 323-345.
doi: 10.1023/A:1008777829180.![]() ![]() ![]() |
[48] |
Y. Tao, Q. Sun, Q. Du and W. Liu, Nonlocal neural networks, nonlocal diffusion and nonlocal modeling, In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Red Hook, NY, USA, Curran Associates Inc, 18 (2018), 494–504.
![]() |
[49] |
M. B. Wakin, D. L. Donoho, Hyeokho Choi and R. G. Baraniuk, High-resolution navigation on non-differentiable image manifolds, In Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 5 (2005), 1073–1076.
![]() |
[50] |
M. Wakin, D. Donoho, H. Choi and R. Baraniuk, The multiscale structure of non-differentiable image manifolds, Proc SPIE, 5914 (2005).
doi: 10.1117/12.617822.![]() ![]() |
[51] |
J. Wang, Y. Guo, Y. Ying, Y. Liu and Q. Peng, Fast non-local algorithm for image denoising, In International Conference on Image Processing, (2006), 1429–1432.
![]() |
[52] |
J. Wang, Y. Wen, Y. Gou, Z. Ye and H. Chen, Fractional-order gradient descent learning of bp neural networks with caputo derivative, Neural Networks, 89 (2017), 19-30.
doi: 10.1016/j.neunet.2017.02.007.![]() ![]() |
[53] |
Y. Wei, Y. Kang, W. Yin and Y. Wang, Generalization of the gradient method with fractional order gradient direction, J. Franklin Inst., 357 (2020), 2514-2532.
doi: 10.1016/j.jfranklin.2020.01.008.![]() ![]() ![]() |