Advanced Search
Article Contents
Article Contents

ISALT: Inference-based schemes adaptive to large time-stepping for locally Lipschitz ergodic systems

  • * Corresponding author: Fei Lu

    * Corresponding author: Fei Lu 

XL is supported by NSF DMS CAREER-1847770. FL is supported NSF DMS 1913243 and NSF DMS 1821211. FY is supported by AMS-Simons travel grants

Abstract Full Text(HTML) Figure(9) / Table(3) Related Papers Cited by
  • Efficient simulation of SDEs is essential in many applications, particularly for ergodic systems that demand efficient simulation of both short-time dynamics and large-time statistics. However, locally Lipschitz SDEs often require special treatments such as implicit schemes with small time-steps to accurately simulate the ergodic measures. We introduce a framework to construct inference-based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in time by several orders of magnitudes. The key is the statistical learning of an approximation to the infinite-dimensional discrete-time flow map. We explore the use of numerical schemes (such as the Euler-Maruyama, the hybrid RK4, and an implicit scheme) to derive informed basis functions, leading to a parameter inference problem. We introduce a scalable algorithm to estimate the parameters by least squares, and we prove the convergence of the estimators as data size increases.

    We test the ISALT on three non-globally Lipschitz SDEs: the 1D double-well potential, a 2D multiscale gradient system, and the 3D stochastic Lorenz equation with a degenerate noise. Numerical results show that ISALT can tolerate time-step magnitudes larger than plain numerical schemes. It reaches optimal accuracy in reproducing the invariant measure when the time-step is medium-large.

    Mathematics Subject Classification: Primary: 65C30, 60H35; Secondary: 37M25, 62M20.


    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Schematic plot of inferring explicit scheme with a large time-step

    Figure 2.  Large-time statistics for 1D double-well potential. (a) TVD between the empirical invariant densities (PDF) of the inferred schemes and the reference PDF from data. (b) and (c): PDFs and ACFs comparison between the IS-RK4 with $ c_0 $ excluded and the reference data

    Figure 3.  1D double-well potential: Convergence of estimators in IS-RK4 with $ c_0 $ excluded. (a) The relative error of the estimator $ \widehat{c_1^{{\delta}, N,M}} $ with $ {\delta} = 80\times \Delta t $ converges at an order about $ (MN)^{-1/2} $, matching Theorem 3.5. (b) Left column: The coefficients depend on the time-step $ {\delta} = {\mathrm{Gap}}\times \Delta t $, with $ c_1 $ being almost 1 and $ c_2 $ being close to linear in $ {\delta} $ until $ {\delta}>0.08 $. The error bars, which are too narrow to be seen, are the standard deviations of the single-trajectory estimators from the $ M $-trajectory estimator. Right column: The residual decays at an order $ O({\delta}^{1/2}) $, matching Theorem 3.6

    Figure 4.  Large-time statistics for the 2D gradient system. (a) TVD between the $ x_1 $ marginal invariant densities (PDF) of the inferred schemes and the reference PDF from data. (b) and (c): PDFs and ACFs comparison between IS-SSBE with $ c_0 $ excluded and the reference data

    Figure 5.  2D gradient system: Convergence of estimators in IS-SSBE with $ c_0 $ excluded. (a) The relative error of the estimator $ \widehat{c_1^{{\delta}, N,M}} $ with $ {\delta} = 120 \Delta t $ converges at an order about $ (MN)^{-1/2} $, matching Theorem 3.5. (b) Left column: The estimators of $ c_1, c_2 $ are almost linear in $ {\delta} $. Right column: The residual changes little as $ {\delta} $ decreases, due to that IS-SSBE is not a parametrization of an explicit scheme (thus, Theorem 3.6 does not apply)

    Figure 6.  2D gradient system: Convergence of estimators in IS-RK4 with $ c_0 $ excluded. (a) The relative error of the estimator $ \widehat{c_1^{{\delta}, N,M}} $ with $ {\delta} = 120 \Delta t $ converges at an order about $ (MN)^{-1/2} $, matching Theorem 3.5. (b) Left column: The estimators of $ c_1, c_2 $ are constant for all $ {\delta} $. Right column: The residual decays at an order $ O({\delta}^{1/2}) $, matching Theorem 3.6

    Figure 7.  Large-time statistics of $ x_1 $ for the stochastic Lorenz system. (a) TVD between the $ x_1 $ marginal invariant densities (PDF) of the inferred schemes and the reference PDF from data. (b) and (c): PDFs and ACFs comparison between IS-RK4 with $ c_0 $ included and the reference data

    Figure 8.  ACF and PDF of $ x_3 $ in the stochastic Lorenz system. Similar to the other examples, IS-RK4 (with $ c_0 $ included) reproduces the PDF and the ACF the best when the time-step is medium large, while plain RK4 and IS-EM blow up even when $ {\mathrm{Gap}} = 20 $

    Figure 9.  The 3D stochastic Lorenz system: Convergence of estimators in IS-RK4 with $ c_0 $ included. (a) The relative error of the estimator $ \widehat{c_1^{{\delta}, N,M}} $ with $ {\delta} = 240 \Delta t = 0.12 $ converges at order about $ (MN)^{-1/2} $, matching Theorem 3.5. (b) Left column: The estimators of $ c_0,c_1, c_2 $ are varies little until $ {\delta}>0.12 $. The vertical dash line is the optimal time gap. Right column: The residuals decay at orders slightly higher than $ O({\delta}^{1/2}) $

    Table 1.  Notations

    Notation Description
    $ {{\bf X}}_t $ and $ {{\bf B}}_t $ true state process and original stochastic force
    $ f({{\bf X}}_t) $, $ \sigma\in \mathbb{R}^{d\times m} $ local-Lipschitz drift and diffusion matrix
    $ dt $ time-step generating data
    $ {\delta}= {\mathrm{Gap}} \times dt $ time-step for inferred scheme, $ {\mathrm{Gap}}\in \{ 1, 2, 4, 10, 20, 40,\ldots\} $
    $ t_i = i{\delta} $ discrete time instants of data
    $ \{{{\bf X}}_{t_0:t_N}^{(m)}, {{\bf B}}_{t_0:t_N}^{(m)}\}_{m=1}^M $ Data: $ M $ independent paths of $ {{\bf X}} $ and $ {{\bf B}} $ at discrete-times
    $ \mathcal{F}\left({{\bf X}}_{t_i},\, {{\bf B}}_{[t_{i}, \, t_{i+1})}\right) $ true flow map representing $ ({{\bf X}}_{t_{i+1}}-{{\bf X}}_{t_i})/{\delta} $
    $ {F}^{\delta}({{\bf X}}_{t_n},\Delta {{\bf B}}_{t_n}) $ approximate flow map using only $ {{\bf X}}_{t_n} $, $ \Delta {{\bf B}}_{t_n} = {{\bf B}}_{t_{n+1}}-{{\bf B}}_{t_{n}} $
    $ \widetilde F^{\delta}\left(c^{\delta}, {{\bf X}}_{t_n},\Delta {{\bf B}}_{t_n} \right) $ parametric approximate flow map
    $ c^{\delta}=(c_0^{\delta},\dots,c_p^{\delta}) $ parameters to be estimated for the inferred scheme
    $ \eta_n $ and $ \sigma_{\eta}^{\delta} $ iid $ N(0, I_d) $ and covariance, representing regression residual
    EM and IS-EM Euler-Maruyama and inferred scheme (IS) parametrizing it
    HRK4 and IS-RK4 hybrid RK4 and inferred scheme parametrizing RK4
    SSBE and IS-SSBE split-step stochastic backward Euler and IS parametrizing it
     | Show Table
    DownLoad: CSV

    Table Algorithm 1.  Inference-based schemes adaptive to large time-stepping (ISALT): detailed algorithm

    Input: Full model; a high fidelity solver preserving the invariant measure.
    Output: Estimated parametric scheme
    1: Generate data: solve the system with the high fidelity solver, which has a small time-step $ dt $; down sample to get time series with $ {\delta}= \mathrm{Gap}\times dt $. Denote the data, consisting of $ M $ independent trajectories on $ [0,N{\delta}] $, by $ \{{{\bf X}}_{t_0:t_N}^{(m)}, {{\bf B}}_{t_0:t_N}^{(m)}\}_{m=1}^M $ with $ t_i= i{\delta} $.
    2: Pick a parametric form approximating the flow map (2.1) as in (2.5)–(2.6).
    3: Estimate parameters $ c_{0:p}^{\delta} $ and $ \sigma_\eta $ as in (2.7).
    4: Model selection: run the inferred scheme for cross-validation, and test the consistency of the estimators.
     | Show Table
    DownLoad: CSV

    Table 2.  Time gap of blow-up for each scheme: plain verse inferred

    1D double-well 2D gradient system 3D Lorenz system
    Plain RK4 $ {\mathrm{Gap}}=20 $ $ {\mathrm{Gap}}=20 $ $ {\mathrm{Gap}}=10 $
    IS-RK4 $ {\mathrm{Gap}}>200 $ $ {\mathrm{Gap}}>200 $ $ {\mathrm{Gap}}>400 $
    Plain SSBE $ {\mathrm{Gap}}=40 $ $ {\mathrm{Gap}}=40 $ $ {\mathrm{Gap}}=20 $
    IS-SSBE $ {\mathrm{Gap}}>200 $ $ {\mathrm{Gap}}>200 $ $ {\mathrm{Gap}}>400 $
     | Show Table
    DownLoad: CSV
  • [1] Y. Bar-SinaiS. HoyerJ. Hickey and M. P. Brenner, Learning data-driven discretizations for partial differential equations, Proc. Natl. Acad. Sci. USA, 116 (2019), 15344-15349.  doi: 10.1073/pnas.1814058116.
    [2] A. J. Chorin and F. Lu, Discrete approach to stochastic parametrization and dimension reduction in nonlinear dynamics, Proceedings of the National Academy of Sciences, USA, 112 (2015), 9804-9809. 
    [3] A. J. ChorinF. LuR. N. MillerM. Morzfeld and X. Tu, Sampling, feasibility, and priors in data assimilation, Discrete Contin. Dyn. Syst., 36 (2016), 4227-4246.  doi: 10.3934/dcds.2016.36.4227.
    [4] W. EB. EngquistX. LiW. Ren and E. Vanden-Eijnden, The heterogeneous multiscale method: A review, Commun. Comput. Phys., 2 (2007), 367-450. 
    [5] P. Hall and  C. C. HeydeMartingale Limit Theory and its Application, Academic press, 1980. 
    [6] HanJentzen and W. E, Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA, 115 (2018), 8505-8510.  doi: 10.1073/pnas.1718942115.
    [7] J. A. Hansen and C. Penland, Efficient approximate technique for integrating stochastic differential equations, Monthly Weather Review, 134 (2006), 3006-3014.  doi: 10.1175/MWR3192.1.
    [8] C. C. Heyde, On the central limit theorem for stationary processes, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 30 (1974), 315-320.  doi: 10.1007/BF00532619.
    [9] Y. Hu, Strong and weak order of time discretization schemes of stochastic differential equatios, In Séminaire de Probabilités XXX, Springer, (1996), 218–227. doi: 10.1007/BFb0094650.
    [10] T. Hudson and X. H. Li, Coarse-graining of overdamped Langevin dynamics via the Mori–Zwanzig formalism, Multiscale Model. Simul., 18 (2020), 1113-1135.  doi: 10.1137/18M1222533.
    [11] M. Hutzenthaler and A. Jentzen, Numerical Approximations of Stochastic Differential Equations with Non-globally Lipschitz Continuous Coefficients, American Mathematical Society, 2015. doi: 10.1090/memo/1112.
    [12] M. HutzenthalerA. Jentzen and P. E. Kloeden, Strong convergence of an explicit numerical method for SDEs with nonglobally Lipschitz continuous coefficients, Ann. Appl. Probab., 22 (2012), 1611-1641.  doi: 10.1214/11-AAP803.
    [13] A. Jentzen and P. Kloeden, Taylor expansions of solutions of stochastic partial differential equations with additive noise, Ann. Probab., 38 (2010), 532-569.  doi: 10.1214/09-AOP500.
    [14] S. W. Jiang and J. Harlim, Modeling of missing dynamical systems: Deriving parametric models using a nonparametric framework, Res. Math. Sci., 7 (2020), Paper No. 16, 25 pp. doi: 10.1007/s40687-020-00217-4.
    [15] R. Khasminskii, Stochastic Stability of Differential Equations, volume 66., Springer-Verlag Berlin Heidelberg, 2nd edition, 2012. doi: 10.1007/978-3-642-23280-0.
    [16] B. KhouiderA. J. Majda and M. A. Katsoulakis, Coarse-grained stochastic models for tropical convection and climate, Proc. Natl. Acad. Sci. USA, 100 (2003), 11941-11946. 
    [17] P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, Springer, Berlin, 3rd edition, 1992. doi: 10.1007/978-3-662-12616-5.
    [18] K. Law, A. Stuart and K. Zygalakis, Data Assimilation: A Mathematical Introduction, Springer, 2015. doi: 10.1007/978-3-319-20325-6.
    [19] F. Legoll and T. Lelièvre, Effective dynamics using conditional expectations, Nonlinearity, 23 (2010), 2131-2163.  doi: 10.1088/0951-7715/23/9/006.
    [20] F. LegollT. Leliévre and U. Sharma, Effective dynamics for non-reversible stochastic differential equations: A quantitative study, Nonlinearity, 32 (2019), 4779-4816.  doi: 10.1088/1361-6544/ab34bf.
    [21] H. LeiN. A. Baker and X. Li, Data-driven parameterization of the generalized Langevin equation, Proc. Natl. Acad. Sci. USA, 113 (2016), 14183-14188. 
    [22] B. Leimkuhler and C. Matthews, Molecular Dynamics, Springer, 2015.
    [23] Y. Li and J. Duan, A data-driven approach for discovering stochastic dynamical systems with non-Gaussian Lévy noise, Phys. D, 417 (2021), 132830, 12 pp. doi: 10.1016/j.physd.2020.132830.
    [24] K. K. Lin and F. Lu, Data-driven model reduction, Wiener projections, and the Koopman-Mori-Zwanzig formalism, J. Comput. Phys., 424 (2021), 109864, 33 pp. doi: 10.1016/j.jcp.2020.109864.
    [25] S. Liu, L. Grzelak and C. W. Oosterlee, The seven-league scheme: Deep learning for large time step monte carlo simulations of stochastic differential equations, arXiv: 2009.03202, (2020).
    [26] F. Lu, Data-driven model reduction for stochastic Burgers equations, Entropy, 22 (2020), Paper No. 1360, 22 pp. doi: 10.3390/e22121360.
    [27] F. LuK. K. Lin and A. J. Chorin, Comparison of continuous and discrete-time data-based modeling for hypoelliptic systems, Commun. Appl. Math. Comput. Sci., 11 (2016), 187-216.  doi: 10.2140/camcos.2016.11.187.
    [28] F. LuK. K. Lin and A. J. Chorin, Data-based stochastic model reduction for the Kuramoto–Sivashinsky equation, Phys. D, 340 (2017), 46-57.  doi: 10.1016/j.physd.2016.09.007.
    [29] F. Lu, M. Maggioni and S. Tang, Learning interaction kernels in stochastic systems of interacting particles from multiple trajectories, J. Mach. Learn. Res., 22 (2021), Paper No. 32, 67 pp.
    [30] F. LuM. ZhongS. Tang and M. Maggioni, Nonparametric inference of interaction laws in systems of agents from trajectory data, Proc. Natl. Acad. Sci. USA, 116 (2019), 14424-14433. 
    [31] Y. Maday and G. Turinici, A parareal in time procedure for the control of partial differential equations, C. R. Math. Acad. Sci. Paris, 335 (2002), 387-392.  doi: 10.1016/S1631-073X(02)02467-6.
    [32] A. J. Majda and J. Harlim, Physics constrained nonlinear regression models for time series, Nonlinearity, 26 (2013), 201-217.  doi: 10.1088/0951-7715/26/1/201.
    [33] X. Mao, Stochastic Differential Equations and Applications, Elsevier, 2007.
    [34] J. C. MattinglyA. M. Stuart and and D. J. Higham, Ergodicity for SDEs and approximations: Locally Lipschitz vector fields and degenerate noise, Stochastic Process. Appl., 101 (2002), 185-232.  doi: 10.1016/S0304-4149(02)00150-3.
    [35] G. A. Pavliotis and A. M. Stuart, Parameter estimation for multiscale diffusions, J. Statist. Phys., 127 (2007), 741-781.  doi: 10.1007/s10955-007-9300-6.
    [36] G. O. Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2 (1996), 341-363.  doi: 10.2307/3318418.
    [37] W. Rümelin, Numerical treatment of stochastic differential equations, SIAM J. Numer. Anal., 19 (1982), 604-613.  doi: 10.1137/0719041.
    [38] J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), 1339-1364.  doi: 10.1016/j.jcp.2018.08.029.
    [39] L. YangD. Zhang and G. E. Karniadakis, Physics-informed generative adversarial networks for stochastic differential equations, SIAM J. Sci. Comput., 42 (2020), A292-A317.  doi: 10.1137/18M1225409.
  • 加载中




Article Metrics

HTML views(537) PDF downloads(262) Cited by(0)

Access History

Other Articles By Authors



    DownLoad:  Full-Size Img  PowerPoint