Discrete and Continuous Dynamical Systems - S
April 2022 , Volume 15 , Issue 4
Issue on stochastic computing in data science
Select all articles
There has been an emerging interest in developing and applying dictionary learning (DL) to process massive datasets in the last decade. Many of these efforts, however, focus on employing DL to compress and extract a set of important features from data, while considering restoring the original data from this set a secondary goal. On the other hand, although several methods are able to process streaming data by updating the dictionary incrementally as new snapshots pass by, most of those algorithms are designed for the setting where the snapshots are randomly drawn from a probability distribution. In this paper, we present a new DL approach to compress and denoise massive dataset in real time, in which the data are streamed through in a preset order (instances are videos and temporal experimental data), so at any time, we can only observe a biased sample set of the whole data. Our approach incrementally builds up the dictionary in a relatively simple manner: if the new snapshot is adequately explained by the current dictionary, we perform a sparse coding to find its sparse representation; otherwise, we add the new snapshot to the dictionary, with a Gram-Schmidt process to maintain the orthogonality. To compress and denoise noisy datasets, we apply the denoising to the snapshot directly before sparse coding, which deviates from traditional dictionary learning approach that achieves denoising via sparse coding. Compared to full-batch matrix decomposition methods, where the whole data is kept in memory, and other mini-batch approaches, where unbiased sampling is often assumed, our approach has minimal requirement in data sampling and storage: i) each snapshot is only seen once then discarded, and ii) the snapshots are drawn in a preset order, so can be highly biased. Through experiments on climate simulations and scanning transmission electron microscopy (STEM) data, we demonstrate that the proposed approach performs competitively to those methods in data reconstruction and denoising.
In this paper, we study linear transport model by adopting deep learning method, in particular deep neural network (DNN) approach. While the interest of using DNN to study partial differential equations is arising, here we adapt it to study kinetic models, in particular the linear transport model. Moreover, theoretical analysis on the convergence of neural network and its approximated solution towards analytic solution is shown. We demonstrate the accuracy and effectiveness of the proposed DNN method in numerical experiments.
This paper is concerned with fully discrete finite element approximations of a stochastic nonlinear Schrödinger (sNLS) equation with linear multiplicative noise of the Stratonovich type. The goal of studying the sNLS equation is to understand the role played by the noises for a possible delay or prevention of the collapsing and/or blow-up of the solution to the sNLS equation. In the paper we first carry out a detailed analysis of the properties of the solution which lays down a theoretical foundation and guidance for numerical analysis, we then present a family of three-parameters fully discrete finite element methods which differ mainly in their time discretizations and contains many well-known schemes (such as the explicit and implicit Euler schemes and the Crank-Nicolson scheme) with different combinations of time discetization strategies. The prototypical
The feasibility problem is at the core of the modeling of many problems in various disciplines of mathematics and physical sciences, and the quasi-convex function is widely applied in many fields such as economics, finance, and management science. In this paper, we consider the stochastic quasi-convex feasibility problem (SQFP), which is to find a common point of infinitely many sublevel sets of quasi-convex functions. Inspired by the idea of a stochastic index scheme, we propose a stochastic quasi-subgradient method to solve the SQFP, in which the quasi-subgradients of a random (and finite) index set of component quasi-convex functions at the current iterate are used to construct the descent direction at each iteration. Moreover, we introduce a notion of Hölder-type error bound property relative to the random control sequence for the SQFP, and use it to establish the global convergence theorem and convergence rate theory of the stochastic quasi-subgradient method. It is revealed in this paper that the stochastic quasi-subgradient method enjoys both advantages of low computational cost requirement and fast convergence feature.
In this paper, we develop a drift homotopy implicit particle filter method. The methodology of our approach is to adopt the concept of drift homotopy in the resampling procedure of the particle filter method for solving the nonlinear filtering problem, and we introduce an implicit particle filter method to improve the efficiency of the drift homotopy resampling procedure. Numerical experiments are carried out to demonstrate the effectiveness and efficiency of our drift homotopy implicit particle filter.
Efficient simulation of SDEs is essential in many applications, particularly for ergodic systems that demand efficient simulation of both short-time dynamics and large-time statistics. However, locally Lipschitz SDEs often require special treatments such as implicit schemes with small time-steps to accurately simulate the ergodic measures. We introduce a framework to construct inference-based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in time by several orders of magnitudes. The key is the statistical learning of an approximation to the infinite-dimensional discrete-time flow map. We explore the use of numerical schemes (such as the Euler-Maruyama, the hybrid RK4, and an implicit scheme) to derive informed basis functions, leading to a parameter inference problem. We introduce a scalable algorithm to estimate the parameters by least squares, and we prove the convergence of the estimators as data size increases.
We test the ISALT on three non-globally Lipschitz SDEs: the 1D double-well potential, a 2D multiscale gradient system, and the 3D stochastic Lorenz equation with a degenerate noise. Numerical results show that ISALT can tolerate time-step magnitudes larger than plain numerical schemes. It reaches optimal accuracy in reproducing the invariant measure when the time-step is medium-large.
In this work, by combining with stochastic approximation methods, we proposed a new explicit multistep scheme for solving the forward backward stochastic differential equations. Compared with the one constructed by using derivative approximation method, the new one covers the approximation of the stochastic part and is more accurate and easier to realize. Several numerical tests are presented to show the stability and effectiveness of the proposed scheme.
Topological data analysis encompasses a broad set of techniques that investigate the shape of data. One of the predominant tools in topological data analysis is persistent homology, which is used to create topological summaries of data called persistence diagrams. Persistent homology offers a novel method for signal analysis. Herein, we aid interpretation of the sublevel set persistence diagrams of signals by 1) showing the effect of frequency and instantaneous amplitude on the persistence diagrams for a family of deterministic signals, and 2) providing a general equation for the probability density of persistence diagrams of random signals via a pushforward measure. We also provide a topologically-motivated, efficiently computable statistical descriptor analogous to the power spectral density for signals based on a generalized Bayesian framework for persistence diagrams. This Bayesian descriptor is shown to be competitive with power spectral densities and continuous wavelet transforms at distinguishing signals with different dynamics in a classification problem with autoregressive signals.
In this paper, we propose a class of numerical schemes for stochastic Poisson systems with multiple invariant Hamiltonians. The method is based on the average vector field discrete gradient and an orthogonal projection technique. The proposed schemes preserve all the invariant Hamiltonians of the stochastic Poisson systems simultaneously, with possibility of achieving high convergence orders in the meantime. We also prove that our numerical schemes preserve the Casimir functions of the systems under certain conditions. Numerical experiments verify the theoretical results and illustrate the effectiveness of our schemes.
Many real-world problems require to estimate parameters of interest in a Bayesian framework from data that are collected sequentially in time. Conventional methods to sample the posterior distributions, such as Markov Chain Monte Carlo methods can not efficiently deal with such problems as they do not take advantage of the sequential structure. To this end, the Ensemble Kalman inversion (EnKI), which updates the particles whenever a new collection of data arrive, becomes a popular tool to solve this type of problems. In this work we present a method to improve the performance of EnKI, which removes some particles that significantly deviate from the posterior distribution via a resampling procedure. Specifically we adopt an idea developed in the sequential Monte Carlo sampler, and simplify it to compute an approximate weight function. Finally we use the computed weights to identify and remove those particles seriously deviating from the target distribution. With numerical examples, we demonstrate that, without requiring any additional evaluations of the forward model, the proposed method can improve the performance of standard EnKI in certain class of problems.
In this paper, we propose a new class of operator factorization methods to discretize the integral fractional Laplacian
We propose a data-driven learning framework for the analytic continuation problem in numerical quantum many-body physics. Designing an accurate and efficient framework for the analytic continuation of imaginary time using computational data is a grand challenge that has hindered meaningful links with experimental data. The standard Maximum Entropy (MaxEnt)-based method is limited by the quality of the computational data and the availability of prior information. Also, the MaxEnt is not able to solve the inversion problem under high level of noise in the data. Here we introduce a novel learning model for the analytic continuation problem using a Adams-Bashforth residual neural network (AB-ResNet). The advantage of this deep learning network is that it is model independent and, therefore, does not require prior information concerning the quantity of interest given by the spectral function. More importantly, the ResNet-based model achieves higher accuracy than MaxEnt for data with higher level of noise. Finally, numerical examples show that the developed AB-ResNet is able to recover the spectral function with accuracy comparable to MaxEnt where the noise level is relatively small.
In this paper, we develop a sparse grid stochastic collocation method to improve the computational efficiency in handling the steady Stokes-Darcy model with random hydraulic conductivity. To represent the random hydraulic conductivity, the truncated Karhunen-Loève expansion is used. For the discrete form in probability space, we adopt the stochastic collocation method and then use the Smolyak sparse grid method to improve the efficiency. For the uncoupled deterministic subproblems at collocation nodes, we apply the general coupled finite element method. Numerical experiment results are presented to illustrate the features of this method, such as the sample size, convergence, and randomness transmission through the interface.
While detailed chemical kinetic models have been successful in representing rates of chemical reactions in continuum scale computational fluid dynamics (CFD) simulations, applying the models in simulations for engineering device conditions is computationally prohibitive. To reduce the cost, data-driven methods, e.g., autoencoders, have been used to construct reduced chemical kinetic models for CFD simulations. Despite their success, data-driven methods rely heavily on training data sets and can be unreliable when used in out-of-distribution (OOD) regions (i.e., when extrapolating outside of the training set). In this paper, we present an enhanced autoencoder model for combustion chemical kinetics with uncertainty quantification to enable the detection of model usage in OOD regions, and thereby creating an OOD-aware autoencoder model that contributes to more robust CFD simulations of reacting flows. We first demonstrate the effectiveness of the method in OOD detection in two well-known datasets, MNIST and Fashion-MNIST, in comparison with the deep ensemble method, and then present the OOD-aware autoencoder for reduced chemistry model in syngas combustion.
We propose the novel augmented Gaussian random field (AGRF), which is a universal framework incorporating the data of observable and derivatives of any order. Rigorous theory is established. We prove that under certain conditions, the observable and its derivatives of any order are governed by a single Gaussian random field, which is the aforementioned AGRF. As a corollary, the statement "the derivative of a Gaussian process remains a Gaussian process" is validated, since the derivative is represented by a part of the AGRF. Moreover, a computational method corresponding to the universal AGRF framework is constructed. Both noiseless and noisy scenarios are considered. Formulas of the posterior distributions are deduced in a nice closed form. A significant advantage of our computational method is that the universal AGRF framework provides a natural way to incorporate arbitrary order derivatives and deal with missing data. We use four numerical examples to demonstrate the effectiveness of the computational method. The numerical examples are composite function, damped harmonic oscillator, Korteweg-De Vries equation, and Burgers' equation.
Built upon the hypoelliptic analysis of the effective Mori-Zwanzig (EMZ) equation for observables of stochastic dynamical systems, we show that the obtained semigroup estimates for the EMZ equation can be used to derive prior estimates of the observable statistics for systems in the equilibrium and nonequilibrium state. In addition, we introduce both first-principle and data-driven methods to approximate the EMZ memory kernel and prove the convergence of the data-driven parametrization schemes using the regularity estimate of the memory kernel. The analysis results are validated numerically via the Monte-Carlo simulation of the Langevin dynamics for a Fermi-Pasta-Ulam chain model. With the same example, we also show the effectiveness of the proposed memory kernel approximation methods.
Call for special issues
Add your name and e-mail address to receive news of forthcoming issues of this journal:
[Back to Top]