doi: 10.3934/dcdss.2021157
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients

1. 

Georgia Institute of Technology, USA

2. 

Google Research, USA

3. 

School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, China

*Corresponding author

© 2021 The Author(s). Published by AIMS, LLC. This is an Open Access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Received  April 2021 Revised  September 2021 Early access December 2021

Many Markov Chain Monte Carlo (MCMC) methods leverage gradient information of the potential function of target distribution to explore sample space efficiently. However, computing gradients can often be computationally expensive for large scale applications, such as those in contemporary machine learning. Stochastic Gradient (SG-)MCMC methods approximate gradients by stochastic ones, commonly via uniformly subsampled data points, and achieve improved computational efficiency, however at the price of introducing sampling error. We propose a non-uniform subsampling scheme to improve the sampling accuracy. The proposed exponentially weighted stochastic gradient (EWSG) is designed so that a non-uniform-SG-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method, and hence the inaccuracy due to SG approximation is reduced. EWSG differs from classical variance reduction (VR) techniques as it focuses on the entire distribution instead of just the variance; nevertheless, its reduced local variance is also proved. EWSG can also be viewed as an extension of the importance sampling idea, successful for stochastic-gradient-based optimizations, to sampling tasks. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index, which is coupled to the MCMC algorithm. Numerical experiments are provided, not only to demonstrate EWSG's effectiveness, but also to guide hyperparameter choices, and validate our non-asymptotic global error bound despite of approximations in the implementation. Notably, while statistical accuracy is improved, convergence speed can be comparable to the uniform version, which renders EWSG a practical alternative to VR (but EWSG and VR can be combined too).

Citation: Ruilin Li, Xin Wang, Hongyuan Zha, Molei Tao. Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients. Discrete and Continuous Dynamical Systems - S, doi: 10.3934/dcdss.2021157
References:
[1]

S. Ahn, A. Korattikara and M. Welling, Bayesian posterior sampling via stochastic gradient fisher scoring, In 29th International Conference on Machine Learning, ICML 2012, (2012), 1591–1598.

[2]

F. Bach, Stochastic gradient methods for machine learning, Technical report, INRIA - Ecole Normale Superieur, 2013. http://lear.inrialpes.fr/people/harchaoui/projects/gargantua/slides/bach_gargantua_nov2013.pdf.

[3]

J. BakerP. FearnheadE. Fox and C. Nemeth, Control variates for stochastic gradient MCMC, Stat. Comput., 29 (2019), 599-615.  doi: 10.1007/s11222-018-9826-2.

[4]

R. Bardenet, A. Doucet and C. Holmes, On markov chain Monte Carlo methods for tall data, J. Mach. Learn. Res., 18 (2017), 43 pp.

[5]

V. S. Borkar and S. K. Mitter, A strong approximation theorem for stochastic recursive algorithms, J. Optim. Theory Appl., 100 (1999), 499-513.  doi: 10.1023/A:1022630321574.

[6]

N. Bou-RabeeA. Eberle and R. Zimmer, Coupling and convergence for Hamiltonian Monte Carlo, Ann. Appl. Probab., 30 (2018), 1209-1250.  doi: 10.1214/19-AAP1528.

[7]

N. Bou-Rabee and H. Owhadi, Long-run accuracy of variational integrators in the stochastic context, SIAM J. Numer. Anal., 48 (2010), 278-297.  doi: 10.1137/090758842.

[8]

N. Bou-Rabee and J. M. Sanz-Serna, Geometric integrators and the Hamiltonian Monte Carlo method, Acta Numer., 27 (2018), 113-206.  doi: 10.1017/S0962492917000101.

[9] S. BrooksA. GelmanG. Jones and X.-L. Meng, Handbook of Markov Chain Monte Carlo, CRC press, 2011.  doi: 10.1201/b10905.
[10]

N. ChatterjiN. FlammarionY. MaP. Bartlett and M. Jordan, On the theory of variance reduction for stochastic gradient monte carlo, ICML, (2018). 

[11]

C. ChenN. Ding and L. Carin, On the convergence of stochastic gradient mcmc algorithms with high-order integrators, Advances in Neural Information Processing Systems, (2015), 2278-2286. 

[12]

T. ChenE. B. Fox and C. Guestrin, Stochastic gradient hamiltonian monte carlo, International Conference on Machine Learning, (2014), 1683-1691. 

[13]

X. Cheng, N. S. Chatterji, Y. Abbasi-Yadkori, P. L. Bartlett and M. I. Jordan, Sharp convergence rates for langevin dynamics in the nonconvex setting, preprint, arXiv: 1805.01648, 2018.

[14]

X. ChengN. S. ChatterjiP. L. Bartlett and M. I. Jordan, Underdamped langevin mcmc: A non-asymptotic analysis, Proceedings of the 31st Conference On Learning Theory, PMLR, (2018). 

[15]

D. Csiba and P. Richtárik, Importance sampling for minibatches, J. Mach. Learn. Res., 19 (2018), 21 pp.

[16]

A. S. Dalalyan and A. Karagulyan, User-friendly guarantees for the langevin Monte Carlo with inaccurate gradient, Stochastic Process. Appl., 129 (2019), 5278-5311.  doi: 10.1016/j.spa.2019.02.016.

[17]

A. Defazio, F. Bach and S. Lacoste-Julien, Saga: A fast incremental gradient method with support for non-strongly convex composite objectives, In Advances in Neural Information Processing Systems, (2014), 1646–1654.

[18]

A. Defazio and L. Bottou, On the ineffectiveness of variance reduced optimization for deep learning, In Advances in Neural Information Processing Systems, (2019), 1755–1765.

[19]

K. A. Dubey, S. J. Reddi, S. A. Williamson, B. Poczos, A. J. Smola and E. P. Xing, Variance reduction in stochastic gradient langevin dynamics, In Advances in Neural Information Processing Systems, (2016), 1154–1162.

[20]

T. Fu and Z. Zhang, Cpsg-mcmc: Clustering-based preprocessing method for stochastic gradient mcmc, In Artificial Intelligence and Statistics, (2017), 841–850.

[21]

K. Jarrett, K. Kavukcuoglu, M. A. Ranzato and Y. LeCun, What is the best multi-stage architecture for object recognition? In 2009 IEEE 12th International Conference on Computer Vision, (2009), 2146–2153. doi: 10.1109/ICCV.2009.5459469.

[22]

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems, (2013), 315-323. 

[23]

A. Korattikara, Y. Chen and M. Welling, Austerity in mcmc land: Cutting the metropolis-hastings budget, In International Conference on Machine Learning, (2014), 181–189.

[24]

R. Kubo, The fluctuation-dissipation theorem, Reports on Progress in Physics, 29 (1966).  doi: 10.1088/0034-4885/29/1/306.

[25]

C. Li, C. Chen, D. Carlson and L. Carin, Preconditioned stochastic gradient langevin dynamics for deep neural networks, In Thirtieth AAAI Conference on Artificial Intelligence, 2016.

[26]

Q. Li, C. Tai and E. Weinan, Stochastic modified equations and adaptive stochastic gradient algorithms, In J. Mach. Learn. Res., 20 (2019), 47 pp.

[27]

R. Li, H. Zha and M. Tao, Mean-square analysis with an application to optimal dimension dependence of Langevin Monte Carlo, preprint, arXiv: 2109.03839, 2021.

[28]

M. Lichman et al, UCI Machine Learning Repository, 2013.

[29]

Y.-A. Ma, T. Chen and E. B. Fox, A complete recipe for stochastic gradient mcmc, In Advances in Neural Information Processing Systems, (2015), 2917–2925.

[30]

D. Maclaurin and R. P. Adams, Firefly monte carlo: Exact mcmc with subsets of data, In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.

[31]

S. Mandt, M. D. Hoffman and D. M. Blei, Stochastic gradient descent as approximate Bayesian inference, J. Mach. Learn. Res., 18 (2017), 134, 35 pp.

[32]

J. C. MattinglyA. M. Stuart and M. V. Tretyakov, Convergence of numerical time-averaging and stationary measures via Poisson equations, SIAM J. Numer. Anal., 48 (2010), 552-577.  doi: 10.1137/090770527.

[33]

D. NeedellR. Ward and N. Srebro, Stochastic gradient descent, weighted sampling, and the randomized kaczmarz algorithm, Math. Program., 155 (2016), 549-573.  doi: 10.1007/s10107-015-0864-7.

[34]

S. Patterson and Y. W. Teh, Stochastic gradient Riemannian Langevin dynamics on the probability simplex, Advances in Neural Information Processing Systems, (2013), 3102-3110. 

[35]

G. A. Pavliotis, Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations, Texts in Applied Mathematics, 60. Springer, New York, 2014. doi: 10.1007/978-1-4939-1323-7.

[36]

G. O Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2 (1996), 341-363.  doi: 10.2307/3318418.

[37]

M. SchmidtR. BabanezhadM. AhmedA. DefazioA. Clifton and A. Sarkar, Non-uniform stochastic average gradient method for training conditional random fields, Artificial Intelligence and Statistics, (2015), 819-828. 

[38]

M. SchmidtN. L. Roux and F. Bach, Minimizing finite sums with the stochastic average gradient, Math. Program., 162 (2017), 83-112.  doi: 10.1007/s10107-016-1030-6.

[39]

M. Tao and T. Ohsawa, Variational optimization on lie groups, with examples of leading (generalized) eigenvalue problems, AISTATS, (2020). 

[40]

Y. W. Teh, A. H. Thiery and S. J. Vollmer, Consistency and fluctuations for stochastic gradient Langevin dynamics, J. Mach. Learn. Res., 17 (2016), 7, 33 pp.

[41]

S. J. Vollmer, K. C. Zygalakis and Y. W. Teh, Exploration of the (non-) asymptotic bias and variance of stochastic gradient langevin dynamics, J. Mach. Learn. Res., 17 (2016), 159, 45 pp.

[42]

M. Welling and Y. W. Teh, Bayesian learning via stochastic gradient langevin dynamics, International Conference on Machine Learning, (2001), 681-688. 

[43]

A. G. Wilson, The case for bayesian deep learning, preprint, arXiv: 2001.10995, 2020.

[44]

R. ZhangA. F. Cooper and C. D. Sa, Asymptotically optimal exact minibatch metropolis-hastings, NeurIPS, (2020). 

[45]

R. Zhang and C. D. Sa, Poisson-minibatching for gibbs sampling with convergence rate guarantees, NeurIPS, (2019). 

[46]

P. Zhao and T. Zhang, Stochastic optimization with importance sampling for regularized loss minimization, International Conference on Machine Learning, (2015), 1-9. 

[47]

R. Zhu, Gradient-based sampling: An adaptive importance sampling for least-squares, Advances in Neural Information Processing Systems, (2016), 406-414. 

show all references

References:
[1]

S. Ahn, A. Korattikara and M. Welling, Bayesian posterior sampling via stochastic gradient fisher scoring, In 29th International Conference on Machine Learning, ICML 2012, (2012), 1591–1598.

[2]

F. Bach, Stochastic gradient methods for machine learning, Technical report, INRIA - Ecole Normale Superieur, 2013. http://lear.inrialpes.fr/people/harchaoui/projects/gargantua/slides/bach_gargantua_nov2013.pdf.

[3]

J. BakerP. FearnheadE. Fox and C. Nemeth, Control variates for stochastic gradient MCMC, Stat. Comput., 29 (2019), 599-615.  doi: 10.1007/s11222-018-9826-2.

[4]

R. Bardenet, A. Doucet and C. Holmes, On markov chain Monte Carlo methods for tall data, J. Mach. Learn. Res., 18 (2017), 43 pp.

[5]

V. S. Borkar and S. K. Mitter, A strong approximation theorem for stochastic recursive algorithms, J. Optim. Theory Appl., 100 (1999), 499-513.  doi: 10.1023/A:1022630321574.

[6]

N. Bou-RabeeA. Eberle and R. Zimmer, Coupling and convergence for Hamiltonian Monte Carlo, Ann. Appl. Probab., 30 (2018), 1209-1250.  doi: 10.1214/19-AAP1528.

[7]

N. Bou-Rabee and H. Owhadi, Long-run accuracy of variational integrators in the stochastic context, SIAM J. Numer. Anal., 48 (2010), 278-297.  doi: 10.1137/090758842.

[8]

N. Bou-Rabee and J. M. Sanz-Serna, Geometric integrators and the Hamiltonian Monte Carlo method, Acta Numer., 27 (2018), 113-206.  doi: 10.1017/S0962492917000101.

[9] S. BrooksA. GelmanG. Jones and X.-L. Meng, Handbook of Markov Chain Monte Carlo, CRC press, 2011.  doi: 10.1201/b10905.
[10]

N. ChatterjiN. FlammarionY. MaP. Bartlett and M. Jordan, On the theory of variance reduction for stochastic gradient monte carlo, ICML, (2018). 

[11]

C. ChenN. Ding and L. Carin, On the convergence of stochastic gradient mcmc algorithms with high-order integrators, Advances in Neural Information Processing Systems, (2015), 2278-2286. 

[12]

T. ChenE. B. Fox and C. Guestrin, Stochastic gradient hamiltonian monte carlo, International Conference on Machine Learning, (2014), 1683-1691. 

[13]

X. Cheng, N. S. Chatterji, Y. Abbasi-Yadkori, P. L. Bartlett and M. I. Jordan, Sharp convergence rates for langevin dynamics in the nonconvex setting, preprint, arXiv: 1805.01648, 2018.

[14]

X. ChengN. S. ChatterjiP. L. Bartlett and M. I. Jordan, Underdamped langevin mcmc: A non-asymptotic analysis, Proceedings of the 31st Conference On Learning Theory, PMLR, (2018). 

[15]

D. Csiba and P. Richtárik, Importance sampling for minibatches, J. Mach. Learn. Res., 19 (2018), 21 pp.

[16]

A. S. Dalalyan and A. Karagulyan, User-friendly guarantees for the langevin Monte Carlo with inaccurate gradient, Stochastic Process. Appl., 129 (2019), 5278-5311.  doi: 10.1016/j.spa.2019.02.016.

[17]

A. Defazio, F. Bach and S. Lacoste-Julien, Saga: A fast incremental gradient method with support for non-strongly convex composite objectives, In Advances in Neural Information Processing Systems, (2014), 1646–1654.

[18]

A. Defazio and L. Bottou, On the ineffectiveness of variance reduced optimization for deep learning, In Advances in Neural Information Processing Systems, (2019), 1755–1765.

[19]

K. A. Dubey, S. J. Reddi, S. A. Williamson, B. Poczos, A. J. Smola and E. P. Xing, Variance reduction in stochastic gradient langevin dynamics, In Advances in Neural Information Processing Systems, (2016), 1154–1162.

[20]

T. Fu and Z. Zhang, Cpsg-mcmc: Clustering-based preprocessing method for stochastic gradient mcmc, In Artificial Intelligence and Statistics, (2017), 841–850.

[21]

K. Jarrett, K. Kavukcuoglu, M. A. Ranzato and Y. LeCun, What is the best multi-stage architecture for object recognition? In 2009 IEEE 12th International Conference on Computer Vision, (2009), 2146–2153. doi: 10.1109/ICCV.2009.5459469.

[22]

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems, (2013), 315-323. 

[23]

A. Korattikara, Y. Chen and M. Welling, Austerity in mcmc land: Cutting the metropolis-hastings budget, In International Conference on Machine Learning, (2014), 181–189.

[24]

R. Kubo, The fluctuation-dissipation theorem, Reports on Progress in Physics, 29 (1966).  doi: 10.1088/0034-4885/29/1/306.

[25]

C. Li, C. Chen, D. Carlson and L. Carin, Preconditioned stochastic gradient langevin dynamics for deep neural networks, In Thirtieth AAAI Conference on Artificial Intelligence, 2016.

[26]

Q. Li, C. Tai and E. Weinan, Stochastic modified equations and adaptive stochastic gradient algorithms, In J. Mach. Learn. Res., 20 (2019), 47 pp.

[27]

R. Li, H. Zha and M. Tao, Mean-square analysis with an application to optimal dimension dependence of Langevin Monte Carlo, preprint, arXiv: 2109.03839, 2021.

[28]

M. Lichman et al, UCI Machine Learning Repository, 2013.

[29]

Y.-A. Ma, T. Chen and E. B. Fox, A complete recipe for stochastic gradient mcmc, In Advances in Neural Information Processing Systems, (2015), 2917–2925.

[30]

D. Maclaurin and R. P. Adams, Firefly monte carlo: Exact mcmc with subsets of data, In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.

[31]

S. Mandt, M. D. Hoffman and D. M. Blei, Stochastic gradient descent as approximate Bayesian inference, J. Mach. Learn. Res., 18 (2017), 134, 35 pp.

[32]

J. C. MattinglyA. M. Stuart and M. V. Tretyakov, Convergence of numerical time-averaging and stationary measures via Poisson equations, SIAM J. Numer. Anal., 48 (2010), 552-577.  doi: 10.1137/090770527.

[33]

D. NeedellR. Ward and N. Srebro, Stochastic gradient descent, weighted sampling, and the randomized kaczmarz algorithm, Math. Program., 155 (2016), 549-573.  doi: 10.1007/s10107-015-0864-7.

[34]

S. Patterson and Y. W. Teh, Stochastic gradient Riemannian Langevin dynamics on the probability simplex, Advances in Neural Information Processing Systems, (2013), 3102-3110. 

[35]

G. A. Pavliotis, Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations, Texts in Applied Mathematics, 60. Springer, New York, 2014. doi: 10.1007/978-1-4939-1323-7.

[36]

G. O Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2 (1996), 341-363.  doi: 10.2307/3318418.

[37]

M. SchmidtR. BabanezhadM. AhmedA. DefazioA. Clifton and A. Sarkar, Non-uniform stochastic average gradient method for training conditional random fields, Artificial Intelligence and Statistics, (2015), 819-828. 

[38]

M. SchmidtN. L. Roux and F. Bach, Minimizing finite sums with the stochastic average gradient, Math. Program., 162 (2017), 83-112.  doi: 10.1007/s10107-016-1030-6.

[39]

M. Tao and T. Ohsawa, Variational optimization on lie groups, with examples of leading (generalized) eigenvalue problems, AISTATS, (2020). 

[40]

Y. W. Teh, A. H. Thiery and S. J. Vollmer, Consistency and fluctuations for stochastic gradient Langevin dynamics, J. Mach. Learn. Res., 17 (2016), 7, 33 pp.

[41]

S. J. Vollmer, K. C. Zygalakis and Y. W. Teh, Exploration of the (non-) asymptotic bias and variance of stochastic gradient langevin dynamics, J. Mach. Learn. Res., 17 (2016), 159, 45 pp.

[42]

M. Welling and Y. W. Teh, Bayesian learning via stochastic gradient langevin dynamics, International Conference on Machine Learning, (2001), 681-688. 

[43]

A. G. Wilson, The case for bayesian deep learning, preprint, arXiv: 2001.10995, 2020.

[44]

R. ZhangA. F. Cooper and C. D. Sa, Asymptotically optimal exact minibatch metropolis-hastings, NeurIPS, (2020). 

[45]

R. Zhang and C. D. Sa, Poisson-minibatching for gibbs sampling with convergence rate guarantees, NeurIPS, (2019). 

[46]

P. Zhao and T. Zhang, Stochastic optimization with importance sampling for regularized loss minimization, International Conference on Machine Learning, (2015), 1-9. 

[47]

R. Zhu, Gradient-based sampling: An adaptive importance sampling for least-squares, Advances in Neural Information Processing Systems, (2016), 406-414. 

Figure 1.  Performance quantification (Gaussian target)
Figure 2.  BLR learning curve
Figure 3.  BNN learning curve. Shade: one standard deviation.
Figure 4.  KL divergence
Figure 5.  Posterior prediction of mean (left) and standard deviation (right) of log likelihood on test data set generated by SGHMC, EWSG and EWSG-VR on two Bayesian logistic regression tasks. Statistics are computed based on 1000 independent simulations. Minibatch size $ b = 1 $ for all methods except FG. $ M = 1 $ for EWSG and EWSG-VR
Figure 6.  (a) Histogram of data used in each iteration for FlyMC algorithm. (b) Autocorrelation plot of FlyMC, EWSG and MH. (c) Samples of EWSG. (d) Samples of FlyMC
Table 1.  Accuracy, log likelihood and wall time of various algorithms on test data after one data pass (mean $ \pm $ std)
Method SGLD pSGLD SGHMC EWSG FlyMC
Accuracy(%) 75.283 $ \pm $ 0.016 75.126 $ \pm $ 0.020 75.268 $ \pm $ 0.017 75.306 $ \pm $ 0.016 75.199 $ \pm $ 0.080
Log Likelihood -0.525 $ \pm $ 0.000 -0.526 $ \pm $ 0.000 -0.525 $ \pm $ 0.000 -0.523 $ \pm $ 0.000 -0.523 $ \pm $ 0.000
Wall Time (s) 3.085 $ \pm $ 0.283 4.312 $ \pm $ 0.359 3.145 $ \pm $ 0.307 3.755 $ \pm $ 0.387 291.295 $ \pm $ 56.368
Method SGLD pSGLD SGHMC EWSG FlyMC
Accuracy(%) 75.283 $ \pm $ 0.016 75.126 $ \pm $ 0.020 75.268 $ \pm $ 0.017 75.306 $ \pm $ 0.016 75.199 $ \pm $ 0.080
Log Likelihood -0.525 $ \pm $ 0.000 -0.526 $ \pm $ 0.000 -0.525 $ \pm $ 0.000 -0.523 $ \pm $ 0.000 -0.523 $ \pm $ 0.000
Wall Time (s) 3.085 $ \pm $ 0.283 4.312 $ \pm $ 0.359 3.145 $ \pm $ 0.307 3.755 $ \pm $ 0.387 291.295 $ \pm $ 56.368
Table 2.  Test error (mean $ \pm $ standard deviation) after 200 epoches
Method Test Error(%), MLP Test Error(%), CNN
SGLD 1.976 $ \pm $ 0.055 0.848 $ \pm $ 0.060
pSGLD 1.821 $ \pm $ 0.061 0.860 $ \pm $ 0.052
SGHMC 1.833 $ \pm $ 0.073 0.778 $ \pm $ 0.040
CP-SGHMC 1.835 $ \pm $ 0.047 0.772 $ \pm $ 0.055
EWSG 1.793 $ \pm $ 0.100 0.753 $ \pm $ 0.035
Method Test Error(%), MLP Test Error(%), CNN
SGLD 1.976 $ \pm $ 0.055 0.848 $ \pm $ 0.060
pSGLD 1.821 $ \pm $ 0.061 0.860 $ \pm $ 0.052
SGHMC 1.833 $ \pm $ 0.073 0.778 $ \pm $ 0.040
CP-SGHMC 1.835 $ \pm $ 0.047 0.772 $ \pm $ 0.055
EWSG 1.793 $ \pm $ 0.100 0.753 $ \pm $ 0.035
Table 3.  Test errors of EWSG (top of each cell) and SGHMC (bottom of each cell) after 200 epoches. $ b $ is minibatch size for EWSG, and minibatch size of SGHMC is set as $ b\times(M+1) $ to ensure the same number of data used per parameter update for both algorithms. Step size is set $ h = \frac{10}{b(M+1)} $ as suggested in [12], different from that used to produce Table 2. Results with smaller test error is highlighted in boldface
$ b $ $ M+1=2 $ $ M+1=5 $ $ M+1=10 $
$ 100 $ 1.86% 1.83% 1.80%
1.94% 1.92% 1.97%
$ 200 $ 1.90% 1.87% 1.80%
1.87% 1.97% 2.07%
$ 500 $ 1.79% 2.01% 2.36%
1.97% 2.17% 2.37%
$ b $ $ M+1=2 $ $ M+1=5 $ $ M+1=10 $
$ 100 $ 1.86% 1.83% 1.80%
1.94% 1.92% 1.97%
$ 200 $ 1.90% 1.87% 1.80%
1.87% 1.97% 2.07%
$ 500 $ 1.79% 2.01% 2.36%
1.97% 2.17% 2.37%
[1]

Yanqin Bai, Yudan Wei, Qian Li. An optimal trade-off model for portfolio selection with sensitivity of parameters. Journal of Industrial and Management Optimization, 2017, 13 (2) : 947-965. doi: 10.3934/jimo.2016055

[2]

Lihui Zhang, Xin Zou, Jianxun Qi. A trade-off between time and cost in scheduling repetitive construction projects. Journal of Industrial and Management Optimization, 2015, 11 (4) : 1423-1434. doi: 10.3934/jimo.2015.11.1423

[3]

Reuven Cohen, Mira Gonen, Avishai Wool. Bounding the bias of tree-like sampling in IP topologies. Networks and Heterogeneous Media, 2008, 3 (2) : 323-332. doi: 10.3934/nhm.2008.3.323

[4]

Reza Lotfi, Zahra Yadegari, Seyed Hossein Hosseini, Amir Hossein Khameneh, Erfan Babaee Tirkolaee, Gerhard-Wilhelm Weber. A robust time-cost-quality-energy-environment trade-off with resource-constrained in project management: A case study for a bridge construction project. Journal of Industrial and Management Optimization, 2022, 18 (1) : 375-396. doi: 10.3934/jimo.2020158

[5]

Boris Kalinin, Victoria Sadovskaya. Normal forms for non-uniform contractions. Journal of Modern Dynamics, 2017, 11: 341-368. doi: 10.3934/jmd.2017014

[6]

Yakov Pesin, Vaughn Climenhaga. Open problems in the theory of non-uniform hyperbolicity. Discrete and Continuous Dynamical Systems, 2010, 27 (2) : 589-607. doi: 10.3934/dcds.2010.27.589

[7]

Pablo G. Barrientos, Abbas Fakhari. Ergodicity of non-autonomous discrete systems with non-uniform expansion. Discrete and Continuous Dynamical Systems - B, 2020, 25 (4) : 1361-1382. doi: 10.3934/dcdsb.2019231

[8]

Markus Bachmayr, Van Kien Nguyen. Identifiability of diffusion coefficients for source terms of non-uniform sign. Inverse Problems and Imaging, 2019, 13 (5) : 1007-1021. doi: 10.3934/ipi.2019045

[9]

Keaton Hamm, Longxiu Huang. Stability of sampling for CUR decompositions. Foundations of Data Science, 2020, 2 (2) : 83-99. doi: 10.3934/fods.2020006

[10]

Deng Lu, Maria De Iorio, Ajay Jasra, Gary L. Rosner. Bayesian inference for latent chain graphs. Foundations of Data Science, 2020, 2 (1) : 35-54. doi: 10.3934/fods.2020003

[11]

Sahani Pathiraja, Sebastian Reich. Discrete gradients for computational Bayesian inference. Journal of Computational Dynamics, 2019, 6 (2) : 385-400. doi: 10.3934/jcd.2019019

[12]

Zhong-Jie Han, Gen-Qi Xu. Spectrum and dynamical behavior of a kind of planar network of non-uniform strings with non-collocated feedbacks. Networks and Heterogeneous Media, 2010, 5 (2) : 315-334. doi: 10.3934/nhm.2010.5.315

[13]

Vladimir Kazakov. Sampling - reconstruction procedure with jitter of markov continuous processes formed by stochastic differential equations of the first order. Conference Publications, 2009, 2009 (Special) : 433-441. doi: 10.3934/proc.2009.2009.433

[14]

Christopher Rackauckas, Qing Nie. Adaptive methods for stochastic differential equations via natural embeddings and rejection sampling with memory. Discrete and Continuous Dynamical Systems - B, 2017, 22 (7) : 2731-2761. doi: 10.3934/dcdsb.2017133

[15]

Alexandre J. Chorin, Fei Lu, Robert N. Miller, Matthias Morzfeld, Xuemin Tu. Sampling, feasibility, and priors in data assimilation. Discrete and Continuous Dynamical Systems, 2016, 36 (8) : 4227-4246. doi: 10.3934/dcds.2016.36.4227

[16]

Shixu Meng. A sampling type method in an electromagnetic waveguide. Inverse Problems and Imaging, 2021, 15 (4) : 745-762. doi: 10.3934/ipi.2021012

[17]

Tielei Zhu, Jiaqing Yang. A non-iterative sampling method for inverse elastic wave scattering by rough surfaces. Inverse Problems and Imaging, 2022, 16 (4) : 997-1017. doi: 10.3934/ipi.2022009

[18]

Donald L. DeAngelis, Bo Zhang. Effects of dispersal in a non-uniform environment on population dynamics and competition: A patch model approach. Discrete and Continuous Dynamical Systems - B, 2014, 19 (10) : 3087-3104. doi: 10.3934/dcdsb.2014.19.3087

[19]

Zhong-Jie Han, Gen-Qi Xu. Exponential decay in non-uniform porous-thermo-elasticity model of Lord-Shulman type. Discrete and Continuous Dynamical Systems - B, 2012, 17 (1) : 57-77. doi: 10.3934/dcdsb.2012.17.57

[20]

Izumi Takagi, Conghui Zhang. Existence and stability of patterns in a reaction-diffusion-ODE system with hysteresis in non-uniform media. Discrete and Continuous Dynamical Systems, 2021, 41 (7) : 3109-3140. doi: 10.3934/dcds.2020400

2021 Impact Factor: 1.865

Metrics

  • PDF downloads (273)
  • HTML views (233)
  • Cited by (0)

Other articles
by authors

[Back to Top]