Home Convergence analysis of online learning algorithm with two-stage step size
Article
Licensed
Unlicensed Requires Authentication

Convergence analysis of online learning algorithm with two-stage step size

  • Weilin Nie and Cheng Wang EMAIL logo
Published/Copyright: August 30, 2021

Abstract

Online learning is a classical algorithm for optimization problems. Due to its low computational cost, it has been widely used in many aspects of machine learning and statistical learning. Its convergence performance depends heavily on the step size. In this paper, a two-stage step size is proposed for the unregularized online learning algorithm, based on reproducing Kernels. Theoretically, we prove that, such an algorithm can achieve a nearly min–max convergence rate, up to some logarithmic term, without any capacity condition.

Mathematics Subject Classification 2010: 68T05; 68Q32; 62J02; 41A46

Corresponding author: Cheng Wang, School of Mathematics and Statistics, Huizhou University, Huizhou 516007, P. R. China, E-mail: wangch@hzu.edu.cn

Funding source: Special Research Project on COVID-19 Prevention and Control in Colleges and Universities in Guangdong Province

Award Identifier / Grant number: 2020KZDZX1195

Funding source: Science and Technology Plan Project in Huizhou

Award Identifier / Grant number: 2020SD0402030

Funding source: Indigenous Innovation’s Capability Development Program of Huizhou University

Award Identifier / Grant number: HZU202003

Award Identifier / Grant number: HZU202020

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: The work is supported partially by the Fund of Science and Technology Plan Project in Huizhou (No. 2020SD0402030), Indigenous Innovation’s Capability Development Program of Huizhou University (project numbers: HZU202003, HZU202020), and Special Research Project on COVID-19 Prevention and Control in Colleges and Universities in Guangdong Province (No. 2020KZDZX1195).

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

[1] F. Cucker and S. Smale, “On the mathematical foundations of learning,” Bull. Am. Math. Soc., vol. 39, pp. 1–49, 2002.10.1090/S0273-0979-01-00923-5Search in Google Scholar

[2] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.10.1017/CBO9780511618796Search in Google Scholar

[3] V. Vapnik, Statistical Learning Theory, John Wiley & Sons, 1998.Search in Google Scholar

[4] S. Smale and Y. Yao, “Online learning algorithms,” Found. Comput. Math., vol. 6, pp. 145–170, 2006. https://doi.org/10.1007/s10208-004-0160-z.Search in Google Scholar

[5] Q. Zhu, “Latent variable regression for supervised modeling and monitoring,” IEEE/CAA J. Autom. Sin., vol. 7, no. 3, pp. 800–811, 2020. https://doi.org/10.1109/jas.2020.1003153.Search in Google Scholar

[6] N. Manwani and M. Chandra, “Exact passive-aggressive algorithms for ordinal regression using interval labels,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 9, pp. 3259–3268, 2020. https://doi.org/10.1109/tnnls.2019.2939861.Search in Google Scholar

[7] X. Luo, Z. G. Liu, L. Jin, Y. Zhou, and M. C. Zhou, “Symmetric nonnegative matrix factorization-based community detection models and their convergence analysis,” IEEE Trans. Neural Netw. Learn. Syst., 2020.10.1109/TNNLS.2020.3041360Search in Google Scholar PubMed

[8] X. Luo, D. X. Wang, M. C. Zhou, and H. Q. Yuan, “Latent factor-based recommenders relying on extended stochastic gradient descent algorithms,” IEEE Trans. Syst. Man Cybern. Syst., vol. 51, no. 2, pp. 916–926, 2021. https://doi.org/10.1109/tsmc.2018.2884191.Search in Google Scholar

[9] Z. G. Liu, X. Luo, and Z. D. Wang, “Convergence analysis of single latent factor-dependent, nonnegative, and multiplicative update-based nonnegative latent factor models,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 4, pp. 1737–1749, 2021. https://doi.org/10.1109/tnnls.2020.2990990.Search in Google Scholar PubMed

[10] T. Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms, ICML, 2004, pp. 919–926.10.1145/1015330.1015332Search in Google Scholar

[11] Y. Ying and M. Pontil, “Online gradient descent learning algorithms,” Found. Comput. Math., vol. 8, no. 5, pp. 561–596, 2008. https://doi.org/10.1007/s10208-006-0237-y.Search in Google Scholar

[12] J. H. Lin and D. X. Zhou, “Online learning algorithms can converge comparably fast as batch learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 6, pp. 2367–2378, 2017.10.1109/TNNLS.2017.2677970Search in Google Scholar PubMed

[13] Y. W. Lei, L. Shi, and Z. C. Guo, “Convergence of unregularized online learning algorithms,” J. Mach. Learn. Res., vol. 18, pp. 1–33, 2018.Search in Google Scholar

[14] Y. Ying and D. X. Zhou, “Online regularized classification algorithms,” IEEE Trans. Inf. Theor., vol. 52, no. 11, pp. 4775–4788, 2006. https://doi.org/10.1109/tit.2006.883632.Search in Google Scholar

[15] G. B. Ye and D. X. Zhou, “Fully online classification by regularization,” Appl. Comput. Harmon. Anal., vol. 23, pp. 198–214, 2007. https://doi.org/10.1016/j.acha.2006.12.001.Search in Google Scholar

[16] J. Langford, L. H. Li, and T. Zhang, “Sparse online learning via truncated gradient,” J. Mach. Learn. Res., vol. 10, pp. 777–801, 2009.Search in Google Scholar

[17] P. Tarrés and Y. Yao, “Online learning as stochastic approximations of regularization paths: optimality and almost-sure convergence,” IEEE Trans. Inf. Theor., vol. 60, no. 9, pp. 5716–5735, 2014. https://doi.org/10.1109/tit.2014.2332531.Search in Google Scholar

[18] A. Capponnetto and E. De Vito, “Optimal rates for the regularized least-squares algorithm,” Found. Comput. Math., vol. 7, pp. 331–368, 2007.10.1007/s10208-006-0196-8Search in Google Scholar

[19] Z. C. Guo and L. Shi, “Fast and strong convergence of online learning algorithms,” Adv. Comput. Math., vol. 45, pp. 2745–2770, 2019. https://doi.org/10.1007/s10444-019-09707-8.Search in Google Scholar

[20] Z. C. Guo, Y. M. Ying, and D. X. Zhou, “Online regularized learning with pairwise loss functions,” Adv. Comput. Math., vol. 43, pp. 127–150, 2017. https://doi.org/10.1007/s10444-016-9479-7.Search in Google Scholar

Received: 2020-07-10
Accepted: 2021-07-20
Published Online: 2021-08-30
Published in Print: 2023-02-23

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Original Research Articles
  3. Modeling and assessment of the flow and air pollutants dispersion during chemical reactions from power plant activities
  4. Stochastic dynamics of dielectric elastomer balloon with viscoelasticity under pressure disturbance
  5. Unsteady MHD natural convection flow of a nanofluid inside an inclined square cavity containing a heated circular obstacle
  6. Fractional-order generalized Legendre wavelets and their applications to fractional Riccati differential equations
  7. Battery discharging model on fractal time sets
  8. Adaptive neural network control of second-order underactuated systems with prescribed performance constraints
  9. Optimal control for dengue eradication program under the media awareness effect
  10. Shifted Legendre spectral collocation technique for solving stochastic Volterra–Fredholm integral equations
  11. Modeling and simulations of a Zika virus as a mosquito-borne transmitted disease with environmental fluctuations
  12. Mathematical analysis of the impact of vaccination and poor sanitation on the dynamics of poliomyelitis
  13. Anti-sway method for reducing vibrations on a tower crane structure
  14. Stable soliton solutions to the time fractional evolution equations in mathematical physics via the new generalized G / G -expansion method
  15. Convergence analysis of online learning algorithm with two-stage step size
  16. An estimative (warning) model for recognition of pandemic nature of virus infections
  17. Interaction among a lump, periodic waves, and kink solutions to the KP-BBM equation
  18. Global exponential stability of periodic solution of delayed discontinuous Cohen–Grossberg neural networks and its applications
  19. An efficient class of fourth-order derivative-free method for multiple-roots
  20. Numerical modeling of thermal influence to pollutant dispersion and dynamics of particles motion with various sizes in idealized street canyon
  21. Construction of breather solutions and N-soliton for the higher order dimensional Caudrey–Dodd–Gibbon–Sawada–Kotera equation arising from wave patterns
  22. Delay-dependent robust stability analysis of uncertain fractional-order neutral systems with distributed delays and nonlinear perturbations subject to input saturation
  23. Construction of complexiton-type solutions using bilinear form of Hirota-type
  24. Inverse estimation of time-varying heat transfer coefficients for a hollow cylinder by using self-learning particle swarm optimization
  25. Infinite line of equilibriums in a novel fractional map with coexisting infinitely many attractors and initial offset boosting
  26. Lump solutions to a generalized nonlinear PDE with four fourth-order terms
  27. Quantum motion control for packaging machines
Downloaded on 27.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/ijnsns-2020-0155/html
Scroll to top button