基于 Nesterov 加速的改进自适应优化算法
作者:

An Improved Adaptive Optimization Algorithm Based on Nesterov Acceleration
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
    摘要:

    目的 针对传统优化算法在训练深度学习模型时,由于模型参数量不断增大,网络层数不断加深所产生的训 练效率较低的问题,提出一种基于 Nesterov 加速的 Nadabelief 优化算法,以提高模型的训练效率。 方法 首先采取 Adabelief 算法代替 Adam 算法,缓解了算法的泛化性问题;接着从一阶矩经典动量项的角度出发,在 Adabelief 算法 的基础上引入了 Nesterov 动量加速机制,在梯度更新时不仅考虑当前时刻的梯度,还借助于历史累积梯度来修正 梯度的更新幅度,进一步提升了算法的效率;最后根据理论分析证明得到算法的遗憾界,确保了算法的收敛性。 结 果 为了验证算法的性能,在凸情况下进行了 Logistic 回归实验,在非凸情况下进行了图像分类和语言建模实验,通 过与 Adam、Adabelief 等算法的比较,验证了 Nadabelief 算法的优越性。 通过在不同初始学习率下对算法进行测试, 验证了算法良好的鲁棒性。 结论 实验表明:所提出的算法在保持原有 Adabelief 算法泛化能力的同时兼具更好的 收敛精度,在训练深度学习模型时效率得到了进一步提高。

    Abstract:

    Objective Traditional optimization algorithms exhibit lower training efficiency when training deep learning models due to increasing model parameters and deeper network layers. To address this issue a Nadabelief optimization algorithm based on Nesterov acceleration was proposed to improve the efficiency of model training. Methods Firstly the Adabelief algorithm was employed in place of the Adam algorithm to mitigate the generalization problem. Subsequently from the perspective of the first-order moment classical momentum term the Nesterov momentum acceleration mechanism was incorporated into the Adabelief algorithm. During gradient updates not only the gradient at the current moment was considered but the historical cumulative gradient was also utilized to adjust the magnitude of gradient updates so as to further improve the convergence of the algorithm. Finally the regret bound of the algorithm was obtained based on theoretical analysis to ensure the convergence of the algorithm. Results To verify the performance of the algorithm Logistic regression experiments were conducted in the convex scenario while image classification and language modeling experiments were carried out in the non - convex scenario. Comparisons with algorithms such as Adam and Adabelief demonstrated the superiority of the Nadabelief algorithm. Additionally the algorithm?? s robustness was confirmed by testing it at various initial learning rates. Conclusion The experiments demonstrate that the proposed algorithm not only maintains the generalization capability of the original Adabelief algorithm but also achieves better convergence accuracy. The proposed algorithm further improves the efficiency when training deep learning models.

    参考文献
    相似文献
    引证文献
引用本文

钱 振 ,李德权.基于 Nesterov 加速的改进自适应优化算法[J].重庆工商大学学报(自然科学版),2025,42(3):44-51
QIAN Zhen, LI Dequan. An Improved Adaptive Optimization Algorithm Based on Nesterov Acceleration[J]. Journal of Chongqing Technology and Business University(Natural Science Edition),2025,42(3):44-51

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 在线发布日期: 2025-05-14
×
2024年《重庆工商大学学报(自然科学版)》影响因子显著提升