我测试,开始收敛确实比较快。
https://baijiahao.baidu.com/s?id=1680860727047517917&wfr=spider&for=pc
http://www.techweb.com.cn/cloud/2020-10-19/2807413.shtml
AdaBelief
- 论文链接:https://arxiv.org/pdf/2010.07468.pdf
- 论文页面:https://juntang-zhuang.github.io/adabelief/
- 代码链接:https://github.com/juntang-zhuang/Adabelief-Optimizer
https://github.com/lucidrains/lambda-networks/blob/1b950fa0879b834757ab0d935017788eb231a66e/lambda_networks/lambda_networks.py
import math