Revision as of 01:12, 1 July 2025 edit Citation bot (talk \| contribs) Bots 5,870,518 edits Alter: template type, page, title. Add: chapter-url-access, chapter-url, chapter, pages, arxiv, authors 1-1. Removed or converted URL. Removed parameters. Some additions/deletions were parameter name changes. Removed Template redirect. \| Use this bot. Report bugs. \| Suggested by Abductive \| Category:Wikipedia articles that are too technical from June 2025 \| #UCB_Category 14/37 ← Previous edit		Revision as of 22:47, 1 July 2025 edit undo Citation bot (talk \| contribs) Bots 5,870,518 edits Removed URL that duplicated identifier. Removed access-date with no URL. \| Use this bot. Report bugs. \| Suggested by Abductive \| Category:Uncategorized from June 2025 \| #UCB_Category 93/116 Next edit →
Line 41: == Research, Variants, and Enhancements == Active research on SAM focuses on reducing its computational overhead and improving its performance. Several variants have been proposed to make the algorithm more efficient. These include methods that attempt to parallelize the two gradient computations, apply the perturbation to only a subset of parameters, or reduce the number of computation steps required.<ref name="Dou2022SAMPa">{{cite arXiv \|eprint=2410.10683 \|class=cs.LG \|first1=Wanyun \|last1=Xie \|first2=Thomas \|last2=Pethick \|title=SAMPa: Sharpness-aware Minimization Parallelized \|last3=Cevher \|first3=Volkan \|year=2022}}</ref><ref name="u277">{{citation \|last1=Mi \|first1=Peng \|title=Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach \|date=2022 \|page= ~~\|url=https://arxiv.org/abs/2210.05177 \|access-date=2025-06-26~~ \|arxiv=2210.05177 \|last2=Shen \|first2=Li \|last3=Ren \|first3=Tianhe \|last4=Zhou \|first4=Yiyi \|last5=Sun \|first5=Xiaoshuai \|last6=Ji \|first6=Rongrong \|last7=Tao \|first7=Dacheng }}</ref><ref name="k651">{{cite conference \|last1=Ji \|first1=Jie \|last2=Li \|first2=Gen \|last3=Fu \|first3=Jingjing \|last4=Afghah \|first4=Fatemeh \|last5=Guo \|first5=Linke \|last6=Yuan \|first6=Xiaoyong \|last7=Ma \|first7=Xiaolong \|date=2025-06-05 \|title=Proceedings of the 38th International Conference on Neural Information Processing Systems \|url=https://dl.acm.org/doi/10.5555/3737916.3739321 \|publisher=Curran Associates Inc. \|publication-place=Red Hook, NY, USA \|volume=37 \|page= \|pages=44269–44290 \|isbn=979-8--33131438-5 \|access-date=2025-06-26}}</ref> Other approaches use historical gradient information or apply SAM steps intermittently to lower the computational burden.<ref name="Liu2022LookaheadSAM">{{cite conference \|last1=Yu \|first1=Runsheng \|last2=Zhang \|first2=Youzhi \|last3=Kwok \|first3=James \|year=2024 \|title=Improving Sharpness-Aware Minimization by Lookahead \|url=https://proceedings.mlr.press/v235/yu24q.html \|conference= \|book-title=International Conference on Learning Representations (ICLR) 2022}}</ref> To improve performance and robustness, variants have been developed that adapt the neighborhood size based on model parameter scales (Adaptive SAM or ASAM)<ref name="Kwon2021ASAM"/> or incorporate information about the curvature of the loss landscape (Curvature Regularized SAM or CR-SAM). Other research explores refining the perturbation step by focusing on specific components of the gradient or combining SAM with techniques like random smoothing.<ref name="m141">{{cite conference \|last1=Li \|first1=Tao \|last2=Zhou \|first2=Pan \|last3=He \|first3=Zhengbao \|last4=Cheng \|first4=Xinwen \|last5=Huang \|first5=Xiaolin \|title=2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) \|date=2024-06-16 \|chapter=Friendly Sharpness-Aware Minimization \|page= \|chapter-url=https://ieeexplore.ieee.org/document/10657696 \|publisher=IEEE \|pages=5631–5640 \|doi=10.1109/CVPR52733.2024.00538 \|isbn=979-8-3503-5300-6 \|access-date=2025-06-26\|chapter-url-access=subscription }}</ref><ref name="t248">{{cite journal \|last1=Liu \|first1=Yong \|last2=Mai \|first2=Siqi \|last3=Cheng \|first3=Minhao \|last4=Chen \|first4=Xiangning \|last5=Hsieh \|first5=Cho-Jui \|last6=You \|first6=Yang \|date=2022-12-06 \|title=Random Sharpness-Aware Minimization \|url=https://papers.nips.cc/paper_files/paper/2022/hash/9b79416c0dc4b09feaa169ed5cdd63d4-Abstract-Conference.html \|journal=Advances in Neural Information Processing Systems \|volume=35 \|pages=24543–24556 \|access-date=2025-06-26}}</ref>

Sharpness aware minimization: Difference between revisions