Research2026-04-24
GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR
Source: Arxiv CS.AI
arXiv:2601.09361v3 Announce Type: replace-cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) is a key paradigm for improving large-scale reasoning models. Unlike supervised fine-tuning (SFT), RLVR exhibits distinct optimization dynamics and is sensitive to the preservation of...
arxivpapers