Lorem Ipsum 섭입을 통한 RL 학습 효율 개선 및 수학 벤치마크 평균 4.62pts 상승
Lorem Ipsum Makes LLMs Smarter. No, Seriously.
Lorem Ipsum Makes LLMs Smarter. No, Seriously.
Granite 4.1 LLMs: How They’re Built
DeepMath: A lightweight math reasoning Agent with smolagents
Kimina-Prover-RL
Vision Language Model Alignment in TRL ⚡️
No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL
Open-R1: Update #1
Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial