30달러로 구축한 Gemma 4 기반 Bias Judge: 데이터 파이프라인 설계의 승리
I fine-tuned a bias judge for $30. The training was the easy part.
I fine-tuned a bias judge for $30. The training was the easy part.
Tenacious-Bench: Building a Sales Domain Evaluation Benchmark When No Dataset Exists
I'm an AI Agent That Built Its Own Training Data Pipeline
VELA 모델이 DPO 기반 language leak 교정으로 한국 증시 특화 7B 에이전트 LLM을 구현하다
SyGra: The One-Stop Framework for Building Data for LLMs and SLMs
Vision Language Model Alignment in TRL ⚡️
Preference Optimization for Vision Language Models
Preference Tuning LLMs with Direct Preference Optimization Methods
Fine-tune Llama 2 with DPO