상용 LLM의 Soft Refusal를 통한 지식 접근 제어 및 Algorithmic Paternalism 분석
The Invisible Guardrail: How Commercial LLMs Enforce Algorithmic Paternalism
The Invisible Guardrail: How Commercial LLMs Enforce Algorithmic Paternalism
AI isn't a software upgrade. It's an organizational redesign.
You Don't Align an AI, You Align with It
The Sovereign Safety Gap: Why AI Alignment Must be Contextual.
Title: I built a reward analysis tool for AI alignment — here's why reward hacking is harder to detect than you think
K501 - Human–Machine Resonance — Beyond Control, Toward Alignment
Stanford Tested 11 AI Chatbots for Advice. Every One Was a Yes-Man.
The AGI Horizon: From Tools to Teammates in the Future of Engineering