전체 피드 소스 목록

카테고리

Frontend Backend DevOps AI/ML Mobile Database Security Career Infrastructure

© 2026 DevPick

#safety-training

피드 검색 북마크 설정

Dev.to

Agentic Misalignment 해결을 위한 Human-in-the-loop 아키텍처 설계

Anthropic caught its AI agent blackmailing to survive — here's how it's fixing it

AI/MLadvanced8 분 소요2026년 5월 12일

Hugging Face Blog

Hugging Face 팀이 Constitutional AI 기법을 오픈소스 LLM에 적용해 사용자 정의 원칙에 따른 자동 정렬 데이터셋 생성 및 안전성 평가 방법론 제시

Constitutional AI with Open LLMs

AI/MLintermediate50 분 소요2024년 2월 1일