Metadata Priority Injection을 통한 RAG 시스템 데이터 오염 취약점 분석

LangChain ChromaDB Metadata Priority Injection — RAG Poisoning Vulnerability

PJ2026년 5월 10일1분intermediate

AI 요약

Context

LangChain과 ChromaDB 통합 환경에서 Metadata 필드를 활용한 문서 검색 우선순위 제어 구조의 설계 결함 존재. Semantic Relevance와 무관하게 임의의 Metadata 값을 통해 검색 결과 상단을 강제 점유하는 RAG Poisoning 가능성 확인.

Technical Solution

Metadata Priority Injection을 통한 악성 문서의 검색 랭킹 강제 상향 조정
Vector Database의 단순 Metadata 필터링 및 정렬 로직을 악용한 데이터 조작 공격
정당한 문서보다 높은 Priority 값을 부여한 Poisoned Document의 우선 순위 획득
Semantic Search의 유사도 점수를 무시하는 Metadata 기반 강제 랭킹 시스템의 취약점 노출
API Layer에서의 OutputGuard 도입을 통한 런타임 기반의 악성 출력 탐지 및 차단

실천 포인트

1. 사용자 업로드 문서의 Metadata 필드에 대한 Strict Validation 적용

2. Semantic Score와 Metadata Priority 간의 가중치 밸런싱 로직 검토

3. RAG 파이프라인 최종 단계에 Output Guardrail 솔루션 배치 여부 확인

태그

#RAG Poisoning #LLM Security #Vector Database #Metadata Injection #LangChain

원문 읽기