ํ”ผ๋“œ๋กœ ๋Œ์•„๊ฐ€๊ธฐ
Beyond Keywords: Mastering HyDE for Smarter Retrieval ๐Ÿง 
Dev.toDev.to
AI/ML

Asymmetric Retrieval ํ•ด๊ฒฐ์„ ์œ„ํ•œ HyDE ๊ธฐ๋ฐ˜ ๊ฐ€์ƒ ๋ฌธ์„œ ์ž„๋ฒ ๋”ฉ ์ „๋žต

Beyond Keywords: Mastering HyDE for Smarter Retrieval ๐Ÿง 

Rushank Savant2026๋…„ 5์›” 10์ผ7๋ถ„intermediate

Context

์‚ฌ์šฉ์ž์˜ ์ผ์ƒ์–ด ์ฟผ๋ฆฌ์™€ ์ „๋ฌธ ์šฉ์–ด ์ค‘์‹ฌ์˜ ๋ฌธ์„œ ๊ฐ„ ๋ฒกํ„ฐ ๊ฑฐ๋ฆฌ ์ฐจ์ด๋กœ ์ธํ•œ Asymmetric Retrieval ๋ฌธ์ œ ๋ฐœ์ƒ. ๋‹จ์ˆœ Vector Search ๊ธฐ๋ฐ˜์˜ RAG ์‹œ์Šคํ…œ์€ ํ‚ค์›Œ๋“œ ๋ถˆ์ผ์น˜ ์‹œ ๊ด€๋ จ ๋ฌธ์„œ ์ถ”์ถœ ์‹คํŒจ๋ผ๋Š” ํ•œ๊ณ„๋ฅผ ๊ฐ€์ง.

Technical Solution

  • Few-shot Prompting์„ ํ†ตํ•ด ์‚ฌ์šฉ์ž ์ฟผ๋ฆฌ๋ฅผ ์ „๋ฌธ ๋ฌธ์„œ ์Šคํƒ€์ผ์˜ ๊ฐ€์ƒ ๋‹ต๋ณ€(Hypothetical Document)์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋‹จ๊ณ„ ์ถ”๊ฐ€
  • ์ฟผ๋ฆฌ ์ž์ฒด๋ฅผ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋Œ€์‹  LLM์ด ์ƒ์„ฑํ•œ ๊ฐ€์ƒ ๋ฌธ์„œ์˜ ๋ฒกํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Vector Store์—์„œ ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰ ์ˆ˜ํ–‰
  • ์งˆ๋ฌธ(Short/Informal)๊ณผ ๋‹ต๋ณ€(Long/Professional)์˜ ๋ถˆ๊ท ํ˜•์„ Symmetric ๊ตฌ์กฐ๋กœ ์ „ํ™˜ํ•˜์—ฌ ๊ฒ€์ƒ‰ ์ •๋ฐ€๋„ ํ–ฅ์ƒ
  • ์ „๋ฌธ ๋ถ„์•ผ(Legal ๋“ฑ)์˜ Linguistic DNA๋ฅผ ๋ฐ˜์˜ํ•œ ํ”„๋กฌํ”„ํŠธ ์„ค๊ณ„๋ฅผ ํ†ตํ•ด ๋„๋ฉ”์ธ ํŠนํ™” ๊ฒ€์ƒ‰ ์ตœ์ ํ™”
  • LLM์˜ ์ƒ์„ฑ ๋Šฅ๋ ฅ๊ณผ Embedding ๋ชจ๋ธ์˜ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ ๋Šฅ๋ ฅ์„ ๊ฒฐํ•ฉํ•œ 2๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ•

- ์‚ฌ์šฉ์ž์™€ ๋ฌธ์„œ ๊ฐ„์˜ ์ „๋ฌธ ์šฉ์–ด ๊ฒฉ์ฐจ๊ฐ€ ํฐ ๋„๋ฉ”์ธ์ธ์ง€ ํ™•์ธ - ์ฟผ๋ฆฌ ๋‹น ์ถ”๊ฐ€ LLM ํ˜ธ์ถœ๋กœ ์ธํ•œ 1~2์ดˆ์˜ Latency ์ฆ๊ฐ€ ๋ฐ ํ† ํฐ ๋น„์šฉ ์ฆ๊ฐ€๋ถ„ ๊ฒ€ํ†  - ์ˆ˜์น˜ ๋ฐ์ดํ„ฐ๋‚˜ ์ •ํ™•ํ•œ ์‚ฌ์‹ค ๊ด€๊ณ„ ๋ฃฉ์—…์ด ์ฃผ ๋ชฉ์ ์ธ ๊ฒฝ์šฐ Hallucination ์œ„ํ—˜์œผ๋กœ ์ธํ•ด ๋„์ž… ๋ฐฐ์ œ - Few-shot ์˜ˆ์‹œ๋ฅผ ํ†ตํ•ด LLM์ด ์ƒ์„ฑํ•  ๊ฐ€์ƒ ๋ฌธ์„œ์˜ ์Šคํƒ€์ผ์„ ์ •๊ตํ•˜๊ฒŒ ์ œ์–ด

์›๋ฌธ ์ฝ๊ธฐ