Auto-regressive LLM의 Reversal Curse 식별 및 GPT-4 정답률 79% vs 33% 격차 확인
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Understanding Decoder-Only Transformers Part 1: Masked Self-Attention
Chapter 12: Inference - Generating New Text
Understanding Transformers Part 17: Generating the Output Word