Autoregressive Loop와 Causal Masking을 통한 LLM 텍스트 생성 메커니즘 분석
How Transformer Decoders Generate Text — From Causal Masking to Decoding
How Transformer Decoders Generate Text — From Causal Masking to Decoding
Chapter 6: Embeddings, the Forward Pass, and the Loss Function
Chapter 5: Linear Transformation and Softmax