Transformer 기반 고차원 Embedding 및 Attention 메커니즘을 통한 문맥 추론
How ChatGPT/Gemini/MS Copilot Understands Your Question: A Step-by-Step Journey from Input to Response
How ChatGPT/Gemini/MS Copilot Understands Your Question: A Step-by-Step Journey from Input to Response
How Large Language Models Work — From Transformers to Conversational AI
Part 8 — Token-by-Token: Why AI Generates Text One Word at a Time (And Why It Costs 4x More)
My Self-Evolving AI Engine Generates Startup Ideas — Then Kills Most of Them
SubQ Model: Can Subquadratic Make Long-Context AI More Efficient?
DeepSeek-V4-Flash Benchmarks, FlashRT CUDA Runtime, & V100 LLM Performance
What Deep Learning Really Means — From Neural Networks to Modern AI
1,000x Claim, No Independent Proof: Subquadratic Architecture
How AI Works Under the Hood: LLMs Explained with Code
I Trained My Own LLM from Scratch in 2025: What That Viral HN Tutorial Doesn't Tell You About the Real Cost
Entrené mi propio LLM desde cero en 2025: lo que el tutorial viral de HN no te dice sobre el costo real
Train Your Own LLM from Scratch
Part 2: Vector Embeddings in simplest terms
Understanding Transformers Part 17: Generating the Output Word
What is OpenAI's Parameter Golf Challenge, and why I spent a month on it
Understanding Text Similarity with Embeddings and Cosine Similarity
LLM Study Diary #1: Transformer
I Rebuilt Karpathy's NanoChat in JAX. Here's What XLA Gets Right and What It Gets Dead Wrong.
Show HN: TRiP – a complete transformer engine in C built from scratch just by me
Understanding Transformers – Part 16: Preparing for Output Prediction with Residual Connections