Softmax 기반 Weighting을 통한 Self-Attention Value 산출 메커니즘
Understanding Transformers Part 7: From Similarity Scores to Self-Attention
Understanding Transformers Part 7: From Similarity Scores to Self-Attention
Exploring the Future of NLP: Trends, Techniques, and Tools in 2026
Mixture of Experts (MoEs) in Transformers
Tokenization in Transformers v5: Simpler, Clearer, and More Modular
Transformers v5: Simple model definitions powering the AI ecosystem
Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers
Transformers backend integration in SGLang
The Transformers Library: standardizing model definitions
Timm ❤️ Transformers: Use any timm model with transformers
Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo
Introducing SynthID Text
Fixing Gradient Accumulation
Tool Use, Unified
Memory-efficient Diffusion Transformers with Quanto and Diffusers
Announcing New Hugging Face and KerasHub integration
Our Transformers Code Agent beats the GAIA benchmark 🏅
Hugging Face on AMD Instinct MI300 GPU
Unlocking Longer Generation with Key-Value Cache Quantization
Total noob’s intro to Hugging Face Transformers
Patch Time Series Transformer in Hugging Face