Hugging Face가 DataCollatorWithFlattening과 Flash Attention 2를 결합해 패딩 제거 시퀀스 학습에서 2배 처리량 향상 달성
Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2
Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2
Fine-Tune ViT for Image Classification with 🤗 Transformers
Boosting Wav2Vec2 with n-grams in 🤗 Transformers
Hugging Face on PyTorch / XLA TPUs
Hyperparameter Search with Transformers and Ray Tune