전체 피드 소스 목록

카테고리

Frontend Backend DevOps AI/ML Mobile Database Security Career Infrastructure

© 2026 DevPick

#bloom

피드 검색 북마크 설정

Hugging Face Blog

HuggingFace 팀이 Megatron-DeepSpeed 학습 모델을 Transformers로 포팅하고 Pipeline Parallelism + Accelerate + CUDA 커널 최적화로 BLOOM 모델 추론 지연시간 5배 단축 및 처리량 50배 증가

Optimization story: Bloom inference

Backendadvanced58 분 소요2022년 10월 12일

Hugging Face Blog

HuggingFace가 DeepSpeed와 Accelerate를 활용해 176B 파라미터 BLOOM 모델의 토큰 생성 처리량을 0.69msec/토큰까지 단축

Incredibly Fast BLOOM Inference with DeepSpeed and Accelerate

Backendadvanced24 분 소요2022년 9월 16일