전체 피드 소스 목록

카테고리

Frontend Backend DevOps AI/ML Mobile Database Security Career Infrastructure

© 2026 DevPick

#gptq

피드 검색 북마크 설정

Hugging Face Blog

Hugging Face Transformers가 bitsandbytes와 auto-gptq 두 가지 양자화 방식을 비교 분석하여 추론 속도와 파인튜닝 성능의 트레이드오프를 명확히 제시

Overview of natively supported quantization schemes in 🤗 Transformers

AI/MLintermediate21 분 소요2023년 9월 12일

Hugging Face Blog

Hugging Face가 AutoGPTQ를 Transformers에 통합해 LLM을 2~8비트 정밀도로 양자화하고 약 4배의 메모리 절감 달성

Making LLMs lighter with AutoGPTQ and transformers

AI/MLintermediate26 분 소요2023년 8월 23일

Hugging Face Blog

AMD GPU와 ROCm을 활용해 Vicuna-13B 모델을 단일 GPU에서 28GB에서 GPTQ 4비트 양자화로 메모리 요구량을 대폭 감소시켜 실행

Run a Chatgpt-like Chatbot on a Single GPU with ROCm

AI/MLintermediate23 분 소요2023년 5월 15일