전체 피드 소스 목록

카테고리

Frontend Backend DevOps AI/ML Mobile Database Security Career Infrastructure

© 2026 DevPick

#sglang

피드 검색 북마크 설정

Dev.to

Prefix Caching 도입 통한 Prefill 비용 최대 80% 절감 및 TTFT 최적화

Prefix caching at scale: when it saves you 80% of prefill cost, and the eviction policies that quietly turn it into 5%

AI/MLadvanced26 분 소요2026년 6월 7일

Dev.to

Nemotron-Labs Diffusion 도입으로 LLM Throughput 6.4배 달성

Diffusion Language Models: How NVIDIA Nemotron-Labs Diffusion Shatters the Autoregressive Speed Ceiling

AI/MLadvanced64 분 소요2026년 5월 23일

Dev.to

Speculative Decoding의 한계 돌파, DFlash로 구현한 병렬 토큰 생성

Speculative Decoding’s Ceiling Just Moved With DFlash

AI/MLadvanced21 분 소요2026년 4월 7일

Hugging Face Blog

SGLang이 Hugging Face transformers를 백엔드로 통합해 네이티브 지원되지 않는 모델을 즉시 고성능 추론으로 실행 가능

Transformers backend integration in SGLang

Backendintermediate10 분 소요2025년 6월 23일

Hugging Face Blog

Open R1 프로젝트가 512개 H100 GPU에서 SGLang을 도입해 생성 속도를 2배 향상시켜 800k개의 DeepSeek R1 추론 트레이스 생성

Open R1: Update #2

AI/MLintermediate29 분 소요2025년 2월 10일