ํ”ผ๋“œ๋กœ ๋Œ์•„๊ฐ€๊ธฐ
Public AI on Hugging Face Inference Providers ๐Ÿ”ฅ
Hugging Face BlogHugging Face Blog
Backend

Hugging Face๊ฐ€ Public AI๋ฅผ Inference Provider๋กœ ํ†ตํ•ฉํ•ด vLLM ๋ฐฑ์—”๋“œ์™€ ๊ธ€๋กœ๋ฒŒ ๋กœ๋“œ ๋ฐธ๋Ÿฐ์‹ฑ์„ ํ†ตํ•œ ๋ถ„์‚ฐ ์ถ”๋ก  ์ธํ”„๋ผ ์ œ๊ณต

Public AI on Hugging Face Inference Providers ๐Ÿ”ฅ

2025๋…„ 9์›” 17์ผ7๋ถ„intermediate

Context

Hugging Face Hub์˜ ๋ชจ๋ธ ํŽ˜์ด์ง€์—์„œ ๋‹ค์–‘ํ•œ ์ถ”๋ก  ์ œ๊ณต์ž์— ์ ‘๊ทผํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ฐ๊ฐ์˜ API ํ‚ค ๊ด€๋ฆฌ์™€ ๋ณ„๋„์˜ ํ†ตํ•ฉ์ด ํ•„์š”ํ–ˆ๋‹ค. ๊ณต๊ฐœ AI ๋ชจ๋ธ๋“ค(Swiss AI Initiative, AI Singapore ๋“ฑ)์„ ํ™œ์šฉํ•˜๋ ค๋Š” ์‚ฌ์šฉ์ž๋“ค์ด ํ”Œ๋žซํผ ๊ฐ„ ์ ‘๊ทผ์„ฑ ๋ฌธ์ œ๋ฅผ ๊ฒช๊ณ  ์žˆ์—ˆ๋‹ค.

Technical Solution

  • Inference Provider ํ†ตํ•ฉ ๊ตฌ์กฐ ๋„์ž…: Public AI๋ฅผ Hugging Face Hub์˜ ๊ณต์‹ Inference Provider๋กœ ๋“ฑ๋กํ•˜์—ฌ ๋ชจ๋ธ ํŽ˜์ด์ง€์— ์ง์ ‘ ํ‘œ์‹œ
  • vLLM ๊ธฐ๋ฐ˜ ๋ฐฑ์—”๋“œ ๋ฐฐํฌ: OpenAI ํ˜ธํ™˜ API๋ฅผ ๋…ธ์ถœํ•˜๋Š” vLLM ์„œ๋ฒ„๋ฅผ ์—ฌ๋Ÿฌ ๊ตญ๊ฐ€์˜ ํŒŒํŠธ๋„ˆ ํด๋Ÿฌ์Šคํ„ฐ์— ๋ถ„์‚ฐ ๋ฐฐํฌ
  • ๊ธ€๋กœ๋ฒŒ ๋กœ๋“œ ๋ฐธ๋Ÿฐ์‹ฑ ๋ ˆ์ด์–ด ๊ตฌํ˜„: ์š”์ฒญ ๊ฒฝ๋กœ๋ฅผ ์ž๋™์œผ๋กœ ์ตœ์ ํ™”ํ•˜์—ฌ ์–ด๋А ๊ตญ๊ฐ€์˜ ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค๊ฐ€ ์ฒ˜๋ฆฌํ•˜๋“  ํˆฌ๋ช…ํ•˜๊ฒŒ ๋ผ์šฐํŒ…
  • ๋“€์–ผ ์ธ์ฆ ๋ชจ๋“œ ์ง€์›: ์‚ฌ์šฉ์ž API ํ‚ค ์ง์ ‘ ์‚ฌ์šฉ(Custom key) ๋˜๋Š” Hugging Face๋ฅผ ํ†ตํ•œ ๋ผ์šฐํŒ…(Routed by HF) ๋‘ ๊ฐ€์ง€ ํ˜ธ์ถœ ๋ฐฉ์‹ ์ œ๊ณต
  • Python/JavaScript SDK ๋„ค์ดํ‹ฐ๋ธŒ ํ†ตํ•ฉ: huggingface_hub(>=0.34.6)๊ณผ @huggingface/inference ํŒจํ‚ค์ง€์—์„œ provider="publicai" ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์ฆ‰์‹œ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • Provider ์šฐ์„ ์ˆœ์œ„ ๊ด€๋ฆฌ: ์‚ฌ์šฉ์ž ๊ณ„์ • ์„ค์ •์—์„œ ์—ฌ๋Ÿฌ Inference Provider๋ฅผ ์„ ํ˜ธ๋„ ์ˆœ์„œ๋Œ€๋กœ ์„ค์ • ๊ฐ€๋Šฅ

Impact

Public AI Inference Utility๋ฅผ ํ†ตํ•œ ์‚ฌ์šฉ๋Ÿ‰์ด ํ˜„์žฌ ๋ฌด๋ฃŒ์ด๋ฉฐ, Hugging Face PRO ์‚ฌ์šฉ์ž๋Š” ๋งค์›” $2 ์ƒ๋‹น์˜ Inference ํฌ๋ ˆ๋”ง์„ ์ œ๊ณต๋ฐ›๋Š”๋‹ค.

Key Takeaway

๋ถ„์‚ฐ ์ถ”๋ก  ์ธํ”„๋ผ์—์„œ OpenAI ํ˜ธํ™˜ API์™€ ๊ธ€๋กœ๋ฒŒ ๋กœ๋“œ ๋ฐธ๋Ÿฐ์‹ฑ์„ ํ‘œ์ค€์œผ๋กœ ์ œ๊ณตํ•˜๋ฉด, ์‚ฌ์šฉ์ž๊ฐ€ ๋ณ„๋„ ์ฝ”๋“œ ์ˆ˜์ • ์—†์ด ์—ฌ๋Ÿฌ ์ œ๊ณต์ž๋ฅผ ๋™์ผํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋กœ ์ „ํ™˜ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” ๋ฉ€ํ‹ฐ ํด๋ผ์šฐ๋“œ/๋ฉ€ํ‹ฐ ๋ฆฌ์ „ ์•„ํ‚คํ…์ฒ˜ ์„ค๊ณ„ ์‹œ API ์ผ๊ด€์„ฑ๊ณผ ์ œ๊ณต์ž ๊ฐ„ ์œ ์—ฐํ•œ ์ „ํ™˜์„ ํ•ต์‹ฌ ์›์น™์œผ๋กœ ์‚ผ์•„์•ผ ํ•จ์„ ์‹œ์‚ฌํ•œ๋‹ค.


๋Œ€๊ทœ๋ชจ LLM ์ถ”๋ก  ์„œ๋น„์Šค๋ฅผ ์ œ๊ณตํ•˜๋Š” ์กฐ์ง์—์„œ OpenAI ํ˜ธํ™˜ API ํ‘œ์ค€ํ™”์™€ ํ•จ๊ป˜ ํด๋ผ์ด์–ธํŠธ SDK์— provider ์„ ํƒ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด, ์ถ”๋ก  ์ œ๊ณต์ž ๋ณ€๊ฒฝ ์‹œ ๋น„์ฆˆ๋‹ˆ์Šค ๋กœ์ง ์ˆ˜์ • ์—†์ด ๋‹จ์ผ ์ค„์˜ ์ฝ”๋“œ ๋ณ€๊ฒฝ๋งŒ์œผ๋กœ ๋Œ€์ฒด ๊ฐ€๋Šฅํ•œ ์œ ์—ฐ์„ฑ์„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋‹ค.

์›๋ฌธ ์ฝ๊ธฐ