ํ”ผ๋“œ๋กœ ๋Œ์•„๊ฐ€๊ธฐ
Deploying ๐Ÿค— ViT on Vertex AI
Hugging Face BlogHugging Face Blog
Backend

Google Cloud๊ฐ€ Vertex AI ํ”Œ๋žซํผ์„ ์‚ฌ์šฉํ•˜์—ฌ Vision Transformer ๋ชจ๋ธ์„ Kubernetes ๋Œ€๋น„ ์ฝ”๋“œ๋Ÿ‰ ๋Œ€ํญ ๊ฐ์†Œ๋กœ ๋ฐฐํฌ

Deploying ๐Ÿค— ViT on Vertex AI

2022๋…„ 8์›” 19์ผ10๋ถ„intermediate

Context

Vision Transformer ๋ชจ๋ธ์„ ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์— ๋ฐฐํฌํ•  ๋•Œ ๋กœ์ปฌ TensorFlow Serving๊ณผ Kubernetes ํด๋Ÿฌ์Šคํ„ฐ ๋ฐฐํฌ๋Š” ๊ฐ๊ฐ ํ™•์žฅ์„ฑ๊ณผ ๊ด€๋ฆฌ ๋ณต์žก๋„ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„๋ฅผ ์•ผ๊ธฐํ–ˆ๋‹ค. ๋‘ ๋ฐฉ์‹ ๋ชจ๋‘ ์ƒ๋‹นํ•œ ์ธํ”„๋ผ ๊ด€๋ฆฌ์™€ ์„ค์ •์ด ํ•„์š”ํ–ˆ๋‹ค.

Technical Solution

  • Vision Transformer B/16 ๋ชจ๋ธ์„ SavedModel ํ˜•์‹์œผ๋กœ ์ง๋ ฌํ™”ํ•˜๋˜, base64 ์ธ์ฝ”๋”ฉ๋œ ์ด๋ฏธ์ง€ ์ž…๋ ฅ์„ ๋ฐ›์•„ 224x224 ๋ฆฌ์‚ฌ์ด์ฆˆ ๋ฐ [-1, 1] ๋ฒ”์œ„ ์ •๊ทœํ™”๋ฅผ ๋‚ด์žฅ: ์„œ๋น™-ํ•™์Šต ๊ฐ„ ์ฐจ์ด ์ตœ์†Œํ™”
  • Google Cloud Storage(GCS) ๋ฒ„ํ‚ท์— ๋ชจ๋ธ ์•„ํ‹ฐํŒฉํŠธ ์ €์žฅ: ์ค‘์•™ํ™”๋œ ๋ชจ๋ธ ์ €์žฅ์†Œ ๊ตฌํ˜„
  • Vertex AI Model Registry์— SavedModel ์—…๋กœ๋“œ: ๋ชจ๋ธ ๋ฒ„์ „ ๊ด€๋ฆฌ ๋ฐ ๊ณ ๊ฐ€์šฉ์„ฑ ๋ณด์žฅ
  • Vertex AI Endpoint ์ƒ์„ฑ ๋ฐ ๋ฐฐํฌ: ์ž๋™ ํŠธ๋ž˜ํ”ฝ ๊ธฐ๋ฐ˜ ์˜คํ† ์Šค์ผ€์ผ๋ง, ๋ฒ„์ „ ๊ฐ„ ํŠธ๋ž˜ํ”ฝ ๋ถ„์‚ฐ, ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ๋กœ๊น… ์ง€์›
  • google-cloud-aiplatform Python SDK ํ™œ์šฉํ•˜์—ฌ 4๋‹จ๊ณ„ ๋ฐฐํฌ ์›Œํฌํ”Œ๋กœ์šฐ ๊ตฌํ˜„: ModelServiceClient, EndpointServiceClient, PredictionServiceClient๋กœ ๋ชจ๋ธ ์—…๋กœ๋“œ, ์—”๋“œํฌ์ธํŠธ ์ƒ์„ฑ, ๋ฐฐํฌ, ์˜ˆ์ธก ์š”์ฒญ ์ฒ˜๋ฆฌ
  • n1-standard-8 ๋จธ์‹  ํƒ€์ž…(8 vCPU, 32GB RAM) + NVIDIA_TESLA_T4 GPU ์‚ฌ์šฉ

Impact

Vertex AI ๋ฐฐํฌ๊ฐ€ Kubernetes ๊ธฐ๋ฐ˜ ๋ฐฐํฌ ๋Œ€๋น„ "ํ˜„์ €ํžˆ ์ ์€ ์ฝ”๋“œ"๋กœ ๋™์ผํ•œ ํ™•์žฅ์„ฑ ์ˆ˜์ค€ ๋‹ฌ์„ฑ.

Key Takeaway

Vertex AI๋Š” ์„ ์–ธ์  ์„ค์ • ๊ธฐ๋ฐ˜ ๋ฐฐํฌ๋ฅผ ํ†ตํ•ด ์ธํ”„๋ผ ๊ด€๋ฆฌ ๋ณต์žก๋„๋ฅผ ์ œ๊ฑฐํ•˜๋ฉด์„œ, ์ธ์ฆ, ์˜คํ† ์Šค์ผ€์ผ๋ง, ๋ชจ๋ธ ๋ฒ„์ „ ๊ด€๋ฆฌ, ํŠธ๋ž˜ํ”ฝ ๋ถ„์‚ฐ, ๋ชจ๋‹ˆํ„ฐ๋ง ๋“ฑ ML ์šด์˜์— ํ•„์ˆ˜์ ์ธ ๊ธฐ๋Šฅ์„ ํ†ตํ•ฉ ์ œ๊ณตํ•œ๋‹ค. Vision Transformer๋ฟ ์•„๋‹ˆ๋ผ SegFormer ๊ฐ™์€ ์ตœ์‹  ๋ชจ๋ธ๋„ ๋™์ผ ์›Œํฌํ”Œ๋กœ์šฐ๋กœ ๋ฐฐํฌ ๊ฐ€๋Šฅํ•˜๋‹ค.


TensorFlow ๊ธฐ๋ฐ˜ Vision ๋ชจ๋ธ์„ ํ”„๋กœ๋•์…˜์— ๋ฐฐํฌํ•˜๋Š” ML ์—”์ง€๋‹ˆ์–ด๋Š” Vertex AI๋ฅผ ์„ ํƒํ•˜๋ฉด SavedModel์— ์ „์ฒ˜๋ฆฌ/ํ›„์ฒ˜๋ฆฌ๋ฅผ ๋‚ด์žฅํ•œ ํ›„ GCS์— ์ €์žฅํ•˜๊ณ , google-cloud-aiplatform SDK์˜ 4๋‹จ๊ณ„ API(๋ชจ๋ธ ์—…๋กœ๋“œ โ†’ ์—”๋“œํฌ์ธํŠธ ์ƒ์„ฑ โ†’ ๋ฐฐํฌ โ†’ ์˜ˆ์ธก ์š”์ฒญ)๋กœ ๊ตฌํ˜„ํ•˜์—ฌ Kubernetes ๊ด€๋ฆฌ ์˜ค๋ฒ„ํ—ค๋“œ ์—†์ด ์ž๋™ ์Šค์ผ€์ผ๋ง ๋ฐ ๋ชจ๋‹ˆํ„ฐ๋ง ๊ธฐ๋Šฅ์„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋‹ค.

์›๋ฌธ ์ฝ๊ธฐ