Gemma 4 9B 기반 Local LLM 도입으로 운영 비용 66% 절감 및 API 의존성 제거
I Ran Gemma 4 on a $7/Month Server and Built an AI-Powered News Monitor That Costs $0 to Operate
I Ran Gemma 4 on a $7/Month Server and Built an AI-Powered News Monitor That Costs $0 to Operate
I Ran Hermes Agent Locally on CPU-Only Hardware With llamafile — No GPU, No Server, No Cloud API
Aximo - offline-first STT API
Aximo — a local Rust STT API for CPU-only inference
Accelerating Stable Diffusion Inference on Intel CPUs
Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs
Scaling-up BERT Inference on CPU (Part 1)