GPU Warm Pool 및 Batching 도입으로 Eval 비용 60% 절감
Stop paying for idle GPUs in your CI: batching LLM eval jobs
Stop paying for idle GPUs in your CI: batching LLM eval jobs
Chat is Dead: How JSON Prompting Cut My AI Costs by 73%
Frontend rate limiting can save you $10,000
Teaching an AI to Pick Its Own Brain: Building Adaptive Model Routing
Defluffer - reduce token usage 📉 by 45% using this one simple trick! [Earthday challenge]
How I Cut My AI Chatbot Costs by 55% With One Architecture Change
YAML vs Markdown vs JSON vs TOON: Which Format Is Most Efficient for the Claude API
I was burning through AI tokens without realizing it. Here's how I fixed it.
I Replaced JSON with TOON in My LLM Prompts and Saved 40% on Tokens. Here's How published: false
Universal Claude.md – cut Claude output tokens by 63%
Built a Caching Proxy for OpenAI — Saved 40% on API Bills