NVIDIA NIM Economics: Where Self-Host Beats Every API

A single NVIDIA H100 GPU running a self-hosted NIM container costs roughly $1,950 per month on RunPod at $2.69 per hour, yet serves the same OpenAI-compatible /v1/chat/completions endpoint as GPT-4.1 — which bills $6 per million blended tokens. The crossover where NIM beats every per-token API sits around 300–500 million …