On H100 SXM5 80GB running Llama 3.3 70B Instruct at FP8, SGLang serves 1,920 tokens per second at 50-way concurrency — just 3.8% faster than vLLM’s 1,850. But swap to Llama 3.1 8B, and that gap explodes to 29%: SGLang hits 16,200 tok/s versus vLLM’s 12,500. The inference engine you …