batch invariance Archives

Person facing a big screen with numbers, artificial intelligence and cloud computing technology

LLM Inference Nondeterminism: Why Temperature 0 Fails You

July 2, 2026 0 Comments

LLM inference nondeterminism means identical prompts can return different outputs even at temperature 0, because dynamic batching changes the order of floating-point reductions inside GPU kernels — a property called batch invariance. Thinking Machines Lab measured 80 distinct completions from 1,000 identical requests on Qwen3-235B, and researchers documented up to …

Editorial team