Spark SQL Tutorial Docker

FULL CUDA graph capture hangs distributed TP inference on 2026-02-09 build

Distributed TP inference with gpt-oss-120b hangs after 1-2 requests on the new 2026-02-09 build (PyTorch 2.10 + Triton 3.6.0). The hang is specific to FULL CUDA graph capture mode. Send 2-3 chat ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

FULL CUDA graph capture hangs distributed TP inference on 2026-02-09 build

Trending now