Skip to content

Add slime RL post-training image#125

Draft
abatilo wants to merge 2 commits intoabatilo/sglangfrom
abatilo/slime
Draft

Add slime RL post-training image#125
abatilo wants to merge 2 commits intoabatilo/sglangfrom
abatilo/slime

Conversation

@abatilo
Copy link

@abatilo abatilo commented Feb 24, 2026

WIP: Add slime RL post-training image

Adds Docker build infrastructure for THUDM/slime, an RL post-training framework that coordinates Megatron-LM (training) + SGLang (rollout inference) via Ray.

What this does

Layers the slime training stack on top of the existing sglang image:

Component Version Why
TransformerEngine 2.10.0 Upgraded from base 2.4 (slime requirement)
Apex 10417ace Slime-pinned commit (different from torch-extras)
SGLang (patched) 4b6f62e2 42-file patch for RL weight sync, scheduling, memory management
Megatron-LM (patched) 3714d81d 17-file patch for MoE routing replay, sandwich norm, MTP fixes
slime b964eedc RL post-training framework + int4_qat CUDA kernel

Architecture

Follows the established ml-containers two-stage build pattern:

sglang image (base)
  └── builder stage: build.bash compiles TE 2.10, apex, patched sglang, int4_qat → /wheels/
  └── final stage: install.bash installs wheels + Megatron-LM (editable) + slime (editable)

Inherits Blackwell sm_100a support and multi-arch (amd64+arm64) from the base sglang image.

TODO

  • Validate with a real docker build against the latest sglang image
  • Smoke test: import slime, megatron, sglang, transformer_engine, apex
  • Confirm arm64 build works
  • Update default base image tag in workflow once latest sglang is built

@abatilo
Copy link
Author

abatilo commented Feb 24, 2026

This change is part of the following stack:

Change managed by git-spice.

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22359629088
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-6bbebab-d6cea4b-386fabe-nccl-cuda12.8.0-ubuntu22.04-nccl2.25.1-1-torch2.6.0-vision0.21.0-audio2.6.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22393112086
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-c469ae4-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22401124059
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-b896655-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22393112087
Image: ghcr.io/coreweave/ml-containers/sglang:abatilo-slime-c469ae4-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22403841820
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-2a176de-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22406520501
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-9b08cb3-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22416790591
Image: ghcr.io/coreweave/ml-containers/sglang:abatilo-slime-79ebda6-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22416790628
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-79ebda6-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22417654701
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-67b6432-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22417446269
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-07a6203-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22495614431
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-8403328-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22495614471
Image: ghcr.io/coreweave/ml-containers/sglang:abatilo-slime-8403328-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22508310210
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-029b183-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22512778429
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-006aa6e-abatilo-sglang-d8259f9-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22647629481
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-a58dea5-abatilo-sglang-d5fb1dd-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@github-actions
Copy link

github-actions bot commented Mar 4, 2026

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22678705175
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-fbbd7f8-abatilo-sglang-d5fb1dd-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@abatilo abatilo force-pushed the abatilo/slime branch 4 times, most recently from dffa047 to 8891b14 Compare March 5, 2026 03:08
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22700394547
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-8891b14-abatilo-sglang-d5fb1dd-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@abatilo abatilo force-pushed the abatilo/slime branch 2 times, most recently from 10fe935 to 5faef68 Compare March 5, 2026 03:55
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22701505825
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-5faef68-abatilo-sglang-d5fb1dd-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

@abatilo abatilo force-pushed the abatilo/slime branch 2 times, most recently from 597ffe5 to e7cf547 Compare March 5, 2026 17:38
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

@abatilo Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/22729124486
Image: ghcr.io/coreweave/ml-containers/slime:abatilo-slime-e7cf547-abatilo-sglang-d5fb1dd-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1

SLIME's post-training pipeline (Megatron + SGLang) requires SGLang v0.5.9.
This upgrades from v0.4.x and rebases the image onto a plain torch base,
dropping the torch-extras layer (DeepSpeed, Apex, xFormers) that neither
SGLang nor SLIME actually uses.

FlashInfer moves from JIT to v0.6.3 AOT compilation via TVM. sgl-kernel
is now built with scikit-build-core and enables SM100A (Blackwell) and
FP4 support. vLLM and Triton are removed from this image since they are
served by the dedicated vllm-tensorizer image.
SLIME combines Megatron-LM and SGLang for reinforcement learning based
post-training of large language models. This image builds
TransformerEngine 2.10, Apex, and a patched SGLang wheel on top of our
sglang base, then installs Megatron-LM with SLIME's patches for
routing replay and memory management.

Patches are versioned under slime/patches/v0.5.7/ and documented in
README.md with their upstream origins from THUDM/SLIME.
@abatilo abatilo force-pushed the abatilo/sglang branch 6 times, most recently from d246ce3 to 34f06f0 Compare March 9, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant