Codex Orchestrator is the CLI + runtime that coordinates Codex-driven runs, pipelines, and delegation MCP tooling. The npm release focuses on running pipelines locally, emitting auditable manifests, and hosting the delegation server.
- Global install (recommended for CLI use):
npm i -g @kbediako/codex-orchestrator
- After install, use either
codex-orchestratoror the short aliascodex-orch:codex-orchestrator --version
- Or run via npx:
npx @kbediako/codex-orchestrator --version
Node.js >= 20 is required.
- Run a pipeline with a task id so artifacts are grouped under
.runs/<task-id>/:The command prints thecodex-orch start diagnostics --format json --task <task-id>
run_idplus the manifest path under.runs/<task-id>/cli/<run-id>/manifest.json. - Watch status:
codex-orch status --run <run-id> --watch --interval 10
- Resume if needed:
codex-orch resume --run <run-id>
Tip: if you prefer
npx, replacecodex-orchwithnpx @kbediako/codex-orchestrator. Tip: for multiple commands, you can alsoexport MCP_RUNNER_TASK_ID=<task-id>once.
Use this when you want Codex to drive work inside another repo with the CO defaults.
- Install templates:
One-shot (templates + optional CO-managed Codex CLI install):
codex-orchestrator init codex --cwd /path/to/repo
This seedscodex-orchestrator init codex --codex-cli --yes
AGENTS.md,mcp-client.json, and downstream .codex/config.toml + .codex/agents/* role files (sourced fromtemplates/codex/.codex/*), pluscodex.orchestrator.json. - Register the delegation MCP server (one-time per machine):
codex mcp add delegation -- codex-orchestrator delegate-server --repo /path/to/repo
- Optional (managed/pinned CLI path): set up a CO-managed Codex CLI:
Use this when you want a pinned binary, build-from-source behavior, or a custom fork. Stock/global
codex-orchestrator codex setup
codexis still the default selection; activate managed binary routing with:export CODEX_CLI_USE_MANAGED=1 - Optional (additive global defaults in
~/.codex/config.toml):This updates only the CO baseline keys/role wiring and preserves unrelated config entries.codex-orchestrator codex defaults codex-orchestrator codex defaults --yes
- Optional (fast refresh helper for downstream users):
Repo-only helper (not included in npm package). Add
scripts/codex-cli-refresh.sh --repo /path/to/codex --align-only
--no-pushwhen you only want local alignment and do not want to updateorigin/main. To refresh the CO-managed CLI, run a separate command with--force-rebuild(without--align-only). SetCODEX_REPOorCODEX_CLI_SOURCEto avoid passing--repoeach time.
Run the delegation MCP server over stdio:
codex-orchestrator delegate-server --repo /path/to/repoOptional: add --mode question_only to disable delegate.spawn/pause/cancel, keeping only delegate.question.* + delegate.status in the delegate namespace. GitHub tools remain available when GitHub integration is enabled.
Register it with Codex once. Delegation MCP is enabled by default (the only MCP enabled by default). To override the default or re-enable after disabling:
codex mcp add delegation -- codex-orchestrator delegate-server --repo /path/to/repo
codex -c 'mcp_servers.delegation.enabled=true' ...delegate-server is the canonical name; delegation-server is supported as an alias (older docs may use it).
Codex built-ins are default, explorer, worker, and awaiter. researcher is user-defined.
spawn_agentdefaults todefaultwhenagent_typeis omitted, so always setagent_typeexplicitly.- Multi-turn loops are supported (
spawn_agent->send_input->wait/resume_agent->close_agent), so subagents can iterate before parent synthesis.
In Codex CLI 0.105.0, built-in explorer no longer pins an older model profile; it inherits top-level defaults unless you attach a role config_file.
CO now ships this downstream starter config via init codex (source template: templates/codex/.codex/config.toml; installed as .codex/config.toml in target repos):
model = "gpt-5.3-codex"
model_reasoning_effort = "xhigh"
[agents]
max_threads = 12
max_depth = 4
max_spawn_depth = 4
[agents.explorer_fast]
description = "Fast explorer (spark text-only)."
config_file = "./agents/explorer-fast.toml"
[agents.worker_complex]
description = "Complex worker role."
config_file = "./agents/worker-complex.toml"
[agents.awaiter]
description = "Awaiter override (keeps awaiter behavior with latest codex/high reasoning)."
config_file = "./agents/awaiter-high.toml"# .codex/agents/explorer-fast.toml
model = "gpt-5.3-codex-spark"
model_reasoning_effort = "xhigh"# .codex/agents/worker-complex.toml
model = "gpt-5.3-codex"
model_reasoning_effort = "xhigh"init codex also writes downstream .codex/agents/awaiter-high.toml from templates/codex/.codex/agents/awaiter-high.toml so CO users can keep awaiter semantics while meeting a high-reasoning minimum.
Caveats:
gpt-5.3-codex-sparkis text-only (no image inputs). Keep it for fast search/synthesis.- Leave
agents.explorerundefined unless you intentionally want to override built-in explorer behavior. - Keep RLM/collab built-ins-first by default; add specialist custom roles only when a measured benefit justifies ongoing maintenance.
max_threads = 12,max_depth = 4, andmax_spawn_depth = 4are CO's standard multi-agent baseline.- Fallbacks are contingency-only: use
8/2/2on constrained hosts or deterministic high-risk lanes; use6/1/1only as break-glass under severe contention. - Awaiter triage: long waits are expected for long-running jobs; treat it as stuck only after multiple polling windows with no status/progress movement.
codex reviewdelegates with collab tools disabled in review threads; keep review expectations single-agent even when multi-agent is enabled elsewhere.
Delegation guard profile:
CODEX_ORCHESTRATOR_GUARD_PROFILE=auto(default): strict in CO-style repos, warn in lightweight repos.- Set
CODEX_ORCHESTRATOR_GUARD_PROFILE=warnfor ad-hoc/no-task-id runs. - Set
CODEX_ORCHESTRATOR_GUARD_PROFILE=strictto enforce full delegation evidence checks.
RLM (Recursive Language Model) is the long-horizon loop used by the rlm pipeline (codex-orchestrator rlm "<goal>" or codex-orchestrator start rlm --goal "<goal>"). Delegated runs only enter RLM when the child is launched with the rlm pipeline (or the rlm runner directly). In auto mode it resolves to symbolic only when context is large (RLM_SYMBOLIC_MIN_BYTES) and an explicit context signal is present (RLM_CONTEXT_PATH or delegated run); otherwise it stays iterative. The runner writes state to .runs/<task-id>/cli/<run-id>/rlm/state.json and stops when the validator passes or budgets are exhausted.
For symbolic mode, the Option 2 alignment checker is enabled by default (RLM_ALIGNMENT_CHECKER=1) and writes append-only alignment artifacts under .runs/<task-id>/cli/<run-id>/rlm/alignment/ (ledger + projection). Rollback toggle: set RLM_ALIGNMENT_CHECKER=0. Enforcement is opt-in via RLM_ALIGNMENT_CHECKER_ENFORCE=1.
Symbolic subcalls can optionally use collab tools. Fast path: codex-orchestrator rlm --multi-agent auto "<goal>" (legacy alias: --collab auto; sets RLM_SYMBOLIC_MULTI_AGENT=1 plus legacy RLM_SYMBOLIC_COLLAB=1 for compatibility, and implies symbolic mode). Collab requires multi_agent=true in codex features list (collab remains a legacy alias). Collab tool calls parsed from codex exec --json --enable multi_agent are stored in manifest.collab_tool_calls (bounded by CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS, set to 0 to disable). For auditable role routing, prefix spawned prompts with [agent_type:<role>] and set spawn_agent.agent_type when supported; lifecycle validation enforces prompt-role evidence and validates agent_type when present (RLM_SYMBOLIC_MULTI_AGENT_ROLE_POLICY=warn|off, legacy alias RLM_COLLAB_ROLE_POLICY; RLM_SYMBOLIC_MULTI_AGENT_ALLOW_DEFAULT_ROLE=1, legacy alias RLM_COLLAB_ALLOW_DEFAULT_ROLE). codex-orchestrator codex setup remains available when you want a managed/pinned CLI path (opt-in via CODEX_CLI_USE_MANAGED=1).
For batch fan-out jobs, prefer native spawn_agents_on_csv before building custom orchestration wrappers.
flowchart TB
A["Parent run<br/>(delegation MCP enabled)"]
C["Delegation MCP server"]
D["delegate.spawn"]
E["Child run<br/>(pipeline resolved)"]
N{Pipeline = rlm?}
P["Standard pipeline<br/>(plan/build/test/review)"]
RLM["RLM pipeline<br/>(see next chart)"]
A --> C --> D --> E --> N
N -- yes --> RLM
N -- no --> P
E -. optional .-> Q["delegate.question.enqueue/poll"] -.-> A
flowchart TB
F["Resolve mode<br/>(auto -> iterative/symbolic)"]
G{Symbolic?}
H["Context store<br/>(chunk + search)"]
I["Planner JSON<br/>(select subcalls)"]
J["Subcalls<br/>(tool + edits, collab optional)"]
K["Validator<br/>(test command)"]
L["State + artifacts<br/>.runs/<task-id>/cli/<run-id>/rlm/state.json"]
M["Exit status"]
F --> G
G -- yes --> H --> I --> J --> K
G -- no --> J
J --> K
K --> L --> M
K -- fail & budget left --> F
Recommended one-shot bootstrap (skills + delegation + DevTools wiring):
codex-orchestrator setup --yes
# Optional: overwrite existing bundled skills in $CODEX_HOME/skills
# codex-orchestrator setup --yes --refresh-skillsThe release ships skills under skills/ for downstream packaging. If you already have global skills installed, treat those as the primary reference and use bundled skills as the shipped fallback. Install bundled skills into $CODEX_HOME/skills:
codex-orchestrator skills installOptions:
--forceoverwrites existing files.--only <skills>installs only selected skills (comma-separated). Combine with--forceto overwrite only those.--codex-home <path>targets a different Codex home directory.
Bundled skills (may vary by release):
collab-subagents-firstchrome-devtoolsdelegation-usagestandalone-reviewdocs-firstcollab-evalscollab-deliberationlong-poll-waitreleaseagent-first-adoption-steeringdelegate-early(compatibility alias; usedelegation-usage)
Check readiness (deps + capability wiring):
codex-orchestrator doctor --format jsonAuto-fix wiring (delegation + DevTools):
codex-orchestrator doctor --apply --yesUsage snapshot (scans local .runs/):
codex-orchestrator doctor --usagedoctor --usage prints adoption KPIs (advanced/cloud/rlm/collab/delegation coverage), and per-run run-summary.json now includes a usageKpi section plus cloud fallback metadata when preflight downgrades to MCP.
doctor also includes a codex-defaults advisory section (model/reasoning/agent baseline drift) and points to additive remediation via codex-orchestrator codex defaults --yes.
Issue bundle logging (downstream dogfooding / repro handoff):
codex-orchestrator doctor --issue-log --issue-title "Observed failure" --issue-notes "what happened"doctor --issue-log appends docs/codex-orchestrator-issues.md (override via --issue-log-path) and writes a JSON bundle under out/<resolved-task>/doctor/issue-bundles/ with doctor/cloud context (latest run context is included when available).
Auto-capture issue bundles when runs fail:
codex-orchestrator start <pipeline> --auto-issue-log
codex-orchestrator flow --task <task-id> --auto-issue-logThis captures both post-manifest run failures and setup failures that occur before a run manifest is created (for example strict repo-config enforcement).
Cloud preflight check (without starting a pipeline):
codex-orchestrator doctor --cloud-preflight- Bootstrap + wire everything:
codex-orchestrator setup --yes(non-destructive for existing skills by default; add--refresh-skillsto overwrite) - Enable required MCP servers with least privilege:
codex-orchestrator mcp enable --servers delegation --yes(plan with--format json; omit--serversonly when you intentionally want all disabled servers enabled; env/secret values are redacted in displayed command lines) - Low-friction docs->implementation guardrails:
codex-orchestrator flow --task <task-id> - Validate + measure adoption locally:
codex-orchestrator doctor --usage --format json - Run docs relevance as an advisory lane (non-blocking):
codex-orchestrator start docs-relevance-advisory --task <task-id> - Capture reproducible downstream failures:
codex-orchestrator doctor --issue-log --issue-title "<title>" --issue-notes "<notes>" - Auto-capture failed run issue bundles:
codex-orchestrator start <pipeline> --auto-issue-logorcodex-orchestrator flow --auto-issue-log - Active PR watch-resolve-merge loop:
codex-orchestrator pr resolve-merge --pr <number> --quiet-minutes <window>(add--auto-mergewhen approved; exits early when author action is required). - Passive PR monitor loop:
codex-orchestrator pr watch-merge --pr <number> --quiet-minutes <window>(monitor-only behavior; keeps waiting unless terminal/timeout). - Review checkpoints (npm-only safe):
NOTES="Goal: ... | Summary: ... | Risks: ..." codex-orchestrator review --task <task-id>for manifest-backed standalone review wrapper behavior (auto-skips repo-only diff-budget script when unavailable in downstream installs); usecodex review "<focus>"for quick prompt-only checks; usecodex-orchestrator start implementation-gate --task <task-id> --format jsonwhen you want a full gate run. - Downstream simulation before shipping wrapper/skill changes:
npm run pack:smoke(packaged CLI in temp mock repo; validatesreviewartifacts andlong-poll-waitinstall path). - Delegation:
codex-orchestrator doctor --apply --yes, then enable for a Codex run with:codex -c 'mcp_servers.delegation.enabled=true' ... - Collab (symbolic RLM subagents):
codex-orchestrator rlm --multi-agent auto "<goal>"(legacy alias:--collab auto; requires Codexfeatures.multi_agent=true) - Cloud: set
CODEX_CLOUD_ENV_ID(and optionalCODEX_CLOUD_BRANCH), then run:codex-orchestrator start <pipeline> --cloud --target <stage-id> - Cloud fail-fast (avoid fallback reliance): set
CODEX_ORCHESTRATOR_CLOUD_FALLBACK=deny - Repo-config fail-fast (deny packaged config fallback): set
CODEX_ORCHESTRATOR_REPO_CONFIG_REQUIRED=1or pass--repo-config-required - Cloud status retry tuning (optional):
CODEX_CLOUD_STATUS_RETRY_LIMIT,CODEX_CLOUD_STATUS_RETRY_BACKOFF_MS
Print DevTools MCP setup guidance:
codex-orchestrator devtools setupcodex-orchestrator start <pipeline>— run a pipeline (add--auto-issue-logfor automatic failure bundle capture; add--repo-config-requiredfor strict repo-local config mode).codex-orchestrator flow --task <task-id>— rundocs-reviewthenimplementation-gatein sequence (supports--auto-issue-logand--repo-config-required).codex-orchestrator start docs-relevance-advisory --task <task-id>— run non-blocking docs relevance signals (warn-mode freshness + advisory review lane).NOTES="Goal: ... | Summary: ... | Risks: ..." codex-orchestrator review --task <task-id>— run standalone review wrapper with manifest-backed evidence (supports run-review flags/env).codex-orchestrator plan <pipeline>— preview pipeline stages.codex-orchestrator exec <cmd>— run a one-off command with the exec runtime.codex-orchestrator init codex— install starter templates (mcp-client.json,AGENTS.md, downstream .codex/config.toml + .codex/agents/* role files sourced fromtemplates/codex/.codex/*,codex.orchestrator.json) into a repo.codex-orchestrator setup --yes— install bundled skills and configure delegation + DevTools wiring (add--refresh-skillsto overwrite existing skills in$CODEX_HOME/skills).codex-orchestrator init codex --codex-cli --yes --codex-source <path>— optionally provision a CO-managed Codex CLI binary (build-from-source default; setCODEX_CLI_SOURCEto avoid passing--codex-sourceevery time, andCODEX_CLI_USE_MANAGED=1to route runs to it).codex-orchestrator init codex --codex-cli --yes --codex-download-url <url> --codex-download-sha256 <sha>— opt-in to a prebuilt Codex CLI download.codex-orchestrator codex setup— plan/apply a CO-managed Codex CLI install (optional managed/pinned path; use--download-url+--download-sha256for prebuilts; activate withCODEX_CLI_USE_MANAGED=1).codex-orchestrator codex defaults— plan/apply additive global defaults in~/.codex/config.tomland~/.codex/agents/*.toml(--yesapplies,--forceallows role file overwrite).codex-orchestrator delegation setup --yes— configure delegation MCP server wiring.codex-orchestrator mcp enable --servers <csv> --yes— enable specific disabled MCP servers from existing Codex config entries.codex-orchestrator self-check --format json— JSON health payload.codex-orchestrator mcp serve— Codex MCP stdio server.npm run pack:smoke— maintainer smoke gate for packaged downstream behavior (tarball install + review/skill checks). Core lane runs it on downstream-facing diffs;.github/workflows/pack-smoke-backstop.ymlruns a weeklymainbackstop.
- CLI + built-in pipelines
- Delegation MCP server (
delegate-server) - Bundled skills under
skills/ - Schemas and templates needed by the CLI
Repo internals, development workflows, and deeper architecture notes (contributor/internal) live in the GitHub repository:
docs/README.mddocs/diagnostics-prompt-guide.md(first-run diagnostics prompt + expected outputs)docs/guides/collab-vs-mcp.md(agent-first decision guide)docs/guides/rlm-recursion-v2.md(RLM recursion reference)docs/guides/cloud-mode-preflight.md(cloud-mode preflight + fallback guidance)docs/guides/review-artifacts.md(wherecodex-orchestrator review/npm run reviewwrite prompt/output artifacts)docs/standalone-review-guide.md(repo-local wrapper behavior + downstream-safe review alternatives)
Seeded OOLONG accuracy curves (Wilson 95% CI, runs=5). In these runs, the baseline accuracy degrades as context length grows, while RLM stays near the ceiling across the tested lengths.
![]() |
![]() |


