-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Description
What variant of Codex are you using?
CLI (v0.106.0)
What feature would you like to see?
Summary
The compiled codex binary is 80MB on macOS arm64. Analysis of the source (codex-rs/) and the binary itself shows this is primarily due to heavy dependencies compiled unconditionally, with no feature gating for optional functionality. The runtime architecture is solid (async throughout, adaptive streaming, render caching); this is purely a dependency/packaging concern.
For comparison, the source is ~420K lines of Rust across 67 workspace crates, with a Cargo.lock containing 1,043 crates.
Findings
1. Heavy dependencies compiled unconditionally
Several large libraries are always included even though they serve optional functionality:
| Dependency | Estimated binary contribution | Used for | Could be optional |
|---|---|---|---|
image (jpeg, png, gif, webp) |
~5-8MB | Clipboard paste, inline image viewing (~100 lines of app code) | Yes, via feature flag |
syntect + two-face |
~4-6MB | Syntax highlighting in TUI (highlight.rs, 1.3KB file) |
Yes, could lazy-load themes |
| OpenTelemetry (5 crates) | ~3-5MB | Observability, not used in basic CLI mode (#12913 notes it's not even emitting metrics) | Yes, via feature flag |
rama (8 crates) |
~2-4MB | HTTP/SOCKS5 network proxy for sandboxing; using ~10% of the framework | Yes, could use lighter alternative |
sqlx with migrate feature |
~1-2MB | State/history storage; migration tooling not needed at runtime | Partially, drop migrate feature |
2. tokio compiled with features = ["full"]
The chatgpt crate depends on tokio with ["full"], which pulls in every tokio subsystem (io, net, fs, signal, process, sync, time, rt-multi-thread, macros). Other crates in the workspace use more targeted feature sets. This inflates the final binary.
3. Both regex and regex-lite included
regex = "1.12.3"
regex-lite = "0.1.8"Both are workspace dependencies. regex-lite exists specifically as a smaller alternative to regex; having both defeats the purpose.
4. ts-rs (TypeScript type generation) compiled into the CLI binary
ts-rs = "11"This is a build-time code generation library. If it's only used for generating TypeScript types from Rust structs, it should be a build dependency or gated behind a feature, not compiled into the release binary.
5. All four image formats compiled
image = { version = "^0.25.9", default-features = false, features = ["jpeg", "png", "gif", "webp"] }For a CLI tool, JPEG and PNG would cover the vast majority of use cases. GIF and WebP decoders add binary weight for edge cases.
6. Core crate is a monolith
The core crate contains: all tool specifications, OpenAI integrations, skill framework, state management, authentication, configuration, models, and threading. Every binary in the workspace that depends on core pulls in all of this. Splitting it into core-api, core-tools, core-models etc. would allow dead code elimination to work more effectively.
What's already done well
Credit where it's due. The build configuration is excellent:
[profile.release]
lto = "fat"
split-debuginfo = "off"
strip = "symbols"
codegen-units = 1And the runtime architecture is strong:
- Adaptive streaming with queue-depth-based gear switching (smooth/catch-up modes)
- Frame-rate-limited TUI rendering with transcript overlay caching
- Proper
tokio::task::spawn_blocking()for sync operations - Event-driven throughout, no blocking in main loop
- Platform-specific binary distribution (only the current platform's binary is installed)
Suggested improvements
High impact (estimated 15-25MB savings):
- Gate
imagebehind aclipboard-imagesfeature flag. Most CLI users never paste images. - Gate OpenTelemetry behind an
otelfeature flag. It's not even emitting metrics currently (codex execemits no OTel metrics;codex mcp-serveremits no OTel telemetry at all #12913). - Gate
syntectbehind asyntax-highlightingfeature flag, or lazy-load theme databases. - Evaluate whether
ramacan be replaced with a lighter proxy implementation for the subset of features used.
Medium impact (estimated 3-8MB savings):
- Replace
["full"]tokio features in thechatgptcrate with the specific features actually used. - Remove
migratefrom sqlx features in the release binary. - Consolidate
regexandregex-liteto one. - Move
ts-rsto a build dependency or gate behind acodegenfeature. - Drop GIF and WebP from image features if usage data supports it.
Lower effort, longer-term:
- Split the
corecrate into smaller modules to improve dead code elimination.
Conservatively, feature-gating the top four dependencies could bring the binary from 80MB down to ~55-60MB with no loss of default functionality (users who need otel or image paste could opt in).
Additional information
Analysis performed on the open source codex-rs/ directory. Binary inspected: @openai/codex-darwin-arm64 v0.106.0 (Mach-O 64-bit arm64, 74MB __TEXT segment, 440 exported symbols, properly stripped).
Environment: macOS Darwin 25.2.0 arm64, Node v20.19.4 (npm install)