Clarity on the "Simple Codex" harness in Terminal Bench 2.0? #12219

cpacker · 2026-02-19T00:22:14Z

cpacker
Feb 19, 2026

What is the type of issue?

Documentation is missing

What is the issue?

Is there any public information on what exactly the "Simple Codex" harness is? As of Feb 18, 2026, it's the top of the leaderboard by a significant margin, and since the Codex CLI is open source, it would be nice to know what configuration is being used to achieve these scores, since presumably this should be correlated to getting the most out of Codex (the model + harness) as a developer / end-user.

Is it just the default Codex CLI harness (headless) for GPT-5.3 Codex, but with certain interactive tools (ask user question, etc) dropped?

X-posted to the T-bench repo in case that's a better spot for this discussion: laude-institute/terminal-bench#1417

Where did you find it?

https://www.tbench.ai/leaderboard/terminal-bench/2.0

etraut-openai · 2026-02-19T06:59:36Z

etraut-openai
Feb 19, 2026
Maintainer

To my knowledge, this report isn't something we publish or have an affiliation with. You'd need to ask the people who publish this leaderboard.

0 replies

cpacker · 2026-02-19T07:27:13Z

cpacker
Feb 19, 2026
Author

@etraut-openai do you mean that the Codex CLI research/engineering team is not affiliated with the team that wrote the release blog on GPT-5.3 Codex? Just trying to understand what the relationship between the OSS and leaderboard score is (sounds like it's unrelated teams?)

https://openai.com/index/introducing-gpt-5-3-codex/

0 replies

cpacker · 2026-02-19T07:27:41Z

cpacker
Feb 19, 2026
Author

(thanks for the quick reply btw)

0 replies

etraut-openai · 2026-02-19T07:33:25Z

etraut-openai
Feb 19, 2026
Maintainer

Ah, I misinterpreted your question. I don't know the answer here. I'm going to convert this to a discussion since it's not a bug or feature request.

1 reply

cpacker Feb 19, 2026
Author

Gotcha, thanks for reopening! Would love to understand what's going on w/ the harness, since I assume "simple" = "less tools" but couldn't find anything online about it, even though Codex CLI is OSS

dynomisto · 2026-02-19T10:18:27Z

dynomisto
Feb 19, 2026

My two cents are: Since the result has the 'We ran this evaluation and verified the results'-checkmark and the official harness for Terminal Bench is harbor (https://github.com/laude-institute/harbor) -> it is the Codex CLI: (https://github.com/laude-institute/harbor/blob/main/src/harbor/agents/installed/codex.py#L533-L557). But this is just an assumption.

0 replies

anabisX · 2026-02-19T20:47:05Z

anabisX
Feb 19, 2026

I want to know too!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarity on the "Simple Codex" harness in Terminal Bench 2.0? #12219

Uh oh!

{{title}}

Uh oh!

Replies: 6 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Clarity on the "Simple Codex" harness in Terminal Bench 2.0? #12219

Uh oh!

cpacker Feb 19, 2026

What is the type of issue?

What is the issue?

Where did you find it?

Replies: 6 comments · 1 reply

Uh oh!

etraut-openai Feb 19, 2026 Maintainer

Uh oh!

cpacker Feb 19, 2026 Author

Uh oh!

cpacker Feb 19, 2026 Author

Uh oh!

etraut-openai Feb 19, 2026 Maintainer

Uh oh!

cpacker Feb 19, 2026 Author

Uh oh!

dynomisto Feb 19, 2026

Uh oh!

anabisX Feb 19, 2026

cpacker
Feb 19, 2026

Replies: 6 comments 1 reply

etraut-openai
Feb 19, 2026
Maintainer

cpacker
Feb 19, 2026
Author

cpacker
Feb 19, 2026
Author

etraut-openai
Feb 19, 2026
Maintainer

cpacker Feb 19, 2026
Author

dynomisto
Feb 19, 2026

anabisX
Feb 19, 2026