Replies: 6 comments 1 reply
-
|
To my knowledge, this report isn't something we publish or have an affiliation with. You'd need to ask the people who publish this leaderboard. |
Beta Was this translation helpful? Give feedback.
-
|
@etraut-openai do you mean that the Codex CLI research/engineering team is not affiliated with the team that wrote the release blog on GPT-5.3 Codex? Just trying to understand what the relationship between the OSS and leaderboard score is (sounds like it's unrelated teams?)
|
Beta Was this translation helpful? Give feedback.
-
|
(thanks for the quick reply btw) |
Beta Was this translation helpful? Give feedback.
-
|
Ah, I misinterpreted your question. I don't know the answer here. I'm going to convert this to a discussion since it's not a bug or feature request. |
Beta Was this translation helpful? Give feedback.
-
|
My two cents are: Since the result has the 'We ran this evaluation and verified the results'-checkmark and the official harness for Terminal Bench is harbor (https://github.com/laude-institute/harbor) -> it is the Codex CLI: (https://github.com/laude-institute/harbor/blob/main/src/harbor/agents/installed/codex.py#L533-L557). But this is just an assumption. |
Beta Was this translation helpful? Give feedback.
-
|
I want to know too! |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
What is the type of issue?
Documentation is missing
What is the issue?
Is there any public information on what exactly the "Simple Codex" harness is? As of Feb 18, 2026, it's the top of the leaderboard by a significant margin, and since the Codex CLI is open source, it would be nice to know what configuration is being used to achieve these scores, since presumably this should be correlated to getting the most out of Codex (the model + harness) as a developer / end-user.
Is it just the default Codex CLI harness (headless) for GPT-5.3 Codex, but with certain interactive tools (ask user question, etc) dropped?
X-posted to the T-bench repo in case that's a better spot for this discussion: laude-institute/terminal-bench#1417
Where did you find it?
https://www.tbench.ai/leaderboard/terminal-bench/2.0
Beta Was this translation helpful? Give feedback.
All reactions