OpenAI launches GPT-5.3-Codex as faster coding agent
OpenAI has launched GPT-5.3-Codex, a new Codex model it says expands Codex from coding assistance into a computer-using agent for a broader range of professional tasks.
OpenAI described the release as a step up from GPT-5.2-Codex, combining coding performance with stronger reasoning and professional knowledge in one system. It said the model runs 25% faster than the prior generation in Codex.
GPT-5.3-Codex is available on paid ChatGPT plans wherever Codex is supported, including the Codex app, command line interface, integrated development environment extension, and the web version. OpenAI is working to enable API access.
Benchmark results
OpenAI reported that GPT-5.3-Codex sets an "industry high" on SWE-Bench Pro and Terminal-Bench, and performs strongly on OSWorld and GDPval. These benchmarks assess software engineering tasks, terminal-based workflows, desktop computer use, and structured knowledge work.
SWE-Bench Pro focuses on real-world software engineering. OpenAI said it covers four programming languages and is designed to be more resistant to contamination than earlier variants. Terminal-Bench 2.0 measures skills needed for work in a terminal environment.
For web development, OpenAI cited internal tests in which GPT-5.3-Codex built and iterated on two browser games over extended runs. The company said it used a web game development skill and follow-up prompts such as "fix the bug" and "improve the game," describing the work as autonomous iteration across millions of tokens.
OpenAI also said the model handles day-to-day website requests differently: underspecified prompts now produce sites with more functionality and more sensible defaults than GPT-5.2-Codex.
Broader workflow
GPT-5.3-Codex is designed for tasks across the software lifecycle, including debugging, deployment, monitoring, and testing, OpenAI said. It also listed product requirement documents, copy editing, user research, and metrics work, along with tasks outside software engineering such as slide decks and spreadsheet analysis.
GDPval is OpenAI's evaluation for well-specified knowledge-work tasks across 44 occupations. OpenAI said GPT-5.3-Codex matches GPT-5.2 on GDPval when used with custom skills similar to those used for earlier results.
OSWorld tests whether an agent can complete productivity tasks in a visual desktop environment. OpenAI said GPT-5.3-Codex shows stronger computer-use performance than previous GPT models.
Interactive control
OpenAI also highlighted how users supervise long-running agent work. It said the Codex app provides frequent updates on key decisions and progress, and lets users interact while a task is underway rather than waiting for final output.
According to OpenAI, GPT-5.3-Codex can respond to feedback mid-task and maintain context during interaction. It compared the workflow to steering a colleague.
Self-improving development
GPT-5.3-Codex is its first model that was "instrumental in creating itself." Early versions were used by the Codex team to debug training runs, manage deployment, and diagnose evaluations.
OpenAI also described internal use in engineering operations, saying Codex helped identify context-rendering bugs and causes of low cache hit rates. During the launch, GPT-5.3-Codex was also used to scale GPU clusters in response to traffic and keep latency stable.
In one internal example, OpenAI said a researcher used GPT-5.3-Codex to build simple regex classifiers to estimate patterns such as clarifications and user responses in session logs. The model then ran the analysis and produced a report. In another example, OpenAI said a data scientist used it to build new data pipelines and visualisations, then co-analysed results with the model.
Cybersecurity focus
OpenAI has reported performance gains on cybersecurity tasks in recent months and says it has prepared strengthened safeguards. GPT-5.3-Codex is the first model it classifies as "High capability" for cybersecurity-related tasks under its Preparedness Framework.
The company says it trained the model to identify software vulnerabilities, but does not have definitive evidence it can automate cyber attacks end to end. As a precaution, it is deploying what it describes as its most comprehensive cybersecurity safety stack to date, including safety training, automated monitoring, trusted access for advanced capabilities, and enforcement pipelines that incorporate threat intelligence.
A pilot programme called Trusted Access for Cyber is intended to accelerate cyber defence research. OpenAI is also expanding a private beta for Aardvark, described as a security research agent and the first product in a planned suite of Codex Security tools.
The company says it is partnering with open-source maintainers to provide free codebase scanning for widely used projects, including Next.js. It is also committing USD $10 million in API credits for cyber defence work, with a focus on open-source software and critical infrastructure systems.
Infrastructure and rollout
OpenAI said GPT-5.3-Codex was co-designed for, trained with, and served on Nvidia GB200 NVL72 systems. It also said infrastructure and inference changes are behind the 25% speed increase for Codex users.