I dont think this is Cerebras. Running on cerebras would change model behavior a bit and it could potentially get a ~10x speedup and it'd be more expensive. So most likely this is them writing new more optimized kernels for Blackwell series maybe?
OpenAI in my estimation has the habit of dropping a model's quality after its introduction. I definitely recall ChatGPT 5.2 being a lot better when it was introduced. A week or two later, its quality suddenly dropped. The initial high looked to be to throw off journalists and benchmarks. As such, nothing that OpenAI says in terms of model speed can be trusted. All they have to do is lower the reasoning effort on average, and boom, it becomes 40% faster. I hope I am wrong, because if I am right, it's a con game.
It's good to be skeptical, but I'm happy to share that we don't pull shenanigans like this. We actually take quite a bit of care to report evals fairly, keep API model behavior constant, and track down reports of degraded performance in case we've accidentally introduced bugs. If we were degrading model behavior, it would be pretty easy to catch us with evals against our API.
In this particular case, I'm happy to report that the speedup is time per token, so it's not a gimmick from outputting fewer tokens at lower reasoning effort. Model weights and quality remain the same.
It looks like you do pull shenanigans like these [0]. The person you're replying to even mentioned "ChatGPT 5.2", but you're specifically talking only about the API, while making it sound like it applies across the board.
Hey Ted, can you confirm whether this 40% improvement is specific to API customers or if that's just a wording thing because this is the OpenAI Developers account posting?
This is great.
In the past month, OpenAI has released for codex users:
- subagents support
- a better multi agent interface (codex app)
- 40% faster inference
No joke, with the first two my productivity is already up like 3x. I am so stoked to try this out.
this is for api only
Shoot me
Try Claude and you can get x^2 performance. OpenAI is sweating
May be a bit different depending on what kind of work you're doing, but for me 5.2-codex finally reached higher level than opus.
5.2-codex is pretty solid and you get dramatically higher usage rates with cheap plans. I would assume API use is much cheaper as well.
It’s interesting that they kept the price the same while doing inference on Cerebras is much more expensive.
I dont think this is Cerebras. Running on cerebras would change model behavior a bit and it could potentially get a ~10x speedup and it'd be more expensive. So most likely this is them writing new more optimized kernels for Blackwell series maybe?
Fair point but it remains to answer - why isn’t this speed up available in ChatGPT and only in the api?
this is almost certainly not being done on cerebras
OpenAI in my estimation has the habit of dropping a model's quality after its introduction. I definitely recall ChatGPT 5.2 being a lot better when it was introduced. A week or two later, its quality suddenly dropped. The initial high looked to be to throw off journalists and benchmarks. As such, nothing that OpenAI says in terms of model speed can be trusted. All they have to do is lower the reasoning effort on average, and boom, it becomes 40% faster. I hope I am wrong, because if I am right, it's a con game.
It's good to be skeptical, but I'm happy to share that we don't pull shenanigans like this. We actually take quite a bit of care to report evals fairly, keep API model behavior constant, and track down reports of degraded performance in case we've accidentally introduced bugs. If we were degrading model behavior, it would be pretty easy to catch us with evals against our API.
In this particular case, I'm happy to report that the speedup is time per token, so it's not a gimmick from outputting fewer tokens at lower reasoning effort. Model weights and quality remain the same.
It looks like you do pull shenanigans like these [0]. The person you're replying to even mentioned "ChatGPT 5.2", but you're specifically talking only about the API, while making it sound like it applies across the board.
[0] https://x.com/chetaslua/status/2018819186425008465
Hey Ted, can you confirm whether this 40% improvement is specific to API customers or if that's just a wording thing because this is the OpenAI Developers account posting?
You're confirming you don't alter "juice" levels..?
I mean you can just run the benchmark again