- OpenAI has always had some of the strongest prompt understanding alongside the weakest image fidelity. This update goes some way towards addressing this weakness.
- It's leagues better at making localized edits without altering the entire image's aesthetic than gpt-image-1, doubling the previous score from 4/12 to 8/12 and the only model that legitimately passed the Giraffe prompt.
- It's one of the most steerable models with a 90% compliance rate
Updates to GenAI Showdown
- Added outtakes sections to each model's detailed report in the Text-to-Image category, showcasing notable failures and unexpected behaviors.
- New models have been added including REVE and Flux.2 Dev (a new locally hostable model).
- Finally got around to implementing a weighted scoring mechanism which considers pass/fail, quality, and compliance for a more holistic model evaluation (click pass/fail icon to toggle between scoring methods).
If you just want to compare gpt-image-1, gpt-image-1.5, and NB Pro at the same time:
Nano Banana has still the best VAE we have seen especially if you are doing high res production work. The flux2 comes close but gpt image is still miles away.
Personal request: could you also advocate for "image previz rendering", which I feel is an extremely compelling use case for these companies to develop. Basically any 2d/3d compositor that allows you to visually block out a scene, then rely on the model to precisely position the set, set pieces, and character poses.
If we got this task onto benchmarks, the companies would absolutely start training their models to perform well at it.
Here are some examples:
gpt-image-1 absolutely excels at this, though you don't have much control over the style and aesthetic:
Thanks! A highly configurable Previz2Image model would be a fantastic addition. I was literally just thinking about this the other day (but more in the context of ControlNets and posable kinematic models). I’m even considering adding an early CG Poser blocked‑out scene test to see how far the various editor models can take it.
With additions like structured prompts (introduced in BFL Flux 2), maybe we'll see something like this in the near future.
If this was a farm of sweatshop Photoshopers in 2010, who download all images from the internet and provide a service of combining them on your request, this would escalate pretty quickly.
Question: with copyright and authorship dead wrt AI, how do I make (at least) new content protected?
Anecdotal: I had a hobby of doing photos in quite rare style and lived in a place where you'd get quite a few pictures of. When I asked gpt to generate a picture of that are in that style, it returned highly modified, but recognizable copy of a photo I've published years ago.
Air gap. If you don’t want content to be used without your permission, it never leaves your computer. This is the only protection that works.
If you want others to see your content, however, you have to accept some degree of trade off with it being misappropriated. Blatant cases can be addressed the same as they always were, but a model overfitting to your original work poses an interesting question for which I’m not aware of any legal precedents having been set yet.
Using references is a standard industry practice for digital art and VFX. The main difference is that you are unable to accidentally copy a reference too close, while with AI it’s possible.
That seems unlikely to me. One side is made up of lots and lots of entrenched interests with sympathetic figures like authors and artists on their side, and the other is “big tech,” dominated by the rather unsympathetic OpenAI and Google.
my question to your anecdotal: who cares? not being fecicious, but who cares if someone reproduced your stuff and millions of people see your stuff? is the money that you want? is it the fame? because fame you will get, maybe not money... but couldn't there be another way?
People have values that go beyond wealth and fame. Some people care about things like personal agency, respect and deference, etc.
If someone were on vacation and came home to learn that their neighbor had allowed some friends stay in the empty house, we would often expect some kind of outrage regardless of whether there had been specific damage or wear to the home.
Culturally, people have deeply set ideas about what's theirs, and feel like they deserve some say over how their things are used and by whom. Even those that are very generous and want their things be widely shared usually want to have have some voice in making that come to be.
To clarify my question - I do not want anything I create to be fed into their training data. That photo is just an example that I caught and it became personal. But in general I don't want anymore to open source my code, write articles and put any effort into improving training data set.
Copyright has overstepped its initial purpose by leaps and bounds because corporations make the law. If you're not cynical about how Copyright currently works you probably haven't been paying attention. And it doesn't take much to go from cynical to nihilist in this case.
There's definitely a case of miscommunication at play if you didn't read cynicism into my original post. I broadly agree with you, but I'll leave it at that to prevent further fruitless arguing about specifics.
(to clarify, OpenAI stops refining the image if a classifier detects your image as potentially violating certain copyrights. Although the gulf in resolution is not caused by that.)
As a professional cinematographer/photographer I am incredibly uncomfortable with people using my art without my permission for unknown ends. Doubly so when it’s venture backed private companies stealing from millions of people like me as they make vague promises about the capabilities of their software trained on my work. It doesn’t take much to understand why that makes me uncomfortable and why I feel I am entitled to saying “no.” Legally I am entitled to that in so many cases, yet for some reason Altman et al get to skip that hurdle. Why?
How do you feel about entities taking your face off of your personal website and plastering it on billboards smiling happily next to their product? What if it’s for a gun? Or condoms? Or a candidate for a party you don’t support? Pick your own example if none of those bother you. I’m sure there are things you do not want to be associated with/don’t want to contribute to.
At the end of the day it’s very gross when we are exploited without our knowledge or permission so rich groups can get richer. I don’t care if my visual work is only partially contributing to some mashed up final image. I don’t want to be a part of it.
The day after I first heard about the Internet, back in 1990-whatever, it occurred to me that I probably shouldn't upload anything to the Internet that I didn't want to see on the front page of tomorrow's newspaper.
Apart from the 'newspaper' anachronism, that's pretty much still my take.
Sorry, but you'll just have to deal with it and get over it.
I can tell you with 100% certainty they are not. For example, Crash doesn't have a backside for his torso. You could definitely make a model that uses these as textures, but you'd really have to force it and a lot of it would be stretched or look weird. If you want to go this approach, it would make a lot more sense to make a model, unwrap it, and use the wireframe UV map as input.
Did an experiment to give a software product a dark theme. Gave Both (GPT and Gemini/Nano) a screenshot of the product and an example theme I found on Dribbble.
- Gemini/Nano did a pretty average job, only applying some grey to some of the panels. I tried a few different examples and got similar output.
- GPT did a great job and themed the whole app and made it look great. I think I'd still need a designer to finesse some things though.
It's strange to me too, but they must have done the market research for what people do with image gen.
My own main use cases are entirely textual: Programming, Wiki, and Mathematics.
I almost never use image generation for anything. However its objectively extremely popular.
This has strong parallels for me to when snapchat filters became super popular. I know lots of people loved editing and filtering pictures but I always left everything as auto mode, in fact I'd turn off a lot of the default beauty filters. It just never appealed to me.
I can actually imagine actors selling the rights to make fake images with them.
In late stage capitalism you pay for fake photos with someone. You have chat gpt write about how you dated for a summer, and have it end with them leaving for grad school to explain why you aren't together.
Eventually we'll all just pay to live in the matrix. When your credit card is declined you'll be logged out, to awaken in a shared studio apartment. To eat your rations.
I can see them getting paid like residuals from TV re-runs.
But after a point it'll hit saturation point. The novelty will wear off since everyone has access to it. Who cares if you have a fake photo with a celebrity if everyone knows it's fake.
So the announcement said the API works with the new model, so I updated my Golang SDK grail (https://github.com/montanaflynn/grail) to use but it returns a 500 server error when you try to use it, and if you change to a completely unknown model it's not listed in the available models:
POST "https://api.openai.com/v1/responses": 500 Internal Server Error {
"message": "An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_******************* in your message.",
"type": "server_error",
"param": null,
"code": "server_error"
}
POST "https://api.openai.com/v1/responses": 400 Bad Request {
"message": "Invalid value: 'blah'. Supported values are: 'gpt-image-1' and 'gpt-image-1-mini'.",
"type": "invalid_request_error",
"param": "tools[0].model",
"code": "invalid_value"
}
I get the tech implementation is amazing, I wonder if it takes away from genuineness of events, like the Astronaut photo, I get it's just a joke/funny too but it's like a photo of you in a supercar vs. actually buying one. Or fake AI companions vs. real people. Beauty filters/skinny filters vs. actually being healthy.
I know this is a bit out of scope for these image editing models but I always try this experiment [1] of drawing a "random" triangle and then doing some geometric construction and they mess up in very funny ways. These models can't "see" very well. I think [2] is still very relevant.
I know OpenAI watermarks their stuff. But I wish they wouldn't. It's a "false" trust.
Now it means whoever has access to uncensored/non-watermarking models can pass off their faked images as real and claim, "Look! There's no watermark, of course, it's not fake!"
Whereas, if none of the image models did watermarking, then people (should) inherently know nothing can be trusted by default.
There are ways to tell if an image is real, if it's been signed cryptographically by the camera for example, but increasingly it probably won't be possible to tell if something is fake. Even if there's some kind of hidden watermark embedded in the pixels, you can process it with img2img in another tool and get rid of the watermark. Exif data, etc is irrelevant, you can get rid of it easily or fake it.
I just checked several of the files uploaded to the news post, the "previous" and "new", both the png and webp (&fm=webp in url) versions - none had the content metadata. So either the internal version they used to generate them skipped them, or they just stripped the metadata when uploading.
Yeah I just tried it and got a 500 server error with no details as to why:
POST "https://api.openai.com/v1/responses": 500 Internal Server Error {
"message": "An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_******************* in your message.",
"type": "server_error",
"param": null,
"code": "server_error"
}
Interestingly if you change to request the model foobar you get an error showing this:
POST "https://api.openai.com/v1/responses": 400 Bad Request {
"message": "Invalid value: 'blah'. Supported values are: 'gpt-image-1' and 'gpt-image-1-mini'.",
"type": "invalid_request_error",
"param": "tools[0].model",
"code": "invalid_value"
}
It's too bad no OpenAI Engineers (or Marketers?) know that term exists. /s
I do not understand why it's so hard for them to just tell the truth. So many announcements "Available today for Plus/Pro/etc" really means "Sometime this week at best, maybe multiple weeks". I'm not asking for them to roll out faster, just communicate better.
AI-generated images would remove all the trust and admire for human talent in art, similar to how text-generation would remove trust and admire for human talent in writing. Same case for coding.
So, let's simulate that future. Since no one trusts your talent in coding, art or writing, you wouldn't care to do any of these. But the economy is built on the products and services which get their value based how much of human talent and effort is required to produce them.
So, the value of these services and products goes down as demand and trust goes down. No one knows or cares who is a good programmer in the team, who is great thinker and writer and who is a modern Picasso.
So, the motivation disappears for humans. There are no achievements to target, there is no way to impress others with your talent. This should lead to uniform workforce without much difference in talents. Pretty much a robot army.
I think we can say the pause we took was reasonable once we realized the environmental impact of dumping greenhouse gases into the atmosphere but if now that can ensure further growth won’t do it, let’s make sure we restart, just clean this time.
Unlike Nano Banana it allows generating photos of children. Always fun to ask AI to imagine children of a couple but it's also kinda concerning that there might be terrible use cases.
If memory serves me, Nano Banana allows generating/editing photos of children. But anything that could be misinterpreted, gets blocked, even absolutely benign and innocent things (especially if you are asking to modify a photo that you upload there). So they allow, but they turn on the guardrails to a point that might not be useful in many situations.
I haven't seen that, meanwhile gpt-image-1.5 still has zero-tolerance policing copyright (even via the API) so it's pretty much useless in production once exposed to consumers.
I'm honestly surprised they're still on this post-Sora 2: let the consumer of the API determine their risk appetite. If a copyright holder comes knocking, "the API did it" isn't going to be a defense either way.
Hope to see more "red alert" status from the ai wars putting companies into al hands on deck. This is only helping cost of tokens and efficacy. As always competition only helps the end users.
In the image they showed for the new one, the mechanic was checking a dipstick...that was still in the vehicle.
I really hope everyone is starting to get disillusioned with OpenAI. They're just charging you more and more for what? Shitty images that are easy to sniff out?
In that case, I have a startup for you to invest in. Its a bridge-selling app.
1) Clear copy editing error in a major section header
The section header reads “Precise edits that preserve what matter”—it should almost certainly be “what matters.” This appears both in the table of contents and the body header, so it’s high-visibility.
Why it matters: This is the kind of basic grammar error that undermines trust in the rest of the claims, especially in a product announcement.
Fix: Update the heading and TOC anchor text site-wide.
-------------------
Shouldn't an AI review of all web posts be part of some kind of agentic workflow for the leading AI lab at this point?
When the demand is back, the labs should start coming back. There's a few in my relatively small city which is pretty surprising. But the costs are still too high to cover the low volume I guess.
Alt text is one of the nicest uses for ai and still Open AI didn't bother using it for something so basic. The dogfooding is not strong with their marketing team.
(Realistically, Seedream 4 is the best at aesthetically pleasing generation, Nano Banana Pro is the best at realism and editing, and Seedream 4.5 is a very strong middleground between the two with great pricing)
gpt-image-1.5 feels like OpenAI doing the bare minimum to keep people from switching to Gemini every time they want an image.
Every person in every picture in their examples is white except for 1 Asian dude. Like a 46:1 ratio for the page (I counted). Not one Middle Eastern or Black or Jewish or Indian or South American person.
Not even one. And no one on the team said anything?
Okay results are in for GenAI Showdown with the new gpt-image 1.5 model for the editing portions of the site!
https://genai-showdown.specr.net/image-editing
Conclusions
- OpenAI has always had some of the strongest prompt understanding alongside the weakest image fidelity. This update goes some way towards addressing this weakness.
- It's leagues better at making localized edits without altering the entire image's aesthetic than gpt-image-1, doubling the previous score from 4/12 to 8/12 and the only model that legitimately passed the Giraffe prompt.
- It's one of the most steerable models with a 90% compliance rate
Updates to GenAI Showdown
- Added outtakes sections to each model's detailed report in the Text-to-Image category, showcasing notable failures and unexpected behaviors.
- New models have been added including REVE and Flux.2 Dev (a new locally hostable model).
- Finally got around to implementing a weighted scoring mechanism which considers pass/fail, quality, and compliance for a more holistic model evaluation (click pass/fail icon to toggle between scoring methods).
If you just want to compare gpt-image-1, gpt-image-1.5, and NB Pro at the same time:
https://genai-showdown.specr.net/image-editing?models=o4,nbp...
> the only model that legitimately passed the Giraffe prompt.
10 years ago I would have considered that sentence satire. Now it allegedly means something.
Somehow it feels like we’re moving backwards.
Nano Banana has still the best VAE we have seen especially if you are doing high res production work. The flux2 comes close but gpt image is still miles away.
I really love everything you're doing!
Personal request: could you also advocate for "image previz rendering", which I feel is an extremely compelling use case for these companies to develop. Basically any 2d/3d compositor that allows you to visually block out a scene, then rely on the model to precisely position the set, set pieces, and character poses.
If we got this task onto benchmarks, the companies would absolutely start training their models to perform well at it.
Here are some examples:
gpt-image-1 absolutely excels at this, though you don't have much control over the style and aesthetic:
https://imgur.com/a/previz-to-image-gpt-image-1-x8t1ijX
Nano Banana (Pro) fails at this task:
https://imgur.com/a/previz-to-image-nano-banana-pro-Q2B8psd
Flux Kontext, Qwen, etc. have mixed results.
I'm going to re-run these under gpt-image-1.5 and report back.
Edit:
gpt-image-1.5 :
https://imgur.com/a/previz-to-image-gpt-image-1-5-3fq042U
And just as I finish this, Imgur deletes my original gpt-image-1 post.
Old link (broken): https://imgur.com/a/previz-to-image-gpt-image-1-Jq5M2Mh
Hopefully imgur doesn't break these. I'll have to start blogging and keep these somewhere I control.
Thanks! A highly configurable Previz2Image model would be a fantastic addition. I was literally just thinking about this the other day (but more in the context of ControlNets and posable kinematic models). I’m even considering adding an early CG Poser blocked‑out scene test to see how far the various editor models can take it.
With additions like structured prompts (introduced in BFL Flux 2), maybe we'll see something like this in the near future.
If this was a farm of sweatshop Photoshopers in 2010, who download all images from the internet and provide a service of combining them on your request, this would escalate pretty quickly.
Question: with copyright and authorship dead wrt AI, how do I make (at least) new content protected?
Anecdotal: I had a hobby of doing photos in quite rare style and lived in a place where you'd get quite a few pictures of. When I asked gpt to generate a picture of that are in that style, it returned highly modified, but recognizable copy of a photo I've published years ago.
> how do I make (at least) new content protected?
Air gap. If you don’t want content to be used without your permission, it never leaves your computer. This is the only protection that works.
If you want others to see your content, however, you have to accept some degree of trade off with it being misappropriated. Blatant cases can be addressed the same as they always were, but a model overfitting to your original work poses an interesting question for which I’m not aware of any legal precedents having been set yet.
Using references is a standard industry practice for digital art and VFX. The main difference is that you are unable to accidentally copy a reference too close, while with AI it’s possible.
We are probably entering the post-copyright era. The law will follow sooner or later.
That seems unlikely to me. One side is made up of lots and lots of entrenched interests with sympathetic figures like authors and artists on their side, and the other is “big tech,” dominated by the rather unsympathetic OpenAI and Google.
A middle ground would be Chat GPT at least providing attribution.
Back in reality, you can get in line to sue. Since they have more money than you, you can't really win though.
So it goes.
> Question: with copyright and authorship dead wrt AI, how do I make (at least) new content protected?
Question: Now that the steamboats have been invented, how do I keep my clipper business afloat ?
Answer: Good riddance to the broken idea of IP, Schumpeter's Gale is around the corner, time for a new business model.
my question to your anecdotal: who cares? not being fecicious, but who cares if someone reproduced your stuff and millions of people see your stuff? is the money that you want? is it the fame? because fame you will get, maybe not money... but couldn't there be another way?
People have values that go beyond wealth and fame. Some people care about things like personal agency, respect and deference, etc.
If someone were on vacation and came home to learn that their neighbor had allowed some friends stay in the empty house, we would often expect some kind of outrage regardless of whether there had been specific damage or wear to the home.
Culturally, people have deeply set ideas about what's theirs, and feel like they deserve some say over how their things are used and by whom. Even those that are very generous and want their things be widely shared usually want to have have some voice in making that come to be.
To clarify my question - I do not want anything I create to be fed into their training data. That photo is just an example that I caught and it became personal. But in general I don't want anymore to open source my code, write articles and put any effort into improving training data set.
Suddenly, copyright doesn't matter anymore when it's no longer useful to the narrative.
Copyright has overstepped its initial purpose by leaps and bounds because corporations make the law. If you're not cynical about how Copyright currently works you probably haven't been paying attention. And it doesn't take much to go from cynical to nihilist in this case.
There's definitely a case of miscommunication at play if you didn't read cynicism into my original post. I broadly agree with you, but I'll leave it at that to prevent further fruitless arguing about specifics.
OpenAI does care about copyright, thankfully China does not: https://imgur.com/a/RKxYIyi
(to clarify, OpenAI stops refining the image if a classifier detects your image as potentially violating certain copyrights. Although the gulf in resolution is not caused by that.)
(Shrug) This is more important. Sorry.
facetious
[I won't bother responding to the rest of your appalling comment]
The issue is ownership, not promotion or visibility.
As a professional cinematographer/photographer I am incredibly uncomfortable with people using my art without my permission for unknown ends. Doubly so when it’s venture backed private companies stealing from millions of people like me as they make vague promises about the capabilities of their software trained on my work. It doesn’t take much to understand why that makes me uncomfortable and why I feel I am entitled to saying “no.” Legally I am entitled to that in so many cases, yet for some reason Altman et al get to skip that hurdle. Why?
How do you feel about entities taking your face off of your personal website and plastering it on billboards smiling happily next to their product? What if it’s for a gun? Or condoms? Or a candidate for a party you don’t support? Pick your own example if none of those bother you. I’m sure there are things you do not want to be associated with/don’t want to contribute to.
At the end of the day it’s very gross when we are exploited without our knowledge or permission so rich groups can get richer. I don’t care if my visual work is only partially contributing to some mashed up final image. I don’t want to be a part of it.
The day after I first heard about the Internet, back in 1990-whatever, it occurred to me that I probably shouldn't upload anything to the Internet that I didn't want to see on the front page of tomorrow's newspaper.
Apart from the 'newspaper' anachronism, that's pretty much still my take.
Sorry, but you'll just have to deal with it and get over it.
> Sorry, but you'll just have to deal with it and get over it.
You were fine until this bit.
They're still fine because they're right.
You got to play the copyright game when the big corps were on your side.
Now they're on the other side. Deal with it and get over it.
I am very impressed a benchmark I like to run is have it create sprite maps, uv texture maps for an imagined 3d model
Noticed it captured a megaman legends vibe ....
https://x.com/AgentifySH/status/2001037332770615302
and here it generated a texture map from a 3d character
https://x.com/AgentifySH/status/2001038516067672390/photo/1
however im not sure if these are true uv maps that is accurate as i dont have the 3d models itself
but ive tried this in nano banana when it first came out and it couldn't do it
> however im not sure if these are true uv maps
I can tell you with 100% certainty they are not. For example, Crash doesn't have a backside for his torso. You could definitely make a model that uses these as textures, but you'd really have to force it and a lot of it would be stretched or look weird. If you want to go this approach, it would make a lot more sense to make a model, unwrap it, and use the wireframe UV map as input.
Here's the original Crash model: https://models.spriters-resource.com/pc_computer/crashbandic... , its actual texture is nothing like the generated one, because the real one was designed for efficiency.
yeah definitely impressive compared to what nano banana outputted
tried your suggested approach by unwrapaped wireframe uv as input and im impressed
https://x.com/AgentifySH/status/2001057153235222867
obviously its not going to be accurate 1:1 but with more 3d spatial awareness i think it could definitely improve
> however im not sure if these are true uv maps that is accurate as i dont have the 3d models itself
also in the tweet
> GPT Image 1.5 is **ing crazy
and
> holy shit lol
what's impressive if you don't know if it's right or not (as the other comment pointed out, it is not right)
Did an experiment to give a software product a dark theme. Gave Both (GPT and Gemini/Nano) a screenshot of the product and an example theme I found on Dribbble.
- Gemini/Nano did a pretty average job, only applying some grey to some of the panels. I tried a few different examples and got similar output.
- GPT did a great job and themed the whole app and made it look great. I think I'd still need a designer to finesse some things though.
It's really weird to see "make images from memories that aren't real" as a product pitch
It's strange to me too, but they must have done the market research for what people do with image gen.
My own main use cases are entirely textual: Programming, Wiki, and Mathematics.
I almost never use image generation for anything. However its objectively extremely popular.
This has strong parallels for me to when snapchat filters became super popular. I know lots of people loved editing and filtering pictures but I always left everything as auto mode, in fact I'd turn off a lot of the default beauty filters. It just never appealed to me.
It would creep me out if the model produced origami animals for that prompt.
I can actually imagine actors selling the rights to make fake images with them.
In late stage capitalism you pay for fake photos with someone. You have chat gpt write about how you dated for a summer, and have it end with them leaving for grad school to explain why you aren't together.
Eventually we'll all just pay to live in the matrix. When your credit card is declined you'll be logged out, to awaken in a shared studio apartment. To eat your rations.
I can see them getting paid like residuals from TV re-runs.
But after a point it'll hit saturation point. The novelty will wear off since everyone has access to it. Who cares if you have a fake photo with a celebrity if everyone knows it's fake.
So the announcement said the API works with the new model, so I updated my Golang SDK grail (https://github.com/montanaflynn/grail) to use but it returns a 500 server error when you try to use it, and if you change to a completely unknown model it's not listed in the available models:
I get the tech implementation is amazing, I wonder if it takes away from genuineness of events, like the Astronaut photo, I get it's just a joke/funny too but it's like a photo of you in a supercar vs. actually buying one. Or fake AI companions vs. real people. Beauty filters/skinny filters vs. actually being healthy.
I know this is a bit out of scope for these image editing models but I always try this experiment [1] of drawing a "random" triangle and then doing some geometric construction and they mess up in very funny ways. These models can't "see" very well. I think [2] is still very relevant.
[1]: https://chatgpt.com/share/6941c96c-c160-8005-bea6-c809e58591...
[2]: https://vlmsareblind.github.io/
Is there a watermarking, or some other way for normal people to tell if its fake?
I know OpenAI watermarks their stuff. But I wish they wouldn't. It's a "false" trust.
Now it means whoever has access to uncensored/non-watermarking models can pass off their faked images as real and claim, "Look! There's no watermark, of course, it's not fake!"
Whereas, if none of the image models did watermarking, then people (should) inherently know nothing can be trusted by default.
There are ways to tell if an image is real, if it's been signed cryptographically by the camera for example, but increasingly it probably won't be possible to tell if something is fake. Even if there's some kind of hidden watermark embedded in the pixels, you can process it with img2img in another tool and get rid of the watermark. Exif data, etc is irrelevant, you can get rid of it easily or fake it.
https://help.openai.com/en/articles/8912793-c2pa-in-chatgpt-...
It doesn't mention the new model, but it's likely the same or similar.
I just checked several of the files uploaded to the news post, the "previous" and "new", both the png and webp (&fm=webp in url) versions - none had the content metadata. So either the internal version they used to generate them skipped them, or they just stripped the metadata when uploading.
I ran exiftool on an image I just generated:
$ exiftool chatgpt_image.png
...
Actions Software Agent Name : GPT-4o
Actions Digital Source Type : http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgori...
Name : jumbf manifest
Alg : sha256
Hash : (Binary data 32 bytes, use -b option to extract)
Pad : (Binary data 8 bytes, use -b option to extract)
Claim Generator Info Name : ChatGPT
...
Exif isn't all that robust though.
I suppose I'm going to have to bite the bullet and actually train an AI detector that works roughly in real time.
It's still not available in the API despite them announcing the availability.
They even linked to their Image Playground where it's also not available..
I updated my local playground to support it and I'm just handling the 404 on the model gracefully
https://github.com/alasano/gpt-image-1-playground
Yeah I just tried it and got a 500 server error with no details as to why:
Interestingly if you change to request the model foobar you get an error showing this:It's a staggered rollout but I am not seeing it on the backend either.
> staggered rollout
It's too bad no OpenAI Engineers (or Marketers?) know that term exists. /s
I do not understand why it's so hard for them to just tell the truth. So many announcements "Available today for Plus/Pro/etc" really means "Sometime this week at best, maybe multiple weeks". I'm not asking for them to roll out faster, just communicate better.
AI-generated images would remove all the trust and admire for human talent in art, similar to how text-generation would remove trust and admire for human talent in writing. Same case for coding.
So, let's simulate that future. Since no one trusts your talent in coding, art or writing, you wouldn't care to do any of these. But the economy is built on the products and services which get their value based how much of human talent and effort is required to produce them.
So, the value of these services and products goes down as demand and trust goes down. No one knows or cares who is a good programmer in the team, who is great thinker and writer and who is a modern Picasso.
So, the motivation disappears for humans. There are no achievements to target, there is no way to impress others with your talent. This should lead to uniform workforce without much difference in talents. Pretty much a robot army.
> Still some scientific inaccuracies, but ~70% correct
That's still dangerously bad for the use-case they're proposing. We don't need better looking but completely wrong infographics.
It's pretty common for infographics to be wrong. The people making them aren't the same people who know the facts.
I'd especially say like 100% of amateur political infographics/memes are wrong. ("climate change is caused by 100 companies" for instance)
We don’t, but most Marketing departments salivate for them.
we seriously can't be burning GW of energy just to have sama in a GPT-Shirt Ad generated by A.I
impressive stuff though - as you can give it a base image + prompt.
counterpoint: we should make energy abundant enough that it really doesn't matter if sama wants to generate gpt-shirt ads or not.
we have the capability, we just stopped making power more abundant.
I think we can say the pause we took was reasonable once we realized the environmental impact of dumping greenhouse gases into the atmosphere but if now that can ensure further growth won’t do it, let’s make sure we restart, just clean this time.
It's a joke about one of his old fits.
https://x.com/coldhealing/status/1747270233306644560
Unlike Nano Banana it allows generating photos of children. Always fun to ask AI to imagine children of a couple but it's also kinda concerning that there might be terrible use cases.
If memory serves me, Nano Banana allows generating/editing photos of children. But anything that could be misinterpreted, gets blocked, even absolutely benign and innocent things (especially if you are asking to modify a photo that you upload there). So they allow, but they turn on the guardrails to a point that might not be useful in many situations.
I was able to generate photos of my imagined children via Nano Banana
I haven't seen that, meanwhile gpt-image-1.5 still has zero-tolerance policing copyright (even via the API) so it's pretty much useless in production once exposed to consumers.
I'm honestly surprised they're still on this post-Sora 2: let the consumer of the API determine their risk appetite. If a copyright holder comes knocking, "the API did it" isn't going to be a defense either way.
This is terrifying. Truth is dead.
Makes you wonder what's really meant when we talk about progress.
not super impressed. feels like 70% as good as nano banana pro.
Hope to see more "red alert" status from the ai wars putting companies into al hands on deck. This is only helping cost of tokens and efficacy. As always competition only helps the end users.
In the image they showed for the new one, the mechanic was checking a dipstick...that was still in the vehicle.
I really hope everyone is starting to get disillusioned with OpenAI. They're just charging you more and more for what? Shitty images that are easy to sniff out?
In that case, I have a startup for you to invest in. Its a bridge-selling app.
Haven’t their prices stayed at $20/m for a while now?
They've published anticipated price increases over coming years. Prices will rise dramatically and steadily to meet revenue targets.
AI doesn’t have much of a moat. People can and will easily switch providers.
I asked GPT-5.2 Pro to review the release:
-------------------
Highest-impact issues to fix
1) Clear copy editing error in a major section header
The section header reads “Precise edits that preserve what matter”—it should almost certainly be “what matters.” This appears both in the table of contents and the body header, so it’s high-visibility.
Why it matters: This is the kind of basic grammar error that undermines trust in the rest of the claims, especially in a product announcement.
Fix: Update the heading and TOC anchor text site-wide.
-------------------
Shouldn't an AI review of all web posts be part of some kind of agentic workflow for the leading AI lab at this point?
If it can't generate non-sexual content of a woman in a bikini, I am not interested.
nah Nano Banana Pro is much better
Really can't stand the image slop suffocating the internet.
Still can't pass my image test
Two women walking in single file
Although it tried very hard and had them staggered slightly
My copium is that analog photography makes a come back as a way to recover some level of trust and authenticity.
Good luck getting it developed unfortunately. I have to ship it off now, there isn’t a single local spot in my city that will develop anymore
When the demand is back, the labs should start coming back. There's a few in my relatively small city which is pretty surprising. But the costs are still too high to cover the low volume I guess.
Alt text is one of the nicest uses for ai and still Open AI didn't bother using it for something so basic. The dogfooding is not strong with their marketing team.
Post: https://openai.com/index/new-chatgpt-images-is-here/ (https://news.ycombinator.com/item?id=46291827)
We'll merge that thread hither to give some other submitters a chance.
Nano Banana Pro is so good that any other attempt feels 1-2 generations behind.
Nano banana pro is almost as good as seedream 4.5!
Seedream 4.5 is almost as good as Seedream 4!
(Realistically, Seedream 4 is the best at aesthetically pleasing generation, Nano Banana Pro is the best at realism and editing, and Seedream 4.5 is a very strong middleground between the two with great pricing)
gpt-image-1.5 feels like OpenAI doing the bare minimum to keep people from switching to Gemini every time they want an image.
Every person in every picture in their examples is white except for 1 Asian dude. Like a 46:1 ratio for the page (I counted). Not one Middle Eastern or Black or Jewish or Indian or South American person.
Not even one. And no one on the team said anything?
Come on Sam, do better.
Another bunch of "startups" have been eliminated.
Among those, Photoshop.
I wish. Even Nano Banana Pro still sucks for even basic operations.