How ChatGPT’s New Image Generator Stacks Up Against Gemini’s Nano Banana Pro


Following the major image editing upgrades added to Google Gemini back in August—under the whimsical codename Nano Banana—it’s OpenAI’s turn to supercharge the tools you get for image manipulations inside ChatGPT. The new update is called GPT Image 1.5, and is rolling out now for all users.

One of the key improvements here, as was the case with Nano Banana, is the way that ChatGPT can now edit a specific part of an image while keeping everything else consistent. You can add or remove something, or change the color or style of something, without ending up with an entirely different looking picture.

Another feature ChatGPT has now borrowed from Gemini: the ability to combine multiple images together in one scene. Want you and your best friend in front of Sydney Harbour Bridge? No problem—just supply the source pictures and the AI will do the rest. You can also change visual styles while maintaining consistent details.

OpenAI says the new image editor and generator is able to follow instructions “more reliably,” and render pictures up to four times faster than before. Text can be more varied in style and size, and images should be more realistic and error-free in general—though OpenAI also admits there’s still room for improvement.

It’s the best image generator tool we’ve ever seen in ChatGPT, and it all looks impressive at first glance—but how does it stack up in practice against Gemini and Nano Banana? I put the two models to the test via the $20-per-month plan on both platforms (that’s ChatGPT Plus and Google AI Pro, respectively) to see how they compared.

Rendering and editing images

Open up ChatGPT on the web or on mobile and you’ll see there’s a new Images tab on the left-hand navigation pane. This takes you to a library of your existing pictures, together with some new prompts for creating images. You get some suggestions for prompts, plus an assortment of preset portrait image styles you can apply.

A journalist, lamp, and countryside scene courtesy of Gemini.
Credit: Gemini

ChatGPT images

A journalist, lamp, and countryside scene courtesy of ChatGPT.
Credit: ChatGPT

I tested out the new GPT Image 1.5 model by getting ChatGPT to generate a busy tech journalist, a lamp in the middle of an empty warehouse, and a cartoon-style rolling landscape of hills in the fog. I then got Gemini to create the same pictures with the same prompts. While the results were pretty varied, in terms of quality and realism they were pretty equal—the occasional issue with weird physics and repetition, but nothing too bad.

Both ChatGPT and Gemini are now quite competent at clean image edits, too: Both AI bots seamlessly switched the journalist’s clothing to a shirt and tie without touching any other part of the picture. This would have taken a significant amount of time to do manually, even by a Photoshop expert, and shows just how transformative AI imaging is becoming.

Color changes were all handled with aplomb, but the AIs struggled a bit with perspective changes, where I asked to see the same shot from another angle. In these cases, instructions were less well-followed and the images were less consistent (as new areas needed to be rendered), though ChatGPT did a little better than Gemini at getting good results.

Gemini images

Clothing can now be swapped out in seconds (Gemini edition).
Credit: Gemini

ChatGPT images

Clothing can now be swapped out in seconds (ChatGPT edition).
Credit: ChatGPT

The classic “remove an object from this picture” challenge was handled with aplomb: Both Gemini and ChatGPT were able to remove a cottage from the countryside scene with surgical precision, leaving everything else intact. Again, these are the kind of time-intensive image edits that would previously have needed a lot of careful effort, and that can now be done in seconds.


What do you think so far?

Gemini images

Gemini’s attempt at removing a cottage.
Credit: Gemini

ChatGPT images

ChatGPT’s attempt at removing a cottage.
Credit: ChatGPT

Combining and remixing images

Another talent ChatGPT and Gemini now have is being able to combine images together. So you can have separate photos of you and your parents, put them together in the same shot, and then add in a background of wherever you like. You can get perfect family photos without actually gathering together your relatives together or going anywhere.

This was an area where Gemini and ChatGPT did struggle a bit more: The editing dexterity was still impressive, but the results didn’t always look like a single, coherent scene. Lighting is sometimes off, or elements from different images appear at different scales, and you’ll have to do a bit more tweaking and editing and reprompting to get everything right.

ChatGPT did fare slightly better at blending different images and elements together, and changing the overall look of a picture. When I tried to get the AIs to mix all my images together in a moody film noir shot, ChatGPT produced something pretty consistent—the Gemini effort looked a lot more like a cut-and-paste job.

It can be fun remixing photos again and again—adding new people, changing the weather, moving the location—and both these bots are now capable of some rather incredible results. Remixing photos of family and friends will be popular, but it’s not all that easy: With people you know, any generative AI that gets added tends to look wrong, because neither ChatGPT nor Gemini knows exactly what these people look like, how they smile, how they’re built, or how they tend to stand or sit.

Gemini images

Gemini can combine images—but they look like different images.
Credit: Gemini

ChatGPT images

ChatGPT did a better job at creating a new image that looked correct.
Credit: ChatGPT

In terms of ChatGPT vs. Gemini, they’re both at a high level now—a level that puts advanced Photoshop-style editing capabilities at everyone’s fingertips. If either AI model has the edge right now, it’s ChatGPT’s, but there’s not much in it. It’s also going to be fascinating to see where these image editing capabilities go next.

Source link