How Apple Plans to Improve AI Image Editors

Apple might be dead last in the AI race—at least when you consider competition from companies like OpenAI, Google, and Meta—but that doesn’t mean the company isn’t working on the tech. In fact, it seems most of the work Apple does on AI is behind the scenes: While Apple Intelligence is, well, there, the company’s researchers are working on other ways to improve AI models for everyone, not just Apple users. The latest project? Improving AI image editors based on text prompts.

In a paper published last week, researchers introduced Pico-Banana-400K, a dataset of 400,000 “text-guided” images selected to improve AI-based image editing. Apple believes its image dataset improves upon existing sets by including higher quality images with more diversity: The researchers found that existing datasets either use images produced by AI models, or are not varied enough, which can hinder efforts to improve the models.

Funnily enough, Pico-Banana-400K is designed to work with Nano Banana, Google’s image editing model. Researchers say using Nano Banana, their dataset can generate 35 different types of edits, as well as tap into Gemini-2.5-Pro to asses quality the edits, and whether those edits should remain as part of the overall dataset.

As part of these 400,000 images, there are 258,000 samples of single edits (where Apple compares the original images to one with edits); 56,000 “preference pairs,” which distinguishes between failed and successful edit generations; and 72,000 “multi-turn sequences,” which walks through two to five edits.

Researchers note that different functions had different success rates in this dataset. Global edits and stylization are “easy,” achieving the highest success rates; object semantics and scene context are “moderate;” while precise geometry, layout, and typography are “hard.” The highest performing function, “strong artistic style transfer,” which could include changing an image’s style to “Van Gogh” or anime, has a 93% success rate. The lowest performing function, “change font style or color of visible text if there is text,” only succeeded 58% of the time. Other tested functions include “add new text” (67% success rate), “zoom in” (74% success rate), and “add film grain or vintage filter” (91% success rate).

Unlike many of Apple’s products, which are typically closed to the company’s own platforms, Pico-Banana-400K is open for all researchers and AI developers to use. It’s cool to see Apple researchers contributing to open research like this, especially in an area Apple is generally behind in. Will we actually get an AI-powered Siri anytime soon? Unclear. But it is clear Apple is actively working on AI, perhaps just in its own way.

Source link

How Apple Plans to Improve AI Image Editors

Like this:

Related

Categories

Recent Posts