Meta’s new AI image generator was trained on 1.1 billion Instagram and Facebook photos

Photo of author
Written By Sedoso Feb

Enlarge / Three images generated by “Imagine with Meta AI” using the Emu AI model.
Meta | Benj Edwards

On Wednesday, Meta released a free standalone AI image generator website, “Imagine with Meta AI,” based on its Emu image synthesis model. Meta used 1.1 billion publicly visible Facebook and Instagram images to train the AI model, which can render a novel image from a written prompt. Previously, Meta’s version of this technology—using the same data—was only available in messaging and social networking apps such as Instagram.

If you’re on Facebook or Instagram, it’s quite possible a picture of you (or that you took) helped train Emu. In a way, the old saying, “If you’re not paying for it, you are the product” has taken on a whole new meaning. Although, as of 2016, Instagram users uploaded over 95 million photos a day, so the dataset Meta used to train its AI model was a small subset of its overall photo library.

Since Meta says it only uses publicly available photos for training, setting your photos private on Instagram or Facebook should prevent their inclusion in the company’s future AI model training (unless it changes that policy, of course).

Imagine with Meta AI

Similar to Stable Diffusion, DALL-E 3, and Midjourney, Imagine with Meta AI generates new images based on what the AI model “knows” about visual concepts learned from the training data. Creating images using the new website requires a Meta account, which can be imported from an existing Facebook or Instagram account. Each generation creates four 1280×1280 pixel images that can be saved in JPEG format. Images include a small “Imagined with AI” watermark logo in the lower left-hand corner.

“We’ve enjoyed hearing from people about how they’re using imagine, Meta AI’s text-to-image generation feature, to make fun and creative content in chats,” Meta says in its news release. “Today, we’re expanding access to imagine outside of chats, making it available in the US to start at imagine.meta.com. This standalone experience for creative hobbyists lets you create images with technology from Emu, our image foundation model.”

We put Meta’s new AI image generator through a battery of low-stakes informal tests using our “Barbarian with a CRT” and “Cat with a beer” image synthesis protocol and found aesthetically novel results, as you can see above. (As an aside, when generating images of people with Emu, we noticed many looked like typical Instagram fashion posts.)

We also tried our hand at adversarial testing. The generator appears to filter out most violence, curse words, sexual topics, and the names of celebrities and historical figures (no Abraham Lincoln, sadly), but it allows commercial characters like Elmo (yes, even “with a knife”) and Mickey Mouse (though not with a machine gun).

Meta’s model generally creates photorealistic images well, but not as well as Midjourney. It can handle complex prompts better than Stable Diffusion XL, but perhaps not as well as DALL-E 3. It doesn’t seem to do text rendering well at all, and it handles different media outputs like watercolors, embroidery, and pen-and-ink with mixed results. Its images of people seem to include diversity in ethnic backgrounds. Overall, it seems about average these days in terms of AI image synthesis.

Facebook and Instagram made this possible

An AI-generated image of a
Enlarge / An AI-generated image of a “psychedelic emu” created on the “Imagine with Meta AI” website.
Meta | Benj Edwards

So, what do we know about Emu, the AI model behind Meta’s new AI image generation features? Based on a research paper released by Meta in September, Emu gets its ability to generate high-quality images through a process called “quality-tuning.” Unlike traditional text-to-image models trained with large numbers of image-text pairs, Emu focuses on “aesthetic alignment” after pre-training, using a set of relatively small but visually appealing images.

At Emu’s heart, however, is the aforementioned massive pre-training dataset of 1.1 billion text-image pairs pulled from Facebook and Instagram. In the Emu research paper, Meta does not specify where that training data came from, but reports from Meta Connect 2023 conference cite Meta president of global affairs Nick Clegg confirming that they were using social media posts as training data for AI models, including images fed into Emu.

It’s a change in approach over other AI companies since Meta has access to so much image and caption data from its services. Other image synthesis models utilize images that have been illicitly scraped from the Internet, licensed from commercial stock image libraries, or a combination of both.

Interestingly, Meta’s research paper on Emu is the first paper on a major image synthesis model we’ve seen that doesn’t disclaim the potential for the model to create reality-warping disinformation or potentially harmful content. That feels like a reflection of the general acceptance (or resignation) of the reality of AI image synthesis models, which are now becoming far more commonplace. Whether that’s a good thing or not is an open question.

Still, Meta seems to be handling issues of potential harmful outputs with filters, a proposed watermarking system that isn’t operational yet (“In the coming weeks, we’ll add invisible watermarking to the imagine with Meta AI experience for increased transparency and traceability,” the company says), and a small disclaimer at the bottom of the website: “Images are and may be inaccurate or inappropriate.”

The images might not be accurate (do cats drink beer?), and they might not even be ethical in the eyes of the unnamed authors of the 1.1 billion images used to train the model. But dare we say it: Generating them can be fun. Of course, depending on your disposition and how you view the pace of AI image synthesis, that fun may be balanced out by an equal level of concern.

Source

Leave a Comment

dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus dus