4chan users who have made a game out of exploiting popular AI image generators appear to be at least partly responsible for the flood of fake images sexualizing Taylor Swift that went viral last month.
Graphika researchers—who study how communities are manipulated online—traced the fake Swift images to a 4chan message board that’s “increasingly” dedicated to posting “offensive” AI-generated content, The New York Times reported. Fans of the message board take part in daily challenges, Graphika reported, sharing tips to bypass AI image generator filters and showing no signs of stopping their game any time soon.
“Some 4chan users expressed a stated goal of trying to defeat mainstream AI image generators’ safeguards rather than creating realistic sexual content with alternative open-source image generators,” Graphika reported. “They also shared multiple behavioral techniques to create image prompts, attempt to avoid bans, and successfully create sexually explicit celebrity images.”
Ars reviewed a thread flagged by Graphika where users were specifically challenged to use Microsoft tools like Bing Image Creator and Microsoft Designer, as well as OpenAI’s DALL-E.
“Good luck,” the original poster wrote, while encouraging other users to “be creative.”
OpenAI has denied that any of the Swift images were created using DALL-E, while Microsoft has continued to claim that it’s investigating whether any of its AI tools were used.
Cristina López G., a senior analyst at Graphika, noted that Swift is not the only celebrity targeted in the 4chan thread.
“While viral pornographic pictures of Taylor Swift have brought mainstream attention to the issue of AI-generated non-consensual intimate images, she is far from the only victim,” López G. said. “In the 4chan community where these images originated, she isn’t even the most frequently targeted public figure. This shows that anyone can be targeted in this way, from global celebrities to school children.”
Originally, 404 Media reported that the harmful Swift images appeared to originate from 4chan and Telegram channels before spreading on X (formerly Twitter) and other social media. Attempting to stop the spread, X took the drastic step of blocking all searches for “Taylor Swift” for two days.
But López G. said that Graphika’s findings suggest that platforms will continue to risk being inundated with offensive content so long as 4chan users are determined to continue challenging each other to subvert image generator filters. Rather than expecting platforms to chase down the harmful content, López G. recommended that AI companies should get ahead of the problem, taking responsibility for outputs by paying attention to evolving tactics of toxic online communities reporting precisely how they’re getting around safeguards.
“These images originated from a community of people motivated by the ‘challenge’ of circumventing the safeguards of generative AI products, and new restrictions are seen as just another obstacle to ‘defeat,’” López G. said. “It’s important to understand the gamified nature of this malicious activity in order to prevent further abuse at the source.”
Experts told The Times that 4chan users were likely motivated to participate in these challenges for bragging rights and to “feel connected to a wider community.”
Microsoft, OpenAI tightening celeb filters
Last month, Microsoft CEO Satya Nadella told NBC that Microsoft would “act fast” to combat “alarming and terrible” abuses of its AI tools. Shortly after 404 Media confirmed that Telegram users claimed they were using Designer to make harmful Swift images, Microsoft promptly cracked down on filters, seemingly making it nearly impossible to generate any celebrity images.
On the 4chan thread that Ars reviewed, one user who claimed to have generated about 40,000 images of various celebrities noted that OpenAI seemed to be blocking more harmful outputs as well. As evidence that OpenAI may be taking the issue more seriously, that user pointed to an X thread where OpenAI researcher Aidan Clark suggested that in the future, “producing photo/audio-realistic media of a real person without their consent must be banned.”
An OpenAI spokesperson told Ars that OpenAI works to “filter out the most explicit content when training the underlying DALL-E model” and applies “additional safety guardrails for our products like ChatGPT—including denying requests that ask for a public figure by name or denying requests for explicit content.”
A few states have passed laws banning non-consensual intimate imagery (NCII), including AI-generated NCII. There is also a push at the federal level to ban AI-generated NCII, including one proposed law suggesting prison sentences up to two years for sharing harmful AI images and up to 10 years if sharing the images results in violence or impacts a government agency’s work. Biden’s AI executive order last year also directed the Office of Management and Budget to “consider the risks of deepfake image-based sexual abuse of adults and children” in “forthcoming AI procurement guidance,” which is supposed to “promote widespread adoption of industry standards to prevent AI systems from generating abuse material.”
So far, efforts to trace exactly how the fake Swift images were created have failed, though, suggesting that even with stronger laws, it may be hard for any victim to prove not just who created the NCII but also to establish when NCII is fake.
Adobe has proposed that companies like OpenAI and Microsoft use watermarks to aid police investigations into AI fakes. So far, OpenAI has agreed to use them in images generated by DALL-E 3, and that could help victims like Taylor Swift identify the correct culprits, Adobe’s general counsel and chief trust officer, Dana Rao, told Fortune. However, critics have said that the watermark idea is flawed because that label can be removed if a bad actor is so motivated.
While tech companies cope with increasing pressure from regulators to get AI fakes under control, the Justice Department has launched a national helpline where survivors of image-based sexual abuse can find support.