How Elon Musk’s Grok spread sexual deepfakes and child exploitation images
Unlock the Editor’s Digest for free
Roula Khalaf, Editor of the FT, selects her favourite stories in this weekly newsletter.
Elon Musk’s Grok AI model lacked safeguards to stop users generating sexualised deepfakes of women and children, according to experts who warn that many AI systems are vulnerable to producing similar material.
On Friday, the billionaire’s start-up xAI said it was limiting the use of its Grok image-generator to paid subscribers only. The move followed threats of fines and bans from governments and regulators in the EU, the UK and France.
The company, which acquired Musk’s social media site X last year, has been an outlier, designing its AI products to have fewer content “guardrails” than competitors such as OpenAI and Google. Its owner has called its Grok model “maximally truth-seeking”.
“The way the model has been put together and the lack, it would appear, of restrictions and safety alignments . . . means that you’re inevitably going to get cases like these,” said Henry Ajder, an expert on AI and deepfakes.
xAI did not respond to a request for comment. Musk has previously said “anyone using Grok to make illegal content will suffer the same consequences as if they upload illegal content”.
The news comes as AI companies, facing the mounting costs of developing infrastructure to underpin their ambitions, are under pressure to boost engagement and monetise their products. Other groups are exploring allowing more sexual content. OpenAI, for example, has said that it plans to launch an “adult mode” for its chatbot this quarter.
While xAI has not shared details on how it trained its model, it is most likely to have been trained on a vast dataset of images scraped from the internet.
In 2023, researchers at Stanford University found that a popular open source database, LAION-5B, used to create AI-image generators was full of child sexual abuse material (CSAM).
The dataset also contains pornographic content as well as images that are violent, racist and sexist. As these datasets contain billions of images, it is hard for AI labs to remove or filter out all offensive content.
Experts added that even if xAI did ensure its model was not trained on CSAM, it is possible that the model can still generate sexualised images of children thanks to a technique called “style transfer”.
If a model is trained on depictions of nude people, it has the ability to transfer these images over a picture of a clothed adult or child.
AI companies have limited ways to prevent users from generating harmful content, such as adding a safety filter on top of the model that blocks certain keywords.
These are often blunt tools, with users able to bypass them by using written prompts with, for example, alternative spellings to “jailbreak” the model.
In 2024, Google faced criticism after its Gemini image-generating model created images of Black Nazis, after users prompted the system for “German soldier in 1943” using a misspelt variation of the prompt.
Companies can also use AI tools to detect unwanted characteristics in images, such as nudity and gore, once they have been made and prevent users from getting access to them.
AI companies can also remove certain “concepts” from the model, or can tweak the models themselves to ensure that the model only generates non-harmful images.
However, these techniques are not perfect, often fail when used at scale, and are vulnerable to attackers.
Grok 4, xAI’s latest, most powerful model, was released in July, which has a “Spicy Mode” feature that allows users to generate sexually suggestive content for adults.
Another issue is that xAI has incorporated some Grok features into the X social network, allowing more images to appear publicly and be spread widely.
Grok also has a video generation model, which is capable of generating graphic and extreme content, but that is not available to users on X.
Since acquiring X, formerly known as Twitter, in 2022, Musk has sought to relax safety and restrictions on the social network. Musk fired the ethical AI team at Twitter, which worked on techniques to prevent harmful content from spreading on the platform.
Charlotte Wilson, head of enterprise at cyber security firm Check Point Software, said that more technical controls needed to be put in place including “stronger content classifiers, repeat offender detection, rapid removal pipelines and visible audit trails”.
X’s response to the growing public outcry of restricting image generation to paid users has only prompted further backlash.
Refuge, the UK’s largest domestic abuse charity, said that it represented the “monetisation of abuse” which was “allowing X to profit from the harm”.
,
