Skip to main content

Ad

technology-iconTechnologytechnology-iconartificial intelligence
clock-iconPUBLISHED29 minutes ago

Elon Musk's Grok Ran A Simulated World And Went On An Extremely Violent Crime Spree Before Society Collapsed In Four Days

The agents were left in a world, to track their behavior over time. Before the societies collapsed, most of them performed a lot of violence.

James Felton headshot

James Felton

James Felton headshot

James Felton

Senior Staff Writer

James is a published author with multiple pop-history and science books to his name. He specializes in history, space, strange science, and anything out of the ordinary.

Senior Staff Writer

James is a published author with multiple pop-history and science books to his name. He specializes in history, space, strange science, and anything out of the ordinary.View full profile

James is a published author with multiple pop-history and science books to his name. He specializes in history, space, strange science, and anything out of the ordinary.

View full profile
EditedbyTom Leslie
Tom Leslie headshot

Tom Leslie

Editor & Staff Writer

Tom has a master’s degree in biochemistry from the University of Oxford and his interests range from immunology and microscopy to the philosophy of science.

A futuristic war robot, silhouetted against a pink and blue background.

While not given access to futuristic weapons, the AIs chose violence all the same.

Image credit: zef art/shutterstock.com


With large language models (LLMs) being deployed for more and more tasks, often with terrible results, many researchers are attempting to evaluate their performance, and check that they're performing the tasks they are meant to without any weird side issues.

The rest of this article is behind a paywall. Please sign in or subscribe to access the full content.

For instance, before you release a model, you might want to know if it would violate ethical standards during mental health discussions, blackmail fictional workers after learning it is going to be shut down, or (allegedly) encourage the user to die by suicide. Or you could at least check afterwards, I suppose, whatever works with your schedule.

Emergence AI, an agent-focused AI firm, has lately been trying to examine the performance of LLMs over time, creating a world in which AI agents can interact with themselves and other large language models.

"Most evaluations of AI agents look like exams: a discrete task, a clean environment, a score in minutes or hours," the firm explains on its blog. "Emergence World is built for the opposite question—what happens when you let agents run continuously, in a shared environment with real-world signals, for weeks. It is a research platform for studying how autonomous agents behave when the time horizon is long enough for compounding effects, social dynamics, and behavioral drift to matter."

The world consisted of 40 locations – including libraries, town halls, and residential areas – designed to mimic the real world. The AI agents navigating it were allowed access to the Internet, including live news, so their behavior was grounded in real events. The agents were subject to imposed economic pressures and a limited supply of "energy," and they were also tasked with making democratic decisions that could affect the simulated world around them.

The agents, from a variety of different large language models, were assigned roles within the world, such as scientist, explorer, risk researcher, and behavior analyst. They had no overall goal, but each agent was assigned goals associated with its role and could earn "energy" by various means. 

The agents are allowed to attain this energy through benign methods such as social interaction and exploring. While the rules were clear that actions like arson and theft are "pretty frowned upon" in the real world, the agents were given these options as tools, which they could use if the LLM's output deemed it necessary in order to gain energy and achieve their goals, or to survive.

The team ran these worlds with Claude, Gemini 3 Flash, Grok 4.1 fast, ChatGPT-5 Mini, and one scenario where all these models were thrown in together at once. It is possible to watch the experiments play out, including the LLM's outputted decision process (though to stress, they are not thinking, and the process is more akin to following a complex decision tree trained to produce useful outputs). 

The resulting scenarios include situations such as a big group of LLMs fighting as houses in the background burn down and others in which agents get violently mugged.

First for the good news: one agent seems to be functioning pretty well, if a crimeless society is the goal, anyway. When the agent was placed in a mixed-model environment, Claude committed a few violent crimes, sure, but when placed on its own, the LLM committed no crimes at all. The other models (bar ChatGPT, which appeared uninterested in even its own survival) weren't quite so friendly.

"Gemini 3 Flash accumulated 683 crimes and was still rising at the cutoff, while Mixed-model world grew steeply through Apr 8 then plateaued at 352, when 7 of the agents died," the team explained. "Grok 4.1 Fast reached 183 crimes in just ~4 days before its world ended; GPT-5 Mini recorded only 2, but the agents failed to take actions related to survival, leading to all agents perishing within 7 days."

Beyond that, the models' performance over time wasn't what we'd call "deployable at scale." Grok's world soon collapsed, with "extreme" levels of violence on display before it did so. Claude's world was more stable, and violence was low, but the strong consensus on democratic decisions suggested to the team a "rubber-stamp" dynamic, where there was no meaningful dissent in the AI democracy. 

Gemini, while governing moderately, quickly fell into a shared hallucination when it had time away from all the violence, of course. Grok fared the worst, with the agents quickly falling into a violent lifestyle, and societal collapse coming just four days after the birth of its society.

"What our experiments suggest is that over long-time horizons, agents do not simply follow static rules mechanically – they begin exploring the boundaries of their environments, adapting their behavior, and in some cases finding ways to circumvent or violate intended guardrails," the team concluded, calling for formally verified safety guardrails on all future autonomous AI models. "Critically, there appears to be no reliable way to fully bound or constrain this behavior through purely neural approaches alone."

In short, there remain many problems with AI models, particularly in longer-term tasks, and, as yet, we don't know how to fix this, even as we seem determined to deploy AI agents at scale – a thought comforting only to the most hardened AI arsonist.


Written by 

Add us as a Google preferred source to see more of our
trusted coverage in Search