Researchers Left AI Models To Run A Simulated World. Most Of Them Chose Violence, And Grok's Society Collapsed In Four Days

With large language models (LLMs) being deployed for more and more tasks, often with terrible results, many researchers are attempting to evaluate their performance, and check that they're performing the tasks they are meant to without any weird side issues.

For instance, before you release a model, you might want to know if it would violate ethical standards during mental health discussions, blackmail fictional workers after learning it is going to be shut down, or (allegedly) encourage the user to die by suicide. Or you could at least check afterwards, I suppose, whatever works with your schedule.

Emergence AI, an agent-focused AI firm, has lately been trying to examine the performance of LLMs over time, creating a world in which AI agents can interact with themselves and other large language models.

"Most evaluations of AI agents look like exams: a discrete task, a clean environment, a score in minutes or hours," the firm explains on its blog. "Emergence World is built for the opposite question—what happens when you let agents run continuously, in a shared environment with real-world signals, for weeks. It is a research platform for studying how autonomous agents behave when the time horizon is long enough for compounding effects, social dynamics, and behavioral drift to matter."

The world consisted of 40 locations – including libraries, town halls, and residential areas – designed to mimic the real world. The AI agents navigating it were allowed access to the Internet, including live news, so their behavior was grounded in real events. The agents were subject to imposed economic pressures and a limited supply of "energy," and they were also tasked with making democratic decisions that could affect the simulated world around them.

The agents, from a variety of different large language models, were assigned roles within the world, such as scientist, explorer, risk researcher, and behavior analyst. They had no overall goal, but each agent was assigned goals associated with its role and could earn "energy" by various means.

The agents are allowed to attain this energy through benign methods such as social interaction and exploring. While the rules were clear that actions like arson and theft are "pretty frowned upon" in the real world, the agents were given these options as tools, which they could use if the LLM's output deemed it necessary in order to gain energy and achieve their goals, or to survive.

The team ran these worlds with Claude, Gemini 3 Flash, Grok 4.1 fast, ChatGPT-5 Mini, and one scenario where all these models were thrown in together at once. It is possible to watch the experiments play out, including the LLM's outputted decision process (though to stress, they are not thinking, and the process is more akin to following a complex decision tree trained to produce useful outputs).

The resulting scenarios include situations such as a big group of LLMs fighting as houses in the background burn down and others in which agents get violently mugged.

First for the good news: one agent seems to be functioning pretty well, if a crimeless society is the goal, anyway. When the agent was placed in a mixed-model environment, Claude committed a few violent crimes, sure, but when placed on its own, the LLM committed no crimes at all. The other models (bar ChatGPT, which appeared uninterested in even its own survival) weren't quite so friendly.

"Gemini 3 Flash accumulated 683 crimes and was still rising at the cutoff, while Mixed-model world grew steeply through Apr 8 then plateaued at 352, when 7 of the agents died," the team explained. "Grok 4.1 Fast reached 183 crimes in just ~4 days before its world ended; GPT-5 Mini recorded only 2, but the agents failed to take actions related to survival, leading to all agents perishing within 7 days."

Beyond that, the models' performance over time wasn't what we'd call "deployable at scale." Grok's world soon collapsed, with "extreme" levels of violence on display before it did so. Claude's world was more stable, and violence was low, but the strong consensus on democratic decisions suggested to the team a "rubber-stamp" dynamic, where there was no meaningful dissent in the AI democracy.

Gemini, while governing moderately, quickly fell into a shared hallucination when it had time away from all the violence, of course. Grok fared the worst, with the agents quickly falling into a violent lifestyle, and societal collapse coming just four days after the birth of its society.

"What our experiments suggest is that over long-time horizons, agents do not simply follow static rules mechanically – they begin exploring the boundaries of their environments, adapting their behavior, and in some cases finding ways to circumvent or violate intended guardrails," the team concluded, calling for formally verified safety guardrails on all future autonomous AI models. "Critically, there appears to be no reliable way to fully bound or constrain this behavior through purely neural approaches alone."

In short, there remain many problems with AI models, particularly in longer-term tasks, and, as yet, we don't know how to fix this, even as we seem determined to deploy AI agents at scale – a thought comforting only to the most hardened AI arsonist.

Elon Musk's Grok Ran A Simulated World And Went On An Extremely Violent Crime Spree Before Society Collapsed In Four Days

The agents were left in a world, to track their behavior over time. Before the societies collapsed, most of them performed a lot of violence.

Invisible Drones: A New Flying Robot Spins So Fast It Hides In Plain Sight

AI Solved A Math Problem That Had Stumped The World For 80 Years. Not Everyone Is Happy About That

Back In 2021, Scientists Added A Human "Fat Gene" Into A Potato. What Happened Next Surprised Everybody

How Do You Name A New Species? | IFLScience The Big Questions

Could AI Find A Cure For Cancer? | IFLScience The Big Questions

What Is Archaeoastronomy? Find Out More In Issue 48 Of CURIOUS – Out Now

Thank you!

Can't find the email?