“In a kingdom long forgotten, there lived a cartographer named Elias whose maps were famous throughout the world.” That’s what ChatGPT just spewed when I asked it to tell me a story.
The rest of this article is behind a paywall. Please sign in or subscribe to access the full content.Now you try. Open up your preferred text-based AI and ask for a story. Chances are it will invent a tale about someone called Elias Thorne who takes care of a remote lighthouse – although he might also be a librarian, a clockmaker, a detective, or some other wistfully evocative profession. Map-maker fits the bill pretty nicely.
In a recent study – which is yet to undergo peer review – researchers analyzed 20,000 AI stories generated by four popular large language models (LLMs), finding that the name Elias appeared in over a quarter of them. They also noted that a small collection of 11 terms appeared in 88 percent of these bot-based fables.
Among this group of words were the names Elias, Elara, and Mara. The word “lighthouse”, meanwhile, appeared in more than half of the stories, with “clockmaker” and “librarian” also on the list.
Given that AI models are trained on pre-existing data, one may assume that the prevalence of these themes in chatbot fantasies reflects the fact that they commonly appear in human-generated fiction. Yet the study authors found that this is absolutely not the case, and that the name Elias is 900 times more common in AI stories than it is in real literature.
Training data therefore can’t explain AI’s obsession with Elias Thorne and his lighthouse. However, the researchers believe that the weird phenomenon may reflect the fact that LLMs have to follow certain rules when creating their own content.
For example, after training, most AI models show a tendency to reference adult content or trademarked characters, yet are programmed not to present these stories to users, thus averting any potential lawsuits for their developers. This immediately reduces the reference pool from which these models can draw when producing fiction.
This is then constrained further by preference data, whereby AI models learn to tailor their output based on user feedback. For example, LLMs often present users with two different answers to the same prompt, and ask them which they prefer. The responses they receive then inform their future output.
The study authors therefore speculate that all AI models are trained on the same small amount of preference data, which shows that people like hearing stories about Elias, lighthouses, clockmakers, and librarians.
As more and more of these stories appear online, they are then fed back into the training data that is used to program AI, which essentially then learns from its own slop.
This weird vicious cycle is known as mode collapse and essentially creates a situation in which bots end up training themselves. In other words, the more stories AI writes about Elias Thorne’s lighthouse, the more it learns that these are kinds of tales people like to read, to the point that almost every AI-generated story ends up following the exact same formula.
According to software engineer Daniel May, the phenomenon is relatively recent, with Google searches for Elias Thorne exploding in the early months of 2026. Since then, the character has escaped from the realm of chatbot fiction and is now listed as the author of numerous dodgy AI-generated books – including a dangerous-sounding alternative guide to cancer treatment.
More broadly, mode collapse illustrates one of AI’s major limitations, demonstrating how easily it can contaminate itself with its own drivel if the input it receives isn’t broad enough.
The preprint, which has not yet been subjected to peer review, is posted to arXiv.
[H/T: 404 Media]





