Artificial intelligence (AI) has already passed a hefty test, proving they are better than humans at passing those online "I am not a robot" CAPTCHA tests. But can they pass the Turing test?
The Turing test (which renowned mathematician and computer scientist Alan Turing originally and much more modestly called The Imitation Game) involves separating a human participant from a conversation partner and asking them to determine if they are human or AI. In the original version, a human evaluator sees the text conversation between one human and one AI, and must determine which is which.
In later modifications to the thought experiment, experimenters have put chatbots to the test directly, having the evaluator talk to the AI themselves. In one such recent experiment – conducted following an explosion of large language model (LLM) chatbots such as Chat GPT and Google Bard – over a million humans got to take part.
The team, in a preprint paper that has yet to be peer-reviewed, detailed how they created a game of roulette. Volunteers could go play an online game called Human or Not, in which their only task was to determine if they were talking to an AI or a fellow human. The game would either assign them a fellow player or an AI.
The team created backstories for the bots, which were created using several different chat AIs.
"We created a diverse array of bots, each with its unique personality and objective," the team wrote in their study. "We were motivated by the desire to keep the conversations interesting and less repetitive for recurring users, and to obscure the tell-tale signs of bots, making detection more challenging."
One such character bio included in the paper – Adam – is told to try to convince the other users that he is them from the future. While it may not sound like the sort of thing humans would do, it might be the sort of thing a human pretending to be an AI would do, adding to the game's difficulty and fun.
In the first month, they got over 10 million guesses from 1.5 million unique users, making it a decent dataset to analyze.
"From the vast pool of interactions, we identified several types of human players that excelled in different aspects of the game: players who were proficient at recognizing fellow humans, players who convincingly signaled their own humanity, and players who masterfully impersonated bots."
Humans in the experiment paid a lot of attention to typos and slang, believing that they generally wouldn't come from a machine, but many of these were also simulated by the AI chatbots. When humans tried to convince their partners that they were human, they would often use more slang and more typos, as well as answering personal and emotional questions.
"A more distinctive approach involved using foul language, expressing controversial views, and posing messages that AI bots tend to avoid answering," they wrote, adding "the use of rude language revealed a player’s humanity 86.7 percent of the time".
Overall, humans guessed the identity of their partners correctly in only 68 percent of the games. When their opponent was a chatbot, they guessed correctly 60 percent of the time, whilst 73 percent of the time they were able to correctly identify they were talking to a human companion.
The team acknowledges the test has its limitations, including that the nature of the game likely raised the participants' suspicions and affected their strategies in conversations, but said that it provided insight into strategies humans use to identify whether we are talking to a fellow human or a machine.
The paper is available on the pre-print server arXiv.