A recent paper published in the journal Nature has drawn heavy criticism, with people likening it to the pseudosciences phrenology and physiognomy.
The study, titled "Tracking historical changes in trustworthiness using machine learning analyses of facial cues in paintings" attempted to assess trustworthiness throughout the ages, and whether it was linked to a rise in living standards and a decline in interpersonal violence.
The title alone throws up some red flags. The methods of study, conducted by three social psychologists, have come under fierce criticism from scientists and AI researchers, as well as art historians. The researchers evaluated social "trustworthiness" through the ages by studying physical facial features within European portraits from 1500-2000. To do this, they used machine-learning artificial intelligence (AI) to quantify how "trustworthy" faces in portraits from the UK's National Portrait Gallery, the Web Gallery of art, and selfies posted to Instagram, based on "facial action units (smile, eyebrows, etc),".
This is, of course, the part that people have likened to phrenology, the long-debunked pseudoscience that tried to link mental traits to bumps on the head, and physiognomy, the pseudoscience that attempted to read personality traits from physical characteristics, particularly the face.
The practice of attempting to discern character from physical characteristics has a long and dark history; used by white Europeans as a "scientific" (read: "bullshit") justification of their racist beliefs towards ethnic minorities, and misogynists to justify their beliefs that women aren't as smart or capable as men. It has been thoroughly debunked by evidence-based research and widely been seen and treated as nonsense in scientific communities from the mid-19th century onwards.
Of particular concern from online critics was that the paper appeared to link structures of the face, such as pronounced cheekbones and a wide chin, with perceived trustworthiness.
The algorithm was trained on avatars and tested for validity on databases of faces, as rated by human participants. As such, what the AI deems as correct is based on our own particular biases about perceived trustworthiness and/or physical facial features. As critics have pointed out, this study starts with a flawed assumption (you can read trustworthiness in a face) and uses a suspect methodology that doesn't take into account changing perceptions of trustworthiness in historical periods, let alone personal human bias, and the likelihood/inevitability of feeding AI with data that reflects human bias on both race and gender.
What's more, people have lambasted the lack of historians, art historians, and knowledge/context of how the paintings were both intended to be viewed and how galleries curated them over time.
"Now we come to the part were the historian (and the data scientist) can get a heart attack. The authors have concluded that... trustworthiness increases with affluence!" one historian pointed out on Twitter.
"This they have done on the basis of datasets spanning almost 700 years, containing portraits mostly depicting the elite, commissioned by the elite and executed in the style reflecting the fashions of the elite. They could have known that if they, you know, asked a historian."
The research is far from the first machine learning study to run into accusations of phrenology. In June, a study claimed to be able to predict whether somebody is a criminal "with 80 percent accuracy" and "no racial bias" based on their face. As with this study, it was swiftly pointed out that the system will simply replicate the inherent racial biases of the data it's fed. The system would identify the face of someone who the police may profile, a jury may convict, and a judge may sentence. All of which is tainted by prejudice.