Scientists have finally succeeded in creating a computer network that is able to functionally replicate object recognition as it occurs in the brains of primates. This could have implications for developing therapies to treat visual impairments as well as use in improving artificial intelligence. James DiCarlo of MIT was senior author of the paper, which appeared in PLoS Computational Biology.
This announcement is particularly exciting, as scientists have been trying to achieve a neural network with this functionality for about 40 years. This success is a testament to how well neuroscientists understand object recognition in the brain; an understanding that could be useful in exploring other neural processes such as language and speech recognition.
“The fact that the models predict the neural responses and the distances of objects in neural population space shows that these models encapsulate our current best understanding as to what is going on in this previously mysterious portion of the brain,” DiCarlo said in a press release.
For primates, visual information passes through the optic nerve and is sent to the visual cortex before it goes to the inferotemporal cortex. Beyond this point, processing is dependent on the type of stimulus. This was replicated in the neural network by creating layers of programs that perform simple analyses of the object until it is identified. In order to identify items more efficiently, information regarding the object's location and movement is disregarded. This simplifies the process, which is important, particularly when dealing with complex objects.
“Each individual element is typically a very simple mathematical expression,” added the paper's lead author, Charles Cadieu. “But when you combine thousands and millions of these things together, you get very complicated transformations from the raw signals into representations that are very good for object recognition.”
Advances in computing power assisted the scientists in reaching this goal. Rather than relying solely on the computer's central processing unit (CPU), they were able to utilize the graphic processing unit (GPU) which is able to handle many more tasks at once.
Just as an animal's brain must be trained to identify certain objects, so must this neural network. The system was not able to accurately identify every test object at first, but learned to do so over time with targeted training and being told if it had identified something correctly or not. Over time, the layers of calculations begin to remember the patterns associated with the objects, leading to more correct answers. While the team can get the computer to identify these objects, they aren't exactly able to describe how it is being accomplished.
“That’s a pro and a con,” Cadieu says. “It’s very good in that we don’t have to really know what the things are that distinguish those objects. But the big con is that it’s very hard to inspect those networks, to look inside and see what they really did. Now that people can see that these things are working well, they’ll work more to understand what’s happening inside of them.”
Moving forward, the team would like to better understand what is going on that allows these neural networks to recognize objects the way they do. Adding the capability to combine object recognition with keeping tabs on the location and movement of the object is also a goal in the future.