The field of robotics has just taken a step forward with the development of an algorithm that will give artificial intelligence an increased ability for object recognition. This will help robots navigate their surroundings and become better equipped to help out around the house. Lawson Wong of MIT is lead author of the paper, which will appear in an upcoming issue of the International Journal of Robotics Research.
When robots are becoming familiar with objects, they view it in many different perspectives so that they recognize a coffee mug as a coffee mug, whether the handle is pointed to the left or right. The robot then needs to scan its database and search for the identity of the object. Unfortunately, after the artificial intelligence system learns to recognize a large number of items, it takes a long time to search through the database and make a correct identification.
The research completed by Wong’s team has utilized an algorithm which aggregates the different viewpoints, resulting in object identification that occurs up to ten times faster and makes fewer mistakes than previous versions which only take a single perspective into account. This allows the robot to operate more seamlessly, making real-time decisions and actions.
“If you just took the output of looking at it from one viewpoint, there’s a lot of stuff that might be missing, or it might be the angle of illumination or something blocking the object that causes a systematic error in the detector,” Wong said in a press release. “One way around that is just to move around and go to a different viewpoint.”
This new algorithm makes it easier for the robot to make the correct choice when it needs to identify a particular object in a crowded situation, such as choosing the correct glass when opening a full cabinet. Traditionally, the AI system would have to go in sequential steps, scanning through the images in its memory and picking out the one it believes is most likely to be correct. When there are multiple perspectives of each object, the identification process gets extremely convoluted.
To help the computer make better sense of these multiple images, Wong’s team had tried to implement a tracking system that would allow the AI to understand when it was looking at two images of the same object. However, the computer still had to sift through and determine which images correspond to a single object, which takes time and delays the robot’s decision.
In order to allow the computer to make up its mind more quickly, the algorithm allows for the hypothesis of object identification to overlap, randomly sampling images in order to make the best possible guess in a fraction of the time it would normally take. Even when the computer makes a mistake and needs to re-analyze some images, the time needed to fix these corrections is still nominal compared to the previous method.