While many of us might use YouTube to get our daily fix of adorable or hilarious cat videos, the site can also be a very useful learning platform. It has thousands of educational videos that can teach us an amazing variety of things, such as how to play the guitar, or facts about the world and universe we live in. But it’s not just people that can learn from YouTube; robots now can, too.
In a new study, a team of scientists from the University of Maryland and the Australian research center NICTA successfully taught a robot how to use tools by showing it cooking videos on YouTube, representing an important step towards the development of futuristic, self-learning helper robots. The published work will be presented soon at the Association for the Advancement of Artificial Intelligence’s 29th annual conference.
The ability to learn actions from human demonstrations is critical if we want to develop service robots that can teach themselves new skills, but it’s been a major hurdle for scientists working on artificial intelligence. In particular, training robots how to manipulate objects has been very tricky since many actions can be performed in a variety of different ways. Cooking, for example, requires a huge range of manipulation actions, and it is likely that these will be required by future service robots, which is why the team chose this skill for their study.
To teach their robot, the researchers used a method of artificial intelligence training known as “deep learning,” which basically involves converting information from a variety of inputs, such as audio and image data, into commands. Key to this technique was a series of artificial neurons that were hooked up to form a network, called a convolutional neural network (CNN), which not only served as a sophisticated image recognition system, but also allowed the robot to break down the actions presented.
The researchers used a pair of CNNs in their system that performed different roles. One observed the cook in the YouTube video and identified various actions, such as a particular grasp used on an object, while the other broke down that action in order to work out how the object was being manipulated. The latter was also capable of predicting the next action that was most likely to be performed with the object.
After using data from 88 different YouTube cooking videos, which are particularly challenging due to the large variation in scenery and demonstrators, the robot was able to identify which type of grasp was used and the object being grasped. It then selected the most appropriate manipulator from a small repertoire to replicate the grasp, such as a vacuum gripper.
“We believe this preliminary integrated system raises hope towards a fully intelligent robot for manipulation tasks that can automatically enrich its own knowledge resource by “watching” recordings from the World Wide Web,” the researchers concluded.