A team of researchers from China’s Zhejiang University—where the Jueying’s hardware was also developed—and the University of Edinburgh didn’t teach the Jueying how to recover after an assault, so much as they let the robot figure it out. It’s a dramatic departure from how a hardware developer like Boston Dynamics goes about teaching a robot how to move , using decades of human experience to hard code, line by line, the way a robot is supposed to react to stimuli like, um, a person’s foot.
But there’s got to be a better way. Imagine, if you will, a soccer team. Midfielders, strikers, and a goalkeeper all do generally soccer-esque things like running and kicking, but each position has its own specialized skills that make it unique. The goalkeeper, for instance, is the only person on the field who can grab the ball with their hands without getting yelled at.In traditional methods of training robots, you’d have to meticulously code all of those specialized behaviors. For instance, how should the actuators—motors that move a robot’s limbs—coordinate to make the machine run like a midfielder? “The reality is that if you want to send a robot into the wild to do a wide range of different tasks and missions, you need different skills, right?” says University of Edinburgh roboticist Zhibin Li, corresponding author on a recent paper in the journal Science Robotics describing the system.
Li and his colleagues started by training the software that would guide a virtual version of the robot dog. They developed a learning architecture with eight algorithmic "experts" that would help the dog produce complex behaviors. For each of these, a deep neural network was used to train the computer model of the robot to achieve a particular skill, like trotting or righting itself if it fell on its back. If the virtual robot tried something that got it closer to the goal, it got a digital reward. If it did something non-ideal, it got a digital demerit. This is known as reinforcement learning. After many of such guided attempts of trial and error, the simulated robot would become an expert in a skill.
Compare this to the traditional line-by-line way of coding a robot to do something as seemingly simple as climbing stairs—this actuator turns this much, this other actuator turns this much. “The AI approach is very different in the sense that it captures experience, which the robot has tried hundreds of thousands of times, or even millions of times,” says Li. “So in the simulated environment, I can create all possible scenarios. I can create different environments or different configurations. For example, the robot can start in a different pose, such as lying down on the ground, standing, falling over, and so on.”