Humans are handy to have around
Nov 11, 2021
Ugo Cupcic, our Chief Technical Architect writes about our favourite topic… grasping!
At the Shadow Robot Company, I’m always looking for exciting ways to make grasping and manipulation easier for non-roboticists. Teaching from demonstration is one of the many ways to address this.
The idea is quite simple, you show your robot how to do something — in my case — how to grasp a given object. The robot will learn from that demonstration, refining what it has been shown until it “works” well.
For my purpose, the dream goal is to show one grasp that is good enough: manually shaping the robot hand around the object, and then have the robot try it a few times on its own until that grasp is really stable. This is usually called One-Shot demonstration learning.
In this story, I’ll focus on a very simple approach to address this problem using Bayesian Optimisation.
Grasping considered as a mathematical function
Bayesian Optimisation is a very useful tool to have in your toolbox for optimising black box functions. There are multiple ways to consider the problem of grasping an object. One of them is to look at it as a function optimisation problem: given my inputs — for example, the joint angles for each of my joints — how can I get the highest output — the grasp that is the most stable.
To put things in perspective, considering grasping as an optimisation problem is the same as trying to find the highest peak in a landscape. In grasping, I often look for a high enough peak that’s quite broad — compared to a high and very narrow peak. If my peak is too narrow, then I’m going to have a hard time being exactly in that configuration with all the uncertainties that add up with even a state of the art robot.
A brute force approach
In my grasping case, let’s remember that I want to find the highest — and broadest — peak, but that I have no idea what the landscape look like. As you can see below, a very easy way to estimate this landscape is to sample everywhere in the space. This means grasping the object with every possible combination of joint targets and getting the associated grasp robustness. Even though it would work well, it is definitely not the optimal method.
In my current problem, I can also make use of the grasp demonstrated by the human. By using that grasp as a starting point, I can narrow my search space. I will only be looking in that neighbourhood as depicted below. If I’m still considering a brute force approach, this would mean that I need a lot less samples to guess the shape of my landscape.
Sampling more intelligently
Even though the search region as depicted above looks quite small on my diagram, you have to keep in mind that I’m in a high dimension space: I have lots of joints. So exploring that region can take a lot of time. What we need is a better way to sample my space.
Bayesian Sampling is a well studied method that maximises the information gathered each time I take a sample. The main idea is to gather as much information as possible about our peaks while not spending too much time sampling areas where grasps are bad.
In this method, I can also tweak how much effort is spent on exploration vs exploitation. Focusing more on exploitation — as you can see on the diagram on the left — will gather more data around the peaks. So this leaves more uncertainties: I could miss some peaks, but the joint targets for my most stable grasp will be very precise. If I focus on exploration instead — the diagram on the right — I will gather more data all around my space, reducing my uncertainties everywhere. But the joint targets for my most stable grasp will be less precise.
Refining the grasp in simulation
Running the optimisation in simulation is very similar to the process I used to gather my dataset for predicting grasp quality. In the simulation, I grasp my object with the joint targets that the bayesian process gives us, then lift the object, and record an objective grasp quality. This grasp quality is proportional to the variation of the distance between the palm and the object. It’s a very simple idea: if I move the hand and the grasp is stable, the object moves with it!
For each iteration of the bayesian optimisation I run this process with the new joint targets to sample from. The output of my black box function is the grasp quality measured from shaking the object. This gives me a new value for my black box function, reducing the uncertainty in that given region!
At the end of the process we have a grasp that is more robust than the initial one. My colleagues at Shadow even tested the grasp on the real robot and it was very stable as you can see below.
If you want to take a look at the code used for this, you can find it on github.
Final words
This process works very well if I can easily import the object in simulation. There is still a lot of work to be done at the simulation level to have a simulation that’s closer to the real robot. This is a big challenge, especially when you tackle grasping and manipulation problems.
The optimisation could also be run on the real robot. For this I’d need to be able to replicate my objective grasp quality on the real robot which isn’t a trivial task. Once the grasp prediction algorithm works well though…
Using the human intelligence to help robots get better at what they do is a great way to move things forward. Most of this work was done by Pavan, who did his internship at Shadow. Thanks Pavan!