I had a lot of fun “teaching” a Sony Aibo to find and pick up “trash.” This project uses some rudimentary computer vision, simple tracking, audio location, and neural network classification. Plus, its cute!
This was a class project for a grad class at Georgia Tech on robotics. The course was a bit of a survey, and as such, the final project is a bit reflective of this. It covers a little bit of everything from vision (my specialty), to tracking (my other specialty… now), to locomotion, to neural networks. It was a ton of fun to do… enjoy!
Below is a description of the project along with our solutions to several problems… Finally, at the end you can find our project code.
I’d like to send a shout out to my two esteemed collegues on this project: Brad Schwagler, and Brian Stefanovic.
Now, before we start, I’d like to whet your appetites with a little video demo:
This robot, is really just a cute project to play around with the Aibo. However, we made up a whole “back story” so, here it is: There are disorganized objects in the world such as litter, household clutter, etc.
It sure would be nice if there was a robot that could find, classify, acquire and put away that stuff! Lets make the Aibo do that! So, we used green geometric shapes to represent “trash.” The dog finds, classifies, and picks up the trash, then brings it to a base station.
Locomotion & Implementation
We used URBI & liburbi for Matlab for all the programming. I’m a big fan of Matlab, of course, and the URBI package made interfacing to the Aibo a breeze. URBI also has a bunch of built-in functions for locomotion. Also, you can use the Webots simulator to play with the Aibo virtually… This is great if A) you don’t have an Aibo, or B) you don’t want to deal with charging its batteries!
We used rangefinders for a basic collision avoidance and for localizing on the object when we were ready to pick it up. To do this, we recorded the distance the robot sees when looking “at infinity” as its head sweeps around Then, we compare that with the values we get in situ to make sure there’s an object there.
We also use some cooler sensors like stereo microphones, and vision.
We wanted the robot to be able to return to a “home base” with his collected “trash.” To do this, we had the base broadcast a tone ( we wanted it to be supersonic, but the mics weren’t good enough, so we used 2kHz).
Then, based on the phase difference in the two microphones, we determined the angle to the source. This is a technique known as Phase Interferometry. Below are some phase plots from the left and right ears… As you can see, there’s not a huge amount of difference to work with.
Since we cheated a little and made all of our objects green, segmentation was simplified somewhat. Here’s a sample image from Aibo’s camera:
We mapped our image to a chromatic color space known as YBR. This has three components like the familiar RGB, but Y is intensity (the grayscale version of this image), and B and R are the percentage of red and blue respectively that make up the intensity. Here are the three channels we get:
Notice how the box stands out a bit better in this space (its the only green thing). Now, based on some training data, we produced a probability distribution function for the objects very similarly to these guys. Based on this probability we can extract something like this:
Then, with a little morphology, we turn it into this beauty:
These segmentations are important for tracking the object (we extract its centroid), and for classification… You’ll see that one in a bit..
So, with these segmentations the tracking is pretty simple. We use this handy little algorithm:
“Find Object” step consist of taking an image, segmenting it, and finding how many degrees from center the object is. Then if we loose the object we look around for it. This ends up being pretty tedious because the Aibo locomotion package we were using wasn’t great.
The geometric shapes that represented different kinds of “trash” were cubes, spheres, and tetrahedrons. We used a neural net for classifying the objects. We tried a bunch of features before finding one that worked well. The trick finally ended up being border angles. We computed the angle from one point to the next along the border of the shapes. The results look a bit like this:
Notice the four levels for the cube, the sinusoid for the sphere, and the three levels for the tetrahedron. This ended up being a great way to do shape classification with neural networks!
We used a backpropogation network with 70 inputs and two outputs (00 = cube, 01 = sphere, 10 = tetra). Our network had 150 nodes in the hidden layer. We did the whole implementation with the Matlab toolbox.
Also, to make this work, we needed a lot of examples, both positive and negative. We also added noise to our training set. In the end we were getting 95+% success. (That was with a best-of-five-classifications scheme that took 5 different segmentations).
All in all, the project was a total success. We accomplished everything we set out to do and got an A! Also, we learned a lot and entertained our classmates with a live demo in class.
It would be fun to try to do this with fewer assumptions (more complex vision, more object classes, etc.) Also, the Aibo wasn’t really the best platform for this application. A wheeled robot would have been much more capable (but far less cute).
I was impressed with how well the neural network worked. It was really amazing the results we were getting by the end.
As usual, here are some nice files to help you out… This code is *terribly* commented, and probably not readable. However, here it is if you’re looking for guidance:
project.zip (374 Kb)