Over the last month or so using a new dense stereo correspondence algorithm called ELAS I've made some quite substantial improvements to the vision systems of the Rodney and GROK2 robots, meaning that they can reliably see the structure of the environment around them under most conditions, probably with enough accuracy to enable a "functional vision" capability. Whilst the depth resolution may not be as good as for active sensing methods, such as laser scanners or the Kinect, for purely passive vision this is quite a pleasing result.
Here is an example depth map using the ELAS algorithm, where disparities are colour coded. The different colours make it easier to see structure in the distance than is otherwise the case with a monochrome representation of depth.
Converting the disparities to ranges, then projecting them in 3D gives a reasonable point cloud model. Here a few stereo images taken from different pan and tilt angles are combined into a unified 3D model.
In this model you can see the desk with the Rodney robot on the left, a cup, the keyboard and screen, and a small Surveyor robot in front of Rodney.
Whilst this isn't quite at the same quality as the CMU hallway runs produced by Moravec in 2000-2002, it's getting into the same ballpark and with some ray modelling and projection of rays into an occupancy grid this would certainly help to resolve the longer distance features.
My last attempt to do this, using an orange juice carton as a subject, was in 2004/5, and although the overall approach was the same the results were not as good using cruder algorithms and camera calibration. Also it should be noted that this progress actually has little to do with raw processing power, and much more to do with improvements in algorithms. The ELAS algorithm is pretty fast, especially on low resolution images, and would have been usable on a computer of five years ago had it been available back then.
Friday, December 10, 2010
Subscribe to:
Post Comments (Atom)
4 comments:
Good stuff as usual, I always find it amazing that you seem to be able to get results which are in the same ballpark as some of the big labs. Do you have any plans to play with Kinect?
Yes I'll probably try using a Kinect in the new year. I expect from the videos I've seen that it will have better depth resolution, with the distant features being more clearly defined.
There are also other possible projects, such as creating a system which can generate a 3D model of an object suitable for use in Second Life or for use with rapid prototypers.
Is this test data generated by the surveyor stereo kit? Or do you have a different camera setup for this?
The results are just amazing. Are there conditions were it falls apart?
This isn't using the Surveyor SVS, but I probably could use that also if necessary. The processing for dense stereo probably needs to be done one a PC, or something more powerful than the Blackfin DSPs, but the Surveyor SVS can broadcast its images to be processed elsewhere.
These results were obtained using the Minoru webcam (the depth map video) and a pair of Logitech Quickcam 9000s which constitute the forward stereo camera of the GROK2 robot. The point clouds were generated from six stereo glimpses in different pan and tilt orientations.
As always with stereo vision there are conditions under which this degrades, with the most obvious one being if you present a checkerboard pattern then the "picket fence" effect can mean that its depth is miscalculated. Striped textures, rather than checkers, seem to be detected ok. Walls which appear to have no texture are still detected but have a bigger standard deviation in their range than more textured surfaces (they look more fuzzy from above).
Post a Comment