Thursday, January 22, 2009

Another year, another android


The march, or perhaps the slightly awkward head movements, of the androids continues in 2009 with Steve Grand's new web site Grandroids. We havn't heard very much from Steve in recent years, apart from the occasional Biota podcast appearance, but it seems that he has been busy building some sort of new fembot.

Whether there is really a market for entertaining exhibition androids remains to be seen. I suppose there might be if he can come up with something sufficiently impressive and interactive. Honda has been hiring out their ASIMO robot, for some no doubt phenomenal fee, for many years. One counter argument against this business model is that as consumer robot toys become increasingly sophisticated and commonplace the "wow" factor of having a humanoid robot on display at an exhibition stand may be somewhat diminished, raising the bar on what is considered to be novel and at the cutting edge of technology.

Building android or anthropomorphic robots is certainly fun, and I've done some amount of this myself in the past, but my approach now is quite different from this one and I'm not convinced that Jacques de Vaucanson style mechanical curiosities are the best way to make progress on the robotics problem. Within the last ten years there have been several large humanoids, mostly of Japanese origin, yet none of these appear to have made the slightest progress towards being practical or commercially viable machines. The showcase android may have its five minutes of fame on YouTube or elsewhere, but the selection pressures within this domain will be for short term ad-hoc gimmicky features which aren't really making incremental progress towards the larger goal of more intelligent machines and are ignoring or avoiding what I would consider to be the key problems which need to be addressed.

I'd agree with Steve's observation that AI is always a harder problem than you think it is. Even the most enthusiastic supporters amongst the AGI and singularist movement I think frequently underestimate or dismiss as trivial or unimportant aspects of robotics which have mostly evaded adequate solutions thus far.
"People sometimes worry that robots are going to take over the world, but frankly the greatest threat robots have posed to humanity so far is that they tend to be heavy and fall over a lot."
On this I used to agree entirely, having experienced first hand just how stupid even the best robotic technology can be. The above statement would have been true five or more years ago, but as the previous blog post suggests the situation is changing rapidly. It remains that case that I don't believe that AI or robotic systems themselves are intrinsically a threat to humanity on any short to medium term time scale. However, I think there is a growing danger of the use of telerobotics as a means of directing and amplifying human malevolence.

Based upon reading the Grandroids site I would guess that Steve is in the Microsoft robotics camp, with mentions of MSRS and Microsoft speech technologies. At the moment the robotics community still seems to be divided as far as choice of operating system and software development methodology goes. On the one hand you have some using Windows and proprietary systems including MSRS, and on the other there are those who mainly stick with Linux and open source.

6 comments:

Steve said...

Hey Bob,

Thanks for the post. I'd just like to correct a couple of the things you said:

Firstly, it's important to recognize that our Grandroids project is business, not research. Everything I do I try to make innovative in some way and I think we are going to have some interesting stuff going on under the hood here too, but research can only continue unfunded for so long and a long-overdue income stream is the primary motivation for this project.

Building a humanoid that's expected to speak and behave in human-like ways means that I have to swallow my pride on a lot of my AI principles. I agree that the "Vaucanson's Duck" approach is not the best way to make progress in AI. I've always said so. But biologically and neurologically inspired AI still has a long way to go and I can't spend any more years eating my life savings with no external support.

It's no different from a Shakespearean actor grudgingly accepting a part in a commercial - we all have to make a living. Having said that, I'm going to stick as close to my principles as I can, given the system requirements, and there are some interesting ideas emerging. Intellectually speaking I'm mostly looking at the project as a piece of cybernetic art, which has its own contribution to make.

Secondly, we aren't using Microsoft Robotics Studio. You may have been confused by the fact that Sara is an expert on MSRS and has written books and articles on it. But she has learned through hard-won and authoritative experience that MSRS is far more of a hindrance than a help under most circumstances. She confirmed my own misgivings - when Tandy first told me about it years ago I thought it was unlikely to achieve what it set out to do, no matter how well-intentioned, and that seems to be the case. It's too big a topic to discuss here, but frankly I wouldn't touch MSRS with a bargepole.

We are using MS speech technologies, but then we're both Windows programmers so that's not surprising.

I think there's a much bigger market for this sort of thing here in the US than you suggest, as long as the robots are genuinely interesting and not just a gimmick. But we don't expect it to be long-term. It's just a foot in the door to help us fund some more interesting projects. At least it'll be a darn sight cheaper to rent than Asimo!

Oh, and I've actually been busy writing a new artificial life simulation game, which is why you haven't seen much output from me, but I've taken time off from finishing it to build this robot.

Cheers,
Steve

Bob Mottram said...

So far I've also avoided using MSRS, but if it gains popularity and becomes more mainstream I might look further into it. My main issues with this system would be that in order to be able to do some of the stuff I want to do I would really need access to the low level source code, which as far as I know in MSRS is just not possible. It would be nice if robotics had progressed to the point where it became a high level programming problem, and the low level details could be subcontracted to some other company, but unfortunately that's not the case right now.

I look forward to the new game, and the androids. Last year I tried playing Spore, but was really not that impressed. There seems to be very little actual evolution or emergence going on within Spore, and some of the features from earlier demos which would have made the game more interesting seem to have been removed in the final release. Perhaps game designers are fearful of unpredictable emergent effects, which are the antithesis of the current paradigm which is typically a very scripted and controlled user experience.

Steve said...

I think you're right: *everyone* seems afraid of emergent effects, even though emergence is the one thing we're all trying to create!

In MSRS you can get pretty low-level in the sense that you can write services to interface to your own sensors, etc., including in the simulator.

Nevertheless, the main problem with the system, imho, is the huge amount of overhead - not just computational but in terms of programmer effort - needed to make even the simplest thing possible. You can spend days just developing the infrastructure to interface a bump sensor to a two-wheeled robot. And worst of all, every single request to read a sensor involves constructing a bunch of XML, wrapping it up in a SOAP header and sending it through HTTP. The result is then assembled by the sensor's service in the same fashion and the received XML is finally parsed to return a value. That involves millions of clock cycles just to read a simple port bit, which ordinarily would take a nanosecond!

Also it's not very stable and Sara had a lot of problems with the strong-name signing, which meant the whole project could suddenly fail to compile for no apparent reason, necessitating a lot of fiddling about to get all the DLLs recompiled in the right sequence.

All this huge amount of heartache, basically to give you a concurrent processing system. And I'm not convinced that most robots are all that concurrent in the first place. Usually there's some central point where sensory data are bound together in order to decide on a system-wide action. It's not like you can simply leave the bump sensors to tell the robot to stop because it has hit something, and no other part of the system needs to know. Polling is usually a better and more controlled way of handling things. I think the designers of MSRS (who were not, in the main, roboticists) made some assumptions about what needs to happen in a robot that only really apply to certain robotic paradigms and big team projects.

For my part I tend to keep things massively parallel all the way through - from the camera pixels straight to the neurons and out to the motors without ever being turned into easily transmitted symbols such as "there's a cup 25 degrees to your left". That's pretty concurrent stuff, but the thought of trying to pass millions of individual data items per second by assembling and parsing SOAP packets would be ludicrous!

It's not for the faint-hearted, that's for sure.

Bob Mottram said...

Its emergent effects which make things surprising and fun, and a game based around emergence could have a lot more lasting appeal for the player due to the endless creation of novelty.

Actually I'm following a somewhat similar path to MSRS, by turning sensor and actuator values into XML, then passing the messages between programs and computers via TCP. Admittedly from a purists perspective this is hopelessly inefficient with a lot of parsing overhead. However in my case the amount of information shuttling around is quite small (I'm not trying to pass the state of a thousand neurons from one program to another, merely things like encoder values and button presses). From a development point of view there is some convenience in being able to debug systems by intercepting the messages and displaying the relatively human readable content. I could pass the data directly without the XML formatting, which would mean that I'd need to create some fixed width message formats.

On a PC all this extra XML parsing is not really an issue in terms of performance, although on a microcontroller or other low power embedded system it would be far more impractical.

Sending images packaged as XML would be very slow and silly, so what I'm doing is that most of the stereo calculations are done in a self contained way, then something like 200 disparity values are packaged up as XML and broadcast to any other programs which might want to consume that data. This gives me the flexibility that the main part of the software doesn't care how the range data was obtained, or where it comes from (it could potentially be broadcast over the internet). So, for example I can either use a webcam based stereo vision system, or use an off the shelf device such as the Surveyor SVS. In a world where hardware changes quickly, this sort of abstraction layer can be useful and help to ensure that a single robot project can be sustained over a number of years.

Steve said...

In that case maybe you'd like MSRS, although you still have to commit yourself to learning their service-oriented architecture, which is a pretty big undertaking. By the time you've mastered MSRS you'll probably have forgotten what it was you wanted to use it for!

I guess the best thing about it is that you can test your code on the simulation engine, although for a non-standard robot platform that too is a fairly major undertaking. And not much use for your work in stereo vision, I should imagine, since any virtual cameras will return unnaturally clean, simple images derived from the renderer.

I can see your logic in using XML data for inter-PC comms on small data packets, if you're not strapped for processor power to run a big ANN or whatever. I bet your code has fewer layers of overhead than MSRS though - the latter certainly seems to hog the processor.

Bob Mottram said...

Simulation based testing can be advantageous, and I am trying to do some of that - devising unit tests wherever possible - although I'm not using anything as complicated as the 3D robot simulators available.

As you say, simulation based testing for vision is almost useless and I have not even attempted to go down that route. Even the best ray tracing systems currently available cannot accurately reproduce some of the optical effects which can easily be seen on any cheap webcam. 3D rendering produces much too perfect a result, which can lead to a false sense of confidence when running vision algorithms. It's possible to device algorithms which work perfectly in simulation, but completely break down when faced with the uncertainties of the real world.

Another reason for not using MSRS is that I now use Linux for almost everything, so I'd like to retain the flexibility of running the system either on Linux or Windows. In some regards having the system Linux based simplifies things, and in the unlikely event that this project is a roaring success and I wanted to begin commercial mass production Linux would have clear manufacturing cost advantages in the consumer market.