» Download Audio
Voice Recognition Becomes More Intelligent
These days searching for a number in a five centimetre thick telephone directory seems very old fashioned. Voice recognition systems are becoming more and more common and efficient: the best of them apparently recognise 49 out of every 50 words.
These devices save companies a huge amount of money. Stephen Evans in New York has been talking to the machines and to the men who design them.
I had a bit of a Basil Fawlty moment, the other day. I rang 411, the American directory enquiries which now uses a voice recognition system. I told the machine I wanted the number for "Harlem Auto Mall" and she -- for this machine had a female voice -- replied "Harlem Public School 154". No doubt like lots of people, I found myself ranting.
Machines, you see, have personalities, and banks, phone companies, railways and all kinds of alleged help-lines are spending a lot of money trying to find out what kinds of voices to give the machines that speak to us, the public, on their behalf.
Much of the research is conducted in a small room -- Room 325 in McClatchy Hall -- in Stanford University in California. It's the site of the drily-entitled but fascinating laboratory for "Communication between Humans and Inter-active Media", and the domain of a genial, enthusiastic professor called Clifford Nass who studies, quite simply, how people and machines get on, particularly when the machines talk to the people.
In his lab, a stream of students and local people of all shapes and sizes undergo tests. Voices of different ages and accents are played to them and their reactions noted: "Did you trust that voice?" "Did this one have authority?"
Generally, the tests show that people are less persuaded by female voices than by male ones (though people are more likely to be antagonised by a male voice). On the up side, male voiced machines are perceived to have energy and authority. One of the results of that, for example is that in Japan a stock-broking company used a female voice on its machine to give information on stocks and shares but then a male one to make the actual sale.
Now, in many parts of the world, when you hire a car, you get a navigation system -- a little electronic map on a screen with a machine voice. In America, it's a female voice (whom I like to call Gladys). She tells me, say, to make a right in two miles and -- I fancy, at least -- gets exasperated if I don't follow her directions: "Recalculating Route", she snaps, in her American English.
Now, in Germany when they tried a similar system, men reacted against being given directions by a female voice so it had to be taken off the market. Old people, by the way, take advice more readily from young people than from people their own age.
Tone matters to drivers. Professor Nass is working on a system where the machine-voice changes according to how you address it. He's discovered that irritable drivers can calm down if the voice on the navigation system is subdued -- though, for some reason that he doesn't quite understand, calm drivers get wound up by subdued, low-key voices that don't vary in pitch. So the next task is to vary the navigation system's voice according to how grumpy you, the driver, are. If you sound aggressive to the machine, the machine will change tone to calm you down.
The technology is improving all the time. Basically, machines that speak first involve a human actor recording countless different words and syllables and a computer then re-assembling the sounds into coherent sentences, according to what it thinks you've said to it. These machines are getting better and better, better able to recognise more accents and variations.
They're also better able to talk back without sounding like a machine. It seems the androids are getting very good indeed.
And companies like them a great deal. They even construct personas around the voices on the machines that speak for them. One of the Canadian telephone companies published a biography of the imaginary woman its machine was imitating. She was Emily, a nice small-town girl who had a history degree and went back-packing round Asia after college. With some panache, a local radio host decided to call her up. Emily, of course, being a machine could only answer his chat with lines like "You're calling to check your account balance. Is that right?"
It may be, though, that the company has the last laugh. Emily is paid no wages and the telephone company reckons it saves three million dollars a year by employing her instead of a crowd of expensive, high-maintenance human-beings. There's no doubt that soon the androids will speak better than we do -- and they're much, much cheaper -- they're much, much cheaper -- much, much cheaper.