IBM Makes Most Realistic Computerized Voice Ever

Logan Westbrook

Transform, Roll Out, Etc
Feb 21, 2008
17,672
0
0
IBM Makes Most Realistic Computerized Voice Ever


Getting stuck with an automated service when on the phone might just get a little bit more bearable, thanks to IBM.

The new technology, called 'generating paralinguistic phenomena via markup in text-to-speech syntheses', gives the machine voice more human attributes, such as verbal tics, sighing, or even coughing to get your attention.

Andy Aaron, of IBM's Thomas J Watson research group speech team, said, "These sounds can be incredibly subtle, even unnoticeable, but have a profound psychological effect. It can be extremely reassuring to have a more attentive-sounding voice... When you are on the telephone on an automated service helping you fix your computer or buy insurance, this could make the difference between being a happy customer or hanging up and canceling a service."

Aaron was quick to assuage any fears that this technology might be used to replace actual people, saying, "We are almost at the point where the voice is indistinguishable from a human, but that is not our goal. We don't want to fool anybody."

Source: The Telegraph [http://www.telegraph.co.uk/scienceandtechnology/technology/technologynews/4420798/IBM-develop-most-realistic-computerised-voice.html]



Permalink
 

Limos

New member
Jun 15, 2008
789
0
0
Why is there no link to where we can hear this Oh-so-fancy new voice?
 

Virgil

#virgil { display:none; }
Legacy
Jun 13, 2002
1,507
0
41
The sound of the voice is not the problem. The problem with their terrible voice-driven automated systems is that it takes 12 tries to say "No" before it finally decides to fail out and give you to an operator.

Seriously, if they're not just going to give me a person to begin with, I'll take knowing it's a machine and pressing the number keys over any more of these experiments in 'customer satisfaction'
 

MaxFan

New member
Nov 15, 2008
251
0
0
Virgil said:
The sound of the voice is not the problem. The problem with their terrible voice-driven automated systems is that it takes 12 tries to say "No" before it finally decides to fail out and give you to an operator.

Seriously, if they're not just going to give me a person to begin with, I'll take knowing it's a machine and pressing the number keys over any more of these experiments in 'customer satisfaction'
Yep, when they changed one of my authorization systems from getting a human right away to trying to do it automated first, the issue was not whether the voice was realistic, but that it now took ten times as long to do the same thing because of pathetic programming.
 

cleverlymadeup

New member
Mar 7, 2008
5,256
0
0
i don't really mind a lot of the voice systems, tho i will say american ones are worse than canadian ones. bell's voice system emily is pretty good, the rogers one is pretty good too

the biggest issue i've found is with the ppl not the software, ppl tend to babble on instead of just saying 1 or 2 words that are needed. that and accents can be a real pain.

i think the biggest issue i've found with the american systems is the bad phone lines in america
 

mangus

New member
Jan 2, 2009
399
0
0
I call bull on the "almost indistinguishable from a human", even without having heard it.
Also, the key to robot phone systems is to talk like you are also a soulless machine. Don't get mad, get I AM ROBOT.
 

theultimateend

New member
Nov 1, 2007
3,621
0
0
Finally Phone Sex without another person.

To think that people said technology wasn't advancing for the betterment of mankind. IBM is showing those people the door with a hearty "Good Day to you Sir!"

But seriously, who really cares? To me I think the bigger problem is IBM is working on a more realistic voice to replace people instead of hiring people who need jobs.
 

9of9

New member
Feb 14, 2008
199
0
0
If you look up IBM voice demos, there's a couple online - including this one: http://www-01.ibm.com/software/pervasive/tech/demos/tts.shtml

It's actually pretty damned good - it struggles with less common words sometimes, but the intonation is spot-on. And it doesn't look like this is their most recent, state-of-the-art technology, so I'm assuming that the new tech they're boasting about in the article is better than that.
 

SmugFrog

Ribbit
Sep 4, 2008
1,239
4
43
Nice find 9of9. Most of those sound pretty much like a computer though, with the exception of Kate. That's impressive. If the tech in the article is even better than this, that will be creepy!

EDIT: I wonder if it could be used in games! Imagine the impact this could have on the modding community.
 

SmugFrog

Ribbit
Sep 4, 2008
1,239
4
43
I still say it could have great applications in the modding community - especially for a game like Oblivion or Fallout 3. Check out this link - this seems more like what the article is about, but I can't find a section where you can put your own words into it:

http://www.research.ibm.com/tts/

Normal - http://www.research.ibm.com/tts/samples/expressive/gdncookiesN.wav
Expressive - http://www.research.ibm.com/tts/samples/expressive/gdncookiesE.wav

Normal - http://www.research.ibm.com/tts/samples/expressive/apodoorN.wav
Expressive - http://www.research.ibm.com/tts/samples/expressive/apodoorE.wav

Normal - http://www.research.ibm.com/tts/samples/expressive/QYNbeachN.wav
Expressive - http://www.research.ibm.com/tts/samples/expressive/QYNbeachE.wav

Awesome! I would love to use this program to read forum posts or different pages on the internet while I'm on a different window. I still don't think this one sounds as "good" as the British / Austrailian voice (Kate) that 9of9 linked. It sounds better in presentation, almost like it has emotion to it; but I think if they could make it sound more like "Kate", it would be really awesome.

Danzorz said:
I seriously doubt it was worth the money and next time they should just use common sense.
You really think it is a waste of time? I would hate to see this development be used only for telemarketers. It has such potential for so many other applications! You know, there are blind people that use the internet via text-to-speech programs - I'm sure this would be wonderful to listen to instead of the default stuff. Or, perhaps, Stephen Hawking could have a better voice?

I just really want it for the modding aspects. Have you not ever playing Oblivion or Fallout mods, where the mod creator only added new text instead of a voice for his npc? It ruins the immersion a bit - I think this could be used to make it read from a file and sound realistic as an NPCs voice.