The Turing Test has Been Beaten

Kordie · Jun 26, 2012

Ok MOSTLY beaten, the test calls for a 30% sucess rate and they AI only acheived 29%.

Bot with boyish personality wins biggest Turing test [http://www.newscientist.com/blogs/onepercent/2012/06/bot-with-boyish-personality-wi.html?DCMP=OTC-rss&nsref=online-news]

For those who can't follow the link,

Eugene Goostman, a chatbot with the personality of a 13-year-old boy, won the biggest Turing test ever staged, on 23 June, the 100th anniversary of the birth of Alan Turing.

Held at Bletchley Park near Milton Keynes, UK, where Turing cracked the Nazi Enigma code during the second world war, the test involved over 150 separate conversations, 30 judges (including myself), 25 hidden humans and five elite, chattering software programs.

By contrast, the most famous Turing test - the annual Loebner prize, also held at Bletchley Park this year to honour Turing - typically involves just four human judges and four machines.

"With 150 Turing tests conducted, this is the biggest Turing test contest ever," says Huma Shah, a researcher at the University of Reading, UK, who organised the mammoth test.

That makes the result more statistically significant than any other previous Turing test, says Eugene's creator Vladimir Veselov based in Raritan, New Jersey. "It was a pretty huge number of conversations," he said, shortly after he was awarded first prize: "I am very excited."

First conceived by Turing in the early 1950s, the test is the most famous evaluation of machine intelligence. Human judges converse via a text interface with both hidden bots and humans - and say in each case whether they are chatting to a human or machine.

Turing said that a machine that fooled humans into thinking it was human 30 per cent of the time would have beaten the test. Just short of this, Eugene fooled its judges 29 per cent of the time. In a close second place, came JFred, the brain child of Robby Garner, and in third place Rollo Carpenter's Cleverbot. The other two bots to compete were Ultra Hal and Elbot.

Unlike several of Eugene's rivals, which put together sentences by imitating people they have spoken to before or by searching through Twitter transcripts for conversational ideas, Veselov has given his bot a consistent and specific personality. "He has created very much a person where Cleverbot is everybody," says Carpenter.

Eugene's character is that of a 13-year-old boy living in Odessa, Ukraine. He has a pet guinea pig and a father who is a gynaecologist. Is 13 about the right age for a chatbot, then? "Thirteen years old is not too old to know everything and not too young to know nothing," explains Veselov.

A veteran of the Loebner prize and the Chatterbox challenge , Eugene was due a win. "We took second place several times but never were we the winners," says Veselov.

Did having a personality give him an advantage? "I think any appearance of a particular personality is likely to have a persuasive effect on judges," says John Barnden, an AI researcher specialising in machine understanding of metaphor at the University of Birmingham, UK, and a fellow judge.

He cautions against concluding that this was Eugene's edge, however - for that you would have to compare two versions of the same bot, but in one case with personality suppressed.

"In my own case it's not so much personality in the abstract that's key as how the system responds to a comment - is the response relevant and non-vacuous?" he adds.

I can sympathise with that: in some cases I knew it was a machine because the entity didn't seem to follow the sense of the conversation. I was however, delighted by how funny, and zany some of the conversations with beings that I labelled as bots (Disclaimer: the best judge award is still to be awarded so I don't actually know how often I was right). They also forced me to consider in a new way just what it is that makes humans human.

I wonder if Chris Hansen will be out of a job now that a computer can pretend to be a 13 year old in chat rooms.

CAPTCHA · Jun 26, 2012

And the SCP-050 [http://www.scp-wiki.net/scp-050] award goes too...

Doclector · Jun 26, 2012

You fools, you gave them a personality? The next step is sentience. We're doomed.

So, place your bets, are we in for a skynet, a group of cylons who "have a plan" or a more slow approach, like an I-robot esque scheme, or simple AI personality cores going haywire and either tricking people into setting off ancient weapons of mass destruction, or becoming obsessed with cake and testing?

I'm betting cylons, personally.

Rowan93 · Jun 26, 2012

30% is still depressing, and pretending to be a young human is cheating. At least 18 years old, but if the AI is going to pretend to be a specific human that seems like a separate set of skills anyway.

True AI won't get here for at least another 15 years.

David Bjur · Jun 26, 2012

Doclector said:
You fools, you gave them a personality? The next step is sentience. We're doomed.

So, place your bets, are we in for a skynet, a group of cylons who "have a plan" or a more slow approach, like an I-robot esque scheme, or simple AI personality cores going haywire and either tricking people into setting off ancient weapons of mass destruction, or becoming obsessed with cake and testing?

I'm betting cylons, personally.

I-Robot scheme, or atleast I hope so since then we will have Will Smith on our side

OT: So it might be a year away until the AI can beat the Turing Test? Sure hopes so, then it will be much easier for me to finally find friends.

Jonluw · Jun 26, 2012

Cleverbot took third place?
That sort of takes a bit of the glory from taking first place in the competition, doesn't it?

And it seems to me that instead of creating a machine that is actually intelligent enough to fool humans, they've created a machine that has certain presets that make it harder to determine if it's a machine.

A stupid robot dressed in human skin is going to make a more convincing human than a smart robot that looks like a robot.
In this analogy, Eugene is the former.

Hoplon · Jun 26, 2012

Kordie said:
Ok MOSTLY beaten, the test calls for a 30% sucess rate and they AI only acheived 29%.

Bot with boyish personality wins biggest Turing test [http://www.newscientist.com/blogs/onepercent/2012/06/bot-with-boyish-personality-wi.html?DCMP=OTC-rss&nsref=online-news]

For those who can't follow the link,

Eugene Goostman, a chatbot with the personality of a 13-year-old boy, won the biggest Turing test ever staged, on 23 June, the 100th anniversary of the birth of Alan Turing.

Held at Bletchley Park near Milton Keynes, UK, where Turing cracked the Nazi Enigma code during the second world war, the test involved over 150 separate conversations, 30 judges (including myself), 25 hidden humans and five elite, chattering software programs.

By contrast, the most famous Turing test - the annual Loebner prize, also held at Bletchley Park this year to honour Turing - typically involves just four human judges and four machines.

"With 150 Turing tests conducted, this is the biggest Turing test contest ever," says Huma Shah, a researcher at the University of Reading, UK, who organised the mammoth test.

That makes the result more statistically significant than any other previous Turing test, says Eugene's creator Vladimir Veselov based in Raritan, New Jersey. "It was a pretty huge number of conversations," he said, shortly after he was awarded first prize: "I am very excited."

First conceived by Turing in the early 1950s, the test is the most famous evaluation of machine intelligence. Human judges converse via a text interface with both hidden bots and humans - and say in each case whether they are chatting to a human or machine.

Turing said that a machine that fooled humans into thinking it was human 30 per cent of the time would have beaten the test. Just short of this, Eugene fooled its judges 29 per cent of the time. In a close second place, came JFred, the brain child of Robby Garner, and in third place Rollo Carpenter's Cleverbot. The other two bots to compete were Ultra Hal and Elbot.

Unlike several of Eugene's rivals, which put together sentences by imitating people they have spoken to before or by searching through Twitter transcripts for conversational ideas, Veselov has given his bot a consistent and specific personality. "He has created very much a person where Cleverbot is everybody," says Carpenter.

Eugene's character is that of a 13-year-old boy living in Odessa, Ukraine. He has a pet guinea pig and a father who is a gynaecologist. Is 13 about the right age for a chatbot, then? "Thirteen years old is not too old to know everything and not too young to know nothing," explains Veselov.

A veteran of the Loebner prize and the Chatterbox challenge , Eugene was due a win. "We took second place several times but never were we the winners," says Veselov.

Did having a personality give him an advantage? "I think any appearance of a particular personality is likely to have a persuasive effect on judges," says John Barnden, an AI researcher specialising in machine understanding of metaphor at the University of Birmingham, UK, and a fellow judge.

He cautions against concluding that this was Eugene's edge, however - for that you would have to compare two versions of the same bot, but in one case with personality suppressed.

"In my own case it's not so much personality in the abstract that's key as how the system responds to a comment - is the response relevant and non-vacuous?" he adds.

I can sympathise with that: in some cases I knew it was a machine because the entity didn't seem to follow the sense of the conversation. I was however, delighted by how funny, and zany some of the conversations with beings that I labelled as bots (Disclaimer: the best judge award is still to be awarded so I don't actually know how often I was right). They also forced me to consider in a new way just what it is that makes humans human.

I wonder if Chris Hansen will be out of a job now that a computer can pretend to be a 13 year old in chat rooms.

It fell short at only 29%. To pass the test it has to be 30% or better.

So no, the threshold is still intact.

porpoise hork · Jun 26, 2012

huh.. interesting.. I agree that using a 13 year old personality is kinda cheap.

On a side note my wife looked at me funny when I ordered my "I failed the Turning Test" shirt..

Esotera · Jun 26, 2012

I think we're a long way off from a robot uprising...

This is pretty cool though. I'd like to see it repeated, and to have them increase the percentage with which they can beat it.

Kordie · Jun 26, 2012

Hoplon said:
Kordie said:

Ok MOSTLY beaten, the test calls for a 30% sucess rate and they AI only acheived 29%.

Bot with boyish personality wins biggest Turing test [http://www.newscientist.com/blogs/onepercent/2012/06/bot-with-boyish-personality-wi.html?DCMP=OTC-rss&nsref=online-news]

For those who can't follow the link,

Eugene Goostman, a chatbot with the personality of a 13-year-old boy, won the biggest Turing test ever staged, on 23 June, the 100th anniversary of the birth of Alan Turing.

Held at Bletchley Park near Milton Keynes, UK, where Turing cracked the Nazi Enigma code during the second world war, the test involved over 150 separate conversations, 30 judges (including myself), 25 hidden humans and five elite, chattering software programs.

By contrast, the most famous Turing test - the annual Loebner prize, also held at Bletchley Park this year to honour Turing - typically involves just four human judges and four machines.

"With 150 Turing tests conducted, this is the biggest Turing test contest ever," says Huma Shah, a researcher at the University of Reading, UK, who organised the mammoth test.

That makes the result more statistically significant than any other previous Turing test, says Eugene's creator Vladimir Veselov based in Raritan, New Jersey. "It was a pretty huge number of conversations," he said, shortly after he was awarded first prize: "I am very excited."

First conceived by Turing in the early 1950s, the test is the most famous evaluation of machine intelligence. Human judges converse via a text interface with both hidden bots and humans - and say in each case whether they are chatting to a human or machine.

Turing said that a machine that fooled humans into thinking it was human 30 per cent of the time would have beaten the test. Just short of this, Eugene fooled its judges 29 per cent of the time. In a close second place, came JFred, the brain child of Robby Garner, and in third place Rollo Carpenter's Cleverbot. The other two bots to compete were Ultra Hal and Elbot.

Unlike several of Eugene's rivals, which put together sentences by imitating people they have spoken to before or by searching through Twitter transcripts for conversational ideas, Veselov has given his bot a consistent and specific personality. "He has created very much a person where Cleverbot is everybody," says Carpenter.

Eugene's character is that of a 13-year-old boy living in Odessa, Ukraine. He has a pet guinea pig and a father who is a gynaecologist. Is 13 about the right age for a chatbot, then? "Thirteen years old is not too old to know everything and not too young to know nothing," explains Veselov.

A veteran of the Loebner prize and the Chatterbox challenge , Eugene was due a win. "We took second place several times but never were we the winners," says Veselov.

Did having a personality give him an advantage? "I think any appearance of a particular personality is likely to have a persuasive effect on judges," says John Barnden, an AI researcher specialising in machine understanding of metaphor at the University of Birmingham, UK, and a fellow judge.

He cautions against concluding that this was Eugene's edge, however - for that you would have to compare two versions of the same bot, but in one case with personality suppressed.

"In my own case it's not so much personality in the abstract that's key as how the system responds to a comment - is the response relevant and non-vacuous?" he adds.

I can sympathise with that: in some cases I knew it was a machine because the entity didn't seem to follow the sense of the conversation. I was however, delighted by how funny, and zany some of the conversations with beings that I labelled as bots (Disclaimer: the best judge award is still to be awarded so I don't actually know how often I was right). They also forced me to consider in a new way just what it is that makes humans human.

I wonder if Chris Hansen will be out of a job now that a computer can pretend to be a 13 year old in chat rooms.

Click to expand...

It fell short at only 29%. To pass the test it has to be 30% or better.

So no, the threshold is still intact.

Almost like that was the first sentence in the post...

ReservoirAngel · Jun 26, 2012

Doclector said:
You fools, you gave them a personality? The next step is sentience. We're doomed.

So, place your bets, are we in for a skynet, a group of cylons who "have a plan" or a more slow approach, like an I-robot esque scheme, or simple AI personality cores going haywire and either tricking people into setting off ancient weapons of mass destruction, or becoming obsessed with cake and testing?

I'm betting cylons, personally.

I'll take that bet. Since it basically relies on humans being kind of pricks to the robots, and being rampant pricks is what we're best at as a species.

captcha: "Gregory Peck"

Erm...

Art thou appeased?

Hoplon · Jun 26, 2012

Kordie said:
Almost like that was the first sentence in the post...

*head desk* My apologies.

The Turing Test has Been Beaten

Kordie

New member

CAPTCHA

Mushroom Camper

Doclector

New member

Rowan93

New member

David Bjur

Hazy sucks, Daystar Moreso

Jonluw

New member

Hoplon

Jabbering Fool

porpoise hork

Fly Fatass!! Fly!!!

Esotera

New member

Kordie

New member

ReservoirAngel

New member

Hoplon

Jabbering Fool