Wednesday 27 May 2009

IBM Watson

Wednesday, May 27, 2009

How IBM Plans to Win Jeopardy!

IBM's Watson will showcase the latest tricks in natural-language processing.

By David Talbot

smaller text tool iconmedium text tool iconlarger text tool icon
What is Watson?: IBM is preparing a natural-language computer system that will compete against humans on TV’s Jeopardy!, which is hosted by Alex Trebek. 
Credit: IBM
MULTIMEDIA
video See IBM researchers discuss theJeopardy! project.

For decades, humans have struggled to create machines that can extract meaning from human language, with all its messiness, subtle context, humor, and irony. Traditional approaches require a great deal of manual work up front to render material understandable to computer algorithms. The ultimate goal is to make this step unnecessary.

IBM hopes to advance toward this objective with Watson, a computer system that will play Jeopardy!, the popular TV trivia game show, against human contestants. Demonstrations of the system are expected this year, with a final televised matchup--complete with hosting by the show's Alex Trebek--sometime next year. Questions will be spoken aloud by Trebek but fed into the machine in text format during the show.

The company has not yet published any research papers describing how its system will tackle Jeopardy!-style questions. But David Ferrucci, the IBM computer scientist leading the effort, explains that the system breaks a question into pieces, searches its own databases for "related knowledge," and then finally makes connections to assemble a result. Watson is not designed to search the Web, and IBM's end goal is a system that it can sell to its corporate customers who need to make large quantities of information more accessible.

Ferrucci describes how the technology would handle the following Jeopardy!-style question: "It's the opera mentioned in the lyrics of a 1970 number-one hit by Smokey Robinson and the Miracles."

The Watson engine uses natural-language processing techniques to break the question into structural components. In this case, the pieces include 1) an opera; 2) the opera is mentioned in a song; 3) the song was a hit in 1970; and 4) the hit was by Smokey Robinson and the Miracles.

In searching its databases for information that could be relevant to these segments, the system might find hundreds of passages. These could include the following three:

"Pagliacci,'' the opera about a clown who tries to keep his feelings "hid"; 

Smokey Robinson's Motown hit record of the '60s "Tears of a Clown"; 

"Tears of a Clown" by the Miracles hit #1 in the UK in 1970.

No comments:

Post a Comment