Saturday 23 October 2010

The Neverending Story | Speech Technology Magazine Blog

In a basement computer center in Pittsburgh, a computer is beginning to learn like a human. NELL, the Never-Ending Language Learner, is a computer that scientists at Carnigie Mellon University, are teaching to read the internet and “learn” from the information.

In other words, NELL tries to extract not only facts, but relationships among facts; it looks for patterns in information. On the website, it is explained that NELL can learn from certain catagories like “Barack Obama” is a person and a politician. NELL ran without supervison for six months, but then the scientists stepped in to check on NELL’s progress.

They estimate that it was 71%, and now they use their feedback to help NELL become more accurate. For example, the NYT reported on an error NELL made: it couldn’t understand the word ‘cookies’ when applied to internet, and ended up deciding that Internet cookies were the kind of cookies you ate, which to me makes sense (who decided it should be called cookies, anyhow? I have to think it was some kind of nerdy in-joke).

“We are still trying to understand what causes it to become increasingly competent at reading some types of information, but less accurate over time for others. Beginning in June, 2010, we began periodic review sessions every few weeks in which we would spend about 5 minutes scanning each category and relation. During this 5 minutes, we determined whether NELL was learning to read it fairly correctly, and in case not, we labeled the most blatant errors in the knowledge base. NELL now uses this human feedback in its ongoing training process, along with its own self-labeled examples. In July, a spot test showed the average precision of the knowledge base was approximately 87% over all categories and relations. We continue to add new categories and relations to the ontology over time, as NELL continues learning to populate its growing knowledge base.”

Now, what could this mean for speech technology? As the NYT reports (and I guess, it’s fairly obvious) that this kind of technology could really improve natural language technology. In theory, you could have a ‘smarter’ system that is able to gather information in a new way. While I may have watched one too many episodes of Battlestar Galactica, and sometimes this kind of news makes me a bit twitchy, I can see how this kind of breakthrough could be potentially useful and, well, fascinating.

You can see a video here for more info on how it works, or just go to the website.

Flickr - projectbrainsaver

www.flickr.com
projectbrainsaver's A Point of View photoset projectbrainsaver's A Point of View photoset