Why IBM's Watson learned curse words
Anyone who has every tried to learn a language knows that you can practice all you want but you can't really catch on until you're immersed in a culture that predominantly speaks that language. That's because language is always evolving. We add words and make up new phases that you won't find in any textbook or language-learning class.
IBM recognized the evolving nature of human language when it was developing its famous, Jeopardy-winning artificial intelligence computer system, Watson. To help Watson learn the intricacies of the English language, IBM's developers turned to the one place you can learn all the slang you've ever wanted to learn (and more): The Urban Dictionary, the online dictionary with definitions of nearly 7 million English slang words and phrases. IBM's developers had Watson memorize the dictionary. The only problem, Watson didn't have a computer-to-mouth filter:
Watson couldn't distinguish between polite language and profanity -- which the Urban Dictionary is full of. Watson picked up some bad habits from reading Wikipedia as well. In tests it even used the word "bullshit" in an answer to a researcher's query.
Ultimately, Brown's 35-person team developed a filter to keep Watson from swearing and scraped the Urban Dictionary from its memory. But the trial proves just how thorny it will be to get artificial intelligence to communicate naturally. Brown is now training Watson as a diagnostic tool for hospitals. No knowledge of OMG required.
That's been the biggest challenge for Eric Brown -- a research scientist at IBM, who works with Watson -- teaching it the subtlety of language in hopes of getting it to pass the Turing test, a way to test if AI is indistinguishable from human interaction. Brown told CNN, "As humans, we don't realize just how ambiguous our communication is." And I bet Brown didn't realize how raunchy English can be until Watson learned the Urban Dictionary.
[h/t The Atlantic]
Related on SmartPlanet: