Kristian Hammond quit artificial intelligence 10 years ago, but that is exactly when he created an intelligent machine. Hammond is a professor at the Intelligent information Laboratory at Northwestern University, who has built a computer that can create movie reviews by curating text online found on blogs and on Twitter.
The machine can also produce original sports stories from data. And what's surprising is that the sports stories aren't actually that bad. Local news stations could find this kind of news generation useful, but Hammond's personal goal is to create content for every little league player in the country so the team member's family can read about their performance. Enter the world of robot journalism.
What made you think a machine could generate news? What gave you that idea?
Our lab works on personalized information systems. They take the form of whatever the task is and do a search that would find documents that would be useful to you. Then we realized instead of handing back lists of documents, we could create an experience for you.
News at Seven came out of that. We started working on that a few years ago. We built different kinds of news segments. My favorite is the movie reviews. It goes and finds who is in the movie, who directed the movie. It checks to see if people like it or do not like it. And then picks a narrative arc. If a lot of people liked it and didn't like the movie, it will create a conversation between the characters from the information it finds online.
Are you happy with the news segments?
I'm very happy with a lot of elements of it, but we are not happy with the voices. The voices are relatively flat. There's not a lot we can do about it because we are consumers of speech generation systems.
Then how did you get into creating sports stories?
It's a natural progression. We are looking at computer generated content. We got to a point where we were generating from existing text online, we could generate from existing data. The reality there is a ton of data, but it's numerical data. its not that compelling to most people, but a story can be. We can do it automatically. We are generating content where it didn't exist before because we can get to the data and generate these stories. We think we can generate into any sport where there is numerical data.
We are generating college level softball stories. If you go to the Bigtennetwork.com to see the Stats Monkey stories of women's softball. Our byline is "Narrative Science." We get the scores. The system looks at the score, figures out what plays and what players were important and then builds a characterization of the game. Was it a heroic effort from one player or was it a team effort? The story is then sent back to the Bigtennetwork.com and back to the schools.
What else could this be used for?
We are also looking at working in the financial realm to create quarterly reports. We want to create personalized market reports based upon what your portfolio is. We can create personalized medical information. Anywhere there is data and there is interest. Our hope is to be able to do something for the 2010 census and provide every local newspaper with information about their town and how they compare to the rest of the country.
How is your AI different than traditional AI?
Rather than thinking about what we are doing in terms of deep cognitive models, we hook up semantics to a combination of both search and statistics. And by doing this, we get a lot of what people are striving for in artificial intelligence. We were able to create experiences that are compelling and every once in a while, a little amusing. Artificial intelligence has rested on the foundation of vary deep representations of events and the control of inferences. Ours begins with a set of numbers and we apply statistics.
What do other AI researchers think of your work?
We get more feed back from journalists. People really like the stories. And sports writers really like the stories. People don't have the time and the people to create the kind of content we generate. If they can get more content by depending upon us, it enriches their websites — without taking people away from more traditional tasks.
People are hungry for genuine content. We will have content that will make individuals happy. If we can get hold of data, we can generate a story in a quarter of a second. We are really looking at what we can do to provide a set of services. As more data comes online, we will be there to generate the stories.
Here is a movie review. It's interesting, but Hammond was right. The voices are really really flat.