By John Herrman
Posting in Design
Tucked into the 2012 budget request for the DARPA is an audacious plan: to create a universal, instant language translator.
In a way, it's shocking that machine translation--the computer-aided translation of human languages--hasn't been perfected yet. Philosophers and mathematicians have been proposing its possibility since before linguistics was a formal discipline, and, as Spencer Ackerman at Danger Room notes, researchers were making bold and specific claims about the imminence of effective machine translation in the mid 1950s, when computers still depended on punchcards.
Processing power has since reached comparatively stratospheric levels. And yet machine translation, to be blunt, is still pretty terrible. Services like Google Translate are impressive mainly because their predecessors were so ineffective. Asking Google to translate a webpage or a block of text will usually net you something good enough to provide a gist, but that's it.
It's no surprise, then, that universal language translation is something that the Pentagon is hungry for. It is a surprise that the DoD, via DARPA, seems to think that it's an attainable goal in the near term. DARPA's 2012 budget request (PDF) contains the following initiative, for which $15 million will be earmarked in 2012:
The Boundless Operational Language Translation (BOLT) program will enable communication regardless of medium (voice or text), and genre (conversation, chat, or messaging) through expansion of language translation capabilities, humanmachine multimodal dialogue, and language generation. The BOLT program will enable warfighters and military/government personnel to readily communicate with coalition partners and local populations and will enhance intelligence through better exploitation of all language sources including messaging and conversations. The program will also enable sophisticated search of stored language information and analysis of the information by increasing the capability of machines for deep language comprehension.
Bureaucratic language has a tendency to flatten even the most ambitious claims, so let's unpack this a little. DARPA is planning a universal translation technology that does the following:
- Translates text
- Translates voice messages
- Understands colloquial errors as well as incorrect and incomplete syntax
- Interprets poor pronunciation.
As is so often the case with DARPA's plans, this sounds impossible--or at the very least, implausible--in the near future. There's also the issue of redundancy: private companies have been hard at work on machine translation for years, and have invested millions of dollars with varying degrees of success.
Google recently released a smartphone app that bridges its voice recognition and translation services, allowing people to do something like what is described in this proposal, albeit not to the standard laid out for BOLT. I find it hard to imagine that a team of engineers could even replicate Google's present successes for $15 million, much less something significantly more capable.
To dismiss BOLT as a pipe dream, though, is a mistake. A year of research and a chunk of change won't summon perfect battlefield translation devices into existence, but may provide valuable insights and translation techniques to bring dream of universal, instant and good translation into reality.
Google Translate has been built an effective but narrow set of techniques. Google, instead of attempting to construct a translation ruleset from scratch, has allowed its computers to deduce them on their own. According to the company:
[The computers learn rules] by analyzing millions and millions of documents that have already been translated by human translators. These translated texts come from books, organizations like the UN and websites from all around the world. Our computers scan these texts looking for statistically significant patterns -- that is to say, patterns between the translation and the original text that are unlikely to occur by chance. Once the computer finds a pattern, it can use this pattern to translate similar texts in the future.
This has worked out pretty well! Google Translate service has reached levels of accuracy far beyond earliers services like AltaVista's (now Yahoo!'s) Babelfish. It's moderately effective at producing results from flat, syntactical text, like the content of a news article or an email--the kind of stuff that people most commonly want Google to translate, and which the company can then sell the most ads against. The translations are still conspicuously broken, but they're good enough for plenty of uses.
BOLT seems to be less focused on creating a massive database of words and rules, and more focused on the thornier problems of translation; a day when you can enter a news article into Google Translate and be returned a perfect translation is easily conceivable and likely to occur soon, with or without DARPA's help. But a day when you can hold up a device to a panicked villager's mouth in a war-torn area of the world, and convert his colloquial, unusually inflected and regionally pronounced speech into usable data is still over the horizon.
New research into techniques for deep comprehension, inference and subtle contextual clues--the truly hard problems of machine translation--won't net usable results immediately. Or maybe even soon! But even as a complement to existing technologies rather than a replacement, a fresh and novel attempt at a true universal translator can't hurt. This thing is long overdue.
Top image depicts the output of the MIT-designed Whirlwind translation computer, as printed in the January 1956 issue of Scientific American. Accessed on the fantastic ModernMechanix.com
Feb 21, 2011
The real reason for the absence of a great translator today, is simply human stupidity + lack of motivation. It could have been done decades ago. Google's and Yahoo's awful translators are perfect examples of such applied stupidity. Human intelligence is the missing link between languages and computer power. @ sboverie@..."One of the problems with fighting a war in another country..." Why fighting wars in other countries?
How will it translate typos? For example in your closing paragraph you state: "But even as a complement to existing technologies rather than a replacement, a fresh and novel attempt at a true universal translator can?t help." Can't help what? Or did you mean "can't hurt"?
The problem with this is in employing a still highly imperfect technology (which machine/computer translation will continue to be for some time) in situations like intel analysis or combat area communications (like in rural Afghanistan) where one small mistake will have a massive impact, whether on foreign policy decisions or even life & death. http://www.schreibertransblog.com
Great idea. I have used some of the free translators on-line, and one that is a widget on the mac. When translating a foreign language into English. The translator fails to translate most of the words. But enough is translated, that i get the idea what is being said. I would think, with the processing power of computers, a universal translator is doable. Especially with languages that are widely used such as Spanish and French. But i can see a problem with languages, that are not widely spoken, such as those spoken by Native Americans or dialects spoken in the different tribes in Africa.
I think it is a good idea to start somewhere and is long overdue. Lets face most arguements or disagreements and misunderstandings is because of poor communications. Even a teachers hardest job is to communicate an concept or an idea. With translaters just maybe more people will understand one anothers and there will be a few less problems in the world. Signed Just an opinion
One of the problems with fighting a war in another country is that the language barrier can make a small problem into an international incident. I read a story by an Iraqi war veteran talking about controlling traffic; the locals did not understand that they had to stop when the soldiers were trying to tell them. In most cases, the drivers were trying to go somewhere safe but could not understand the soldier's excitement or why the soldiers shot at them. A universal translator might have been used to prevent miscommunication that sometimes led to deaths of the driver and passengers.
Keep in mind, for the face-to-face, interactive translation aspect of it, they're not trying to make Faulkner comprehensible in Farsi. Translating a relatively small vocabulary of unambiguous military commands in real time is not remotely similar to, for example, translating a diplomatic address to the United Nations. Military command language is not subtle. Nobody is going to say "You can't put too much water in a nuclear reactor."
Here's a half-assed idea like the universal remote control. We have enough trouble trying to understand each other in just one language (and, except for dead ones like Ancient Egyptian, they change every day). But let's pretend that something like this might be possible at an unknown time years from now. Everyone knows that all of our dreams will come true in the future, won't they?
because not even two humans speaking the same language can understand each other (ask your wife or husband).
A fantastic idea. As a test for your article, how about using Google to translate it into half a dozen languages, just for the record for today. And, in five years do the same thing to see what advances we have.
I'm surprised the authors didn't mention MASTOR from IBM, a two-way, free form speech translator. It's focus is to convey the meaning of what was said, even if minor errors are made by the speaker(s) or speech recognizer. The users speak into a microphone (one at a time), MASTOR recognizes and translates the speech, then vocalizes the translation in the target language for the foreign language speaker to hear. The foreign language speaker can then speak into the microphone in their own language, and MASTOR translates and vocalizes their speech in the original language. MASTOR is a software-based solution that can run on a PDA, tablet PC or laptop computer. See http://domino.research.ibm.com/comm/research_projects.nsf/pages/mastor.index.html Also in 2007, IBM donated the speech translation technology to foster better communication and humanitarian efforts in Iraq. It included 1,000 devices, 10,000 copies of the software, with a vocabulary of over 50,000 English and 100,000 Iraqi Arabic words, designed for environments such as hospitals and training. See http://www-03.ibm.com/press/us/en/pressrelease/21323.wss
In the late '70's, a classmate of mine joined a civilian intelligence branch of the Army. He was trained in 7 dialects of the Czech language, and spent several years translating intercepted communications for military intelligence-gathering efforts. I have no idea how many translators he worked with, but the big drawback with an organization like that is that to handle crisis situations which require large throughput, you must maintain high staffing levels. If you cut costs but cutting staff levels - well, guess what? You can't recruit people off the street and put them to work translating a language they don't know. By the time they've gotten past "Bonjour, Monsieur Ledoux. Ou est la salle de bain?", the bomb has dropped. With an effective automated tool like they are proposing, you can have rapidly-scalable machines doing most of the translation and early analysis, searching tirelessly through vast piles of data for content that needs immediate human attention. In normal times, the staff can spot-check the translations, adding corrections when errors are found, making the overall system more effective. In time of crisis, the staff's full attention can be focused on the flagged data streams. This thing could pay for itself in 5 or 10 years. The project could also have a side benefit. You could feed in the National Enquirer, iteratively translate from English to Chinese to German to Swahili to French to Japanese ad nauseum, and see how many virtual translators it takes to produce the complete works of Shakespeare. Aren't we all waiting for that?
This IS preserving the nation - 1 service member's life at a time. A language translator could be the difference between life & death for US troops in the field & the people they come into contact with. Although the ideal gadget may be many years off, the ability to understand & communicate a limited set of basic concepts is itself useful.
Yes, that gap is what they must attempt to overcome. For all we know, their technique could involve extremely specialized hardware, not teraflops of performance.
Human language interpretation is not a problem of computational capability, there is a huge gap between human languages and what computers can structure and process. http://en.wikipedia.org/wiki/Noam_Chomsky
The recent contest on Jeopardy featuring IBM's "WATSON" vs. two previous top winners on the program showed that the computer as good as it was did not have complete knowledge and command of the language as humans. A universal language translator is an immense and commensurately expensive project. In a day and time when mere survival of the USA is at stake funds should be concentrated on the preservation of the nation first and all that is nice but not absolutely necessary be postponed!