TALES is a multilingual, multi-modal analytic system that translates foreign-language news broadcasts and websites into English.
The Translingual Automatic Language Exploitation System is a multilingual, multi-modal analytic system that lets English speakers collect, index and access information contained in foreign-language news broadcasts and Websites. TALES technology is built on top of the IBM Unstructured Information Management Architecture (UIMA) platform and uses multiple IBM natural language technology components.
To facilitate the collection of foreign-language news, we focused on several technical challenges in at least four areas. Among these:
Speech-to-text (STT). We addressed problems regarding background noise; crosstalk (interference from unintentional coupling to another communication channel or incidental conversation), and multiple dialects. We also had to break down speech output into phrases to aid in machine translation.
Machine translation (MT). We dealt with words not currently found in standard dictionaries. We also confronted the issue of properly translating named entities, that is, names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
Named entity information extraction (IE). We handled different types of incoming text – speech recognition output, for example – which appears in all upper-case letters and contains no punctuation.
Intuitive presentation of translated transcription.We used subtitles/closed captioning to display translated transcriptions along with the video during playback.
TALES also had to deal with system-level engineering requirements, such as 24x7 video capture, near real-time video monitoring, flexible network deployment topology, system stability and error recovery, to name a few.
Addressing the research challenges
To address the research challenges, TALES relied on various IBM innovations, technologies and procedures:
• We developed speaker and dialect detection engines to better handle crosstalk and dialects. We wrote a statistical model based "phraser" component that utilizes silence duration information as well as language models and video keyframe location information to generate more intelligible phrases.
• We developed machine translation models that receive nightly updates based on user feedback, including out-of-vocabulary (OOV) words, which cannot be translated correctly. We obtained English translations of named entities by using the alignment information* generated by our MT engines. Our MT models also adopted the tokenization component from information extraction to produce more accurate alignments.
• We adapted our information extraction model to work with speech recognition output, restoring sentence casing along the way.
• We displayed the translated transcription as subtitles/closed-captioning together with the video. We also developed intelligent segmentation and timing algorithms to display the closed captioning effectively.
• We developed a Web translation tool that displays translations as soon as they are available instead of waiting for the entire page to be translated. It lets end users browse and translate the Web simultaneously.
Addressing the engineering challenges
We learned to control satellite receivers with programmable external infrared emitters that act as remote controls. We developed customized video capture software to simulate real-time video monitoring with a relatively short latency of 4.5 minutes. We devised algorithms to detect the most representative image among on a webpage.
Several prototypes were built and successfully deployed to customer sites. During the course of deployment, we also addressed various unique yet critical customer requirements, such as accessing the entire TALES system only through port 80 and getting access to satellite video signals when the server room is too far from the roof.
How our customers benefited
These deployed TALES prototype systems let our customers search foreign-language news, play back streaming video with English closed captioning, monitor live video with low latency time, browse and translate foreign websites, etc. This means the number of human translators can be reduced significantly, thus saving our customers money. Translators also can focus on highly relevant documents, which are first identified by TALES. The flexible UIMA architecture lets customers develop and deploy their own analytical components. One customer embedded TALES technology as a portlet on a Web page.
Note that TALES is not merely an application but rather a “platform” in the sense that it offers a series of standard Web services that users can access directly. One customer already has an information analytics Web portal and wishes to embed TALES capability as a portlet. They were able to do so with minimal effort because of TALES’ services and flexible architecture.
Today TALES supports Arabic, Chinese, Spanish and English. It is available as a public-accessible demo.** In the future we expect to improve the quality of speech recognition and machine translation, support more languages, add on-demand data processing ability and the ability to process streaming web media, convert TALES into a turn-key appliance and generally provide more value to the customer. We also intend to expand the TALES system into a Web-service-based TALES platform, on top of which customers can build their own applications and user interfaces, further extending the reach of IBM technology.
* Natural Language Processing researchers have two ways of translating named entities. They can translate the text to the target language and extract named entities there; or they can extract the named entities in the source language, translate the text into the target language and then find the corresponding “mapped” tokens in the translated text. Token alignment information is required for this latter approach.
** If you would like to evaluate the demo system, contact raduf@us.ibm.com.
____________________________________________________________________________
TALES showcases a wide range of IBM technologies and products
Among these:
• UIMA framework
• STT/MT/IE engines
• Omnifind search engine
• CueVideo keyframe extraction technology (IBM Almaden)
• IBM DS400 storage area network (SAN) hardware
Related links
TALES project home page
To see a TALES demo or for information about the TALES TransBrowser, contact Leiming Qian at qianl@us.ibm.com.
Last updated December 21, 2007







