Operationalization of interactive Multilingual Access Gateways (iMAGs) in the Traouiero project

Abstract : We will explain and demonstrate iMAGs (interactive Multilingual Access Gateways), in particular on a scientific laboratory web site and on the Greater Grenoble (La Métro) web site. This bilingual presentation has been obtained using an iMAG. Presentation This presentation is an adaptation and update of an article presented as a demonstration only to TALN-2010. The names of the files have been kept the same, although their contents are slightly different. The iMAG concept has been proposed by Ch. Boitet and V. Bellynck in 2006 (Boitet & al. 2008, Boitet & al. 2005), and reached prototype status in November 2008, with a first demonstration on the LIG laboratory Web site. It has been adapted to the DSR (Digital Silk Road) Web site in April 2009, and then to more than 50 other Web sites. These first prototypes are extensions of the SECTra_w (Huynh & al. 2008) online translation corpora support system. Since the beginning of 2011, we are operationalizing this software with a view to deploy it as a multilingual access infrastructure, in the context of the French ANR (National Agency for Research) Traouiero " emergence " project. An iMAG is an interactive Multilingual Access Gateway very much like Google Translate at first sight: one gives it a URL (starting Web site) and an access language and then navigates in that access language. When the cursor hovers over a segment (usually a sentence or a title), a palette shows the source segment and proposes to contribute by correcting the target segment, in effect post-editing an MT result. With Google Translate, the page does not change after contribution, and if another page contains the same segment, its translation is still the rough MT result, not the polished post-edited version. The more recent Google Translation Toolkit enables one to MT-translate and then post-edit online full Web pages from sites such as Wikipedia, but again the corrected segments don't appear when one later browses the Wikipedia page in the access language. By contrast, an iMAG is dedicated to an elected Web site, or rather to the elected sublanguage defined by one or more URLs and their textual content. It contains a translation memory (TM) and a specific, preterminological dictionary (pTD), both dedicated to the elected sublanguage. Segments are pretranslated not by a unique MT system, but by a (selectable) set of MT systems. Systran and Google are mainly used now, but specialized systems developed from the postedited part of the TM, and based on Moses, will be also used in the future. The powerful online contributive platforms SECTra_w and PIVAX (Nguyen & al. 2007) are used to support the TMs and pTDs. Translated pages are built with the best segment translations available so far. While reading a translated page, it is possible not only to contribute to the segment under the cursor, but also to seamlessly switch to SECTra_w online post-editing environment, equipped with proactive dictionary help and good filtering and search-and-replace functions, and then back to the reading context. A translation relay is being implemented to define the iMAGs or other translation gateways used by an elected Web site, select and parameterize the MT systems and translation routes used for various language pairs, and manage users, groups, projects (some contributions may be organized, other opportunistic), and access rights. Finally, MT systems tailored to the selected sublanguage can be built (by combinations of empirical and expert methods) from the TM and the pTD dedicated to a given elected Web site. That approach will inherently raise the linguistic and terminological quality of the MT results, hopefully converting them from rough into raw translations. The demonstration will use some iMAGs created by the AXiMAG startup for various Web sites, such as those of the LIG lab (http://service.aximag.fr:8180/xwiki/bin/view/imag/liglab) and of La Metro (Greater Grenoble) web site (http://service.aximag.fr:8180/xwiki/bin/view/imag/lametro), where access in Chinese and English was enabled in 2010 for the Shanghai Expo.
Type de document :
Communication dans un congrès
Translating and the Computer Conference 2011, Nov 2011, London, United Kingdom
Contributeur : Valérie Bellynck <>
Soumis le : jeudi 16 juin 2016 - 19:07:22
Dernière modification le : jeudi 11 janvier 2018 - 01:49:04


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-01333093, version 1



Christian Boitet, Valérie Bellynck, Achille Falaise, Hong-Thai Nguyen. Operationalization of interactive Multilingual Access Gateways (iMAGs) in the Traouiero project. Translating and the Computer Conference 2011, Nov 2011, London, United Kingdom. 〈hal-01333093〉



