Towards the Primary Platform for
Language Technologies in Europe

Unsupervised MT for Low-resourced language pairs

Short Name: MT4All
Name: Unsupervised MT for Low-resourced language pairs
Coordinator: Gorka Labaka, University of the Basque Country
Consortium: University of the Basque Country, Barcelona Supercomputing Centre, Secretary of State for Digital Advancement, Iconic Translation Machines Limited, Unbabel, Tilde
Project Runtime: 1 January 2020 – 31 December 2021
Funded by: European Commission
The MT4All CEF action aims to provide bilingual resources (bilingual dictionaries and machine translation systems) for the under-resourced languages in fields of public interest at the EU level, such as e-Health and e-Justice. MT4All will contribute to the CEF Automated Translation Building block by enlarging its coverage for language pairs and domains, for which parallel data do not exist. Based on monolingual data, MT4All will apply the latest advances in unsupervised machine translation to derive bilingual dictionaries and translation models. The specific use-cases, defined by the validation partners, address the following language pairs and domains:
Finnish, Norwegian, Latvian with English in the Financial domain
Ukrainian, Georgian and Kazakh with English in the Legal domain
Norwegian, Spanish and German with English in the domain of Customer support
Basque, Catalan with English in the General domain