Navigation auf


Department of Computational Linguistics

Machine Translation of Film Subtitles

The Department of Computational Linguistics develops and evaluates Machine Translation systems for the media industry. In cooperation with a large Scandinavian subtitling company we have developed translation systems for English --> Swedish and Swedish --> Danish and Norwegian. The development benefited from the closeness of the Scandinavian languages, from the subtitle time codes for automatic alignment, and from large amounts of human-translated subtitles (more than 50 million words per language). All systems are in practical use and translate large volumes of subtitles every day.

Currently we are working on more subtitle translation systems for more language pairs. The challenge lies in maintaining a high translation quality even if the languages are typologically further apart and even if we have less training material.

Project head:



  1. Mark Fishel, Yota Georgakopoulou, Sergio Penkale, Volha Petukhova, Matej Rojc, Martin Volk and Andy Way (2012): From Subtitles to Parallel Corpora. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012). Trento.
  2. Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Maucec, Andy Way, Panayota Georgakopoulou and Martin Volk (2012): SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). Istanbul.
  3. Martin Volk and Rico Sennrich (2011): Disambiguation of English Contractions for Machine Translation of TV Subtitles. In: Proceedings of the 18th Nordic Conference of Computational Linguistics (Nodalida 2011). Riga.
  4. Martin Volk, Rico Sennrich, Christian Hardmeier and Frida Tidström (2010): Machine Translation of TV Subtitles for Large Scale Production. In: Second Joint EM+/CNGL Workshop. Denver.
  5. Christian Hardmeier and Martin Volk (2009): Using Linguistic Annotations in Statistical Machine Translation of Film Subtitles. In: Proceedings of Nodalida. Odense.
  6. Martin Volk (2008): The Automatic Translation of Film Subtitles. A Machine Translation Success Story? In: Festschrift for Anna Sågvall Hein. Uppsala.
  7. Martin Volk and Søren Harder (2007): Evaluating MT with Translations or Translators. What is the Difference? In: Proc. of MT-Summit XI. Copenhagen.

Weiterführende Informationen


Teaser text