Main Publications

The following is a list of publications that are useful to understand the technology behind the Europe Media Monitor (EMM) family of applications. We invite you to first explore the multilingual EMM applications live. If you want to read more, go to the full list of publications.

Introduction to EMM's multilingual media monitoring applications

Multilingual event extraction and visualisation

Building social networks based on information extraction from the multilingual news

Design principles to build highly multilingual applications

Finite-state pattern recognition engine used for some EMM information extraction tasks

Geo-tagging: recognition and disambiguation of place names in multilingual text

  • Pouliquen Bruno, Marco Kimler, Ralf Steinberger,  Camelia Ignat, Tamara Oellinger, Ken Blackler, Flavio Fuart, Wajdi Zaghouani, Anna Widiger, Ann-Charlotte Forslund, Clive Best (2006). Geocoding multilingual texts: Recognition, Disambiguation and Visualisation. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006), pp. 53-58. Genoa, Italy, 24-26 May 2006.

Multilingual quotation recognition (direct speech)

Multilingual multi-lable document categorisation, using the Eurovoc Thesaurus

Multilingual named entity recognition: person and organisation names, matching and grounding of name variants

  • Pouliquen Bruno & Ralf Steinberger (2009). Automatic Construction of Multilingual Name Dictionaries. In: Cyril Goutte, Nicola Cancedda, Marc Dymetman & George Foster (eds.): Learning Machine Translation. pp. 59-78. MIT Press - Advances in Neural Information Processing Systems Series (NIPS). (Purchase online)

Medical Information System MedISys and PULS - gathering and classifying multilingual Public Health-related news, early warning, medical event extraction

  • Steinberger Ralf, Flavio Fuart, Erik van der Goot, Clive Best, Peter von Etter & Roman Yangarber (2008). Text Mining from the Web for Medical Intelligence. In: Fogelman-Soulié Françoise, Domenico Perrotta, Jakub Piskorski & Ralf Steinberger (eds.): Mining Massive Data Sets for Security. pp. 295-310. IOS Press, Amsterdam, The Netherlands Practicalities (EUROLAN'2003). Bucharest, Romania, 28 July - 8 August 2003.

JRC-Acquis multilingual parallel corpus, sentence-aligned (22 languages)


Site Meter

Please send comments on this page to Ralf Steinberger (Email address format:

Last update: 22 August 2012