Towards the Primary Platform for
Language Technologies in Europe

National Competence Centre Hungary

The Languages of Hungary

Hungarian is the official language of Hungary with approximately 13 million speakers, Vojvodina (an independent province in Serbia) and three regions in Slovenia. Outside of Hungary, it is spoken in the bordering countries in Europe and in the United States, Canada and Israel. It is considered to be a minority or regional language in Austria, Croatia, Romania, Ukraine and Slovakia.
Hungarian has less language variation, which is shown by the only seven dialects in Hungary and two dialects in Romania. The dialects vary little from the standard and each other.
Hungarian is the most spoken Uralic language in Europe. It is part of the Urig group like Finnish, Estonian and some minority languages. The vocabulary differs much from the other European languages, which belong to the Indo-European family of languages.
Features of Hungarian:

  • There is no genus. He and she are expressed with the same word.
  • The verbs can just be conjugated in present and past. Other tenses has to be circumscribed.
  • Hungarian’s morphology is rich of derivation and flexion, which cause the production of long words.
  • Words underlie a vowel harmony. The vowels are categorised in two classes, deep and high vowels. The vowels of suffixes change dependent of the stem with whom they connect. The result are words with vowels of one class.
Wikipedia contributors. (2020, June 6). Hungarian language. In Wikipedia, The Free Encyclopedia. Retrieved 17:00, June 11,
2020, from https://en.wikipedia.org/wiki/Hungarian_language.

NCC Lead Hungary

Dr. Tamás Váradi is deputy director and head of the Department of Language Technology and Applied Linguistics of the Hungarian Research Institute for Linguistics. His research interests are language technology, especially corpus linguistics, machine translation, computational lexicography and, more recently, neural NLP.
Since the nineties, he has been playing a significant role in the buildup of the European and specially Hungarian infrastructure of language technologies. Moreover, he coordinated several EU funded projects like CESAR or iTranslate4.eu and is currently leading the MARCELL and CURLICAT CEF Telecom projects. He was one of the founding persons of the CLARIN infrastructure project. He functions as the general secretary in the European Federation of the National Institutions of Language (EFNIL).
He has (co-)authored over 100 papers and conference proceedings.

Current National Initiatives

  • There is no dedicated LT programme but some projects cover LT applications.
  • Since 2012, at the Pázmány Péter Catholic University the independent Hungarian LT Research Group has been running supporting the salaries of about six postdoc researchers and PhD students.
  • Recently started a national AI research project. The focus is on neural methods and their applications in various areas, including LT, which comprises only a small part.

META-NET White Paper on Hungarian

Eszter Simon, Piroska Lendvai, Géza Németh, Gábor Olaszy, and Klára Vicsi. A magyar nyelv a digitális korban – The Hungarian Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).

Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Availability of Tools and Resources for Hungarian (as of 2012)

The following table illustrates the support of the Hungarian language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support