Towards the Primary Platform for
Language Technologies in Europe

National Competence Centre Slovenia

The Languages of Slovenia

With over 90% of the Slovenian population talking Slovene as the first language, Slovenia is one of the EU-countries with the most homogeneous linguistic situation. In addition to Slovene, languages of the former Yugoslavia are the most spoken minority languages. Furthermore, Italian and Hungarian are official minority languages in some regions near the border. Slovene is also spoken in some regions in Italy, Austria, Hungary, Croatia, US, Canada, Argentina and Australia. All together, Slovene is spoken by approximately 2.4 million native speakers. 1.85 million speakers are living in Slovenia.
Slovene belongs to the south-western branch of the Slavic languages.
Although Slovene is spoken by a relatively small population, it has over 40 regional dialects, which are grouped in 7 main dialects. The abundance of dialects complicated the development of a spoken standard.
Features of Slovene:

  • The rich inflectional system contains 3 genders, 6 cases and 3 numbers. Slovene is one of the less Indo-European languages, which inflect nouns, adjectives, pronouns, numerals and verbs by the grammatical number dual.
  • An adjective is also declined by degree and definiteness. As a result of this rich inflectional system, a single adjective can have 164 word forms.
  • Slovene is a language with a relatively free word order. The sentence “Eve gave an apple to Adam.” can have 120 permutations. The different sentence formations stress different aspects of the sentence.
Wikipedia contributors. (2020, July 6). Indo-European languages. In Wikipedia, The Free Encyclopedia. Retrieved 15:00, July 6, 2020, from https://en.wikipedia.org/wiki/Indo-European_languages
Government Communication Office of Slovenia. (2020, June 23). Official Language. In GOV.SI Portal. Retrieved 14:30, July 6, 2020, from https://www.gov.si/en/topics/official-language/

NCC Lead Slovenia

Dr. Simon Krek is researcher at the “Jožef Stefan” Institute, Artificial Intelligence Laboratory and the head of the Centre for language resources and technologies, University of Ljubljana. His research interests include lexicography and lexicogrammar, corpus linguistics, natural language processing, language technology infrastructure and computer-assisted language learning and teaching. He takes part in many national and international projects. On national level he is currently the project leader of four projects which support the development of language technology and resource infrastructure. On international level he participates in pan european projects like ELEXIS, CLARIN.SI and MARCELL.
Simon Krek authored and co-authored over 350 scientific articles and conference proceedings. Furthermore, he edited as the editor-in-chief the Oxford-DZS comprehensive English-Slovenian dictionary.

Current National Initiatives

  • There is only funding for CLARIN.SI, and the long-term research programme “LRs/LTs for Slovene”. A 4 million EUR call was published in Nov. 2019 covering LT topics for Slovene such as speech technology, MT, semantic technologies, corpus upgrades and a terminology portal.

META-NET White Paper on Slovene

Simon Krek. Slovenski jezik v digitalni dobi – The Slovene Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).

Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Availability of Tools and Resources for Slovene (as of 2012)

The following table illustrates the support of the Romanian language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support