Towards the Primary Platform for
Language Technologies in Europe

National Competence Centre Serbia

The Languages of Serbia

Approximately, eight million native speakers of Serbian live in the Balkans. This number contains the Serbs in the Republic of Serbia and Serbs in other countries of the former Yugoslavia. Outside of the Balkans, Serbian is spoken by approximately 0.5-1.5 million native speakers. The population in Serbia is dominated by a very multilingual community. The official use of minority languages is regulated by the “Law on the Official use of Language and the Alphabet” which provides that laws and legal acts are issued in languages of ethnic minorities.
Serbian belongs to the Western-Slavic branch of the Indo-European languages.

Features of Serbian:

  • Morphophonemic alternations in inflection and word formation can cause word forms which differ to a high degree.
  • The spoken language has two different pronunciation systems: Ekavian and Ijekavian which is reflected in written texts.
  • Nouns have a grammatical and a semantic gender.
  • The rich inflectional system contains three different types: declension, conjugation and comparison. All types have different paradigms and exceptions. In addition, Serbian has seven cases.
  • Serbian is a SVO-language with a relatively free word order. The English sentence “Mary gave John an apple.” can be expressed in 24 different ways.
  • Serbian can be spelled in the Cyrillic and the Latin alphabet. The Cyrillic alphabet is used for official communication, while both alphabets are in widespread use.

Wikipedia contributors. (2020, April 20). Serbian language. In Wikipedia, The Free Encyclopedia. Retrieved 14:30, July 02, 2020, from https://en.wikipedia.org/wiki/Serbian_language.

NCC Lead Serbia

Prof. Cvetana Krstev teaches at the Library and Information Science Department at the University of Belgrade, Faculty of Philology and was a Guest lecturer in several European universities in the course of different programmes in the past. Primarily, she received her Master’s degree in Mathematics and her PhD in Computer Science.
Within her scientific work, she took part in many national and international projects like META-NET, CESAR, several COST actions, and much more. She has developed the Serbian morphological e-dictionary and is one of the key contributors to the development of Serbian WordNet, the Corpus of Contemporary Serbian, the parallel Serbian/English corpus, Named-Entity Recognition System for Serbian, and many other language resources and tools. Furthermore, she is the Editor-in-Chief of the journal for Digital Humanities “InfoTheca”, a member of the editorial boards of two journals and a member of organizing and programme committees of numerous international conferences. She published more than 180 scientific articles in journals and conference proceedings in the broad area from library and information science to natural language processing. She is one of co-founders of the Association for Language resources and Tools – JeRTeh. She actively maintains cooperation with numerous research centers and group from France, Poland, Bulgaria, Greece, and North Macedonia.

Current National Initiatives

  • There are no LT funding programmes. Recently, the government has established a working group with the aim of formulating a strategy for the development of AI 2020-2025. As of yet, it is unclear if LT will have a specific place in it.
  • Also, an AI institute has been established in the Science Technology Park with the aim of connecting academia and industry.
  • In 2020, a thoroughly reorganised funding of national scientific programmes was initiated by the first call for proposals in the field of AI, including LT. The results of this call were not positive for Language Technologies. However, the number of students at all levels of studies interested in computational linguistics, NLP and Language Technologies is rising steadily.
  • Industry is a modest user of existing LT, and even lesser is their involvement in the development of LT for Serbian.

META-NET White Paper on Serbia

Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, and Mladen Stanojević. Српски језик у дигиталном добу – The Serbian Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).

Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Availability of Tools and Resources for Serbian (as of 2012)

The following table illustrates the support of the Serbian language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support