Towards the Primary Platform for
Language Technologies in Europe

National Competence Centre Latvia

The Languages of Latvia

1.3 million speakers of the 1.75 million native speakers of Latvian live in Latvia. The other 450,000 native speakers are spread around the world with the biggest communities in Lithuania, Estonia and Germany. Latvian is the only official language of the country. The biggest minority speaks Russian, but the number of non-speaking Latvian residents decreased in the last three decades since the independence of Latvia.
Latvian belongs to the East Baltic branch of the Indo-European languages. The only other still existing language of the East Baltic branch is Lithuanian. Three main dialects are spoken in Latvia: the Central dialect, Tamian and the High Latvian dialect. In addition, there exist approximately 500 different vernaculars and sub-dialects. The standard literary language is based on the Central dialect.

Features of Latvian:

  • In Latvian, the tone and the length of a syllable differentiate the meaning of words. This rare feature of a European language is one of the challenges for language technologies.
  • The rich inflection system contains many categories for declination. Besides gender, number, person and tense are voice, degree of comparison, definiteness of the ending, mode and reflexivity categories of declination. The inflection of a single word depends on the part of speech. It influences which set of features has to be chosen by the speaker.
  • Complex punctuation rules are difficult to understand for learner. They contain sets of rules with grammatical and intonational principles like grammatical linking or the marking of pauses.

Wikipedia contributors. (2020, June 16). Latvian language. In Wikipedia, The Free Encyclopedia. Retrieved 15:00, June 22, 2020, from https://en.wikipedia.org/wiki/Latvian_language.

NCC Lead Latvia

Prof. Inguna Skadiņa is Senior Researcher at the Institute of Mathematics and Computer Science and Professor at the University of Latvia. Furthermore, she is the Chief Scientific Officer of Tilde.
She has been working for over 30 years in a field of the natural language processing. Her research interests include language resources and tools, machine translation and human-computer interaction. She has led and participated in many national and international projects (FP5-FP7, ICT PSP, H2020 and CEF) related to the language technology, including scientific coordination of FP7 project Accurat and ICT PSP project META-NORD. At the moment, she is principal investigator of the large-scale national project “Multilingual Artificial Intelligence Based Human Computer Interaction”.

Inguna Skadiņa is a national coordinator of the CLARIN research infrastructure in Latvia CLARIN-LV. She has (co-)authored more than 70 research papers. She is an expert of the Latvian Council of Sciences, member of several professional organizations and committees of different scientific events related to the language resources and tools and natural language processing.

Current National Initiatives

  • There is no LT funding programme, but some support exists through the different national projects.
  • Two state research programmes “Latvian Language” and “Digital resources for humanities: integration and development” support creation of language resources and tools.
  • Several projects of the Latvian Council of Science are running, including projects that support the creation of Latvian language learner corpus and a Latvian WordNet. Research and development activities are also supported through European Structural funds (three large projects are running currently – on human-computer interaction, domain-specific speech recognition and on support for multi-lingual speech recognition).Also, there is some funding from the project that supports research infrastructures in Latvia, including CLARIN-LV.

Events

2020
2nd Regional ELG Workshop: Baltic Countries
Slides
Regional workshop Kaunas, Lithuania September 21

META-NET White Paper on Latvian

Inguna Skadiņa, Andrejs Veisbergs, Andrejs Vasiļjevs, Tatjana Gornostaja, Iveta Keiša, and Alda Rudzīte. Latviešu valoda digitālajā laikmetā – The Latvian Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).

Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Availability of Tools and Resources for Latvian (as of 2012)

The following table illustrates the support of the Latvian language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support