Towards the Primary Platform for
Language Technologies in Europe

National Competence Centre Denmark

The Danish Language

Danish is the official language of Denmark spoken by around six million people. 90% of these are ethnic Danes, with Danish as their mother tongue. For the other 10%, only one minority language, German, is officially established. Apart from the Danish speakers who live in Denmark, Danish is also the native or cultural language of around 50,000 Germano-Danish citizens living in the south of Schleswig. In the Faroe Islands and Greenland, the law of autonomy guarantees official equality of Danish alongside the Faeroese and Greenlandic languages, and Danish is an obligatory subject in schools.
Danish derives from the East Norse dialect group. A more recent classification separates modern spoken Danish along with Norwegian and Swedish into the Mainland Scandinavian group.

Features of Danish:

  • The Danish vocabulary has a large flexibility regarding the dynamic generation of compounds.
  • At the syntactic level, Danish allows for considerable movements of words. This flexibility results in challenges for natural language processing.
  • During the last 50 years, changes in the language have been dominated by: a tendency towards less dialectal variation, a less distinct pronunciation of certain sounds in the spoken language and some influence from English both on grammar and lexis.

Danish in Ethnologue
Eberhard, David M., Gary F. Simons, and Charles D. Fennig (eds.). 2020. Ethnologue: Languages of the World. Twenty-third edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com.

NCC Lead Denmark

Bolette Sandford Pedersen is a Professor and Vice Head at the Department of Nordic Studies and Linguistics at the University of Copenhagen. She leads the Centre for Language Technology where her main research areas within language technology are computational lexical semantics, computational lexicography and linguistic ontologies.
She lead the Danish Wordnet project, DanNet (The Danish Research Council “Yngre Forskere”, 2004-2008) as well as the work on semantic annotation and processing in SemDaX (The Danish Research Council for Culture and Communication 2013-2018). She is currently working on the topic of lexicography used in Language Technology (EU H2020 project ELEXIS 2018-2021). Bolette Sandford Pedersen was President of NEALT (Northern European Association of Language Technology) 2014-2016. She has acted as a board member of representatives for the Dansk Sprognævn and is currently a board member of the Global WordNet Association. She is a member of the LT advisory board of the Danish Agency for Digitalization and a member of Riksbankens Jubileumsfond’s Assessment Committee for Humanities and the Social Sciences (Sweden).

Current National Initiatives

  • Centre for Language Technology, UCPH, hosts the DK-CLARIN Platform, which is continuously updated with Danish corpora and LT resources.
  • In 2018, the Ministry of Culture set up an LT Committee led by the Danish Language Council and with participation of researchers and stakeholders in DK, including Centre for Language Technology, UCPH. The recommendations of the LT Committee were published in 2019. The plan was to develop a platform that contains free Danish LRs and functionalities aimed for the NLP industry and to embark new LR projects. First steps included the upgrade of existing Danish dictionaries, and lexical resources as well as the development of a time-encoded Danish speech recognition corpus.
  • In March 2019, the Government presented The National Strategy for AI which includes an initiative of 4 million EUR for developing a Danish language resource to boost and scale up Danish language-centred AI.
  • In January 2020 a national NLP network was set up by the Alexandra Institute.
  • In June 2020 the first version of the Danish LT platform was published by The Danish Agency for Digitalization containing a little less than 100 resources and tools for Danish LT. A call for development of a speech recognition corpus is planned for January 2021.

META-NET White Paper on Danish

Bolette Sandford Pedersen, Jürgen Wedekind, Steen Bøhm-Andersen, Peter Juel Henrichsen, Sanne Hoffensetz-Andresen, Sabine Kirchmeier-Andersen, Jens Otto Kjærum, Louise Bie Larsen, Bente Maegaard, Sanni Nimb, Jens-Erik Rasmussen, Peter Revsbech, and Hanne Erdman Thomsen. Det danske sprog i den digitale tidsalder – The Danish Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).

Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Availability of Tools and Resources for Danish (as of 2012)

The following table illustrates the support of the Danish language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support