Towards the Primary Platform for
Language Technologies in Europe

National Competence Centre Netherlands

The Languages of the Netherlands

Dutch is spoken by approx. 24 million speakers as the first language and by five million speakers as a second language. In addition to the Netherlands, it is an official language in the Flemish part of Belgium, Surinam, Aruba, Curacao and Sint-Marteen. Dutch immigrants spread the language all over the world and it is still spoken in little communities in France, Germany, Brazil, South Africa, Indonesia, Canada and the United States. In the Netherlands, Frisian is an official minority language of the province Friesland.
Dutch has many dialects, which differ within syntactic constructions and lexical meanings. The most significant discrepancies exist between the dialects spoken in the Netherlands and in the Flanders. It is an Indo-European language of the West-Germanic family and belongs to the Low-Franconian branch.

Features of Dutch:

  • The speaker is allowed to use a relatively free word order. It is common to use subjects, objects and adverbials in the first position of the sentence.
  • New words are formed with the help of composition which is a highly productive process of word formation.
  • There are so called “R pronouns” which tend to occur distant to the prepositions they belong to. In addition, the pronouns sometimes have more then one function or preposition they belong to. For Natural Language Processing, it is difficult to allocate these pronouns to their phrases.
  • Dutch, like German, has verbs with prefixes which occur in different positions of the sentence.

Wikipedia contributors. (2020, June 28). Dutch language. In Wikipedia, The Free Encyclopedia. Retrieved 15:00, June 29, 2020, from https://en.wikipedia.org/wiki/Dutch_language.

NCC Lead Netherlands

Dr. Vincent Vandeghinste is a senior researcher at the Instituut voor de Nederlandse Taal (INT, Dutch Language Institute), where he coordinates the tasks on Contemporary Dutch, and is working on topics such as CLARIN and other linguistic infrastructure, treebanking, machine translation, and language technology for inclusion. He has a PhD in Linguistics and a Master in (Experimental) Psychology. He is also affiliated with the Centre for Computational Linguistcs and Leuven.AI at the University of Leuven, where he is involved in courses on Machine Translation, Computational Linguistics, Computational Lexicography and Language Engineering Applications.

Current National Initiatives

  • There is no dedicated programme for LT development, though several projects are ongoing.
  • Some LT development takes place in the context of CLARIAH, especially on speech recognition, event extraction and POS tagging. There is, thanks to the STEVIN programme, no immediate danger for digital extinction of the Dutch language. The META-NET White Papers increased the awareness of the Interparliamentary Committee for the Dutch Language Union of the importance of LT.
  • In 2015, without committing any funding, the committee invited the Dutch LT community to submit a proposal for a new LT programme. However, such a proposal has never been defined.

Events

2021
12th National ELG Workshop: Netherlands National workshop Netherlands December 03

META-NET White Paper on Dutch

Jan Odijk. Het Nederlands in het Digitale Tijdperk – The Dutch Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).

Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Availability of Tools and Resources for Dutch (as of 2012)

The following table illustrates the support of the Dutch language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support