One of the main goals of the European Language Grid is to combat the fragmentation of the European Language Technology community. But how exactly can the ELG be used to aid communication across languages? A use case can be found in Coreon’s pilot project “Multilingual Knowledge Systems as Linguistic Linked Open Data”. Michael Wetzel, Managing Director of the Berlin-based company, explains the project in collaboration with the ELG and the act of providing access to multilingual resources. Find out how together, Coreon and the ELG help bridge the gap between languages and lower barriers of communication.
Early on in its runtime, the ELG project put out a first open call for pilot projects. The idea was to help fund innovative language technology that would incorporate the ELG platform and be accessible through it. The call was opened with several intentions: to allow the ELG to grow and communicate with its user base and to let applicants have the opportunity to realize creative visions for LT through collaboration with the platform and the project. One of the projects that received funding was initiated by a German company called Coreon.
Who is Coreon?
Coreon is a Berlin-based company that produces a software by the same name, a Multilingual Knowledge System. This system allows users to manage, visualize and model data in so-called concept maps that are arranged and exploreable in forms such as tree diagrams. Coreon combines this system with terminology management: A browsable example on their website shows the Coreon system being used to model Eurovoc, a multilingual thesaurus run by the European Union. Here, the relations between words and their respective translations into 22 languages are visualised in a tree diagram. Mapping out knowledge and language can also be useful to software developers looking to create translation tools, chatbots or other software.
What is the Pilot Project?
The example mentioned above is accessible directly from a browser. While this works well for the exploration of a thesaurus, things become more complicated when trying to integrate Coreon’s technology into other developers’ projects. So far, this was only possible by exporting the data or via a rich yet proprietary API. The ELG’s open call for pilot projects gave Coreon the opportunity to create a simpler and faster way to access their repositories, with the ELG providing additional funding and support and thus lessening the risk.
According to Michael Wetzel, Managing Director of Coreon, the goal of the pilot project titled Multilingual Knowledge Systems (MKS) as Linguistic Linked Open Data was to create “a way easier and more straight-forward way to put our resources in other software applications”. The main development was the creation of a SPARQL endpoint, which allows for real-time and direct querying of repositories. This SPARQL endpoint can be found on the ELG, enabling users to access the Coreon repositories straight from the ELG. This service was technically not yet foreseen by the ELG, Wetzel explains: “We helped the technical folks in the grid to enhance their technical support for the kind of service which we were developing”. This led to a close cooperation between Coreon and the ELG.
How does Coreon work?
As the ELG itself aims to bridge the gap between multiple languages and make Language Technology more accessible, Coreon’s Pilot Project was particularly interesting to the ELG due to their particular combination of Knowledge Graphs as a way to model data with terminology management, which can incorporate many languages. But what do these Knowledge Graphs look like?
Knowledge Graphs, or in this case multilingual concept maps, are systems that link concepts with their subcategories in tree diagrams. For example, typing the word fish into the search bar of their Eurovoc visualisation results in this graph.
With one look at the resulting tree diagram, it is clear that the concept fish is a subcategory of a variety of other concepts and is itself divided into sea fish and freshwater fish. The sidebar notably lists a number of translations for fish and some of its related concepts. Linking knowledge and language in this way can be useful to software developers, e.g. for training a chatbot or in other NLP applications. With the successful implementation of the collaborative pilot project between Coreon and the ELG, this is now possible in a far more convenient way.
Because the pilot project was completed only a few months ago, its impact is still hard to predict, but Michael Wetzel is optimistic: “We are seeing that other software companies do start using these endpoints that we’ve developed. If you ask me in a year or two from now, I think we will see quite some integrations based on that technology.” His hope is that the pilot project will have a more unifying effect on the European LT community because multilingual knowledge systems are now more easily available to software developers for the creation of translation tools, chatbots and other software.
In a more general sense, this is quite similar to the goals of the ELG itself. Europe is wonderfully diverse, but the diversity also causes fragmentation, particularly in the LT community. To overcome this challenge, the ELG aims to become the central hub for European Language Technology. This would help bridge the gap between LT developers of different languages, as aspects like communication, collaboration and the availability of multilingual products like Coreon’s would be strengthened.
Wetzel is hopeful in this aspect. “The technological fragmentation, we can overcome. In Europe, we will continue to have hundreds of language or software companies, focusing on language technologies, but let’s help them so that they can more easily connect with and complement each other’s services. I think this is what the ELG really is good for.”