Towards the Primary Platform for
Language Technologies in Europe

Become an active member of the ELG Community – 5 simple steps for your organisation to join the European Language Grid

When you are reading this tutorial, you most likely have received a link from us that leads you to an entry of your organisation in the European Language Grid (ELG). There are many good reasons to have your company, research department or academic institution listed in ELG, the non-profit platform for Language Technologies in Europe. An overview of the ideas behind ELG and its many benefits can be found here. In this short five-step tutorial, we explain how you can take over (“claim”) your organisation’s page in ELG as your own. A brief step-by-step instruction can be found at the end of the tutorial.

To make things easier for you, we have taken the liberty to create a default entry for your organisation. This means that you do not have to set up a new page but can simply claim your organisation’s page so that you can modify it. We used public information to set up your page and invite you to edit your organisation’s page to make it complete, individual and representative by adding your logo, keywords and contact details. An edited page could look like this:

Screenshot of the ILSP organisation

The first step is to be logged into your ELG account. If you have not registered an ELG account yet, here is a guide on how to create it. An important note: An organisation’s ELG page can only be claimed by one user, so it would be ideal if you used your professional email address and ensured that you are the right person to do this for your organisation.

Once you are logged in and have your organisation’s ELG page open, click the “Claim” button in the top right corner. This sends an automatic message to us and we will validate your request. Afterwards, you receive an email from us that confirms your request, which also unpublishes the organisation entry from the ELG. This means that it can now be edited by you.

Screenshot of an organisation page

When you enter the “My Grid” section, which is found in the top right corner next to your name, you will find your organisation’s entry under “My items”. Here, you can edit the page, add further information and list contact details. Once you are finished, you submit the entry for publication. Our ELG team will do a quick technical check and re-publish your organisation’s page to the European Language Grid.

Screenshot of the my Grid section with the claimed organisation

Newly claimed organisations are frequently featured in a short profile in our ELT Newsletter, which has more than 4,000 readers, and will soon also be highlighted on the frontpage of the European Language Grid. So don’t wait and join the European Language Grid with your organisation and become part of the European non-profit network for Language Technology services, resources, companies, research organisations and users.

How to take over your organisation’s page in short:

1. Log in or register to ELG (please use your professional email address)

2. Open your organisation’s page and click the ‘Claim’ button (top right) – the ELG team validates your claim and informs you via email

3. Open your organisation’s page under ‘My items’ in the ‘My grid’ section

4. Edit your organisation’s page

5. Click ‘Submit for publication’ – the ELG team will then publish your page


Lowering Language Barriers – How Coreon uses the ELG to provide access to multilingual resources

One of the main goals of the European Language Grid is to combat the fragmentation of the European Language Technology community. But how exactly can the ELG be used to aid communication across languages? A use case can be found in Coreon’s pilot project “Multilingual Knowledge Systems as Linguistic Linked Open Data”. Michael Wetzel, Managing Director of the Berlin-based company, explains the project in collaboration with the ELG and the act of providing access to multilingual resources. Find out how together, Coreon and the ELG help bridge the gap between languages and lower barriers of communication.

Logo of the company Coreon

Early on in its runtime, the ELG project put out a first open call for pilot projects. The idea was to help fund innovative language technology that would incorporate the ELG platform and be accessible through it. The call was opened with several intentions: to allow the ELG to grow and communicate with its user base and to let applicants have the opportunity to realize creative visions for LT through collaboration with the platform and the project. One of the projects that received funding was initiated by a German company called Coreon.

Who is Coreon?

Coreon is a Berlin-based company that produces a software by the same name, a Multilingual Knowledge System. This system allows users to manage, visualize and model data in so-called concept maps that are arranged and exploreable in forms such as tree diagrams. Coreon combines this system with terminology management: A browsable example on their website shows the Coreon system being used to model Eurovoc, a multilingual thesaurus run by the European Union. Here, the relations between words and their respective translations into 22 languages are visualised in a tree diagram. Mapping out knowledge and language can also be useful to software developers looking to create translation tools, chatbots or other software.

What is the Pilot Project?

The example mentioned above is accessible directly from a browser. While this works well for the exploration of a thesaurus, things become more complicated when trying to integrate Coreon’s technology into other developers’ projects. So far, this was only possible by exporting the data or via a rich yet proprietary API. The ELG’s open call for pilot projects gave Coreon the opportunity to create a simpler and faster way to access their repositories, with the ELG providing additional funding and support and thus lessening the risk.

According to Michael Wetzel, Managing Director of Coreon, the goal of the pilot project titled Multilingual Knowledge Systems (MKS) as Linguistic Linked Open Data was to create “a way easier and more straight-forward way to put our resources in other software applications”. The main development was the creation of a SPARQL endpoint, which allows for real-time and direct querying of repositories. This SPARQL endpoint can be found on the ELG, enabling users to access the Coreon repositories straight from the ELG. This service was technically not yet foreseen by the ELG, Wetzel explains: “We helped the technical folks in the grid to enhance their technical support for the kind of service which we were developing”. This led to a close cooperation between Coreon and the ELG.

How does Coreon work?

As the ELG itself aims to bridge the gap between multiple languages and make Language Technology more accessible, Coreon’s Pilot Project was particularly interesting to the ELG due to their particular combination of Knowledge Graphs as a way to model data with terminology management, which can incorporate many languages. But what do these Knowledge Graphs look like?

Knowledge Graphs, or in this case multilingual concept maps, are systems that link concepts with their subcategories in tree diagrams. For example, typing the word fish into the search bar of their Eurovoc visualisation results in this graph.

With one look at the resulting tree diagram, it is clear that the concept fish is a subcategory of a variety of other concepts and is itself divided into sea fish and freshwater fish. The sidebar notably lists a number of translations for fish and some of its related concepts. Linking knowledge and language in this way can be useful to software developers, e.g. for training a chatbot or in other NLP applications. With the successful implementation of the collaborative pilot project between Coreon and the ELG, this is now possible in a far more convenient way.

Summary

Because the pilot project was completed only a few months ago, its impact is still hard to predict, but Michael Wetzel is optimistic: “We are seeing that other software companies do start using these endpoints that we’ve developed. If you ask me in a year or two from now, I think we will see quite some integrations based on that technology.” His hope is that the pilot project will have a more unifying effect on the European LT community because multilingual knowledge systems are now more easily available to software developers for the creation of translation tools, chatbots and other software.

In a more general sense, this is quite similar to the goals of the ELG itself. Europe is wonderfully diverse, but the diversity also causes fragmentation, particularly in the LT community. To overcome this challenge, the ELG aims to become the central hub for European Language Technology. This would help bridge the gap between LT developers of different languages, as aspects like communication, collaboration and the availability of multilingual products like Coreon’s would be strengthened.

Wetzel is hopeful in this aspect. “The technological fragmentation, we can overcome. In Europe, we will continue to have hundreds of language or software companies, focusing on language technologies, but let’s help them so that they can more easily connect with and complement each other’s services. I think this is what the ELG really is good for.”


How to use the ELG: Video tutorial for the European Language Grid

The ELG Video Tutorial hands you the basics of the European Language Grid: How to browse it, register, become a provider and upload, store and share resources like Language Technology tools and corpora. It also touches upon the ELG Python SDK, which is explained in detail in the ELG Documentation. For a shorter introduction into the functionalities and many advantages of the European Language Grid, have a look!


Choose the right tool to create your ELG service in Python

With the Python SDK, the European Language Grid provides tools to facilitate the creation of an ELG service from your Language Technology (LT) tool in Python. It is very easy to convert your LT tool running on your computer to an ELG service accessible and usable by everyone. In this blog post, we will show you the different options that the Python SDK offers for building your ELG service, and when to use them.

The ELG Python SDK

Before we start, a little reminder on what the ELG Python SDK is and what does.

The Python SDK is a pip package and can be installed easily via pip/PyPI:

pip install elg

It provides access to most of the ELG functionalities in Python. It provides access to the catalogue of resources with methods that allow you to search the catalogue and look for corpora, services, and information on organisations. The Python SDK allows you to call the services available into ELG and even combine them with a pipeline mechanism. In addition, and this is what interests us in this blog post, the Python SDK also contains utilities to build ELG compatible services.

If you want to learn more about the Python SDK, you can have a look at the documentation.

ELG compatible services

To create an ELG service, it is not necessary to understand what a service in the ELG is, but it does help, so we will explain it briefly here. Note that if you want more details on the ELG architecture, it is detailed in several publications, as well as in our documentation.

ELG services are Docker containers that run in ELG’s Kubernetes cluster. Not all services are running all the time, i.e. not all Docker images of services are launched; in fact, we use knative to run them on demand, only when users want to use them. These Docker containers (which correspond to the ELG services) are not accessible from outside the cluster and communicate only with the REST server (LT Service Execution Orchestrator). The communication between the ELG services and the REST server is done according to the specification of the ELG Internal LT Service API – we will come back to this point a little later in this post. When we use an ELG service, the request is therefore made to the REST server which then forwards the request to the service, and returns the response obtained; only the REST server is accessible from outside the cluster.

Technical architecture of the ELG

Technical architecture of the ELG

To recap: an ELG service is a Docker container which communicates with the REST server via the ELG Internal LT Service API. Starting from this, there are three options for integrating your LT tool:

Option 1: Your LT tool is in the Docker image

This is the most common way of creating an ELG service based on your LT tool: the docker image that exposes an ELG compatible endpoint contains your LT tool and runs it.

Option 2: Your LT tool is already a Docker image

If your LT tool is already packed in a Docker image that exposes an HTTP endpoint but is not compatible with ELG, the ELG service can simply be an adapter that will be compatible with the ELG Internal LT Service API, and that will forward the requests to your LT tool Docker container also running inside the ELG cluster (technically, the LT tool and the ELG service Docker containers run in the same Kubernetes pod).

Option 3: Your LT tool is running outside of ELG

Similar to the second option, if your LT tool is already exposing an HTTP endpoint outside of ELG, the ELG service deployed into the ELG cluster can simply be a proxy that will be compatible with the ELG Internal LT Service API, and that will forward the requests to your LT tool running outside of ELG. This option is used when you don't want your LT tool to be deployed into ELG.

The three different options for integrating a LT tool into ELG

The three different options for integrating a LT tool into ELG

Creation of your ELG service using the Python SDK

To create an ELG service, you must create a Docker image that exposes an HTTP endpoint compatible with the ELG Internal LT Service API (this Docker image may or may not contain your LT tool depending on the integration option you have chosen). This can be done relatively easily for most programming languages, using Flask or FastAPI for Python, or Spring for Java for example. However, this always takes a little time and can be complex to optimize depending on your LT tool and the integration option chosen. Fortunately, the ELG Python SDK simplifies this step of creating the Docker image compatible with ELG.

The Python SDK contains two classes: the FlaskService and the QuartService classes, respectively based on Flask and Quart that allow you to easily create an ELG service. To understand how it works, let's look at an example.

These classes are not installed by default, you need to install the ELG pip package with extra dependencies to use them:

pip install elg[flask] # to use the FlaskService class
pip install elg[quart] # to use the QuartService class

Create an ELG service using the FlaskService class

Imagine your LT tool is a language detection tool that runs as follows:

import langdetect

results = langdetect.detect_langs("This is a sentence in English.")
print(results)
>> [en:0.999998356414909]

To convert your tool in an ELG service using the FlaskService class, it is as easy as creating this Python script:

from elg import FlaskService
from elg.model import TextRequest, AnnotationsResponse
import langdetect

class ELGService(FlaskService):
    def process_text(self, content: TextRequest):
        langs = langdetect.detect_langs(content.content)
        ld = {}
        for l in langs:
            ld[l.lang] = l.prob
        return AnnotationsResponse(features=ld)

service = ELGService("LangDetection")
app = service.app

We comment each part of this file so that you can understand how it is working:

from elg import FlaskService

We import the FlaskService class from the ELG Python SDK. This class implements a Flask web server so you only have to redefine the process_text, process_audio, or process_structured_text method depending on the input type of your service.

from elg.model import TextRequest, AnnotationsResponse

Here, we import two classes that represent the request and the response messages of our ELG service. These classes correspond to the specification of the ELG Internal LT Service API, so by using them you ensure that your ELG service will be ELG compatible. Here, the LT tool takes text as input and returns an annotation response, but all the ELG message types are available.

import langdetect

This import is needed to run the LT tool. It is used to perform the language identification.

class ELGService(FlaskService):

We create our ELG service class that inherits from the FlaskService class to take advantage of what is already implemented in the FlaskService class.

    def process_text(self, content: TextRequest):

We redefine the process_text method because the ELG service takes text in input. The content parameter of this request contains all the data of the input request.

        langs = langdetect.detect_langs(content.content)
        ld = {}
        for l in langs:
            ld[l.lang] = l.prob

This part corresponds to the LT tool. We are performing the language identification of the input content. We are changing the form of the output to facilitate the creation of the output message.

        return AnnotationsResponse(features=ld)

Now we return the output of the LT tool as an AnnotationsResponse message to make sure that the ELG service is ELG compatible.

service = ELGService("LangDetection")
app = service.app

Finally, we instantiate the class and define the app variable that will be the endpoint of the Docker container.


After having created the elg_service.py file, you can run the following command:

elg docker create -n ELGService -p elg_service.py -r langdetect

if you are using this command with the QuartService class, you need to add: --service_type quart

This will create all the needed files to build the Docker image of your ELG service. Among those files, there is the Dockerfile from which you can build the image running the usual docker build …

A more precise example of how to use the FlaskService class is available in our documentation.

The image that we just built is an ELG service that can be deployed in the ELG cluster. Easy, right? We hope so, and if not, please feel free to send us your feedback. We are happy to hear your thoughts.

Choose the right tool depending on your use case

Depending on the LT tool you want to integrate into ELG and the integration option you choose, the requirements for the Docker image of the ELG service might differ. The FlaskService class we just presented is suitable when the LT tool is inside the Docker image (integration option 1), but is not when the LT tool is running outside of the Docker image (options 2 & 3) because the Flask server is synchronised, which means it will have to wait for the response from the LT tool before being able to handle another request. To solve this issue, we created the QuartService class which works very similar to the FlaskService class, except asynchronously. Therefore, the QuartService class is recommended when the ELG service is only an adapter or a proxy (integration options 2 & 3). In addition, the QuartService class has the possibility to stream the request without storing it locally, which can be very useful when dealing with large input data like audio for example*.

The strength of the FlaskService and the QuartService classes is that they work almost the same way, and so you can optimize your ELG service very easily without having to do a lot of configurations. For example, here is the Python script for an ELG Speech-to-Text service that proxies the request to the LT tool running outside of ELG without caching the audio file locally:

import traceback
import aiohttp

from elg import QuartService
from elg.model import TextsResponse
from elg.quart_service import ProcessingError

class Proxy(QuartService):

    consume_generator = False

    async def setup(self):
        self.session = aiohttp.ClientSession()

    async def shutdown(self):
        if self.session is not None:
            await self.session.close()

    async def process_audio(self, content):
        try:
            # Make the remote call
            async with self.session.post("https://example.com/endpoint", data=content.generator) as client_response:
                status_code = client_response.status
                content = await client_response.json()
        except:
            traceback.print_exc()
            raise ProcessingError.InternalError('Error during the call')

        if status_code >= 400:
            raise ProcessingError.InternalError('Error during the call')

        return TextsResponse(texts=[{"content": content["text"]}])

service = Proxy("Proxy")
app = service.app


We try to make the creation of an ELG service as simple as possible with the ELG Python SDK. You can convert your Python LT tool in a Docker image that exposes an ELG compatible endpoint quickly and easily by following the steps described above.

We hope you enjoyed this blog post – if you have questions or feedback on the ELG Python SDK, please feel free to contact us.


* The streaming of the request is currently only supported for audio requests.