With the Python SDK, the European Language Grid provides tools to facilitate the creation of an ELG service from your Language Technology (LT) tool in Python. It is very easy to convert your LT tool running on your computer to an ELG service accessible and usable by everyone. In this blog post, we will show you the different options that the Python SDK offers for building your ELG service, and when to use them.
The ELG Python SDK
Before we start, a little reminder on what the ELG Python SDK is and what does.
The Python SDK is a pip package and can be installed easily via pip/PyPI:
pip install elg
It provides access to most of the ELG functionalities in Python. It provides access to the catalogue of resources with methods that allow you to search the catalogue and look for corpora, services, and information on organisations. The Python SDK allows you to call the services available into ELG and even combine them with a pipeline mechanism. In addition, and this is what interests us in this blog post, the Python SDK also contains utilities to build ELG compatible services.
If you want to learn more about the Python SDK, you can have a look at the documentation.
ELG compatible services
To create an ELG service, it is not necessary to understand what a service in the ELG is, but it does help, so we will explain it briefly here. Note that if you want more details on the ELG architecture, it is detailed in several publications, as well as in our documentation.
ELG services are Docker containers that run in ELG's Kubernetes cluster. Not all services are running all the time, i.e. not all Docker images of services are launched; in fact, we use knative to run them on demand, only when users want to use them. These Docker containers (which correspond to the ELG services) are not accessible from outside the cluster and communicate only with the REST server (LT Service Execution Orchestrator). The communication between the ELG services and the REST server is done according to the specification of the ELG Internal LT Service API – we will come back to this point a little later in this post. When we use an ELG service, the request is therefore made to the REST server which then forwards the request to the service, and returns the response obtained; only the REST server is accessible from outside the cluster.
To recap: an ELG service is a Docker container which communicates with the REST server via the ELG Internal LT Service API. Starting from this, there are three options for integrating your LT tool:
Option 1: Your LT tool is in the Docker image
This is the most common way of creating an ELG service based on your LT tool: the docker image that exposes an ELG compatible endpoint contains your LT tool and runs it.
Option 2: Your LT tool is already a Docker image
If your LT tool is already packed in a Docker image that exposes an HTTP endpoint but is not compatible with ELG, the ELG service can simply be an adapter that will be compatible with the ELG Internal LT Service API, and that will forward the requests to your LT tool Docker container also running inside the ELG cluster (technically, the LT tool and the ELG service Docker containers run in the same Kubernetes pod).
Option 3: Your LT tool is running outside of ELG
Similar to the second option, if your LT tool is already exposing an HTTP endpoint outside of ELG, the ELG service deployed into the ELG cluster can simply be a proxy that will be compatible with the ELG Internal LT Service API, and that will forward the requests to your LT tool running outside of ELG. This option is used when you don't want your LT tool to be deployed into ELG.
Creation of your ELG service using the Python SDK
To create an ELG service, you must create a Docker image that exposes an HTTP endpoint compatible with the ELG Internal LT Service API (this Docker image may or may not contain your LT tool depending on the integration option you have chosen). This can be done relatively easily for most programming languages, using Flask or FastAPI for Python, or Spring for Java for example. However, this always takes a little time and can be complex to optimize depending on your LT tool and the integration option chosen. Fortunately, the ELG Python SDK simplifies this step of creating the Docker image compatible with ELG.
The Python SDK contains two classes: the FlaskService and the QuartService classes, respectively based on Flask and Quart that allow you to easily create an ELG service. To understand how it works, let's look at an example.
These classes are not installed by default, you need to install the ELG pip package with extra dependencies to use them:
pip install elg[flask] # to use the FlaskService class pip install elg[quart] # to use the QuartService class
Create an ELG service using the FlaskService class
Imagine your LT tool is a language detection tool that runs as follows:
import langdetect results = langdetect.detect_langs("This is a sentence in English.") print(results) >> [en:0.999998356414909]
To convert your tool in an ELG service using the FlaskService class, it is as easy as creating this Python script:
from elg import FlaskService from elg.model import TextRequest, AnnotationsResponse import langdetect class ELGService(FlaskService): def process_text(self, content: TextRequest): langs = langdetect.detect_langs(content.content) ld = {} for l in langs: ld[l.lang] = l.prob return AnnotationsResponse(features=ld) service = ELGService("LangDetection") app = service.app
We comment each part of this file so that you can understand how it is working:
from elg import FlaskService
We import the FlaskService class from the ELG Python SDK. This class implements a Flask web server so you only have to redefine the process_text, process_audio, or process_structured_text method depending on the input type of your service.
from elg.model import TextRequest, AnnotationsResponse
Here, we import two classes that represent the request and the response messages of our ELG service. These classes correspond to the specification of the ELG Internal LT Service API, so by using them you ensure that your ELG service will be ELG compatible. Here, the LT tool takes text as input and returns an annotation response, but all the ELG message types are available.
import langdetect
This import is needed to run the LT tool. It is used to perform the language identification.
class ELGService(FlaskService):
We create our ELG service class that inherits from the FlaskService class to take advantage of what is already implemented in the FlaskService class.
def process_text(self, content: TextRequest):
We redefine the process_text method because the ELG service takes text in input. The content parameter of this request contains all the data of the input request.
langs = langdetect.detect_langs(content.content) ld = {} for l in langs: ld[l.lang] = l.prob
This part corresponds to the LT tool. We are performing the language identification of the input content. We are changing the form of the output to facilitate the creation of the output message.
return AnnotationsResponse(features=ld)
Now we return the output of the LT tool as an AnnotationsResponse message to make sure that the ELG service is ELG compatible.
service = ELGService("LangDetection") app = service.app
Finally, we instantiate the class and define the app variable that will be the endpoint of the Docker container.
After having created the elg_service.py file, you can run the following command:
elg docker create -n ELGService -p elg_service.py -r langdetect
if you are using this command with the QuartService class, you need to add:
--service_type quart
This will create all the needed files to build the Docker image of your ELG service. Among those files, there is the Dockerfile from which you can build the image running the usual docker build ...
A more precise example of how to use the FlaskService class is available in our documentation.
The image that we just built is an ELG service that can be deployed in the ELG cluster. Easy, right? We hope so, and if not, please feel free to send us your feedback. We are happy to hear your thoughts.
Choose the right tool depending on your use case
Depending on the LT tool you want to integrate into ELG and the integration option you choose, the requirements for the Docker image of the ELG service might differ. The FlaskService class we just presented is suitable when the LT tool is inside the Docker image (integration option 1), but is not when the LT tool is running outside of the Docker image (options 2 & 3) because the Flask server is synchronised, which means it will have to wait for the response from the LT tool before being able to handle another request. To solve this issue, we created the QuartService class which works very similar to the FlaskService class, except asynchronously. Therefore, the QuartService class is recommended when the ELG service is only an adapter or a proxy (integration options 2 & 3). In addition, the QuartService class has the possibility to stream the request without storing it locally, which can be very useful when dealing with large input data like audio for example*.
The strength of the FlaskService and the QuartService classes is that they work almost the same way, and so you can optimize your ELG service very easily without having to do a lot of configurations. For example, here is the Python script for an ELG Speech-to-Text service that proxies the request to the LT tool running outside of ELG without caching the audio file locally:
import traceback import aiohttp from elg import QuartService from elg.model import TextsResponse from elg.quart_service import ProcessingError class Proxy(QuartService): consume_generator = False async def setup(self): self.session = aiohttp.ClientSession() async def shutdown(self): if self.session is not None: await self.session.close() async def process_audio(self, content): try: # Make the remote call async with self.session.post("https://example.com/endpoint", data=content.generator) as client_response: status_code = client_response.status content = await client_response.json() except: traceback.print_exc() raise ProcessingError.InternalError('Error during the call') if status_code >= 400: raise ProcessingError.InternalError('Error during the call') return TextsResponse(texts=[{"content": content["text"]}]) service = Proxy("Proxy") app = service.app
We try to make the creation of an ELG service as simple as possible with the ELG Python SDK. You can convert your Python LT tool in a Docker image that exposes an ELG compatible endpoint quickly and easily by following the steps described above.
We hope you enjoyed this blog post – if you have questions or feedback on the ELG Python SDK, please feel free to contact us.
* The streaming of the request is currently only supported for audio requests.