Deploying Serverless spaCy Transformer Model with AWS Lambda
Oct 12, 2021
With transformer becoming essential for many NLP tasks thanks to their unmatched performance, various useful and impactful NLP models are created everyday. However, many NLP practitioners find it challenging to deploy models into production. According to this report, 90% of machine learning models never make it into production.
Model deployment enables you to host your model in a server environment so it can be used to output prediction when called by an API, for example.
In this tutorial I will show you how to push an NER spacy transformer model to Huggingface and deploy the model on AWS Lambda to run predictions.
According to AWS website:
“AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. With Lambda, you can run code for virtually any type of application or backend service — all with zero administration.”
Deploying models without the need to manage backend servers is a game changer. It will enable developers and small startups who do not have devops resources to start deploying models ready to use in production.
Below are the steps we are going to follow:
- Deploy a trained spaCy transformer model in Huggingface
- Store the model in S3
- Deploy the model in AWS Lambda
- Run AWS Lambda function to output prediction based on user’s input
Deploy Spacy Transformer Model in Huggingface
In this tutorial, we fine-tuned the transformer NER model SciBert to extract materials, processes, and tasks from scientific abstracts. The annotation was done using the UBIAI text annotation tool. We have followed the same approach presented in my previous article where we leveraged google colab to train the model. The next step after training the model, is to host it on huggingface so it can be accessible by API. For more information on how to push a spacy model to huggingface, check this link.
First, install spacy-huggingface-hub from pip:
pip install spacy-huggingface-hub
Build a .whl file from the trained spacy pipeline (make sure to create the output directory beforehand):
huggingface-cli login
python -m spacy package ./model_science_scibert/model-last ./output --build wheel
Push the wheel file into the Huggingface hub:
cd ./output/en_scibert_ScienceIE-0.0.0/dist
python -m spacy huggingface-hub push en_scibert_ScienceIE-0.0.0-py3-none-any.whl
Let’s check that the model has been successfully uploaded to Huggingface:
