How to Build a Knowledge Graph with Neo4J and Transformers
Nov 21, 2021
In my previous article “Building a Knowledge Graph for Job Search using BERT Transformer”, we explored how to create a knowledge graph from job descriptions using entities and relations extracted by a custom transformer model. While we were able to get great visuals of our nodes and relations using Python library networkX, the actual graph lived in Python memory and wasn’t stored in database. This can be problematic when trying to create a scalable applications where you have to store an ever growing knowledge graph. This is where Neo4j excels, it enables you to store the graph in a fully functional database that will allow you to manage large amount of data. In addition, Neo4j’s Cypher language is rich, easy to use and very intuitive.
In this article, I will show how to build a knowledge graph from job descriptions using fine-tuned transformer-based Named Entity Recognition (NER) and spacy’s relation extraction models. The method described here can be used in any different field such as biomedical, finance, healthcare, etc.
Below are the steps we are going to take:
- Load our fine-tuned transformer NER and spacy relation extraction model in google colab
- Create a Neo4j Sandbox and add our entities and relations
- Query our graph to find the highest job match to a target resume, find the three most popular skills and highest skills co-occurrence
For more information on how to generate training data using UBIAI and fine-tuning the NER and relation extraction model, checkout the articles below:
- – Introducing UBIAI: Easy-to-Use Text Annotation for NLP Applications
- – How to Train a Joint Entities and Relation Extraction Classifier using BERT Transformer with spaCy 3
- – How to Fine-Tune BERT Transformer with spaCy 3
The dataset of job descriptions is publicly available in Kaggle.
At the end of this tutorial, we will be able to create a knowledge graph as shown below.