Auto-labeling using GPT-4V

Nov 29th, 2023

We are thrilled to unveil the integration of GPT-4V into UBIAI: you can auto-label PDFs and images more accurately than ever. This powerful synergy brings together GPT-4v’s advanced multi-modal model with UbiAI’s robust annotation platform, setting a new standard in data labeling efficiency and accuracy.

Hugging Face Integration

JuLY 5th, 2023

UBIAI now supports auto-labeling using Hugging Face models out-of-the-box. You can leverage the thousands of models available in Hugging Face to auto-label your data with just few clicks!

The auto-labeling process becomes incredibly straightforward and efficient. Simply input the desired Hugging Face model from the library and let the model automatically generate accurate labels for your data.

hugging face


JunE 1st, 2023 by UBIAI simplifies document automation, eliminating resource limitations and complex AI models.


With a few clicks, create customized intelligent document workflows tailored to your business needs without coding.

It offers higher accuracy, LLM integration, and human-in-the-loop capabilities.


Streamline your document processes with AI Builder.

Zero-Shot Labeling

May 25, 2023

Zero-Shot Labeling  utilizes OpenAI’s GPT3.5 to automatically label PDF documents. It employs the power of zero-shot and few-shot methods, enabling UBIAI’s latest feature to perform LLM-assisted labeling without the requirement of manual labeled examples. This approach simplifies complex tasks like Named Entity Recognition (NER), making them more accessible. With Zero-Shot Labeling, you can expect immediate LLM-assisted labeling without the need for any provided examples. It also accelerates the training process for your AI models, allowing you to achieve faster and more efficient results. Moreover, it significantly reduces the time spent on labeling, ultimately saving valuable resources.

Composed Models

MAY 15, 2023

Composed Models are a feature that revolutionizes the auto-labeling process. By combining multiple models into a single composed model, users can cluster various models for each document template. This breakthrough approach significantly enhances the efficiency of auto-labeling.


With Composed Models, each model can be trained using only five documents, streamlining the setup process. The integrated model system intelligently directs new documents to the most suitable model within the composed structure, resulting in precise and efficient data extraction tailored to the unique requirements of each template.

nlp project

Few-Shot Labeling using GPT

April 12, 2023

Say goodbye to tedious and time-consuming manual labeling with the latest integration of OpenAI’s GPT model. We are thrilled to introduce Few-Shot Automated Labeling using GPT – the solution you’ve been waiting for! 

You can now provide few labeled examples and let GPT auto-label your data instantly in any format including PDFs. This new feature paves the way for a more efficient and streamlined approach to data annotation, enabling you to unlock your team’s full potential. We are currently supporting Named Entity Recognition (NER) tasks and we will be adding document classification and relation extraction soon, stay tuned!

auto annotation tool

Template Form Recognizer

March 10, 2023

Does your dataset have consistent layout? No need to label thousands of documents to train a sophisticated deep learning model. 


With template-based models, you can get started with only 5 labeled documents and still get high model performance. This new feature will save you time and effort, allowing you to quickly and accurately label your data at scale. 


Try out this new feature today and see how it can help streamline your data labeling process.

Model-assisted labeling for collaboration

Dec 10th, 2022

This new feature will help speed up the labeling process by using machine learning models to automatically generate labels for some of the data. 


With model-assisted labeling, collaboration tasks will be even more efficient and effective. 


By using machine learning models to generate labels, we can reduce the amount of time and effort required to label large datasets.

Model-Assisted Labeling for Collaboration

AWS Delegated Access

Nov 15th, 2022

With this new feature, you can securely and seamlessly store your data in your own cloud environment by providing UBIAI necessary IAM access. 


Without your data ever leaving your premises, UBIAI will display your documents so you can label it. 


We currently support AWS S3 but will be adding GCP and Azure very soon. So stay tuned!

AWS Delegated Access

OCR Inter-Annotator Agreement

Nov 10th, 2022

Measuring the inter-annotator agreement (IAA) of your team is critical to the success of your project. 

Although IAA is very common in unstructured text, it is much harder to achieve for semi-structured documents such as invoices, receipts, tickets, etc. 


We are excited to announce the new OCR IAA feature that enable you to: compare annotations on native PDF, invoices, and scanned images, etc. and directly review the annotation conflicts on semi-structured documents.

OCR Inter-Annotator Agreement

OCR Text Edit

Nov 5th, 2022

We all know that OCR is an imperfect process; if you feed it low-resolution or handwritten documents, you can get a low-quality result. 



With UBIAI’s new OCR text edit feature allows you to correct OCR parsing errors by editing the text in the labeling interface, allowing you to improve your OCR annotation results and achieve near-perfect accuracy.

OCR Text Edit

LayoutLM Fine-tuning and Auto-labeling

Sep 19th, 2022

Semi-structured documents such as invoices, receipts, tickets, etc. require advanced NLP models training. 


With this release, you can fine-tune Microsoft’s latest layoutLM model on your annotated documents and use it for auto-labeling the rest of your unlabeled data!

LayoutLM Fine-tuning and Auto-labeling

Object Detection

Jan 20th, 2023

With the new object detection feature, you can draw a bounding box around an image and assign a label. 


This is useful to supplement your OCR annotation with non-textual entity such as a signature, figures and images

Object Detection

Project Comparison

Aug 28th, 2022

Comparing project annotations against each other can reveal very useful insights. 


For example you may want to compare human labeled data to a model-based labeled data. 


With UBIAI’s new project comparison feature, you can compare two project against each other.

Project Comparison

Span categorizer training and auto-labeling

Jul 21th, 2022

When dealing with nested entities or long sentence classification, NER models are generally not the right choice. 


UBIAI now supports spaCy’s span categorizer training and auto-labeling with a click of a button!

Comparing project annotations against each other can reveal very useful insights. For example you may want to compare human labeled data to a model-based labeled data. With UBIAI's new project comparison feature, you can compare two project against each other.

Multi-lingual OCR Annotation

Mar 06th, 2022

UBIAI now supports multi-language OCR annotation including Arabic, Hebrew, Japanese, Chinese, etc. 


With the OCR feature you can annotate directly on PDF or images in any language without losing any of the layout information.

Multilingual OCR Annotation

Model-Assisted Labeling

Dec 12th, 2021

Why hand label when you can automate? 


With this new feature, you can train an NER and relation model to auto-label your documents and cut your annotation time by 50-80%. 


In the annotation interface, simply select a model and click on the run button to start the training on your own annotations! 


The feature is only available for Team and Team Pro packages.

Model-Assisted Labeling

IAA Conflict Visualization

Dec 12th, 2021

It can be difficult to spot conflicts between annotators and resolve disagreements. With this new feature, you will be able to compare the annotations between two pair of annotators and spot any conflicts. 


In the annotation interface, simply select a model and click on the run button to start the training on your own annotations! 


The feature is only available for Team and Team Pro packages.

IAA Conflict Visualization

Add Entity Filter

Nov 13th, 2021

With this new feature, you can filter documents by annotated entities, relations and text classes. 


In addition, you can add logic operations AND, OR or NOT to filter the documents by entities or document state.

Add Entity Filter

OCR Optimized Multi-words Selection

Nov 06th, 2021

Previously, multiword selection required holding down shift + click first and last token. 

You can now select any word sequences by creating a box frame around the words. 

All the tokens inside the frame will be automatically selected.

OCR Optimized Multi-words Selection

API Support

Oct 22th, 2021

UBIAI now supports fully programmatic upload, labeling, model training and inference. 


The API feature is only available for Team, Team Pro and Enterprise packages.

API Support

Fine-tune Relation Extraction Model

Oct 06th, 2021

In addition to fine tuning custom NER model for auto-annotation, you can now train a relation extraction model to auto-annotate relation between your entities.

Fine-tune Relation Extraction Model

Add Free-form Input Text

Sep 19th, 2021

You can now add a free-form input text for each document. 


The text input interface is best used to ask the annotator to translate a given text or leave feedback about a manual annotation task

Understanding Data Labels and Data Labeling Machine Learning 3

Upgrade to spacy 3

AUG 18th, 2021

You can now train the latest spacy 3 model to auto-annotate your documents. 

The new spaCy3 supports more languages such as Arabic, Hindi, Tamil, Albanian, etc. 

In addition, you can export your annotation directly in spacy’s DocBin format.

Upgrade to spacy 3

Multiple Dictionary Pre-annotation

May 20th, 2021

This new feature allows you to add multiple separate dictionaries to pre-annotate your documents. 

This is useful if you want to perform multi-pass pre-annotation on your data.

Multiple Dictionary Pre-annotation

Multi-tag Annotation

May 07th, 2021

With multi-tag feature you can assign multiple labels to a token or create overlapping entities. 


This is extremly useful in biomedical or machine translation annotation for example.

Multi-tag Annotation

OCR Annotation

APR 13th, 2021

With the OCR feature you can annotate directly on PDF or images without losing any of the layout information. 


This is useful for invoice extraction where text sequence and spatial information are equally important. 


UBIAI offers the most compelete and comprehensive OCR annotation solution in the market.

OCR Annotation

Document Classification Annotation

APR 06th, 2021

In addition to entity and relation annotation, you can now assign a label to the document at the same time without wasting time in creating a different task!

Document Classification Annotation