ubiai deep learning
Description Guided Zero-Shot Labeling for NLP Applications

Description Guided Zero-Shot Labeling for NLP Applications

Using LLM

Sept 27th, 2023

Zero-shot Labeling using LLM such as GPT is a promising approach to quickly create training data with minimal human input. It enables training AI systems without needing to manually label the entire dataset. However, one of the disadvtanage of this approach is accurate classification of complex and ambiguous entities.

 

Imagine a scenario where an AI system needs to label entities in news articles. While classifying straightforward topics like “sports” or “politics” might be a breeze, things get tricky when we encounter more intricate entities like “artificial intelligence regulations,” “climate change agreements,” or “financial market fluctuations.” These labels often carry inherent ambiguity, and traditional auto-labeling systems may stumble when trying to disentangle the subtle nuances that differentiate one label from another.

 

This is where the concept of “Description guided zero-shot labeling” enters the scene. By providing concise and informative descriptions for each label, we equip our LLM with invaluable context and clarity. This approach holds the promise of significantly enhancing the accuracy of zero-shot auto-labeling by offering guidance and disambiguation precisely when it’s needed most.

 

In this article, we explore the challenges posed by complex entities, and demonstrate how the inclusion of label descriptions can be a game-changer. We will examine the mechanics of label description guided auto-labeling, present real-world case studies and experiments, discuss its potential applications across industries, and explore the challenges that lie ahead on the road to achieving enhanced accuracy.

Setting up Zero-Shot Labeling

In this section, we will walk through the process of enabling description-guided auto-labeling using UbiAI, a powerful labeling platform designed to streamline the labeling process and model fine-tuning. We’ll illustrate this tutorial with practical examples from litigation case analysis.

 

For this tutorial, we are going to identify plaintiff, defendants and their claims from litigation cases using zero shot labeling. First we upload the document to UbiAI, below is a small snippet of the document:

Litigation Document.

Litigation Document.

We will extract the Defendant and Plaintiff names as well as the claims for each party. To do so, we simply add the labels in UbiAI and enable the zero shot LLM feature:

UbiAI Labeling Interface.

UbiAI Labeling Interface.

Let’s run the zero shot labeling without any description per label added:

 
UbiAI Zero-shot Labeling Configuration Window.

UbiAI Zero-shot Labeling Configuration Window.

Here is the result:
LLM Zero-shot Labeling Without Description

LLM Zero-shot Labeling Without Description

LLM Zero-shot Labeling Without Description.

LLM Zero-shot Labeling Without Description.

Although the plaintiff and defendant names were identified correctly, the claims of each party were not extracted.

 

Now, let’s add descriptions for each label:

 

PLAINTIFF: Identify the name of the plaintiff. Do not extract sentences.

 

DEFENDANT: Identify the name of the defendant. Do not extract sentences.

 

CLAIM_PLAINTIFF: Identify the sentence describing the claim of the plaintiff

 

CLAIM_DEFENDANT: Identify the sentence describing the claim of the defendant.

 

Underhood, UbiAI is leveraging the new “function calling” feature of OpenAI to attach a description for each label.

 

It is important to add clear and concise descriptions for each label to guide the LLM effectively. We’ve also noticed that adding positive and negative examples in the description boosts the accuracy.

LLM Zero-Shot Configuration Window

LLM Zero-Shot Configuration Window.

In UbiAI, you have the option to select GPT3 or GPT-4 model. To enable the description per label, we will need to switch to the 16k context length since it allows for larger input to enter our descriptions. Then click on the edit description button to enter the description

Description Guided Zero Shot-Labeling

We are now ready to run the LLM Zero-Shot Labeling with the added description. With the help of the provided description, the LLM is now able to correctly extract the plaintiff claims as shown below.

LLM Zero-shot Labeling With Description

LLM Zero-shot Labeling With Description.

However, it incorrectly identified the Defendant Claim (CLAIM_DEFENDANT) which was not present in the document. Further clarification in the description should help.

 

LLM Zero-shot Labeling With Description.

LLM Zero-shot Labeling With Description.

Conclusion

In this tutorial, we showcased the practical application of description-guided auto-labeling using UbiAI. We delved into the setting up labels, defining clear descriptions, configuring the auto-labeling process, and conducting manual verification.

 

By providing crucial context and guidance, label descriptions helps the LLM to make more informed idenitfication and mitigate ambiguity, ultimately leading to more accurate classifications.

 

Guiding the LLM with concise and clear description is crucial to avoid false positives as shown in this tutorial. However, the potential benefits for industries, research, and applications are profound.