Blog

Description Guided Zero-Shot Labeling for NLP Applications

In this article, we explore the challenges posed by complex entities, and demonstrate how the inclusion of label descriptions can be a game-changer.

Description Guided Zero-Shot Labeling for NLP Applications
AUG 28TH, 2023

Streamlining Lease Abstraction with AI

In this tutorial, we’ll delve into how to automate lease abstraction using a custom-trained AI model in conjunction with a Large Language Model (LLM).

Streamlining Lease Abstraction with AI
juLy 18TH, 2023

Unlocking Legal Litigation Analysis with chatGPT​

In this tutorial, we are going to extract relevant information from environmental litigation cases such as named entities, facts presented and summarization using the LLM GPT-3.5-Tubo.

Unlocking Legal Litigation Analysis with chatGPT A Step-by-Step Tutorial
juLy 18TH, 2023

Automating Entity Extraction from Safety Data Sheets (SDS) using LLMs

In this article, we will explore how to leverage the power of Large Language Models (LLMs) to automate entity extraction from SDS and improve the overall efficiency of SDS companies.

Automating Entity Extraction from Safety Data Sheets (SDS) using LLMs
july 15TH, 2023

How to automate entity extraction from PDF using LLMs

In this tutorial, we are going to present a method to auto-label unstructured and semi-structured documents using Large Language Model’s (LLM) in-context learning capabilities.

How to automate entity extraction from PDF using LLMs
june 21TH, 2023

Understanding Data Labels and Data Labeling: Definition, Types, and How it Works for Machine Learning

This article aims to shed light on the concept of data labels, their importance in machine learning, and how data labeling works. We will also delve into different types of machine learning labels, data labeling techniques, quality control measures, and the emerging trend of human-in-the-loop labeling.

Understanding Data Labels and Data Labeling: Definition, Types, and How it Works for Machine Learning
june 21TH, 2023

Mastering Model Validation with Data Annotations: Understanding Date Format Validation, Display, DataType, and Required Fields

In this article, we will explore the various aspects of data annotation validation, with a specific focus on date format validation, display, datatype, and required fields.

Mastering Model Validation with Data Annotations: Understanding Date Format Validation, Display, DataType, and Required Fields
june 19TH, 2023

A Comprehensive Guide to Data Labeling with Unsupervised Learning

In this article, we explore the world of unsupervised data labeling and its significance in the field of machine learning. We delve into various topics such as clustering algorithms, dimensionality reduction techniques, active learning, evaluation metrics, challenges, applications, hybrid approaches, ethics, and future research trends.

A Comprehensive Guide to Data Labeling with Unsupervised Learning
JUne 16TH, 2023

Exploring the Different Types of Text Annotation and Use Cases

This comprehensive guide explores different types of text annotation and their diverse use cases. From Named Entity Recognition and Sentiment Analysis to Text Categorization and Question-Answering Annotations, we examine how each type contributes to language understanding and enables applications in various fields. 

Exploring the Different Types of Text Annotation and Use Cases
june 11TH, 2023

Advancing Language Understanding: Multilingual Semantic Annotation Systems and Multilingual Annotation Systems

This article will explore in-depth the different types of data annotation and the critical role they play in machine learning, as well as the difference between data labeling and data annotation in data preparation for machine learning.

Blog_Multilingual-Sentiment-Analysis
june 10TH, 2023

Unlocking the Potential of Machine Learning with Data Annotation: Types, Techniques, and Importance

This article explores the advancements, challenges, and diverse applications of multilingual semantic annotation systems and multilingual annotation systems.

Unlocking the Potential of Machine Learning with Data Annotation itsTypes, Techniques, and Importance
JUne 9TH, 2023

Transforming Raw Data into Actionable Insights: The Significance of Data Annotation

In this article, we will discuss the main purpose of data labeling, its importance, and its use in different industries such as healthcare, finance, retail, and manufacturing.

Transforming Raw Data into Actionable Insights: The Significance of Data Annotation
JUNe 6TH, 2023

How to Analyze Company Risk Factors from SEC Reports with AI

In this tutorial, we delve into the key steps involved in training a custom AI model that identifies risk factors from SEC 10-K reports and integrating it into a workflow that analyses the results using chatGPT. We also highlight the importance of human-in-the-loop review for refining the model’s predictions.

Analyze Company Risk Factors from SEC Reports with AI Using custom NLP model and chatGPT
June 8TH, 2023

Data Labeling: Fueling Machine Learning Algorithms for Success

In this article, we will explore the importance of data labeling, its examples, and its use in machine learning. We will also discuss the data labeling process, including the project’s requirements, the appropriate labeling technique, the team of experts, the labeling guidelines, and the continuous improvement of labeled data quality.

Data Labeling & Fueling Machine Learning Algorithms for Success
MAY 31, 2023

How to Automate Document Extraction from Insurance Documents Using custom AI and chatGPT

In today’s fast-paced insurance industry, processing a vast array of documents is a critical but often cumbersome task. Intelligent Document Extraction has emerged as a game-changing solution for insurance companies.

How to Automate Document Extraction from Insurance Documents Using custom AI and chatGPT
MAY 31, 2023

Intelligent Document Extraction for Logistics and Supply Chain

In this tutorial, we will show how to train a custom AI model on logistics documents, host it and integrate it in a workflow without any coding required or extensive AI knowledge. Let’s dive in!

Intelligent Document Extraction for Logistics and Supply Chain
MAY 31, 2023

The Future of Data Extraction from PDFs: Unveiling Intelligent Methods

In the digital age, extracting valuable data from PDFs efficiently is crucial for organizations across industries. Throughout this comprehensive article, we will delve into the transformative techniques and tools that have revolutionized this domain. Join me as we explore the future of data extraction

The Future of Data Extraction from PDFs
MAY 25, 2023

Introducing AI Builder: the A.I engine for building intelligent document applications

As AI models are becoming commoditized, high quality training data is becoming key to successful and applicable AI. chatGPT is the perfect example, feeding GPT-3 with a small amount of high quality human labeled dataset using RLHF, OpenAI …

Introducing AI Builder: the A.I engine for building intelligent document applications
MAY 20, 2023

How to Automate Data Extraction from Bank Statements using custom trained AI model

In this tutorial we are going to learn how to automate the data extraction process from bank statements using custom trained AI models and automated table extraction.

How to Automate Data Extraction from Bank Statements Using custom trained AI model
MAY 11, 2023

Mastering Entity Extraction for Business Success

Entity extraction, also known as named entity recognition (NER) or entity identification, is a sub-field of natural language processing (NLP) that involves identifying and classifying key information elements or “entities” within unstructured text. These entities may include people’s names, locations, organizations, dates, and more.

Entity extraction Business intelligence Text analytics NLP Information extraction Machine learning Data mining Natural language processing Named entity recognition Data analysis Big data Semantic analysis Sentiment analysis Text classification Data-driven decision-making
MAR 23, 2023

Data Labeling and Annotation

Data labeling and annotation are key components of machine learning and artificial intelligence. These processes add relevant information, tags, or labels to the raw data to help train machine learning models. Labeled data helps machine learning algorithms recognize patterns and make predictions based on new, unseen data.

Data Labeling and Annotation - Manual labeling - few-shot labeling - zero-shot labeling - weak labeling
MAY 04, 2023

Intelligent Document Processing IDP and Robotic Process Automation RPA: A Comparative Study on Automation of Business Processes

A comparative study of IDP and RPA, exploring their definitions, applications, benefits, and limitations. 

Intelligent Document Processing IDP and Robotic Process Automation RPA: A Comparative Study on Automation of Business Processes
APR 6, 2023

How Few-Shot Learning is Automating Document Labeling

In this article, we’ll take a closer look at how few-shot learning is transforming document labeling, specifically for Named Entity Recognition which is the most important task in document processing.

How Few-Shot Learning is Automating Document Labeling
FEB 27, 2023

Entity-based Synthetic Data Generation with chatGPT

The development of text generation models has been greatly accelerated by the introduction of large pre-trained language models like GPT (Generative Pre-trained Transformer). These models are trained on massive amounts of text data using unsupervised learning techniques and can generate high-quality text that is often indistinguishable from human-written text.

Entity-based Synthetic Data Generation with chatGPT
FEB 27, 2023

What is Synthetic Data Generation?

In this article, we will introduce the concept of synthetic data, how we generate it, its types, techniques, and tools. In the next article, we will show a few examples of generating the data using named entities extracted from real text. This series will provide you the knowledge required to help in producing synthesized dataset for solving data-related issues..

What is Synthetic Data Generation
JAN 23, 2023

How to Train the LILT Model on Invoices and Run Inference

In the realm of document understanding, deep learning models have played a significant role. These models are able to accurately interpret the content and structure of documents, making them valuable tools for tasks such as invoice processing, resume parsing, and contract analysis. 

UBIAI BLOG
JAN 12, 2023

Revolutionize your Data Extraction Process with OCR and NLP

If you’ve ever wondered how you can automate data extraction from your goods receipts and shipment documents, then you’ve come to the right place.

In this article, we’ll explain how Natural Language Processing can quickly and easily extract data from semi-structured documents using OCR, labeling, and fine-tuning models.

Revolutionize your Data Extraction Process with OCR and NLP
July 18 , 2022

LayoutLM v3 vs LayoutLM v2 : Fine-tuning LayoutLM v3 for Invoice Processing

Step-by-step tutorial for fine-tuning Microsoft’s latest LayoutLM v3 on invoices, starting with annotations performed with UBIAI OCR and Text annotation tools then comparing its performance to the layoutLM V2.

UBIAI BLOG
Nov 29, 2021

Analyzing Scientific Articles with fine-tuned SciBERT NER Model and Neo4j

Step by step Tutorial on Analyzing datasets of scientific abstracts using the Neo4j Graph database and a fine-tuned SciBERT model.

UBIAI BLOG
Jan 14, 2022

Interpretable and Explainable NER with LIME

Training an NER model that predicts Skills, Experience, Diploma, and Diploma Majors from job descriptions and explaining its output using LIME algorithm.

UBIAI BLOG
Oct 12, 2021

Deploying Serverless spaCy Transformer Model with AWS Lambda

Step-by-step tutorial for deploying an NER spaCy Transformer Model to Huggingface and running predictions on AWS Lambda.

UBIAI BLOG
May 31, 2021

How to Annotate PDFs and Scanned Images for NLP Applications using UBIAI Text Annotation Tool

Increase efficiency, productivity, cost savings by using NLP/NER for info retrieval in unstructured/structured text/docs analysis (invoices,receipts,contracts)

UBIAI BLOG
Feb 13, 2022

UBIAI – A Data Labelling and Text Annotation Tool That’s Different. Here’s How

Build in-house,Outsource data labeling,OCR, Model assisted labeling,Fully-optimized UI,Easy-to-use,Microsoft Word,PDFs,Team collaboration,Auto-labeling .

UBIAI BLOG
May 17, 2021

Building a Knowledge Graph for Job Search using BERT Transformer

Tutorial on how to create a job recommendation from unstructured text, how to extract entities and relations from job descriptions using the BERT model, and how to create a knowledge graph.

UBIAI BLOG
Apr 1, 2021

How to Train a Joint Entities and Relation Extraction Classifier using BERT Transformer with spaCy 3

Train Joint Entities with BERT Transformer/spaCy3 to automate info retrieval (unstructured texts,contracts,financial documents,healthcare records)

UBIAI BLOG
Feb 28, 2021

How to Fine-Tune BERT Transformer with spaCy 3 for NER

Step-by-step tutorial for cloning and fine-tuning a huggingface library model on a dataset.

UBIAI BLOG
Jul 24, 2020

Building a job entity recognizer using Amazon Comprehend

Step by step tutorial on how to annotate job descriptions using UBIAI tool and trained a custom entity recognizer in Amazon Comprehend.

UBIAI BLOG
Sep 4, 2020

Introducing UBIAI easy to use text annotation for NLP applications

UBIAI – easy-to-use UI, multilingual (Arabic,Chinese,etc) auto-annotation,Entity recognition,chatbot training,entity sentiment analysis,text classification .

UBIAI BLOG
Jul 24, 2020

How to Automate Job Searches Using Named Entity Recognition

NER model for entity extraction from job descriptions, CVs, resumes, and cover letters for targeted recruitment and job search.

UBIAI BLOG
Jun 21, 2021

Fine-Tuning Transformer Model for Invoice Recognition

Step-by-step tutorial for cloning and fine-tuning a huggingface library model on a dataset.

UBIAI BLOG
Jan 14, 2022

Metadata Extraction from Rental Agreements Using AI

AI and its application in rental agreements, identifying, labeling, annotating, and extracting metadata from documents (rental agreements,contracts).

UBIAI BLOG
Nov 21, 2021

How to Build a Knowledge Graph with Neo4J and Transformers

build a knowledge graph from job descriptions using fine-tuned transformer-based Named Entity Recognition NER and spacy’s relation extraction models.

UBIAI BLOG
Dec 7, 2021

Building An NLP Project From Zero To Hero (Project Overview)

Gentle refresher on the core concepts of NLP,Project Overview ,Data Collection, Preprocessing, Labeling, Model Training, Deployment, Monitoring & Text Mining.

UBIAI BLOG
Dec 14, 2021

Build An NLP Project From Zero To Hero (Data Collection)

NER Model,extract info from tweets, Open Datasets,Public APIs,Web Scraping,Python Reddit API Wrapper,Public Open Dataset,Financial Tweets,Kaggle,Data Quality.

UBIAI BLOG
Dec 26, 2021

Build An NLP Project From Zero To Hero (Preprocessing)

DataAnalysis,Cleaning,Integration,Reduction,Transformation,Tokenization,Normalization,Denoising,Preprocessing,NER model,Data Observation,Text Quality Filtering,Spacy Pre-annotation.

UBIAI BLOG
Jan 11, 2022

Build An NLP Project From Zero To Hero (Data Labeling)

UBIAI : a friendly User Interface,Domain Knowledge Consensus,Data Quality Assessment,NER Annotation,Pre-Annotation,Metrics Interface,Model Assisted Labeling.

UBIAI BLOG
Jan 11, 2022

Build An NLP Project From Zero To Hero (5): Model Training

Selecting the model, Verifying the integrity of the input data, evaluating the model, training, and saving it.

UBIAI BLOG
Mar 31, 2022

Build An NLP Project From Zero To Hero (Model Integration)

Integrate a Spacy NLP model into a web application and use it to provide services to users over HTTP using the Twitter Developer API and Stock Market Tweets Analyzer.

UBIAI BLOG
Mar 31, 2022

Types of Text Data Annotation Techniques

Get the most accurate insights, transform your data, train your NLP models, extract important information, and deploy without developing any code.

UBIAI BLOG
June 7, 2022

Natural Language Processing and Optical Character Recognition

NLP roots,Challenges,Optical Character Recognition,OCR Stages,Segmentation,Entity Recognition,NER,Document Classification,Post-Processing,Dictionary & ML-based .

UBIAI BLOG
June 7, 2022

Use case of NLP in Supply Chain

Parsing with OCR Technology, Custom Fine Tuned Models ,Shipment doc automation & translation & Summarization ,Monitoring change ,Obtaining data on benchmarks.

UBIAI BLOG
June 7, 2022

Is Weak Labeling Capable of Replacing Human-Labeled Data?

Unsupervised vs. fully supervised data labeling : Step-by-step demonstration of NLP model performance trained on weakly labeled data versus hand-labeled data.

UBIAI BLOG
June 7, 2022

Fine-Tuning LayoutLM v2 For Invoice Recognition

Step-by-step tutorial for fine-tuning layoutLM V2, starting with data annotation with UBIAI OCR Text Annotation tool to model training and inference.

UBIAI BLOG
June 13, 2022

Auto-Label Data Using Transformer Models

Dictionary pre-annotation, NER, relation, and document classification Model training, spaCy and transformer training, model auto-labeling, custom trained models.

UBIAI BLOG
June 24, 2022

NLP, AI, Data Labeling & annotation in Manufacturing

UBIAI adventages on High-Quality AutoLabeling,Process Automation,Inventory Management,Operation Optimization in Manufacturing,Healthcare,Automotive,Advertising

UBIAI BLOG
jun 24 , 2022

Invoice & semi-structured documents Processing with OCR

OCR annotation tools to Convert text found in scanned documents into machine-readable text, Scanned documents, printed text, images, handwritten text.

UBIAI BLOG
july 7 , 2022

NLP, AI, Data Labeling & annotation in Insurance

Personalized Customer Experience,Claim Processing,Fraud Detection,Underwriting Process Automation,annotating property,customer demographic,claims,pricing data.

UBIAI BLOG
july 7 , 2022

OCR & NLP are Transforming Supply Chain Industry

Automating Data Extraction,Making Sense of Semi-structured Data(Invoice Processes, Purchase Order Maintenance),Empowering the Supply Chain,NLP OCR ML Challenges.

UBIAI BLOG
july 18 , 2022

NLP Applications and Importance in Contract Management

NLP and ML models’ applications and key benefits in business and Contract Lifecycle Management via data auto-labeling and team management solutions

UBIAI BLOG
August 9 , 2022

MEDICAL REPORT USING NER WITH SPACY TRANSFORMERS AND OCR WITH EASYOCR

Extract text from images files related to covid-19 and recognize medical entities from unstructured text using fine-tuning with spacy transformers, easyOCR NER.

UBIAI BLOG
August 9 , 2022

Multimodal Transformers for structured & unstructured data.

Fine-tune pre-trained model for data classification, Business understanding, Work environment preparation, Data understanding & Preparation, results Evaluation .

UBIAI BLOG
August 25 , 2022

Machine Learning Ops tools for Natural Language Processing

Best practices for Machine Learning lifecycle approaches (traditional pipeline, advanced MLOps) in NLP projects.

UBIAI BLOG
August 25 , 2022

Data Annotation (Text, Image, Audio, Video) in Supervised ML

Data Annotation in Supervised Machine Learning : Semantic, Instance, panoptic segmentation; Sentiment, Semantic, OCR annotation, Text categorization

UBIAI BLOG
September 19 , 2022

Annotate Text Using Google Apps Script and ML APIs

Active learning is a ML technique that reduces the amount of labeled data to train a model by labeling instances that are most likely to improve the model.

UBIAI BLOG
Sep 19, 2022

Annotate Text From Google Sheet Using Google Apps Script and Machine Learning APIs

Combine the power of Google Apps Script and Machine Learning APIs to annotate textual HR and recruitment data from Google Sheets.

UBIAI BLOG
October 9 , 2022

Speedup Data Labeling using Clustering: Tools and Techniques for enhanced Data Labeling.

Speed up, reduce annotation cost & time, Data flow, Email labeling, supervised & unsupervised learning, data labeling, and Classification vs Clustering.

UBIAI BLOG
October 26 , 2022

LayoutLM vs OCR-free Donut Model

Data extraction tutorial: using the OCR-free Donut model for document classification, document question answering, and synthetic data generation.

UBIAI BLOG
Nov 23, 2022

Types of sentiment analysis and their applications in Business

Sentiment analysis is a text classification technique that identifies and extracts data from the source material, allowing data analysts to gain a deeper understanding of the social perception surrounding their product or service while monitoring online chats …

UBIAI BLOG
Novembre 23 , 2022

Natural Language Processing use cases in Finance

Businesses are betting big on Natural Language Processing (NLP) to pick up their financial game in today’s digital age …

UBIAI Blog
Novembre 23 , 2022

Natural Language Processing use cases in the insurance industry

Because of claims, insurance policies, and customer relationships, the insurance industry generates a large amount of unstructured text, making it difficult for insurers to leverage their datasets using traditional methods …

UBIAI BLOG
Novembre 23 , 2022

Top Open-source Optical Character Recognition programs

OCR is still a relatively new technology for business process automation, which is why most industries continue to rely on traditional systems …

Top Open-source OCR programs
Nov 23, 2022

6 Natural Language Processing Models you should know

Natural language processing, or NLP, is one of the most fascinating topics in artificial intelligence, and it has already spawned our everyday technological utilities …

UBIAI BLOG
December 22 , 2022

NLP Data extraction tools as a military weapon

Nowadays, Data has taken over the military and warfare sectors because whoever has accurate information at the right time has an advantage in operations and strategic moves…

UBIAI BLOG
December 13 , 2022

GPT-3 : Use Cases, Advantages, and Limitations

GPT-3 requires a small amount of input text to generate large volumes of relevant and sophisticated machine-generated text such as code, stories, poems, and…

UBIAI BLOG
December 01 , 2022

Customizable NLP Models

Strategic decision making is the key to all businesses’ success, but in order for a company to make accurate predictions and decisions at the right time, they must obtain accurate insights from…

UBIA BLOG
January 04 , 2023

Active Learning

What is active learning? Active learning is a special case of machine learning in which a learning algorithm can interactively query a user (or some other information source) to label new data points with the desired outputs…

UBIAI Blog
UBIAI