ubiai deep learning
1703288832914

Retrieval Augmented Generation On structured data

Feb 19h 2024

If you want to combine the capabilities of pretrained Large language models with external data sources for improved capabilities, then this article is for you. In this article, we aim to provide a comprehensive exploration of the application of Retrieval Augmented Generation (RAG) and its intricate relationship with large language models.
In this article we cover :

  • What is RAG
  • Why use RAG to improve LLM
  • How does RAG works
  • Applications of RAG
  • Example of Application
  • Conclusion

1- What is RAG:

RAG, or Retrieval Augmented Generation, represents a cutting-edge approach that synergizes the strengths of pre-trained large language models (LLM), such as GPT-3 or GPT-4, with external data sources. By seamlessly integrating these components, RAG harnesses the sophisticated language understanding and generation capabilities of LLM with the precision and depth of specialized data search techniques. This fusion empowers the system to not only deliver nuanced and precise responses but also to adapt dynamically to a wide range of user queries and information needs. The result is a versatile and robust framework that excels in generating contextually relevant and informative outputs across diverse domains and applications.

image_2024-02-19_142051766

2- Why use RAG to improve LLM:

When using traditional language models (LLMs), there are several limitations to consider:

  • Lack of Specific Information: LLMs are limited to providing generic answers based on their training data. If users ask specific questions about the software you sell or require in-depth troubleshooting guidance, traditional LLMs may not provide accurate responses. This is because they lack training on data specific to your organization, and their training data has a cutoff date, limiting their ability to offer up-to-date information.
  • Hallucinations: LLMs can “hallucinate,” meaning they might confidently generate false responses based on imagined facts. These algorithms can also provide off-topic responses if they don’t have an accurate answer to the user’s query, leading to a poor customer experience.
  • Generic Responses: Language models often provide generic responses that aren’t tailored to specific contexts. In customer support scenarios, personalized responses based on individual user preferences are usually necessary for a better customer experience.

 

RAG addresses these limitations by integrating the general knowledge base of LLMs with access to specific information, such as data from your product database and user manuals. This approach enables highly accurate and tailored responses that meet your organization’s needs.

3- How does RAG works:

Now that you understand what RAG is, let’s explore the steps involved in setting up this framework:

  • Step 1: Data Collection

Begin by gathering all the necessary data for your application. For a customer support chatbot in an electronics company, this might include user manuals, a product database, and a list of FAQs.

  • Step 2: Data Chunking

Data chunking involves breaking down your data into smaller, more manageable pieces. For example, a lengthy 100-page user manual can be divided into different sections, each potentially addressing different customer queries. This approach focuses each chunk on a specific topic, making retrieved information more directly applicable to user queries and improving efficiency by quickly obtaining relevant information instead of processing entire documents.

  • Step 3: Document Embeddings

After breaking down the source data, it needs to be converted into a vector representation using document embeddings. These numeric representations capture the semantic meaning behind the text, allowing the system to understand user queries and match them with relevant information in the source dataset based on meaning rather than a simple word-to-word comparison. This ensures that responses are relevant and aligned with the user’s query.

  • Step 4: Handling User Queries

When a user query enters the system, it is also converted into an embedding or vector representation using the same model as for document embeddings to ensure consistency. The system then compares the query embedding with the document embeddings, identifying and retrieving chunks whose embeddings are most similar to the query embedding using measures like cosine similarity and Euclidean distance. These chunks are considered the most relevant to the user’s query.

  • Step 5: Generating Responses with an LLM

The retrieved text chunks and the initial user query are fed into a language model, which uses this information to generate a coherent response to the user’s questions through a chat interface

 

 

To seamlessly execute these steps for generating responses with LLMs, you can use a data framework like LlamaIndex. This solution enables you to develop your own LLM applications by efficiently managing the flow of information from external data sources to language models like GPT-3. To learn more about this framework and how to build LLM-based applications, read our tutorial on LlamaIndex.

4- Applications of RAG:

Retrieval Augmented Generation (RAG) has several applications across various domains. Some key applications include:

  • Information Retrieval: RAG can be used to enhance traditional information retrieval systems by generating contextually relevant responses to user queries. It can improve the quality of search results by incorporating both retrieval and generation capabilities
  • Question Answering: RAG can be applied to question answering tasks, where it can retrieve relevant passages from a knowledge base and generate concise and accurate answers to user questions.
  • Language Understanding: RAG can improve language understanding tasks by leveraging large-scale pre-trained language models to retrieve and generate text that captures nuanced meanings and context.
  • Conversation Systems: RAG can enhance conversational AI systems by enabling them to retrieve and generate relevant responses based on the context of the conversation. This can lead to more engaging and natural interactions with users.
  • Content Creation: RAG can be used in content creation tasks such as summarization, paraphrasing, and text generation. It can help in generating diverse and coherent content based on retrieved information.

5- Example of application for LLM:

Let’s set up a system where queries about job opportunities are answered using Langchain for database interaction and OpenAI’s GPT-3.5 model for natural language responses. The generate function takes a query, retrieves relevant information from the database, constructs a prompt with the query and database context, and uses the Langchain ChatOpenAI model to generate a response.

 

First let’s install the needed libraries

				
					!pip install -q langchain
!pip install -q openai

				
			

Then set the OpenAI API key. You can get your key from the official OpenAi website.

				
					import os
os.environ['OPENAI_API_KEY'] = 'Enter your OpeAi Key'

import numpy as np 
import pandas as pd 
import sqlite3 
#If you have your data on Excel feel free to upload it into a dataframe
df = pd.read_excel("Your Excel file")



				
			

Now Let’s check the data we’re working with. In our example our data frame is composed of these columns.

				
					df.columns

				
			
image_2024-02-19_142420096
				
					import sqlite3


# Connect to the SQLite database
conn = sqlite3.connect('People.sqlite')
c = conn.cursor()


# Create the People table if it doesn't exist
c.execute('''CREATE TABLE IF NOT EXISTS People (
    _File TEXT,
    CERTIFICATE TEXT,
    CHARACTERISTIC TEXT,
    COMPANY TEXT,
    DATE TEXT,
    EDUCATION TEXT,
    EMAIL TEXT,
    INDUSTRY TEXT,
    JOBTITLE TEXT,
    LANGUAGES TEXT,
    LOCATION TEXT,
    NAME TEXT,
    NUMBER TEXT,
    SKILL TEXT,
    TIME TEXT,
    URL TEXT,
    Text TEXT
)''')
conn.commit()
# Insert data from the DataFrame into the People table
df.to_sql('People', conn, if_exists='replace', index=False)


# Retrieve and print all rows from the People table
c.execute('''SELECT * FROM People''')
for row in c.fetchall():
    print(row)



				
			

At this stage the SQL database is created from the Excel file. 

image_2024-02-19_142518375

Now let us format a function that takes as input an SQL query in addition to the database and returns the output to that query from the database.

				
					import sqlite3


def read_sql_query(sql, db):
    conn = sqlite3.connect(db)
    cur = conn.cursor()
    cur.execute(sql)
    rows = cur.fetchall()
    conn.close()
    return rows
# Example usage
db_file = 'People.sqlite'
sql_query = 'SELECT * FROM People'
result_rows = read_sql_query(sql_query, db_file)


for row in result_rows:
    print(row)

				
			

Import the libraries needed

 

				
					
from langchain.llms import OpenAI
from langchain.sql_database import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

				
			

Now create an instance of the SQLDatabase, plus an instance from the openAi LLM. In this example we set the temperature of the OpenAi instance to 0.

				
					input_db = SQLDatabase.from_uri('sqlite:///People.sqlite')
llm_1 = OpenAI(temperature=0)


				
			

Then set the Langchain SQL agent.

				
					db_agent = SQLDatabaseChain(llm = llm_1, database = input_db, verbose=True)


				
			

Question 1: 

				
					db_agent.run("Give me the top 5 skills")

				
			
image_2024-02-19_142843666

Question 2:

				
					db_agent.run("Give me the top 5 candidates that have experience in Python, Angular and AWS
")

				
			
image_2024-02-19_143000026

6- Conclusion:

In conclusion, Retrieval Augmented Generation (RAG) stands out as the leading technique for harnessing the language capabilities of Large Language Models (LLMs) in conjunction with specialized databases. These systems effectively address critical challenges in natural language processing, offering an innovative solution.

Despite their advancements, RAG applications are not without limitations, particularly in their dependence on the quality of input data. To maximize the effectiveness of RAG systems, human oversight is essential.Careful curation of data sources, combined with expert knowledge, is crucial to guarantee the reliability of these solutions.

Unlocking the Power of SLM Distillation for Higher Accuracy and Lower Cost​

How to make smaller models as intelligent as larger ones

Recording Date : March 7th, 2025

Unlock the True Potential of LLMs !

Harnessing AI Agents for Advanced Fraud Detection

How AI Agents Are Revolutionizing Fraud Detection

Recording Date : February 13th, 2025

Unlock the True Potential of LLMs !

Thank you for registering!

Check your email for the live demo details

see you on February 19th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Thank you for registering!

Check your email for webinar details

see you on March 5th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Fine Tuning LLMs on Your Own Dataset ​

Fine-Tuning Strategies and Practical Applications

Recording Date : January 15th, 2025

Unlock the True Potential of LLMs !