ubiai deep learning

Instruction Fine Tuning on UbiAI

Dec 27th, 2024

What is Instruction Fine Tuning and Why is it Important?

One of the many specialized techniques in AI, instruction fine-tuning is designed to enhance the performance and usability of large language models. This is the process of refining instructions given to an AI model so it could understand and react better to complex queries.

Unlike traditional fine-tuning, which relies on a wider variety of data to tune the parameters, instruction fine-tuning only uses the instruction-response pairs; it allows the models to get closer to users’ intended meaning. The former aspect is important because better interaction quality implies a rise in user satisfaction.

How Instruction Fine-Tuning Works

Instruction fine-tuning relies on supervised learning using datasets comprising labeled instruction-response pairs. Each data point typically includes:

  1. Instruction: A natural language prompt specifying a task.
  2. Optional Context: Additional information to clarify the instruction.
  3. Desired Output: The expected response, serves as the ground truth during training.

This process optimizes the model’s parameters to improve its ability to follow instructions, thereby making its responses more useful and predictable.

Let’s say for an example your goal is to teach LLM to reason and analyze its mistakes.

A good dataset for this is fluently-sets/reasoning-1–1k dataset on Huggingface.

How Instruction Fine-Tuning Works - huggingface

Instruction Fine-Tuning using UbiAI:

After downloading the dataset head over to UbiAI and import your dataset.

Instruction Fine-Tuning using UbiAI: Import dataset

Select Prompt and Response as it is the case in this dataset.

Instruction Fine-Tuning using UbiAI: upload dataset

Click on Upload dataset and upload the file that we downloaded from Huggingface.

Instruction Fine-Tuning using UbiAI: dataset details

Fill out the name and language of the file and click Next.

The file is in Apache Parquet (.parquet) format, it’s a format that was designed for storing tabular data on disk. It was designed based on the format used in Google’s Dremel paper (Dremel later became Big Query).

Parquet files store data in a binary format, which means that they can be efficiently read by computers but are difficult for people to read.

You can easily convert to CSV format using these few lines of code using pandas:

				
					pip install pandas

				
			
				
					import pandas as pd
df = pd.read_parquet('path/to/file.parquet')
df.to_csv('/path/to/file.csv', index=False)
				
			

In this page, upload your dataset and fill out the necessary columns for data mapping:

  • User Prompt: is the prompt Column.
  • System Prompt: is the system_prompt Column.
  • Response: is the completion Column.

Then Click on Finish.

Instruction Fine-Tuning using UbiAI: validate dataset

After that, you need to validate each row of the dataset.

The next step is to assign the dataset we created and create a model, we will choose ‘gpt4’.

Instruction Fine-Tuning using UbiAI: training the model

And that’s it! Now you have just trained your model using Instruction Fine-Tuning!

Now, let’s compare our model with untrained Llama 3.1–8b. For that, we are going to use Groq playground and the model there.

First test: Math analysis:

Prompt:

				
					Two trains, Train A and Train B, are traveling toward each other on the same straight track. Train A is 300 meters long and moving at a speed of 60 km/h. Train B is 200 meters long and moving at a speed of 40 km/h. There is a tunnel that is 500 meters long between them.

				
			
Train A enters the tunnel at the same time Train B starts moving.
Train B must clear the tunnel completely before the trains collide.
Question: Will Train B make it out of the tunnel in time, or will the trains collide?

Correct answer:

Train B will not make it out of the tunnel in time, and the trains will collide.

Untrained Llama 3.1 8B:

Untrained Llama 3.1 8B:

Fine-Tuned ‘llama3–1-b’:

Fine-Tuned ‘llama3–1-b’:

Second Test : Logical Puzzles:

Prompt:

				
					Tanya is older than Eric.
Cliff is older than Tanya.
Eric is older than Cliff.

If the first two statements are true, the third statement is
a. true
b. false
c. uncertain
				
			

Correct answer:

b. false

Untrained Llama 3.1 8B:

Untrained Llama 3.1 8B:

Fine-Tuned ‘llama3–1-b’:

Fine-Tuned ‘llama3–1-b’:

Third Test : Probabilty Problem:

Prompt:

				
					You have 2 boxes:

Box 1 contains 3 red balls and 2 blue balls.
Box 2 contains 1 red ball and 4 blue balls.
You randomly pick one box and then randomly pick one ball from that box.

Question:
What is the probability of picking a red ball?
				
			

Correct answer :

The probability of picking a red ball is 2/5​.

Untrained Llama 3.1 8B:

Untrained Llama 3.1 8B:

Fine-Tuned ‘llama3–1-b’:

Fine-Tuned ‘llama3–1-b’:

Final test: Task Scheduling with Resource Constraints

Prompt

				
					You are a project manager tasked with scheduling three tasks for a team of workers. The tasks and their requirements are as follows:

Task A requires 4 hours of work and must be done by Worker 1.
Task B requires 3 hours of work and must be done by Worker 2.
Task C requires 5 hours of work and can be done by either Worker 1 or Worker 2.
You have two workers available, Worker 1 and Worker 2, who each have a maximum of 8 hours available to work.

Question:
How can you schedule the tasks so that all tasks are completed within the available work hours, while adhering to the constraints (who can work on which task and the total number of hours)?
				
			

Correct Answer:

All tasks are completed within the available work hours. Worker 1 has 4 hours left, and Worker 2 is fully booked.

Untrained ‘Llama 3.1 8B’:

Untrained ‘Llama 3.1 8B’:

Fine-Tuned ‘llama3–1-b’:

Fine-Tuned ‘llama3–1-b’:

Conclusion

Although both models are based on Llama 3.1 8B, the fine-tuned model performs relatively better with more structured and logical outputs due to being instruction-tuned on a reasoning dataset.

 

This demonstrates that instruction fine-tuning significantly enhances our Large Language Model’s performance on specific tasks, enabling it to produce more structured and accurate outputs.

Unlocking the Power of SLM Distillation for Higher Accuracy and Lower Cost​

How to make smaller models as intelligent as larger ones

Recording Date : March 7th, 2025

Unlock the True Potential of LLMs !

Harnessing AI Agents for Advanced Fraud Detection

How AI Agents Are Revolutionizing Fraud Detection

Recording Date : February 13th, 2025

Unlock the True Potential of LLMs !

Thank you for registering!

Check your email for the live demo details

see you on February 19th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Thank you for registering!

Check your email for webinar details

see you on March 5th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Fine Tuning LLMs on Your Own Dataset ​

Fine-Tuning Strategies and Practical Applications

Recording Date : January 15th, 2025

Unlock the True Potential of LLMs !